[I] [Feature] Support Spark expression: map_contains_key [datafusion-comet]

via GitHub Wed, 14 Jan 2026 17:03:57 -0800


andygrove opened a new issue, #3164:
URL: https://github.com/apache/datafusion-comet/issues/3164


   ## What is the problem the feature request solves?
   
   > **Note:** This issue was generated with AI assistance. The specification 
details have been extracted from Spark documentation and may need verification.
   
   Comet does not currently support the Spark `map_contains_key` function, 
causing queries using this function to fall back to Spark's JVM execution 
instead of running natively on DataFusion.
   
   The `MapContainsKey` expression checks whether a given key exists in a map. 
It is implemented as a runtime-replaceable expression that internally uses 
`ArrayContains` on the map's keys to perform the lookup.
   
   Supporting this expression would allow more Spark workloads to benefit from 
Comet's native acceleration.
   
   ## Describe the potential solution
   
   ### Spark Specification
   
   **Syntax:**
   ```sql
   map_contains_key(map_expr, key_expr)
   ```
   
   **Arguments:**
   | Argument | Type | Description |
   |----------|------|-------------|
   | map_expr | MapType | The map to search in |
   | key_expr | Same as map key type or compatible | The key to search for |
   
   **Return Type:** Returns `BooleanType` - `true` if the key exists in the 
map, `false` otherwise.
   
   **Supported Data Types:**
   - Map input: Any `MapType` with orderable key types
   - Key input: Must be the same type as the map's key type or a type that can 
be coerced to it through type widening
   - Null key inputs are not supported and will result in a type check error
   
   **Edge Cases:**
   - **Null key handling**: Null keys are explicitly rejected during type 
checking and will cause a `DataTypeMismatch` error with `NULL_TYPE` subclass
   - **Type mismatch**: If key type cannot be coerced to map key type, throws 
`DataTypeMismatch` with `MAP_FUNCTION_DIFF_TYPES` subclass  
   - **Non-orderable keys**: Key types that don't support ordering operations 
will fail type validation
   - **Empty maps**: Returns `false` for any key lookup in empty maps
   - **Null maps**: Standard null propagation rules apply - null map input 
returns null result
   
   **Examples:**
   ```sql
   -- Basic usage
   SELECT map_contains_key(map(1, 'a', 2, 'b'), 1);
   -- Returns: true
   
   SELECT map_contains_key(map(1, 'a', 2, 'b'), 3);
   -- Returns: false
   
   -- With string keys  
   SELECT map_contains_key(map('name', 'John', 'age', '30'), 'name');
   -- Returns: true
   ```
   
   ```scala
   // DataFrame API usage
   import org.apache.spark.sql.functions._
   
   df.select(map_contains_key(col("my_map"), lit("search_key")))
   
   // Creating map and checking key
   df.select(map_contains_key(
     map(lit("key1"), lit("value1"), lit("key2"), lit("value2")), 
     lit("key1")
   ))
   ```
   
   ### Implementation Approach
   
   See the [Comet guide on adding new 
expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html)
 for detailed instructions.
   
   1. **Scala Serde**: Add expression handler in 
`spark/src/main/scala/org/apache/comet/serde/`
   2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
   3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if 
needed
   4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has 
built-in support first)
   
   
   ## Additional context
   
   **Difficulty:** Medium
   **Spark Expression Class:** 
`org.apache.spark.sql.catalyst.expressions.MapContainsKey`
   
   **Related:**
   - `MapKeys` - Extracts all keys from a map
   - `ArrayContains` - Underlying implementation for containment check  
   - `MapValues` - Extracts all values from a map
   - `ElementAt` - Retrieves value by key from map
   
   ---
   *This issue was auto-generated from Spark reference documentation.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [Feature] Support Spark expression: map_contains_key [datafusion-comet]

Reply via email to