Re: [PR] docs: Rewrite supported expressions page to show complete overview of what is and is not supported by Comet [datafusion-comet]

via GitHub Mon, 01 Jun 2026 14:28:28 -0700


comphead commented on code in PR #4550:
URL: https://github.com/apache/datafusion-comet/pull/4550#discussion_r3337285635



##########
docs/source/user-guide/latest/expressions.md:
##########
@@ -17,354 +17,652 @@
   under the License.
 -->
 
-# Supported Spark Expressions
-
-Comet supports the following Spark expressions. See the [Comet Compatibility 
Guide] for details on known
-incompatibilities and unsupported cases.
-
-All expressions are enabled by default, but most can be disabled by setting
-`spark.comet.expression.EXPRNAME.enabled=false`, where `EXPRNAME` is the 
expression name as specified in
-the following tables, such as `Length`, or `StartsWith`. See the [Comet 
Configuration Guide] for a full list
-of expressions that be disabled.
-
-## Conditional Expressions
-
-| Expression | SQL                                         |
-| ---------- | ------------------------------------------- |
-| CaseWhen   | `CASE WHEN expr THEN expr ELSE expr END`    |
-| If         | `IF(predicate_expr, true_expr, false_expr)` |
-
-## Predicate Expressions
-
-| Expression         | SQL           |
-| ------------------ | ------------- |
-| And                | `AND`         |
-| EqualTo            | `=`           |
-| EqualNullSafe      | `<=>`         |
-| GreaterThan        | `>`           |
-| GreaterThanOrEqual | `>=`          |
-| ILike              | `ILIKE`       |
-| In                 | `IN`          |
-| InSet              | `IN (...)`    |
-| IsNotNull          | `IS NOT NULL` |
-| IsNull             | `IS NULL`     |
-| LessThan           | `<`           |
-| LessThanOrEqual    | `<=`          |
-| Not                | `NOT`         |
-| Or                 | `OR`          |
-
-## String Functions
-
-| Expression      |
-| --------------- |
-| Ascii           |
-| BitLength       |
-| Chr             |
-| Concat          |
-| ConcatWs        |
-| Contains        |
-| Decode          |
-| EndsWith        |
-| InitCap         |
-| Left            |
-| Length          |
-| Like            |
-| Lower           |
-| OctetLength     |
-| Reverse         |
-| Right           |
-| RLike           |
-| Split           |
-| StartsWith      |
-| StringInstr     |
-| StringRepeat    |
-| StringReplace   |
-| StringLPad      |
-| StringRPad      |
-| StringSpace     |
-| StringTranslate |
-| StringTrim      |
-| StringTrimBoth  |
-| StringTrimLeft  |
-| StringTrimRight |
-| Substring       |
-| SubstringIndex  |
-| Upper           |
-
-## JSON Functions
-
-| Expression    |
-| ------------- |
-| GetJsonObject |
-
-## Date/Time Functions
-
-| Expression        | SQL                          |
-| ----------------- | ---------------------------- |
-| AddMonths         | `add_months`                 |
-| ConvertTimezone   | `convert_timezone`           |
-| CurrentTimeZone   | `current_timezone`           |
-| DateAdd           | `date_add`                   |
-| DateDiff          | `datediff`                   |
-| DateFormat        | `date_format`                |
-| DateFromUnixDate  | `date_from_unix_date`        |
-| DateSub           | `date_sub`                   |
-| DatePart          | `date_part(field, source)`   |
-| Days              | `days`                       |
-| Extract           | `extract(field FROM source)` |
-| FromUnixTime      | `from_unixtime`              |
-| Hour              | `hour`                       |
-| LastDay           | `last_day`                   |
-| LocalTimestamp    | `localtimestamp`             |
-| MakeDate          | `make_date`                  |
-| MakeTime          | `make_time`                  |
-| MakeTimestamp     | `make_timestamp`             |
-| MicrosToTimestamp | `timestamp_micros`           |
-| MillisToTimestamp | `timestamp_millis`           |
-| Minute            | `minute`                     |
-| MonthsBetween     | `months_between`             |
-| NextDay           | `next_day`                   |
-| Second            | `second`                     |
-| TimestampSeconds  | `timestamp_seconds`          |
-| ToUnixTimestamp   | `to_unix_timestamp`          |
-| TruncDate         | `trunc`                      |
-| TruncTimestamp    | `date_trunc`                 |
-| UnixDate          | `unix_date`                  |
-| UnixMicros        | `unix_micros`                |
-| UnixMillis        | `unix_millis`                |
-| UnixSeconds       | `unix_seconds`               |
-| UnixTimestamp     | `unix_timestamp`             |
-| Year              | `year`                       |
-| Month             | `month`                      |
-| DayOfMonth        | `day`/`dayofmonth`           |
-| DayOfWeek         | `dayofweek`                  |
-| WeekDay           | `weekday`                    |
-| DayOfYear         | `dayofyear`                  |
-| WeekOfYear        | `weekofyear`                 |
-| Quarter           | `quarter`                    |
-| ToTime            | `to_time`                    |
-| TryToTime         | `try_to_time`                |
-
-## Math Expressions
-
-| Expression     | SQL            |
-| -------------- | -------------- |
-| Abs            | `abs`          |
-| Acos           | `acos`         |
-| Acosh          | `acosh`        |
-| Add            | `+`            |
-| Asin           | `asin`         |
-| Asinh          | `asinh`        |
-| Atan           | `atan`         |
-| Atan2          | `atan2`        |
-| Atanh          | `atanh`        |
-| Bin            | `bin`          |
-| BRound         | `bround`       |
-| Cbrt           | `cbrt`         |
-| Ceil           | `ceil`         |
-| Cos            | `cos`          |
-| Cosh           | `cosh`         |
-| Cot            | `cot`          |
-| Csc            | `csc`          |
-| Divide         | `/`            |
-| Exp            | `exp`          |
-| Expm1          | `expm1`        |
-| Factorial      | `factorial`    |
-| Floor          | `floor`        |
-| Hex            | `hex`          |
-| IntegralDivide | `div`          |
-| IsNaN          | `isnan`        |
-| Log            | `log`          |
-| Log2           | `log2`         |
-| Log10          | `log10`        |
-| Multiply       | `*`            |
-| Pi             | `pi`           |
-| Pow            | `power`        |
-| Rand           | `rand`         |
-| Randn          | `randn`        |
-| Remainder      | `%`            |
-| Rint           | `rint`         |
-| Round          | `round`        |
-| Sec            | `sec`          |
-| Signum         | `signum`       |
-| Sin            | `sin`          |
-| Sinh           | `sinh`         |
-| Sqrt           | `sqrt`         |
-| Subtract       | `-`            |
-| Tan            | `tan`          |
-| Tanh           | `tanh`         |
-| ToDegrees      | `degrees`      |
-| ToRadians      | `radians`      |
-| TryAdd         | `try_add`      |
-| TryDivide      | `try_div`      |
-| TryMultiply    | `try_mul`      |
-| TrySubtract    | `try_sub`      |
-| UnaryMinus     | `-`            |
-| Unhex          | `unhex`        |
-| WidthBucket    | `width_bucket` |
-
-## Hashing Functions
-
-| Expression  |
-| ----------- |
-| Crc32       |
-| Md5         |
-| Murmur3Hash |
-| Sha1        |
-| Sha2        |
-| XxHash64    |
-
-## Bitwise Expressions
-
-| Expression         | SQL   |
-| ------------------ | ----- |
-| BitwiseAnd         | `&`   |
-| BitwiseCount       |       |
-| BitwiseGet         |       |
-| BitwiseOr          | `\|`  |
-| BitwiseNot         | `~`   |
-| BitwiseXor         | `^`   |
-| ShiftLeft          | `<<`  |
-| ShiftRight         | `>>`  |
-| ShiftRightUnsigned | `>>>` |
-
-## Aggregate Expressions
-
-| Expression    | SQL        |
-| ------------- | ---------- |
-| Average       |            |
-| BitAndAgg     |            |
-| BitOrAgg      |            |
-| BitXorAgg     |            |
-| BoolAnd       | `bool_and` |
-| BoolOr        | `bool_or`  |
-| CollectSet    |            |
-| Corr          |            |
-| Count         |            |
-| CountIf       | `count_if` |
-| CovPopulation |            |
-| CovSample     |            |
-| First         |            |
-| Last          |            |
-| Max           |            |
-| Min           |            |
-| StddevPop     |            |
-| StddevSamp    |            |
-| Sum           |            |
-| VariancePop   |            |
-| VarianceSamp  |            |
-
-## Window Functions
-
-```{warning}
-Window support is disabled by default due to known correctness issues. 
Tracking issue: [#2721](https://github.com/apache/datafusion-comet/issues/2721).
-```
-
-Comet supports using the following aggregate functions within window contexts 
with PARTITION BY and ORDER BY clauses.
-
-| Expression |
-| ---------- |
-| Count      |
-| Max        |
-| Min        |
-| Sum        |
-
-**Note:** Dedicated window functions such as `rank`, `dense_rank`, 
`row_number`, `lag`, `lead`, `ntile`, `cume_dist`, `percent_rank`, and 
`nth_value` are not currently supported and will fall back to Spark.
-
-## Array Expressions
-
-| Expression     |
-| -------------- |
-| ArrayAppend    |
-| ArrayCompact   |
-| ArrayContains  |
-| ArrayDistinct  |
-| ArrayExcept    |
-| ArrayFilter    |
-| ArrayInsert    |
-| ArrayIntersect |
-| ArrayJoin      |
-| ArrayMax       |
-| ArrayMin       |
-| ArrayPosition  |
-| ArrayRemove    |
-| ArrayRepeat    |
-| ArraysZip      |
-| ArrayUnion     |
-| ArraysOverlap  |
-| CreateArray    |
-| ElementAt      |
-| Flatten        |
-| GetArrayItem   |
-| Size           |
-| SortArray      |
-
-## Map Expressions
-
-| Expression     |
-| -------------- |
-| GetMapValue    |
-| MapContainsKey |
-| MapEntries     |
-| MapFromArrays  |
-| MapFromEntries |
-| MapKeys        |
-| MapValues      |
-| StringToMap    |
-
-## Struct Expressions
-
-| Expression           |
-| -------------------- |
-| CreateNamedStruct    |
-| GetArrayStructFields |
-| GetStructField       |
-| JsonToStructs        |
-| StructsToJson        |
-
-## URL Functions
-
-| Expression   |
-| ------------ |
-| TryUrlDecode |
-| UrlDecode    |
-| UrlEncode    |
-
-## Conversion Expressions
-
-| Expression |
-| ---------- |
-| Cast       |
-
-## SortOrder
-
-| Expression |
-| ---------- |
-| NullsFirst |
-| NullsLast  |
-| Ascending  |
-| Descending |
-
-## Other
-
-| Expression                   |
-| ---------------------------- |
-| Alias                        |
-| AttributeReference           |
-| BloomFilterMightContain      |
-| Coalesce                     |
-| CheckOverflow                |
-| KnownFloatingPointNormalized |
-| Literal                      |
-| MakeDecimal                  |
-| MonotonicallyIncreasingID    |
-| NormalizeNaNAndZero          |
-| PromotePrecision             |
-| RegExpReplace                |
-| ScalarSubquery               |
-| SparkPartitionID             |
-| ToPrettyString               |
-| UnscaledValue                |
-
-[Comet Configuration Guide]: configs.md
-[Comet Compatibility Guide]: compatibility/expressions/index.md
+# Spark Expression Support
+
+This page is the complete reference for how Apache Comet handles each Spark 
built-in
+expression. Comet accelerates expressions either with a native (Rust) 
implementation or by
+dispatching to a Spark-compatible codegen path. When an expression is not 
supported, Comet
+transparently falls back to Spark for that part of the plan; results are 
unaffected.
+
+Expressions marked ✅ Supported are enabled by default. Expressions marked ⚠️ 
Supported
+(caveats) include cases that are known to diverge from Spark; those cases fall 
back to Spark
+by default and must be opted into per expression with
+`spark.comet.expression.EXPRNAME.allowIncompatible=true` (where `EXPRNAME` is 
the Spark
+expression class name, for example `Cast`). There is no global opt-in.
+
+Most expressions can also be disabled with 
`spark.comet.expression.EXPRNAME.enabled=false`, where
+`EXPRNAME` is the Spark expression class name (for example `Length` or 
`StartsWith`). See the
+[Comet Configuration Guide](configs.md) for the full list.
+
+## Status legend
+
+| Status                 | Meaning                                             
                                                                                
                                                    |
+| ---------------------- | 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 |
+| ✅ Supported           | Native or codegen path; compatible with Spark by 
default.                                                                        
                                                       |
+| ⚠️ Supported (caveats) | Works, but may diverge from Spark in some cases: 
incompatible, flag-gated (`allowIncompatible`), or restricted to certain types. 
See the [Compatibility Guide](compatibility/index.md). |
+| 🔜 Planned             | Intended; tracked by an open issue or pull request.  
                                                                                
                                                   |
+| 🚫 Out of scope        | Deliberately not planned.                            
                                                                                
                                                   |
+
+## Out of scope
+
+Comet focuses acceleration on mainstream relational, string, datetime, math, 
and collection
+expressions. Some Spark function families are **out of scope**: specialized 
functionality with
+narrow real-world analytics use and high implementation cost. These will fall 
back to Spark and
+are not on the roadmap:
+
+- **Probabilistic sketches and approximate top-k** (`kll_sketch_*`, `hll_*`, 
`theta_*`, `count_min_sketch`, `bitmap_*`, `approx_top_k*`): specialized data 
structures with exact-correctness traps.
+- **XML / XPath** (`from_xml`, `to_xml`, `schema_of_xml`, `xpath*`): legacy 
text format, rare in accelerated workloads.
+- **Geospatial** (`st_*`): brand-new Spark 4.1 functionality, specialized.
+- **Avro / Protobuf codecs** (`from_avro`, `to_avro`, `from_protobuf`, 
`to_protobuf`, `schema_of_avro`): format conversion belongs at the IO layer, 
not expression evaluation.
+- **JVM reflection** (`java_method`, `reflect`): niche, and they invoke 
arbitrary JVM methods (a security concern).
+- **CSV functions** (`from_csv`, `to_csv`, `schema_of_csv`): row-level CSV 
parsing and formatting in expressions is niche and better handled at the data 
source layer.
+- **UTF-8 validation** (`is_valid_utf8`, `make_valid_utf8`, `validate_utf8`, 
`try_validate_utf8`): niche Spark 4.x string-validation helpers.
+- **File metadata** (`input_file_name`, `input_file_block_start`, 
`input_file_block_length`): require scan-internal per-row file information, 
outside the expression layer.
+- **Miscellaneous niche** (`histogram_numeric`, `version`, `sentences`, 
`quote`, `uuid`): low-value or specialized functions with little benefit from 
native acceleration.
+
+Note that `approx_count_distinct`, `approx_percentile` / `percentile_approx`, 
`median`, and `mode`
+are _not_ out of scope: although approximate, they are mainstream and are 
planned.
+
+The tables below list every Spark built-in expression with its current status.
+
+## agg_funcs
+
+| Function                | Status | Notes                                     
                       |
+| ----------------------- | ------ | 
---------------------------------------------------------------- |
+| `any`                   | ✅     |                                            
                      |
+| `any_value`             | ✅     |                                            
                      |
+| `approx_count_distinct` | 🔜     | tracking #4098                             
                      |
+| `approx_percentile`     | 🔜     | 
[#3189](https://github.com/apache/datafusion-comet/issues/3189)  |
+| `array_agg`             | 🔜     | Array aggregate (related to 
`collect_list`, #2524)               |
+| `avg`                   | ⚠️     | Interval types (YearMonth, DayTime) fall 
back                    |
+| `bit_and`               | ✅     |                                            
                      |
+| `bit_or`                | ✅     |                                            
                      |
+| `bit_xor`               | ✅     |                                            
                      |
+| `bool_and`              | ✅     |                                            
                      |
+| `bool_or`               | ✅     |                                            
                      |
+| `collect_list`          | 🔜     | 
[#2524](https://github.com/apache/datafusion-comet/issues/2524)  |
+| `collect_set`           | ✅     |                                            
                      |
+| `corr`                  | ✅     |                                            
                      |
+| `count`                 | ✅     |                                            
                      |
+| `count_if`              | ✅     |                                            
                      |
+| `covar_pop`             | ✅     |                                            
                      |
+| `covar_samp`            | ✅     |                                            
                      |
+| `every`                 | ✅     |                                            
                      |
+| `first`                 | ✅     |                                            
                      |
+| `first_value`           | ✅     |                                            
                      |
+| `grouping`              | 🔜     | Grouping indicator for 
ROLLUP/CUBE/GROUPING SETS                 |
+| `grouping_id`           | 🔜     | Grouping indicator for 
ROLLUP/CUBE/GROUPING SETS                 |
+| `kurtosis`              | 🔜     | tracking #4098                             
                      |
+| `last`                  | ✅     |                                            
                      |
+| `last_value`            | ✅     |                                            
                      |
+| `listagg`               | 🔜     | String aggregation                         
                      |
+| `max`                   | ✅     |                                            
                      |
+| `max_by`                | 🔜     | 
[#3841](https://github.com/apache/datafusion-comet/issues/3841)  |
+| `mean`                  | ✅     |                                            
                      |
+| `median`                | 🔜     | tracking #4098                             
                      |
+| `min`                   | ✅     |                                            
                      |
+| `min_by`                | 🔜     | 
[#3841](https://github.com/apache/datafusion-comet/issues/3841)  |
+| `mode`                  | 🔜     | 
[#3970](https://github.com/apache/datafusion-comet/issues/3970)  |
+| `percentile`            | 🔜     | #4542                                      
                      |
+| `percentile_approx`     | 🔜     | 
[#3189](https://github.com/apache/datafusion-comet/issues/3189)  |
+| `percentile_cont`       | 🔜     | Percentile aggregate                       
                      |
+| `percentile_disc`       | 🔜     | Percentile aggregate                       
                      |
+| `regr_avgx`             | ✅     | Native: Spark rewrites to `Average` (tests 
in #4551)             |
+| `regr_avgy`             | ✅     | Native: Spark rewrites to `Average` (tests 
in #4551)             |
+| `regr_count`            | ✅     | Native: Spark rewrites to `Count` (tests 
in #4551)               |
+| `regr_intercept`        | 🔜     | Falls back; can reuse 
`covar_pop`/`var_pop` accumulators (#4552) |
+| `regr_r2`               | 🔜     | Falls back; can reuse the `corr` 
accumulator (#4552)             |
+| `regr_slope`            | 🔜     | Falls back; can reuse 
`covar_pop`/`var_pop` accumulators (#4552) |
+| `regr_sxx`              | 🔜     | Falls back; can reuse `var_pop` 
accumulator (#4552)              |
+| `regr_sxy`              | 🔜     | Falls back; can reuse `covar_pop` 
accumulator (#4552)            |
+| `regr_syy`              | 🔜     | Falls back; can reuse `var_pop` 
accumulator (#4552)              |
+| `skewness`              | 🔜     | tracking #4098                             
                      |
+| `some`                  | ✅     |                                            
                      |
+| `std`                   | ✅     |                                            
                      |
+| `stddev`                | ✅     |                                            
                      |
+| `stddev_pop`            | ✅     |                                            
                      |
+| `stddev_samp`           | ✅     |                                            
                      |
+| `string_agg`            | 🔜     | String aggregation (alias of `listagg`)    
                      |
+| `sum`                   | ✅     |                                            
                      |
+| `try_avg`               | 🔜     | tracking #4098                             
                      |
+| `try_sum`               | 🔜     | tracking #4098                             
                      |
+| `var_pop`               | ✅     |                                            
                      |
+| `var_samp`              | ✅     |                                            
                      |
+| `variance`              | ✅     |                                            
                      |
+
+---
+
+## array_funcs
+
+| Function          | Status | Notes                                           
                                                                          |
+| ----------------- | ------ | 
-------------------------------------------------------------------------------------------------------------------------
 |
+| `array`           | ✅     |                                                  
                                                                         |
+| `array_append`    | ⚠️     | On Spark 4.0+ rewrites to `array_insert`; 
inherits its incompatibilities                                                  
|
+| `array_compact`   | ✅     |                                                  
                                                                         |
+| `array_contains`  | ⚠️     | NaN-canonicalization may differ for 
float/double arrays 
([#4481](https://github.com/apache/datafusion-comet/issues/4481)) |
+| `array_distinct`  | ⚠️     | NaN/signed-zero canonicalization may differ 
([#4481](https://github.com/apache/datafusion-comet/issues/4481))             |
+| `array_except`    | ⚠️     | Null handling and ordering may differ; 
`Incompatible`, flag-gated                                                      
   |
+| `array_insert`    | ✅     |                                                  
                                                                         |
+| `array_intersect` | ⚠️     | Result element order may differ when right 
array is longer than left                                                      |
+| `array_join`      | ⚠️     | Null handling may differ 
([#3178](https://github.com/apache/datafusion-comet/issues/3178)); 
`Incompatible`, flag-gated    |
+| `array_max`       | ⚠️     | NaN ordering may differ for float/double 
([#4482](https://github.com/apache/datafusion-comet/issues/4482))               
 |
+| `array_min`       | ⚠️     | NaN ordering may differ for float/double 
([#4482](https://github.com/apache/datafusion-comet/issues/4482))               
 |
+| `array_position`  | ⚠️     | Falls back for binary/struct/map/null element 
types                                                                       |
+| `array_prepend`   | 🔜     | Sibling of `array_append`                        
                                                                         |
+| `array_remove`    | ✅     |                                                  
                                                                         |
+| `array_repeat`    | ✅     |                                                  
                                                                         |
+| `array_union`     | ⚠️     | NaN/signed-zero canonicalization may differ 
([#4481](https://github.com/apache/datafusion-comet/issues/4481))             |
+| `arrays_overlap`  | ✅     |                                                  
                                                                         |
+| `arrays_zip`      | ✅     |                                                  
                                                                         |
+| `element_at`      | ⚠️     | Only `ArrayType` input; `MapType` input falls 
back                                                                        |
+| `flatten`         | ⚠️     | Falls back for binary/struct/map child element 
types                                                                      |
+| `get`             | ✅     |                                                  
                                                                         |
+| `sequence`        | 🔜     | #4538                                            
                                                                         |
+| `shuffle`         | 🔜     | Random array shuffle                             
                                                                         |
+| `slice`           | ✅     | Native (#4149)                                   
                                                                         |
+| `sort_array`      | ⚠️     | Incompatible under strict floating-point; falls 
back for nested struct/null arrays                                        |
+
+---
+
+## bitwise_funcs
+
+| Function             | Status | Notes                                        
        |
+| -------------------- | ------ | 
---------------------------------------------------- |
+| `&`                  | ✅     |                                               
       |
+| `<<`                 | ✅     |                                               
       |
+| `>>`                 | ✅     |                                               
       |
+| `>>>`                | ✅     | Operator alias for `shiftrightunsigned` 
(Spark 4.0+) |
+| `^`                  | ✅     |                                               
       |
+| `bit_count`          | ✅     |                                               
       |
+| `bit_get`            | ✅     |                                               
       |
+| `getbit`             | ✅     |                                               
       |
+| `shiftright`         | ✅     |                                               
       |
+| `shiftrightunsigned` | ✅     |                                               
       |
+| `\|`                 | ✅     |                                               
       |
+| `~`                  | ✅     |                                               
       |
+
+---
+
+## collection_funcs
+
+| Function      | Status | Notes                                               
                                                                             |
+| ------------- | ------ | 
--------------------------------------------------------------------------------------------------------------------------------
 |
+| `array_size`  | ⚠️     | Lowers to `size`; accelerated, but returns -1 
instead of NULL for NULL input (#4560)                                          
   |
+| `cardinality` | ⚠️     | Alias for `size`; `MapType` input falls back 
([#4472](https://github.com/apache/datafusion-comet/issues/4472))               
    |
+| `concat`      | ⚠️     | Only `StringType` children; 
`BinaryType`/`ArrayType` fall back 
([#4471](https://github.com/apache/datafusion-comet/issues/4471)) |
+| `reverse`     | ⚠️     | Array with `BinaryType` elements is `Incompatible`, 
flag-gated ([#2763](https://github.com/apache/datafusion-comet/issues/2763)) |
+| `size`        | ⚠️     | `MapType` input falls back 
([#4472](https://github.com/apache/datafusion-comet/issues/4472))               
                      |
+
+---
+
+## conditional_funcs
+
+| Function     | Status | Notes                             |
+| ------------ | ------ | --------------------------------- |
+| `coalesce`   | ✅     |                                   |
+| `if`         | ✅     |                                   |
+| `ifnull`     | ✅     |                                   |
+| `nanvl`      | 🔜     | #4538                             |
+| `nullif`     | ✅     |                                   |
+| `nullifzero` | ✅     | Lowers to `if`/`=` (Spark 4.0+)   |
+| `nvl`        | ✅     |                                   |
+| `nvl2`       | ✅     |                                   |
+| `when`       | ✅     |                                   |
+| `zeroifnull` | ✅     | Lowers to `coalesce` (Spark 4.0+) |
+
+---
+
+## conversion_funcs
+
+The type-name conversion functions (`bigint`, `binary`, `boolean`, `date`, 
`decimal`, `double`, `float`, `int`, `smallint`, `string`, `timestamp`, 
`tinyint`) are SQL aliases for `CAST(... AS <type>)` and share the support and 
caveats of `cast`.
+
+| Function | Status | Notes                                                    
                                                          |
+| -------- | ------ | 
------------------------------------------------------------------------------------------------------------------
 |
+| `cast`   | ⚠️     | Many type pairs supported; float-to-decimal rounding may 
differ; see [Compatibility Guide](compatibility/index.md) |
+
+---
+
+## datetime_funcs
+
+| Function              | Status | Notes                                       
                                                           |
+| --------------------- | ------ | 
------------------------------------------------------------------------------------------------------
 |
+| `add_months`          | ✅     |                                              
                                                          |
+| `convert_timezone`    | ✅     |                                              
                                                          |
+| `curdate`             | ✅     | Constant-folded to a literal (alias of 
`current_date`)                                                 |
+| `current_date`        | ✅     | Constant-folded to a literal before Comet 
sees the plan                                                |
+| `current_time`        | 🔜     | Blocked on Spark 4.1 TIME type support 
(#4288)                                                         |
+| `current_timestamp`   | ✅     | Constant-folded to a literal before Comet 
sees the plan                                                |
+| `current_timezone`    | ✅     |                                              
                                                          |
+| `date_add`            | ✅     |                                              
                                                          |
+| `date_diff`           | ✅     |                                              
                                                          |
+| `date_format`         | ✅     |                                              
                                                          |
+| `date_from_unix_date` | ✅     |                                              
                                                          |
+| `date_part`           | ✅     |                                              
                                                          |
+| `date_sub`            | ✅     |                                              
                                                          |
+| `date_trunc`          | ✅     |                                              
                                                          |
+| `dateadd`             | ✅     |                                              
                                                          |
+| `datediff`            | ✅     |                                              
                                                          |
+| `datepart`            | ✅     |                                              
                                                          |
+| `day`                 | ✅     |                                              
                                                          |
+| `dayname`             | 🔜     | #4544                                        
                                                          |
+| `dayofmonth`          | ✅     |                                              
                                                          |
+| `dayofweek`           | ✅     |                                              
                                                          |
+| `dayofyear`           | ✅     |                                              
                                                          |
+| `extract`             | ✅     |                                              
                                                          |
+| `from_unixtime`       | ✅     |                                              
                                                          |
+| `from_utc_timestamp`  | ⚠️     | Legacy zone forms (`GMT+1`, `PST`) throw a 
native parse error                                          |
+| `hour`                | ✅     |                                              
                                                          |
+| `last_day`            | ✅     |                                              
                                                          |
+| `localtimestamp`      | ✅     |                                              
                                                          |
+| `make_date`           | ✅     |                                              
                                                          |
+| `make_dt_interval`    | 🔜     | #4541                                        
                                                          |
+| `make_interval`       | 🔜     | Produces legacy CalendarInterval; tracked by 
#4540                                                     |
+| `make_time`           | 🔜     | Spark 4.1 TIME type; tracked by #4288        
                                                          |
+| `make_timestamp`      | ✅     |                                              
                                                          |
+| `make_timestamp_ltz`  | ⚠️     | 6-arg form runs via the codegen dispatcher; 
2-arg `(date, time)` form (Spark 4.1 TIME type) falls back |
+| `make_timestamp_ntz`  | ⚠️     | 6-arg form runs via the codegen dispatcher; 
2-arg `(date, time)` form (Spark 4.1 TIME type) falls back |
+| `make_ym_interval`    | 🔜     | #4541                                        
                                                          |
+| `minute`              | ✅     |                                              
                                                          |
+| `month`               | ✅     |                                              
                                                          |
+| `monthname`           | 🔜     | #4544                                        
                                                          |
+| `months_between`      | ✅     |                                              
                                                          |
+| `next_day`            | ✅     |                                              
                                                          |
+| `now`                 | ✅     | Constant-folded to a literal (alias of 
`current_timestamp`)                                            |
+| `quarter`             | ✅     |                                              
                                                          |
+| `second`              | ✅     |                                              
                                                          |
+| `session_window`      | 🔜     | Time-window grouping; tracked by #4553       
                                                          |
+| `time_diff`           | 🔜     | Spark 4.1 TIME type; tracked by #4288        
                                                          |
+| `time_trunc`          | 🔜     | Spark 4.1 TIME type; tracked by #4288        
                                                          |
+| `timestamp_micros`    | ✅     |                                              
                                                          |
+| `timestamp_millis`    | ✅     |                                              
                                                          |
+| `timestamp_seconds`   | ✅     |                                              
                                                          |
+| `to_date`             | ✅     | Rewrites to `Cast` (or `Cast(GetTimestamp)` 
with a format) before Comet sees the plan                  |
+| `to_time`             | 🔜     | Spark 4.1 TIME type; tracked by #4288        
                                                          |
+| `to_timestamp`        | ✅     | Rewrites to `Cast` (or `GetTimestamp` with a 
format) before Comet sees the plan                        |
+| `to_timestamp_ltz`    | ✅     | Rewrites to `to_timestamp` (`TimestampType`) 
                                                          |
+| `to_timestamp_ntz`    | ✅     | Rewrites to `to_timestamp` 
(`TimestampNTZType`)                                                        |
+| `to_unix_timestamp`   | ✅     |                                              
                                                          |
+| `to_utc_timestamp`    | ⚠️     | Legacy zone forms (`GMT+1`, `PST`) throw a 
native parse error                                          |
+| `trunc`               | ✅     |                                              
                                                          |
+| `try_make_interval`   | 🔜     | Produces legacy CalendarInterval; tracked by 
#4540                                                     |
+| `try_make_timestamp`  | ⚠️     | Runs natively for valid inputs, but returns 
wrong values for invalid inputs instead of NULL (#4554)    |
+| `try_to_date`         | 🔜     | Rewrites to `Cast`/`GetTimestamp` but 
currently falls back; tracked by #4556                           |
+| `try_to_time`         | 🔜     | Spark 4.1 TIME type; tracked by #4288        
                                                          |
+| `try_to_timestamp`    | 🔜     | Rewrites to `Cast`/`GetTimestamp` but 
currently falls back; tracked by #4556                           |
+| `unix_date`           | ✅     |                                              
                                                          |
+| `unix_micros`         | ✅     |                                              
                                                          |
+| `unix_millis`         | ✅     |                                              
                                                          |
+| `unix_seconds`        | ✅     |                                              
                                                          |
+| `unix_timestamp`      | ✅     |                                              
                                                          |
+| `weekday`             | ✅     |                                              
                                                          |
+| `weekofyear`          | ✅     |                                              
                                                          |
+| `window`              | 🔜     | Time-window grouping; tracked by #4553       
                                                          |
+| `window_time`         | 🔜     | Time-window grouping; tracked by #4553       
                                                          |
+| `year`                | ✅     |                                              
                                                          |
+
+---
+
+## generator_funcs
+
+`explode` and `posexplode` are supported via `CometExplodeExec` 
(operator-level, not
+expression-level). The `outer` variants are wired but marked `Incompatible`; 
they require
+`spark.comet.exec.explode.enabled=true` and `allowIncompatible`.
+
+| Function           | Status | Notes                                          
      |
+| ------------------ | ------ | 
---------------------------------------------------- |
+| `explode`          | ✅     | via `CometExplodeExec`                          
     |
+| `explode_outer`    | ⚠️     | `outer=true` incompatible; needs 
`allowIncompatible` |
+| `inline`           | 🔜     | Operator-level generator (like `explode`)       
     |
+| `inline_outer`     | 🔜     | Operator-level generator (like `explode`)       
     |
+| `posexplode`       | ✅     | via `CometExplodeExec`                          
     |
+| `posexplode_outer` | ⚠️     | `outer=true` incompatible; needs 
`allowIncompatible` |
+| `stack`            | 🔜     | Operator-level generator                        
     |
+
+---
+
+## hash_funcs
+
+| Function   | Status | Notes |
+| ---------- | ------ | ----- |
+| `crc32`    | ✅     |       |
+| `hash`     | ✅     |       |
+| `md5`      | ✅     |       |
+| `sha`      | ✅     |       |
+| `sha1`     | ✅     |       |
+| `sha2`     | ✅     |       |
+| `xxhash64` | ✅     |       |
+
+---
+
+## json_funcs
+
+| Function            | Status | Notes                                         
                                                                        |
+| ------------------- | ------ | 
---------------------------------------------------------------------------------------------------------------------
 |
+| `from_json`         | ⚠️     | Partial native support (requires explicit 
schema, marked `Incompatible`); fuller support via codegen dispatch (#4305) |
+| `get_json_object`   | ⚠️     | Single-quoted JSON and unescaped control 
chars require `allowIncompatible`                                            |
+| `json_array_length` | 🔜     | tracking #4098                                 
                                                                       |
+| `json_object_keys`  | 🔜     | 
[#3161](https://github.com/apache/datafusion-comet/issues/3161)                 
                                      |
+| `json_tuple`        | 🔜     | 
[#3160](https://github.com/apache/datafusion-comet/issues/3160)                 
                                      |
+| `schema_of_json`    | 🔜     | 
[#3163](https://github.com/apache/datafusion-comet/issues/3163)                 
                                      |
+| `to_json`           | ⚠️     | Partial native support (options and map/array 
inputs fall back); fuller support via codegen dispatch (#4305)          |
+
+---
+
+## lambda_funcs
+
+All higher-order functions are planned via 
[#4224](https://github.com/apache/datafusion-comet/issues/4224).
+
+| Function           | Status | Notes                                          
                 |
+| ------------------ | ------ | 
--------------------------------------------------------------- |
+| `aggregate`        | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `array_sort`       | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `exists`           | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `filter`           | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `forall`           | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `map_filter`       | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `map_zip_with`     | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `reduce`           | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `transform`        | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `transform_keys`   | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `transform_values` | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+| `zip_with`         | 🔜     | 
[#4224](https://github.com/apache/datafusion-comet/issues/4224) |
+
+---
+
+## map_funcs
+
+| Function           | Status | Notes                                          
              |
+| ------------------ | ------ | 
------------------------------------------------------------ |
+| `element_at`       | ⚠️     | Only `ArrayType` input; `MapType` input falls 
back           |
+| `map`              | 🔜     | Constructs a map                                
             |
+| `map_concat`       | 🔜     | Concatenates maps                               
             |
+| `map_contains_key` | ✅     |                                                 
             |
+| `map_entries`      | ✅     |                                                 
             |
+| `map_from_arrays`  | ✅     |                                                 
             |
+| `map_from_entries` | ⚠️     | `BinaryType` key/value falls back unless 
`allowIncompatible` |
+| `map_keys`         | ✅     |                                                 
             |
+| `map_values`       | ✅     |                                                 
             |
+| `str_to_map`       | ✅     |                                                 
             |
+| `try_element_at`   | ✅     | Lowers to `element_at`; array input (MapType 
falls back)     |
+
+---
+
+## math_funcs
+
+| Function       | Status | Notes                                              
                                                                |
+| -------------- | ------ | 
------------------------------------------------------------------------------------------------------------------
 |
+| `%`            | ⚠️     | `try_mod` form (`EvalMode.TRY`) falls back 
([#4484](https://github.com/apache/datafusion-comet/issues/4484))       |
+| `*`            | ⚠️     | Interval multiplication falls back                 
                                                                |
+| `+`            | ✅     |                                                     
                                                               |
+| `-`            | ✅     |                                                     
                                                               |
+| `/`            | ✅     |                                                     
                                                               |
+| `abs`          | ⚠️     | Interval types fall back; ANSI overflow for 
integer min value                                                      |
+| `acos`         | ✅     |                                                     
                                                               |
+| `acosh`        | ✅     |                                                     
                                                               |
+| `asin`         | ✅     |                                                     
                                                               |
+| `asinh`        | ✅     |                                                     
                                                               |
+| `atan`         | ✅     |                                                     
                                                               |
+| `atan2`        | ✅     |                                                     
                                                               |
+| `atanh`        | ✅     |                                                     
                                                               |
+| `bin`          | ✅     |                                                     
                                                               |
+| `bround`       | 🔜     | #4538                                               
                                                               |
+| `cbrt`         | ✅     |                                                     
                                                               |
+| `ceil`         | ⚠️     | Two-arg `ceil(expr, scale)` form falls back        
                                                                |
+| `ceiling`      | ✅     |                                                     
                                                               |
+| `conv`         | 🔜     | #4538                                               
                                                               |
+| `cos`          | ✅     |                                                     
                                                               |
+| `cosh`         | ✅     |                                                     
                                                               |
+| `cot`          | ✅     |                                                     
                                                               |
+| `csc`          | ✅     |                                                     
                                                               |
+| `degrees`      | ✅     |                                                     
                                                               |
+| `div`          | ✅     |                                                     
                                                               |
+| `e`            | ✅     | Folds to a literal (like `pi`)                      
                                                               |
+| `exp`          | ✅     |                                                     
                                                               |
+| `expm1`        | ✅     |                                                     
                                                               |
+| `factorial`    | ✅     |                                                     
                                                               |
+| `floor`        | ⚠️     | Two-arg `floor(expr, scale)` form falls back       
                                                                |
+| `greatest`     | ✅     |                                                     
                                                               |
+| `hex`          | ✅     |                                                     
                                                               |
+| `hypot`        | 🔜     | #4538                                               
                                                               |
+| `least`        | ✅     |                                                     
                                                               |
+| `ln`           | ✅     |                                                     
                                                               |
+| `log`          | ✅     |                                                     
                                                               |
+| `log10`        | ✅     |                                                     
                                                               |
+| `log1p`        | 🔜     | #4538                                               
                                                               |
+| `log2`         | ✅     |                                                     
                                                               |
+| `mod`          | ✅     |                                                     
                                                               |
+| `negative`     | ✅     |                                                     
                                                               |
+| `pi`           | ✅     |                                                     
                                                               |
+| `pmod`         | 🔜     | #4538                                               
                                                               |
+| `positive`     | ✅     |                                                     
                                                               |
+| `pow`          | ✅     |                                                     
                                                               |
+| `power`        | ✅     |                                                     
                                                               |
+| `radians`      | ✅     |                                                     
                                                               |
+| `rand`         | ✅     |                                                     
                                                               |
+| `randn`        | ✅     |                                                     
                                                               |
+| `random`       | ✅     | Alias for `rand` (Spark 4.0+); seed must be a 
literal                                                              |
+| `randstr`      | 🔜     | Random string (Spark 4.0+)                          
                                                               |
+| `rint`         | ✅     |                                                     
                                                               |
+| `round`        | ⚠️     | Float/Double inputs always fall back; 
integer/decimal HALF_UP supported                                            |
+| `sec`          | ✅     |                                                     
                                                               |
+| `shiftleft`    | ✅     |                                                     
                                                               |
+| `sign`         | ✅     |                                                     
                                                               |
+| `signum`       | ✅     |                                                     
                                                               |
+| `sin`          | ✅     |                                                     
                                                               |
+| `sinh`         | ✅     |                                                     
                                                               |
+| `sqrt`         | ✅     |                                                     
                                                               |
+| `tan`          | ✅     |                                                     
                                                               |
+| `tanh`         | ✅     |                                                     
                                                               |
+| `try_add`      | ⚠️     | Datetime/interval form falls back; numeric form 
supported                                                          |
+| `try_divide`   | ✅     |                                                     
                                                               |
+| `try_mod`      | 🔜     | Lowers to `Remainder` with TRY eval mode, which 
falls back (#4484)                                                 |
+| `try_multiply` | ✅     |                                                     
                                                               |
+| `try_subtract` | ✅     |                                                     
                                                               |
+| `unhex`        | ✅     |                                                     
                                                               |
+| `uniform`      | ✅     | Constant-folded; literal arguments only (Spark 
4.0+)                                                               |
+| `width_bucket` | ⚠️     | Wired via shim, bypasses support-level framework 
([#4485](https://github.com/apache/datafusion-comet/issues/4485)) |
+
+---
+
+## misc_funcs
+
+| Function                      | Status | Notes                               
                                             |
+| ----------------------------- | ------ | 
--------------------------------------------------------------------------------
 |
+| `aes_decrypt`                 | 🔜     | Falls back; `StaticInvoke` not 
allowlisted; planned via codegen dispatch (#4558) |
+| `aes_encrypt`                 | 🔜     | Falls back; planned via codegen 
dispatch (#4558); nondeterministic IV by default |
+| `assert_true`                 | 🔜     | Lowers to `RaiseError`, which falls 
back                                         |
+| `current_catalog`             | ✅     | Resolved to a literal by the 
analyzer (`ReplaceCurrentLike`)                     |
+| `current_database`            | ✅     | Resolved to a literal by the 
analyzer (`ReplaceCurrentLike`)                     |
+| `current_schema`              | ✅     | Alias of `current_database`; 
resolved to a literal by the analyzer               |
+| `current_user`                | ✅     | Resolved to a literal by the 
analyzer; same as `user`                            |
+| `equal_null`                  | ✅     | Lowers to `<=>` (`EqualNullSafe`)    
                                            |
+| `is_variant_null`             | 🔜     | tracking #4098                       
                                            |
+| `monotonically_increasing_id` | ✅     |                                      
                                            |
+| `parse_json`                  | 🔜     | tracking #4098                       
                                            |
+| `raise_error`                 | 🔜     | Raises a runtime error               
                                            |
+| `rand`                        | ✅     | Seed must be a literal               
                                            |
+| `randn`                       | ✅     | Seed must be a literal               
                                            |
+| `schema_of_variant`           | 🔜     | tracking #4098                       
                                            |
+| `schema_of_variant_agg`       | 🔜     | tracking #4098                       
                                            |
+| `session_user`                | ✅     | Alias of `current_user`; resolved to 
a literal by the analyzer                   |
+| `spark_partition_id`          | ✅     |                                      
                                            |
+| `to_variant_object`           | 🔜     | tracking #4098                       
                                            |
+| `try_aes_decrypt`             | 🔜     | Falls back; planned via codegen 
dispatch (#4558)                                 |
+| `try_parse_json`              | 🔜     | tracking #4098                       
                                            |
+| `try_variant_get`             | 🔜     | tracking #4098                       
                                            |
+| `typeof`                      | ✅     | Foldable; resolved to a literal 
before Comet sees the plan                       |
+| `user`                        | ✅     | Resolved to a literal by the Spark 
analyzer before reaching Comet                |
+| `variant_get`                 | 🔜     | tracking #4098                       
                                            |
+
+---
+
+## predicate_funcs
+
+| Function      | Status | Notes                                               
                                          |
+| ------------- | ------ | 
---------------------------------------------------------------------------------------------
 |
+| `!`           | ✅     |                                                      
                                         |
+| `<`           | ✅     |                                                      
                                         |
+| `<=`          | ✅     |                                                      
                                         |
+| `<=>`         | ✅     |                                                      
                                         |
+| `=`           | ✅     |                                                      
                                         |
+| `==`          | ✅     |                                                      
                                         |
+| `>`           | ✅     |                                                      
                                         |
+| `>=`          | ✅     |                                                      
                                         |
+| `and`         | ✅     |                                                      
                                         |
+| `between`     | ✅     |                                                      
                                         |
+| `ilike`       | ✅     |                                                      
                                         |
+| `in`          | ✅     |                                                      
                                         |
+| `isnan`       | ✅     |                                                      
                                         |
+| `isnotnull`   | ✅     |                                                      
                                         |
+| `isnull`      | ✅     |                                                      
                                         |
+| `like`        | ✅     |                                                      
                                         |
+| `not`         | ✅     |                                                      
                                         |
+| `or`          | ✅     |                                                      
                                         |
+| `regexp`      | ⚠️     | Alias for `rlike`; uses Rust `regex` crate, 
requires `allowIncompatible`                      |
+| `regexp_like` | ⚠️     | Alias for `rlike`; uses Rust `regex` crate, 
requires `allowIncompatible`                      |
+| `rlike`       | ⚠️     | Uses Rust `regex` crate; requires 
`allowIncompatible`; results may differ from Java `Pattern` |
+
+---
+
+## string_funcs
+
+| Function             | Status | Notes                                        
                                    |
+| -------------------- | ------ | 
--------------------------------------------------------------------------------
 |
+| `ascii`              | ✅     |                                               
                                   |
+| `base64`             | 🔜     | Lowers to `StaticInvoke(encode)` (not 
allowlisted); falls back                   |
+| `bit_length`         | ✅     |                                               
                                   |
+| `btrim`              | ✅     |                                               
                                   |
+| `char`               | ✅     |                                               
                                   |
+| `char_length`        | ✅     |                                               
                                   |
+| `character_length`   | ✅     |                                               
                                   |
+| `chr`                | ✅     |                                               
                                   |
+| `collate`            | 🔜     | Spark collation (umbrella #2190)              
                                   |
+| `collation`          | ✅     | Constant-folded to a literal (Spark 4.0+)     
                                   |
+| `concat_ws`          | ✅     |                                               
                                   |
+| `contains`           | ✅     |                                               
                                   |
+| `decode`             | ✅     |                                               
                                   |
+| `elt`                | 🔜     | #4538                                         
                                   |
+| `encode`             | 🔜     | Lowers to `StaticInvoke(encode)` (not 
allowlisted); falls back                   |
+| `endswith`           | ✅     |                                               
                                   |
+| `find_in_set`        | 🔜     | #4538                                         
                                   |
+| `format_number`      | 🔜     | #4538                                         
                                   |
+| `format_string`      | 🔜     | #4538                                         
                                   |
+| `initcap`            | ✅     |                                               
                                   |
+| `instr`              | ✅     |                                               
                                   |
+| `lcase`              | ✅     |                                               
                                   |
+| `left`               | ✅     |                                               
                                   |
+| `len`                | ✅     |                                               
                                   |
+| `length`             | ✅     |                                               
                                   |
+| `levenshtein`        | 🔜     | #4538                                         
                                   |
+| `locate`             | 🔜     | #4538                                         
                                   |
+| `lower`              | ✅     |                                               
                                   |
+| `lpad`               | ✅     |                                               
                                   |
+| `ltrim`              | ✅     |                                               
                                   |
+| `luhn_check`         | ✅     | Native via `StaticInvoke` (tests: 
luhn_check.sql)                                |
+| `mask`               | 🔜     | Data masking                                  
                                   |
+| `octet_length`       | ✅     |                                               
                                   |
+| `overlay`            | 🔜     | #4538                                         
                                   |
+| `position`           | 🔜     | #4538                                         
                                   |
+| `printf`             | 🔜     | #4538                                         
                                   |
+| `regexp_count`       | 🔜     | tracking #4098                                
                                   |
+| `regexp_extract`     | 🔜     | tracking #4098                                
                                   |
+| `regexp_extract_all` | 🔜     | tracking #4098                                
                                   |
+| `regexp_instr`       | 🔜     | tracking #4098                                
                                   |
+| `regexp_replace`     | ✅     |                                               
                                   |
+| `regexp_substr`      | 🔜     | tracking #4098                                
                                   |
+| `repeat`             | ✅     |                                               
                                   |
+| `replace`            | ✅     |                                               
                                   |
+| `right`              | ✅     |                                               
                                   |
+| `rpad`               | ✅     |                                               
                                   |
+| `rtrim`              | ✅     |                                               
                                   |
+| `soundex`            | 🔜     | #4538                                         
                                   |
+| `space`              | ✅     |                                               
                                   |
+| `split`              | ✅     |                                               
                                   |
+| `split_part`         | 🔜     | Lowers to `element_at(StringSplitSQL(...))`; 
`StringSplitSQL` falls back (#4561) |
+| `startswith`         | ✅     |                                               
                                   |
+| `substr`             | ✅     |                                               
                                   |
+| `substring`          | ✅     |                                               
                                   |
+| `substring_index`    | ✅     |                                               
                                   |
+| `to_binary`          | ⚠️     | Only the hex format is accelerated (lowers 
to `unhex`); UTF-8/base64 fall back   |
+| `to_char`            | 🔜     | #4538                                         
                                   |
+| `to_number`          | 🔜     | #4538                                         
                                   |
+| `to_varchar`         | 🔜     | #4538                                         
                                   |
+| `translate`          | ✅     |                                               
                                   |
+| `trim`               | ✅     |                                               
                                   |
+| `try_to_binary`      | 🔜     | Lowers to `TryEval(...)`, which falls back    
                                   |
+| `try_to_number`      | 🔜     | TRY variant of `to_number`                    
                                   |
+| `ucase`              | ✅     |                                               
                                   |
+| `unbase64`           | 🔜     | #4538                                         
                                   |
+| `upper`              | ✅     |                                               
                                   |
+
+---
+
+## struct_funcs
+
+| Function       | Status | Notes                                    |
+| -------------- | ------ | ---------------------------------------- |
+| `named_struct` | ⚠️     | Duplicate field names fall back to Spark |
+| `struct`       | ✅     |                                          |
+
+---
+
+## url_funcs
+
+| Function         | Status | Notes |
+| ---------------- | ------ | ----- |
+| `parse_url`      | ✅     |       |
+| `try_url_decode` | ✅     |       |
+| `url_decode`     | ✅     |       |
+| `url_encode`     | ✅     |       |
+
+---
+
+## window_funcs
+
+Window functions run via `CometWindowExec`. Window support is disabled by 
default due to known
+correctness issues (tracking 
[#2721](https://github.com/apache/datafusion-comet/issues/2721)).
+When enabled, `lag` and `lead` are explicitly wired; aggregate window 
functions (`count`, `min`,
+`max`, `sum`) are also supported. Ranking functions (`rank`, `dense_rank`, 
`row_number`,
+`ntile`, `percent_rank`, `cume_dist`, `nth_value`) are not yet wired in the 
window serde and
+fall back to Spark.
+
+| Function       | Status | Notes                             |
+| -------------- | ------ | --------------------------------- |
+| `cume_dist`    | 🔜     | Window function; tracked by #2721 |
+| `dense_rank`   | 🔜     | Window function; tracked by #2721 |
+| `lag`          | ✅     | via `CometWindowExec`             |
+| `lead`         | ✅     | via `CometWindowExec`             |
+| `nth_value`    | 🔜     | Window function; tracked by #2721 |
+| `ntile`        | 🔜     | Window function; tracked by #2721 |
+| `percent_rank` | 🔜     | Window function; tracked by #2721 |
+| `rank`         | 🔜     | Window function; tracked by #2721 |
+| `row_number`   | 🔜     | Window function; tracked by #2721 |
+
+---
+
+## Out-of-scope function list

Review Comment:
   isn't it duplicate?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] docs: Rewrite supported expressions page to show complete overview of what is and is not supported by Comet [datafusion-comet]

Reply via email to