Re: [PR] fix: Sort on single struct should fallback to Spark [datafusion-comet]

via GitHub Sun, 11 Aug 2024 22:48:09 -0700


viirya commented on code in PR #811:
URL: https://github.com/apache/datafusion-comet/pull/811#discussion_r1713194759



##########
docs/source/user-guide/configs.md:
##########
@@ -63,8 +63,8 @@ Comet provides the following configuration settings.
 | spark.comet.nativeLoadRequired | Whether to require Comet native library to 
load successfully when Comet is enabled. If not, Comet will silently fallback 
to Spark when it fails to load the native lib. Otherwise, an error will be 
thrown and the Spark job will be aborted. | false |
 | spark.comet.parquet.enable.directBuffer | Whether to use Java direct byte 
buffer when reading Parquet. By default, this is false | false |
 | spark.comet.regexp.allowIncompatible | Comet is not currently fully 
compatible with Spark for all regular expressions. Set this config to true to 
allow them anyway using Rust's regular expression engine. See compatibility 
guide for more information. | false |
-| spark.comet.sparkToColumnar.supportedOperatorList | A comma-separated list 
of operators that will be converted to Comet columnar format when 
'spark.comet.sparkToColumnar.enabled' is true | Range,InMemoryTableScan |
 | spark.comet.scan.enabled | Whether to enable Comet scan. When this is turned 
on, Spark will use Comet to read Parquet data source. Note that to enable 
native vectorized execution, both this config and 'spark.comet.exec.enabled' 
need to be enabled. By default, this config is true. | true |
 | spark.comet.scan.preFetch.enabled | Whether to enable pre-fetching feature 
of CometScan. By default is disabled. | false |
 | spark.comet.scan.preFetch.threadNum | The number of threads running 
pre-fetching for CometScan. Effective if spark.comet.scan.preFetch.enabled is 
enabled. By default it is 2. Note that more pre-fetching threads means more 
memory requirement to store pre-fetched row groups. | 2 |
 | spark.comet.shuffle.preferDictionary.ratio | The ratio of total values to 
distinct values in a string column to decide whether to prefer dictionary 
encoding when shuffling the column. If the ratio is higher than this config, 
dictionary encoding will be used on shuffling string column. This config is 
effective if it is higher than 1.0. By default, this config is 10.0. Note that 
this config is only used when `spark.comet.exec.shuffle.mode` is `jvm`. | 10.0 |
+| spark.comet.sparkToColumnar.supportedOperatorList | A comma-separated list 
of operators that will be converted to Comet columnar format when 
'spark.comet.sparkToColumnar.enabled' is true | Range,InMemoryTableScan |

Review Comment:
   The document is updated automatically when `make release` locally.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix: Sort on single struct should fallback to Spark [datafusion-comet]

Reply via email to