Re: [PR] Extract parquet statistics from timestamps with timezones [datafusion]

2024-06-03 Thread via GitHub
xinlifoobar commented on code in PR #10766: URL: https://github.com/apache/datafusion/pull/10766#discussion_r1623933162 ## datafusion/core/tests/parquet/mod.rs: ## @@ -403,9 +437,13 @@ fn make_timestamp_batch(offset: Duration) -> RecordBatch { schema, vec![

[I] Merge `ScalarUDFImpl`'s `invoke_no_args` and `invoke` into one method [datafusion]

2024-06-03 Thread via GitHub
lewiszlw opened a new issue, #10773: URL: https://github.com/apache/datafusion/issues/10773 ### Is your feature request related to a problem or challenge? Current api design does not look very elegant. And users need handle no args and args separately. ### Describe the solution

[PR] Update rstest requirement from 0.20.0 to 0.21.0 [datafusion]

2024-06-03 Thread via GitHub
dependabot[bot] opened a new pull request, #10774: URL: https://github.com/apache/datafusion/pull/10774 Updates the requirements on [rstest](https://github.com/la10736/rstest) to permit the latest version. Release notes Sourced from https://github.com/la10736/rstest/releases";>rste

Re: [I] Extract parquet statistics from `Duration` columns [datafusion]

2024-06-03 Thread via GitHub
marvinlanhenke commented on issue #10754: URL: https://github.com/apache/datafusion/issues/10754#issuecomment-2144639343 @alamb ...also while looking into this. I think `Duration` is not [supported](https://github.com/apache/arrow-rs/blob/master/parquet/src/arrow/schema/mod.rs#L451), th

[PR] Fix extract parquet statistics from LargeBinary columns [datafusion]

2024-06-03 Thread via GitHub
xinlifoobar opened a new pull request, #10775: URL: https://github.com/apache/datafusion/pull/10775 ## Which issue does this PR close? Closes #10753 ## Rationale for this change ## What changes are included in this PR? ## Are these changes

[I] Misaligned datapoints [datafusion]

2024-06-03 Thread via GitHub
maronavenue opened a new issue, #10776: URL: https://github.com/apache/datafusion/issues/10776 ### Describe the bug I have implemented a custom `TableProvider` which sources a test csv that can be found [here](https://github.com/maronavenue/datafusion-example/blob/main/tests/data/exa

[PR] Fix extract parquet statistics from Decimal256 columns [datafusion]

2024-06-03 Thread via GitHub
xinlifoobar opened a new pull request, #10777: URL: https://github.com/apache/datafusion/pull/10777 ## Which issue does this PR close? Closes #10755 . ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [I] Library Guide: Building LogicalPlans [datafusion]

2024-06-03 Thread via GitHub
alamb commented on issue #7306: URL: https://github.com/apache/datafusion/issues/7306#issuecomment-2144852119 I think @andygrove started on a pre-processor that extracted the markdown and tested it here https://github.com/apache/datafusion/pull/7956 Maybe we can revive that idea --

Re: [I] Library Guide: Building LogicalPlans [datafusion]

2024-06-03 Thread via GitHub
edmondop commented on issue #7306: URL: https://github.com/apache/datafusion/issues/7306#issuecomment-2144879448 I have created https://github.com/apache/datafusion/issues/10768 which I think would be a replacement of the Python scripts with something maintained somewhere else -- This i

[PR] Document Committer and PMC process [datafusion]

2024-06-03 Thread via GitHub
alamb opened a new pull request, #10778: URL: https://github.com/apache/datafusion/pull/10778 ## Which issue does this PR close? Closes https://github.com/apache/datafusion/issues/10479 ## Rationale for this change As part of governing DataFusion in the open and via t

Re: [PR] Document Committer and PMC process [datafusion]

2024-06-03 Thread via GitHub
alamb commented on code in PR #10778: URL: https://github.com/apache/datafusion/pull/10778#discussion_r1624248642 ## docs/source/contributor-guide/inviting.md: ## @@ -0,0 +1,427 @@ + + +# Inviting New Committers and PMC Members + +This is a cookbook of the recommended DataFusion

Re: [PR] feat: support unparsing LogicalPlan::Window nodes [datafusion]

2024-06-03 Thread via GitHub
devinjdangelo commented on code in PR #10767: URL: https://github.com/apache/datafusion/pull/10767#discussion_r1624265772 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -127,7 +127,10 @@ fn roundtrip_statement() -> Result<()> { UNION ALL SELECT j2_

Re: [PR] feat: support unparsing LogicalPlan::Window nodes [datafusion]

2024-06-03 Thread via GitHub
devinjdangelo commented on code in PR #10767: URL: https://github.com/apache/datafusion/pull/10767#discussion_r1624265772 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -127,7 +127,10 @@ fn roundtrip_statement() -> Result<()> { UNION ALL SELECT j2_

Re: [PR] feat: support unparsing LogicalPlan::Window nodes [datafusion]

2024-06-03 Thread via GitHub
devinjdangelo commented on code in PR #10767: URL: https://github.com/apache/datafusion/pull/10767#discussion_r1624273646 ## datafusion/sql/src/unparser/utils.rs: ## @@ -82,3 +91,28 @@ pub(crate) fn unproject_agg_exprs(expr: &Expr, agg: &Aggregate) -> Result })

Re: [PR] Introduce Sum UDAF [datafusion]

2024-06-03 Thread via GitHub
jayzhan211 commented on PR #10651: URL: https://github.com/apache/datafusion/pull/10651#issuecomment-2144982759 🚀 Thanks @mustafasrepo and @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Introduce Sum UDAF [datafusion]

2024-06-03 Thread via GitHub
jayzhan211 merged PR #10651: URL: https://github.com/apache/datafusion/pull/10651 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [I] Make ASF public press release [datafusion]

2024-06-03 Thread via GitHub
alamb commented on issue #10403: URL: https://github.com/apache/datafusion/issues/10403#issuecomment-2145167166 Update here is we have submitted a draft to the ASF publicity chairs who are working the process -- This is an automated message from the Apache Git Service. To respond to the m

Re: [I] Create presentation for DataFusion SIGMOD 2024 paper [datafusion]

2024-06-03 Thread via GitHub
alamb commented on issue #10480: URL: https://github.com/apache/datafusion/issues/10480#issuecomment-2145170179 I got good feedback and I am feeling pretty good about the slides. I plan to practice a recording of the talk later this week and post the recording to youtube -- This is an au

Re: [I] API in ParquetExec to pass in RowSelections to `ParquetExec` (enable custom indexes, finer grained pushdown) [datafusion]

2024-06-03 Thread via GitHub
alamb commented on issue #9929: URL: https://github.com/apache/datafusion/issues/9929#issuecomment-2145171946 Update here is I have the API for specifying the selection sketched out here: https://github.com/apache/datafusion/pull/10738 -- This is an automated message from the Apache Git S

Re: [PR] Move `Count` to `functions-aggregate` [datafusion]

2024-06-03 Thread via GitHub
jayzhan211 commented on code in PR #10484: URL: https://github.com/apache/datafusion/pull/10484#discussion_r1624466476 ## datafusion-cli/Cargo.toml: ## @@ -26,7 +26,7 @@ license = "Apache-2.0" homepage = "https://datafusion.apache.org"; repository = "https://github.com/apache/

Re: [PR] build(deps): upgrade sqlparser to 0.47.0 [datafusion]

2024-06-03 Thread via GitHub
alamb commented on PR #10392: URL: https://github.com/apache/datafusion/pull/10392#issuecomment-2145223156 My plan for this PR is to disable the failing test on windows and file a ticket to investigate -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] DataFusion weekly project plan (Andrew Lamb) - May 27, 2024 [datafusion]

2024-06-03 Thread via GitHub
alamb commented on issue #10699: URL: https://github.com/apache/datafusion/issues/10699#issuecomment-2145246707 Next week https://github.com/apache/datafusion/issues/10779 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] DataFusion weekly project plan (Andrew Lamb) - May 27, 2024 [datafusion]

2024-06-03 Thread via GitHub
alamb closed issue #10699: DataFusion weekly project plan (Andrew Lamb) - May 27, 2024 URL: https://github.com/apache/datafusion/issues/10699 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] feat: Use enum to represent CAST eval_mode in expr.proto [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #415: URL: https://github.com/apache/datafusion-comet/pull/415#discussion_r1624491362 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -535,12 +549,14 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {

Re: [PR] fix: use total ordering in the min & max accumulator for floats [datafusion]

2024-06-03 Thread via GitHub
alamb commented on PR #10627: URL: https://github.com/apache/datafusion/pull/10627#issuecomment-2145251913 @westonpace what is the status / plan with this PR? It has failing CI tests but is not marked as a draft. Are you still planning on working with it? Do you need help to push it along?

Re: [PR] feat: add hex scalar function [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #449: URL: https://github.com/apache/datafusion-comet/pull/449#discussion_r1624513049 ## core/src/execution/datafusion/expressions/scalar_funcs/hex.rs: ## @@ -0,0 +1,296 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or mo

Re: [PR] feat: Add "Comet Fuzz" fuzz-testing utility [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #472: URL: https://github.com/apache/datafusion-comet/pull/472#discussion_r1624519642 ## fuzz-testing/src/main/scala/org/apache/comet/fuzz/QueryGen.scala: ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#discussion_r1624542497 ## build_for_centos7.sh: ## @@ -0,0 +1,5 @@ +docker build -t comet_build_env_centos7:1.0 -f core/comet_build_env_centos7.dockerfile Review Comment: This f

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#discussion_r1624542868 ## core/comet_build_env_centos7.dockerfile: ## @@ -0,0 +1,36 @@ +FROM centos:7 Review Comment: This file needs an ASF licence header -- This is an auto

[I] [NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? [datafusion-comet]

2024-06-03 Thread via GitHub
SemyonSinchenko opened a new issue, #503: URL: https://github.com/apache/datafusion-comet/issues/503 ### Describe the bug I'm running a query that do the following: 1. Read parquet files 2. Generate a lot of case-when columns 3. Run groupBy + agg on top of that columns

Re: [I] java.lang.NoSuchMethodError: org.apache.spark.sql.execution.FileSourceScanExec.copy$default$9() [datafusion-comet]

2024-06-03 Thread via GitHub
SemyonSinchenko closed issue #190: java.lang.NoSuchMethodError: org.apache.spark.sql.execution.FileSourceScanExec.copy$default$9() URL: https://github.com/apache/datafusion-comet/issues/190 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#issuecomment-2145375131 I ran into a compilation error when running `make release PROFILES="-Pspark-3.5"`: ``` [ERROR] /home/andy/git/apache/datafusion-comet/spark/src/test/scala/org/apache/sp

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#issuecomment-2145401283 I hacked my local copy to call `getNormalizedQueryExecutionResult` instead of `getNormalizedResult` and was then able to run the example from the installation guide. :rocket:

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#discussion_r1624619475 ## spark/src/main/spark-3.3/org/apache/comet/parquet/CometParquetFileFormat.scala: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[PR] Support negatives in split part [datafusion]

2024-06-03 Thread via GitHub
tshauck opened a new pull request, #10780: URL: https://github.com/apache/datafusion/pull/10780 ## Which issue does this PR close? Closes #10761 ## Rationale for this change Postgres at least supports negative index in split_part. It'd be nice to for datafusion to simila

Re: [PR] feat: add hex scalar function [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on code in PR #449: URL: https://github.com/apache/datafusion-comet/pull/449#discussion_r1624630592 ## core/src/execution/datafusion/expressions/scalar_funcs/hex.rs: ## @@ -0,0 +1,296 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#discussion_r1624640327 ## spark/src/main/spark-3.3/org/apache/comet/parquet/CometParquetFileFormat.scala: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF

Re: [PR] build(deps): upgrade sqlparser to 0.47.0 [datafusion]

2024-06-03 Thread via GitHub
alamb commented on code in PR #10392: URL: https://github.com/apache/datafusion/pull/10392#discussion_r1624642596 ## datafusion/expr/src/logical_plan/ddl.rs: ## @@ -341,29 +341,8 @@ pub struct CreateFunctionBody { pub language: Option, /// IMMUTABLE | STABLE | VOLATILE

Re: [PR] build(deps): upgrade sqlparser to 0.47.0 [datafusion]

2024-06-03 Thread via GitHub
tisonkun commented on code in PR #10392: URL: https://github.com/apache/datafusion/pull/10392#discussion_r1624648204 ## datafusion/expr/src/logical_plan/ddl.rs: ## @@ -341,29 +341,8 @@ pub struct CreateFunctionBody { pub language: Option, /// IMMUTABLE | STABLE | VOLAT

Re: [I] Extract parquet statistics from timestamps with timezones [datafusion]

2024-06-03 Thread via GitHub
comphead closed issue #10758: Extract parquet statistics from timestamps with timezones URL: https://github.com/apache/datafusion/issues/10758 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] feat: add hex scalar function [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on code in PR #449: URL: https://github.com/apache/datafusion-comet/pull/449#discussion_r1624649245 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1038,6 +1038,20 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSpar

Re: [PR] feat: add hex scalar function [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on code in PR #449: URL: https://github.com/apache/datafusion-comet/pull/449#discussion_r1624649245 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1038,6 +1038,20 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSpar

Re: [PR] Extract parquet statistics from timestamps with timezones [datafusion]

2024-06-03 Thread via GitHub
comphead merged PR #10766: URL: https://github.com/apache/datafusion/pull/10766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Extract parquet statistics from Time32 and Time64 columns [datafusion]

2024-06-03 Thread via GitHub
comphead commented on code in PR #10771: URL: https://github.com/apache/datafusion/pull/10771#discussion_r1624656349 ## datafusion/core/tests/parquet/mod.rs: ## @@ -442,6 +450,55 @@ fn make_int_batches(start: i8, end: i8) -> RecordBatch { .unwrap() } +/// Return record b

Re: [PR] feat: add hex scalar function [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on code in PR #449: URL: https://github.com/apache/datafusion-comet/pull/449#discussion_r1624657434 ## core/src/execution/datafusion/expressions/scalar_funcs/hex.rs: ## @@ -0,0 +1,306 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

Re: [PR] Minor: Add tests for extracting dictionary parquet statistics [datafusion]

2024-06-03 Thread via GitHub
comphead merged PR #10729: URL: https://github.com/apache/datafusion/pull/10729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Speed up arrow_statistics test [datafusion]

2024-06-03 Thread via GitHub
comphead commented on code in PR #10735: URL: https://github.com/apache/datafusion/pull/10735#discussion_r1624659827 ## datafusion/core/tests/parquet/arrow_statistics.rs: ## @@ -159,9 +159,9 @@ impl TestReader { } /// Defines a test case for statistics extraction -struct Tes

Re: [PR] Speed up arrow_statistics test [datafusion]

2024-06-03 Thread via GitHub
comphead commented on PR #10735: URL: https://github.com/apache/datafusion/pull/10735#issuecomment-2145522349 We got lots of statistics PRs so conflicts are unavoidable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Document Committer and PMC process [datafusion]

2024-06-03 Thread via GitHub
viirya commented on code in PR #10778: URL: https://github.com/apache/datafusion/pull/10778#discussion_r1624661849 ## docs/source/contributor-guide/inviting.md: ## @@ -0,0 +1,427 @@ + + +# Inviting New Committers and PMC Members + +This is a cookbook of the recommended DataFusio

Re: [PR] Minor: consider precision/length parameter for varchar/char types [datafusion]

2024-06-03 Thread via GitHub
comphead commented on code in PR #10746: URL: https://github.com/apache/datafusion/pull/10746#discussion_r1624663831 ## datafusion/sqllogictest/test_files/strings.slt: ## @@ -78,3 +78,14 @@ e1 p2 p2e1 p2m1e1 + + +# Truncate +# would error since the length is less than current

Re: [PR] Minor: consider precision/length parameter for varchar/char types [datafusion]

2024-06-03 Thread via GitHub
comphead commented on code in PR #10746: URL: https://github.com/apache/datafusion/pull/10746#discussion_r1624664619 ## datafusion/sql/src/expr/mod.rs: ## @@ -274,6 +276,34 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { if let Some(format) = format {

Re: [PR] Document Committer and PMC process [datafusion]

2024-06-03 Thread via GitHub
viirya commented on code in PR #10778: URL: https://github.com/apache/datafusion/pull/10778#discussion_r1624680195 ## docs/source/contributor-guide/inviting.md: ## @@ -0,0 +1,427 @@ + + +# Inviting New Committers and PMC Members + +This is a cookbook of the recommended DataFusio

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#discussion_r1624688003 ## spark/src/main/spark-3.3/org/apache/comet/parquet/CometParquetFileFormat.scala: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF

Re: [I] Precision/length parameter of varchar/char types is ignored [datafusion]

2024-06-03 Thread via GitHub
lowka commented on issue #10743: URL: https://github.com/apache/datafusion/issues/10743#issuecomment-2145553724 I updated the issue to make it clear that is not only a problem with literals. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Minor: consider precision/length parameter for varchar/char types [datafusion]

2024-06-03 Thread via GitHub
lowka commented on PR #10746: URL: https://github.com/apache/datafusion/pull/10746#issuecomment-214366 @Lordworms Thank you for looking into this issue. > Currently, I think it is tricky to do 2 and 3 since Datafusion would transform the ast::DataType into arrow::DataType ...

Re: [I] [NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on issue #503: URL: https://github.com/apache/datafusion-comet/issues/503#issuecomment-2145556454 Could you run a simple query to verify if Comet shuffle can be triggered? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] [NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on issue #503: URL: https://github.com/apache/datafusion-comet/issues/503#issuecomment-2145563035 Oh, could you disable `spark.sql.adaptive.coalescePartitions.enabled` and retry? -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] feat: Support Ansi mode in abs function [datafusion-comet]

2024-06-03 Thread via GitHub
vaibhawvipul commented on code in PR #500: URL: https://github.com/apache/datafusion-comet/pull/500#discussion_r1624694390 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -1474,15 +1481,14 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSer

Re: [PR] feat: Update Parquet row filtering to handle type coercion [datafusion]

2024-06-03 Thread via GitHub
alamb commented on code in PR #10716: URL: https://github.com/apache/datafusion/pull/10716#discussion_r1624676825 ## datafusion/core/src/datasource/schema_adapter.rs: ## @@ -75,9 +75,16 @@ pub trait SchemaAdapter: Send + Sync { /// Creates a `SchemaMapping` that can be used t

Re: [I] XxHash64 hash function support [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove closed issue #344: XxHash64 hash function support URL: https://github.com/apache/datafusion-comet/issues/344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] feat: Add xxhash64 function support [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove merged PR #424: URL: https://github.com/apache/datafusion-comet/pull/424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] feat: Use enum to represent CAST eval_mode in expr.proto [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on PR #415: URL: https://github.com/apache/datafusion-comet/pull/415#issuecomment-2145580025 @viirya @kazuyukitanimura @parthchandra @huaxingao I plan on merging this one soon unless you want to review? -- This is an automated message from the Apache Git Service. To re

Re: [PR] feat: add hex scalar function [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove merged PR #449: URL: https://github.com/apache/datafusion-comet/pull/449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] feat: add hex scalar function [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on code in PR #449: URL: https://github.com/apache/datafusion-comet/pull/449#discussion_r1624713927 ## core/src/execution/datafusion/expressions/scalar_funcs/hex.rs: ## @@ -0,0 +1,306 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or mo

[I] Regression in `first_value` coercsion [datafusion]

2024-06-03 Thread via GitHub
appletreeisyellow opened a new issue, #10781: URL: https://github.com/apache/datafusion/issues/10781 ### Describe the bug There is a regression in `first_value` coercsion after https://github.com/apache/datafusion/pull/10648 is merged. The error message looks like: `Error duri

Re: [I] Switch back to released version of DataFusion and arrow-rs after Arrow Java 16 is released [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on issue #248: URL: https://github.com/apache/datafusion-comet/issues/248#issuecomment-2145607388 The release vote for arrow-rs 52 has started, and there is a draft PR in DataFusion to upgrade to this version: https://github.com/apache/datafusion/pull/10765 --

Re: [I] Switch back to released version of DataFusion and arrow-rs after Arrow Java 16 is released [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on issue #248: URL: https://github.com/apache/datafusion-comet/issues/248#issuecomment-2145613047 Cool. I will update the PR accordingly once we get DataFusion and arrow-rs releases. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] Regression in `first_value` coercsion [datafusion]

2024-06-03 Thread via GitHub
appletreeisyellow commented on issue #10781: URL: https://github.com/apache/datafusion/issues/10781#issuecomment-2145615556 Seems like it is fixed in https://github.com/apache/datafusion/issues/10703. I'm verifying -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Move `DynamicFileCatalog` back to core [datafusion]

2024-06-03 Thread via GitHub
goldmedal commented on code in PR #10745: URL: https://github.com/apache/datafusion/pull/10745#discussion_r1624740586 ## datafusion/core/Cargo.toml: ## @@ -158,6 +159,16 @@ tokio-postgres = "0.7.7" [target.'cfg(not(target_os = "windows"))'.dev-dependencies] nix = { version = "

Re: [PR] Move `DynamicFileCatalog` back to core [datafusion]

2024-06-03 Thread via GitHub
goldmedal commented on code in PR #10745: URL: https://github.com/apache/datafusion/pull/10745#discussion_r1624740586 ## datafusion/core/Cargo.toml: ## @@ -158,6 +159,16 @@ tokio-postgres = "0.7.7" [target.'cfg(not(target_os = "windows"))'.dev-dependencies] nix = { version = "

Re: [PR] feat: Add "Comet Fuzz" fuzz-testing utility [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on PR #472: URL: https://github.com/apache/datafusion-comet/pull/472#issuecomment-2145648950 Thanks for the review @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] feat: Add "Comet Fuzz" fuzz-testing utility [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove merged PR #472: URL: https://github.com/apache/datafusion-comet/pull/472 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] Move `DynamicFileCatalog` back to core [datafusion]

2024-06-03 Thread via GitHub
goldmedal commented on PR #10745: URL: https://github.com/apache/datafusion/pull/10745#issuecomment-2145649820 > I'm not sure if it's related to the `macos-latest` GitHub runner. I can't reproduce it in my local. Here is the revised version with fixed grammar: I have checked th

Re: [I] Regression in `first_value` coercsion [datafusion]

2024-06-03 Thread via GitHub
appletreeisyellow closed issue #10781: Regression in `first_value` coercsion URL: https://github.com/apache/datafusion/issues/10781 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] Regression in `first_value` coercsion [datafusion]

2024-06-03 Thread via GitHub
appletreeisyellow commented on issue #10781: URL: https://github.com/apache/datafusion/issues/10781#issuecomment-2145651031 I verified that https://github.com/apache/datafusion/pull/10651 fixed the regression. Thank you @jayzhan211! -- This is an automated message from the Apache Git Serv

Re: [I] [NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? [datafusion-comet]

2024-06-03 Thread via GitHub
SemyonSinchenko commented on issue #503: URL: https://github.com/apache/datafusion-comet/issues/503#issuecomment-2145654862 Wow! With a disabled `spark.sql.adaptive.coalescePartitions.enabled` it works! May I open a PR with updates to documentation? Looks like I need to update [this page](

Re: [I] [NOT A BUG] Why comet does not convert the HashAggregate expression to native in my query? [datafusion-comet]

2024-06-03 Thread via GitHub
viirya commented on issue #503: URL: https://github.com/apache/datafusion-comet/issues/503#issuecomment-2145662099 Yea, you can open a PR to update the document. Although it should be a temporary limit and we are working on to remove it. -- This is an automated message from the Apache Gi

Re: [PR] feat: Use enum to represent CAST eval_mode in expr.proto [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove merged PR #415: URL: https://github.com/apache/datafusion-comet/pull/415 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [I] chore: Use enum to represent CAST eval_mode in expr.proto [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove closed issue #361: chore: Use enum to represent CAST eval_mode in expr.proto URL: https://github.com/apache/datafusion-comet/issues/361 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
kazuyukitanimura commented on code in PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#discussion_r1624838287 ## spark/src/main/spark-3.3/org/apache/comet/parquet/CometParquetFileFormat.scala: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundati

[PR] chore: Switch to stable Rust [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove opened a new pull request, #505: URL: https://github.com/apache/datafusion-comet/pull/505 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes t

Re: [PR] feat: Support Ansi mode in abs function [datafusion-comet]

2024-06-03 Thread via GitHub
planga82 commented on code in PR #500: URL: https://github.com/apache/datafusion-comet/pull/500#discussion_r1624851281 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -1474,15 +1481,14 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde w

Re: [PR] feat: Switch to use Rust stable by default [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove closed pull request #373: feat: Switch to use Rust stable by default URL: https://github.com/apache/datafusion-comet/pull/373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] feat: Switch to use Rust stable by default [datafusion-comet]

2024-06-03 Thread via GitHub
andygrove commented on PR #373: URL: https://github.com/apache/datafusion-comet/pull/373#issuecomment-2145817199 I branched from this and created https://github.com/apache/datafusion-comet/pull/505, so closing this one. Thanks @sunchao for getting this started. -- This is an automated me

Re: [PR] Profile spark3.5.1 and centos7 for compatible on spark 3.5.1 and centos7 old glic 2.7 [datafusion-comet]

2024-06-03 Thread via GitHub
parthchandra commented on code in PR #491: URL: https://github.com/apache/datafusion-comet/pull/491#discussion_r1624873799 ## spark/src/main/spark-3.3/org/apache/comet/parquet/CometParquetFileFormat.scala: ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Update rstest requirement from 0.20.0 to 0.21.0 [datafusion]

2024-06-03 Thread via GitHub
alamb merged PR #10774: URL: https://github.com/apache/datafusion/pull/10774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Refactor size estimation of Hashset into a function [datafusion]

2024-06-03 Thread via GitHub
alamb closed issue #8764: Refactor size estimation of Hashset into a function URL: https://github.com/apache/datafusion/issues/8764 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Minor: Refactor memory size estimation for HashTable [datafusion]

2024-06-03 Thread via GitHub
alamb merged PR #10748: URL: https://github.com/apache/datafusion/pull/10748 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Minor: Refactor memory size estimation for HashTable [datafusion]

2024-06-03 Thread via GitHub
alamb commented on PR #10748: URL: https://github.com/apache/datafusion/pull/10748#issuecomment-2145882319 Thanks again @marvinlanhenke -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Speed up arrow_statistics test [datafusion]

2024-06-03 Thread via GitHub
alamb commented on PR #10735: URL: https://github.com/apache/datafusion/pull/10735#issuecomment-2145883313 > We got lots of statistics PRs so conflicts are unavoidable Thanks for the heads up @comphead -- I plan to get the other statistics tests in first and then I will merge / fixup

Re: [PR] Reduce code repetition in `datafusion/functions` mod files [datafusion]

2024-06-03 Thread via GitHub
alamb merged PR #10700: URL: https://github.com/apache/datafusion/pull/10700 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Reduce repetition in datafusion::functions using macros [datafusion]

2024-06-03 Thread via GitHub
alamb closed issue #10397: Reduce repetition in datafusion::functions using macros URL: https://github.com/apache/datafusion/issues/10397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Reduce code repetition in `datafusion/functions` mod files [datafusion]

2024-06-03 Thread via GitHub
alamb commented on code in PR #10700: URL: https://github.com/apache/datafusion/pull/10700#discussion_r1624896673 ## datafusion/functions/src/macros.rs: ## @@ -36,25 +36,31 @@ ///] /// } /// ``` +/// +/// Exported functions accept: +/// - `Vec` argument (single argument f

Re: [PR] Cleanup GetIndexedField [datafusion]

2024-06-03 Thread via GitHub
alamb commented on PR #10769: URL: https://github.com/apache/datafusion/pull/10769#issuecomment-2145890532 Thank you @lewiszlw . I merged up from main to resolve a conflict. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Minor: (Doc) Enable rt-multi-thread feature for sample code [datafusion]

2024-06-03 Thread via GitHub
alamb merged PR #10770: URL: https://github.com/apache/datafusion/pull/10770 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Update split_part to support negative indexes vs failing [datafusion]

2024-06-03 Thread via GitHub
alamb closed issue #10761: Update split_part to support negative indexes vs failing URL: https://github.com/apache/datafusion/issues/10761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Support negatives in split part [datafusion]

2024-06-03 Thread via GitHub
alamb merged PR #10780: URL: https://github.com/apache/datafusion/pull/10780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Support negatives in split part [datafusion]

2024-06-03 Thread via GitHub
alamb commented on code in PR #10780: URL: https://github.com/apache/datafusion/pull/10780#discussion_r1624904885 ## datafusion/functions/src/string/split_part.rs: ## @@ -97,14 +97,21 @@ fn split_part(args: &[ArrayRef]) -> Result { .zip(n_array.iter()) .map(|(

Re: [PR] Fix extract parquet statistics from Decimal256 columns [datafusion]

2024-06-03 Thread via GitHub
alamb commented on code in PR #10777: URL: https://github.com/apache/datafusion/pull/10777#discussion_r1624905947 ## datafusion/core/src/datasource/physical_plan/parquet/statistics.rs: ## @@ -594,6 +629,42 @@ mod test { .unwrap(), ), }

Re: [PR] Reduce code repetition in `datafusion/functions` mod files [datafusion]

2024-06-03 Thread via GitHub
MohamedAbdeen21 commented on PR #10700: URL: https://github.com/apache/datafusion/pull/10700#issuecomment-2145901923 Thanks for the review! @jayzhan211 @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

  1   2   >