[PR] Implement unparse `ScalarVariable` to String [datafusion]

2024-05-16 Thread via GitHub
reswqa opened a new pull request, #10541: URL: https://github.com/apache/datafusion/pull/10541 ## Which issue does this PR close? Closes #10518. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [PR] Implement unparse `ScalarVariable` to String [datafusion]

2024-05-16 Thread via GitHub
reswqa commented on code in PR #10541: URL: https://github.com/apache/datafusion/pull/10541#discussion_r1602756692 ## datafusion/sql/src/unparser/expr.rs: ## @@ -388,8 +388,18 @@ impl Unparser<'_> { expr: Box::new(sql_parser_expr), })

Re: [PR] Move min_max unit tests to slt [datafusion]

2024-05-16 Thread via GitHub
jayzhan211 merged PR #10539: URL: https://github.com/apache/datafusion/pull/10539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Move min_max unit tests to slt [datafusion]

2024-05-16 Thread via GitHub
jayzhan211 commented on PR #10539: URL: https://github.com/apache/datafusion/pull/10539#issuecomment-2114288420 Thanks @xinlifoobar and @yyy1000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] Implement Unparse TryCast Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
xinlifoobar opened a new pull request, #10542: URL: https://github.com/apache/datafusion/pull/10542 ## Which issue does this PR close? Closes #10520 ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [I] `GroupingSet` Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
xinlifoobar commented on issue #10521: URL: https://github.com/apache/datafusion/issues/10521#issuecomment-2114571082 Hi @alamb, I am wondering whether is there a way to implement the `cube` and 'rollup' functions easily. the inner vector is expanded in the planner. -- This is an automat

Re: [PR] Implement conversion from ColumnStatistics to NullableInterval [datafusion]

2024-05-16 Thread via GitHub
dmitrybugakov commented on code in PR #10510: URL: https://github.com/apache/datafusion/pull/10510#discussion_r1602926580 ## datafusion/expr/src/interval_arithmetic.rs: ## @@ -1469,6 +1472,8 @@ pub enum NullableInterval { MaybeNull { values: Interval }, /// The value i

Re: [I] select multiple columns in a single `Expr` [datafusion]

2024-05-16 Thread via GitHub
jayzhan211 commented on issue #10102: URL: https://github.com/apache/datafusion/issues/10102#issuecomment-2114676662 > However, in the case of COLUMNS('number\d+'), you need to have all the columns, and only return few of them from the function I agree, we can't get all the columns by

Re: [PR] test: some tests to write data to a parquet file and read its metadata [datafusion]

2024-05-16 Thread via GitHub
tustvold commented on code in PR #10537: URL: https://github.com/apache/datafusion/pull/10537#discussion_r1603007745 ## datafusion/core/src/datasource/physical_plan/parquet/arrow_statistics.rs: ## @@ -0,0 +1,43 @@ +use arrow_array::ArrayRef; +use arrow_schema::DataType; +use dat

Re: [PR] test: some tests to write data to a parquet file and read its metadata [datafusion]

2024-05-16 Thread via GitHub
tustvold commented on code in PR #10537: URL: https://github.com/apache/datafusion/pull/10537#discussion_r1603013478 ## datafusion/core/tests/parquet/arrow_statistics.rs: ## @@ -0,0 +1,528 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [PR] test: some tests to write data to a parquet file and read its metadata [datafusion]

2024-05-16 Thread via GitHub
tustvold commented on code in PR #10537: URL: https://github.com/apache/datafusion/pull/10537#discussion_r1603007745 ## datafusion/core/src/datasource/physical_plan/parquet/arrow_statistics.rs: ## @@ -0,0 +1,43 @@ +use arrow_array::ArrayRef; +use arrow_schema::DataType; +use dat

Re: [I] `OuterColumnReference` Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
goldmedal commented on issue #10523: URL: https://github.com/apache/datafusion/issues/10523#issuecomment-2114779250 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Implement unparse `IsNotFalse` to String [datafusion]

2024-05-16 Thread via GitHub
crepererum merged PR #10538: URL: https://github.com/apache/datafusion/pull/10538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Implement Unparse TryCast Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
crepererum merged PR #10542: URL: https://github.com/apache/datafusion/pull/10542 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [I] `TryCast` Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
crepererum closed issue #10520: `TryCast` Expr --> String Support URL: https://github.com/apache/datafusion/issues/10520 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] Implement unparse `Placeholder` to String [datafusion]

2024-05-16 Thread via GitHub
crepererum commented on PR #10540: URL: https://github.com/apache/datafusion/pull/10540#issuecomment-2114804519 needs a rebase (I've merged a bunch of other unparse PRs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Implement unparse `ScalarVariable` to String [datafusion]

2024-05-16 Thread via GitHub
crepererum commented on PR #10541: URL: https://github.com/apache/datafusion/pull/10541#issuecomment-2114805801 needs a rebase (I've merged a bunch of other unparse PRs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Docs: Update PR workflow documentation [datafusion]

2024-05-16 Thread via GitHub
crepererum commented on code in PR #10532: URL: https://github.com/apache/datafusion/pull/10532#discussion_r1603076109 ## docs/source/contributor-guide/index.md: ## @@ -66,24 +66,33 @@ ideas with the community to get feedback on implementation. ## Pull Request Overview -We

Re: [PR] Implement unparse `IsNotFalse` to String [datafusion]

2024-05-16 Thread via GitHub
goldmedal commented on PR #10538: URL: https://github.com/apache/datafusion/pull/10538#issuecomment-2114840538 Thanks @crepererum ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[PR] Add reference visitor `TreeNode` APIs [datafusion]

2024-05-16 Thread via GitHub
peter-toth opened a new pull request, #10543: URL: https://github.com/apache/datafusion/pull/10543 ## Which issue does this PR close? Part of https://github.com/apache/datafusion/issues/10505, required for https://github.com/apache/datafusion/issues/10426. ## Rationale for thi

Re: [PR] Better CSE identifier [datafusion]

2024-05-16 Thread via GitHub
peter-toth commented on PR #10473: URL: https://github.com/apache/datafusion/pull/10473#issuecomment-2114895780 > I think @peter-toth plans to break this PR up into smaller ones, so marking it as a draft to make it clear it isn't waiting on more feedback. If I am mistaken, please let me kno

Re: [I] Support `Union` as a function [datafusion]

2024-05-16 Thread via GitHub
vaibhawvipul commented on issue #10206: URL: https://github.com/apache/datafusion/issues/10206#issuecomment-2114931422 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Implement unparse `OuterReferenceColumn` to String [datafusion]

2024-05-16 Thread via GitHub
goldmedal opened a new pull request, #10544: URL: https://github.com/apache/datafusion/pull/10544 ## Which issue does this PR close? Closes #10523 ## Rationale for this change IMO, `OuterReferenceColumn` is a column can be resolved to a outside field of the current plan. Typica

Re: [I] datafusion-cli not installed after pip install datafusion [datafusion-python]

2024-05-16 Thread via GitHub
Michael-J-Ward commented on issue #587: URL: https://github.com/apache/datafusion-python/issues/587#issuecomment-2114989127 @l1t1 Did this used to be the case? I would expect that you need to use `cargo install` or some other method from the [datafusion-cli installation guide](https

Re: [I] Discussion: make it easier for specify SQL --> function translation [datafusion]

2024-05-16 Thread via GitHub
jayzhan211 commented on issue #10534: URL: https://github.com/apache/datafusion/issues/10534#issuecomment-2115002223 I rethink the issue in #10102, and I found it is strongly related to the user-defined parser mentioned here, that we can define the returned `Expr` given the registered funct

Re: [I] `GroupingSet` Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
xinlifoobar commented on issue #10521: URL: https://github.com/apache/datafusion/issues/10521#issuecomment-2115020694 I think from the `Vec` could not mapping back to the only unique `Vec>` and hence unparse the cube statement would be difficult. How about store another copy of origin Vec i

Re: [PR] Docs: Update PR workflow documentation [datafusion]

2024-05-16 Thread via GitHub
alamb commented on code in PR #10532: URL: https://github.com/apache/datafusion/pull/10532#discussion_r1603220677 ## docs/source/contributor-guide/index.md: ## @@ -66,24 +66,33 @@ ideas with the community to get feedback on implementation. ## Pull Request Overview -We welco

Re: [I] `Placeholder` Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
alamb closed issue #10522: `Placeholder` Expr --> String Support URL: https://github.com/apache/datafusion/issues/10522 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Implement unparse `Placeholder` to String [datafusion]

2024-05-16 Thread via GitHub
alamb merged PR #10540: URL: https://github.com/apache/datafusion/pull/10540 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Ensure examples stay updated in CI. [datafusion-python]

2024-05-16 Thread via GitHub
timsaucer commented on issue #696: URL: https://github.com/apache/datafusion-python/issues/696#issuecomment-2115120820 I have a branch ready that corrects the examples to match the spec output. Most of it was numerical errors between float and decimal representation, and converting 3 month

[I] AggregateUDF expression API design [datafusion]

2024-05-16 Thread via GitHub
jayzhan211 opened a new issue, #10545: URL: https://github.com/apache/datafusion/issues/10545 File an issue to track the design of UDAF API Perhaps something like ```rust // form `FIRST_VALUE(a ORDER BY b)` let agg_expr = AggregateUDF::call(first_value()) .args(col("a"))

Re: [PR] fix: Compute murmur3 hash with dictionary input correctly [datafusion-comet]

2024-05-16 Thread via GitHub
advancedxy commented on code in PR #433: URL: https://github.com/apache/datafusion-comet/pull/433#discussion_r1603321044 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1452,17 +1452,55 @@ class CometExpressionSuite extends CometTestBase with Adaptiv

Re: [PR] Create default jekyll site with old datafusion posts [datafusion-site]

2024-05-16 Thread via GitHub
andygrove commented on PR #1: URL: https://github.com/apache/datafusion-site/pull/1#issuecomment-2115224167 @alamb It turns out that we do not have rat checks in the arrow-site repo, so I hope it is ok if we omit them here too, at least to get started -- This is an automated message from

Re: [PR] fix: Compute murmur3 hash with dictionary input correctly [datafusion-comet]

2024-05-16 Thread via GitHub
advancedxy commented on code in PR #433: URL: https://github.com/apache/datafusion-comet/pull/433#discussion_r1603321044 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1452,17 +1452,55 @@ class CometExpressionSuite extends CometTestBase with Adaptiv

[I] Example for building an external index for parquet files [datafusion]

2024-05-16 Thread via GitHub
alamb opened a new issue, #10546: URL: https://github.com/apache/datafusion/issues/10546 ### Is your feature request related to a problem or challenge? It is common in databases and other analytic system to have additional external "indexes" (perhaps stored in the "metadata catalog",

Re: [PR] Implement unparse `OuterReferenceColumn` to String [datafusion]

2024-05-16 Thread via GitHub
alamb commented on PR #10544: URL: https://github.com/apache/datafusion/pull/10544#issuecomment-2115236957 Thanks again @goldmedal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Implement unparse `OuterReferenceColumn` to String [datafusion]

2024-05-16 Thread via GitHub
alamb merged PR #10544: URL: https://github.com/apache/datafusion/pull/10544 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] `OuterColumnReference` Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
alamb closed issue #10523: `OuterColumnReference` Expr --> String Support URL: https://github.com/apache/datafusion/issues/10523 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Implement unparse `OuterReferenceColumn` to String [datafusion]

2024-05-16 Thread via GitHub
goldmedal commented on PR #10544: URL: https://github.com/apache/datafusion/pull/10544#issuecomment-2115238959 Thanks @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] test: some tests to write data to a parquet file and read its metadata [datafusion]

2024-05-16 Thread via GitHub
alamb commented on code in PR #10537: URL: https://github.com/apache/datafusion/pull/10537#discussion_r1603344193 ## datafusion/core/src/datasource/physical_plan/parquet/arrow_statistics.rs: ## @@ -0,0 +1,43 @@ +use arrow_array::ArrayRef; +use arrow_schema::DataType; +use datafu

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
advancedxy commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603360782 ## spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala: ## @@ -249,7 +249,7 @@ abstract class CometTestBase var dfSpark: Dataset[Row] = null

Re: [PR] test: some tests to write data to a parquet file and read its metadata [datafusion]

2024-05-16 Thread via GitHub
alamb commented on code in PR #10537: URL: https://github.com/apache/datafusion/pull/10537#discussion_r1603364922 ## datafusion/core/tests/parquet/arrow_statistics.rs: ## @@ -0,0 +1,528 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [I] cannot import datafusion-37.1.0 in python 3.8 of windows 7 x64 [datafusion]

2024-05-16 Thread via GitHub
l1t1 commented on issue #10513: URL: https://github.com/apache/datafusion/issues/10513#issuecomment-2115277047 is it same reason of this error? https://github.com/pola-rs/polars/issues/15450#issuecomment-2056143302 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] fix: Compute murmur3 hash with dictionary input correctly [datafusion-comet]

2024-05-16 Thread via GitHub
advancedxy commented on PR #433: URL: https://github.com/apache/datafusion-comet/pull/433#issuecomment-2115279498 @viirya @kazuyukitanimura @sunchao PTAL when you have time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add reference visitor `TreeNode` APIs [datafusion]

2024-05-16 Thread via GitHub
peter-toth commented on PR #10543: URL: https://github.com/apache/datafusion/pull/10543#issuecomment-2115281194 cc @alamb, @berkaysynnada, @ozankabak -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] Update changelog for 38.0.0 [datafusion-python]

2024-05-16 Thread via GitHub
andygrove opened a new pull request, #704: URL: https://github.com/apache/datafusion-python/pull/704 # Which issue does this PR close? N/A # Rationale for this change We need to update the changelog before releasing # What changes are included in this

Re: [I] datafusion-cli not installed after pip install datafusion [datafusion-python]

2024-05-16 Thread via GitHub
l1t1 commented on issue #587: URL: https://github.com/apache/datafusion-python/issues/587#issuecomment-2115296335 I think this method is more friendly to a normal user . https://github.com/apache/datafusion/pull/9452 -- This is an automated message from the Apache Git Service. To respond

[PR] fix: `array_slice` panics [datafusion]

2024-05-16 Thread via GitHub
jonahgao opened a new pull request, #10547: URL: https://github.com/apache/datafusion/pull/10547 ## Which issue does this PR close? Closes #10425. ## Rationale for this change Note: Negative `from` or `to` implies counting from the end of the list and both are s

Re: [PR] feat: Implement Spark-compatible CAST from String to Date [datafusion-comet]

2024-05-16 Thread via GitHub
andygrove commented on PR #383: URL: https://github.com/apache/datafusion-comet/pull/383#issuecomment-2115341055 @vidyasankarv I would suggest that we skip the test for now when running against Spark 3.2 and file a follow on issue to fix 3.2 compatibility (this may not be a high priority si

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603425800 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1399,7 +1399,7 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSpark

[I] `array_slice` can't correctly handle parameters being NULL or too small [datafusion]

2024-05-16 Thread via GitHub
jonahgao opened a new issue, #10548: URL: https://github.com/apache/datafusion/issues/10548 ### Describe the bug These queries will give an error or incorrect results. ### To Reproduce Run queries in CLI: ```sh DataFusion CLI v38.0.0 0 row(s) fetched. Elapsed

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603425800 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1399,7 +1399,7 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSpark

Re: [PR] Create default jekyll site with old datafusion posts [datafusion-site]

2024-05-16 Thread via GitHub
andygrove commented on PR #1: URL: https://github.com/apache/datafusion-site/pull/1#issuecomment-2115378256 PR demonstrating lack of rat checks in arrow-site (checks passed): https://github.com/apache/arrow-site/pull/518 -- This is an automated message from the Apache Git Service. To

Re: [PR] Add reference visitor `TreeNode` APIs [datafusion]

2024-05-16 Thread via GitHub
berkaysynnada commented on PR #10543: URL: https://github.com/apache/datafusion/pull/10543#issuecomment-2115389289 > cc @alamb, @berkaysynnada, @ozankabak Thanks @peter-toth. After a quick look, I started thinking it might be better to use this new ref_visitor API instead of keeping a

Re: [PR] build: Switch back to released version of DataFusion and arrow-rs after Arrow Java 16 is released [datafusion-comet]

2024-05-16 Thread via GitHub
advancedxy commented on PR #403: URL: https://github.com/apache/datafusion-comet/pull/403#issuecomment-2115404901 > We only need to wait for new release of arrow-rs and DataFusion. Do we have an estimate on when new releases of arrow-rs and DataFusion will be available? I'm asking bec

Re: [I] Ensure examples stay updated in CI. [datafusion-python]

2024-05-16 Thread via GitHub
Michael-J-Ward commented on issue #696: URL: https://github.com/apache/datafusion-python/issues/696#issuecomment-2115415705 After a little experimenting, I'm pretty sure this will work. Create a `./example/tpch/_tests.py` ```python def test_q01_pricing_summary_report():

Re: [PR] build: Switch back to released version of DataFusion and arrow-rs after Arrow Java 16 is released [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on PR #403: URL: https://github.com/apache/datafusion-comet/pull/403#issuecomment-2115419825 arrow-rs and DataFusion have fast release cycle. We can hold Comet release after the new releases of arrow-rs and DataFusion. -- This is an automated message from the Apache Git S

Re: [I] Ensure examples stay updated in CI. [datafusion-python]

2024-05-16 Thread via GitHub
Michael-J-Ward commented on issue #696: URL: https://github.com/apache/datafusion-python/issues/696#issuecomment-2115422069 We could make it stupid simple by making these snapshot tests and using https://pypi.org/project/pytest-snapshot/ -- This is an automated message from the Apache Gi

Re: [PR] Add reference visitor `TreeNode` APIs [datafusion]

2024-05-16 Thread via GitHub
peter-toth commented on PR #10543: URL: https://github.com/apache/datafusion/pull/10543#issuecomment-2115422692 > Thanks @peter-toth. After a quick look, I started thinking it might be better to use this new ref_visitor API instead of keeping also the original one. I'll take a closer look t

Re: [PR] build: Switch back to released version of DataFusion and arrow-rs after Arrow Java 16 is released [datafusion-comet]

2024-05-16 Thread via GitHub
advancedxy commented on PR #403: URL: https://github.com/apache/datafusion-comet/pull/403#issuecomment-2115424225 > We can hold Comet release after the new releases of arrow-rs and DataFusion. Thanks for the clarification. I second this. -- This is an automated message from the Apa

Re: [PR] test: some tests to write data to a parquet file and read its metadata [datafusion]

2024-05-16 Thread via GitHub
NGA-TRAN commented on code in PR #10537: URL: https://github.com/apache/datafusion/pull/10537#discussion_r1603501230 ## datafusion/core/src/datasource/physical_plan/parquet/arrow_statistics.rs: ## @@ -0,0 +1,43 @@ +use arrow_array::ArrayRef; +use arrow_schema::DataType; +use dat

Re: [PR] Minor: Add `PullUpCorrelatedExpr::new` and improve documentation [datafusion]

2024-05-16 Thread via GitHub
comphead commented on code in PR #10500: URL: https://github.com/apache/datafusion/pull/10500#discussion_r1603509067 ## datafusion/optimizer/src/decorrelate.rs: ## @@ -38,23 +38,63 @@ use datafusion_physical_expr::execution_props::ExecutionProps; /// 'Filter'. It adds the inne

Re: [PR] fix: `array_slice` panics [datafusion]

2024-05-16 Thread via GitHub
jonahgao commented on code in PR #10547: URL: https://github.com/apache/datafusion/pull/10547#discussion_r1603518324 ## datafusion/functions-array/src/extract.rs: ## @@ -418,19 +418,16 @@ where if let (Some(from), Some(to)) = (from_index, to_index) { let

Re: [PR] Create default jekyll site with old datafusion posts [datafusion-site]

2024-05-16 Thread via GitHub
andygrove commented on PR #1: URL: https://github.com/apache/datafusion-site/pull/1#issuecomment-2115464435 Perhaps rat checks get applied to the generated content when we create PRs against the asf-site branch? :thinking: -- This is an automated message from the Apache Git Service. To r

Re: [PR] Minor: Add `PullUpCorrelatedExpr::new` and improve documentation [datafusion]

2024-05-16 Thread via GitHub
comphead commented on code in PR #10500: URL: https://github.com/apache/datafusion/pull/10500#discussion_r1603532658 ## datafusion/optimizer/src/decorrelate.rs: ## @@ -38,23 +38,63 @@ use datafusion_physical_expr::execution_props::ExecutionProps; /// 'Filter'. It adds the inne

Re: [PR] Minor: Add `PullUpCorrelatedExpr::new` and improve documentation [datafusion]

2024-05-16 Thread via GitHub
jackwener commented on code in PR #10500: URL: https://github.com/apache/datafusion/pull/10500#discussion_r1603532943 ## datafusion/optimizer/src/decorrelate.rs: ## @@ -38,23 +38,63 @@ use datafusion_physical_expr::execution_props::ExecutionProps; /// 'Filter'. It adds the inn

Re: [PR] Update changelog for 38.0.0 [datafusion-python]

2024-05-16 Thread via GitHub
andygrove merged PR #704: URL: https://github.com/apache/datafusion-python/pull/704 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] Stop copying LogicalPlan and Exprs in `PushDownLimit` [datafusion]

2024-05-16 Thread via GitHub
comphead commented on code in PR #10508: URL: https://github.com/apache/datafusion/pull/10508#discussion_r1603551871 ## datafusion/optimizer/src/push_down_limit.rs: ## @@ -183,6 +169,65 @@ impl OptimizerRule for PushDownLimit { } } +/// Wrap the input plan with a limit n

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
comphead commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603557818 ## spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala: ## @@ -249,7 +249,7 @@ abstract class CometTestBase var dfSpark: Dataset[Row] = null

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603601211 ## spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala: ## @@ -259,7 +259,7 @@ abstract class CometTestBase dfSpark.queryExecution.explainString(

[PR] Alamb/external parquet index [datafusion]

2024-05-16 Thread via GitHub
alamb opened a new pull request, #10549: URL: https://github.com/apache/datafusion/pull/10549 ## Which issue does this PR close? Closes https://github.com/apache/datafusion/issues/10546 ## Rationale for this change See https://github.com/apache/datafusion/issues/1

Re: [PR] Implement conversion from ColumnStatistics to NullableInterval [datafusion]

2024-05-16 Thread via GitHub
dmitrybugakov commented on PR #10510: URL: https://github.com/apache/datafusion/pull/10510#issuecomment-2115634506 @alamb Do we want to introduce a Type in `ColumnStats` in this PR? I found that the changes will affect Proto and may require other changes that are not clear for me at the

Re: [PR] Add example for building an external index for parquet files [datafusion]

2024-05-16 Thread via GitHub
alamb commented on code in PR #10549: URL: https://github.com/apache/datafusion/pull/10549#discussion_r1603675508 ## datafusion-examples/examples/parquet_index.rs: ## @@ -0,0 +1,608 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-16 Thread via GitHub
Blizzara commented on code in PR #10531: URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601979954 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -1277,6 +1407,56 @@ pub(crate) fn from_substrait_literal(lit: &Literal) -> Result {

Re: [PR] Example for simple Expr --> SQL conversion [datafusion]

2024-05-16 Thread via GitHub
yyy1000 commented on code in PR #10528: URL: https://github.com/apache/datafusion/pull/10528#discussion_r1603727701 ## datafusion-examples/README.md: ## @@ -63,6 +63,7 @@ cargo run --example csv_sql - [`parquet_sql.rs`](examples/parquet_sql.rs): Build and run a query plan from

Re: [PR] feat: Supports UUID column [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on code in PR #395: URL: https://github.com/apache/datafusion-comet/pull/395#discussion_r1603743733 ## common/src/main/java/org/apache/comet/vector/CometPlainVector.java: ## @@ -111,7 +115,12 @@ public UTF8String getUTF8String(int rowId) { byte[] result =

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
kazuyukitanimura commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603760627 ## spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala: ## @@ -249,7 +249,7 @@ abstract class CometTestBase var dfSpark: Dataset[Row] = n

Re: [I] `GroupingSet` Expr --> String Support [datafusion]

2024-05-16 Thread via GitHub
alamb commented on issue #10521: URL: https://github.com/apache/datafusion/issues/10521#issuecomment-2115936209 Hi @xinlifoobar - another thing that we could potentially do is to work at this from the SQL statement level (rather than the Expr level) As in implement a round trip test

Re: [PR] Optimization: make the most of Hint::AcceptsSingular when call make_scalar_function to Improve performance [datafusion]

2024-05-16 Thread via GitHub
alamb commented on PR #10054: URL: https://github.com/apache/datafusion/pull/10054#issuecomment-2115940200 Sorry @JasonLi-cn -- marking as ready for review so we can give it another look -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Optimization: make the most of Hint::AcceptsSingular when call make_scalar_function to Improve performance [datafusion]

2024-05-16 Thread via GitHub
alamb commented on PR #10054: URL: https://github.com/apache/datafusion/pull/10054#issuecomment-2115940825 It seems like this PR is failing a CI check: https://github.com/apache/datafusion/actions/runs/8771259487/job/24068788239?pr=10054 -- This is an automated message from the Apache Git

Re: [PR] Add reference visitor `TreeNode` APIs [datafusion]

2024-05-16 Thread via GitHub
alamb commented on PR #10543: URL: https://github.com/apache/datafusion/pull/10543#issuecomment-2115952198 I agree with @berkaysynnada in https://github.com/apache/datafusion/pull/10543#issuecomment-2115389289 that in an ideal world we woul change `TreeNode::visit` and `TreeNode::apply`, ho

Re: [PR] test: some tests to write data to a parquet file and read its metadata [datafusion]

2024-05-16 Thread via GitHub
alamb commented on code in PR #10537: URL: https://github.com/apache/datafusion/pull/10537#discussion_r1603880876 ## datafusion/core/tests/parquet/arrow_statistics.rs: ## @@ -0,0 +1,528 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [I] Discussion: make it easier for specify SQL --> function translation [datafusion]

2024-05-16 Thread via GitHub
alamb commented on issue #10534: URL: https://github.com/apache/datafusion/issues/10534#issuecomment-2115958620 In general, I think the idea of allowing users to customize the behavior of the sql planner is reasonable. However I am not entirely sure if we need to modify the planner itself,

Re: [I] AggregateUDF expression API design [datafusion]

2024-05-16 Thread via GitHub
alamb commented on issue #10545: URL: https://github.com/apache/datafusion/issues/10545#issuecomment-2115959437 related to https://github.com/apache/datafusion/issues/6747 (for window APIs) -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Stop copying LogicalPlan and Exprs in `PushDownFilter` (4%-6% faster planning) [datafusion]

2024-05-16 Thread via GitHub
alamb commented on code in PR #10444: URL: https://github.com/apache/datafusion/pull/10444#discussion_r1603886068 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -941,22 +935,65 @@ impl OptimizerRule for PushDownFilter { None => extension_plan.node.i

Re: [PR] Implement unparse `ScalarVariable` to String [datafusion]

2024-05-16 Thread via GitHub
alamb commented on PR #10541: URL: https://github.com/apache/datafusion/pull/10541#issuecomment-2115969536 I took the liberty of merging up from main to resolve the conflicts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Implement unparse `ScalarVariable` to String [datafusion]

2024-05-16 Thread via GitHub
alamb commented on code in PR #10541: URL: https://github.com/apache/datafusion/pull/10541#discussion_r1603891849 ## datafusion/sql/src/unparser/expr.rs: ## @@ -388,8 +388,18 @@ impl Unparser<'_> { expr: Box::new(sql_parser_expr), })

[I] Add an example of how to convert LogicalPlan to/from SQL Strings [datafusion]

2024-05-16 Thread via GitHub
alamb opened a new issue, #10550: URL: https://github.com/apache/datafusion/issues/10550 ### Is your feature request related to a problem or challenge? Having a good example helps to make features easier to use in DataFusion In this case the usecase is programmatic construction

Re: [PR] Example for simple Expr --> SQL conversion [datafusion]

2024-05-16 Thread via GitHub
alamb commented on PR #10528: URL: https://github.com/apache/datafusion/pull/10528#issuecomment-2115986516 I filed https://github.com/apache/datafusion/issues/10550 for hte logical plan version too -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] feat: Add support for TryCast expression in Spark 3.2 and 3.3 [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on code in PR #416: URL: https://github.com/apache/datafusion-comet/pull/416#discussion_r1603936564 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -948,10 +948,7 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlanHelper {

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603940165 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1399,7 +1399,7 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSpark

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
viirya merged PR #436: URL: https://github.com/apache/datafusion-comet/pull/436 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
viirya commented on PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#issuecomment-2116045697 Merged. Thanks @kazuyukitanimura @andygrove @advancedxy @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
kazuyukitanimura commented on PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#issuecomment-2116047350 Thank you all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Excessive memory consumption on sorting [datafusion]

2024-05-16 Thread via GitHub
alamb commented on issue #10511: URL: https://github.com/apache/datafusion/issues/10511#issuecomment-2116051494 @samuelcolvin can you share the query plan for this query? Specifically what is the output of this query? ```sql explain select span_name from records order by bit_lengt

Re: [PR] Minor: Add `PullUpCorrelatedExpr::new` and improve documentation [datafusion]

2024-05-16 Thread via GitHub
alamb commented on PR #10500: URL: https://github.com/apache/datafusion/pull/10500#issuecomment-2116052776 > lgtm thanks @alamb its good to go, but we probably want to revisit this count bug sometime Thank you -- I will file a ticket to track the issue and leave a note in the comment

Re: [PR] test: Fix explain with exteded info comet test [datafusion-comet]

2024-05-16 Thread via GitHub
kazuyukitanimura commented on code in PR #436: URL: https://github.com/apache/datafusion-comet/pull/436#discussion_r1603947683 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -1399,7 +1399,7 @@ class CometExpressionSuite extends CometTestBase with Ada

Re: [PR] Create default jekyll site with old datafusion posts [datafusion-site]

2024-05-16 Thread via GitHub
alamb commented on code in PR #1: URL: https://github.com/apache/datafusion-site/pull/1#discussion_r1603948157 ## README.md: ## @@ -1,3 +1,42 @@ -# Apache DataFusion Web Site +# Apache DataFusion Blog Content -Coming soon +This repository contains the Apache DataFusion blog co

Re: [PR] Add reference visitor `TreeNode` APIs [datafusion]

2024-05-16 Thread via GitHub
ozankabak commented on PR #10543: URL: https://github.com/apache/datafusion/pull/10543#issuecomment-2116059936 Breaking this into two PRs (with the latter one removing the old usage and migrating to the new one) sounds reasonable to me. It could make sense to do the follow-on quickly becaus

Re: [PR] Example for simple Expr --> SQL conversion [datafusion]

2024-05-16 Thread via GitHub
backkem commented on PR #10528: URL: https://github.com/apache/datafusion/pull/10528#issuecomment-2116068547 > Using the `expr_to_sql` api, we get the following error: > > ``` > assertion `left == right` failed > left: "((\"a\" < 5) OR (\"a\" = 8))" > right: "a < 5 OR a = 8

  1   2   >