Re: [PR] Support Arrays for the Map scalar functions [datafusion]

2024-08-02 Thread via GitHub
dharanad commented on PR #11712: URL: https://github.com/apache/datafusion/pull/11712#issuecomment-2264702046 Sorry, this change took more time than expected. The good thing is I deepened my understanding of Arrow. @goldmedal @jayzhan211 requesting you help to review this change. -- T

Re: [PR] fix: Optimize decimal creation macros [datafusion-comet]

2024-08-02 Thread via GitHub
kazuyukitanimura commented on PR #764: URL: https://github.com/apache/datafusion-comet/pull/764#issuecomment-2264746540 ``` OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5 Apple M1 Max TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms)Rate

Re: [PR] fix: Optimize decimal creation macros [datafusion-comet]

2024-08-02 Thread via GitHub
kazuyukitanimura commented on PR #764: URL: https://github.com/apache/datafusion-comet/pull/764#issuecomment-2264748079 ### Before ![Screenshot 2024-08-01 at 11 22 56  PM](https://github.com/user-attachments/assets/6115cb62-0375-4dd6-9377-4e235160354f) ### After ![Screenshot 2024

Re: [PR] perf: improve decimal read performance in CometVector [datafusion-comet]

2024-08-02 Thread via GitHub
kazuyukitanimura commented on PR #756: URL: https://github.com/apache/datafusion-comet/pull/756#issuecomment-2264760991 Oops, I did not see this earlier... related https://github.com/apache/datafusion-comet/pull/758 -- This is an automated message from the Apache Git Service. To respond t

Re: [I] Create `name` of aggregate expression from expressions [datafusion]

2024-08-02 Thread via GitHub
lewiszlw commented on issue #11707: URL: https://github.com/apache/datafusion/issues/11707#issuecomment-2264772430 For example `select count(1) as count from t`, the generated name is `count(1)`, we need be able to store alias. `AggregateExec`'s `aggr_expr` field doesn't support alias, s

[PR] Optionally create name of aggregate expression from expressions [datafusion]

2024-08-02 Thread via GitHub
lewiszlw opened a new pull request, #11776: URL: https://github.com/apache/datafusion/pull/11776 ## Which issue does this PR close? Closes https://github.com/apache/datafusion/issues/11707. ## Rationale for this change ## What changes are included in this

Re: [I] Evaluate ValuesExec's exprs during execution [datafusion]

2024-08-02 Thread via GitHub
lewiszlw commented on issue #11736: URL: https://github.com/apache/datafusion/issues/11736#issuecomment-2264820665 For example, `explain select * from values (generate_series(1, 10));` will take 8s to complete in my machine. -- This is an automated message from the Apache Git Service.

[PR] Single multi groupby v2 [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 opened a new pull request, #11777: URL: https://github.com/apache/datafusion/pull/11777 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested

Re: [PR] fix: Add additional required expression for natural join [datafusion]

2024-08-02 Thread via GitHub
jonahgao commented on PR #11713: URL: https://github.com/apache/datafusion/pull/11713#issuecomment-2264913915 I checked that the final logical plan is correct, but the calling of [expand_wildcard](https://github.com/apache/datafusion/blob/d010ce90f40f2866904a4eea563afbbff72497cc/datafusion/e

Re: [PR] Single multi groupby v2 [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11777: URL: https://github.com/apache/datafusion/pull/11777#discussion_r1701548962 ## datafusion/core/benches/high_cardinality.rs: ## @@ -0,0 +1,129 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] Single multi groupby v2 [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11777: URL: https://github.com/apache/datafusion/pull/11777#discussion_r1701548962 ## datafusion/core/benches/high_cardinality.rs: ## @@ -0,0 +1,129 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] Single multi groupby v2 [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11777: URL: https://github.com/apache/datafusion/pull/11777#discussion_r1701555605 ## datafusion/core/benches/low_cardinality.rs: ## @@ -0,0 +1,95 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licens

Re: [PR] Single multi groupby v2 [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11777: URL: https://github.com/apache/datafusion/pull/11777#discussion_r1701566977 ## datafusion/core/benches/high_cardinality.rs: ## @@ -0,0 +1,129 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

Re: [PR] [Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on PR #11765: URL: https://github.com/apache/datafusion/pull/11765#issuecomment-2265009235 Thanks @gruuya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] [Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 merged PR #11765: URL: https://github.com/apache/datafusion/pull/11765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Fix #11692: Improve doc comments within macros [datafusion]

2024-08-02 Thread via GitHub
alamb commented on PR #11694: URL: https://github.com/apache/datafusion/pull/11694#issuecomment-2265079093 I think we can always make docs better, this is a step in the right direction. Thanks again. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Fix #11692: Improve doc comments within macros [datafusion]

2024-08-02 Thread via GitHub
alamb merged PR #11694: URL: https://github.com/apache/datafusion/pull/11694 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Extract `CoalesceBatchesStream` to a struct [datafusion]

2024-08-02 Thread via GitHub
alamb commented on PR #11610: URL: https://github.com/apache/datafusion/pull/11610#issuecomment-2265212592 Thank you @ozankabak and @andygrove for the review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Extract `CoalesceBatchesStream` to a struct [datafusion]

2024-08-02 Thread via GitHub
alamb merged PR #11610: URL: https://github.com/apache/datafusion/pull/11610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] refactor: move ExecutionPlan and related structs into dedicated mod [datafusion]

2024-08-02 Thread via GitHub
alamb merged PR #11759: URL: https://github.com/apache/datafusion/pull/11759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Error in building using the release or from source - build failure [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on issue #763: URL: https://github.com/apache/datafusion-comet/issues/763#issuecomment-2265248152 Spark 3.4 only supports Java 8, 11, and 17. Could you try with 17? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Ballista 0.13.0 Release [datafusion-ballista]

2024-08-02 Thread via GitHub
andrewwebber commented on issue #974: URL: https://github.com/apache/datafusion-ballista/issues/974#issuecomment-2265252176 It would bring great confidence to the community and or unsure observers if there was a regular release cadence. Would you have an update regarding this ticket? --

Re: [PR] feat: Implement basic version of RLIKE [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove merged PR #734: URL: https://github.com/apache/datafusion-comet/pull/734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] feat: Add GetStructField expression [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on PR #731: URL: https://github.com/apache/datafusion-comet/pull/731#issuecomment-2265316215 Thanks for addressing the feedback @Kimahriman. Could you fix the merge conflict? -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] Add docs and rename param for `Signature::numeric` [datafusion]

2024-08-02 Thread via GitHub
matthewmturner opened a new pull request, #11778: URL: https://github.com/apache/datafusion/pull/11778 ## Which issue does this PR close? I was creating a UDF and noticed that `Signature::numeric` was missing docs and also the param name confused me at first, so I aligned it t

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-08-02 Thread via GitHub
berkaysynnada commented on PR #11516: URL: https://github.com/apache/datafusion/pull/11516#issuecomment-2265355871 Are we tracking this TODO? https://github.com/apache/datafusion/blob/0332eb569a5428ac385fe892ce7b5fb40d52c8c0/datafusion/core/src/datasource/listing_table_factory.rs#L55 -

Re: [PR] fix: optimize some bit functions [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on code in PR #718: URL: https://github.com/apache/datafusion-comet/pull/718#discussion_r1701817076 ## native/core/src/common/bit.rs: ## @@ -131,6 +131,18 @@ pub fn read_num_bytes_u32(size: usize, src: &[u8]) -> u32 { trailing_bits(v as u64, size * 8) as

[I] Align UDF name to lowercase or uppercase [datafusion]

2024-08-02 Thread via GitHub
edmondop opened a new issue, #11779: URL: https://github.com/apache/datafusion/issues/11779 ### Describe the bug As a part of performing [this](https://github.com/apache/datafusion/pull/11013) change, @jayzhan211 recommended to use `min` and `max` as a default name for the new UDFs r

Re: [PR] Move min and max to user defined aggregate function [datafusion]

2024-08-02 Thread via GitHub
edmondop commented on code in PR #11013: URL: https://github.com/apache/datafusion/pull/11013#discussion_r1701819442 ## datafusion/functions-aggregate/src/min_max.rs: ## @@ -914,69 +888,56 @@ impl Accumulator for SlidingMaxAccumulator { } } -/// MIN aggregate expression

Re: [I] Comet should fallback to Spark for unsupported partitioning [datafusion-comet]

2024-08-02 Thread via GitHub
viirya closed issue #760: Comet should fallback to Spark for unsupported partitioning URL: https://github.com/apache/datafusion-comet/issues/760 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] fix: Fallback to Spark for unsupported partitioning [datafusion-comet]

2024-08-02 Thread via GitHub
viirya merged PR #759: URL: https://github.com/apache/datafusion-comet/pull/759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] fix: Fallback to Spark for unsupported partitioning [datafusion-comet]

2024-08-02 Thread via GitHub
viirya commented on PR #759: URL: https://github.com/apache/datafusion-comet/pull/759#issuecomment-2265362695 Merged. Thanks @kazuyukitanimura @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-08-02 Thread via GitHub
findepi commented on PR #11516: URL: https://github.com/apache/datafusion/pull/11516#issuecomment-2265372798 @berkaysynnada yes, https://github.com/apache/datafusion/issues/11600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Align UDF name to lowercase or uppercase [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on issue #11779: URL: https://github.com/apache/datafusion/issues/11779#issuecomment-2265371418 Sadly this is the last name to be lowercase, others are already converted. I don't have strong preference on upper case or lower case as long as they are all consistent --

[PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
goldmedal opened a new pull request, #11780: URL: https://github.com/apache/datafusion/pull/11780 ## Which issue does this PR close? Closes #11434 ## Rationale for this change ## What changes are included in this PR? ## Are these changes te

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1701831607 ## datafusion/sqllogictest/test_files/map.slt: ## @@ -310,3 +310,36 @@ VALUES (MAP(['a'], [1])), (MAP(['b'], [2])), (MAP(['c', 'a'], [3, 1])) {a: 1} {b: 2} {

[PR] Feature/eliminate aggregate [datafusion]

2024-08-02 Thread via GitHub
mertak-synnada opened a new pull request, #11781: URL: https://github.com/apache/datafusion/pull/11781 ## Which issue does this PR close? Closes #. ## Rationale for this change Inefficient planning is produced when a redundant DISTINCT & GROUP BY clause is used t

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-08-02 Thread via GitHub
alamb commented on PR #11516: URL: https://github.com/apache/datafusion/pull/11516#issuecomment-2265409603 > @berkaysynnada yes, #11600 I suggest adding a comment in the code with the link to the ticket to help others find this ticket -- This is an automated message from the Apach

Re: [I] Create `name` of aggregate expression from expressions [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on issue #11707: URL: https://github.com/apache/datafusion/issues/11707#issuecomment-2265407609 I see. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Optionally create name of aggregate expression from expressions [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11776: URL: https://github.com/apache/datafusion/pull/11776#discussion_r1701855415 ## datafusion/core/src/physical_planner.rs: ## @@ -1908,9 +1648,9 @@ pub fn create_aggregate_expr_and_maybe_filter( ) -> Result { // unpack (nested) alias

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1701857458 ## datafusion/sqllogictest/test_files/map.slt: ## @@ -310,3 +310,36 @@ VALUES (MAP(['a'], [1])), (MAP(['b'], [2])), (MAP(['c', 'a'], [3, 1])) {a: 1} {b: 2} {

Re: [PR] Support Arrays for the Map scalar functions [datafusion]

2024-08-02 Thread via GitHub
alamb commented on code in PR #11712: URL: https://github.com/apache/datafusion/pull/11712#discussion_r1701859111 ## datafusion/functions-nested/src/map.rs: ## @@ -202,3 +226,128 @@ fn get_element_type(data_type: &DataType) -> datafusion_common::Result<&DataType ),

Re: [I] Update ClickBench benchmarks with DataFusion 40 [datafusion]

2024-08-02 Thread via GitHub
pmcgleenon commented on issue #11567: URL: https://github.com/apache/datafusion/issues/11567#issuecomment-2265439558 Hi @alamb the [Clickbench PR](https://github.com/ClickHouse/ClickBench/pull/210) has been merged the Datafusion version 40 results are now visible on the main

Re: [I] Update ClickBench benchmarks with DataFusion 40 [datafusion]

2024-08-02 Thread via GitHub
alamb commented on issue #11567: URL: https://github.com/apache/datafusion/issues/11567#issuecomment-2265469454 Thank you so much @pmcgleenon 🙏 -- I am pretty excited to complte our inprogress work (like stringview and high cardinality aggregates) and run these again with a newer version

Re: [I] Update ClickBench benchmarks with DataFusion 40 [datafusion]

2024-08-02 Thread via GitHub
alamb closed issue #11567: Update ClickBench benchmarks with DataFusion 40 URL: https://github.com/apache/datafusion/issues/11567 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701892563 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701902126 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter)

Re: [I] Evaluate ValuesExec's exprs during execution [datafusion]

2024-08-02 Thread via GitHub
alamb commented on issue #11736: URL: https://github.com/apache/datafusion/issues/11736#issuecomment-2265500845 > For example, `explain select * from values (generate_series(1, 10));` will take 8s to complete in my machine. It would be fascinating to run a profile / flamegraph and

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701923500 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701923500 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701923500 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter

[I] Simplify the `name` function in datafusion [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 opened a new issue, #11782: URL: https://github.com/apache/datafusion/issues/11782 ### Is your feature request related to a problem or challenge? There are `display_name`, `create_name`, `write_name`, `impl fmt::Display for Expr`, `create_function_physical_name`, `create_ph

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701939306 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2752,11 +2752,10 @@ fn calc_func_dependencies_for_project( .iter() .filter_map(|expr| {

Re: [PR] perf: Avoid redundant copying of arrays in scan->filter->join [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on PR #762: URL: https://github.com/apache/datafusion-comet/pull/762#issuecomment-2265548008 I'm not seeing any performance improvement so far ## Before ``` OpenJDK 64-Bit Server VM 11.0.24+8-post-Ubuntu-1ubuntu322.04 on Linux 6.5.0-41-generic AMD Ryzen 9

Re: [PR] feat: show executed native plan with metrics when in debug mode [datafusion-comet]

2024-08-02 Thread via GitHub
huaxingao merged PR #746: URL: https://github.com/apache/datafusion-comet/pull/746 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701954822 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701954822 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
jayzhan211 commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701954822 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter

Re: [PR] feat: Add GetStructField expression [datafusion-comet]

2024-08-02 Thread via GitHub
Kimahriman commented on PR #731: URL: https://github.com/apache/datafusion-comet/pull/731#issuecomment-2265582122 > Thanks for addressing the feedback @Kimahriman. Could you fix the merge conflict? Done -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Incorrect result returned by `UNION ALL` (SQLancer-TLP) [datafusion]

2024-08-02 Thread via GitHub
2010YOUY01 commented on issue #11742: URL: https://github.com/apache/datafusion/issues/11742#issuecomment-2265610351 Note to myself Maybe duplicate: ``` TLP-Aggregate oracle violated: Q's result is not equalt to MIN(Q1, Q2, Q3): RS(Q) - MIN(RS(Q1), RS(Q2), RS(Q3)) is :0.1

[PR] fix: Unsupported types for SinglePartition should fallback to Spark [datafusion-comet]

2024-08-02 Thread via GitHub
viirya opened a new pull request, #765: URL: https://github.com/apache/datafusion-comet/pull/765 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes test

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1701979383 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter)

Re: [PR] fix: Unsupported types for SinglePartition should fallback to Spark [datafusion-comet]

2024-08-02 Thread via GitHub
viirya commented on code in PR #765: URL: https://github.com/apache/datafusion-comet/pull/765#discussion_r1701981437 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2931,8 +2931,8 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde with

[I] Aggregate SQL query with filter not working in datafusion CLI [datafusion]

2024-08-02 Thread via GitHub
2010YOUY01 opened a new issue, #11783: URL: https://github.com/apache/datafusion/issues/11783 ### Describe the bug Aggregate function with filter is supported, see this [testcase](https://github.com/apache/datafusion/blob/0332eb569a5428ac385fe892ce7b5fb40d52c8c0/datafusion/sqllogictes

[PR] Add references to github issue [datafusion]

2024-08-02 Thread via GitHub
findepi opened a new pull request, #11784: URL: https://github.com/apache/datafusion/pull/11784 Add links to #11600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-08-02 Thread via GitHub
findepi commented on PR #11516: URL: https://github.com/apache/datafusion/pull/11516#issuecomment-2265643243 good idea! https://github.com/apache/datafusion/pull/11784 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] fix: Unsupported types for SinglePartition should fallback to Spark [datafusion-comet]

2024-08-02 Thread via GitHub
viirya commented on PR #765: URL: https://github.com/apache/datafusion-comet/pull/765#issuecomment-2265652992 Thanks @huaxingao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] perf: improve decimal read performance in CometVector [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on code in PR #756: URL: https://github.com/apache/datafusion-comet/pull/756#discussion_r1702017416 ## common/src/main/java/org/apache/comet/vector/CometVector.java: ## @@ -125,14 +144,14 @@ byte[] getBinaryDecimal(int i) { /** Reads a 16-byte byte array

Re: [PR] perf: improve decimal read performance in CometVector [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on code in PR #756: URL: https://github.com/apache/datafusion-comet/pull/756#discussion_r1702020395 ## common/src/main/java/org/apache/comet/vector/CometVector.java: ## @@ -89,6 +90,8 @@ public Decimal getDecimal(int i, int precision, int scale) { retu

Re: [PR] perf: improve decimal read performance in CometVector [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on code in PR #756: URL: https://github.com/apache/datafusion-comet/pull/756#discussion_r1702022378 ## common/src/main/java/org/apache/comet/vector/CometVector.java: ## @@ -125,14 +144,14 @@ byte[] getBinaryDecimal(int i) { /** Reads a 16-byte byte array

Re: [PR] perf: improve decimal read performance in CometVector [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on code in PR #756: URL: https://github.com/apache/datafusion-comet/pull/756#discussion_r1702024030 ## common/src/main/java/org/apache/comet/vector/CometVector.java: ## @@ -125,14 +144,14 @@ byte[] getBinaryDecimal(int i) { /** Reads a 16-byte byte array

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
comphead commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702027202 ## datafusion/sql/src/expr/mod.rs: ## @@ -714,6 +717,37 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { not_impl_err!("Unsupported dictionary literal:

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
comphead commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702029699 ## datafusion/sqllogictest/test_files/map.slt: ## @@ -310,3 +310,36 @@ VALUES (MAP(['a'], [1])), (MAP(['b'], [2])), (MAP(['c', 'a'], [3, 1])) {a: 1} {b: 2} {c

Re: [PR] fix: Optimize getDecimal for small precision [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on PR #758: URL: https://github.com/apache/datafusion-comet/pull/758#issuecomment-2265689651 Single run of TPC-DS before and after the change in this PR: ![tpcds_allqueries](https://github.com/user-attachments/assets/f60fb3c7-de78-426d-89b7-fb998ea52521) --

Re: [PR] Enhance the formatting for Column [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11724: URL: https://github.com/apache/datafusion/pull/11724#discussion_r1702037283 ## datafusion/common/src/column.rs: ## @@ -372,7 +372,7 @@ impl FromStr for Column { impl fmt::Display for Column { fn fmt(&self, f: &mut fmt::Formatter)

Re: [PR] perf: improve decimal read performance in CometVector [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on code in PR #756: URL: https://github.com/apache/datafusion-comet/pull/756#discussion_r1702020395 ## common/src/main/java/org/apache/comet/vector/CometVector.java: ## @@ -89,6 +90,8 @@ public Decimal getDecimal(int i, int precision, int scale) { retu

Re: [PR] perf: improve decimal read performance in CometVector [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove commented on code in PR #756: URL: https://github.com/apache/datafusion-comet/pull/756#discussion_r1702060804 ## common/src/main/java/org/apache/comet/vector/CometVector.java: ## @@ -125,14 +144,14 @@ byte[] getBinaryDecimal(int i) { /** Reads a 16-byte byte array

Re: [I] Unsupported types for SinglePartition should fallback to Spark [datafusion-comet]

2024-08-02 Thread via GitHub
viirya closed issue #766: Unsupported types for SinglePartition should fallback to Spark URL: https://github.com/apache/datafusion-comet/issues/766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] fix: Unsupported types for SinglePartition should fallback to Spark [datafusion-comet]

2024-08-02 Thread via GitHub
viirya merged PR #765: URL: https://github.com/apache/datafusion-comet/pull/765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] chore: Skip creation of BigDecimal during getDecimal call [datafusion-comet]

2024-08-02 Thread via GitHub
viirya closed pull request #682: chore: Skip creation of BigDecimal during getDecimal call URL: https://github.com/apache/datafusion-comet/pull/682 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702121012 ## datafusion/sqllogictest/test_files/map.slt: ## @@ -341,5 +341,121 @@ SELECT MAP {'a':1, 'b':2, 'c':3 }['a'] FROM t; # # {} +# values contain null +qu

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
dharanad commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702122361 ## datafusion/sql/src/expr/mod.rs: ## @@ -714,6 +717,37 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { not_impl_err!("Unsupported dictionary literal:

Re: [PR] fix: Add additional required expression for natural join [datafusion]

2024-08-02 Thread via GitHub
Lordworms commented on PR #11713: URL: https://github.com/apache/datafusion/pull/11713#issuecomment-2265833859 > I checked that the final logical plan is correct, but the calling of [expand_wildcard](https://github.com/apache/datafusion/blob/d010ce90f40f2866904a4eea563afbbff72497cc/datafusio

Re: [PR] fix: Add additional required expression for natural join [datafusion]

2024-08-02 Thread via GitHub
Lordworms commented on PR #11713: URL: https://github.com/apache/datafusion/pull/11713#issuecomment-2265834154 > > I checked that the final logical plan is correct, but the calling of [expand_wildcard](https://github.com/apache/datafusion/blob/d010ce90f40f2866904a4eea563afbbff72497cc/datafus

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
dharanad commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702128317 ## datafusion/sql/src/expr/mod.rs: ## @@ -714,6 +717,37 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { not_impl_err!("Unsupported dictionary literal:

[I] Access a Map with a non-string keys [datafusion]

2024-08-02 Thread via GitHub
goldmedal opened a new issue, #11785: URL: https://github.com/apache/datafusion/issues/11785 ### Describe the bug Currently, we support creating a map like ``` SELECT MAKE_MAP(1, 'a', 2, 'b', 3, 'c'); {1: a, 2: b, 3: c} ``` However, we can't access this map if it

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702133809 ## datafusion/sqllogictest/test_files/map.slt: ## @@ -341,5 +341,121 @@ SELECT MAP {'a':1, 'b':2, 'c':3 }['a'] FROM t; # # {} +# values contain null +qu

Re: [I] Access a Map with a non-string keys [datafusion]

2024-08-02 Thread via GitHub
dharanad commented on issue #11785: URL: https://github.com/apache/datafusion/issues/11785#issuecomment-2265852901 Interesting, I can give it a try. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] feat: Add GetStructField expression [datafusion-comet]

2024-08-02 Thread via GitHub
Kimahriman commented on code in PR #731: URL: https://github.com/apache/datafusion-comet/pull/731#discussion_r1702143071 ## spark/src/main/scala/org/apache/spark/sql/comet/CometRowToColumnarExec.scala: ## @@ -60,8 +62,17 @@ case class CometRowToColumnarExec(child: SparkPlan)

Re: [PR] feat: Add GetStructField expression [datafusion-comet]

2024-08-02 Thread via GitHub
Kimahriman commented on code in PR #731: URL: https://github.com/apache/datafusion-comet/pull/731#discussion_r1702143071 ## spark/src/main/scala/org/apache/spark/sql/comet/CometRowToColumnarExec.scala: ## @@ -60,8 +62,17 @@ case class CometRowToColumnarExec(child: SparkPlan)

[I] Support `starts_with` for UTF8 [datafusion]

2024-08-02 Thread via GitHub
tshauck opened a new issue, #11786: URL: https://github.com/apache/datafusion/issues/11786 ### Is your feature request related to a problem or challenge? From what I can tell, `starts_with` when called with utf8view will have its arguments coerced to a regular utf8 array type.

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702147467 ## datafusion/sql/src/expr/mod.rs: ## @@ -714,6 +717,37 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { not_impl_err!("Unsupported dictionary literal:

Re: [I] Support `starts_with` for `Utf8View` [datafusion]

2024-08-02 Thread via GitHub
tshauck commented on issue #11786: URL: https://github.com/apache/datafusion/issues/11786#issuecomment-2265867742 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] fix: unwrap dictionaries in CreateNamedStruct [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove merged PR #754: URL: https://github.com/apache/datafusion-comet/pull/754 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [I] CreateNamedStruct fails at runtime on dictionary-encoded string arrays [datafusion-comet]

2024-08-02 Thread via GitHub
andygrove closed issue #750: CreateNamedStruct fails at runtime on dictionary-encoded string arrays URL: https://github.com/apache/datafusion-comet/issues/750 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] feat: support `Utf8View` for `starts_with` [datafusion]

2024-08-02 Thread via GitHub
tshauck opened a new pull request, #11787: URL: https://github.com/apache/datafusion/pull/11787 ## Which issue does this PR close? Closes #11786 ## Rationale for this change Utf8Views don't appear to be supported `starts_with` yet. ## What changes are included in t

Re: [I] Error in building using the release or from source - build failure [datafusion-comet]

2024-08-02 Thread via GitHub
zelda89 closed issue #763: Error in building using the release or from source - build failure URL: https://github.com/apache/datafusion-comet/issues/763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Error in building using the release or from source - build failure [datafusion-comet]

2024-08-02 Thread via GitHub
zelda89 commented on issue #763: URL: https://github.com/apache/datafusion-comet/issues/763#issuecomment-2265874126 Thanks! that worked, might be helpful to update the documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Support planning `Map` literal [datafusion]

2024-08-02 Thread via GitHub
goldmedal commented on code in PR #11780: URL: https://github.com/apache/datafusion/pull/11780#discussion_r1702154448 ## datafusion/sql/src/expr/mod.rs: ## @@ -714,6 +717,37 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { not_impl_err!("Unsupported dictionary literal:

Re: [PR] feat: support `Utf8View` for `starts_with` [datafusion]

2024-08-02 Thread via GitHub
tshauck commented on code in PR #11787: URL: https://github.com/apache/datafusion/pull/11787#discussion_r1702161338 ## datafusion/functions/src/string/starts_with.rs: ## @@ -81,18 +106,73 @@ impl ScalarUDFImpl for StartsWithFunc { } fn return_type(&self, _arg_types:

  1   2   >