Re: [PR] Implement `kurtosis_pop` UDAF [datafusion]

2024-09-04 Thread via GitHub
goldmedal commented on PR #12273: URL: https://github.com/apache/datafusion/pull/12273#issuecomment-2328104660 Thanks @jayzhan211 @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] feat: Add projection to FilterExec [datafusion]

2024-09-04 Thread via GitHub
berkaysynnada commented on code in PR #12281: URL: https://github.com/apache/datafusion/pull/12281#discussion_r1743240999 ## datafusion/physical-expr/src/equivalence/projection.rs: ## @@ -82,6 +82,11 @@ impl ProjectionMapping { .map(|map| Self { map }) } +

[PR] Update arrow-schema requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
dependabot[bot] opened a new pull request, #12312: URL: https://github.com/apache/datafusion/pull/12312 Updates the requirements on [arrow-schema](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/master

Re: [PR] perf: avoid repeat format in calc_func_dependencies_for_project [datafusion]

2024-09-04 Thread via GitHub
lewiszlw merged PR #12305: URL: https://github.com/apache/datafusion/pull/12305 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] significant degraded performance of `select * from large tables` in main compared to datafusion v41.0.0 [datafusion]

2024-09-04 Thread via GitHub
lewiszlw closed issue #12304: significant degraded performance of `select * from large tables` in main compared to datafusion v41.0.0 URL: https://github.com/apache/datafusion/issues/12304 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] Update arrow-buffer requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
dependabot[bot] opened a new pull request, #12314: URL: https://github.com/apache/datafusion/pull/12314 Updates the requirements on [arrow-buffer](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/master

[PR] Update arrow-ipc requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
dependabot[bot] opened a new pull request, #12315: URL: https://github.com/apache/datafusion/pull/12315 Updates the requirements on [arrow-ipc](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/master/CH

[PR] Update arrow-array requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
dependabot[bot] opened a new pull request, #12316: URL: https://github.com/apache/datafusion/pull/12316 Updates the requirements on [arrow-array](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/master/

[PR] Update arrow-ord requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
dependabot[bot] opened a new pull request, #12317: URL: https://github.com/apache/datafusion/pull/12317 Updates the requirements on [arrow-ord](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/master/CH

[PR] Update arrow requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
dependabot[bot] opened a new pull request, #12318: URL: https://github.com/apache/datafusion/pull/12318 Updates the requirements on [arrow](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/master/CHANGE

Re: [PR] Handle type coercion in signature for `ApproxPercentileCont` [datafusion]

2024-09-04 Thread via GitHub
Blizzara commented on code in PR #12274: URL: https://github.com/apache/datafusion/pull/12274#discussion_r1743339844 ## datafusion/functions-aggregate/src/approx_median.rs: ## @@ -116,4 +112,8 @@ impl AggregateUDFImpl for ApproxMedian { acc_args.exprs[0].data_type(a

Re: [PR] chore: Close dictionary provider when iterator is closed [datafusion-comet]

2024-09-04 Thread via GitHub
Kontinuation commented on PR #904: URL: https://github.com/apache/datafusion-comet/pull/904#issuecomment-2328336839 This fix looks correct. The dictionary vectors also [hold the reference counts](https://github.com/apache/arrow/blob/r-16.1.0/java/c/src/main/java/org/apache/arrow/c/ReferenceC

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
mesejo commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743393765 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[DataType

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
mesejo commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743402139 ## datafusion/core/src/dataframe/mod.rs: ## @@ -1803,6 +1803,17 @@ mod tests { Ok(()) } +#[tokio::test] +async fn test_coalesce_schema() ->

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
mesejo commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743402139 ## datafusion/core/src/dataframe/mod.rs: ## @@ -1803,6 +1803,17 @@ mod tests { Ok(()) } +#[tokio::test] +async fn test_coalesce_schema() ->

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
jayzhan211 commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743420867 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[Data

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
jayzhan211 commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743420867 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[Data

Re: [PR] feat: Enforce the uniqueness of map key name for the map/make_map function [datafusion]

2024-09-04 Thread via GitHub
Weijun-H merged PR #12153: URL: https://github.com/apache/datafusion/pull/12153 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] Enforce the uniqueness of map key name for the map/make_map function [datafusion]

2024-09-04 Thread via GitHub
Weijun-H closed issue #11437: Enforce the uniqueness of map key name for the map/make_map function URL: https://github.com/apache/datafusion/issues/11437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Handle type coercion in signature for `ApproxPercentileCont` [datafusion]

2024-09-04 Thread via GitHub
jayzhan211 commented on code in PR #12274: URL: https://github.com/apache/datafusion/pull/12274#discussion_r1743431560 ## datafusion/functions-aggregate/src/approx_median.rs: ## @@ -116,4 +112,8 @@ impl AggregateUDFImpl for ApproxMedian { acc_args.exprs[0].data_type

Re: [PR] Use `filtered_null_mask` in `CountGroupsAccumulator ` and `PrimitiveGroupsAccumulator` [datafusion]

2024-09-04 Thread via GitHub
Rachelint commented on PR #11825: URL: https://github.com/apache/datafusion/pull/11825#issuecomment-2328457120 Hi, I am curious about the slower cases, I merge and run some benchmarks in my local , found some faster and no cases slower... ``` $ bash bench.sh compare main test-quick

Re: [PR] Update arrow-flight requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
alamb commented on PR #12313: URL: https://github.com/apache/datafusion/pull/12313#issuecomment-2328533566 Included in https://github.com/apache/datafusion/pull/12032 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Update arrow-buffer requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
alamb commented on PR #12314: URL: https://github.com/apache/datafusion/pull/12314#issuecomment-2328533711 Included in https://github.com/apache/datafusion/pull/12032 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Update arrow requirement from 52.2.0 to 53.0.0 [datafusion]

2024-09-04 Thread via GitHub
alamb commented on PR #12318: URL: https://github.com/apache/datafusion/pull/12318#issuecomment-2328534420 Included in https://github.com/apache/datafusion/pull/12032 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Improve & unify validation in LogicalPlan::with_new_exprs [datafusion]

2024-09-04 Thread via GitHub
alamb commented on code in PR #12264: URL: https://github.com/apache/datafusion/pull/12264#discussion_r1743517939 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -1013,92 +1046,157 @@ impl LogicalPlan { } LogicalPlan::Distinct(distinct) => {

Re: [PR] Update to `arrow`/`parquet` `53.0.0`, `tonic`, `prost`, `object_store`, `pyo3` [datafusion]

2024-09-04 Thread via GitHub
alamb commented on PR #12032: URL: https://github.com/apache/datafusion/pull/12032#issuecomment-2328555310 This PR is now ready for review 🎉 (finally, 53.0.0 is released) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
mesejo commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743548464 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[DataType

Re: [PR] Fix Possible Congestion Scenario in `SortPreservingMergeExec` [datafusion]

2024-09-04 Thread via GitHub
tustvold commented on PR #12302: URL: https://github.com/apache/datafusion/pull/12302#issuecomment-2328599689 I'm afraid I don't have capacity to review this, and am not likely to in the foreseeable future, however, one thing to be aware of is that SortPreservingMerge must be stable. Theref

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
jayzhan211 commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743596400 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[Data

Re: [PR] Fix Possible Congestion Scenario in `SortPreservingMergeExec` [datafusion]

2024-09-04 Thread via GitHub
alamb commented on code in PR #12302: URL: https://github.com/apache/datafusion/pull/12302#discussion_r1743625178 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -156,12 +164,22 @@ impl SortPreservingMergeStream { } // try to initialize the loser tree

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
Kontinuation commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1743675528 ## common/src/main/scala/org/apache/comet/vector/NativeUtil.scala: ## @@ -110,12 +132,12 @@ class NativeUtil { * @return * a list of Comet vectors

Re: [PR] Improve binary scalars display [datafusion]

2024-09-04 Thread via GitHub
alamb merged PR #12192: URL: https://github.com/apache/datafusion/pull/12192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
Kontinuation commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1743702560 ## native/core/src/execution/utils.rs: ## @@ -96,6 +100,25 @@ impl SparkArrowConvert for ArrayData { Ok((array as i64, schema as i64)) } + +

[PR] Minor: Reduce string allocations in ScalarValue::binary display [datafusion]

2024-09-04 Thread via GitHub
alamb opened a new pull request, #12322: URL: https://github.com/apache/datafusion/pull/12322 ## Which issue does this PR close? N/A ## Rationale for this change Follow on to https://github.com/apache/datafusion/pull/12192 from @lewiszlw I noticed some improvements

Re: [PR] Update to `arrow`/`parquet` `53.0.0`, `tonic`, `prost`, `object_store`, `pyo3` [datafusion]

2024-09-04 Thread via GitHub
Omega359 commented on code in PR #12032: URL: https://github.com/apache/datafusion/pull/12032#discussion_r1743791472 ## datafusion-cli/Cargo.toml: ## @@ -30,9 +30,16 @@ rust-version = "1.76" readme = "README.md" [dependencies] -arrow = { version = "52.2.0" } +arrow = { versi

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
mesejo commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743854073 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[DataType

Re: [PR] feat: Array element extraction [datafusion-comet]

2024-09-04 Thread via GitHub
andygrove commented on code in PR #899: URL: https://github.com/apache/datafusion-comet/pull/899#discussion_r1743856474 ## native/spark-expr/src/list.rs: ## @@ -0,0 +1,322 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] Support the custom terminator for the CSV file format [datafusion]

2024-09-04 Thread via GitHub
goldmedal commented on code in PR #12263: URL: https://github.com/apache/datafusion/pull/12263#discussion_r1743863746 ## datafusion/core/src/datasource/physical_plan/csv.rs: ## @@ -1210,6 +1242,105 @@ mod tests { crate::assert_batches_eq!(expected, &result); } +

Re: [PR] Support the custom terminator for the CSV file format [datafusion]

2024-09-04 Thread via GitHub
goldmedal commented on code in PR #12263: URL: https://github.com/apache/datafusion/pull/12263#discussion_r1743885465 ## datafusion/sqllogictest/test_files/csv_files.slt: ## @@ -336,3 +336,23 @@ id message 05)good test 4 unquoted value end + +statement ok +CREATE EXTERNAL TAB

[PR] Logfire 2024 08 29 [datafusion]

2024-09-04 Thread via GitHub
samuelcolvin opened a new pull request, #12324: URL: https://github.com/apache/datafusion/pull/12324 Just to view -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[PR] Avo/match input schema [datafusion]

2024-09-04 Thread via GitHub
Blizzara opened a new pull request, #12325: URL: https://github.com/apache/datafusion/pull/12325 ## Which issue does this PR close? Fixes a correctness issue when consuming Substrait plans against tables that don't exactly match in schema. Currently DF may in those cases use wrong co

[I] Converting binary data to utf8 string [datafusion]

2024-09-04 Thread via GitHub
nathanielc opened a new issue, #12326: URL: https://github.com/apache/datafusion/issues/12326 ### Is your feature request related to a problem or challenge? I often work with a column with a binary type however its known that the binary data is a valid utf8 string. I'd like a mechanis

Re: [PR] feat: Array element extraction [datafusion-comet]

2024-09-04 Thread via GitHub
Kimahriman commented on code in PR #899: URL: https://github.com/apache/datafusion-comet/pull/899#discussion_r1743937320 ## native/spark-expr/src/list.rs: ## @@ -0,0 +1,322 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreemen

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
mesejo commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1743962226 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[DataType

Re: [PR] perf: Apply DataFusion's projection pushdown rule [datafusion-comet]

2024-09-04 Thread via GitHub
comphead commented on code in PR #907: URL: https://github.com/apache/datafusion-comet/pull/907#discussion_r1743985266 ## native/core/src/execution/jni_api.rs: ## @@ -249,7 +254,14 @@ fn prepare_datafusion_session_context( let runtime = RuntimeEnv::new(rt_config).unwrap()

Re: [PR] perf: Apply DataFusion's projection pushdown rule [datafusion-comet]

2024-09-04 Thread via GitHub
andygrove commented on code in PR #907: URL: https://github.com/apache/datafusion-comet/pull/907#discussion_r1743989603 ## native/core/src/execution/jni_api.rs: ## @@ -249,7 +254,14 @@ fn prepare_datafusion_session_context( let runtime = RuntimeEnv::new(rt_config).unwrap(

Re: [PR] Support `skewness(x)` in Aggregation function [datafusion]

2024-09-04 Thread via GitHub
comphead commented on code in PR #12295: URL: https://github.com/apache/datafusion/pull/12295#discussion_r1744003521 ## docs/source/user-guide/sql/aggregate_functions.md: ## @@ -527,6 +528,19 @@ regr_sxy(expression_y, expression_x) - **expression_x**: Independent variable. C

[PR] minor: Add PartialEq, Eq traits to StatsType [datafusion]

2024-09-04 Thread via GitHub
andygrove opened a new pull request, #12327: URL: https://github.com/apache/datafusion/pull/12327 ## Which issue does this PR close? N/A ## Rationale for this change DataFusion Comet currently has a copy of this enum with `PartialEq` and `Eq` added. I wou

Re: [PR] Minor: improve performance of `ScalarValue::Binary*` debug [datafusion]

2024-09-04 Thread via GitHub
Dandandan commented on code in PR #12323: URL: https://github.com/apache/datafusion/pull/12323#discussion_r1744004720 ## datafusion/common/src/scalar/mod.rs: ## @@ -3646,18 +3646,22 @@ fn fmt_list(arr: ArrayRef, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{value_fo

Re: [PR] Support `skewness(x)` in Aggregation function [datafusion]

2024-09-04 Thread via GitHub
comphead commented on PR #12295: URL: https://github.com/apache/datafusion/pull/12295#issuecomment-2329386911 Basically I'm feeling we have too much of user .md files, duplicating the functions description and poorly connected to each other which confuses a lot. @alamb WDYT about hav

Re: [PR] chore: Upgrade to latest DataFusion revision [datafusion-comet]

2024-09-04 Thread via GitHub
andygrove commented on code in PR #909: URL: https://github.com/apache/datafusion-comet/pull/909#discussion_r1744020900 ## native/spark-expr/src/utils.rs: ## @@ -30,23 +29,6 @@ use arrow::{ }; use chrono::{DateTime, Offset, TimeZone}; -use datafusion_physical_plan::PhysicalE

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1744021968 ## common/src/main/scala/org/apache/comet/vector/NativeUtil.scala: ## @@ -110,12 +132,12 @@ class NativeUtil { * @return * a list of Comet vectors *

Re: [PR] perf: Apply DataFusion's projection pushdown rule [datafusion-comet]

2024-09-04 Thread via GitHub
andygrove commented on PR #907: URL: https://github.com/apache/datafusion-comet/pull/907#issuecomment-2329407232 The benchmark results are not very exciting, and the improvements could just be noise. ![tpch_queries_speedup_abs](https://github.com/user-attachments/assets/3ac8abf0-da8

Re: [PR] Minor: improve performance of `ScalarValue::Binary*` debug [datafusion]

2024-09-04 Thread via GitHub
Dandandan commented on code in PR #12323: URL: https://github.com/apache/datafusion/pull/12323#discussion_r1744036133 ## datafusion/common/src/scalar/mod.rs: ## @@ -3646,18 +3646,22 @@ fn fmt_list(arr: ArrayRef, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{value_fo

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1744037235 ## native/core/src/execution/utils.rs: ## @@ -96,6 +100,25 @@ impl SparkArrowConvert for ArrayData { Ok((array as i64, schema as i64)) } + +//

Re: [PR] Minor: Reduce string allocations in ScalarValue::binary display [datafusion]

2024-09-04 Thread via GitHub
crepererum merged PR #12322: URL: https://github.com/apache/datafusion/pull/12322 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Improve & unify validation in LogicalPlan::with_new_exprs [datafusion]

2024-09-04 Thread via GitHub
findepi commented on code in PR #12264: URL: https://github.com/apache/datafusion/pull/12264#discussion_r1744097551 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -1013,92 +1046,157 @@ impl LogicalPlan { } LogicalPlan::Distinct(distinct) => {

Re: [I] Ability to inspect type of an expression [datafusion]

2024-09-04 Thread via GitHub
findepi commented on issue #12272: URL: https://github.com/apache/datafusion/issues/12272#issuecomment-2329530054 > What do you mean `SQL type`? e.g. `varchar(n)`, `bigint`. As in `SELECT CAST('123' AS bigint)` which is a valid expression accepted by DF. > Do you have a example

Re: [PR] Support the custom terminator for the CSV file format [datafusion]

2024-09-04 Thread via GitHub
goldmedal commented on code in PR #12263: URL: https://github.com/apache/datafusion/pull/12263#discussion_r1744114242 ## datafusion/sqllogictest/test_files/csv_files.slt: ## @@ -336,3 +336,23 @@ id message 05)good test 4 unquoted value end + +statement ok +CREATE EXTERNAL TAB

Re: [PR] chore: Upgrade to latest DataFusion revision [datafusion-comet]

2024-09-04 Thread via GitHub
andygrove commented on PR #909: URL: https://github.com/apache/datafusion-comet/pull/909#issuecomment-2329618394 @huaxingao There are quite a few changes to aggregates in this PR due to upstream API changes. Could you review when you get a chance? -- This is an automated message from the

Re: [PR] Update to `arrow`/`parquet` `53.0.0`, `tonic`, `prost`, `object_store`, `pyo3` [datafusion]

2024-09-04 Thread via GitHub
XiangpengHao commented on code in PR #12032: URL: https://github.com/apache/datafusion/pull/12032#discussion_r1744229803 ## datafusion/functions/src/regex/regexpreplace.rs: ## @@ -401,8 +401,7 @@ fn _regexp_replace_static_pattern_replace( DataType::Utf8View => {

Re: [PR] chore: Upgrade to latest DataFusion revision [datafusion-comet]

2024-09-04 Thread via GitHub
kazuyukitanimura commented on PR #909: URL: https://github.com/apache/datafusion-comet/pull/909#issuecomment-2329697386 Oops, some test failures -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1744257806 ## native/core/src/execution/utils.rs: ## @@ -96,6 +99,22 @@ impl SparkArrowConvert for ArrayData { Ok((array as i64, schema as i64)) } + +///

[PR] Added array_any_value function [datafusion]

2024-09-04 Thread via GitHub
athultr1997 opened a new pull request, #12329: URL: https://github.com/apache/datafusion/pull/12329 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [PR] chore: Upgrade to latest DataFusion revision [datafusion-comet]

2024-09-04 Thread via GitHub
andygrove commented on code in PR #909: URL: https://github.com/apache/datafusion-comet/pull/909#discussion_r1744279713 ## native/core/src/execution/datafusion/planner.rs: ## @@ -1347,12 +1349,24 @@ impl PhysicalPlanner { match datatype {

Re: [PR] chore: Upgrade to latest DataFusion revision [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #909: URL: https://github.com/apache/datafusion-comet/pull/909#discussion_r1744282163 ## native/core/src/execution/datafusion/planner.rs: ## @@ -1347,12 +1349,24 @@ impl PhysicalPlanner { match datatype { Dat

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
mesejo commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1744452513 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[DataType

Re: [PR] minor: reuse SessionStateBuilder methods for default builder [datafusion]

2024-09-04 Thread via GitHub
Omega359 commented on PR #12330: URL: https://github.com/apache/datafusion/pull/12330#issuecomment-2330059849 Ah, nice improvement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1744475477 ## native/core/src/execution/utils.rs: ## @@ -96,6 +99,22 @@ impl SparkArrowConvert for ArrayData { Ok((array as i64, schema as i64)) } + +///

[PR] date_add expression [datafusion-comet]

2024-09-04 Thread via GitHub
mbutrovich opened a new pull request, #910: URL: https://github.com/apache/datafusion-comet/pull/910 ## Which issue does this PR close? Does not close an individual PR, but addresses an item in #858. ## Rationale for this change ## What changes are include

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1744475477 ## native/core/src/execution/utils.rs: ## @@ -96,6 +99,22 @@ impl SparkArrowConvert for ArrayData { Ok((array as i64, schema as i64)) } + +///

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#issuecomment-2330295213 Okay. The latest commit fixes the SIGSEGV errors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] DataFusion does not validate that Substrait NamedScan schemas match registered tables [datafusion]

2024-09-04 Thread via GitHub
vbarua commented on issue #12223: URL: https://github.com/apache/datafusion/issues/12223#issuecomment-2330309852 From conversations with @Blizzara, the requirement that the DataFusion and Substrait schemas match exactly is stricter than it needs to be. In practice, if the Substrait schema i

[PR] build: Add maven-compiler-plugin [datafusion-comet]

2024-09-04 Thread via GitHub
viirya opened a new pull request, #911: URL: https://github.com/apache/datafusion-comet/pull/911 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes test

Re: [PR] build: Add maven-compiler-plugin [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #911: URL: https://github.com/apache/datafusion-comet/pull/911#discussion_r1744612980 ## pom.xml: ## @@ -811,6 +811,17 @@ under the License. + + org.apache.maven.plugins + maven-compiler

Re: [PR] build: Add maven-compiler-plugin [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #911: URL: https://github.com/apache/datafusion-comet/pull/911#discussion_r1744612980 ## pom.xml: ## @@ -811,6 +811,17 @@ under the License. + + org.apache.maven.plugins + maven-compiler

Re: [PR] validate and adjust Substrait NamedTable schemas (#12223) [datafusion]

2024-09-04 Thread via GitHub
vbarua commented on code in PR #12245: URL: https://github.com/apache/datafusion/pull/12245#discussion_r1744627789 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -850,6 +868,31 @@ pub async fn from_substrait_rel( } } +/// Validates that the given Substrait s

Re: [I] Parquet statistics missing when reading `Utf8` as `Utf8View` [datafusion]

2024-09-04 Thread via GitHub
wiedld commented on issue #12123: URL: https://github.com/apache/datafusion/issues/12123#issuecomment-2330337635 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
jayzhan211 commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1744632046 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[Data

Re: [PR] fix: coalesce schema issues [datafusion]

2024-09-04 Thread via GitHub
jayzhan211 commented on code in PR #12308: URL: https://github.com/apache/datafusion/pull/12308#discussion_r1744643400 ## datafusion/functions/src/core/coalesce.rs: ## @@ -60,12 +60,16 @@ impl ScalarUDFImpl for CoalesceFunc { } fn return_type(&self, arg_types: &[Data

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
Kontinuation commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1744699978 ## native/core/src/execution/utils.rs: ## @@ -96,6 +99,22 @@ impl SparkArrowConvert for ArrayData { Ok((array as i64, schema as i64)) } + +

Re: [PR] chore: Revise array import to more follow C Data Interface semantics [datafusion-comet]

2024-09-04 Thread via GitHub
viirya commented on code in PR #905: URL: https://github.com/apache/datafusion-comet/pull/905#discussion_r1744713645 ## native/core/src/execution/utils.rs: ## @@ -96,6 +99,22 @@ impl SparkArrowConvert for ArrayData { Ok((array as i64, schema as i64)) } + +///

Re: [PR] implement max_by aggregate function [datafusion]

2024-09-04 Thread via GitHub
Lordworms commented on PR #12284: URL: https://github.com/apache/datafusion/pull/12284#issuecomment-2330570159 repushed it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Impl `convert_to_state` for `GroupsAccumulatorAdapter`. [datafusion]

2024-09-04 Thread via GitHub
Rachelint commented on code in PR #11827: URL: https://github.com/apache/datafusion/pull/11827#discussion_r1744783828 ## datafusion/physical-expr/src/aggregate/groups_accumulator/adapter.rs: ## @@ -342,6 +374,50 @@ impl GroupsAccumulator for GroupsAccumulatorAdapter { fn si

Re: [I] Support Connection through Arrow Flight RPC / ADBC [datafusion-comet]

2024-09-04 Thread via GitHub
v-kessler commented on issue #913: URL: https://github.com/apache/datafusion-comet/issues/913#issuecomment-2330679120 @vaibhawvipul as discussed here is the issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Fix subquery alias table definition unparsing for SQLite [datafusion]

2024-09-04 Thread via GitHub
phillipleblanc commented on code in PR #12331: URL: https://github.com/apache/datafusion/pull/12331#discussion_r1744845781 ## datafusion/sql/src/unparser/plan.rs: ## @@ -450,10 +453,33 @@ impl Unparser<'_> { Ok(()) } LogicalPlan::Subque

Re: [PR] Fix subquery alias table definition unparsing for SQLite [datafusion]

2024-09-04 Thread via GitHub
phillipleblanc commented on code in PR #12331: URL: https://github.com/apache/datafusion/pull/12331#discussion_r1744845781 ## datafusion/sql/src/unparser/plan.rs: ## @@ -450,10 +453,33 @@ impl Unparser<'_> { Ok(()) } LogicalPlan::Subque

[PR] Simplify `update_skip_aggregation_probe` method [datafusion]

2024-09-04 Thread via GitHub
lewiszlw opened a new pull request, #12332: URL: https://github.com/apache/datafusion/pull/12332 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested?

Re: [PR] Fix subquery alias table definition unparsing for SQLite [datafusion]

2024-09-04 Thread via GitHub
sgrebnov commented on code in PR #12331: URL: https://github.com/apache/datafusion/pull/12331#discussion_r1744899415 ## datafusion/sql/src/unparser/rewrite.rs: ## @@ -257,6 +257,37 @@ pub(super) fn subquery_alias_inner_query_and_columns( (outer_projections.input.as_ref(), c