Re: [PR] test: interval analysis unit tests [datafusion]

2025-01-19 Thread via GitHub
berkaysynnada commented on code in PR #14189: URL: https://github.com/apache/datafusion/pull/14189#discussion_r1921925967 ## datafusion/physical-expr/src/analysis.rs: ## @@ -246,3 +246,124 @@ fn calculate_selectivity( acc * cardinality_ratio(&initial.interval, &targ

Re: [PR] fix: fetch is missed in the EnsureSorting [datafusion]

2025-01-19 Thread via GitHub
berkaysynnada commented on PR #14192: URL: https://github.com/apache/datafusion/pull/14192#issuecomment-2601607057 I'll be reviewing this today -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] fix: fetch is missed in the EnsureSorting [datafusion]

2025-01-19 Thread via GitHub
xudong963 commented on PR #14192: URL: https://github.com/apache/datafusion/pull/14192#issuecomment-2601603557 Thanks, @akurmustafa! I plan to merge the PR after @alamb or someone else who's interested has a look. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Update logical-types to main [datafusion]

2025-01-19 Thread via GitHub
tobixdev commented on PR #14202: URL: https://github.com/apache/datafusion/pull/14202#issuecomment-2601483855 I'll have a look at it today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Add union_extract scalar function [datafusion]

2025-01-19 Thread via GitHub
gstvg commented on PR #12116: URL: https://github.com/apache/datafusion/pull/12116#issuecomment-2601470429 Thanks @jayzhan211 🙏, this is ready for review cc @tobixdev -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Add support for function chaining and the dot syntax for function calls [datafusion]

2025-01-19 Thread via GitHub
gstvg commented on issue #12206: URL: https://github.com/apache/datafusion/issues/12206#issuecomment-2601463163 I filled #14205 to track support for lambda functions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Spark+HDFS cannot decompress parquet files lz4 compressed by DataFusion [datafusion]

2025-01-19 Thread via GitHub
hayman42 closed issue #14105: Spark+HDFS cannot decompress parquet files lz4 compressed by DataFusion URL: https://github.com/apache/datafusion/issues/14105 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Spark+HDFS cannot decompress parquet files lz4 compressed by DataFusion [datafusion]

2025-01-19 Thread via GitHub
hayman42 commented on issue #14105: URL: https://github.com/apache/datafusion/issues/14105#issuecomment-2601382444 @tustvold Confirmed that using `lz4_raw` option + spark 3.5 works well. Thank you for your detailed explanation. -- This is an automated message from the Apache Git Service.

Re: [I] Implement Display for ColumnarValue [datafusion]

2025-01-19 Thread via GitHub
Kunal-Singh-Dadhwal commented on issue #14176: URL: https://github.com/apache/datafusion/issues/14176#issuecomment-2601371233 @zjregee Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Implement Display for ColumnarValue [datafusion]

2025-01-19 Thread via GitHub
zjregee commented on issue #14176: URL: https://github.com/apache/datafusion/issues/14176#issuecomment-2601368836 Hi, @Kunal-Singh-Dadhwal, can I take this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Feat: Support array_join [datafusion-comet]

2025-01-19 Thread via GitHub
jatin510 commented on PR #1290: URL: https://github.com/apache/datafusion-comet/pull/1290#issuecomment-2601261342 lgtm 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [I] Find a safe alternative to `LogicalPlan::using_columns()` [datafusion]

2025-01-19 Thread via GitHub
jonahgao commented on issue #14118: URL: https://github.com/apache/datafusion/issues/14118#issuecomment-2601178862 > 2. Make exclude using columns shallower/non-recursive i.e. do not let it search for redundant column in sub-query. I think it is the correct approach. Since unnamed sub

[I] LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-19 Thread via GitHub
haohuaijin opened a new issue, #14204: URL: https://github.com/apache/datafusion/issues/14204 ### Describe the bug see the plan ``` | physical_plan after OutputRequirements | GlobalLimitExec: skip=0, fetch=10

Re: [PR] Minor add ticket references to deprecated code [datafusion]

2025-01-19 Thread via GitHub
jonahgao merged PR #14174: URL: https://github.com/apache/datafusion/pull/14174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Add hook for sharing join state in distributed execution [datafusion]

2025-01-19 Thread via GitHub
github-actions[bot] commented on PR #12523: URL: https://github.com/apache/datafusion/pull/12523#issuecomment-2601169692 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Improve `case` expr constant handling for when `` [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 commented on PR #14159: URL: https://github.com/apache/datafusion/pull/14159#issuecomment-2601131708 Thanks @alamb @Omega359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Improve `case` expr constant handling for when `` [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 merged PR #14159: URL: https://github.com/apache/datafusion/pull/14159 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] refactor: switch BooleanBufferBuilder to NullBufferBuilder in functions-nested functions [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 merged PR #14201: URL: https://github.com/apache/datafusion/pull/14201 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] feat: Use `SchemaRef` in `JoinFilter` [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 merged PR #14182: URL: https://github.com/apache/datafusion/pull/14182 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [I] Use `SchemaRef` in `JoinFilter` [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 closed issue #14177: Use `SchemaRef` in `JoinFilter` URL: https://github.com/apache/datafusion/issues/14177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] feat: Use `SchemaRef` in `JoinFilter` [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 commented on PR #14182: URL: https://github.com/apache/datafusion/pull/14182#issuecomment-2601127141 Thanks @irenjj @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] Add union_extract scalar function [datafusion]

2025-01-19 Thread via GitHub
gstvg opened a new pull request, #12116: URL: https://github.com/apache/datafusion/pull/12116 ## Which issue does this PR close? Closes #11081 ## What changes are included in this PR? union_extract implementation and docs Add table with union column on the session con

Re: [PR] Update logical-types to main [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 commented on PR #14202: URL: https://github.com/apache/datafusion/pull/14202#issuecomment-2601124055 ![Screenshot 2025-01-20 at 9 00 09  AM](https://github.com/user-attachments/assets/9a4e2a33-3913-43b1-ab81-8d66e3d7cb2e) I compare it with the `main`, it includes unexpected chan

Re: [PR] Update logical-types to main [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 commented on PR #14202: URL: https://github.com/apache/datafusion/pull/14202#issuecomment-2601119156 I will just merge this, since it is a rebase to `logical-types` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Update logical-types to main [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 merged PR #14202: URL: https://github.com/apache/datafusion/pull/14202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Introduce `return_type_from_args ` for ScalarFunction. [datafusion]

2025-01-19 Thread via GitHub
jayzhan211 commented on code in PR #14094: URL: https://github.com/apache/datafusion/pull/14094#discussion_r1921677988 ## datafusion/common/src/utils/mod.rs: ## @@ -1201,4 +1202,13 @@ mod tests { assert_eq!(expected, transposed); Ok(()) } + +#[test] +

Re: [PR] Add union_extract scalar function [datafusion]

2025-01-19 Thread via GitHub
gstvg commented on PR #12116: URL: https://github.com/apache/datafusion/pull/12116#issuecomment-2601112958 Hi @jayzhan211, can you reopen this, or should I open a new PR? Sorry for the delay. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] chore: extract math_funcs expressions to folders based on spark grouping [datafusion-comet]

2025-01-19 Thread via GitHub
andygrove merged PR #1219: URL: https://github.com/apache/datafusion-comet/pull/1219 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Feat: Support array_join [datafusion-comet]

2025-01-19 Thread via GitHub
codecov-commenter commented on PR #1290: URL: https://github.com/apache/datafusion-comet/pull/1290#issuecomment-2601083015 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1290?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-19 Thread via GitHub
logan-keede commented on code in PR #14190: URL: https://github.com/apache/datafusion/pull/14190#discussion_r1921648964 ## datafusion/physical-optimizer/Cargo.toml: ## @@ -48,6 +49,7 @@ futures = { workspace = true } itertools = { workspace = true } log = { workspace = true }

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-19 Thread via GitHub
logan-keede commented on code in PR #14190: URL: https://github.com/apache/datafusion/pull/14190#discussion_r1921648964 ## datafusion/physical-optimizer/Cargo.toml: ## @@ -48,6 +49,7 @@ futures = { workspace = true } itertools = { workspace = true } log = { workspace = true }

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-19 Thread via GitHub
logan-keede commented on code in PR #14190: URL: https://github.com/apache/datafusion/pull/14190#discussion_r1921648964 ## datafusion/physical-optimizer/Cargo.toml: ## @@ -48,6 +49,7 @@ futures = { workspace = true } itertools = { workspace = true } log = { workspace = true }

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-19 Thread via GitHub
logan-keede commented on code in PR #14190: URL: https://github.com/apache/datafusion/pull/14190#discussion_r1921278241 ## datafusion/physical-optimizer/Cargo.toml: ## @@ -48,6 +49,7 @@ futures = { workspace = true } itertools = { workspace = true } log = { workspace = true }

Re: [I] Move `ProjectionPushdown` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-19 Thread via GitHub
logan-keede commented on issue #14184: URL: https://github.com/apache/datafusion/issues/14184#issuecomment-2601035917 @alamb Please let me know your thoughts on this and #14190, whenever possible, Thanks, Logan -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Deprecate max statistics size properly [datafusion]

2025-01-19 Thread via GitHub
logan-keede commented on PR #14188: URL: https://github.com/apache/datafusion/pull/14188#issuecomment-2601030958 > I pushed another commit for `#[allow(deprecated)]` Can I ask for the rationale is it just good practice or was it compiler giving warnings? (I am relatively new to rust.)

Re: [I] Move `ProjectionPushdown` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-19 Thread via GitHub
logan-keede commented on issue #14184: URL: https://github.com/apache/datafusion/issues/14184#issuecomment-2601028884 I think this would require making a separate crate for `datasource` since `ProjectionPushdown` has a dependency on `datasource` in `core` unless I am missing some major and

Re: [PR] Feat: Support array_join [datafusion-comet]

2025-01-19 Thread via GitHub
erenavsarogullari commented on PR #1290: URL: https://github.com/apache/datafusion-comet/pull/1290#issuecomment-2601018854 > I think the handling of the third argument (nullReplacement) is missing see: https://docs.databricks.com/en/sql/language-manual/functions/array_join.html#arguments.

Re: [PR] Update verson to 0.54.0 and update changelog [datafusion-sqlparser-rs]

2025-01-19 Thread via GitHub
alamb commented on PR #1668: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1668#issuecomment-2601008082 Awesome -- thanks @iffyio -- I'll plan to make one tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Feat: Support array_intersect [datafusion-comet]

2025-01-19 Thread via GitHub
erenavsarogullari commented on PR #1271: URL: https://github.com/apache/datafusion-comet/pull/1271#issuecomment-2600994861 > Thanks @erenavsarogullari. It would be great to have help with this. I will try and add some more notes to the issue with suggestions for how we can improve coverage

Re: [I] Add documentation for `<=>` operator [datafusion]

2025-01-19 Thread via GitHub
Spaarsh commented on issue #14203: URL: https://github.com/apache/datafusion/issues/14203#issuecomment-2600980758 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] RFC: Should we remove pyarrow feature from datafusion core [datafusion]

2025-01-19 Thread via GitHub
andygrove commented on issue #14197: URL: https://github.com/apache/datafusion/issues/14197#issuecomment-2600980715 This seems logical to me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Add benchmark for planning sorted unions [datafusion]

2025-01-19 Thread via GitHub
comphead merged PR #14157: URL: https://github.com/apache/datafusion/pull/14157 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Support spaceship operator (`<=>`) support (alias for `IS NOT DISTINCT FROM` [datafusion]

2025-01-19 Thread via GitHub
comphead merged PR #14187: URL: https://github.com/apache/datafusion/pull/14187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Support spaceship operator (`<=>`) support (alias for `IS NOT DISTINCT FROM` [datafusion]

2025-01-19 Thread via GitHub
comphead commented on PR #14187: URL: https://github.com/apache/datafusion/pull/14187#issuecomment-2600973473 Filed https://github.com/apache/datafusion/issues/14203 for documentation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Spaceship operator (<=>) not supported [datafusion]

2025-01-19 Thread via GitHub
comphead closed issue #14098: Spaceship operator (<=>) not supported URL: https://github.com/apache/datafusion/issues/14098 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Added job board as a separate header in the documentation [datafusion]

2025-01-19 Thread via GitHub
comphead merged PR #14191: URL: https://github.com/apache/datafusion/pull/14191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Minor: Rename extended test job name [datafusion]

2025-01-19 Thread via GitHub
comphead merged PR #14199: URL: https://github.com/apache/datafusion/pull/14199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Merge SortMergeJoin filtered batches into larger batches [datafusion]

2025-01-19 Thread via GitHub
comphead commented on PR #14160: URL: https://github.com/apache/datafusion/pull/14160#issuecomment-2600971996 Tbh, I was not able to find `BatchCoalescer` in joins, the closest was `CoalesceBatchesExec` in bunch of physical plan nodes including `sort_preserving_merge.rs` 🤔 But it will make