Re: [PR] Add extension hooks for encoding and decoding UDAFs and UDWFs [datafusion]

2024-07-15 Thread via GitHub
lewiszlw commented on PR #11417: URL: https://github.com/apache/datafusion/pull/11417#issuecomment-2230141932 I got this error before, running `cargo update` fixed it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] Move `sql_compound_identifier_to_expr` to `ExprPlanner` [datafusion]

2024-07-15 Thread via GitHub
dharanad commented on issue #11473: URL: https://github.com/apache/datafusion/issues/11473#issuecomment-2230136006 QQ: @alamb In `sql_compound_identifier_to_expr` the only this which is using `get_function_meta` is `get_field` ScalarUDF. I think ideally we should move `get_field` to `ExprPl

Re: [PR] Add extension hooks for encoding and decoding UDAFs and UDWFs [datafusion]

2024-07-15 Thread via GitHub
joroKr21 commented on PR #11417: URL: https://github.com/apache/datafusion/pull/11417#issuecomment-2230131517 @alamb do you get this issue locally on `main`? ``` error[E0432]: unresolved import `chrono::MappedLocalTime` --> datafusion/functions/src/datetime/to_local_time.rs:34:24

Re: [I] EnforceDistribution fails, seems to turn all the types of the schema to UInt64 [datafusion]

2024-07-15 Thread via GitHub
mustafasrepo commented on issue #10421: URL: https://github.com/apache/datafusion/issues/10421#issuecomment-2230106142 Thanks @fabianmurarui. I have debugged your code, the problem seems to be these two lines [line1](https://github.com/fabianmurariu/df_sql_bug/blob/4301fd122dde58bfdd9f912

Re: [PR] Unparser datetime to timestamp and interval [datafusion]

2024-07-15 Thread via GitHub
y-f-u commented on code in PR #11466: URL: https://github.com/apache/datafusion/pull/11466#discussion_r1678778548 ## datafusion/sql/src/unparser/expr.rs: ## @@ -1108,6 +1102,123 @@ impl Unparser<'_> { } } +fn interval_scalar_to_sql(&self, v: &ScalarValue) ->

Re: [PR] Unparser datetime to timestamp and interval [datafusion]

2024-07-15 Thread via GitHub
sgrebnov commented on code in PR #11466: URL: https://github.com/apache/datafusion/pull/11466#discussion_r1678766134 ## datafusion/sql/src/unparser/expr.rs: ## @@ -1108,6 +1102,123 @@ impl Unparser<'_> { } } +fn interval_scalar_to_sql(&self, v: &ScalarValue)

[PR] Move `GetField` to `ExprPlanner` [datafusion]

2024-07-15 Thread via GitHub
dharanad opened a new pull request, #11487: URL: https://github.com/apache/datafusion/pull/11487 ## Which issue does this PR close? Part of #11473 ## Rationale for this change ## What changes are included in this PR? ## Are these changes tes

Re: [PR] Remove element's nullability of array_agg function [datafusion]

2024-07-15 Thread via GitHub
eejbyfeldt commented on PR #11447: URL: https://github.com/apache/datafusion/pull/11447#issuecomment-2230004067 I don't think I am really following the motivation here. For me, the reason to have the correct nullability of the field inside the returned list is the same as having correct nul

Re: [PR] feat: Optimize for CASE WHEN predicate THEN expr ELSE null END [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove commented on PR #672: URL: https://github.com/apache/datafusion-comet/pull/672#issuecomment-2229965416 I also see a ~5% performance improvement with TPC-DS q43 @ sf=100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678713488 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678627090 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678627090 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678713488 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678713488 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678713904 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678713488 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

Re: [I] Move `sql_compound_identifier_to_expr` to `ExprPlanner` [datafusion]

2024-07-15 Thread via GitHub
dharanad commented on issue #11473: URL: https://github.com/apache/datafusion/issues/11473#issuecomment-2229949761 > Sorry @dharanad -- I didn't see #11244 before No worries at all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] feat: support group by unnest [datafusion]

2024-07-15 Thread via GitHub
JasonLi-cn commented on PR #11469: URL: https://github.com/apache/datafusion/pull/11469#issuecomment-2229933369 > Thank you @JasonLi-cn -- I think this PR looks good to me. Thank you for the contribution > > I think adding some comments about what `try_process_group_by_unnest` is doi

Re: [PR] feat: Optimize for CASE WHEN predicate THEN expr ELSE null END [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove closed pull request #672: feat: Optimize for CASE WHEN predicate THEN expr ELSE null END URL: https://github.com/apache/datafusion-comet/pull/672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [Epic] Extract catalog functionality from the core to make it more modular [datafusion]

2024-07-15 Thread via GitHub
comphead commented on issue #10782: URL: https://github.com/apache/datafusion/issues/10782#issuecomment-2229887288 This graph is amazing btw, first it shows how tightly coupled we are, the second is it gives really nice understandable picture -- This is an automated message from the Apach

Re: [PR] Fix parse `'1'::interval` as month by default [datafusion]

2024-07-15 Thread via GitHub
nix010 commented on code in PR #11454: URL: https://github.com/apache/datafusion/pull/11454#discussion_r1678672697 ## datafusion/expr/src/columnar_value.rs: ## @@ -195,19 +195,39 @@ impl ColumnarValue { kernels::cast::cast_with_options(array, cast_type, &cast_o

Re: [PR] Python wrapper classes for all user interfaces [datafusion-python]

2024-07-15 Thread via GitHub
kylebarron commented on PR #750: URL: https://github.com/apache/datafusion-python/pull/750#issuecomment-2229872515 > I am currently testing mkdocs per @kylebarron recommendation. Given that this project is currently using sphinx for documentation, it would be a pretty big change to m

[PR] Replace to_lowercase with to_string in sql exmaple [datafusion]

2024-07-15 Thread via GitHub
lewiszlw opened a new pull request, #11486: URL: https://github.com/apache/datafusion/pull/11486 ## Which issue does this PR close? Closes #. ## Rationale for this change Complete todo item in sql example. ## What changes are included in this PR?

Re: [PR] Adding complex expressions projections for Subquery [datafusion]

2024-07-15 Thread via GitHub
github-actions[bot] commented on PR #9719: URL: https://github.com/apache/datafusion/pull/9719#issuecomment-2229839661 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or th

Re: [PR] Get expr planners when creating new planner [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on code in PR #11485: URL: https://github.com/apache/datafusion/pull/11485#discussion_r1678638723 ## datafusion/sql/src/planner.rs: ## @@ -196,12 +193,6 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { Self::new_with_options(context_provider, Parse

Re: [PR] Get expr planners when creating new planner [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on PR #11485: URL: https://github.com/apache/datafusion/pull/11485#issuecomment-2229803166 We may even remove planners in SqlToRel 🤔 ? ```rust /// SQL query planner pub struct SqlToRel<'a, S: ContextProvider> { pub(crate) context_provider: &'a S,

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
wiedld commented on code in PR #11444: URL: https://github.com/apache/datafusion/pull/11444#discussion_r1678627090 ## datafusion/common/src/config.rs: ## @@ -1874,4 +2042,193 @@ mod tests { let parsed_metadata = table_config.parquet.key_value_metadata; assert_e

[PR] Expr or null [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove opened a new pull request, #672: URL: https://github.com/apache/datafusion-comet/pull/672 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes t

Re: [I] [Epic] Extract catalog functionality from the core to make it more modular [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on issue #10782: URL: https://github.com/apache/datafusion/issues/10782#issuecomment-2229757379 The dependencies in core is quite complex, take a note for it Draft the dependency graph, incomplete ```mermaid graph TD; CatalogProvider --> TableP

Re: [PR] upgrade sqlparser 0.47 -> 0.48 [datafusion]

2024-07-15 Thread via GitHub
comphead commented on PR #11453: URL: https://github.com/apache/datafusion/pull/11453#issuecomment-2229755920 Thanks @alamb for your analysis, I think this PR is good to go, but we need to think on recursive calls. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Support SortMergeJoin spilling [datafusion]

2024-07-15 Thread via GitHub
comphead commented on code in PR #11218: URL: https://github.com/apache/datafusion/pull/11218#discussion_r1678595256 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -1503,12 +1579,39 @@ fn get_buffered_columns( buffered_batch_idx: usize, buffered_indice

[PR] Get expr planners when creating new planner [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 opened a new pull request, #11485: URL: https://github.com/apache/datafusion/pull/11485 ## Which issue does this PR close? Closes #11477 . ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [PR] Support SortMergeJoin spilling [datafusion]

2024-07-15 Thread via GitHub
comphead commented on code in PR #11218: URL: https://github.com/apache/datafusion/pull/11218#discussion_r1678572836 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -867,12 +951,16 @@ impl SMJStream { while !self.buffered_data.batches.is_empt

Re: [I] ExprPlanner not propagated to SqlToRel [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on issue #11477: URL: https://github.com/apache/datafusion/issues/11477#issuecomment-2229713743 I see. Let me see if it is possible to avoid the additional copied -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] ExprPlanner not propagated to SqlToRel [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on issue #11477: URL: https://github.com/apache/datafusion/issues/11477#issuecomment-2229701907 I see. You are right, we can get the planners without additional step. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Support SortMergeJoin spilling [datafusion]

2024-07-15 Thread via GitHub
comphead commented on code in PR #11218: URL: https://github.com/apache/datafusion/pull/11218#discussion_r1678537635 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -602,15 +622,38 @@ impl BufferedBatch { + mem::size_of::>() + mem::size_

Re: [PR] feat: support `COUNT()` [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on PR #11229: URL: https://github.com/apache/datafusion/pull/11229#issuecomment-2229677324 @tshauck Is it better to merge the logic of count-wild and count empty into single rule, given the similarity of them? -- This is an automated message from the Apache Git Servic

Re: [I] ExprPlanner not propagated to SqlToRel [datafusion]

2024-07-15 Thread via GitHub
wjones127 commented on issue #11477: URL: https://github.com/apache/datafusion/issues/11477#issuecomment-2229673349 > We assume planner should exist, at least with datafusion's builtin one, otherwise we don't know how to handle it thus throws error for the case. I'm not sure I underst

Re: [I] Add nullable in `StateFieldArgs` [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on issue #11433: URL: https://github.com/apache/datafusion/issues/11433#issuecomment-2229650572 > This is a recent change introduced in #11093. > > https://github.com/apache/datafusion/blob/0965455486b7dcbd8c9a5efa8d2370ca5460bb9f/datafusion/functions-aggregate/sr

Re: [PR] Support SortMergeJoin spilling [datafusion]

2024-07-15 Thread via GitHub
comphead commented on code in PR #11218: URL: https://github.com/apache/datafusion/pull/11218#discussion_r1678511553 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -577,6 +595,8 @@ struct BufferedBatch { /// The indices of buffered batch that failed the joi

Re: [I] ExprPlanner not propagated to SqlToRel [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on issue #11477: URL: https://github.com/apache/datafusion/issues/11477#issuecomment-2229631902 After https://github.com/apache/datafusion/pull/11403, I think you can easily get the default planners with ```rust SessionStateBuilder::new() .with_d

Re: [I] ExprPlanner not propagated to SqlToRel [datafusion]

2024-07-15 Thread via GitHub
jayzhan211 commented on issue #11477: URL: https://github.com/apache/datafusion/issues/11477#issuecomment-2229629221 We assume planner should exist, at least with datafusion's builtin one, otherwise we don't know how to handle it thus throws error for the case. -- This is an automated mes

Re: [PR] chore: Add microbenchmarks [datafusion-comet]

2024-07-15 Thread via GitHub
codecov-commenter commented on PR #671: URL: https://github.com/apache/datafusion-comet/pull/671#issuecomment-2229625195 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/671?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campai

Re: [PR] Support SortMergeJoin spilling [datafusion]

2024-07-15 Thread via GitHub
comphead commented on code in PR #11218: URL: https://github.com/apache/datafusion/pull/11218#discussion_r1678505039 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -602,15 +622,38 @@ impl BufferedBatch { + mem::size_of::>() + mem::size_

Re: [PR] Support SortMergeJoin spilling [datafusion]

2024-07-15 Thread via GitHub
comphead commented on code in PR #11218: URL: https://github.com/apache/datafusion/pull/11218#discussion_r1678503959 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -602,15 +622,38 @@ impl BufferedBatch { + mem::size_of::>() + mem::size_

Re: [I] Add retract_bach method for xor accumulator [datafusion]

2024-07-15 Thread via GitHub
drewhayward commented on issue #7666: URL: https://github.com/apache/datafusion/issues/7666#issuecomment-2229611397 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Support SortMergeJoin spilling [datafusion]

2024-07-15 Thread via GitHub
comphead commented on code in PR #11218: URL: https://github.com/apache/datafusion/pull/11218#discussion_r1678496388 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -602,15 +622,38 @@ impl BufferedBatch { + mem::size_of::>() + mem::size_

Re: [PR] Use IfExpr to check when input to log2 is <=0 and return null [datafusion-comet]

2024-07-15 Thread via GitHub
kazuyukitanimura commented on PR #506: URL: https://github.com/apache/datafusion-comet/pull/506#issuecomment-2229606328 And thanks @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Use IfExpr to check when input to log2 is <=0 and return null [datafusion-comet]

2024-07-15 Thread via GitHub
kazuyukitanimura commented on PR #506: URL: https://github.com/apache/datafusion-comet/pull/506#issuecomment-2229605518 Merged, thanks @PedroMDuarte @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Use IfExpr to check when input to log2 is <=0 and return null [datafusion-comet]

2024-07-15 Thread via GitHub
kazuyukitanimura merged PR #506: URL: https://github.com/apache/datafusion-comet/pull/506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

[I] Optimize CASE WHEN for "expression or null" case [datafusion]

2024-07-15 Thread via GitHub
andygrove opened a new issue, #11484: URL: https://github.com/apache/datafusion/issues/11484 ### Is your feature request related to a problem or challenge? Given the expression `CASE WHEN predicate THEN expr ELSE null END` (as seen in some TPC-DS queries) we could do better than using

[PR] chore: Add microbenchmarks [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove opened a new pull request, #671: URL: https://github.com/apache/datafusion-comet/pull/671 ## Which issue does this PR close? N/A ## Rationale for this change When running the complex TPC-* queries, it is challenging to debug why Comet is slower

Re: [PR] feat: support `COUNT()` [datafusion]

2024-07-15 Thread via GitHub
tshauck commented on PR #11229: URL: https://github.com/apache/datafusion/pull/11229#issuecomment-2229514088 @jayzhan211 @alamb I've reverted this PR its original form. It adds a logical rule and updates the COUNT() function as a basis for discussion. I also explored a little bit the

[PR] chore: Add criterion benchmark for CaseExpr [datafusion]

2024-07-15 Thread via GitHub
andygrove opened a new pull request, #11482: URL: https://github.com/apache/datafusion/pull/11482 ## Which issue does this PR close? N/A ## Rationale for this change I would like to see if it is possible to improve performance of CaseExpr in some cases an

[PR] Minor: rename `row_groups.rs` to `row_group_filter.rs` [datafusion]

2024-07-15 Thread via GitHub
alamb opened a new pull request, #11481: URL: https://github.com/apache/datafusion/pull/11481 ## Which issue does this PR close? Part of https://github.com/apache/datafusion/issues/11480 ## Rationale for this change While working on https://github.com/apache/datafusio

Re: [PR] feat: add raw aggregate udf planner [datafusion]

2024-07-15 Thread via GitHub
tshauck commented on PR #11371: URL: https://github.com/apache/datafusion/pull/11371#issuecomment-2229478589 I'm going to close this as I don't think it is necessary for count() though it seemed to be an incremental step towards supporting planned aggregate udf expressions. For `coun

Re: [PR] feat: add raw aggregate udf planner [datafusion]

2024-07-15 Thread via GitHub
tshauck closed pull request #11371: feat: add raw aggregate udf planner URL: https://github.com/apache/datafusion/pull/11371 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] chore: Move protobuf files to separate crate [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove merged PR #661: URL: https://github.com/apache/datafusion-comet/pull/661 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] Use upstream `StatisticsConverter` from arrow-rs in DataFusion [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11479: URL: https://github.com/apache/datafusion/pull/11479#discussion_r1678376971 ## datafusion/core/src/datasource/physical_plan/parquet/page_filter.rs: ## @@ -521,6 +518,31 @@ macro_rules! get_min_max_values_for_page_index { }}; } +// Co

Re: [PR] Use upstream `StatisticsConverter` from arrow-rs in DataFusion [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11479: URL: https://github.com/apache/datafusion/pull/11479#discussion_r1678376343 ## datafusion/core/src/datasource/physical_plan/parquet/row_groups.rs: ## @@ -356,20 +356,24 @@ impl<'a> RowGroupPruningStatistics<'a> { &'a self,

[PR] Use upstream `StatisticsConverter` from arrow-rs in DataFusion [datafusion]

2024-07-15 Thread via GitHub
alamb opened a new pull request, #11479: URL: https://github.com/apache/datafusion/pull/11479 Note this needs to wait until the next arrow release (`52.2.0`) as the API is not yet available TODO: - [ ] Use StatisticsConverter API for DataPage statistics (TODO find link + find tic

Re: [I] `Unnest` query does not allow one expression being referenced multiple times [datafusion]

2024-07-15 Thread via GitHub
duongcongtoai commented on issue #11198: URL: https://github.com/apache/datafusion/issues/11198#issuecomment-2229345078 https://github.com/apache/datafusion/blob/58f79e143e1a90e5caa59eecc9b36dbdd082a7eb/datafusion/sql/src/select.rs#L344 At this line, before create new projection nodes, we

Re: [I] `Unnest` query does not allow one expression being referenced multiple times [datafusion]

2024-07-15 Thread via GitHub
duongcongtoai commented on issue #11198: URL: https://github.com/apache/datafusion/issues/11198#issuecomment-2229340200 Although theoretically #6543 can solve this ticket, but i think the bug presented here comes from the internal implementation inside try_process_unnest function itself, an

Re: [I] Error building rev c434872 [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove closed issue #662: Error building rev c434872 URL: https://github.com/apache/datafusion-comet/issues/662 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] fix: Remove nightly flag in release-nogit target in Makefile [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove merged PR #667: URL: https://github.com/apache/datafusion-comet/pull/667 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [I] ExprPlanner not propagated to SqlToRel [datafusion]

2024-07-15 Thread via GitHub
wjones127 commented on issue #11477: URL: https://github.com/apache/datafusion/issues/11477#issuecomment-2229260251 @jayzhan211 could you confirm what the intended behavior is for this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] minor: non-overlapping `repart_time` and `send_time` metrics [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11440: URL: https://github.com/apache/datafusion/pull/11440#discussion_r1678305488 ## datafusion/physical-plan/src/repartition/mod.rs: ## @@ -303,8 +312,8 @@ impl BatchPartitioner { let batch =

Re: [PR] Use IfExpr to check when input to log2 is <=0 and return null [datafusion-comet]

2024-07-15 Thread via GitHub
PedroMDuarte commented on PR #506: URL: https://github.com/apache/datafusion-comet/pull/506#issuecomment-2229258745 thanks for the guidance here, really appreciate you all taking the time to show me how to contribute this fix. -- This is an automated message from the Apache Git Service.

[PR] Alamb/rename field [datafusion]

2024-07-15 Thread via GitHub
alamb opened a new pull request, #11478: URL: https://github.com/apache/datafusion/pull/11478 Draft until https://github.com/apache/datafusion/pull/11440 merges ## Which issue does this PR close? None ## Rationale for this change While reviewing https://github.com/apac

Re: [PR] Introduce user defined SQL planner API [datafusion]

2024-07-15 Thread via GitHub
wjones127 commented on code in PR #11180: URL: https://github.com/apache/datafusion/pull/11180#discussion_r1678307891 ## datafusion/sql/src/expr/mod.rs: ## @@ -341,7 +278,17 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { } }; -

[I] ExprPlanner not propagated to SqlToRel [datafusion]

2024-07-15 Thread via GitHub
wjones127 opened a new issue, #11477: URL: https://github.com/apache/datafusion/issues/11477 ### Describe the bug When upgrading to DataFusion 40, I kept on encountering this error: ``` Internal error: Expected a simplified result, but none was found. ``` This happe

Re: [PR] fix: Optimize some functions to rewrite dictionary-encoded strings [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove merged PR #627: URL: https://github.com/apache/datafusion-comet/pull/627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] fix: Optimize some functions to rewrite dictionary-encoded strings [datafusion-comet]

2024-07-15 Thread via GitHub
andygrove commented on PR #627: URL: https://github.com/apache/datafusion-comet/pull/627#issuecomment-2229237105 Thanks @vaibhawvipul and thanks for the review @parthchandra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Unparser datetime to timestamp and interval [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11466: URL: https://github.com/apache/datafusion/pull/11466#discussion_r1678292019 ## datafusion/sql/src/unparser/dialect.rs: ## @@ -95,4 +145,76 @@ impl Dialect for CustomDialect { fn identifier_quote_style(&self, _: &str) -> Option {

Re: [PR] feat: support group by unnest [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11469: URL: https://github.com/apache/datafusion/pull/11469#discussion_r1678290570 ## datafusion/sqllogictest/test_files/unnest.slt: ## @@ -556,4 +554,126 @@ physical_plan 05)RepartitionExec: partitioning=RoundRobinBatch(4), input_partitio

Re: [PR] Fix parse `'1'::interval` as month by default [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11454: URL: https://github.com/apache/datafusion/pull/11454#discussion_r1678281210 ## datafusion/expr/src/columnar_value.rs: ## @@ -195,19 +195,39 @@ impl ColumnarValue { kernels::cast::cast_with_options(array, cast_type, &cast_op

Re: [PR] Fix parse `'1'::interval` as month by default [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11454: URL: https://github.com/apache/datafusion/pull/11454#issuecomment-2229201179 > Also where should I put the tests for this ? I recommend sqllogictests Here are the instructions: https://github.com/apache/datafusion/tree/main/datafusion/sqllogict

Re: [PR] feat: precompile literal regex pattern [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11455: URL: https://github.com/apache/datafusion/pull/11455#discussion_r1678277164 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -41,20 +39,24 @@ use datafusion_expr::type_coercion::binary::get_result_type; use datafusion_expr::{Col

Re: [I] Add nullable in `StateFieldArgs` [datafusion]

2024-07-15 Thread via GitHub
jcsherin commented on issue #11433: URL: https://github.com/apache/datafusion/issues/11433#issuecomment-2229187055 This is a recent change introduce by #11093. https://github.com/apache/datafusion/blob/0965455486b7dcbd8c9a5efa8d2370ca5460bb9f/datafusion/functions-aggregate/src/nth_value.

Re: [PR] Minor: Make execute_input_stream Accessible for Any Sinking Operators [datafusion]

2024-07-15 Thread via GitHub
alamb commented on code in PR #11449: URL: https://github.com/apache/datafusion/pull/11449#discussion_r1678263132 ## datafusion/physical-plan/src/lib.rs: ## @@ -805,6 +805,97 @@ pub fn execute_stream_partitioned( Ok(streams) } +/// Executes an input stream and ensures th

Re: [PR] Enable `clone_on_ref_ptr` clippy lint on functions* [datafusion]

2024-07-15 Thread via GitHub
alamb merged PR #11468: URL: https://github.com/apache/datafusion/pull/11468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] fix: `regexp_replace` fails when pattern or replacement is a scalar `NULL` [datafusion]

2024-07-15 Thread via GitHub
alamb merged PR #11459: URL: https://github.com/apache/datafusion/pull/11459 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] fix: `regexp_replace` fails when pattern or replacement is a scalar `NULL` [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11459: URL: https://github.com/apache/datafusion/pull/11459#issuecomment-2229173658 Thank you @Weijun-H and @jonahgao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] regexp_replace fails when pattern or replacement is a scalar NULL [datafusion]

2024-07-15 Thread via GitHub
alamb closed issue #11410: regexp_replace fails when pattern or replacement is a scalar NULL URL: https://github.com/apache/datafusion/issues/11410 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Remove element's nullability of array_agg function [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11447: URL: https://github.com/apache/datafusion/pull/11447#issuecomment-2229171437 Let's plan to merge this tomorrow to allow anyone else who may wish to review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Docs: Document creating new extension APIs [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11425: URL: https://github.com/apache/datafusion/pull/11425#issuecomment-2229169995 I plan to merge this PR tomorrow unless anyone else would like additional time to review -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [I] Add nullable in `StateFieldArgs` [datafusion]

2024-07-15 Thread via GitHub
jcsherin commented on issue #11433: URL: https://github.com/apache/datafusion/issues/11433#issuecomment-2229152203 When adding trace statements in `nth_value` aggregate I can see that the following are executed in order: - `update_batch()` - `state()` - `merge_batch()` - `evaluat

Re: [PR] chore: Move protobuf files to separate crate [datafusion-comet]

2024-07-15 Thread via GitHub
viirya commented on code in PR #661: URL: https://github.com/apache/datafusion-comet/pull/661#discussion_r1678242523 ## native/proto/README.md: ## @@ -0,0 +1,23 @@ + + +# Apache DataFusion Comet: Intermediate Representation of Query Plan + +This crate contains code generated fro

Re: [PR] chore: Move protobuf files to separate crate [datafusion-comet]

2024-07-15 Thread via GitHub
viirya commented on code in PR #661: URL: https://github.com/apache/datafusion-comet/pull/661#discussion_r1678242025 ## native/proto/README.md: ## @@ -0,0 +1,23 @@ + + +# Apache DataFusion Comet: Intermediate Representation of Query Plan + +This crate contains code generated fro

Re: [PR] upgrade sqlparser 0.47 -> 0.48 [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11453: URL: https://github.com/apache/datafusion/pull/11453#issuecomment-2229123663 What I suggest we do is: 1. File a ticket to reduce the stack size more on debug builds somehow 2. Merge this PR -- This is an automated message from the Apache Git Serv

Re: [PR] upgrade sqlparser 0.47 -> 0.48 [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11453: URL: https://github.com/apache/datafusion/pull/11453#issuecomment-2229122607 🤔 the comments suggest this is not a new problem: https://github.com/apache/datafusion/blob/48a1754b33e983a8201ca3fefa36136fa44f0c55/datafusion/sql/src/expr/value.rs#L134-L135 --

Re: [I] Release DataFusion `40.0.0` [datafusion]

2024-07-15 Thread via GitHub
hveiga commented on issue #11077: URL: https://github.com/apache/datafusion/issues/11077#issuecomment-2229120630 Thanks @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] upgrade sqlparser 0.47 -> 0.48 [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11453: URL: https://github.com/apache/datafusion/pull/11453#issuecomment-2229119318 ``` RUST_LOG=info cargo test --features=backtrace --test sqllogictests -- array.slt ``` Shows the issue appears to be in the handling of this query ```sql select

Re: [PR] upgrade sqlparser 0.47 -> 0.48 [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11453: URL: https://github.com/apache/datafusion/pull/11453#issuecomment-2229115590 Here is a smaller reproducer: ```shell cargo test --features=backtrace --test sqllogictests -- array.slt ``` -- This is an automated message from the Apache Git Service. To r

Re: [PR] upgrade sqlparser 0.47 -> 0.48 [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11453: URL: https://github.com/apache/datafusion/pull/11453#issuecomment-2229111742 I ran the program under lldb and here is the stack trace on my M3 mac ``` (lldb) target create "/Users/andrewlamb/Software/datafusion/target/debug/deps/sqllogictests-ac426

Re: [I] DataFusion weekly project plan (Andrew Lamb) - July 15, 2024 [datafusion]

2024-07-15 Thread via GitHub
alamb commented on issue #11474: URL: https://github.com/apache/datafusion/issues/11474#issuecomment-2229102024 Review Queue Arrow - [x] https://github.com/apache/arrow-rs/pull/6053 - [ ] https://github.com/apache/arrow-rs/pull/6046 - [ ] https://github.com/apache/arrow-rs/pul

Re: [PR] Refactor: more clearly delineate btwn writer options vs session configuration [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11444: URL: https://github.com/apache/datafusion/pull/11444#issuecomment-2229100270 Thanks @wiedld A suggestion that I think will help speed up reviews / merging is to try and keep the PRs smaller (like for example make one PR to add docs, and another PR to a

Re: [I] sql_compound_identifier_to_expr [datafusion]

2024-07-15 Thread via GitHub
alamb commented on issue #11473: URL: https://github.com/apache/datafusion/issues/11473#issuecomment-2229095759 Sorry @dharanad -- I didn't see https://github.com/apache/datafusion/issues/11244 before -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] [Epic] Extract catalog functionality from the core to make it more modular [datafusion]

2024-07-15 Thread via GitHub
alamb commented on issue #10782: URL: https://github.com/apache/datafusion/issues/10782#issuecomment-2229094411 Update: https://github.com/apache/datafusion/pull/11403 has been merged and I think building SessionState is much nicer now -- This is an automated message from the Apache Git S

Re: [PR] Return scalar result when all inputs are constants in `map` and `make_map` [datafusion]

2024-07-15 Thread via GitHub
alamb commented on PR #11461: URL: https://github.com/apache/datafusion/pull/11461#issuecomment-2229092662 Thank you @Rachelint @dharanad @goldmedal and @jayzhan211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

  1   2   >