Re: [I] How do I bring dependencies in my binding? [datafusion-python]

2024-06-22 Thread via GitHub
Michael-J-Ward commented on issue #737: URL: https://github.com/apache/datafusion-python/issues/737#issuecomment-2182977368 I haven't previously done what you're trying to do, so a minimum github repo to reproduce would be helpful. This error is the python runtime trying to import `d

Re: [I] How do I bring dependencies in my binding? [datafusion-python]

2024-06-22 Thread via GitHub
Michael-J-Ward commented on issue #737: URL: https://github.com/apache/datafusion-python/issues/737#issuecomment-2182988044 I haven't done any digging to see why the code is like this, but the end-result is that you probably will need `datafusion` as a `python` dependency. https://g

Re: [PR] Better CSE identifier [datafusion]

2024-06-22 Thread via GitHub
peter-toth commented on PR #10473: URL: https://github.com/apache/datafusion/pull/10473#issuecomment-2182987502 This PR is more or less ready for review, tests are passing except for the MSRV. I focused only on the 3 performance improvements and deliberately kept the code as close to the

[PR] Queries for clickbench [datafusion]

2024-06-22 Thread via GitHub
gayyappan opened a new pull request, #11050: URL: https://github.com/apache/datafusion/pull/11050 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested?

Re: [PR] Queries for clickbench [datafusion]

2024-06-22 Thread via GitHub
gayyappan closed pull request #11050: Queries for clickbench URL: https://github.com/apache/datafusion/pull/11050 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Compute gcd with u64 instead of i64 because of overflows [datafusion]

2024-06-22 Thread via GitHub
jonahgao commented on PR #11036: URL: https://github.com/apache/datafusion/pull/11036#issuecomment-2182995239 > so i think the only way to fix that issue is to return a `u64`, which will alter the API. @LorrensP-2158466 How about returning an overflow error when the final result can

Re: [I] Complete Possible Join Type Handling for EmptyRelation Propagation Rule [datafusion]

2024-06-22 Thread via GitHub
LorrensP-2158466 commented on issue #10967: URL: https://github.com/apache/datafusion/issues/10967#issuecomment-2183468763 Alright, I've never done the rebase thing, it does sound nicer than 2 separate branches, I guess I can try it out now. You'll see the PR for cases 1, 2 and 3 tomorrow.

[I] Order of Interval Addition Should Affect Final Output [datafusion]

2024-06-22 Thread via GitHub
vbarua opened a new issue, #11055: URL: https://github.com/apache/datafusion/issues/11055 ### Describe the bug In various engines, the order in which intervals are added to dates can affect the final value. This is especially noticeable with leap years. Datafusion appears to co

[PR] chore: add test to show current behavior of `AT TIME ZONE` for string vs. timestamp [datafusion]

2024-06-22 Thread via GitHub
appletreeisyellow opened a new pull request, #11056: URL: https://github.com/apache/datafusion/pull/11056 ## Which issue does this PR close? Closes #. ## Rationale for this change Add test cases to show the current behavior of `AT TIME ZONE` for string vs.

[PR] handle overflow in gcd and return this as an error [datafusion]

2024-06-22 Thread via GitHub
LorrensP-2158466 opened a new pull request, #11057: URL: https://github.com/apache/datafusion/pull/11057 ## Which issue does this PR close? Closes #11053. ## Rationale for this change ## What changes are included in this PR? GCD and LCM functions can no

Re: [PR] Add `advanced_parquet_index.rs` example of index in into parquet files [datafusion]

2024-06-22 Thread via GitHub
alamb commented on code in PR #10701: URL: https://github.com/apache/datafusion/pull/10701#discussion_r1649429690 ## datafusion-examples/examples/advanced_parquet_index.rs: ## @@ -0,0 +1,662 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

Re: [PR] Boolean parquet get datapage stat [datafusion]

2024-06-22 Thread via GitHub
alamb commented on code in PR #11054: URL: https://github.com/apache/datafusion/pull/11054#discussion_r1649431236 ## datafusion/core/src/datasource/physical_plan/parquet/statistics.rs: ## @@ -613,6 +625,14 @@ macro_rules! get_data_page_statistics { ($stat_type_prefix: ident

Re: [PR] Boolean parquet get datapage stat [datafusion]

2024-06-22 Thread via GitHub
LorrensP-2158466 commented on code in PR #11054: URL: https://github.com/apache/datafusion/pull/11054#discussion_r1649435311 ## datafusion/core/src/datasource/physical_plan/parquet/statistics.rs: ## @@ -613,6 +625,14 @@ macro_rules! get_data_page_statistics { ($stat_type_pr

Re: [PR] feat: Implement more efficient version of xxhash64 [datafusion-comet]

2024-06-22 Thread via GitHub
parthchandra commented on code in PR #575: URL: https://github.com/apache/datafusion-comet/pull/575#discussion_r1649411618 ## core/src/execution/datafusion/expressions/xxhash64.rs: ## @@ -0,0 +1,186 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

Re: [PR] Support dictionary data type in array_to_string [datafusion]

2024-06-22 Thread via GitHub
alamb commented on code in PR #10908: URL: https://github.com/apache/datafusion/pull/10908#discussion_r1649449221 ## datafusion/functions-array/src/string.rs: ## @@ -281,6 +283,49 @@ pub(super) fn array_to_string_inner(args: &[ArrayRef]) -> Result { Ok(arg)

Re: [PR] fix: Improve error "BroadcastExchange is not supported" [datafusion-comet]

2024-06-22 Thread via GitHub
parthchandra commented on code in PR #577: URL: https://github.com/apache/datafusion-comet/pull/577#discussion_r1649457828 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -594,22 +630,41 @@ class CometSparkSessionExtensions if (isCo

[PR] build(deps): bump syn from 2.0.66 to 2.0.67 [datafusion-python]

2024-06-22 Thread via GitHub
dependabot[bot] opened a new pull request, #738: URL: https://github.com/apache/datafusion-python/pull/738 Bumps [syn](https://github.com/dtolnay/syn) from 2.0.66 to 2.0.67. Release notes Sourced from https://github.com/dtolnay/syn/releases";>syn's releases. 2.0.67 Pr

[PR] build(deps): bump url from 2.5.0 to 2.5.2 [datafusion-python]

2024-06-22 Thread via GitHub
dependabot[bot] opened a new pull request, #739: URL: https://github.com/apache/datafusion-python/pull/739 Bumps [url](https://github.com/servo/rust-url) from 2.5.0 to 2.5.2. Commits https://github.com/servo/rust-url/commit/54346fa288e16b25b71c45149d7067c752b450e0";>54346fa Rev

Re: [PR] build(deps): bump url from 2.5.0 to 2.5.1 [datafusion-python]

2024-06-22 Thread via GitHub
dependabot[bot] closed pull request #733: build(deps): bump url from 2.5.0 to 2.5.1 URL: https://github.com/apache/datafusion-python/pull/733 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] build(deps): bump url from 2.5.0 to 2.5.1 [datafusion-python]

2024-06-22 Thread via GitHub
dependabot[bot] commented on PR #733: URL: https://github.com/apache/datafusion-python/pull/733#issuecomment-2184157014 Superseded by #739. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] feat: Create datafusion-distributed crate with shuffle reader/writer [datafusion]

2024-06-22 Thread via GitHub
thinkharderdev commented on PR #11070: URL: https://github.com/apache/datafusion/pull/11070#issuecomment-2184161637 > @thinkharderdev @Dandandan @avantgardnerio Just fyi and wanted to get your opinion on whether this is useful for you This seems like a good idea although I'm not sure

Re: [PR] feat: Create datafusion-distributed crate with shuffle reader/writer [datafusion]

2024-06-22 Thread via GitHub
andygrove commented on PR #11070: URL: https://github.com/apache/datafusion/pull/11070#issuecomment-2184166286 > Edit: Adding the distributed scheduler to this create would be great though and something we'd definitely be interested in using and contributing to, especially if if can abstrac

Re: [PR] build(deps): bump syn from 2.0.66 to 2.0.67 [datafusion-python]

2024-06-22 Thread via GitHub
viirya merged PR #738: URL: https://github.com/apache/datafusion-python/pull/738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Implement min/max for interval types [datafusion]

2024-06-22 Thread via GitHub
maxburke commented on code in PR #11015: URL: https://github.com/apache/datafusion/pull/11015#discussion_r1649777954 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -1785,29 +1785,42 @@ select min(t), max(t) from (select '00:00:00' as t union select '00:00:01' unio

Re: [PR] Migrate more code from `Expr::to_columns` to `Expr::column_refs` [datafusion]

2024-06-22 Thread via GitHub
comphead commented on code in PR #11067: URL: https://github.com/apache/datafusion/pull/11067#discussion_r1649769840 ## datafusion/optimizer/src/utils.rs: ## @@ -66,6 +66,16 @@ pub fn optimize_children( } } +/// Returns true if all columns in col_refs are in `schema_cols

Re: [PR] Convert Average to UDAF #10942 [datafusion]

2024-06-22 Thread via GitHub
dharanad commented on code in PR #10964: URL: https://github.com/apache/datafusion/pull/10964#discussion_r1649703615 ## datafusion/proto/tests/cases/roundtrip_physical_plan.rs: ## @@ -281,17 +283,6 @@ fn roundtrip_window() -> Result<()> { Arc::new(window_frame), ))