Re: [PR] chore: use `NullBuffer::union` for Spark `concat` [datafusion]

2025-10-15 Thread via GitHub
Jefffrey commented on code in PR #18087: URL: https://github.com/apache/datafusion/pull/18087#discussion_r2434783947 ## datafusion/spark/src/function/string/concat.rs: ## @@ -122,13 +127,13 @@ fn spark_concat(args: ScalarFunctionArgs) -> Result { apply_null_mask(result, nu

Re: [PR] Add independent configs for topk/join dynamic filter [datafusion]

2025-10-15 Thread via GitHub
2010YOUY01 commented on code in PR #18090: URL: https://github.com/apache/datafusion/pull/18090#discussion_r2434598480 ## docs/source/user-guide/configs.md: ## @@ -132,7 +132,9 @@ The following configuration settings are available: | datafusion.optimizer.enable_round_robin_repa

Re: [PR] Chore: Code hygiene - warn-numeric-widen [datafusion-comet]

2025-10-15 Thread via GitHub
codecov-commenter commented on PR #2588: URL: https://github.com/apache/datafusion-comet/pull/2588#issuecomment-3409142125 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/2588?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] chore: add stack overflow warning for Visitor and VisitorMut [datafusion-sqlparser-rs]

2025-10-15 Thread via GitHub
niebayes commented on code in PR #2068: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2068#discussion_r2434384608 ## src/ast/visitor.rs: ## @@ -124,6 +124,10 @@ visit_noop!(bigdecimal::BigDecimal); /// node and `post_visit_` methods are invoked after visiting all

Re: [I] Support smaller decimal types through SQL interface [datafusion]

2025-10-15 Thread via GitHub
AdamGS commented on issue #17747: URL: https://github.com/apache/datafusion/issues/17747#issuecomment-3401227536 There has been a bunch of work to improve decimal support (by me and others), but the SQL part remains open. My current understanding and thinking here is: 1. Supporting decim

Re: [PR] feat: Reuse existing file instead of reopening during shuffle write [datafusion-comet]

2025-10-15 Thread via GitHub
zuston commented on PR #2577: URL: https://github.com/apache/datafusion-comet/pull/2577#issuecomment-3408950394 > Also, could you provide more information about the `Rationale for this change`? updated. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] fix: Potential data lost without flush on writeSortedFileNative [datafusion-comet]

2025-10-15 Thread via GitHub
zuston closed pull request #2578: fix: Potential data lost without flush on writeSortedFileNative URL: https://github.com/apache/datafusion-comet/pull/2578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] fix: Potential data lost without flush on writeSortedFileNative [datafusion-comet]

2025-10-15 Thread via GitHub
zuston commented on code in PR #2578: URL: https://github.com/apache/datafusion-comet/pull/2578#discussion_r2434401241 ## native/core/src/execution/shuffle/row.rs: ## @@ -839,6 +839,7 @@ pub fn process_sorted_row_partition( .open(&output_path)?; output_da

Re: [I] extended tests failures on main [datafusion]

2025-10-15 Thread via GitHub
comphead commented on issue #18084: URL: https://github.com/apache/datafusion/issues/18084#issuecomment-3408689000 @alamb I created https://github.com/apache/datafusion/issues/18088 to protect some critical core areas with `extended` suite. -- This is an automated message from the

Re: [PR] Short circuit complex case evaluation modes as soon as possible [datafusion]

2025-10-15 Thread via GitHub
pepijnve commented on PR #17898: URL: https://github.com/apache/datafusion/pull/17898#issuecomment-3408463654 I’ll have a look at the existing micro benchmarks tomorrow. Not sure if there’s anything in there already with sufficient branches that you would notice the impact. -- This is an

Re: [I] Unexpected "type mismatch" when filtering bool list column using `array_distinct` and `make_array` [datafusion]

2025-10-15 Thread via GitHub
dqkqd commented on issue #17416: URL: https://github.com/apache/datafusion/issues/17416#issuecomment-3408396628 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [I] Release DataFusion `50.3.0` (minor) [datafusion]

2025-10-15 Thread via GitHub
andygrove commented on issue #18072: URL: https://github.com/apache/datafusion/issues/18072#issuecomment-3408376240 https://github.com/apache/datafusion/pull/18013 would be nice to have -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Short circuit complex case evaluation modes as soon as possible [datafusion]

2025-10-15 Thread via GitHub
alamb commented on PR #17898: URL: https://github.com/apache/datafusion/pull/17898#issuecomment-3408228378 🤖 `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.14.0-1016-gcp #17~

Re: [I] Single File Per ParquetExec, AvroExec, etc... [datafusion]

2025-10-15 Thread via GitHub
alamb commented on issue #2293: URL: https://github.com/apache/datafusion/issues/2293#issuecomment-3408193200 I believe this was completed in the great `DataSource` extraction, so closing this out https://docs.rs/datafusion/latest/datafusion/datasource/source/trait.DataSource.html

Re: [I] ListingTable provider does not prune partitions when no filters are supplied [datafusion]

2025-10-15 Thread via GitHub
alamb closed issue #17957: ListingTable provider does not prune partitions when no filters are supplied URL: https://github.com/apache/datafusion/issues/17957 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Enable placeholders with extension types [datafusion]

2025-10-15 Thread via GitHub
paleolimbot commented on PR #17986: URL: https://github.com/apache/datafusion/pull/17986#issuecomment-3407895662 Thank you...I'd love that! I just need to finish up a few tests...I didn't quite get there today but I'm confident I can get this ready for eyes tomorrow. -- This is an automat

Re: [PR] chore: [branch-0.11] Bump version to 0.11.0 [datafusion-comet]

2025-10-15 Thread via GitHub
andygrove commented on code in PR #2583: URL: https://github.com/apache/datafusion-comet/pull/2583#discussion_r2433738233 ## docs/source/contributor-guide/benchmarking_aws_ec2.md: ## @@ -104,7 +104,7 @@ make release Set `COMET_JAR` environment variable. ```shell -export COM

Re: [PR] chore: [branch-0.11] Bump version to 0.11.0 [datafusion-comet]

2025-10-15 Thread via GitHub
codecov-commenter commented on PR #2583: URL: https://github.com/apache/datafusion-comet/pull/2583#issuecomment-3407977636 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/2583?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

[I] Convert `ExecutionOptions::time_zone` to `Option` [datafusion]

2025-10-15 Thread via GitHub
Weijun-H opened a new issue, #18081: URL: https://github.com/apache/datafusion/issues/18081 ### Is your feature request related to a problem or challenge? We may want to update the doc for ExecutionOptions::time_zone slightly as well as currently it's very targetted at 'Extract'.

Re: [I] Config changes related to on-heap / testing [datafusion-comet]

2025-10-15 Thread via GitHub
andygrove commented on issue #2569: URL: https://github.com/apache/datafusion-comet/issues/2569#issuecomment-3407840495 This was implemented in https://github.com/apache/datafusion-comet/pull/2538 -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [I] Config changes related to on-heap / testing [datafusion-comet]

2025-10-15 Thread via GitHub
andygrove closed issue #2569: Config changes related to on-heap / testing URL: https://github.com/apache/datafusion-comet/issues/2569 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[PR] refactor: move ListingTable over to the catalog-listing-table crate [datafusion]

2025-10-15 Thread via GitHub
timsaucer opened a new pull request, #18080: URL: https://github.com/apache/datafusion/pull/18080 ## Which issue does this PR close? - This addresses part of https://github.com/apache/datafusion/issues/17713 ## Rationale for this change In order to remove the `datafusion`

[I] [Substrait] Planning errors: Missing column with multiple consecutive joins [datafusion]

2025-10-15 Thread via GitHub
hareshkh opened a new issue, #18079: URL: https://github.com/apache/datafusion/issues/18079 ### Describe the bug We get the following error: ``` The left or right side of the join does not have all columns on \"on\": \nMissing on the left: {Column { name: \"target_unit:1\", inde

Re: [I] Expand use of sql parsing string expressions in DataFrame [datafusion-python]

2025-10-15 Thread via GitHub
milenkovicm commented on issue #1278: URL: https://github.com/apache/datafusion-python/issues/1278#issuecomment-3407425098 I have no strong opinion Tim, I'm split between keeping it as it is and change it -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] Release 50.1.0 [datafusion-python]

2025-10-15 Thread via GitHub
milenkovicm commented on issue #1279: URL: https://github.com/apache/datafusion-python/issues/1279#issuecomment-3407417672 Would it be possible to get DF.parse_to_sql released in 50.1 we can time others for later ? -- This is an automated message from the Apache Git Service. To respond t

Re: [I] Release 50.1.0 [datafusion-python]

2025-10-15 Thread via GitHub
timsaucer commented on issue #1279: URL: https://github.com/apache/datafusion-python/issues/1279#issuecomment-3407405807 cc @milenkovicm in case you feel strongly about timing of the SQL parsing features -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Respect execution timezone in `to_timestamp` and related functions [datafusion]

2025-10-15 Thread via GitHub
Omega359 commented on code in PR #18025: URL: https://github.com/apache/datafusion/pull/18025#discussion_r2433324044 ## datafusion/functions/src/datetime/common.rs: ## @@ -42,6 +44,167 @@ pub(crate) fn string_to_timestamp_nanos_shim(s: &str) -> Result { string_to_timestamp

Re: [PR] Respect execution timezone in `to_timestamp` and related functions [datafusion]

2025-10-15 Thread via GitHub
Omega359 commented on code in PR #18025: URL: https://github.com/apache/datafusion/pull/18025#discussion_r2433317581 ## datafusion/functions/src/datetime/common.rs: ## @@ -42,6 +44,167 @@ pub(crate) fn string_to_timestamp_nanos_shim(s: &str) -> Result { string_to_timestamp

Re: [I] LimitPushPastWindows returns incorrect results for queries with `lead()` [datafusion]

2025-10-15 Thread via GitHub
alamb commented on issue #18028: URL: https://github.com/apache/datafusion/issues/18028#issuecomment-3407274997 We are hoping to include this in 50.3.0: - https://github.com/apache/datafusion/issues/18072 -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] docs: Documentation updates [datafusion-comet]

2025-10-15 Thread via GitHub
codecov-commenter commented on PR #2581: URL: https://github.com/apache/datafusion-comet/pull/2581#issuecomment-3407351514 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/2581?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] docs: Documentation updates [datafusion-comet]

2025-10-15 Thread via GitHub
comphead commented on code in PR #2581: URL: https://github.com/apache/datafusion-comet/pull/2581#discussion_r2433246439 ## docs/source/user-guide/latest/compatibility.md: ## @@ -83,9 +83,6 @@ The `native_datafusion` scan has some additional limitations: Comet will fall back to

Re: [PR] Fix bug in LimitPushPastWindows [datafusion]

2025-10-15 Thread via GitHub
alamb commented on PR #18029: URL: https://github.com/apache/datafusion/pull/18029#issuecomment-3407278255 @hareshkh has created a ticket to track the 50.3.0 release: - https://github.com/apache/datafusion/issues/18072 @avantgardnerio can you please create a backport PR to branch-5

Re: [I] [EPIC] Complete `datafusion-spark` Spark Compatible Functions [datafusion]

2025-10-15 Thread via GitHub
alamb commented on issue #15914: URL: https://github.com/apache/datafusion/issues/15914#issuecomment-3407253577 For anyone else following along, something that would be super helpful would be to find existing PRs for porting spark functions into `datafusion-spark` and help ensure they are a

Re: [PR] fix: Use dynamic timezone in now() function for accurate timestamp [datafusion]

2025-10-15 Thread via GitHub
alamb commented on code in PR #18017: URL: https://github.com/apache/datafusion/pull/18017#discussion_r2433180096 ## datafusion/functions/src/datetime/now.rs: ## @@ -54,6 +57,15 @@ impl NowFunc { Self { signature: Signature::nullary(Volatility::Stable),

Re: [PR] fix: Use dynamic timezone in now() function for accurate timestamp [datafusion]

2025-10-15 Thread via GitHub
Omega359 commented on code in PR #18017: URL: https://github.com/apache/datafusion/pull/18017#discussion_r2433182045 ## datafusion/functions/src/datetime/now.rs: ## @@ -54,6 +57,15 @@ impl NowFunc { Self { signature: Signature::nullary(Volatility::Stable),

Re: [PR] Feat: Make current_time aware of execution timezone. [datafusion]

2025-10-15 Thread via GitHub
comphead commented on code in PR #18040: URL: https://github.com/apache/datafusion/pull/18040#discussion_r2433125632 ## datafusion/sqllogictest/test_files/current_time_timezone.slt: ## @@ -0,0 +1,100 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more con

[PR] docs: Documentation updates [WIP] [datafusion-comet]

2025-10-15 Thread via GitHub
andygrove opened a new pull request, #2581: URL: https://github.com/apache/datafusion-comet/pull/2581 ## Which issue does this PR close? Closes #. ## Rationale for this change Preparing for 0.11.0 release. ## What changes are included in this PR?

[I] CI is failing on main [datafusion]

2025-10-15 Thread via GitHub
alamb opened a new issue, #18062: URL: https://github.com/apache/datafusion/issues/18062 ### Describe the bug An example failure: https://github.com/apache/datafusion/actions/runs/18508018377/job/52742095719 ``` running 1 test test engines::conversion::tests::test_b

[I] Documentation site rendering issue [datafusion-comet]

2025-10-15 Thread via GitHub
andygrove opened a new issue, #2580: URL: https://github.com/apache/datafusion-comet/issues/2580 ### Describe the bug The right-hand navigation menu is getting in the way of the content. https://github.com/user-attachments/assets/6752d05e-b0d6-4ac1-9e23-6668bca1e4a4"; />

[I] Relation Planner Extension API [datafusion]

2025-10-15 Thread via GitHub
geoffreyclaude opened a new issue, #18078: URL: https://github.com/apache/datafusion/issues/18078 ### Is your feature request related to a problem or challenge? _No response_ ### Describe the solution you'd like _No response_ ### Describe alternatives you've consid

Re: [I] Expand use of sql parsing string expressions in DataFrame [datafusion-python]

2025-10-15 Thread via GitHub
milenkovicm commented on issue #1278: URL: https://github.com/apache/datafusion-python/issues/1278#issuecomment-3406712056 Have you changed your mind about select Tim? I'm stuck next few days but may help if needed -- This is an automated message from the Apache Git Service. To respond

Re: [PR] refactor: add dialect enum [datafusion]

2025-10-15 Thread via GitHub
dariocurr commented on code in PR #18043: URL: https://github.com/apache/datafusion/pull/18043#discussion_r2432299275 ## datafusion/common/src/config.rs: ## @@ -292,6 +292,88 @@ config_namespace! { } } +#[derive(Debug, Default, Clone, Copy, PartialEq, Eq)] Review Commen

Re: [PR] Project RecordBatch before evaluating `case` [datafusion]

2025-10-15 Thread via GitHub
alamb commented on PR #18055: URL: https://github.com/apache/datafusion/pull/18055#issuecomment-3406578159 > I already need the schema anyway in order to decide if it makes sense to project or not. One simple solution is to just keep a reference to that one. But things get a bit weird when

Re: [I] Expand use of sql parsing string expressions in DataFrame [datafusion-python]

2025-10-15 Thread via GitHub
timsaucer commented on issue #1278: URL: https://github.com/apache/datafusion-python/issues/1278#issuecomment-3406551401 cc @milenkovicm @K-dash -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Project RecordBatch before evaluating `case` [datafusion]

2025-10-15 Thread via GitHub
pepijnve commented on PR #18055: URL: https://github.com/apache/datafusion/pull/18055#issuecomment-3406549307 > There is similar code for filtering here (namely that evaluates the filter expression first, and then only calles `filter` with columns that are needed) This touches on one

[PR] chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.30 [datafusion-sandbox]

2025-10-15 Thread via GitHub
dependabot[bot] opened a new pull request, #35: URL: https://github.com/apache/datafusion-sandbox/pull/35 Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.61.8 to 2.62.30. Release notes Sourced from https://github.com/taiki-e/install-action/releases"

Re: [PR] chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.30 [datafusion-sandbox]

2025-10-15 Thread via GitHub
dependabot[bot] commented on PR #35: URL: https://github.com/apache/datafusion-sandbox/pull/35#issuecomment-340596 ### Labels The following labels could not be found: `auto-dependencies`. Please create it before Dependabot can add it to a pull request. Please fix the a

Re: [PR] Added support for MATCH syntax and unified column option ForeignKey [datafusion-sqlparser-rs]

2025-10-15 Thread via GitHub
iffyio merged PR #2062: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/2062 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [I] Use an enum to express the nullability of an array [datafusion]

2025-10-15 Thread via GitHub
alamb closed issue #18047: Use an enum to express the nullability of an array URL: https://github.com/apache/datafusion/issues/18047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] chore: Use an enum to express the different kinds of nullability in an array [datafusion]

2025-10-15 Thread via GitHub
alamb merged PR #18048: URL: https://github.com/apache/datafusion/pull/18048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Make ClickBench Q23 Go Faster [datafusion]

2025-10-15 Thread via GitHub
alamb commented on issue #15177: URL: https://github.com/apache/datafusion/issues/15177#issuecomment-3405759427 YES! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[I] Add different configs for topk/join dynamic filter [datafusion]

2025-10-15 Thread via GitHub
xudong963 opened a new issue, #18071: URL: https://github.com/apache/datafusion/issues/18071 Currently, we use the `enable_dynamic_filter_pushdown` config to control topk and join dynamic filter iiuc. I think it'll be better to give users more flexibility to choose which dynamic filt

Re: [PR] feat: add temporary view option for into_view [datafusion-python]

2025-10-15 Thread via GitHub
timsaucer merged PR #1267: URL: https://github.com/apache/datafusion-python/pull/1267 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [PR] feat: `ClassicJoin` for PWMJ [datafusion]

2025-10-15 Thread via GitHub
2010YOUY01 commented on PR #17482: URL: https://github.com/apache/datafusion/pull/17482#issuecomment-3405108110 The sqlite tests are passing 1. manually change the `enable_piesewise_merge_join` default to `true` 2. run `INCLUDE_SQLITE=true cargo test --profile release-nonlto --test sql

Re: [PR] chore(deps): bump taiki-e/install-action from 2.61.8 to 2.62.28 [datafusion-sandbox]

2025-10-15 Thread via GitHub
dependabot[bot] commented on PR #32: URL: https://github.com/apache/datafusion-sandbox/pull/32#issuecomment-3401433513 Superseded by #33. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Add defensive debug asserts in simplify to ensure schema doesn't change [datafusion]

2025-10-15 Thread via GitHub
EeshanBembi commented on issue #18001: URL: https://github.com/apache/datafusion/issues/18001#issuecomment-3405031849 I've investigated the codebase and identified where the defensive debug assertions should be added. The key location is in https://github.com/apache/datafusion/blob/mai

Re: [PR] feat: Reuse existing file instead of reopening during shuffle write [datafusion-comet]

2025-10-15 Thread via GitHub
codecov-commenter commented on PR #2577: URL: https://github.com/apache/datafusion-comet/pull/2577#issuecomment-3404932696 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/2577?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca