Re: [PR] Add datafusion.extract [datafusion-python]

2024-11-25 Thread via GitHub
kosiew closed pull request #958: Add datafusion.extract URL: https://github.com/apache/datafusion-python/pull/958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Fix omitted predicate in parquet reading [datafusion]

2024-11-25 Thread via GitHub
Sevenannn closed pull request #13120: Fix omitted predicate in parquet reading URL: https://github.com/apache/datafusion/pull/13120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Enhance the nested type access for Generic and DuckDB dialect [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
iffyio commented on code in PR #1541: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1541#discussion_r1857813412 ## src/parser/mod.rs: ## @@ -2935,12 +2935,23 @@ impl<'a> Parser<'a> { }) } else if Token::LBracket == tok { if diale

Re: [PR] [minor] Update Doc of required_indices.rs [datafusion]

2024-11-25 Thread via GitHub
akurmustafa merged PR #13555: URL: https://github.com/apache/datafusion/pull/13555 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] feat: Add GroupColumn `Decimal128Array` [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 commented on code in PR #13564: URL: https://github.com/apache/datafusion/pull/13564#discussion_r1857786838 ## datafusion/physical-plan/src/aggregates/group_values/multi_group_by/primitive.rs: ## @@ -190,9 +191,13 @@ impl GroupColumn assert!(nulls.is_non

Re: [PR] ScalarUDFImpl invoke improvements [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 closed pull request #13507: ScalarUDFImpl invoke improvements URL: https://github.com/apache/datafusion/pull/13507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Enhance the nested type access for Generic and DuckDB dialect [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
goldmedal commented on code in PR #1541: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1541#discussion_r1857742702 ## src/parser/mod.rs: ## @@ -2935,12 +2935,23 @@ impl<'a> Parser<'a> { }) } else if Token::LBracket == tok { if di

[PR] feat: Add GroupColumn `Decimal128Array` [datafusion]

2024-11-25 Thread via GitHub
jonathanc-n opened a new pull request, #13564: URL: https://github.com/apache/datafusion/pull/13564 ## Which issue does this PR close? Closes #13505. ## Rationale for this change ## What changes are included in this PR? Added group column for `Decimal12

Re: [PR] [minor]: Update median implementation [datafusion]

2024-11-25 Thread via GitHub
akurmustafa commented on code in PR #13554: URL: https://github.com/apache/datafusion/pull/13554#discussion_r1857688000 ## datafusion/functions-aggregate/src/median.rs: ## @@ -310,6 +310,18 @@ impl Accumulator for DistinctMedianAccumulator { } } +/// Get maximum entry i

Re: [PR] Add generate_series() udtf (and introduce 'lazy' `MemoryExec`) [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 commented on code in PR #13540: URL: https://github.com/apache/datafusion/pull/13540#discussion_r1857677256 ## datafusion/physical-plan/src/memory.rs: ## @@ -365,8 +365,165 @@ impl RecordBatchStream for MemoryStream { } } +pub trait StreamingBatchGenerator: Se

Re: [PR] Support unparsing plans after applying `optimize_projections` rule [datafusion]

2024-11-25 Thread via GitHub
sgrebnov closed pull request #13267: Support unparsing plans after applying `optimize_projections` rule URL: https://github.com/apache/datafusion/pull/13267 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Support unparsing plans after applying `optimize_projections` rule [datafusion]

2024-11-25 Thread via GitHub
sgrebnov commented on PR #13267: URL: https://github.com/apache/datafusion/pull/13267#issuecomment-2499506539 @alamb - makes sense, thank you. I'm going to close this PR as I don't think it makes sense to add this to DF, instead I'll create separate PR with just unparser improvement and wil

Re: [PR] Replace `OnceLock` with `LazyLock`, update MSRV to 1.80 [datafusion]

2024-11-25 Thread via GitHub
github-actions[bot] closed pull request #11690: Replace `OnceLock` with `LazyLock`, update MSRV to 1.80 URL: https://github.com/apache/datafusion/pull/11690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [EPIC] Improved aggregate function performance [datafusion]

2024-11-25 Thread via GitHub
Rachelint commented on issue #13548: URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2499442858 > @Dandandan - thanks to the great work by @SemyonSinchenko, it's easy to generate these datasets with [falsa](https://github.com/mrpowers-io/falsa). > > Here's the comm

Re: [I] [EPIC] Improved aggregate function performance [datafusion]

2024-11-25 Thread via GitHub
Rachelint commented on issue #13548: URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2499442851 > @Dandandan - thanks to the great work by @SemyonSinchenko, it's easy to generate these datasets with [falsa](https://github.com/mrpowers-io/falsa). > > Here's the comm

Re: [PR] Update substrait requirement from 0.48 to 0.49 [datafusion]

2024-11-25 Thread via GitHub
jonahgao merged PR #13556: URL: https://github.com/apache/datafusion/pull/13556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] [EPIC] Improved aggregate function performance [datafusion]

2024-11-25 Thread via GitHub
MrPowers commented on issue #13548: URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2499396354 @Dandandan - thanks to the great work by @SemyonSinchenko, it's easy to generate these datasets with [falsa](https://github.com/mrpowers-io/falsa). Here's the command to

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 closed pull request #13404: feat: Add implicit casting to `TypeSignature::String` URL: https://github.com/apache/datafusion/pull/13404 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2499346571 > @jayzhan211 So should this PR be closed? Since there is no need for implicit casting in TypeSignature::String? I just worked on this because of your comment on implicit casting

Re: [PR] Move many udf implementations from `invoke` to `invoke_batch` [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 merged PR #13491: URL: https://github.com/apache/datafusion/pull/13491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Move many udf implementations from `invoke` to `invoke_batch` [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 commented on PR #13491: URL: https://github.com/apache/datafusion/pull/13491#issuecomment-2499324628 Thanks @joseph-isaacs @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-25 Thread via GitHub
jonathanc-n commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2499318140 @jayzhan211 So should this PR be closed? Since there is no need for implicit casting in TypeSignature::String? I just worked on this because of your comment on implicit casting i

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2499306321 > this behaviour seems to have been deprecated in DuckDB. Did they deprecate? -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [I] [Proposal] String function data type handling requirements [datafusion]

2024-11-25 Thread via GitHub
jayzhan211 commented on issue #13552: URL: https://github.com/apache/datafusion/issues/13552#issuecomment-2499304530 Handling `Dict(_, String)` is no different from handling `String` > String functions MAY choose to allow non-contiguous data types for data arguments but it is NOT RECO

Re: [PR] updated maturin version and ccargo build to build yml [datafusion-ballista]

2024-11-25 Thread via GitHub
tbar4 commented on PR #1136: URL: https://github.com/apache/datafusion-ballista/pull/1136#issuecomment-2499297595 @andygrove I will tried to get the rest of the jobs fixed tomorrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] feat: Support `Utf8View` for `get_wider_type` + `binary_to_string_coercion` functions [datafusion]

2024-11-25 Thread via GitHub
jonathanc-n commented on PR #13370: URL: https://github.com/apache/datafusion/pull/13370#issuecomment-2499260072 @alamb I also believe that this one is also waiting on numeric to utf8view coercion pr in arrow-rs, correct me if I'm wrong though. -- This is an automated message from the Apa

Re: [I] Support data source sampling with TABLESAMPLE [datafusion]

2024-11-25 Thread via GitHub
theirix commented on issue #13563: URL: https://github.com/apache/datafusion/issues/13563#issuecomment-2499248049 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] Support data source sampling with TABLESAMPLE [datafusion]

2024-11-25 Thread via GitHub
theirix commented on issue #13563: URL: https://github.com/apache/datafusion/issues/13563#issuecomment-2499247966 Thank you for the initial analysis. I will first submit a PR to datafusion-sqlparser-rs with extended grammar. -- This is an automated message from the Apache Git Service. To

Re: [PR] fix: Support `Utf8View` in `numeric_string_coercion` [datafusion]

2024-11-25 Thread via GitHub
jonathanc-n commented on PR #13366: URL: https://github.com/apache/datafusion/pull/13366#issuecomment-2499243060 @alamb I believe this is waiting on the numeric to utf8view pr for arrow-rs. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-25 Thread via GitHub
jonathanc-n commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2499212644 @alamb @jayzhan211 I'm a bit confused with this pr, are we looking to support implicit casting with TypeSignature or not? As can be seen above, this behaviour seems to have been

Re: [PR] Support duplicate column aliases in queries [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13489: URL: https://github.com/apache/datafusion/pull/13489#issuecomment-2499131431 > Bugs happen and these are caught by ValidateDependenciesChecker (example failure [trinodb/trino#22806](https://github.com/trinodb/trino/issues/22806)). That error message is 😍

Re: [PR] Fixed imports in custom_datasource.rs example [datafusion]

2024-11-25 Thread via GitHub
alamb merged PR #13561: URL: https://github.com/apache/datafusion/pull/13561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Move many udf implementations from `invoke` to `invoke_batch` [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13491: URL: https://github.com/apache/datafusion/pull/13491#issuecomment-2499120623 > I think there was a transient failure I retriggered the tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Handle alias when parsing sql(parse_sql_expr) [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #12939: URL: https://github.com/apache/datafusion/pull/12939#issuecomment-2499094525 Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look -- This is an automated message from the A

[PR] updated maturin version and ccargo build to build yml [datafusion-ballista]

2024-11-25 Thread via GitHub
tbar4 opened a new pull request, #1136: URL: https://github.com/apache/datafusion-ballista/pull/1136 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes

Re: [PR] feat: Optimize `SortPreservingMergeExec` to avoid merging non-overlapping partitions [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13296: URL: https://github.com/apache/datafusion/pull/13296#issuecomment-2499107581 > > FYI an update here is that I don't think I am going to be able to work on Statistics for the next month or two. Though I think @mhilton from InfluxData was thinking of potentially

Re: [PR] fix: Support `Utf8View` in `numeric_string_coercion` [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13366: URL: https://github.com/apache/datafusion/pull/13366#issuecomment-2499104479 @jonathanc-n do you have some time to add some SLT tests to this PR as suggested above? -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] feat: Support `Utf8View` for `get_wider_type` + `binary_to_string_coercion` functions [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13370: URL: https://github.com/apache/datafusion/pull/13370#issuecomment-2499103831 I am trying to clean up the review backlog -- what is the status of this PR? It seems like it is waiting on an end to end (e.g. sql or dataframe test) that actually has a proble

Re: [PR] Support duplicate column aliases in queries [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13489: URL: https://github.com/apache/datafusion/pull/13489#issuecomment-2499093201 > > PhysicalExpr columns use ordinal offsets and they don't seem to have generated too many debugging headaches > > a gin has been invoked - #13559 🤣 😭 -- This is an a

Re: [PR] Fix join with sort push down [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13560: URL: https://github.com/apache/datafusion/pull/13560#issuecomment-2499091744 This appears to be code that was introduced in https://github.com/apache/datafusion/pull/12559 from @berkaysynnada @berkaysynnada is there any way you can help review this PR?

Re: [PR] feat: [substrait] support-try-cast [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13562: URL: https://github.com/apache/datafusion/pull/13562#issuecomment-2499088430 Thank you @eatthepear 🙏 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] feat: [substrait] support-try-cast [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13562: URL: https://github.com/apache/datafusion/pull/13562#issuecomment-2499088149 FYI @vbarua @Blizzara -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] support `json_object('k':'v')` in postgres [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
lovasoa commented on PR #1546: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1546#issuecomment-2499087257 Great ! https://github.com/apache/datafusion-sqlparser-rs/pull/1547 should now be mergeable upon rebase -- This is an automated message from the Apache Git Service. To

Re: [PR] Added documentation for SortMergeJoin [datafusion]

2024-11-25 Thread via GitHub
alamb merged PR #13469: URL: https://github.com/apache/datafusion/pull/13469 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Added documentation for SortMergeJoin [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13469: URL: https://github.com/apache/datafusion/pull/13469#issuecomment-2499081979 I changed the PR description to say "part of" rather than closing https://github.com/apache/datafusion/issues/10357 Thanks @athultr1997 @comphead and @viirya -- This is an a

Re: [PR] fix: Ignore names of technical inner fields (of List and Map types) when comparing datatypes for logical equivalence [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13522: URL: https://github.com/apache/datafusion/pull/13522#issuecomment-2499080028 > I went ahead and added it in [b714d2e](https://github.com/apache/datafusion/commit/b714d2ee41ee28efabda177b02a2e998743796f9). FYI @alamb since you had already approved. Thank

Re: [I] Substrait consumer's schema check can fail for list (probs also map) columns [datafusion]

2024-11-25 Thread via GitHub
alamb closed issue #13437: Substrait consumer's schema check can fail for list (probs also map) columns URL: https://github.com/apache/datafusion/issues/13437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] fix: Ignore names of technical inner fields (of List and Map types) when comparing datatypes for logical equivalence [datafusion]

2024-11-25 Thread via GitHub
alamb merged PR #13522: URL: https://github.com/apache/datafusion/pull/13522 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Update tests and resolve TODOs after arrow update [datafusion]

2024-11-25 Thread via GitHub
alamb merged PR #13538: URL: https://github.com/apache/datafusion/pull/13538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] chore: Remove unused dependencies [datafusion]

2024-11-25 Thread via GitHub
alamb merged PR #13541: URL: https://github.com/apache/datafusion/pull/13541 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Incorrect LIKE and ILIKE result for NULL input, `%', and '%%' pattern [datafusion]

2024-11-25 Thread via GitHub
alamb closed issue #12637: Incorrect LIKE and ILIKE result for NULL input, `%', and '%%' pattern URL: https://github.com/apache/datafusion/issues/12637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] chore: Remove unused dependencies [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13541: URL: https://github.com/apache/datafusion/pull/13541#issuecomment-2499078542 🚀 -- thank you @findepi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] warning: variant `UTF8` is never constructed [datafusion]

2024-11-25 Thread via GitHub
alamb closed issue #13530: warning: variant `UTF8` is never constructed URL: https://github.com/apache/datafusion/issues/13530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] test: allow external_access_plan run on windows [datafusion]

2024-11-25 Thread via GitHub
alamb merged PR #13531: URL: https://github.com/apache/datafusion/pull/13531 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] feat: Optimize `SortPreservingMergeExec` to avoid merging non-overlapping partitions [datafusion]

2024-11-25 Thread via GitHub
suremarc commented on PR #13296: URL: https://github.com/apache/datafusion/pull/13296#issuecomment-2499072825 > The implementation is really nice. I'm wondering is it convenient to move the stream concat logic into `StreamingMergeBuilder`, like > > ```rust > let result = StreamingM

Re: [PR] Support custom field metadata in UDF [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13458: URL: https://github.com/apache/datafusion/pull/13458#issuecomment-2499071332 > I think a test would be in order that would showcase why the new metadata method exists and what problem it solves. I agree. -- This is an automated message from the Apache

Re: [PR] feat: Optimize `SortPreservingMergeExec` to avoid merging non-overlapping partitions [datafusion]

2024-11-25 Thread via GitHub
suremarc commented on PR #13296: URL: https://github.com/apache/datafusion/pull/13296#issuecomment-2499070756 > FYI an update here is that I don't think I am going to be able to work on Statistics for the next month or two. Though I think @mhilton from InfluxData was thinking of potentially

Re: [I] Release DataFusion `44.0.0` [datafusion]

2024-11-25 Thread via GitHub
alamb commented on issue #13334: URL: https://github.com/apache/datafusion/issues/13334#issuecomment-2499069355 @Omega359 and @andygrove suggested https://github.com/apache/datafusion/issues/13525#issuecomment-2496487538 that for this release we > we file tickets as part of every re

Re: [I] [DISCUSSION] Making it easier to use DataFusion (lessons from GlareDB) [datafusion]

2024-11-25 Thread via GitHub
alamb commented on issue #13525: URL: https://github.com/apache/datafusion/issues/13525#issuecomment-2499064530 > It's not really captured by semver, but it would be nice if there were a distinction between breaking changes that simply require fixing compilation errors in a straightforward

Re: [I] [Proposal] String function data type handling requirements [datafusion]

2024-11-25 Thread via GitHub
alamb commented on issue #13552: URL: https://github.com/apache/datafusion/issues/13552#issuecomment-2499059230 FYI @findepi and @jayzhan211 in case you have opinions in this area -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] [Proposal] String function data type handling requirements [datafusion]

2024-11-25 Thread via GitHub
alamb commented on issue #13552: URL: https://github.com/apache/datafusion/issues/13552#issuecomment-2499058829 > String functions MUST accept scalar values for all config arguments but MAY accept both scalar and array if appropriate for the function. is a "config" argument well defin

Re: [PR] Add csv loading benchmarks. [datafusion]

2024-11-25 Thread via GitHub
alamb commented on code in PR #13544: URL: https://github.com/apache/datafusion/pull/13544#discussion_r1857348347 ## benchmarks/src/csv/load.rs: ## @@ -0,0 +1,75 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See th

Re: [PR] feat: Add ConfigOptions to ScalarFunctionArgs [datafusion]

2024-11-25 Thread via GitHub
alamb commented on PR #13527: URL: https://github.com/apache/datafusion/pull/13527#issuecomment-2499032455 Yeah, it is a tricky one for sure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] [EPIC] Improved aggregate function performance [datafusion]

2024-11-25 Thread via GitHub
alamb commented on issue #13548: URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2499030684 I would also expect this to help: - https://github.com/apache/datafusion/pull/11627 -- This is an automated message from the Apache Git Service. To respond to the message, ple

[PR] on condition requirement for join [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
demetribu opened a new pull request, #1552: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1552 Closes: #1550 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] support `json_object('k':'v')` in postgres [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
alamb commented on PR #1546: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1546#issuecomment-2499024746 🚀 -- thanks again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] support `json_object('k':'v')` in postgres [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
alamb merged PR #1546: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1546 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Fallback to identifier parsing if expression parsing fails [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
alamb merged PR #1513: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Fallback to identifier parsing if expression parsing fails [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
alamb commented on PR #1513: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1513#issuecomment-2499023423 Thanks @yoavcloud -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Support data source sampling with TABLESAMPLE [datafusion]

2024-11-25 Thread via GitHub
alamb commented on issue #13563: URL: https://github.com/apache/datafusion/issues/13563#issuecomment-2498989436 I looked around in sqlparser-rs briefly and it seems this syntax is not yet supported https://github.com/search?q=repo%3Aapache%2Fdatafusion-sqlparser-rs%20tablesample&type

Re: [PR] fix: [comet-parquet-exec] Use RDD partition index [datafusion-comet]

2024-11-25 Thread via GitHub
viirya commented on code in PR #1120: URL: https://github.com/apache/datafusion-comet/pull/1120#discussion_r1857303548 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -1029,12 +1029,20 @@ class CometSparkSessionExtensions var firstNativ

[I] Can get wrong results when querying Delta tables [datafusion-comet]

2024-11-25 Thread via GitHub
Kimahriman opened a new issue, #1121: URL: https://github.com/apache/datafusion-comet/issues/1121 ### Describe the bug Because Delta scans work by using a subclass of `ParquetFileFormat` within a normal Hadoop relation, Comet will see this and simply replace it with a `CometParquetFi

Re: [I] `with_checksum_algorithm` fails when used in with `CsvWriter` on large files [datafusion]

2024-11-25 Thread via GitHub
avantgardnerio commented on issue #13528: URL: https://github.com/apache/datafusion/issues/13528#issuecomment-2498786083 Confirmed this is an issue in `object_store`, filing a new issue there. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [I] Filters on `RANDOM()` are applied incorrectly when pushdown_filters is enabled. [datafusion]

2024-11-25 Thread via GitHub
theirix commented on issue #13268: URL: https://github.com/apache/datafusion/issues/13268#issuecomment-2498941989 > I think it makes sense to add TABLESAMPLE. @theirix would you want to create an issue about this? fwiw here is Trino docs on the topic: https://trino.io/docs/current/sql/selec

[I] Support data source sampling with TABLESAMPLE [datafusion]

2024-11-25 Thread via GitHub
theirix opened a new issue, #13563: URL: https://github.com/apache/datafusion/issues/13563 ### Is your feature request related to a problem or challenge? It is helpful to have sampling support for queries to ease the exploration of data. ### Describe the solution you'd like

Re: [I] Fix Issue with maturin Cargo.lock file [datafusion-ballista]

2024-11-25 Thread via GitHub
tbar4 commented on issue #1135: URL: https://github.com/apache/datafusion-ballista/issues/1135#issuecomment-2498929836 @andygrove just received the same error locally. Once I ran `cargo build` first, then `maturin develop` it cleared the issue. Will check the new build script for a cargo b

[I] Fix Issue with maturin Cargo.lock file [datafusion-ballista]

2024-11-25 Thread via GitHub
tbar4 opened a new issue, #1135: URL: https://github.com/apache/datafusion-ballista/issues/1135 **Describe the bug** ![image](https://github.com/user-attachments/assets/9b25614b-e1bf-4be2-bd86-1ad5595663e7) Issue in build pipeline found on #1134 **To Reproduce** Steps to rep

[I] fixed size list type is not retained when writing to parquet [datafusion-python]

2024-11-25 Thread via GitHub
matko opened a new issue, #957: URL: https://github.com/apache/datafusion-python/issues/957 When I create a parquet file from an arrow table with a fixed size array as one of the columns, then read back the resulting parquet, the column is no longer a fixed size array, but instead a dynamic

Re: [I] Filters on `RANDOM()` are applied incorrectly when pushdown_filters is enabled. [datafusion]

2024-11-25 Thread via GitHub
findepi commented on issue #13268: URL: https://github.com/apache/datafusion/issues/13268#issuecomment-2498864767 I think it makes sense to add TABLESAMPLE. @theirix would you want to create an issue about this? fwiw here is Trino docs on the topic: https://trino.io/docs/current/sql/sele

Re: [I] Referencing a column from `select` and `order by` clauses triggers duplicate expression error [datafusion]

2024-11-25 Thread via GitHub
findepi commented on issue #13558: URL: https://github.com/apache/datafusion/issues/13558#issuecomment-2498860833 @jatin510 what's your idea for the fix? fwiw i tried to fix this for select clauses (https://github.com/apache/datafusion/pull/13489), but run into some fundamental issues wi

[PR] [substrait] support-try-cast [datafusion]

2024-11-25 Thread via GitHub
eatthepear opened a new pull request, #13562: URL: https://github.com/apache/datafusion/pull/13562 ## Which issue does this PR close? Closes #13419 . ## Rationale for this change See issue. ## What changes are included in this PR? ##

Re: [PR] Support duplicate column aliases in queries [datafusion]

2024-11-25 Thread via GitHub
findepi commented on PR #13489: URL: https://github.com/apache/datafusion/pull/13489#issuecomment-2498838693 > PhysicalExpr columns use ordinal offsets and they don't seem to have generated too many debugging headaches a gin has been invoked - https://github.com/apache/datafusion/issu

Re: [PR] Multipart signature issue [datafusion]

2024-11-25 Thread via GitHub
avantgardnerio closed pull request #13529: Multipart signature issue URL: https://github.com/apache/datafusion/pull/13529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] `with_checksum_algorithm` fails when used in with `CsvWriter` on large files [datafusion]

2024-11-25 Thread via GitHub
avantgardnerio closed issue #13528: `with_checksum_algorithm` fails when used in with `CsvWriter` on large files URL: https://github.com/apache/datafusion/issues/13528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] fix: Ignore names of technical inner fields (of List and Map types) when comparing datatypes for logical equivalence [datafusion]

2024-11-25 Thread via GitHub
Blizzara commented on PR #13522: URL: https://github.com/apache/datafusion/pull/13522#issuecomment-2498764862 > Whichever you like. This one was approved so we could merge and add another ticket. Or if you want to just add it in, I'll review it right away. Or if you want me to take it on, I

Re: [PR] copy build.yaml from datafusion-python [datafusion-ballista]

2024-11-25 Thread via GitHub
andygrove commented on PR #1134: URL: https://github.com/apache/datafusion-ballista/pull/1134#issuecomment-2498756896 @milenkovicm @tbar4 fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] chore: dependancy updates [datafusion-ballista]

2024-11-25 Thread via GitHub
andygrove merged PR #1131: URL: https://github.com/apache/datafusion-ballista/pull/1131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

[PR] copy build.yaml from datafusion-python [datafusion-ballista]

2024-11-25 Thread via GitHub
andygrove opened a new pull request, #1134: URL: https://github.com/apache/datafusion-ballista/pull/1134 # Which issue does this PR close? Part of https://github.com/apache/datafusion-ballista/issues/1120 # Rationale for this change We need a release process

Re: [I] Release Minor DataFusion 43.1.0 release [datafusion]

2024-11-25 Thread via GitHub
timsaucer commented on issue #13499: URL: https://github.com/apache/datafusion/issues/13499#issuecomment-2498711967 @alamb is there anything I can do to help with this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Update DataFusion to 43 [datafusion-ballista]

2024-11-25 Thread via GitHub
milenkovicm commented on PR #1125: URL: https://github.com/apache/datafusion-ballista/pull/1125#issuecomment-2498708158 once we merge this patch, ballista will read deltalake (custom codecs needed) ```rust let config = SessionConfig::new_with_ballista() .with_target_pa

Re: [I] Ballista 43.0.0 Release [datafusion-ballista]

2024-11-25 Thread via GitHub
milenkovicm commented on issue #974: URL: https://github.com/apache/datafusion-ballista/issues/974#issuecomment-2498700555 I have no objections. Other projects put the bar very hight, we might get under pressure 😀 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] chore: Remove unused dependencies [datafusion]

2024-11-25 Thread via GitHub
findepi commented on PR #13541: URL: https://github.com/apache/datafusion/pull/13541#issuecomment-2498682026 Similar in arrow - https://github.com/apache/arrow-rs/pull/6792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] fix: Ignore names of technical inner fields (of List and Map types) when comparing datatypes for logical equivalence [datafusion]

2024-11-25 Thread via GitHub
timsaucer commented on PR #13522: URL: https://github.com/apache/datafusion/pull/13522#issuecomment-2498674943 Whichever you like. This one was approved so we could merge and add another ticket. Or if you want to just add it in, I'll review it right away. Or if you want me to take it on, I

Re: [PR] Extensions API for third-party table/catalog providers [datafusion-ray]

2024-11-25 Thread via GitHub
ccciudatu commented on code in PR #43: URL: https://github.com/apache/datafusion-ray/pull/43#discussion_r1857074011 ## src/context.rs: ## @@ -45,22 +46,30 @@ pub struct PyContext { pub(crate) fn execution_plan_from_pyany( py_plan: &Bound, +py: Python, ) -> PyResult>

Re: [I] Ballista 0.13.0 Release [datafusion-ballista]

2024-11-25 Thread via GitHub
andygrove commented on issue #974: URL: https://github.com/apache/datafusion-ballista/issues/974#issuecomment-2498666419 I don't have a strong opinion, but if we are going to change the versioning scheme, I would rather do it once than twice. Why don't we just jump straight to 43.0.0 for t

Re: [PR] Extensions API for third-party table/catalog providers [datafusion-ray]

2024-11-25 Thread via GitHub
timsaucer commented on code in PR #43: URL: https://github.com/apache/datafusion-ray/pull/43#discussion_r1857021166 ## src/context.rs: ## @@ -45,22 +46,30 @@ pub struct PyContext { pub(crate) fn execution_plan_from_pyany( py_plan: &Bound, +py: Python, ) -> PyResult>

Re: [PR] Add csv loading benchmarks. [datafusion]

2024-11-25 Thread via GitHub
berkaysynnada commented on PR #13544: URL: https://github.com/apache/datafusion/pull/13544#issuecomment-2498574013 > I don't have a strong opinion on the location of the benchmarks, so I'm happy to follow recommendations. > > For my future reference, how do you differentiate this func

[PR] build(deps): bump rustls from 0.23.16 to 0.23.18 [datafusion-python]

2024-11-25 Thread via GitHub
dependabot[bot] opened a new pull request, #956: URL: https://github.com/apache/datafusion-python/pull/956 Bumps [rustls](https://github.com/rustls/rustls) from 0.23.16 to 0.23.18. Commits https://github.com/rustls/rustls/commit/33af2c38b0f1e4abf44d59d5b74ccf12f5cf5e56";>33af2c3

Re: [PR] Enhance the nested type access for Generic and DuckDB dialect [datafusion-sqlparser-rs]

2024-11-25 Thread via GitHub
goldmedal commented on code in PR #1541: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1541#discussion_r1856938487 ## src/parser/mod.rs: ## @@ -2935,12 +2935,23 @@ impl<'a> Parser<'a> { }) } else if Token::LBracket == tok { if di

Re: [PR] Rename `BuiltInWindow*` to `StandardWindow*` [datafusion]

2024-11-25 Thread via GitHub
alamb merged PR #13536: URL: https://github.com/apache/datafusion/pull/13536 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

  1   2   >