Re: [I] `SessionContext::create_physical_expr` does not unwrap casts (and thus is not always optimal) [datafusion]

2025-03-04 Thread via GitHub
jayzhan211 commented on issue #14987: URL: https://github.com/apache/datafusion/issues/14987#issuecomment-2696573839 What do you mean "simplify", is it `SimplifyExpressions` or other logic. If it is `SimplifyExpressions`, then you have logical plan optimization involved, if not and you go d

Re: [PR] _repr_ and _html_repr_ show '... and additional rows' message [datafusion-python]

2025-03-04 Thread via GitHub
kosiew commented on code in PR #1041: URL: https://github.com/apache/datafusion-python/pull/1041#discussion_r1978735197 ## src/dataframe.rs: ## @@ -90,59 +90,108 @@ impl PyDataFrame { } fn __repr__(&self, py: Python) -> PyDataFusionResult { -let df = self.df

Re: [PR] _repr_ and _html_repr_ show '... and additional rows' message [datafusion-python]

2025-03-04 Thread via GitHub
kosiew commented on code in PR #1041: URL: https://github.com/apache/datafusion-python/pull/1041#discussion_r1978864670 ## src/dataframe.rs: ## @@ -90,59 +90,108 @@ impl PyDataFrame { } fn __repr__(&self, py: Python) -> PyDataFusionResult { -let df = self.df

Re: [I] `SessionContext::create_physical_expr` does not unwrap casts (and thus is not always optimal) [datafusion]

2025-03-04 Thread via GitHub
ion-elgreco commented on issue #14987: URL: https://github.com/apache/datafusion/issues/14987#issuecomment-2696634942 @jayzhan211 its using the expression simplifier: ``` let logical_filter = self.filter.map(|expr| { // Simplify the expression first

Re: [PR] add method SessionStateBuilder::new_with_defaults() [datafusion]

2025-03-04 Thread via GitHub
alamb commented on code in PR #14998: URL: https://github.com/apache/datafusion/pull/14998#discussion_r1979267227 ## datafusion/core/src/execution/session_state.rs: ## @@ -1109,6 +1109,22 @@ impl SessionStateBuilder { .with_table_function_list(SessionStateDefaults:

Re: [PR] chore: Update `SessionStateBuilder::with_default_features` does not replace existing features [datafusion]

2025-03-04 Thread via GitHub
alamb merged PR #14935: URL: https://github.com/apache/datafusion/pull/14935 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] minor: `SessionStateBuilder::with_default_features` ergonomics [datafusion]

2025-03-04 Thread via GitHub
alamb closed issue #14899: minor: `SessionStateBuilder::with_default_features` ergonomics URL: https://github.com/apache/datafusion/issues/14899 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] add method SessionStateBuilder::new_with_defaults() [datafusion]

2025-03-04 Thread via GitHub
alamb commented on PR #14998: URL: https://github.com/apache/datafusion/pull/14998#issuecomment-2697248930 > Why don't we just implement Default trait? There is some more backstory here: - https://github.com/apache/datafusion/pull/14935#pullrequestreview-2654150748 Basically

Re: [PR] chore: Update `SessionStateBuilder::with_default_features` does not replace existing features [datafusion]

2025-03-04 Thread via GitHub
alamb commented on PR #14935: URL: https://github.com/apache/datafusion/pull/14935#issuecomment-2697244168 Thanks again @irenjj and @milenkovicm It seems as if @shruti2522 already has a PR to implement the new_with_default_features - https://github.com/apache/datafusion/pull/14998

Re: [PR] Fix documentation warnings and error if anymore occur [datafusion]

2025-03-04 Thread via GitHub
AmosAidoo commented on PR #14952: URL: https://github.com/apache/datafusion/pull/14952#issuecomment-2697309878 @alamb my pleasure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[I] Implement `tree` explain for `FilterExec` [datafusion]

2025-03-04 Thread via GitHub
alamb opened a new issue, #15000: URL: https://github.com/apache/datafusion/issues/15000 ### Is your feature request related to a problem or challenge? - Part of https://github.com/apache/datafusion/issues/14914 @irenjj added a new `tree` explain mode in https://github.com/apa

Re: [PR] fix: Executor memory overhead overriding [datafusion-comet]

2025-03-04 Thread via GitHub
wForget commented on code in PR #1462: URL: https://github.com/apache/datafusion-comet/pull/1462#discussion_r1979412638 ## spark/src/test/scala/org/apache/spark/CometPluginsSuite.scala: ## @@ -143,3 +143,36 @@ class CometPluginsNonOverrideSuite extends CometTestBase { asser

Re: [PR] Implement `tree` explain for FilterExec [datafusion]

2025-03-04 Thread via GitHub
irenjj commented on code in PR #15001: URL: https://github.com/apache/datafusion/pull/15001#discussion_r1979377790 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -185,6 +188,32 @@ physical_plan 26)│ DataSourceExec ││ DataSourceExec │ 27)└──

Re: [I] Expose to `AccumulatorArgs` whether all the groups are sorted [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2697939824 In the case where the data is already sorted on the group expressions, it should use a different grouping operator, specifically https://docs.rs/datafusion/latest/datafusion/ph

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-04 Thread via GitHub
parthchandra commented on PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#issuecomment-2698007418 > Would it be difficult to split this PR? The title makes it seem like a minor refactor to the ParquetExec instantiation, but it's actually bringing in a lot of the object

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-04 Thread via GitHub
andygrove commented on PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456#issuecomment-2698077082 Some tests are failing due to https://github.com/apache/datafusion-comet/issues/1289 I think the root cause is that we are trying to shuffle with arrays and Comet shuff

Re: [I] Unsupported Arrow Vector for export: class org.apache.arrow.vector.complex.ListVector [datafusion-comet]

2025-03-04 Thread via GitHub
andygrove commented on issue #1289: URL: https://github.com/apache/datafusion-comet/issues/1289#issuecomment-2698079511 I think the root cause is that we are trying to shuffle with complex types and Comet shuffle does not support complex types yet. We need to fall back to Spark for these s

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-04 Thread via GitHub
andygrove commented on PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456#issuecomment-2698094548 In `CometExecRule` we check to see if we support the partitioning types for the shuffle but do not check that we support the types of other columns. @comphead Do you wan

Re: [PR] Add all missing table options to be handled in any order [datafusion-sqlparser-rs]

2025-03-04 Thread via GitHub
iffyio commented on PR #1747: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1747#issuecomment-2697471150 Marking the PR as draft in the meantime as it is not currently pending review, @benrsatori feel free to undraft and ping when ready! -- This is an automated message from

Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-03-04 Thread via GitHub
alamb commented on code in PR #14331: URL: https://github.com/apache/datafusion/pull/14331#discussion_r1979614246 ## .github/workflows/extended.yml: ## @@ -33,16 +33,46 @@ on: push: branches: - main + issue_comment: +types: [created] + +permissions: + pull-r

Re: [I] [EPIC] Complete `SQL EXPLAIN` Tree Rendering [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14914: URL: https://github.com/apache/datafusion/issues/14914#issuecomment-2697451179 I made a first PR to show how to add tree explain - https://github.com/apache/datafusion/pull/15001 Once we get that one in I plan to file a bunch of other tickets for the

[PR] Snowflake: Support dollar quoted comment of table, view and field [datafusion-sqlparser-rs]

2025-03-04 Thread via GitHub
7phs opened a new pull request, #1755: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1755 Snowflake [supports](https://docs.snowflake.com/en/sql-reference/literals-table) dollar-quoted comments as string literals when creating tables, views, and their fields. For examp

Re: [PR] add method SessionStateBuilder::new_with_defaults() [datafusion]

2025-03-04 Thread via GitHub
shruti2522 commented on code in PR #14998: URL: https://github.com/apache/datafusion/pull/14998#discussion_r1979327789 ## datafusion/core/src/execution/session_state.rs: ## @@ -1109,6 +1109,22 @@ impl SessionStateBuilder { .with_table_function_list(SessionStateDefa

[PR] Implement `tree` explain for FilterExec [datafusion]

2025-03-04 Thread via GitHub
alamb opened a new pull request, #15001: URL: https://github.com/apache/datafusion/pull/15001 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/15000 ## Rationale for this change Let's have nice explain plans! I wanted

Re: [PR] Make `create_ordering` pub and add doc for it [datafusion]

2025-03-04 Thread via GitHub
xudong963 merged PR #14996: URL: https://github.com/apache/datafusion/pull/14996 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] fix: Executor memory overhead overriding [datafusion-comet]

2025-03-04 Thread via GitHub
LukMRVC commented on PR #1462: URL: https://github.com/apache/datafusion-comet/pull/1462#issuecomment-2697480797 That one should be fine, they passed with current changes to overriding spark executor memory, but fail without them. -- This is an automated message from the Apache Git Serv

Re: [PR] Implement `tree` explain for FilterExec [datafusion]

2025-03-04 Thread via GitHub
irenjj commented on code in PR #15001: URL: https://github.com/apache/datafusion/pull/15001#discussion_r1979381479 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -185,6 +188,32 @@ physical_plan 26)│ DataSourceExec ││ DataSourceExec │ 27)└──

Re: [I] Implement Nicer / DuckDB style explain plans [datafusion]

2025-03-04 Thread via GitHub
alamb closed issue #9371: Implement Nicer / DuckDB style explain plans URL: https://github.com/apache/datafusion/issues/9371 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] feat: Add `tree` / pretty explain mode [datafusion]

2025-03-04 Thread via GitHub
alamb merged PR #14677: URL: https://github.com/apache/datafusion/pull/14677 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] `SessionContext::create_physical_expr` does not unwrap casts (and thus is not always optimal) [datafusion]

2025-03-04 Thread via GitHub
jayzhan211 commented on issue #14987: URL: https://github.com/apache/datafusion/issues/14987#issuecomment-2697401042 I see. We can try moving `UnwrapCastInComparison` into `Simplifier` large pattern matching rule -- This is an automated message from the Apache Git Service. To respond to t

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14123: URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2697426444 We have merged the fix here - https://github.com/apache/datafusion/pull/14990 To simplify votiing I suggest we backport that fix to the `branch-46` line and make a second

Re: [PR] feat: Add `tree` / pretty explain mode [datafusion]

2025-03-04 Thread via GitHub
alamb commented on PR #14677: URL: https://github.com/apache/datafusion/pull/14677#issuecomment-2697397568 🚀 thank you again so much @irenjj -- this is so cool. A great step towards better explain plans. I am working on enhancing the follow on ticket you filed in https://github.co

Re: [PR] Refactor SortPushdown using the standard top-down visitor and using `EquivalenceProperties` [datafusion]

2025-03-04 Thread via GitHub
alamb commented on PR #14821: URL: https://github.com/apache/datafusion/pull/14821#issuecomment-2698190011 @berkaysynnada @wiedld and I had a brief meeting. The outcomes from my perspective is: 1. @berkaysynnada plans to review this PR and test on the synnada fork to ensure it still wor

Re: [PR] fix: Executor memory overhead overriding [datafusion-comet]

2025-03-04 Thread via GitHub
LukMRVC commented on code in PR #1462: URL: https://github.com/apache/datafusion-comet/pull/1462#discussion_r1979775668 ## spark/src/test/scala/org/apache/spark/CometPluginsSuite.scala: ## @@ -143,3 +143,36 @@ class CometPluginsNonOverrideSuite extends CometTestBase { asser

Re: [I] Expose to `AccumulatorArgs` whether all the groups are sorted [datafusion]

2025-03-04 Thread via GitHub
rluvaton commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2698202576 Actually it uses GroupAccumulator even if it is fully sorted. you can see by adding breakpoint to https://github.com/apache/datafusion/blob/ac79ef3442e65f6197c7234da9fad9

Re: [I] Epic: Statistics improvements [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #8227: URL: https://github.com/apache/datafusion/issues/8227#issuecomment-2698552710 @clflushopt I don't really know as I am not driving the statistics rework myself and thus don't have much visibility into what is planned The only remaining thing I know of

Re: [I] Substrait plan read relation baseSchema does not include the struct with type information [datafusion]

2025-03-04 Thread via GitHub
amoeba commented on issue #12244: URL: https://github.com/apache/datafusion/issues/12244#issuecomment-2699036156 The behavior was modified in #12245 and the original issue looks addressed but the plans I'm getting don't validate. DataFusion currently hardcodes struct nullability to `NULLABI

Re: [PR] Minor: Add identation to EnforceDistribution test plans. [datafusion]

2025-03-04 Thread via GitHub
alamb commented on code in PR #15007: URL: https://github.com/apache/datafusion/pull/15007#discussion_r1980306194 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -572,33 +573,33 @@ fn multi_hash_joins() -> Result<()> { // Should in

Re: [I] Some aggregates silently ignore `IGNORE NULLS` and `ORDER BY` on arguments [datafusion]

2025-03-04 Thread via GitHub
vbarua commented on issue #9924: URL: https://github.com/apache/datafusion/issues/9924#issuecomment-2698960345 I opened up an issue that's potentially related to this when it comes to supporting IGNORE NULLS https://github.com/apache/datafusion/issues/15006 -- This is an automated mess

Re: [I] Substrait plan read relation baseSchema does not include the struct with type information [datafusion]

2025-03-04 Thread via GitHub
Blizzara commented on issue #12244: URL: https://github.com/apache/datafusion/issues/12244#issuecomment-2699060966 I didn‘t see something in substrait.io where it‘d be explicitly said, but the example there (https://substrait.io/tutorial/sql_to_substrait/#types-and-schemas) does use NULLAB

Re: [PR] feat: Upgrade to DataFusion 46.0.0-rc2 [datafusion-comet]

2025-03-04 Thread via GitHub
andygrove commented on PR #1423: URL: https://github.com/apache/datafusion-comet/pull/1423#issuecomment-2699132289 @kazuyukitanimura I'd like to go ahead and merge this now to give us a few days to test with D 46 before the final release. WDYT? -- This is an automated message from the Ap

[PR] Do not double alias Exprs [datafusion]

2025-03-04 Thread via GitHub
alamb opened a new pull request, #15008: URL: https://github.com/apache/datafusion/pull/15008 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/14895 ## Rationale for this change Nested `Expr::Alias` is confusing for display a

Re: [I] Substrait plan read relation baseSchema does not include the struct with type information [datafusion]

2025-03-04 Thread via GitHub
amoeba commented on issue #12244: URL: https://github.com/apache/datafusion/issues/12244#issuecomment-2699149322 Thanks @Blizzara. I'll file a PR to change this soon and we can close this issue out. I asked for some clarification on the Substrait Slack so we can be sure it's the right chang

Re: [PR] Implement `tree` explain for FilterExec [datafusion]

2025-03-04 Thread via GitHub
alamb commented on code in PR #15001: URL: https://github.com/apache/datafusion/pull/15001#discussion_r1979586990 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -185,6 +188,32 @@ physical_plan 26)│ DataSourceExec ││ DataSourceExec │ 27)└───

Re: [I] Weekly Plan (Andrew Lamb) March 3, 2025 [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14978: URL: https://github.com/apache/datafusion/issues/14978#issuecomment-2697852067 DataFusion: Bugs/UX/Performance - [ ] https://github.com/apache/datafusion/pull/14331 - [ ] https://github.com/apache/datafusion/pull/14935 - [ ] https://github.com/apa

Re: [I] Expr simplifier doesn't simplify exprs that are same if you swap lhs with rhs regardless of cycles [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14943: URL: https://github.com/apache/datafusion/issues/14943#issuecomment-2697865461 @jayzhan211 has a PR to fix this here: - https://github.com/apache/datafusion/pull/14994 -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] Expr simplifier doesn't simplify exprs that are same if you swap lhs with rhs regardless of cycles [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14943: URL: https://github.com/apache/datafusion/issues/14943#issuecomment-2697880588 > We simplify A = B and B = A in common_sub_expression_eliminate but not simplify_expressions I wonder if this is another example of code that would be beneficial to pull i

Re: [PR] Simplify Between expression to Eq [datafusion]

2025-03-04 Thread via GitHub
alamb commented on code in PR #14994: URL: https://github.com/apache/datafusion/pull/14994#discussion_r1979605743 ## datafusion/sqllogictest/test_files/simplify_expr.slt: ## @@ -0,0 +1,34 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor lic

Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-03-04 Thread via GitHub
alamb commented on code in PR #14331: URL: https://github.com/apache/datafusion/pull/14331#discussion_r1979614246 ## .github/workflows/extended.yml: ## @@ -33,16 +33,46 @@ on: push: branches: - main + issue_comment: +types: [created] + +permissions: + pull-r

Re: [I] Expr simplifier doesn't simplify exprs that are same if you swap lhs with rhs regardless of cycles [datafusion]

2025-03-04 Thread via GitHub
ion-elgreco commented on issue #14943: URL: https://github.com/apache/datafusion/issues/14943#issuecomment-2698150878 > [@jayzhan211](https://github.com/jayzhan211) has a PR to fix this here: > > * [Simplify Between expression to Eq  #14994](https://github.com/apache/datafusion/pull/14

Re: [I] [EPIC] Complete `SQL EXPLAIN` Tree Rendering [datafusion]

2025-03-04 Thread via GitHub
milenkovicm commented on issue #14914: URL: https://github.com/apache/datafusion/issues/14914#issuecomment-2698392966 This looks great! Would it be possible to extend this with svg output? It would be great for ballista ui -- This is an automated message from the Apache Git Service. To r

[I] [Question] Skip Partial stage aggregation stage for sorted input [datafusion]

2025-03-04 Thread via GitHub
rluvaton opened a new issue, #15002: URL: https://github.com/apache/datafusion/issues/15002 If the input is sorted by the group keys can't we just skip partial aggregation stage? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Fix: to_char Function Now Correctly Handles DATE Values in DataFusion [datafusion]

2025-03-04 Thread via GitHub
alamb commented on code in PR #14970: URL: https://github.com/apache/datafusion/pull/14970#discussion_r1979911036 ## datafusion/functions/src/datetime/to_char.rs: ## @@ -234,15 +226,21 @@ fn _to_char_scalar( }; let formatter = ArrayFormatter::try_new(array.as_ref(),

Re: [I] Improve EnforceDistribution testings. [datafusion]

2025-03-04 Thread via GitHub
wiedld commented on issue #15003: URL: https://github.com/apache/datafusion/issues/15003#issuecomment-2698352844 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Scrollable python notebook table rendering [datafusion-python]

2025-03-04 Thread via GitHub
kevinjqliu commented on code in PR #1036: URL: https://github.com/apache/datafusion-python/pull/1036#discussion_r1979772203 ## src/dataframe.rs: ## @@ -100,46 +106,153 @@ impl PyDataFrame { } fn _repr_html_(&self, py: Python) -> PyDataFusionResult { -let mut

Re: [PR] _repr_ and _html_repr_ show '... and additional rows' message [datafusion-python]

2025-03-04 Thread via GitHub
kevinjqliu commented on PR #1041: URL: https://github.com/apache/datafusion-python/pull/1041#issuecomment-2698204265 btw #1036 also changes `_repr_html_` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-03-04 Thread via GitHub
danila-b commented on code in PR #14331: URL: https://github.com/apache/datafusion/pull/14331#discussion_r1979865173 ## .github/workflows/extended.yml: ## @@ -33,16 +33,46 @@ on: push: branches: - main + issue_comment: +types: [created] + +permissions: + pul

Re: [PR] [WIP] chore: Add detailed error for sum::coerce_type [datafusion]

2025-03-04 Thread via GitHub
dentiny commented on PR #14710: URL: https://github.com/apache/datafusion/pull/14710#issuecomment-2698883962 I'm really really sorry for the latency, get stuck with something else; I will get back to the PR by EoW. -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Release sqlparser-rs version `0.55.0` [datafusion-sqlparser-rs]

2025-03-04 Thread via GitHub
comphead commented on issue #1671: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1671#issuecomment-2698903812 +1 Verified on M3 Mac Thanks Andrew 27 February 2025, 09:37:51, From "Andrew Lamb" ***@***.***>: I made a release candidate. Here is the draft email. Howe

Re: [I] Expose to `AccumulatorArgs` whether all the groups are sorted [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2698391616 ```sql > CREATE TABLE test_table ( col_i32 INT, col_u32 INT UNSIGNED ) as VALUES ( NULL,NULL), ( -2147483648, 0), ( -2147483648, 0), ( 100,

Re: [I] Table function supports non-literal args [datafusion]

2025-03-04 Thread via GitHub
Lordworms commented on issue #14958: URL: https://github.com/apache/datafusion/issues/14958#issuecomment-2698453390 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Improved error for expand wildcard rule [datafusion]

2025-03-04 Thread via GitHub
Jiashu-Hu commented on issue #15004: URL: https://github.com/apache/datafusion/issues/15004#issuecomment-2698791268 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] chore: Refactor CometScanRule to avoid duplication and improve fallback messages [datafusion-comet]

2025-03-04 Thread via GitHub
andygrove commented on PR #1474: URL: https://github.com/apache/datafusion-comet/pull/1474#issuecomment-2698801229 @mbutrovich, Could you review this since this affects the new experimental native scan support? -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Support WITHIN GROUP syntax to standardize certain existing aggregate functions [datafusion]

2025-03-04 Thread via GitHub
vbarua commented on code in PR #13511: URL: https://github.com/apache/datafusion/pull/13511#discussion_r1979981659 ## datafusion/expr/src/expr.rs: ## @@ -295,6 +295,8 @@ pub enum Expr { /// See also [`ExprFunctionExt`] to set these fields. /// /// [`ExprFunctionEx

Re: [PR] Refactor test suite in EnforceDistribution, to use standard test config. [datafusion]

2025-03-04 Thread via GitHub
wiedld commented on code in PR #15010: URL: https://github.com/apache/datafusion/pull/15010#discussion_r1980473165 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -371,46 +371,91 @@ macro_rules! plans_matches_expected { } } +fn test_suite_defau

Re: [PR] Support WITHIN GROUP syntax to standardize certain existing aggregate functions [datafusion]

2025-03-04 Thread via GitHub
vbarua commented on PR #13511: URL: https://github.com/apache/datafusion/pull/13511#issuecomment-2699344130 Pinging @Dandandan for commiter review as they filed the ticket this fix is for. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[I] Incorrect URLs in Cargo.toml files [datafusion-ballista]

2025-03-04 Thread via GitHub
andygrove opened a new issue, #1193: URL: https://github.com/apache/datafusion-ballista/issues/1193 **Describe the bug** We still use Arrow URLs in some places: ``` homepage = "https://github.com/apache/arrow-ballista"; repository = "https://github.com/apache/arrow-ballista";

Re: [I] Support datatype cast for insert api same as insert into sql [datafusion]

2025-03-04 Thread via GitHub
zhuqi-lucas commented on issue #15015: URL: https://github.com/apache/datafusion/issues/15015#issuecomment-2699799674 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[PR] Minor: cleanup unused code [datafusion]

2025-03-04 Thread via GitHub
qazxcdswe123 opened a new pull request, #15016: URL: https://github.com/apache/datafusion/pull/15016 ## Which issue does this PR close? - Closes nothing ## Rationale for this change cleanup unused deadcode ## What changes are included in this PR?

Re: [PR] BUG: schema_force_view_type configuration not working for CREATE EXTERNAL TABLE [datafusion]

2025-03-04 Thread via GitHub
zhuqi-lucas commented on code in PR #14922: URL: https://github.com/apache/datafusion/pull/14922#discussion_r1980680580 ## datafusion/sqllogictest/test_files/insert_to_external.slt: ## @@ -456,13 +456,16 @@ explain insert into table_without_values select c1 from aggregate_test_

Re: [PR] Bug: Fix multi-lines printing issue for datafusion-cli [datafusion]

2025-03-04 Thread via GitHub
zhuqi-lucas commented on code in PR #14954: URL: https://github.com/apache/datafusion/pull/14954#discussion_r1980745507 ## datafusion-cli/tests/cli_integration.rs: ## @@ -51,6 +51,163 @@ fn init() { ["--command", "show datafusion.execution.batch_size", "--format", "json",

Re: [PR] Support WITHIN GROUP syntax to standardize certain existing aggregate functions [datafusion]

2025-03-04 Thread via GitHub
Garamda commented on code in PR #13511: URL: https://github.com/apache/datafusion/pull/13511#discussion_r1980826096 ## datafusion/sql/src/expr/function.rs: ## @@ -349,15 +365,49 @@ impl SqlToRel<'_, S> { } else { // User defined aggregate functions (UDAF) h

Re: [PR] Support WITHIN GROUP syntax to standardize certain existing aggregate functions [datafusion]

2025-03-04 Thread via GitHub
Garamda commented on code in PR #13511: URL: https://github.com/apache/datafusion/pull/13511#discussion_r1980817866 ## datafusion/sql/src/expr/function.rs: ## @@ -349,15 +365,49 @@ impl SqlToRel<'_, S> { } else { // User defined aggregate functions (UDAF) h

Re: [I] Table function supports non-literal args [datafusion]

2025-03-04 Thread via GitHub
Lordworms commented on issue #14958: URL: https://github.com/apache/datafusion/issues/14958#issuecomment-2700016967 I have done some basic research on how Postgres deal with table function with column, for example, something like this would work in Postgresql ``` SELECT t.id, t.start_v

Re: [I] Table function supports non-literal args [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14958: URL: https://github.com/apache/datafusion/issues/14958#issuecomment-2699172543 I think this one is likely pretty tricky -- it might be worth working on a design / writeup at first of how it would work / what the plans would look like -- This is an automate

Re: [PR] feat: Add array reading support to native_datafusion scan [datafusion-comet]

2025-03-04 Thread via GitHub
andygrove closed pull request #1324: feat: Add array reading support to native_datafusion scan URL: https://github.com/apache/datafusion-comet/pull/1324 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] Add dependency checks to verify-release-candidate script [datafusion]

2025-03-04 Thread via GitHub
waynexia opened a new pull request, #15009: URL: https://github.com/apache/datafusion/pull/15009 ## Which issue does this PR close? - Closes #. ## Rationale for this change Check all build dependencies are available before running the script. Backgr

Re: [I] Expr simplifier doesn't simplify exprs that are same if you swap lhs with rhs regardless of cycles [datafusion]

2025-03-04 Thread via GitHub
alamb commented on issue #14943: URL: https://github.com/apache/datafusion/issues/14943#issuecomment-2698393138 I think it will not be present in DataFusion 46 (we have the RC out for voting now) -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Simplify Between expression to Eq [datafusion]

2025-03-04 Thread via GitHub
jayzhan211 merged PR #14994: URL: https://github.com/apache/datafusion/pull/14994 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Simplify Between expression to Eq [datafusion]

2025-03-04 Thread via GitHub
jayzhan211 commented on PR #14994: URL: https://github.com/apache/datafusion/pull/14994#issuecomment-2699326547 > I don't think it closes https://github.com/apache/datafusion/issues/14943 I agree it. -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Support WITHIN GROUP syntax to standardize certain existing aggregate functions [datafusion]

2025-03-04 Thread via GitHub
vbarua commented on code in PR #13511: URL: https://github.com/apache/datafusion/pull/13511#discussion_r1980465883 ## datafusion/expr/src/udaf.rs: ## @@ -845,6 +861,19 @@ pub trait AggregateUDFImpl: Debug + Send + Sync { ScalarValue::try_from(data_type) } +//

[PR] Refactor test suite in EnforceDistribution, to use standard test config. [datafusion]

2025-03-04 Thread via GitHub
wiedld opened a new pull request, #15010: URL: https://github.com/apache/datafusion/pull/15010 ## Which issue does this PR close? Part of #15003 ## Rationale for this change Make it easier to determine the differences in test configuration, for each test case. Also c

Re: [PR] Refactor test suite in EnforceDistribution, to use standard test config. [datafusion]

2025-03-04 Thread via GitHub
wiedld commented on code in PR #15010: URL: https://github.com/apache/datafusion/pull/15010#discussion_r1980468330 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -371,46 +371,95 @@ macro_rules! plans_matches_expected { } } +fn test_suite_defau

Re: [PR] Refactor test suite in EnforceDistribution, to use standard test config. [datafusion]

2025-03-04 Thread via GitHub
wiedld commented on code in PR #15010: URL: https://github.com/apache/datafusion/pull/15010#discussion_r1980470202 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -371,46 +371,95 @@ macro_rules! plans_matches_expected { } } +fn test_suite_defau

Re: [PR] Refactor test suite in EnforceDistribution, to use standard test config. [datafusion]

2025-03-04 Thread via GitHub
wiedld commented on code in PR #15010: URL: https://github.com/apache/datafusion/pull/15010#discussion_r1980470202 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -371,46 +371,95 @@ macro_rules! plans_matches_expected { } } +fn test_suite_defau

[I] Support datatype cast for insert api same as insert into sql [datafusion]

2025-03-04 Thread via GitHub
zhuqi-lucas opened a new issue, #15015: URL: https://github.com/apache/datafusion/issues/15015 ### Is your feature request related to a problem or challenge? Now insert into sql we will automatically case the datatype to consistent with the table datatype, we'd better to support dat

Re: [I] Project Ideas for GSoC 2025 (Google Summer of Code) [datafusion]

2025-03-04 Thread via GitHub
oznur-synnada commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2700032123 Hi @mkarbo and @waynexia - we have been approved as a mentoring organization for GSoC 2025. I'm going to invite you to the GSoC portal as mentors so could you share your e

Re: [PR] BUG: schema_force_view_type configuration not working for CREATE EXTERNAL TABLE [datafusion]

2025-03-04 Thread via GitHub
2010YOUY01 commented on PR #14922: URL: https://github.com/apache/datafusion/pull/14922#issuecomment-2700038867 > Thank you @2010YOUY01 for review, addressed comments in latsest PR. LGTM, thank you. I don't know this code well so let's wait for others to approve it. -- This is an

Re: [PR] Support WITHIN GROUP syntax to standardize certain existing aggregate functions [datafusion]

2025-03-04 Thread via GitHub
Garamda commented on code in PR #13511: URL: https://github.com/apache/datafusion/pull/13511#discussion_r1980817133 ## datafusion/sql/src/expr/function.rs: ## @@ -349,15 +365,49 @@ impl SqlToRel<'_, S> { } else { // User defined aggregate functions (UDAF) h

Re: [PR] Support WITHIN GROUP syntax to standardize certain existing aggregate functions [datafusion]

2025-03-04 Thread via GitHub
Garamda commented on code in PR #13511: URL: https://github.com/apache/datafusion/pull/13511#discussion_r1980829367 ## datafusion/sql/src/expr/function.rs: ## @@ -349,15 +365,49 @@ impl SqlToRel<'_, S> { } else { // User defined aggregate functions (UDAF) h

Re: [PR] feat: instrument spawned tasks with current tracing span when `tracing` feature is enabled [datafusion]

2025-03-04 Thread via GitHub
geoffreyclaude commented on PR #14547: URL: https://github.com/apache/datafusion/pull/14547#issuecomment-2697021679 @alamb: I've gone ahead and refactored to allow "injecting" the tracing behavior at runtime. As predicted, the code is a bit scary looking, especially due to the Box/Unbox dan

Re: [PR] Fix doc logo [datafusion]

2025-03-04 Thread via GitHub
khushishukla2813 commented on code in PR #14989: URL: https://github.com/apache/datafusion/pull/14989#discussion_r1979134788 ## datafusion/macros/src/macros.rs: ## @@ -0,0 +1,10 @@ +#[macro_export] Review Comment: > Maybe add a `proc-macros` crate, move the proc macros there

[I] Bug: calling "with_new_exprs" on join after optimization unexpectedly fails [datafusion]

2025-03-04 Thread via GitHub
niebayes opened a new issue, #14999: URL: https://github.com/apache/datafusion/issues/14999 ### Describe the bug Before optimization, specifically the `ExtractEquijoinPredicate` rule, calling `with_new_exprs` succeeds. However, after optimized by `ExtractEquijoinPredicate`, calling

Re: [PR] Bug: Fix multi-lines printing issue for datafusion-cli [datafusion]

2025-03-04 Thread via GitHub
zhuqi-lucas commented on PR #14954: URL: https://github.com/apache/datafusion/pull/14954#issuecomment-2696829174 I am continuing polish the code besides the https://github.com/apache/datafusion/issues/14886 which will add the streaming state struct. -- This is an automated message from t

Re: [PR] Split out avro, parquet, json and csv into individual crates [datafusion]

2025-03-04 Thread via GitHub
AdamGS commented on code in PR #14951: URL: https://github.com/apache/datafusion/pull/14951#discussion_r1979201859 ## datafusion/core/src/datasource/file_format/avro.rs: ## @@ -15,163 +15,31 @@ // specific language governing permissions and limitations // under the License.

Re: [I] Allow sorting to improve `FixedSizeBinary` filtering [datafusion]

2025-03-04 Thread via GitHub
samuelcolvin closed issue #11170: Allow sorting to improve `FixedSizeBinary` filtering URL: https://github.com/apache/datafusion/issues/11170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Allow sorting to improve `FixedSizeBinary` filtering [datafusion]

2025-03-04 Thread via GitHub
samuelcolvin commented on issue #11170: URL: https://github.com/apache/datafusion/issues/11170#issuecomment-2696717653 Closed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Split out avro, parquet, json and csv into individual crates [datafusion]

2025-03-04 Thread via GitHub
alamb commented on PR #14951: URL: https://github.com/apache/datafusion/pull/14951#issuecomment-2697208596 Thanks again @AdamGS and @logan-keede -- this is amazing progress. Something I have thought was important for the last t -- This is an automated message from the Apache Git Service

Re: [PR] Split out avro, parquet, json and csv into individual crates [datafusion]

2025-03-04 Thread via GitHub
alamb merged PR #14951: URL: https://github.com/apache/datafusion/pull/14951 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Split out avro, parquet, json and csv into individual crates [datafusion]

2025-03-04 Thread via GitHub
alamb commented on PR #14951: URL: https://github.com/apache/datafusion/pull/14951#issuecomment-2697209311 I am merging this one in so it doesn't accumulate conflicts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

  1   2   >