Re: [PR] Add generate_series() udtf (and introduce 'lazy' `MemoryExec`) [datafusion]

2024-11-28 Thread via GitHub
berkaysynnada commented on code in PR #13540: URL: https://github.com/apache/datafusion/pull/13540#discussion_r1863073808 ## datafusion/functions-table/src/generate_series.rs: ## @@ -0,0 +1,180 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

Re: [PR] Test sort merge join on TPC-H benchmark [datafusion]

2024-11-28 Thread via GitHub
Dandandan commented on PR #13572: URL: https://github.com/apache/datafusion/pull/13572#issuecomment-2507250856 Thanks for the review @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Test TPCH with sort merge join [datafusion]

2024-11-28 Thread via GitHub
Dandandan closed issue #13573: Test TPCH with sort merge join URL: https://github.com/apache/datafusion/issues/13573 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [PR] Test sort merge join on TPC-H benchmark [datafusion]

2024-11-28 Thread via GitHub
Dandandan merged PR #13572: URL: https://github.com/apache/datafusion/pull/13572 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Improve unparsing after optimize_projections optimization [datafusion]

2024-11-28 Thread via GitHub
sgrebnov commented on code in PR #13599: URL: https://github.com/apache/datafusion/pull/13599#discussion_r1863025876 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -926,12 +926,25 @@ fn test_table_scan_pushdown() -> Result<()> { let query_from_table_scan_with_projectio

Re: [PR] Implement GroupsAccumulator for corr(x,y) aggregate function [datafusion]

2024-11-28 Thread via GitHub
Dandandan commented on code in PR #13581: URL: https://github.com/apache/datafusion/pull/13581#discussion_r1863046233 ## datafusion/functions-aggregate/src/correlation.rs: ## @@ -263,3 +283,307 @@ impl Accumulator for CorrelationAccumulator { Ok(()) } } + +#[deriv

Re: [PR] Implement GroupsAccumulator for corr(x,y) aggregate function [datafusion]

2024-11-28 Thread via GitHub
Dandandan commented on code in PR #13581: URL: https://github.com/apache/datafusion/pull/13581#discussion_r1863046233 ## datafusion/functions-aggregate/src/correlation.rs: ## @@ -263,3 +283,307 @@ impl Accumulator for CorrelationAccumulator { Ok(()) } } + +#[deriv

Re: [PR] Improve unparsing after optimize_projections optimization [datafusion]

2024-11-28 Thread via GitHub
sgrebnov commented on code in PR #13599: URL: https://github.com/apache/datafusion/pull/13599#discussion_r1863025876 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -926,12 +926,25 @@ fn test_table_scan_pushdown() -> Result<()> { let query_from_table_scan_with_projectio

[PR] Improve unparsing after optimize_projections optimization [datafusion]

2024-11-28 Thread via GitHub
sgrebnov opened a new pull request, #13599: URL: https://github.com/apache/datafusion/pull/13599 ## Which issue does this PR close? Follow-up item for [Support unparsing plans after applying the `optimize_projections` rule PR](https://github.com/apache/datafusion/pull/13267), focusin

Re: [PR] refactor: add `get_available_parallelism` function [datafusion]

2024-11-28 Thread via GitHub
jonahgao merged PR #13595: URL: https://github.com/apache/datafusion/pull/13595 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] Move available_parallelism() into utility function [datafusion]

2024-11-28 Thread via GitHub
jonahgao closed issue #13591: Move available_parallelism() into utility function URL: https://github.com/apache/datafusion/issues/13591 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-28 Thread via GitHub
jonahgao commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2506957606 > The rationale is that this makes the SLT test output closer to the output a DataFusion user would typically see, in datafusion-cli, when writing float outputs to CSV or when using

Re: [PR] Simplify spilling merge logic in GroupedHashAggregate [datafusion]

2024-11-28 Thread via GitHub
github-actions[bot] closed pull request #12517: Simplify spilling merge logic in GroupedHashAggregate URL: https://github.com/apache/datafusion/pull/12517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Implement SHOW FUNCTIONS [datafusion]

2024-11-28 Thread via GitHub
github-actions[bot] commented on PR #12266: URL: https://github.com/apache/datafusion/pull/12266#issuecomment-2506952118 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Enable parquet pushdown_filter by default [datafusion]

2024-11-28 Thread via GitHub
github-actions[bot] closed pull request #12524: Enable parquet pushdown_filter by default URL: https://github.com/apache/datafusion/pull/12524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
jonahgao commented on PR #13596: URL: https://github.com/apache/datafusion/pull/13596#issuecomment-2506949744 Thanks @findepi @comphead @Dandandan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Fix build issues on latest stable Rust toolchain (1.83) [datafusion]

2024-11-28 Thread via GitHub
jonahgao closed issue #13597: Fix build issues on latest stable Rust toolchain (1.83) URL: https://github.com/apache/datafusion/issues/13597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
jonahgao merged PR #13596: URL: https://github.com/apache/datafusion/pull/13596 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-28 Thread via GitHub
jayzhan211 commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2506916421 An alternative approach is that we need to differentiate `string literal` and `varchar` like Postgres an DuckDB. Only untyped `string literal` is able to cast to any other types,

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-28 Thread via GitHub
jonathanc-n commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2506910402 @jayzhan211 That sounds good. However, I think implicit coercion should be the default or it'll cause regressions for users. Are you able to open this pr back up, i can add the c

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-28 Thread via GitHub
jayzhan211 commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2506897643 And set implicit coercion as the default -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-28 Thread via GitHub
jayzhan211 commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2506894562 @Omega359 How about we make this configurable? Enable implicit coercion if we want the ease of use and the casting cost is acceptable, disable it if we want prefer explicit

Re: [PR] refactor: add `get_available_parallelism` function [datafusion]

2024-11-28 Thread via GitHub
comphead commented on PR #13595: URL: https://github.com/apache/datafusion/pull/13595#issuecomment-2506830432 Since this is a first time contribution I'll be waiting for another review before merging it in -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] feat: Add ConfigOptions to ScalarFunctionArgs [datafusion]

2024-11-28 Thread via GitHub
Omega359 commented on PR #13527: URL: https://github.com/apache/datafusion/pull/13527#issuecomment-2506774637 > > Maybe we could change the semantics so that `SessionConfig` has a `Arc` which was cloned when it was modified (`Arc::unwrap_or_clone()` style) 🤔 > > Certainly possible, I

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
findepi commented on PR #13596: URL: https://github.com/apache/datafusion/pull/13596#issuecomment-2506762313 [clippy](https://github.com/apache/datafusion/actions/runs/12075224493/job/33674907768?pr=13596#logs) job passed with toolchain pin dropped. -- This is an automated message from t

Re: [PR] Reject CREATE TABLE/VIEW with duplicate column names [datafusion]

2024-11-28 Thread via GitHub
findepi commented on code in PR #13517: URL: https://github.com/apache/datafusion/pull/13517#discussion_r1862694059 ## datafusion/expr/src/logical_plan/ddl.rs: ## @@ -303,12 +531,154 @@ pub struct CreateMemoryTable { pub or_replace: bool, /// Default values for columns

Re: [PR] [minor]: Update median implementation [datafusion]

2024-11-28 Thread via GitHub
comphead commented on code in PR #13554: URL: https://github.com/apache/datafusion/pull/13554#discussion_r1862667080 ## datafusion/functions-aggregate/src/median.rs: ## @@ -310,6 +311,20 @@ impl Accumulator for DistinctMedianAccumulator { } } +/// Get maximum entry in t

Re: [PR] refactor: add `get_available_parallelism` function [datafusion]

2024-11-28 Thread via GitHub
alan910127 commented on PR #13595: URL: https://github.com/apache/datafusion/pull/13595#issuecomment-2506747529 @comphead Rebased. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
findepi commented on PR #13596: URL: https://github.com/apache/datafusion/pull/13596#issuecomment-2506738154 now that https://github.com/apache/datafusion/pull/13598 is merged, let be rebase, otherwise we no longer test with 1.83 -- This is an automated message from the Apache Git Service

Re: [PR] Temporarily pin toolchain version to avoid clippy [datafusion]

2024-11-28 Thread via GitHub
findepi commented on PR #13598: URL: https://github.com/apache/datafusion/pull/13598#issuecomment-2506737917 thanks for the merge! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
findepi commented on PR #13596: URL: https://github.com/apache/datafusion/pull/13596#issuecomment-2506737552 > I think we need to `#allow[missing_docs]` as the clippy now complains on that :) just added -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] refactor: add `get_available_parallelism` function [datafusion]

2024-11-28 Thread via GitHub
comphead commented on PR #13595: URL: https://github.com/apache/datafusion/pull/13595#issuecomment-2506736916 @alan910127 please rebase from the latest main -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
comphead commented on PR #13596: URL: https://github.com/apache/datafusion/pull/13596#issuecomment-2506736284 I think we need to `#allow[missing_docs]` as the clippy now complains on that -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
comphead commented on code in PR #13596: URL: https://github.com/apache/datafusion/pull/13596#discussion_r1862668685 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -746,7 +746,7 @@ mod tests { // Split the provided record batch into multiple batch_s

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
findepi commented on code in PR #13596: URL: https://github.com/apache/datafusion/pull/13596#discussion_r1862667573 ## datafusion/optimizer/src/replace_distinct_aggregate.rs: ## @@ -54,8 +54,6 @@ use datafusion_expr::{Aggregate, Distinct, DistinctOn, Expr, LogicalPlan}; /// )

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
findepi commented on code in PR #13596: URL: https://github.com/apache/datafusion/pull/13596#discussion_r1862667940 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -746,7 +746,7 @@ mod tests { // Split the provided record batch into multiple batch_si

Re: [PR] Minor: Add example of backporting / `cherry-pick`ing to release branch [datafusion]

2024-11-28 Thread via GitHub
comphead merged PR #13565: URL: https://github.com/apache/datafusion/pull/13565 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Implement RightSemi join for SortMergeJoin [datafusion]

2024-11-28 Thread via GitHub
comphead commented on code in PR #13584: URL: https://github.com/apache/datafusion/pull/13584#discussion_r1862655033 ## datafusion/core/tests/fuzz_cases/join_fuzz.rs: ## @@ -209,6 +209,30 @@ async fn test_semi_join_1k_filtered() { .await } +#[tokio::test] +async fn test_

Re: [PR] Temporarily pin toolchain version to avoid clippy [datafusion]

2024-11-28 Thread via GitHub
Dandandan merged PR #13598: URL: https://github.com/apache/datafusion/pull/13598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
Dandandan commented on code in PR #13596: URL: https://github.com/apache/datafusion/pull/13596#discussion_r1862643735 ## datafusion/expr/src/logical_plan/display.rs: ## @@ -181,7 +181,7 @@ impl<'a, 'b> GraphvizVisitor<'a, 'b> { } } -impl<'n, 'a, 'b> TreeNodeVisitor<'n> f

Re: [I] [EPIC] Add support for all array expressions [datafusion-comet]

2024-11-28 Thread via GitHub
SemyonSinchenko commented on issue #1042: URL: https://github.com/apache/datafusion-comet/issues/1042#issuecomment-2506696139 I would like to work on `array_zip` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] refactor: add `get_available_parallelism` function [datafusion]

2024-11-28 Thread via GitHub
alan910127 commented on PR #13595: URL: https://github.com/apache/datafusion/pull/13595#issuecomment-2506695237 Hi @comphead, Thanks for your review! I just checked the pipeline logs, the clippy errors are not related to my changes. Perhaps it's what @findepi mentioned in the previous comme

Re: [PR] Apply clippy fixes for Rust 1.83 [datafusion]

2024-11-28 Thread via GitHub
comphead commented on code in PR #13596: URL: https://github.com/apache/datafusion/pull/13596#discussion_r1862634805 ## datafusion/functions-nested/src/concat.rs: ## @@ -438,7 +438,7 @@ fn concat_internal(args: &[ArrayRef]) -> Result { Ok(Arc::new(list_arr)) } -/// Kern

Re: [PR] Temporarily pin toolchain version to avoid clippy [datafusion]

2024-11-28 Thread via GitHub
findepi commented on PR #13598: URL: https://github.com/apache/datafusion/pull/13598#issuecomment-2506687640 cc @Dandandan @alamb let's maybe merge this in, since the PR builds already started to fail -- https://github.com/apache/datafusion/pull/13595#issuecomment-2506686762 -- This is

Re: [PR] refactor: add `get_available_parallelism` function [datafusion]

2024-11-28 Thread via GitHub
findepi commented on PR #13595: URL: https://github.com/apache/datafusion/pull/13595#issuecomment-2506686762 The [clippy](https://github.com/apache/datafusion/actions/runs/12073608931/job/33672780242?pr=13595#logs) job failed. This is probably nothing wrong with this PR, see https://github

Re: [I] Improve performance of db-benchmark query 8 [datafusion]

2024-11-28 Thread via GitHub
alan910127 commented on issue #13586: URL: https://github.com/apache/datafusion/issues/13586#issuecomment-2506668979 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Fix clippy warnings on rust 1.83 [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
iffyio commented on PR #1570: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1570#issuecomment-250558 cc @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[PR] Fix clippy warnings on rust 1.83 [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
iffyio opened a new pull request, #1570: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1570 Fixes [clippy warnings](https://github.com/apache/datafusion-sqlparser-rs/actions/runs/12071816871/job/33664408289?pr=1552) on 1.83 -- This is an automated message from the Apache Gi

[PR] Temporarily pin toolchain version to avoid clippy [datafusion]

2024-11-28 Thread via GitHub
findepi opened a new pull request, #13598: URL: https://github.com/apache/datafusion/pull/13598 Temporarily pin toolchain version until problems reported by newer clippy release are solved. Workaround for https://github.com/apache/datafusion/issues/13597 -- This is an automated mes

[PR] Apply clippy fixes [datafusion]

2024-11-28 Thread via GitHub
findepi opened a new pull request, #13596: URL: https://github.com/apache/datafusion/pull/13596 `dev/rust_lint.sh` no longer passes for me, maybe because of `rustup update`. This is first portion of fixes suggested by clippy. -- This is an automated message from the Apache Git Service

Re: [I] Spark support only i32 indexed arrays while comet is trying to support both i32 and i64 [datafusion-comet]

2024-11-28 Thread via GitHub
SemyonSinchenko closed issue #1114: Spark support only i32 indexed arrays while comet is trying to support both i32 and i64 URL: https://github.com/apache/datafusion-comet/issues/1114 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] chore: Make list.rs non generic & simplify the code [datafusion-comet]

2024-11-28 Thread via GitHub
SemyonSinchenko closed pull request #1118: chore: Make list.rs non generic & simplify the code URL: https://github.com/apache/datafusion-comet/pull/1118 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] feat: support array_insert [datafusion-comet]

2024-11-28 Thread via GitHub
SemyonSinchenko commented on code in PR #1073: URL: https://github.com/apache/datafusion-comet/pull/1073#discussion_r1862564708 ## native/spark-expr/src/list.rs: ## @@ -413,14 +426,297 @@ impl PartialEq for GetArrayStructFields { } } +#[derive(Debug, Hash)] +pub struct A

Re: [PR] chore: Make list.rs non generic & simplify the code [datafusion-comet]

2024-11-28 Thread via GitHub
SemyonSinchenko commented on PR #1118: URL: https://github.com/apache/datafusion-comet/pull/1118#issuecomment-2506605090 There is a [valid argument](https://github.com/apache/datafusion-comet/pull/1073#discussion_r1862550710) against it: > The difference I think is that a LargeList can

[PR] refactor: add `get_available_parallelism` function [datafusion]

2024-11-28 Thread via GitHub
alan910127 opened a new pull request, #13595: URL: https://github.com/apache/datafusion/pull/13595 ## Which issue does this PR close? Closes #13591. ## Rationale for this change In https://github.com/apache/datafusion/pull/13579#pullrequestreview-2465606280,

Re: [PR] feat: support array_insert [datafusion-comet]

2024-11-28 Thread via GitHub
Kimahriman commented on code in PR #1073: URL: https://github.com/apache/datafusion-comet/pull/1073#discussion_r1862550710 ## native/spark-expr/src/list.rs: ## @@ -413,14 +426,297 @@ impl PartialEq for GetArrayStructFields { } } +#[derive(Debug, Hash)] +pub struct ArrayI

Re: [PR] feat: support array_insert [datafusion-comet]

2024-11-28 Thread via GitHub
Kimahriman commented on code in PR #1073: URL: https://github.com/apache/datafusion-comet/pull/1073#discussion_r1862550710 ## native/spark-expr/src/list.rs: ## @@ -413,14 +426,297 @@ impl PartialEq for GetArrayStructFields { } } +#[derive(Debug, Hash)] +pub struct ArrayI

Re: [PR] feat: add expression array_size [datafusion-comet]

2024-11-28 Thread via GitHub
Kimahriman commented on code in PR #1122: URL: https://github.com/apache/datafusion-comet/pull/1122#discussion_r1862531038 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2220,6 +2220,16 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerd

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
Eason0729 commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2506522979 It seems like we reached the decision to add `recursive` instead of using underlying dependency(stacker). -- This is an automated message from the Apache Git Service.

Re: [I] Move available_parallelism() into utility function [datafusion]

2024-11-28 Thread via GitHub
alan910127 commented on issue #13591: URL: https://github.com/apache/datafusion/issues/13591#issuecomment-2506469259 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Support parsing optional nulls handling for unique constraint [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
iffyio commented on code in PR #1567: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1567#discussion_r1862450915 ## tests/sqlparser_postgres.rs: ## @@ -594,6 +594,13 @@ fn parse_alter_table_constraints_rename() { } } +#[test] +fn parse_alter_table_constrain

[I] Main branch, linter failure on new Rust version [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
demetribu opened a new issue, #1569: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1569 It seems the main branch is currently failing Clippy lint checks against Rust 1.83.0. Previous succ run was on stable-x86_64-unknown-linux-gnu unchanged - rustc 1.82.0. cc @ala

Re: [PR] Fix displaying WORK or TRANSACTION after BEGIN [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
iffyio commented on code in PR #1565: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1565#discussion_r1862429680 ## src/ast/mod.rs: ## @@ -5133,6 +5138,24 @@ pub enum TruncateCascadeOption { Restrict, } +/// Transaction started with [ TRANSACTION | WORK ] +

Re: [PR] Fix: JOIN should require ON condition [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
demetribu commented on PR #1552: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1552#issuecomment-2506430850 @iffyio It seems the main branch is currently failing Clippy lint checks against Rust 1.83.0. cc @alamb -- This is an automated message from the Apache Git Se

Re: [PR] Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
iffyio commented on code in PR #1538: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1538#discussion_r1862426564 ## src/ast/mod.rs: ## @@ -7377,15 +7420,84 @@ pub enum MySQLColumnPosition { impl Display for MySQLColumnPosition { fn fmt(&self, f: &mut fmt::For

Re: [I] [DISCUSSION] Making it easier to use DataFusion (lessons from GlareDB) [datafusion]

2024-11-28 Thread via GitHub
waynexia commented on issue #13525: URL: https://github.com/apache/datafusion/issues/13525#issuecomment-2506407906 For the `object_store` specific problem, it doesn't have wasm support so far as I know. I've used opendal as a workaround in https://github.com/datafusion-contrib/datafusion-wa

Re: [I] [EPIC] Improved aggregate function performance [datafusion]

2024-11-28 Thread via GitHub
Rachelint commented on issue #13548: URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2506349949 > > I am using a M3 Macbook with 16 GB of RAM. How much RAM does your machine have? Perhaps DataFusion only struggles with query 9 when the machine doesn't have lots of extra

Re: [I] [DISCUSSION] Making it easier to use DataFusion (lessons from GlareDB) [datafusion]

2024-11-28 Thread via GitHub
goldmedal commented on issue #13525: URL: https://github.com/apache/datafusion/issues/13525#issuecomment-2506337026 > I didn't mean to imply it, just that it's not something actively tested/developed for in DataFusion. I can't recall the exact issue, but late last year (2023) I was working

Re: [I] Improve vectorized operations of `GroupColumn` [datafusion]

2024-11-28 Thread via GitHub
Rachelint commented on issue #13275: URL: https://github.com/apache/datafusion/issues/13275#issuecomment-2506293751 > I tried the possible optimizations for`vectorized_equal_to`: [#12996 (comment)](https://github.com/apache/datafusion/pull/12996#discussion_r1818601807) [#12996 (comment)](h

Re: [PR] Enhance the nested type access for Generic and DuckDB dialect [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
goldmedal commented on PR #1541: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1541#issuecomment-2506297347 Move to another design #1551 and see https://github.com/apache/datafusion-sqlparser-rs/pull/1541#discussion_r1861002348 for the detail. -- This is an automated messa

Re: [PR] Enhance the nested type access for Generic and DuckDB dialect [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
goldmedal commented on code in PR #1541: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1541#discussion_r1862331967 ## src/parser/mod.rs: ## @@ -2935,12 +2935,23 @@ impl<'a> Parser<'a> { }) } else if Token::LBracket == tok { if di

Re: [PR] Enhance the nested type access for Generic and DuckDB dialect [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
goldmedal closed pull request #1541: Enhance the nested type access for Generic and DuckDB dialect URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1541 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Support relation visitor to visit the `Option` field [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
goldmedal commented on code in PR #1556: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1556#discussion_r1862328322 ## derive/src/lib.rs: ## @@ -256,3 +265,16 @@ fn visit_children( Data::Union(_) => unimplemented!(), } } + +fn is_option(ty: &Type) ->

Re: [I] Improve vectorized operations of `GroupColumn` [datafusion]

2024-11-28 Thread via GitHub
Rachelint commented on issue #13275: URL: https://github.com/apache/datafusion/issues/13275#issuecomment-2506290147 I tried the possible optimizations for`vectorized_equal_to`: https://github.com/apache/datafusion/pull/12996#discussion_r1818601807 https://github.com/apache/datafusion/pu

Re: [PR] Support relation visitor to visit the `Option` field [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
goldmedal commented on code in PR #1556: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1556#discussion_r1862311298 ## src/ast/mod.rs: ## @@ -7653,6 +7653,7 @@ impl fmt::Display for ShowStatementInParentType { pub struct ShowStatementIn { pub clause: ShowStat

Re: [PR] Doc gen: Attributes to support `related_udf`, `alternative_syntax` [datafusion]

2024-11-28 Thread via GitHub
Omega359 commented on code in PR #13575: URL: https://github.com/apache/datafusion/pull/13575#discussion_r1862252991 ## datafusion/doc/src/lib.rs: ## @@ -86,29 +90,30 @@ pub struct DocSection { /// description: None, /// }; /// -/// let documentation = Documen

Re: [PR] chore: Update python files [datafusion-ballista]

2024-11-28 Thread via GitHub
milenkovicm closed pull request #1141: chore: Update python files URL: https://github.com/apache/datafusion-ballista/pull/1141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Implement GroupsAccumulator for corr(x,y) aggregate function [datafusion]

2024-11-28 Thread via GitHub
Dandandan commented on code in PR #13581: URL: https://github.com/apache/datafusion/pull/13581#discussion_r1862087935 ## datafusion/functions-aggregate-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -371,6 +371,69 @@ pub fn accumulate( } } +/// Accumulates

[PR] feat(substrait): remove dependency on datafusion default features [datafusion]

2024-11-28 Thread via GitHub
notfilippo opened a new pull request, #13594: URL: https://github.com/apache/datafusion/pull/13594 ## Which issue does this PR close? Closes #13593 ## What changes are included in this PR? - `physical` feature, which enables production and consumption of physical substr

Re: [I] [EPIC] Improved aggregate function performance [datafusion]

2024-11-28 Thread via GitHub
2010YOUY01 commented on issue #13548: URL: https://github.com/apache/datafusion/issues/13548#issuecomment-2505971804 > I am using a M3 Macbook with 16 GB of RAM. How much RAM does your machine have? Perhaps DataFusion only struggles with query 9 when the machine doesn't have lots of extra R

Re: [I] [substrait] make dependency on parquet optional [datafusion]

2024-11-28 Thread via GitHub
notfilippo commented on issue #13593: URL: https://github.com/apache/datafusion/issues/13593#issuecomment-2505968846 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[I] [substrait] make dependency on parquet optional [datafusion]

2024-11-28 Thread via GitHub
notfilippo opened a new issue, #13593: URL: https://github.com/apache/datafusion/issues/13593 ### Is your feature request related to a problem or challenge? The `datafusion-substrait` package currently depends on the `datafusion` crate with all default features **enabled**. This can f

Re: [PR] Implement GroupsAccumulator for corr(x,y) aggregate function [datafusion]

2024-11-28 Thread via GitHub
2010YOUY01 commented on code in PR #13581: URL: https://github.com/apache/datafusion/pull/13581#discussion_r1862055020 ## datafusion/functions-aggregate-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -371,6 +371,87 @@ pub fn accumulate( } } +/// Accumulates

Re: [PR] Fix `LogicalPlan::..._with_subqueries` methods [datafusion]

2024-11-28 Thread via GitHub
peter-toth commented on code in PR #13589: URL: https://github.com/apache/datafusion/pull/13589#discussion_r1862019495 ## datafusion/expr/src/logical_plan/tree_node.rs: ## @@ -710,13 +714,12 @@ impl LogicalPlan { node: &LogicalPlan, f: &mut F,

Re: [PR] Fix `LogicalPlan::..._with_subqueries` methods [datafusion]

2024-11-28 Thread via GitHub
peter-toth commented on PR #13589: URL: https://github.com/apache/datafusion/pull/13589#issuecomment-2505915082 > I verified test coverage by running the test without the code changes and it failed > > ``` > assertion failed: !filter_found > thread 'logical_plan::plan::tests::te

Re: [PR] Fix `LogicalPlan::..._with_subqueries` methods [datafusion]

2024-11-28 Thread via GitHub
peter-toth commented on code in PR #13589: URL: https://github.com/apache/datafusion/pull/13589#discussion_r1862019495 ## datafusion/expr/src/logical_plan/tree_node.rs: ## @@ -710,13 +714,12 @@ impl LogicalPlan { node: &LogicalPlan, f: &mut F,

Re: [PR] Support relation visitor to visit the `Option` field [datafusion-sqlparser-rs]

2024-11-28 Thread via GitHub
alamb commented on code in PR #1556: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1556#discussion_r1862005838 ## src/ast/mod.rs: ## @@ -7653,6 +7653,7 @@ impl fmt::Display for ShowStatementInParentType { pub struct ShowStatementIn { pub clause: ShowStatemen

Re: [PR] Deprecate `adjust_output_array` in favor of `PrimitiveArray::with_data_type` [datafusion]

2024-11-28 Thread via GitHub
alamb commented on code in PR #13585: URL: https://github.com/apache/datafusion/pull/13585#discussion_r1861280467 ## datafusion/physical-plan/src/aggregates/topk/heap.rs: ## @@ -151,10 +151,11 @@ where } fn drain(&mut self) -> (ArrayRef, Vec) { +let nulls = N

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-28 Thread via GitHub
alamb commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2505853129 Thank you @leoyvens -- this looks epic. I will review this PR but I may not have a chance to do so for a day or two. It looks awesome -- This is an automated message from the Apache

Re: [PR] Fix `LogicalPlan::..._with_subqueries` methods [datafusion]

2024-11-28 Thread via GitHub
alamb commented on code in PR #13589: URL: https://github.com/apache/datafusion/pull/13589#discussion_r1861976956 ## datafusion/expr/src/logical_plan/tree_node.rs: ## @@ -710,13 +714,12 @@ impl LogicalPlan { node: &LogicalPlan, f: &mut F, ) ->

[PR] Add SimpleScalarUDF::new_with_signature [datafusion]

2024-11-28 Thread via GitHub
findepi opened a new pull request, #13592: URL: https://github.com/apache/datafusion/pull/13592 This is helpful for simple function implementations or function stubs, when `Signature::exact` is not desired. -- This is an automated message from the Apache Git Service. To respond to the

[I] Move available_parallelism() into utility function [datafusion]

2024-11-28 Thread via GitHub
Dandandan opened a new issue, #13591: URL: https://github.com/apache/datafusion/issues/13591 ### Is your feature request related to a problem or challenge? As noted by @comphead in https://github.com/apache/datafusion/pull/13579#pullrequestreview-2465606280 We can move the repeate

Re: [PR] [Minor] Use std::thread::available_parallelism instead of `num_cpus` [datafusion]

2024-11-28 Thread via GitHub
Dandandan merged PR #13579: URL: https://github.com/apache/datafusion/pull/13579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

[PR] More rigorous treatment of floats in tests [datafusion]

2024-11-28 Thread via GitHub
leoyvens opened a new pull request, #13590: URL: https://github.com/apache/datafusion/pull/13590 ## Which issue does this PR close? My motivation was to improve DF testing of float outputs, even at the least-significant digits. The situation in #13569 seemed a bit uncomfortable

Re: [PR] Implement GroupsAccumulator for corr(x,y) aggregate function [datafusion]

2024-11-28 Thread via GitHub
Dandandan commented on code in PR #13581: URL: https://github.com/apache/datafusion/pull/13581#discussion_r1861872899 ## datafusion/functions-aggregate-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -371,6 +371,87 @@ pub fn accumulate( } } +/// Accumulates

Re: [PR] Implement GroupsAccumulator for corr(x,y) aggregate function [datafusion]

2024-11-28 Thread via GitHub
Dandandan commented on code in PR #13581: URL: https://github.com/apache/datafusion/pull/13581#discussion_r1861872899 ## datafusion/functions-aggregate-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -371,6 +371,87 @@ pub fn accumulate( } } +/// Accumulates

Re: [PR] Fix `LogicalPlan::..._with_subqueries` methods [datafusion]

2024-11-28 Thread via GitHub
peter-toth commented on PR #13589: URL: https://github.com/apache/datafusion/pull/13589#issuecomment-2505714389 cc @alamb , @berkaysynnada -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] chore: Fix jdk documentation for spark [datafusion-comet]

2024-11-28 Thread via GitHub
adi-kmt closed pull request #979: chore: Fix jdk documentation for spark URL: https://github.com/apache/datafusion-comet/pull/979 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] chore: rename known project ZincObserve to OpenObserve [datafusion]

2024-11-28 Thread via GitHub
Weijun-H merged PR #13587: URL: https://github.com/apache/datafusion/pull/13587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Supplement as_*_array functions [datafusion]

2024-11-28 Thread via GitHub
Weijun-H commented on PR #13580: URL: https://github.com/apache/datafusion/pull/13580#issuecomment-2505646747 Thanks @alamb for reviewing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

  1   2   >