Re: [PR] fix: panic and incorrect results in `LogFunc::output_ordering()` [datafusion]

2024-07-22 Thread via GitHub
berkaysynnada commented on code in PR #11571: URL: https://github.com/apache/datafusion/pull/11571#discussion_r1686068918 ## datafusion/functions/src/math/log.rs: ## @@ -334,4 +341,94 @@ mod tests { assert_eq!(args[0], lit(2)); assert_eq!(args[1], lit(3));

[I] Add NullState::is_null public method [datafusion]

2024-07-22 Thread via GitHub
joroKr21 opened a new issue, #11591: URL: https://github.com/apache/datafusion/issues/11591 ### Is your feature request related to a problem or challenge? For some group accumulators it would be necessary to check if the previous value was `null` explicitly. E.g. consider implementing

Re: [PR] Normalize float zeros (-0.0 -> 0.0) before binary operations [datafusion]

2024-07-22 Thread via GitHub
MohamedAbdeen21 commented on code in PR #11585: URL: https://github.com/apache/datafusion/pull/11585#discussion_r1686180731 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -289,6 +289,14 @@ impl PhysicalExpr for BinaryExpr { return apply_cmp_for_nested(

Re: [PR] fix: panic and incorrect results in `LogFunc::output_ordering()` [datafusion]

2024-07-22 Thread via GitHub
jonahgao commented on code in PR #11571: URL: https://github.com/apache/datafusion/pull/11571#discussion_r1686182162 ## datafusion/sqllogictest/test_files/order.slt: ## @@ -512,7 +519,7 @@ CREATE EXTERNAL TABLE aggregate_test_100 ( ) STORED AS CSV WITH ORDER(c11) -WITH ORDER(

[PR] Add NullState::is_valid and NullState::is_null [datafusion]

2024-07-22 Thread via GitHub
joroKr21 opened a new pull request, #11592: URL: https://github.com/apache/datafusion/pull/11592 ## Which issue does this PR close? Closes #11591. ## Rationale for this change Provide more flexibility for implementing `GroupsAccumulator` which often make

Re: [PR] fix: panic and incorrect results in `LogFunc::output_ordering()` [datafusion]

2024-07-22 Thread via GitHub
jonahgao commented on code in PR #11571: URL: https://github.com/apache/datafusion/pull/11571#discussion_r1686201995 ## datafusion/functions/src/math/log.rs: ## @@ -82,7 +82,13 @@ impl ScalarUDFImpl for LogFunc { } fn output_ordering(&self, input: &[ExprProperties])

Re: [PR] Normalize float zeros (-0.0 -> 0.0) before binary operations [datafusion]

2024-07-22 Thread via GitHub
MohamedAbdeen21 commented on code in PR #11585: URL: https://github.com/apache/datafusion/pull/11585#discussion_r1686180731 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -289,6 +289,14 @@ impl PhysicalExpr for BinaryExpr { return apply_cmp_for_nested(

[I] Typo in for datafusion::physical_plan::Partitioning [datafusion]

2024-07-22 Thread via GitHub
waruto210 opened a new issue, #11593: URL: https://github.com/apache/datafusion/issues/11593 ### Describe the bug In https://docs.rs/datafusion/latest/datafusion/physical_plan/enum.Partitioning.html ![image](https://github.com/user-attachments/assets/ea26e8b4-3ed9-4df1-a80c-3af238

Re: [PR] Feature/alternate function extension [datafusion]

2024-07-22 Thread via GitHub
jayzhan211 commented on PR #11582: URL: https://github.com/apache/datafusion/pull/11582#issuecomment-2242536741 I had tried to directly modify `Expr` before, but ends up builder API. I couldn't remember the reason -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Internal error when there is a bitwise operation in `order by` clause (SQLancer) [datafusion]

2024-07-22 Thread via GitHub
nix010 commented on issue #11561: URL: https://github.com/apache/datafusion/issues/11561#issuecomment-2242586367 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686318175 ## datafusion/functions/src/math/bounds.rs: ## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686325658 ## datafusion/functions/src/math/bounds.rs: ## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686326772 ## datafusion/functions/src/math/monotonicity.rs: ## @@ -15,24 +15,17 @@ // specific language governing permissions and limitations // under the License.

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686325658 ## datafusion/functions/src/math/bounds.rs: ## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686333818 ## datafusion/functions/src/math/bounds.rs: ## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686318175 ## datafusion/functions/src/math/bounds.rs: ## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [I] Prototype implementing DataFusion functions / operators using `arrow-udf` liibrary [datafusion]

2024-07-22 Thread via GitHub
wangrunji0408 commented on issue #11413: URL: https://github.com/apache/datafusion/issues/11413#issuecomment-2242629113 > > An example would be > > FWIW this implementation of concat would likely perform pretty poorly compared to a hand written one as it will both create and allocate

Re: [I] Clippy is not happy now [datafusion-comet]

2024-07-22 Thread via GitHub
Xuanwo commented on issue #700: URL: https://github.com/apache/datafusion-comet/issues/700#issuecomment-2242674685 Hi, @andygrove and @viirya, do you think it's a good idea to introduce clippy checks in CI? I'm have to finish it. -- This is an automated message from the Apache Git Servic

[I] Expression Simplifier doesn't consider associativity (`(i + 1) + 2)` is not simplified to `i + 3`) [datafusion]

2024-07-22 Thread via GitHub
alamb opened a new issue, #11594: URL: https://github.com/apache/datafusion/issues/11594 ### Is your feature request related to a problem or challenge? DataFusion will simplify expressions like this: `i + (1 + 2)` => `i + 3` However, it will not simplify `i + 1 + 2` (remains `i

Re: [I] Expression Simplifier doesn't consider associativity (`(i + 1) + 2)` is not simplified to `i + 3`) [datafusion]

2024-07-22 Thread via GitHub
alamb commented on issue #11594: URL: https://github.com/apache/datafusion/issues/11594#issuecomment-2242687644 Here is the full example from Discord: ```rust use datafusion::arrow::datatypes::{DataType, Field, Schema, TimeUnit}; use datafusion::optimizer::simplify_expressions::E

Re: [PR] Fix unparser invalid sql for query with order [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11527: URL: https://github.com/apache/datafusion/pull/11527 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Fix unparser invalid sql for query with order [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11527: URL: https://github.com/apache/datafusion/pull/11527#issuecomment-2242688838 Thanks @y-f-u -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-22 Thread via GitHub
timsaucer commented on PR #11550: URL: https://github.com/apache/datafusion/pull/11550#issuecomment-2242694088 Copying @jayzhan211 's comment from > I had tried to directly modify Expr before, but ends up builder API. I couldn't remember the reason My *guess* is that it has to

Re: [PR] Normalize float zeros (-0.0 -> 0.0) before binary operations [datafusion]

2024-07-22 Thread via GitHub
MohamedAbdeen21 commented on code in PR #11585: URL: https://github.com/apache/datafusion/pull/11585#discussion_r1686376343 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -289,6 +289,14 @@ impl PhysicalExpr for BinaryExpr { return apply_cmp_for_nested(

Re: [PR] Normalize float zeros (-0.0 -> 0.0) before binary operations [datafusion]

2024-07-22 Thread via GitHub
MohamedAbdeen21 commented on code in PR #11585: URL: https://github.com/apache/datafusion/pull/11585#discussion_r1686392227 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -289,6 +289,14 @@ impl PhysicalExpr for BinaryExpr { return apply_cmp_for_nested(

Re: [PR] Normalize float zeros (-0.0 -> 0.0) before binary operations [datafusion]

2024-07-22 Thread via GitHub
MohamedAbdeen21 commented on code in PR #11585: URL: https://github.com/apache/datafusion/pull/11585#discussion_r1686392227 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -289,6 +289,14 @@ impl PhysicalExpr for BinaryExpr { return apply_cmp_for_nested(

Re: [I] Expression Simplifier doesn't consider associativity (`(i + 1) + 2)` is not simplified to `i + 3`) [datafusion]

2024-07-22 Thread via GitHub
timsaucer commented on issue #11594: URL: https://github.com/apache/datafusion/issues/11594#issuecomment-2242754551 Off the cuff, if we go down this avenue I'd imagine wanting to check for associative, commutative, and distributive properties. For example, `(3*x + 8) + (5 + x*9) = 12x

Re: [PR] Normalize float zeros (-0.0 -> 0.0) before binary operations [datafusion]

2024-07-22 Thread via GitHub
ozankabak commented on code in PR #11585: URL: https://github.com/apache/datafusion/pull/11585#discussion_r1686423397 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -289,6 +289,14 @@ impl PhysicalExpr for BinaryExpr { return apply_cmp_for_nested(self.o

Re: [PR] Normalize float zeros (-0.0 -> 0.0) before binary operations [datafusion]

2024-07-22 Thread via GitHub
MohamedAbdeen21 closed pull request #11585: Normalize float zeros (-0.0 -> 0.0) before binary operations URL: https://github.com/apache/datafusion/pull/11585 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Change default Parquet writer settings to match arrow-rs (except for compression & statistics) [datafusion]

2024-07-22 Thread via GitHub
alamb commented on code in PR #11558: URL: https://github.com/apache/datafusion/pull/11558#discussion_r1686428586 ## docs/source/user-guide/configs.md: ## @@ -58,15 +58,15 @@ Environment variables are read during `SessionConfig` initialisation so they mus | datafusion.executio

Re: [PR] Change default Parquet writer settings to match arrow-rs (except for compression & statistics) [datafusion]

2024-07-22 Thread via GitHub
alamb commented on code in PR #11558: URL: https://github.com/apache/datafusion/pull/11558#discussion_r1686431340 ## datafusion/common/src/file_options/parquet_writer.rs: ## @@ -712,35 +662,13 @@ mod tests { "should see the extern parquet's default over-riding dataf

Re: [PR] Change default Parquet writer settings to match arrow-rs (except for compression & statistics) [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11558: URL: https://github.com/apache/datafusion/pull/11558#issuecomment-2242778687 I restarted the failed pyarrow check as it seemed to be an infrastructure problem -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] [draft] Add `LogicalType`, try to support user-defined types [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #8143: URL: https://github.com/apache/datafusion/pull/8143#issuecomment-2242780414 FYI https://github.com/apache/datafusion/pull/11160 tracks a new proposal for this feature. It seems to be gaining traction -- This is an automated message from the Apache Git Service.

Re: [PR] Add NullState::is_valid and NullState::is_null [datafusion]

2024-07-22 Thread via GitHub
joroKr21 commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1686448252 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -324,6 +324,18 @@ impl NullState { } } +/// Check if

Re: [PR] chore: Make rust clippy happy [datafusion-comet]

2024-07-22 Thread via GitHub
alamb commented on PR #701: URL: https://github.com/apache/datafusion-comet/pull/701#issuecomment-2242808267 Happy Rust 1.80 day! https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-180-2024-07-25 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Add blog post announcing Comet 0.1.0 [datafusion-site]

2024-07-22 Thread via GitHub
alamb commented on code in PR #7: URL: https://github.com/apache/datafusion-site/pull/7#discussion_r1686446998 ## _posts/2024-07-20-datafusion-comet-0.1.0.md: ## @@ -0,0 +1,153 @@ +--- +layout: post +title: "Apache DataFusion Comet 0.1.0 Release" +date: "2024-07-20 00:00:00" +au

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-07-22 Thread via GitHub
findepi commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2242825107 I like the idea of separating logical types from arrow types, but it would be great to understand the exact consequences. DataFusion is both a SQL execution engine (so it has

Re: [PR] Initial support for regex_replace on `StringViewArray` [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11556: URL: https://github.com/apache/datafusion/pull/11556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add NullState::is_valid and NullState::is_null [datafusion]

2024-07-22 Thread via GitHub
alamb commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1686464191 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -324,6 +324,18 @@ impl NullState { } } +/// Check if the

Re: [PR] chore: Make rust clippy happy [datafusion-comet]

2024-07-22 Thread via GitHub
Xuanwo commented on PR #701: URL: https://github.com/apache/datafusion-comet/pull/701#issuecomment-2242831348 > Happy Rust 1.80 day! https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-180-2024-07-25 Will we need to bump our rust-toolchain to 1.80? I'm happy to create an

Re: [PR] chore: Minor cleanup `simplify_demo()` example [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11576: URL: https://github.com/apache/datafusion/pull/11576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Why is there no content in `Extending DataFusion’s operators: custom LogicalPlan and Execution Plan` [datafusion]

2024-07-22 Thread via GitHub
alamb commented on issue #11590: URL: https://github.com/apache/datafusion/issues/11590#issuecomment-2242838596 Btw I would love to review a PR for this content) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Why is there no content in `Extending DataFusion’s operators: custom LogicalPlan and Execution Plan` [datafusion]

2024-07-22 Thread via GitHub
alamb commented on issue #11590: URL: https://github.com/apache/datafusion/issues/11590#issuecomment-2242838216 I think the answer is that we are waiting for someone to write it (or maybe adapt it from the existing examples) ![Screenshot 2024-07-22 at 8 28 18  AM](https://github.c

Re: [PR] Minor: move `Column` related tests [datafusion]

2024-07-22 Thread via GitHub
alamb commented on code in PR #11573: URL: https://github.com/apache/datafusion/pull/11573#discussion_r1686471430 ## datafusion/physical-expr/src/expressions/column.rs: ## @@ -100,49 +100,3 @@ impl PartialEq for UnKnownColumn { false } } - Review Comment: It s

Re: [I] Implement `evaluate_bounds` for math unary functions [datafusion]

2024-07-22 Thread via GitHub
alamb commented on issue #11583: URL: https://github.com/apache/datafusion/issues/11583#issuecomment-2242842713 FYO @mustafasrepo and @boazberman -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Move Datafusion Query Optimizer to library user guide [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11563: URL: https://github.com/apache/datafusion/pull/11563 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Move Datafusion Query Optimizer to library user guide [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11563: URL: https://github.com/apache/datafusion/pull/11563#issuecomment-2242844487 Thanks again @devesh-2002 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Move Datafusion Query Optimizer to library user guide [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11563: URL: https://github.com/apache/datafusion/pull/11563#issuecomment-2242844287 I also filed https://github.com/apache/datafusion/issues/11581 to track updating this content -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] feat: Error when a SHOW command is passed in with an accompanying non-existant variable [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11540: URL: https://github.com/apache/datafusion/pull/11540 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] `SHOW NONSENSE` does not error [datafusion]

2024-07-22 Thread via GitHub
alamb closed issue #11529: `SHOW NONSENSE` does not error URL: https://github.com/apache/datafusion/issues/11529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] feat: Error when a SHOW command is passed in with an accompanying non-existant variable [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11540: URL: https://github.com/apache/datafusion/pull/11540#issuecomment-2242845119 Thanks again @itsjunetime -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11516: URL: https://github.com/apache/datafusion/pull/11516#issuecomment-2242850429 So it sounds to me like we are converging on a consensus here. I think the next steps for this PR are to 1. resolve the outstanding comments / file tickets for additional follow on

Re: [PR] fix: CASE with NULL [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11542: URL: https://github.com/apache/datafusion/pull/11542 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] `CASE` with `NULL` branch does not coerce when passed to aggregate function [datafusion]

2024-07-22 Thread via GitHub
alamb closed issue #11258: `CASE` with `NULL` branch does not coerce when passed to aggregate function URL: https://github.com/apache/datafusion/issues/11258 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] fix: CASE with NULL [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11542: URL: https://github.com/apache/datafusion/pull/11542#issuecomment-2242851561 Thanks @Weijun-H and @jonahgao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Provide DataFrame API for `map` and move `map` to `functions-array` [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11560: URL: https://github.com/apache/datafusion/pull/11560 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Easier Dataframe API for `map` [datafusion]

2024-07-22 Thread via GitHub
alamb closed issue #11546: Easier Dataframe API for `map` URL: https://github.com/apache/datafusion/issues/11546 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Move OutputRequirements to datafusion-physical-optimizer crate [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11579: URL: https://github.com/apache/datafusion/pull/11579#issuecomment-2242856193 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Move OutputRequirements to datafusion-physical-optimizer crate [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11579: URL: https://github.com/apache/datafusion/pull/11579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add NullState::is_valid and NullState::is_null [datafusion]

2024-07-22 Thread via GitHub
joroKr21 commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1686483705 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -324,6 +324,18 @@ impl NullState { } } +/// Check if

Re: [PR] Add NullState::is_valid and NullState::is_null [datafusion]

2024-07-22 Thread via GitHub
joroKr21 commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1686483705 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -324,6 +324,18 @@ impl NullState { } } +/// Check if

Re: [I] Why is there no content in `Extending DataFusion’s operators: custom LogicalPlan and Execution Plan` [datafusion]

2024-07-22 Thread via GitHub
sunheyi6 commented on issue #11590: URL: https://github.com/apache/datafusion/issues/11590#issuecomment-2242878483 > Btw I would love to review a PR for this content 🎣 ok,i will try it -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Add NullState::is_valid and NullState::is_null [datafusion]

2024-07-22 Thread via GitHub
joroKr21 commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1686495035 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/accumulate.rs: ## @@ -324,6 +324,18 @@ impl NullState { } } +/// Check if

[I] Serialization of UDF might lose aliases [datafusion]

2024-07-22 Thread via GitHub
edmondop opened a new issue, #11595: URL: https://github.com/apache/datafusion/issues/11595 ### Describe the bug As a part of migrating min/max to user-defined aggregate function, the test `'cases::roundtrip_physical_plan::roundtrip_scalar_udf_extension_codec' (datafusion/proto/tests

Re: [PR] Move min and max to user defined aggregate function [datafusion]

2024-07-22 Thread via GitHub
edmondop commented on PR #11013: URL: https://github.com/apache/datafusion/pull/11013#issuecomment-2242879782 I think this is now blocked by https://github.com/apache/datafusion/issues/11595 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[I] Add remaining non-wrapped functions [datafusion-python]

2024-07-22 Thread via GitHub
timsaucer opened a new issue, #767: URL: https://github.com/apache/datafusion-python/issues/767 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We still have a few classes that do not yet have wrapper functions. Namely `datafu

[PR] Initial commit of blog post of datafusion python updates [datafusion-site]

2024-07-22 Thread via GitHub
timsaucer opened a new pull request, #8: URL: https://github.com/apache/datafusion-site/pull/8 Per request. Dates will need to be changed with release 40.0 comes out -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] GC `StringViewArray` in `CoalesceBatchesStream` [datafusion]

2024-07-22 Thread via GitHub
XiangpengHao commented on code in PR #11587: URL: https://github.com/apache/datafusion/pull/11587#discussion_r1686519520 ## datafusion/physical-plan/src/coalesce_batches.rs: ## @@ -216,6 +218,41 @@ impl CoalesceBatchesStream { match input_batch { Po

Re: [PR] GC `StringViewArray` in `CoalesceBatchesStream` [datafusion]

2024-07-22 Thread via GitHub
XiangpengHao commented on code in PR #11587: URL: https://github.com/apache/datafusion/pull/11587#discussion_r1686531840 ## datafusion/physical-plan/src/coalesce_batches.rs: ## @@ -216,6 +218,41 @@ impl CoalesceBatchesStream { match input_batch { Po

Re: [PR] GC `StringViewArray` in `CoalesceBatchesStream` [datafusion]

2024-07-22 Thread via GitHub
XiangpengHao commented on code in PR #11587: URL: https://github.com/apache/datafusion/pull/11587#discussion_r1686536860 ## datafusion/physical-plan/src/coalesce_batches.rs: ## @@ -216,6 +218,41 @@ impl CoalesceBatchesStream { match input_batch { Po

Re: [PR] GC `StringViewArray` in `CoalesceBatchesStream` [datafusion]

2024-07-22 Thread via GitHub
XiangpengHao commented on PR #11587: URL: https://github.com/apache/datafusion/pull/11587#issuecomment-2242943692 > This logic (batch in, batch out) should be a separate helper function (maybe living somewhere else, as it could be useful in other contexts too). Agree! I moved the log

Re: [PR] Add support for Utf8View for date/temporal codepaths [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11518: URL: https://github.com/apache/datafusion/pull/11518#issuecomment-2242966262 Thank you very much @a10y I took the liberty of pushing some commits to this branch to get CI to pass. Also I think by doing so the CI will run automatically for this PR from no

Re: [PR] Minor: move `Column` related tests [datafusion]

2024-07-22 Thread via GitHub
jonahgao commented on code in PR #11573: URL: https://github.com/apache/datafusion/pull/11573#discussion_r1686573611 ## datafusion/physical-expr/src/expressions/column.rs: ## @@ -100,49 +100,3 @@ impl PartialEq for UnKnownColumn { false } } - Review Comment: R

Re: [PR] Blog post for release 40.0.0 [datafusion-site]

2024-07-22 Thread via GitHub
alamb commented on code in PR #6: URL: https://github.com/apache/datafusion-site/pull/6#discussion_r1686579649 ## _posts/2024-07-09-datafusion-40.0.0.md: ## @@ -0,0 +1,450 @@ +--- +layout: post +title: "Apache Arrow DataFusion 40.0.0 Released" +date: "2024-07-09 00:00:00" +autho

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r1686590748 ## datafusion/core/src/lib.rs: ## @@ -535,6 +535,11 @@ pub use common::config; // NB datafusion execution is re-exported in the `execution` module +/// re-exp

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r168664 ## datafusion/catalog/src/session.rs: ## @@ -0,0 +1,102 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] Blog post for release 40.0.0 [datafusion-site]

2024-07-22 Thread via GitHub
2010YOUY01 commented on code in PR #6: URL: https://github.com/apache/datafusion-site/pull/6#discussion_r1686602669 ## _posts/2024-07-09-datafusion-40.0.0.md: ## @@ -0,0 +1,450 @@ +--- +layout: post +title: "Apache Arrow DataFusion 40.0.0 Released" +date: "2024-07-09 00:00:00" +

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r1686603262 ## datafusion/core/src/datasource/listing/table.rs: ## @@ -736,13 +736,16 @@ impl TableProvider for ListingTable { async fn scan( &self, -s

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r1686606121 ## datafusion-cli/src/catalog.rs: ## @@ -237,7 +236,7 @@ fn substitute_tilde(cur: String) -> String { mod tests { use super::*; -use datafusion::catalo

Re: [PR] Provide DataFrame API for `map` and move `map` to `functions-array` [datafusion]

2024-07-22 Thread via GitHub
goldmedal commented on PR #11560: URL: https://github.com/apache/datafusion/pull/11560#issuecomment-2243051488 Thanks @jayzhan211 and @alamb for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on PR #11516: URL: https://github.com/apache/datafusion/pull/11516#issuecomment-2243060801 Squashed current code and rebased to resolve conflicts with `main`. No other changes yet. -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-22 Thread via GitHub
jayzhan211 commented on PR #11550: URL: https://github.com/apache/datafusion/pull/11550#issuecomment-2243056293 I guess the chain of `?` is probably the reason why I choose builder, so we have `build()` for checking validity at once. But I would like to optimize for `user experience`

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on PR #11516: URL: https://github.com/apache/datafusion/pull/11516#issuecomment-2243062441 > we might need more granular trait than single CatalogSession for each trait I agree with you @jayzhan211 that having separate traits for TableProvider, CatalogProvider, Schem

[I] Propagation of ordered `SortProperties` should consider `nulls_first` [datafusion]

2024-07-22 Thread via GitHub
jonahgao opened a new issue, #11596: URL: https://github.com/apache/datafusion/issues/11596 ### Describe the bug As pointed by @2010YOUY01 in https://github.com/apache/datafusion/pull/11571#discussion_r1685702554 For binary expressions or multi-input functions , if their args a

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r1686626658 ## datafusion/catalog/src/catalog.rs: ## @@ -0,0 +1,173 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] fix: panic and incorrect results in `LogFunc::output_ordering()` [datafusion]

2024-07-22 Thread via GitHub
jonahgao commented on PR #11571: URL: https://github.com/apache/datafusion/pull/11571#issuecomment-2243071315 > The fix looks good to me. Thanks, @jonahgao. Could you also address the missing cases of nulls order checks, or do we report a ticket? Filed https://github.com/apache/datafu

Re: [PR] Minor: move `Column` related tests and rename `column.rs` [datafusion]

2024-07-22 Thread via GitHub
alamb commented on PR #11573: URL: https://github.com/apache/datafusion/pull/11573#issuecomment-2243091038 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Minor: move `Column` related tests and rename `column.rs` [datafusion]

2024-07-22 Thread via GitHub
alamb merged PR #11573: URL: https://github.com/apache/datafusion/pull/11573 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Add a sub-project for map udf functions [datafusion]

2024-07-22 Thread via GitHub
zhuliquan closed issue #11572: Add a sub-project for map udf functions URL: https://github.com/apache/datafusion/issues/11572 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Add a sub-project for map udf functions [datafusion]

2024-07-22 Thread via GitHub
zhuliquan commented on issue #11572: URL: https://github.com/apache/datafusion/issues/11572#issuecomment-2243093070 > > I plan to move the `map` UDF to `functions-array` in #11560 first, and then create another PR to rename `functions-array`. I guess it is just a renaming task. So, I think

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
tshauck commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686644243 ## datafusion/functions/src/math/bounds.rs: ## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r1686650408 ## datafusion/core/src/datasource/listing_table_factory.rs: ## @@ -49,16 +49,18 @@ impl ListingTableFactory { impl TableProviderFactory for ListingTableFactory {

Re: [PR] Change default Parquet writer settings to match arrow-rs (except for compression & statistics) [datafusion]

2024-07-22 Thread via GitHub
wiedld commented on code in PR #11558: URL: https://github.com/apache/datafusion/pull/11558#discussion_r1686650020 ## datafusion/common/src/file_options/parquet_writer.rs: ## @@ -712,35 +662,13 @@ mod tests { "should see the extern parquet's default over-riding data

Re: [PR] Change default Parquet writer settings to match arrow-rs (except for compression & statistics) [datafusion]

2024-07-22 Thread via GitHub
wiedld commented on code in PR #11558: URL: https://github.com/apache/datafusion/pull/11558#discussion_r1686650020 ## datafusion/common/src/file_options/parquet_writer.rs: ## @@ -712,35 +662,13 @@ mod tests { "should see the extern parquet's default over-riding data

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-22 Thread via GitHub
findepi commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r1686655186 ## datafusion/core/src/datasource/listing_table_factory.rs: ## @@ -49,16 +49,18 @@ impl ListingTableFactory { impl TableProviderFactory for ListingTableFactory {

Re: [PR] Parsing SQL strings to Exprs wtih the qualified schema [datafusion]

2024-07-22 Thread via GitHub
goldmedal commented on code in PR #11562: URL: https://github.com/apache/datafusion/pull/11562#discussion_r1686658173 ## datafusion/sql/src/expr/identifier.rs: ## @@ -47,40 +47,58 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { // compound identifiers, but this is

Re: [PR] GC `StringViewArray` in `CoalesceBatchesStream` [datafusion]

2024-07-22 Thread via GitHub
2010YOUY01 commented on code in PR #11587: URL: https://github.com/apache/datafusion/pull/11587#discussion_r1686678665 ## datafusion/physical-plan/src/coalesce_batches.rs: ## @@ -216,6 +218,8 @@ impl CoalesceBatchesStream { match input_batch { Poll:

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
tshauck commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686692384 ## datafusion/functions/src/math/monotonicity.rs: ## @@ -15,24 +15,17 @@ // specific language governing permissions and limitations // under the License. -use

[PR] feat: use Substrait's PrecisionTimestamp and PrecisionTimestampTz instead of deprecated Timestamp [datafusion]

2024-07-22 Thread via GitHub
Blizzara opened a new pull request, #11597: URL: https://github.com/apache/datafusion/pull/11597 ## Which issue does this PR close? N/A ## Rationale for this change DF was using the Substrait type_variations on a Substrait Timestamp to indicate whether a timestamp is sec

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-22 Thread via GitHub
tshauck commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1686702799 ## datafusion/functions/src/math/bounds.rs: ## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

  1   2   3   >