Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-23 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1687532640 ## datafusion/expr/src/interval_arithmetic.rs: ## @@ -332,6 +332,46 @@ impl Interval { Ok(Self::new(unbounded_endpoint.clone(), unbounded_endpoint))

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-23 Thread via GitHub
berkaysynnada commented on code in PR #11584: URL: https://github.com/apache/datafusion/pull/11584#discussion_r1687531392 ## datafusion/expr/src/interval_arithmetic.rs: ## @@ -332,6 +332,46 @@ impl Interval { Ok(Self::new(unbounded_endpoint.clone(), unbounded_endpoint))

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-23 Thread via GitHub
berkaysynnada commented on PR #11584: URL: https://github.com/apache/datafusion/pull/11584#issuecomment-2244418683 Thanks @tshauck. This is a nice step towards having comprehensive interval analysis. I have left a small suggestion. Once that is addressed, the PR will be good to go 🚀 -- T

Re: [PR] Add was_valid parameter to NullState callbacks [datafusion]

2024-07-23 Thread via GitHub
Dandandan commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1687564010 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/prim_op.rs: ## @@ -102,9 +102,13 @@ where values, opt_filter,

Re: [PR] Add was_valid parameter to NullState callbacks [datafusion]

2024-07-23 Thread via GitHub
joroKr21 commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1687603952 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/prim_op.rs: ## @@ -102,9 +102,13 @@ where values, opt_filter,

Re: [PR] Add was_valid parameter to NullState callbacks [datafusion]

2024-07-23 Thread via GitHub
joroKr21 commented on code in PR #11592: URL: https://github.com/apache/datafusion/pull/11592#discussion_r1687604948 ## datafusion/physical-expr-common/src/aggregate/groups_accumulator/prim_op.rs: ## @@ -102,9 +102,13 @@ where values, opt_filter,

Re: [PR] Extract CoalesceBatchesStream to a struct [datafusion]

2024-07-23 Thread via GitHub
ozankabak commented on PR #11610: URL: https://github.com/apache/datafusion/pull/11610#issuecomment-2244524858 We plan to submit fetching support to `CoalesceBatchesExec` in a couple days. It'd be good to do the reorg after that to minimize total effort -- This is an automated message fro

Re: [PR] feat: support Map literals in Substrait consumer and producer [datafusion]

2024-07-23 Thread via GitHub
Blizzara commented on code in PR #11547: URL: https://github.com/apache/datafusion/pull/11547#discussion_r1687639877 ## datafusion/common/src/hash_utils.rs: ## @@ -692,6 +730,48 @@ mod tests { assert_eq!(hashes[0], hashes[1]); } +#[test] +// Tests actual

[PR] chore(deps): update substrait requirement from 0.36.0 to 0.38.0 [datafusion]

2024-07-23 Thread via GitHub
dependabot[bot] opened a new pull request, #11613: URL: https://github.com/apache/datafusion/pull/11613 Updates the requirements on [substrait](https://github.com/substrait-io/substrait-rs) to permit the latest version. Release notes Sourced from https://github.com/substrait-io/su

Re: [PR] Parsing SQL strings to Exprs with the qualified schema [datafusion]

2024-07-23 Thread via GitHub
jonahgao commented on code in PR #11562: URL: https://github.com/apache/datafusion/pull/11562#discussion_r1687702567 ## datafusion/sql/src/expr/identifier.rs: ## @@ -47,40 +47,58 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { // compound identifiers, but this is

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-23 Thread via GitHub
jayzhan211 commented on code in PR #11550: URL: https://github.com/apache/datafusion/pull/11550#discussion_r1687794069 ## datafusion/expr/src/expr_fn.rs: ## @@ -676,6 +679,266 @@ pub fn interval_month_day_nano_lit(value: &str) -> Expr { Expr::Literal(ScalarValue::IntervalMo

[PR] Enforce uniqueness of `named_struct` field names [datafusion]

2024-07-23 Thread via GitHub
dharanad opened a new pull request, #11614: URL: https://github.com/apache/datafusion/pull/11614 ## Which issue does this PR close? Closes #11438 ## Rationale for this change ## What changes are included in this PR? ## Are these changes test

Re: [PR] GC `StringViewArray` in `CoalesceBatchesStream` [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11587: URL: https://github.com/apache/datafusion/pull/11587#issuecomment-2244814903 > This logic (batch in, batch out) should be a separate helper function (maybe living somewhere else, as it could be useful in other contexts too). This way, the main logic of the Coal

Re: [PR] support Decimal256 type in datafusion-proto [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11606: URL: https://github.com/apache/datafusion/pull/11606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Decimal256 type is not supported in datafusion-proto [datafusion]

2024-07-23 Thread via GitHub
alamb closed issue #11607: Decimal256 type is not supported in datafusion-proto URL: https://github.com/apache/datafusion/issues/11607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] support Decimal256 type in datafusion-proto [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11606: URL: https://github.com/apache/datafusion/pull/11606#issuecomment-2244816353 Thanks @leoyvens and @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Extract `CoalesceBatchesStream` to a struct [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11610: URL: https://github.com/apache/datafusion/pull/11610#issuecomment-2244825431 > We plan to submit fetching support to CoalesceBatchesExec in a couple days. It'd be good to do the reorg after that to minimize total effort @ozankabak Sounds good to me -- I

Re: [PR] Extract `CoalesceBatchesStream` to a struct [datafusion]

2024-07-23 Thread via GitHub
alamb commented on code in PR #11610: URL: https://github.com/apache/datafusion/pull/11610#discussion_r1687851341 ## datafusion/physical-plan/src/coalesce_batches.rs: ## @@ -290,26 +277,106 @@ pub fn concat_batches( arrow::compute::concat_batches(schema, batches) } +///

[PR] Minor: Use upstream `concat_batches` from arrow-rs [datafusion]

2024-07-23 Thread via GitHub
alamb opened a new pull request, #11615: URL: https://github.com/apache/datafusion/pull/11615 ## Which issue does this PR close? N/A ## Rationale for this change While working on https://github.com/apache/datafusion/pull/11610 I found that there is a wrapper over `concat

Re: [PR] Minor: Use upstream `concat_batches` from arrow-rs [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11615: URL: https://github.com/apache/datafusion/pull/11615#issuecomment-2244905451 @ozankabak mentioned https://github.com/apache/datafusion/pull/11610#issuecomment-2244524858 there was work in progress for `CoalesceBatches` so we might want to make this PR a draft

Re: [PR] Minor: Use upstream `concat_batches` from arrow-rs [datafusion]

2024-07-23 Thread via GitHub
alamb commented on code in PR #11615: URL: https://github.com/apache/datafusion/pull/11615#discussion_r1687859576 ## datafusion/physical-plan/src/coalesce_batches.rs: ## @@ -276,20 +266,6 @@ impl RecordBatchStream for CoalesceBatchesStream { } } -/// Concatenates an arra

Re: [PR] Extract `CoalesceBatchesStream` to a struct [datafusion]

2024-07-23 Thread via GitHub
ozankabak commented on PR #11610: URL: https://github.com/apache/datafusion/pull/11610#issuecomment-2244971934 For reference, here is the [PR](https://github.com/synnada-ai/datafusion-upstream/pull/27) we are preparing (and will submit to the upstream repo soon): -- This is an automated

Re: [PR] Enforce uniqueness of `named_struct` field names [datafusion]

2024-07-23 Thread via GitHub
dharanad commented on PR #11614: URL: https://github.com/apache/datafusion/pull/11614#issuecomment-2244980703 Based on my understanding, since both struct field names and dictionary keys are of type `Ident`, it seems challenging to distinguish between a `NULL` value and a `'NULL'` string. F

Re: [PR] Enforce uniqueness of `named_struct` field names [datafusion]

2024-07-23 Thread via GitHub
dharanad commented on code in PR #11614: URL: https://github.com/apache/datafusion/pull/11614#discussion_r1687937536 ## datafusion/functions/src/core/named_struct.rs: ## @@ -57,6 +57,15 @@ fn named_struct_expr(args: &[ColumnarValue]) -> Result { .into_iter() .

Re: [PR] Minor: Use upstream `concat_batches` from arrow-rs [datafusion]

2024-07-23 Thread via GitHub
Dandandan commented on code in PR #11615: URL: https://github.com/apache/datafusion/pull/11615#discussion_r1687964534 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -364,7 +364,7 @@ async fn collect_left_input( let stream = merge.execute(0, context)?;

Re: [PR] Minor: Use upstream `concat_batches` from arrow-rs [datafusion]

2024-07-23 Thread via GitHub
Dandandan commented on code in PR #11615: URL: https://github.com/apache/datafusion/pull/11615#discussion_r1687964534 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -364,7 +364,7 @@ async fn collect_left_input( let stream = merge.execute(0, context)?;

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-23 Thread via GitHub
timsaucer commented on code in PR #11550: URL: https://github.com/apache/datafusion/pull/11550#discussion_r1687978397 ## datafusion/expr/src/expr_fn.rs: ## @@ -676,6 +679,266 @@ pub fn interval_month_day_nano_lit(value: &str) -> Expr { Expr::Literal(ScalarValue::IntervalMon

[PR] fix: rat check error caused by logos [datafusion-comet]

2024-07-23 Thread via GitHub
PengleiShi opened a new pull request, #707: URL: https://github.com/apache/datafusion-comet/pull/707 ## Which issue does this PR close? fix rat check error Closes #. ## Rationale for this change ## What changes are included in this PR? ## How

[PR] Chore/fifo tests cleanup [datafusion]

2024-07-23 Thread via GitHub
ozankabak opened a new pull request, #11616: URL: https://github.com/apache/datafusion/pull/11616 ## Which issue does this PR close? Closes #. ## Rationale for this change FIFO tests did not have enough comments and was hard to read/understand. Refactored and sim

Re: [I] Update Optimizer documentation [datafusion]

2024-07-23 Thread via GitHub
edmondop commented on issue #11581: URL: https://github.com/apache/datafusion/issues/11581#issuecomment-2245153268 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] test: get file size by func metadata [datafusion]

2024-07-23 Thread via GitHub
zhuliquan commented on PR #11575: URL: https://github.com/apache/datafusion/pull/11575#issuecomment-2245159846 > Thanks @zhuliquan -- this makes sense to me. > > I don't understand how this issue isn't caught on the CI runner though I also got confused, `#[cfg(target_os)]` not e

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-23 Thread via GitHub
ozankabak commented on PR #11584: URL: https://github.com/apache/datafusion/pull/11584#issuecomment-2245191841 Apart from @berkaysynnada's outstanding suggestions the only thing I see is that we would need to use two different π values, depending on whether it is in the lower bound or the u

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-23 Thread via GitHub
timsaucer commented on PR #11550: URL: https://github.com/apache/datafusion/pull/11550#issuecomment-2245192874 Question: it looks like we have some functions that have both aggregate functions and window functions defined. Specifically looking at `first_value` and `last_value`. My thought i

Re: [PR] Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [datafusion]

2024-07-23 Thread via GitHub
phillipleblanc commented on code in PR #11516: URL: https://github.com/apache/datafusion/pull/11516#discussion_r1688021260 ## datafusion/catalog/src/session.rs: ## @@ -0,0 +1,102 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-23 Thread via GitHub
timsaucer commented on code in PR #11550: URL: https://github.com/apache/datafusion/pull/11550#discussion_r1688043042 ## datafusion/expr/src/expr_fn.rs: ## @@ -676,6 +679,266 @@ pub fn interval_month_day_nano_lit(value: &str) -> Expr { Expr::Literal(ScalarValue::IntervalMon

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-07-23 Thread via GitHub
notfilippo commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2245251759 > Now, i have add_one() function that can take 64-bit integer values and add +1 to them. The +1 operation is perfectly valid operation for i64 -- it's valid for sql long/bigi

[PR] ExprBuilder for Physical Aggregate Expr [datafusion]

2024-07-23 Thread via GitHub
jayzhan211 opened a new pull request, #11617: URL: https://github.com/apache/datafusion/pull/11617 ## Which issue does this PR close? Part of #11543. Since `create_aggregate_expr` is not stable yet, probably will change after refactor #11359 . I only create build

[I] Improve consistency and documentation on error handling in in UDFs [datafusion]

2024-07-23 Thread via GitHub
edmondop opened a new issue, #11618: URL: https://github.com/apache/datafusion/issues/11618 ### Is your feature request related to a problem or challenge? When writing a new UDF, a developer needs to decide how to perform error management in functions that return `Result`, such as `re

Re: [PR] Fix Internal Error for an INNER JOIN query [datafusion]

2024-07-23 Thread via GitHub
xinlifoobar commented on code in PR #11578: URL: https://github.com/apache/datafusion/pull/11578#discussion_r1688069755 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -518,6 +510,14 @@ impl LogicalPlan { Ok(using_columns) } +fn as_col(expr: Expr) -> Resul

Re: [PR] Add parser option enable_options_value_normalization [datafusion]

2024-07-23 Thread via GitHub
xinlifoobar commented on PR #11330: URL: https://github.com/apache/datafusion/pull/11330#issuecomment-2245264178 > > @berkaysynnada 's solution allows option-specific normalization but I don't think that kind of fine-grained control over option values is needed. > > We actually do nee

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-07-23 Thread via GitHub
notfilippo commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2245278317 > if we opt for logicalType, how do users then specify the physical types? @doki23 -- Through the use of a custom implementation of the `TypeRelation` trait, which then

Re: [I] Update ClickBench benchmarks with DataFusion 40 [datafusion]

2024-07-23 Thread via GitHub
xinlifoobar commented on issue #11567: URL: https://github.com/apache/datafusion/issues/11567#issuecomment-2245279213 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] feat: support Map literals in Substrait consumer and producer [datafusion]

2024-07-23 Thread via GitHub
goldmedal commented on code in PR #11547: URL: https://github.com/apache/datafusion/pull/11547#discussion_r1688090369 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -1785,6 +1798,98 @@ fn from_substrait_literal( } } } +

Re: [I] Update ClickBench benchmarks with DataFusion 40 [datafusion]

2024-07-23 Thread via GitHub
xinlifoobar commented on issue #11567: URL: https://github.com/apache/datafusion/issues/11567#issuecomment-2245316926 Sorry @alamb, I just found out that I could not create AWS account at this time. Is it fine to use Azure VM, e.g., F16sv2, instead? If not please unassign me... ![im

Re: [I] abs returns incorrect value in some cases [datafusion-comet]

2024-07-23 Thread via GitHub
vaibhawvipul commented on issue #666: URL: https://github.com/apache/datafusion-comet/issues/666#issuecomment-2245317880 @andygrove I have a doubt here. According to #642 there were relevant tests added. However I see the following tests pass. `test("abs Overflow legacy mode")` and `

Re: [PR] Add parser option enable_options_value_normalization [datafusion]

2024-07-23 Thread via GitHub
ozankabak commented on PR #11330: URL: https://github.com/apache/datafusion/pull/11330#issuecomment-2245314151 Sure -- let's just make sure to avoid API churn and complete the work 🙂 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-23 Thread via GitHub
jayzhan211 commented on code in PR #11550: URL: https://github.com/apache/datafusion/pull/11550#discussion_r1688132417 ## datafusion/expr/src/expr_fn.rs: ## @@ -676,6 +679,266 @@ pub fn interval_month_day_nano_lit(value: &str) -> Expr { Expr::Literal(ScalarValue::IntervalMo

Re: [PR] Chore/fifo tests cleanup [datafusion]

2024-07-23 Thread via GitHub
ozankabak merged PR #11616: URL: https://github.com/apache/datafusion/pull/11616 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Chore/fifo tests cleanup [datafusion]

2024-07-23 Thread via GitHub
ozankabak commented on PR #11616: URL: https://github.com/apache/datafusion/pull/11616#issuecomment-2245368658 Since this is a very narrow change that only improves test-readability without any material change, I'm going to merge it quickly given @mustafasrepo reviewed it. If anyone sees an

Re: [PR] Rename `functions-array` to `functions-nested` [datafusion]

2024-07-23 Thread via GitHub
goldmedal commented on code in PR #11602: URL: https://github.com/apache/datafusion/pull/11602#discussion_r1688149520 ## datafusion/core/Cargo.toml: ## @@ -40,8 +40,10 @@ name = "datafusion" path = "src/lib.rs" [features] -# Used to enable the avro format nested_expressions

Re: [I] Improve consistency and documentation on error handling in in UDFs [datafusion]

2024-07-23 Thread via GitHub
2010YOUY01 commented on issue #11618: URL: https://github.com/apache/datafusion/issues/11618#issuecomment-2245388504 We should definitely explain them better in the doc. Here is my understanding. I'm wondering if anyone has additional thoughts or if I'm understanding something wrong.

[PR] refactor: simplify `DFSchema::field_with_unqualified_name` [datafusion]

2024-07-23 Thread via GitHub
jonahgao opened a new pull request, #11619: URL: https://github.com/apache/datafusion/pull/11619 ## Which issue does this PR close? N/A ## Rationale for this change The only difference between `qualified_field_with_unqualified_name` and `field_with_unqualified_name` is t

Re: [PR] Implement `DynamicFileSchemaProvider` in the core [datafusion]

2024-07-23 Thread via GitHub
goldmedal commented on PR #11035: URL: https://github.com/apache/datafusion/pull/11035#issuecomment-2245430181 > BTW I think #11516 is very related to this PR -- maybe once we get that one in then the API changes needed for this feature will become more natural / easier to fit Agreed

[PR] Doc: A tiny typo in scalar function's doc [datafusion]

2024-07-23 Thread via GitHub
2010YOUY01 opened a new pull request, #11620: URL: https://github.com/apache/datafusion/pull/11620 ## Which issue does this PR close? Closes #. ## Rationale for this change Fix a tiny typo ## What changes are included in this PR? ## Are t

Re: [I] always failed test on datasource::file_format::csv::tests::test_csv_parallel_one_col::case_6 on windows machine [datafusion]

2024-07-23 Thread via GitHub
zhuliquan closed issue #11574: always failed test on datasource::file_format::csv::tests::test_csv_parallel_one_col::case_6 on windows machine URL: https://github.com/apache/datafusion/issues/11574 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[I] Incorrect predicate evaluation result in a query (SQLancer-NoREC) [datafusion]

2024-07-23 Thread via GitHub
2010YOUY01 opened a new issue, #11621: URL: https://github.com/apache/datafusion/issues/11621 ### Describe the bug In the following reproducer, two rows in each table should first be joined, then get evaluted to `true` by the predicate, and finally output to the final result. But

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-07-23 Thread via GitHub
notfilippo commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2245458843 > One small nit: I don't think I would lump together FixedSizeBinary with Binary and FixedSizeList with List. The fixed lengths often have semantics that should be considered

[I] Internal error when regex operator `~` is used with `List`s (SQLancer) [datafusion]

2024-07-23 Thread via GitHub
2010YOUY01 opened a new issue, #11622: URL: https://github.com/apache/datafusion/issues/11622 ### Describe the bug The following query has an invalid operation, so it should return a planning error. However, it's returning an internal error to indicate a potential bug ``` > s

Re: [PR] Consistent approach to setting parameters on aggregate functions and window functions [datafusion]

2024-07-23 Thread via GitHub
timsaucer commented on PR #11550: URL: https://github.com/apache/datafusion/pull/11550#issuecomment-2245490998 From ^ comment I think things like the helper function `first_value()` and such are adding technical debt. I'm going to dig in tomorrow. If we're updating the window functions now

Re: [PR] Add blog post announcing Comet 0.1.0 [datafusion-site]

2024-07-23 Thread via GitHub
andygrove commented on PR #7: URL: https://github.com/apache/datafusion-site/pull/7#issuecomment-2245536377 Thanks for the review @alamb. I plan on merging this later today once the release is available. -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Add blog post announcing Comet 0.1.0 [datafusion-site]

2024-07-23 Thread via GitHub
viirya commented on PR #7: URL: https://github.com/apache/datafusion-site/pull/7#issuecomment-2245540968 Still in draft? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Blog post for release 40.0.0 [datafusion-site]

2024-07-23 Thread via GitHub
alamb commented on code in PR #6: URL: https://github.com/apache/datafusion-site/pull/6#discussion_r1688258419 ## _posts/2024-07-09-datafusion-40.0.0.md: ## @@ -0,0 +1,450 @@ +--- +layout: post +title: "Apache Arrow DataFusion 40.0.0 Released" +date: "2024-07-09 00:00:00" +autho

Re: [PR] Blog post for release 40.0.0 [datafusion-site]

2024-07-23 Thread via GitHub
alamb commented on code in PR #6: URL: https://github.com/apache/datafusion-site/pull/6#discussion_r1688260514 ## _posts/2024-07-23-datafusion-40.0.0.md: ## @@ -0,0 +1,492 @@ +--- +layout: post +title: "Apache DataFusion 40.0.0 Released" +date: "2024-07-21 00:00:00" +author: pmc

[I] Incorrect `NULL` handling for regex match `~` (SQLancer) [datafusion]

2024-07-23 Thread via GitHub
2010YOUY01 opened a new issue, #11623: URL: https://github.com/apache/datafusion/issues/11623 ### Describe the bug Query `select 'foo' ~ null;` should be valid, since `null` can be treated as a missing string regex, so the final result should be `NULL` Current datafusion-cli: ``

Re: [PR] Add blog post announcing Comet 0.1.0 [datafusion-site]

2024-07-23 Thread via GitHub
viirya commented on code in PR #7: URL: https://github.com/apache/datafusion-site/pull/7#discussion_r1688267997 ## _posts/2024-07-20-datafusion-comet-0.1.0.md: ## @@ -0,0 +1,154 @@ +--- +layout: post +title: "Apache DataFusion Comet 0.1.0 Release" +date: "2024-07-20 00:00:00" +a

Re: [PR] Add blog post announcing Comet 0.1.0 [datafusion-site]

2024-07-23 Thread via GitHub
viirya commented on code in PR #7: URL: https://github.com/apache/datafusion-site/pull/7#discussion_r1688269937 ## _posts/2024-07-20-datafusion-comet-0.1.0.md: ## @@ -0,0 +1,154 @@ +--- +layout: post +title: "Apache DataFusion Comet 0.1.0 Release" +date: "2024-07-20 00:00:00" +a

Re: [PR] feat: support Map literals in Substrait consumer and producer [datafusion]

2024-07-23 Thread via GitHub
Blizzara commented on code in PR #11547: URL: https://github.com/apache/datafusion/pull/11547#discussion_r1688306249 ## datafusion/common/src/hash_utils.rs: ## @@ -692,6 +732,64 @@ mod tests { assert_eq!(hashes[0], hashes[1]); } +#[test] +// Tests actual

Re: [PR] Blog post for release 40.0.0 [datafusion-site]

2024-07-23 Thread via GitHub
alamb commented on PR #6: URL: https://github.com/apache/datafusion-site/pull/6#issuecomment-2245617456 Thanks everyone for the comments. I plan to publish this tomorrow Is there any chance one of the committers could approve this PR so I can do so? @andygrove maybe? ![Screens

Re: [I] Typo in doc of datafusion::physical_plan::Partitioning [datafusion]

2024-07-23 Thread via GitHub
alamb closed issue #11593: Typo in doc of datafusion::physical_plan::Partitioning URL: https://github.com/apache/datafusion/issues/11593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Fix typo in doc of Partitioning [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11612: URL: https://github.com/apache/datafusion/pull/11612 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] feat: support Map literals in Substrait consumer and producer [datafusion]

2024-07-23 Thread via GitHub
Blizzara commented on code in PR #11547: URL: https://github.com/apache/datafusion/pull/11547#discussion_r1688326679 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -1785,6 +1798,98 @@ fn from_substrait_literal( } } } +S

Re: [PR] feat: support Map literals in Substrait consumer and producer [datafusion]

2024-07-23 Thread via GitHub
Blizzara commented on code in PR #11547: URL: https://github.com/apache/datafusion/pull/11547#discussion_r1685012930 ## datafusion/sqllogictest/test_files/map.slt: ## @@ -302,3 +302,11 @@ SELECT MAP(arrow_cast(make_array('POST', 'HEAD', 'PATCH'), 'LargeList(Utf8)'), a {POST: 4

Re: [PR] fix: rat check error caused by logos [datafusion-comet]

2024-07-23 Thread via GitHub
parthchandra commented on PR #707: URL: https://github.com/apache/datafusion-comet/pull/707#issuecomment-2245703570 > there is a rat exclude list and also we have a list in the maven pom.xml I'm in favor of adding this to the `dev/release/rat_exclude_files.txt` file rather than the ma

Re: [PR] Implement physical plan serialization for csv COPY plans , add `as_any`, `Debug` to `FileFormatFactory` [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11588: URL: https://github.com/apache/datafusion/pull/11588#issuecomment-2245716531 > The fuzz test failed seems to be unrelated to this PR I restarted the CI check. I think the failure is https://github.com/apache/datafusion/issues/11555 -- the test was disable

Re: [PR] feat: support Map literals in Substrait consumer and producer [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11547: URL: https://github.com/apache/datafusion/pull/11547 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] feat: support Map literals in Substrait consumer and producer [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11547: URL: https://github.com/apache/datafusion/pull/11547#issuecomment-2245720581 Thanks @goldmedal and @Blizzara -- 👌 very nice -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] Add some zero column tests covering LIMIT, GROUP BY, WHERE, JOIN, and WINDOW [datafusion]

2024-07-23 Thread via GitHub
Kev1n8 opened a new pull request, #11624: URL: https://github.com/apache/datafusion/pull/11624 ## Which issue does this PR close? Related: #5713. ## Rationale for this change To make sure there are not any other execs having issues when it comes to zero-c

Re: [PR] fix: dictionary decimal vector optimization [datafusion-comet]

2024-07-23 Thread via GitHub
parthchandra commented on code in PR #705: URL: https://github.com/apache/datafusion-comet/pull/705#discussion_r1688389477 ## common/src/main/java/org/apache/comet/vector/CometDictionary.java: ## @@ -100,17 +102,21 @@ public void close() { } private void initialize() { -

Re: [I] Support `date_bin` on timestamps with timezone, properly accounting for Daylight Savings Time [datafusion]

2024-07-23 Thread via GitHub
Omega359 commented on issue #10602: URL: https://github.com/apache/datafusion/issues/10602#issuecomment-2245748787 FYI - I came across the [Jiff crate](https://github.com/BurntSushi/jiff) the other day that looks to have some nice support for tz aware calendar handling. From the benchmarks

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-23 Thread via GitHub
tshauck commented on PR #11584: URL: https://github.com/apache/datafusion/pull/11584#issuecomment-2245810011 Thanks to both of you for the thoughtful feedback. @berkaysynnada, I updated infinity to use unbounded in [231a717](https://github.com/apache/datafusion/pull/11584/commits/231a

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-07-23 Thread via GitHub
wjones127 commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2245817331 > While I agree that having fixed length constraint for a list of logical types makes sense I am not convinced about FixedSizeBinaries. What would be the use case? Do you have

Re: [PR] feat: add bounds for unary math scalar functions [datafusion]

2024-07-23 Thread via GitHub
ozankabak commented on PR #11584: URL: https://github.com/apache/datafusion/pull/11584#issuecomment-2245829436 IIRC nextafter is the IEEE standard name for this function, and Rust's next_up does something similar (in the upward direction). If it is in nightly, we probably should just create

Re: [PR] Fix Internal Error for an INNER JOIN query [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11578: URL: https://github.com/apache/datafusion/pull/11578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Fix Internal Error for an INNER JOIN query [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11578: URL: https://github.com/apache/datafusion/pull/11578#issuecomment-2245843727 Thanks again @xinlifoobar -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Minor: Use upstream `concat_batches` from arrow-rs [datafusion]

2024-07-23 Thread via GitHub
alamb commented on code in PR #11615: URL: https://github.com/apache/datafusion/pull/11615#discussion_r1688456284 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -364,7 +364,7 @@ async fn collect_left_input( let stream = merge.execute(0, context)?; /

Re: [I] Internal Error for an INNER JOIN query (SQLancer) [datafusion]

2024-07-23 Thread via GitHub
alamb closed issue #11412: Internal Error for an INNER JOIN query (SQLancer) URL: https://github.com/apache/datafusion/issues/11412 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2024-07-23 Thread via GitHub
notfilippo commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2245845809 > Example uses of fixed size binary are representing arrays of data types not supported in Arrow, such as f16 or i128 (UUID). A value with a different number of bytes would b

Re: [PR] test: get file size by func metadata [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11575: URL: https://github.com/apache/datafusion/pull/11575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] test: get file size by func metadata [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11575: URL: https://github.com/apache/datafusion/pull/11575#issuecomment-2245846814 Anyhow, thanks again @zhuliquan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] test: get file size by func metadata [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11575: URL: https://github.com/apache/datafusion/pull/11575#issuecomment-2245846616 > I'am not sure that ci run unittests on real windows machine according to workflow yaml. Or maybe it runs in [WSL](https://learn.microsoft.com/en-us/windows/wsl/faq) or somethi

Re: [I] always failed test on datasource::file_format::csv::tests::test_csv_parallel_one_col::case_6 on windows machine [datafusion]

2024-07-23 Thread via GitHub
alamb closed issue #11574: always failed test on datasource::file_format::csv::tests::test_csv_parallel_one_col::case_6 on windows machine URL: https://github.com/apache/datafusion/issues/11574 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Doc: A tiny typo in scalar function's doc [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11620: URL: https://github.com/apache/datafusion/pull/11620 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add parser option enable_options_value_normalization [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11330: URL: https://github.com/apache/datafusion/pull/11330#issuecomment-2245863417 > Sure -- let's just make sure to avoid API churn and complete the work 🙂 How about we file a ticket explaining what else is needed prior to merging this PR. I can do this but I

Re: [PR] chore(deps): update substrait requirement from 0.36.0 to 0.38.0 [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11613: URL: https://github.com/apache/datafusion/pull/11613#issuecomment-2245866405 Looks like this needs the prost update that is coming in arrow 52.2.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Improve unparser MySQL compatibility [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11589: URL: https://github.com/apache/datafusion/pull/11589#issuecomment-2245876211 Thanks again @sgrebnov -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Improve unparser MySQL compatibility [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11589: URL: https://github.com/apache/datafusion/pull/11589 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Enforce uniqueness of `named_struct` field names [datafusion]

2024-07-23 Thread via GitHub
alamb commented on code in PR #11614: URL: https://github.com/apache/datafusion/pull/11614#discussion_r1688478700 ## datafusion/functions/src/core/named_struct.rs: ## @@ -57,6 +57,15 @@ fn named_struct_expr(args: &[ColumnarValue]) -> Result { .into_iter() .unz

Re: [PR] Enforce uniqueness of `named_struct` field names [datafusion]

2024-07-23 Thread via GitHub
alamb commented on code in PR #11614: URL: https://github.com/apache/datafusion/pull/11614#discussion_r1688480974 ## datafusion/functions/src/core/named_struct.rs: ## @@ -57,6 +57,15 @@ fn named_struct_expr(args: &[ColumnarValue]) -> Result { .into_iter() .unz

Re: [PR] Change default Parquet writer settings to match arrow-rs (except for compression & statistics) [datafusion]

2024-07-23 Thread via GitHub
alamb commented on PR #11558: URL: https://github.com/apache/datafusion/pull/11558#issuecomment-2245890920 Thanks again @wiedld and @devinjdangelo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Implement physical plan serialization for csv COPY plans , add `as_any`, `Debug` to `FileFormatFactory` [datafusion]

2024-07-23 Thread via GitHub
alamb merged PR #11588: URL: https://github.com/apache/datafusion/pull/11588 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

  1   2   >