[PR] Minor: Doc and organize fields in `struct ExternalSorter` [datafusion]

2024-11-15 Thread via GitHub
2010YOUY01 opened a new pull request, #13447: URL: https://github.com/apache/datafusion/pull/13447 ## Which issue does this PR close? Closes #. ## Rationale for this change `ExternalSorter` is a big and important struct for sorting execution, this PR only organiz

[PR] Fix ClickHouse document link from `Russian` to `English` [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
git-hulk opened a new pull request, #1527: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1527 This closes #1523 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
iffyio commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2480450350 @blaginin thanks for looking to fix this! Currently the preference is to avoid a third-party dependency for this issue, ideally fixing up the parser behavior instead

Re: [PR] Deduplicate and standardize deserialization logic for streams [datafusion]

2024-11-15 Thread via GitHub
ozankabak merged PR #13412: URL: https://github.com/apache/datafusion/pull/13412 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Cannot parse `select 1 in ()` [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
git-hulk commented on issue #1525: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1525#issuecomment-2480449873 @alamb @iffyio The empty `()` should be disallowed in many dialects, but I'm not sure if it's good to add an option to allow this in the SQL parser. This seems reas

Re: [PR] recursive select calls are parsed with bad trailing_commas parameter [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
iffyio commented on code in PR #1521: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1521#discussion_r1844917951 ## src/parser/mod.rs: ## @@ -3538,15 +3538,35 @@ impl<'a> Parser<'a> { } } +/// Parse the comma of a comma-separated syntax element.

Re: [PR] Deduplicate and standardize deserialization logic for streams [datafusion]

2024-11-15 Thread via GitHub
ozankabak commented on code in PR #13412: URL: https://github.com/apache/datafusion/pull/13412#discussion_r1844911513 ## datafusion/core/src/datasource/file_format/mod.rs: ## @@ -168,6 +172,164 @@ pub enum FilePushdownSupport { Supported, } +/// Possible outputs of a [`B

[I] Support multiple (>2) results comparison in benchmark scripts [datafusion]

2024-11-15 Thread via GitHub
2010YOUY01 opened a new issue, #13446: URL: https://github.com/apache/datafusion/issues/13446 ### Is your feature request related to a problem or challenge? Now benchmark scripts only supports comparing two benchmark run results, see https://github.com/apache/datafusion/tree/main/benc

Re: [PR] Deduplicate and standardize deserialization logic for streams [datafusion]

2024-11-15 Thread via GitHub
ozankabak commented on PR #13412: URL: https://github.com/apache/datafusion/pull/13412#issuecomment-2480409944 > One thing I noticed is that https://github.com/apache/datafusion/issues/13411 talks about the arrow and avro as well. Do you plan to update them in a follow on PR? Yes, in

Re: [I] support column definition list in table alias for postgres [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
lovasoa commented on issue #1524: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1524#issuecomment-2479950624 From https://www.postgresql.org/docs/17/queries-table-expressions.html#QUERIES-TABLEFUNCTIONS : > In some cases it is useful to define table functions that

Re: [PR] feat: Hook DataFusion Parquet native scan with Comet execution [datafusion-comet]

2024-11-15 Thread via GitHub
parthchandra commented on code in PR #1094: URL: https://github.com/apache/datafusion-comet/pull/1094#discussion_r1844740539 ## spark/src/main/scala/org/apache/spark/sql/comet/CometNativeScanExec.scala: ## @@ -60,415 +50,34 @@ case class CometNativeScanExec( dataFilters: Se

Re: [PR] Deduplicate and standardize deserialization logic for streams [datafusion]

2024-11-15 Thread via GitHub
ozankabak commented on code in PR #13412: URL: https://github.com/apache/datafusion/pull/13412#discussion_r1844912389 ## datafusion/core/src/datasource/file_format/mod.rs: ## @@ -168,6 +172,164 @@ pub enum FilePushdownSupport { Supported, } +/// Possible outputs of a [`B

Re: [I] Error running crypto functions on `Dictionary` arrays such as `md5` [datafusion]

2024-11-15 Thread via GitHub
jayzhan211 commented on issue #13444: URL: https://github.com/apache/datafusion/issues/13444#issuecomment-2480247473 I think adding casting from binary to string in TypeSignature::String and apply it to md5 is able to fix this too -- This is an automated message from the Apache Git Servic

Re: [PR] Minor: Remove MOVED file [datafusion]

2024-11-15 Thread via GitHub
jayzhan211 merged PR #13442: URL: https://github.com/apache/datafusion/pull/13442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [I] add support for utf8view type to Nvl function [datafusion]

2024-11-15 Thread via GitHub
alamb closed issue #13381: add support for utf8view type to Nvl function URL: https://github.com/apache/datafusion/issues/13381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Refactor signatures for lpad, rpad, left, and right [datafusion]

2024-11-15 Thread via GitHub
jiashenC commented on code in PR #13420: URL: https://github.com/apache/datafusion/pull/13420#discussion_r1844388835 ## datafusion/sqllogictest/test_files/scalar.slt: ## @@ -1864,10 +1864,10 @@ query TT EXPLAIN SELECT letter, letter = LEFT(letter2, 1) FROM simple_string;

[I] Cannot parse `select 1 in ()` [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
adriangb opened a new issue, #1525: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1525 I'm not sure if this should parse, Postgres for example also fails, but if possible it would be nice if this at least parsed if not executed correctly. I tested this in datafusion-cl

Re: [I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
huaxingao commented on issue #1092: URL: https://github.com/apache/datafusion-comet/issues/1092#issuecomment-2479958852 @bmorck Thanks for reporting the bug. When you ran the benchmark, did you use the changes from the upstream Iceberg [PR](https://github.com/apache/iceberg/pull/9841)? -

Re: [PR] [MINOR]: fix min max accumulator nan bug [datafusion]

2024-11-15 Thread via GitHub
akurmustafa commented on code in PR #13432: URL: https://github.com/apache/datafusion/pull/13432#discussion_r1844883502 ## datafusion/functions-aggregate/src/min_max.rs: ## @@ -113,8 +114,12 @@ macro_rules! primitive_max_accumulator { ($DATA_TYPE:ident, $NATIVE:ident, $PRIM

Re: [I] Add SQL examples to window functions: `nth_value`, etc [datafusion]

2024-11-15 Thread via GitHub
spencerscott917 commented on issue #13399: URL: https://github.com/apache/datafusion/issues/13399#issuecomment-2480323177 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Enable datafusion.optimizer.filter_null_join_keys by default [datafusion]

2024-11-15 Thread via GitHub
github-actions[bot] closed pull request #12369: Enable datafusion.optimizer.filter_null_join_keys by default URL: https://github.com/apache/datafusion/pull/12369 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] feat: Support null safe equals in ExtractEquijoinPredicate [datafusion]

2024-11-15 Thread via GitHub
github-actions[bot] commented on PR #12458: URL: https://github.com/apache/datafusion/pull/12458#issuecomment-2480310745 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] feat: Hook DataFusion Parquet native scan with Comet execution [datafusion-comet]

2024-11-15 Thread via GitHub
viirya commented on code in PR #1094: URL: https://github.com/apache/datafusion-comet/pull/1094#discussion_r1844743112 ## spark/src/main/scala/org/apache/spark/sql/comet/CometNativeScanExec.scala: ## @@ -60,415 +50,34 @@ case class CometNativeScanExec( dataFilters: Seq[Expr

Re: [PR] feat: Hook DataFusion Parquet native scan with Comet execution [datafusion-comet]

2024-11-15 Thread via GitHub
viirya commented on code in PR #1094: URL: https://github.com/apache/datafusion-comet/pull/1094#discussion_r1844743112 ## spark/src/main/scala/org/apache/spark/sql/comet/CometNativeScanExec.scala: ## @@ -60,415 +50,34 @@ case class CometNativeScanExec( dataFilters: Seq[Expr

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#issuecomment-2480244408 @milenkovicm @andygrove I am pretty sure this is what we want. We have a way to set config in Python, we are using the new SessionContext to build standalone or remote contexts

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844733981 ## docs/source/user-guide/python.md: ## @@ -28,9 +28,25 @@ popular file formats files, run it in a distributed environment, and obtain the The following

Re: [PR] feat: Hook DataFusion Parquet native scan with Comet execution [datafusion-comet]

2024-11-15 Thread via GitHub
viirya commented on PR #1094: URL: https://github.com/apache/datafusion-comet/pull/1094#issuecomment-2480008480 This doesn't completely work right now. It still gets a few test failures in CometExecSuite. I'm looking into that. -- This is an automated message from the Apache Git Service.

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844733777 ## python/src/lib.rs: ## @@ -15,18 +15,189 @@ // specific language governing permissions and limitations // under the License. +use ballista::prelude::*;

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844733616 ## python/src/lib.rs: ## @@ -15,18 +15,189 @@ // specific language governing permissions and limitations // under the License. +use ballista::prelude::*;

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844733528 ## docs/source/user-guide/python.md: ## @@ -103,14 +119,15 @@ The `explain` method can be used to show the logical and physical query plans fo The followin

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844733437 ## python/ballista/context.py: ## @@ -0,0 +1,79 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844733264 ## python/examples/example.py: ## @@ -0,0 +1,39 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] Refactor signatures for lpad, rpad, left, and right [datafusion]

2024-11-15 Thread via GitHub
jayzhan211 commented on code in PR #13420: URL: https://github.com/apache/datafusion/pull/13420#discussion_r1844728918 ## datafusion/sqllogictest/test_files/scalar.slt: ## @@ -1864,10 +1864,10 @@ query TT EXPLAIN SELECT letter, letter = LEFT(letter2, 1) FROM simple_string; ---

Re: [PR] feat: Support `Utf8View` for `get_wider_type` + `binary_to_string_coercion` functions [datafusion]

2024-11-15 Thread via GitHub
jayzhan211 commented on code in PR #13370: URL: https://github.com/apache/datafusion/pull/13370#discussion_r1844725551 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1550,6 +1552,62 @@ mod tests { ); } +#[test] +fn test_get_wider_type_with_

Re: [PR] Update root `README.md` and other documentation with latest changes [datafusion-ballista]

2024-11-15 Thread via GitHub
milenkovicm commented on PR #1113: URL: https://github.com/apache/datafusion-ballista/pull/1113#issuecomment-2480156150 thanks for your review @tbar4 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] minor: Add hint for finding the GPG key to use when publishing to maven [datafusion-comet]

2024-11-15 Thread via GitHub
andygrove merged PR #1093: URL: https://github.com/apache/datafusion-comet/pull/1093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Update root `README.md` and other documentation with latest changes [datafusion-ballista]

2024-11-15 Thread via GitHub
milenkovicm commented on code in PR #1113: URL: https://github.com/apache/datafusion-ballista/pull/1113#discussion_r1844690942 ## docs/source/contributors-guide/ballista_architecture.excalidraw.svg: ## Review Comment: yes, it's a optional feature (`rest-api`) of scheduler

Re: [PR] Add sort integration benchmark [datafusion]

2024-11-15 Thread via GitHub
alamb merged PR #13306: URL: https://github.com/apache/datafusion/pull/13306 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] feat: Add `stringview` support to `encode` and `decode` and `bit_length` [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13332: URL: https://github.com/apache/datafusion/pull/13332#discussion_r1844306243 ## datafusion/functions/src/encoding/inner.rs: ## @@ -224,6 +224,7 @@ fn encode_process(value: &ColumnarValue, encoding: Encoding) -> Result match a.data_type() {

Re: [I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
bmorck commented on issue #1092: URL: https://github.com/apache/datafusion-comet/issues/1092#issuecomment-2480110799 @viirya It's not clear to me that the issue is related to the issue on Project (19) however, I noticed that the appropriate `ColumnarToRow` operators are injected when the f

Re: [PR] Support unparsing Array plan to SQL string [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13418: URL: https://github.com/apache/datafusion/pull/13418#discussion_r1844320853 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -182,6 +184,11 @@ fn roundtrip_statement() -> Result<()> { SUM(id) OVER (ROWS BETWEEN UNBOUNDED PREC

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
milenkovicm commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844671556 ## python/examples/example.py: ## @@ -0,0 +1,39 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
milenkovicm commented on code in PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#discussion_r1844661189 ## docs/source/user-guide/python.md: ## @@ -103,14 +119,15 @@ The `explain` method can be used to show the logical and physical query plans fo The fo

Re: [I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
viirya commented on issue #1092: URL: https://github.com/apache/datafusion-comet/issues/1092#issuecomment-2480121809 The only place in Comet planner to remove ColumnarToRowExec is when there is a combination ColumnarToRowExec + CometSparkToColumnarExec. As the combination is a no-op actual

Re: [I] Error running crypto functions on `Dictionary` arrays such as `md5` [datafusion]

2024-11-15 Thread via GitHub
Omega359 commented on issue #13444: URL: https://github.com/apache/datafusion/issues/13444#issuecomment-2480120237 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Error running crypto functions on `Dictionary` arrays such as `md5` [datafusion]

2024-11-15 Thread via GitHub
Omega359 commented on issue #13444: URL: https://github.com/apache/datafusion/issues/13444#issuecomment-2480119872 I think this may have been a pre-existing issue. The core issue I think is that apparently md5 is required to return utf8: https://github.com/apache/datafusion/blob/6d8313ebc86

Re: [PR] [MINOR]: fix min max accumulator nan bug [datafusion]

2024-11-15 Thread via GitHub
akurmustafa commented on code in PR #13432: URL: https://github.com/apache/datafusion/pull/13432#discussion_r1844655863 ## datafusion/functions-aggregate/src/min_max.rs: ## @@ -113,8 +114,12 @@ macro_rules! primitive_max_accumulator { ($DATA_TYPE:ident, $NATIVE:ident, $PRIM

Re: [I] Getting "Endpoints of an Interval should have the same type" during plan analysis [datafusion]

2024-11-15 Thread via GitHub
Blizzara commented on issue #13417: URL: https://github.com/apache/datafusion/issues/13417#issuecomment-2480035814 Duplicate of https://github.com/apache/datafusion/issues/13186, fixed by https://github.com/apache/datafusion/pull/13187 (I still need to confirm that actually fixes it but see

[PR] fix docs of register_table to match implementation [datafusion]

2024-11-15 Thread via GitHub
adriangb opened a new pull request, #13438: URL: https://github.com/apache/datafusion/pull/13438 I'm not sure that changing the implementation is possible at this point. We could call deregister_table but I fear that's not atomic. So we'd have to change the implementation of SchemaProvider,

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
milenkovicm commented on PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#issuecomment-2480056929 @tbar4 I'm waiting for this pr to remove BallistaCo text, please don't use it if possible. You can create SessionContext like ```rust let session_con

Re: [PR] Fallback to identifier parsing if expression parsing fails [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
iffyio commented on code in PR #1513: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1513#discussion_r1844243207 ## src/parser/mod.rs: ## @@ -1013,175 +1189,22 @@ impl<'a> Parser<'a> { let next_token = self.next_token(); let expr = match next_to

Re: [PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on PR #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100#issuecomment-2480048712 @milenkovicm I am using the deprecated BallistaContext, but just to define config and then passing the context created from the deprecated BallistaContext to the Python DataFus

Re: [I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
viirya commented on issue #1092: URL: https://github.com/apache/datafusion-comet/issues/1092#issuecomment-2480047198 > After analyzing the query optimization, I've found that the EliminateRedundantTransitions rule, removed a CometSparkToColumnar and subsequent ColumnarToRow following the

Re: [PR] Update root `README.md` and other documentation with latest changes [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1113: URL: https://github.com/apache/datafusion-ballista/pull/1113#discussion_r1844586967 ## docs/source/user-guide/rust.md: ## @@ -17,78 +17,126 @@ under the License. --> -# Ballista Rust Client +# Distributing DataFusion with Ballista -T

Re: [PR] Update root `README.md` and other documentation with latest changes [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1113: URL: https://github.com/apache/datafusion-ballista/pull/1113#discussion_r1844582907 ## docs/source/user-guide/faq.md: ## @@ -25,4 +25,4 @@ DataFusion is a library for executing queries in-process using the Apache Arrow model and computatio

Re: [I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
viirya commented on issue #1092: URL: https://github.com/apache/datafusion-comet/issues/1092#issuecomment-2480022360 I slightly updated the description to make the query plan readable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] minor: Add hint for finding the GPG key to use when publishing to maven [datafusion-comet]

2024-11-15 Thread via GitHub
andygrove opened a new pull request, #1093: URL: https://github.com/apache/datafusion-comet/pull/1093 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] Update root `README.md` and other documentation with latest changes [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1113: URL: https://github.com/apache/datafusion-ballista/pull/1113#discussion_r1844577946 ## docs/source/user-guide/configs.md: ## @@ -19,46 +19,74 @@ # Configuration -## BallistaContext Configuration Settings +## Ballista Configuration Setti

Re: [PR] Update root `README.md` and other documentation with latest changes [datafusion-ballista]

2024-11-15 Thread via GitHub
tbar4 commented on code in PR #1113: URL: https://github.com/apache/datafusion-ballista/pull/1113#discussion_r1844575890 ## docs/source/contributors-guide/ballista_architecture.excalidraw.svg: ## Review Comment: Do we still have a rest implementation that works? I was test

[PR] Native scan [datafusion-comet]

2024-11-15 Thread via GitHub
viirya opened a new pull request, #1094: URL: https://github.com/apache/datafusion-comet/pull/1094 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes te

[PR] support column type definitions in table aliases [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
lovasoa opened a new pull request, #1526: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1526 ```sql SELECT * FROM some_fun() AS x (a TEXT, b INT) ``` fixes https://github.com/apache/datafusion-sqlparser-rs/issues/1524 -- This is an automated message from the Apac

Re: [PR] fix docs of register_table to match implementation [datafusion]

2024-11-15 Thread via GitHub
alamb merged PR #13438: URL: https://github.com/apache/datafusion/pull/13438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
andygrove commented on issue #1092: URL: https://github.com/apache/datafusion-comet/issues/1092#issuecomment-2479951952 @bmorck You can download 0.4.0-rc1 jar files from https://repository.apache.org/#nexus-search;quick~org.apache.datafusion -- This is an automated message from t

Re: [I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
andygrove commented on issue #1092: URL: https://github.com/apache/datafusion-comet/issues/1092#issuecomment-2479928442 I'm in the process of creating the first 0.4.0 release candidate and am uploading the jar files to a maven staging repository. It may be worth testing with this newer ver

Re: [PR] Fix `concat` simplifier for Utf8View types [datafusion]

2024-11-15 Thread via GitHub
alamb merged PR #13346: URL: https://github.com/apache/datafusion/pull/13346 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add support for Utf8View to crypto functions [datafusion]

2024-11-15 Thread via GitHub
Omega359 commented on PR #13407: URL: https://github.com/apache/datafusion/pull/13407#issuecomment-2479816395 I'll take a look at this tonight if no one else beats me to it On Fri, Nov 15, 2024, 2:50 p.m. Andrew Lamb ***@***.***> wrote: > I tried to write some more tests for

Re: [PR] [MINOR]: fix min max accumulator nan bug [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13432: URL: https://github.com/apache/datafusion/pull/13432#discussion_r1844423330 ## datafusion/functions-aggregate/src/min_max.rs: ## @@ -113,8 +114,12 @@ macro_rules! primitive_max_accumulator { ($DATA_TYPE:ident, $NATIVE:ident, $PRIMTYPE:i

Re: [PR] Docs: Add Content Library Page to the docs [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13335: URL: https://github.com/apache/datafusion/pull/13335#discussion_r1844421341 ## docs/source/user-guide/concepts-readings-events.md: ## @@ -0,0 +1,140 @@ + + +# Concepts, Readings, Events + +## 🧭 Background Concepts + +- **2024-06-13**: [2024

[PR] Minor: Fix broken links for meetups in content library [datafusion]

2024-11-15 Thread via GitHub
alamb opened a new pull request, #13445: URL: https://github.com/apache/datafusion/pull/13445 ## Which issue does this PR close? Closes #. ## Rationale for this change @jonahgao noticed a bug in the translation of the content library added in - https://githu

Re: [PR] build: Skip installation of spark-integration and fuzz testing modules [datafusion-comet]

2024-11-15 Thread via GitHub
andygrove merged PR #1091: URL: https://github.com/apache/datafusion-comet/pull/1091 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Add sort integration benchmark [datafusion]

2024-11-15 Thread via GitHub
alamb commented on PR #13306: URL: https://github.com/apache/datafusion/pull/13306#issuecomment-2479836269 Thank you very much @2010YOUY01 -- very cool -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Add support for Utf8View to crypto functions [datafusion]

2024-11-15 Thread via GitHub
alamb commented on PR #13407: URL: https://github.com/apache/datafusion/pull/13407#issuecomment-2479798948 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Add support for Utf8View to crypto functions [datafusion]

2024-11-15 Thread via GitHub
alamb commented on PR #13407: URL: https://github.com/apache/datafusion/pull/13407#issuecomment-2479812462 I tried to write some more tests for this code and I think it may have introduced a bug. See - in https://github.com/apache/datafusion/pull/13443 - https://github.com/apache/dat

[I] Error running crypto functions on `Dictionary` arrays such as `md5` [datafusion]

2024-11-15 Thread via GitHub
alamb opened a new issue, #13444: URL: https://github.com/apache/datafusion/issues/13444 ### Describe the bug A regression appears to have been introduced in - https://github.com/apache/datafusion/pull/13407 T ### To Reproduce This used to work in DataFusion 42

[I] Missing ColumnarToRow when using CometSparkToColumnar [datafusion-comet]

2024-11-15 Thread via GitHub
bmorck opened a new issue, #1092: URL: https://github.com/apache/datafusion-comet/issues/1092 ### Describe the bug I'm working on some internal benchmarks using Comet with Spark 3.3 and Iceberg. To support iceberg, we are including the config, which inserts the `CometSparkToColumnar`

Re: [PR] Fix redundant data copying in unnest [datafusion]

2024-11-15 Thread via GitHub
duongcongtoai commented on PR #13441: URL: https://github.com/apache/datafusion/pull/13441#issuecomment-2479782009 nice, i will take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[PR] Add tests for crypto functions [datafusion]

2024-11-15 Thread via GitHub
alamb opened a new pull request, #13443: URL: https://github.com/apache/datafusion/pull/13443 Draft as the tests are failing (I need to file a bug) ## Which issue does this PR close? Follow on to https://github.com/apache/datafusion/pull/13407 ## Rationale for this change

Re: [I] Add support for Utf8View to crypto functions [datafusion]

2024-11-15 Thread via GitHub
alamb closed issue #13406: Add support for Utf8View to crypto functions URL: https://github.com/apache/datafusion/issues/13406 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Add support for Utf8View to crypto functions [datafusion]

2024-11-15 Thread via GitHub
alamb merged PR #13407: URL: https://github.com/apache/datafusion/pull/13407 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Deduplicate and standardize deserialization logic for streams [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13412: URL: https://github.com/apache/datafusion/pull/13412#discussion_r1844359572 ## datafusion/core/src/datasource/physical_plan/csv.rs: ## @@ -651,36 +651,14 @@ impl FileOpener for CsvOpener { Ok(futures::stream::iter(config

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
blaginin commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2479773461 FYI, I marked this ready for review. @peter-toth @Eason0729, if you guys want to take a look 🌻 -- This is an automated message from the Apache Git Service. To respond

Re: [PR] fix: serialize user-defined window functions to proto [datafusion]

2024-11-15 Thread via GitHub
alamb commented on PR #13421: URL: https://github.com/apache/datafusion/pull/13421#issuecomment-2479765028 Thanks again @jcsherin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] fix: serialize user-defined window functions to proto [datafusion]

2024-11-15 Thread via GitHub
alamb merged PR #13421: URL: https://github.com/apache/datafusion/pull/13421 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Error when serializing physical window functions to proto [datafusion]

2024-11-15 Thread via GitHub
alamb closed issue #13401: Error when serializing physical window functions to proto URL: https://github.com/apache/datafusion/issues/13401 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Add support for utf8view to nvl function [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13382: URL: https://github.com/apache/datafusion/pull/13382#discussion_r1844349849 ## datafusion/sqllogictest/test_files/string/string_view.slt: ## @@ -935,12 +935,19 @@ logical_plan 01)Projection: to_timestamp(test.column1_utf8view, Utf8("a,b,c,d

Re: [PR] Add support for utf8view to nvl function [datafusion]

2024-11-15 Thread via GitHub
alamb merged PR #13382: URL: https://github.com/apache/datafusion/pull/13382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Improve documentation (and ASCII art) about streaming execution, and thread pools [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13423: URL: https://github.com/apache/datafusion/pull/13423#discussion_r1844349133 ## datafusion/core/src/lib.rs: ## @@ -382,13 +382,13 @@ //! //! Calling [`execute`] produces 1 or more partitions of data, //! as a [`SendableRecordBatchStream`],

Re: [PR] Fix Duplicated filters within (filter(TableScan)) plan [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13422: URL: https://github.com/apache/datafusion/pull/13422#discussion_r1844328030 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -1146,6 +1146,33 @@ fn test_join_with_table_scan_filters() -> Result<()> { assert_eq!(sql.to_string(), exp

Re: [PR] Fix `concat` simplifier for Utf8View types [datafusion]

2024-11-15 Thread via GitHub
timsaucer commented on PR #13346: URL: https://github.com/apache/datafusion/pull/13346#issuecomment-2479753274 @alamb this is ready for re-review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Make DFSchema::datatype_is_semantically_equal public [datafusion]

2024-11-15 Thread via GitHub
alamb merged PR #13429: URL: https://github.com/apache/datafusion/pull/13429 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2024-11-15 Thread via GitHub
blaginin commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2479741814 Looks like this PR won't slow down anything ``` critcmp main recursive groupmain

Re: [PR] fix docs of register_table to match implementation [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13438: URL: https://github.com/apache/datafusion/pull/13438#discussion_r1844337539 ## datafusion/core/src/execution/context/mod.rs: ## @@ -1312,8 +1312,8 @@ impl SessionContext { /// Registers a [`TableProvider`] as a table that can be //

[PR] Minor: Remove MOVED file [datafusion]

2024-11-15 Thread via GitHub
alamb opened a new pull request, #13442: URL: https://github.com/apache/datafusion/pull/13442 ## Which issue does this PR close? Closes #. ## Rationale for this change This file was left over from when we started migrating sqllogictests to their own crate ; Now t

Re: [PR] Fix Binary & Binary View Unparsing [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13427: URL: https://github.com/apache/datafusion/pull/13427#discussion_r1844331548 ## datafusion/sql/src/unparser/expr.rs: ## @@ -2167,6 +2174,39 @@ mod tests { } } +#[test] +fn test_cast_value_to_binary_expr() { +le

Re: [PR] Fix Duplicated filters within (filter(TableScan)) plan [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13422: URL: https://github.com/apache/datafusion/pull/13422#discussion_r1844328460 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -1146,6 +1146,33 @@ fn test_join_with_table_scan_filters() -> Result<()> { assert_eq!(sql.to_string(), exp

[PR] Remove build-in object store registry [datafusion-ballista]

2024-11-15 Thread via GitHub
milenkovicm opened a new pull request, #1114: URL: https://github.com/apache/datafusion-ballista/pull/1114 ... as users can plug in their own now. # Which issue does this PR close? relates to #1067 # Rationale for this change Ballista provides a way to use

Re: [PR] feat(substrait): replace SessionContext with a trait [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13343: URL: https://github.com/apache/datafusion/pull/13343#discussion_r1844310157 ## datafusion/substrait/src/logical_plan/context.rs: ## @@ -0,0 +1,54 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] chore: Prepare for 0.5.0 development [datafusion-comet]

2024-11-15 Thread via GitHub
andygrove merged PR #1090: URL: https://github.com/apache/datafusion-comet/pull/1090 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Fix Duplicated filters within (filter(TableScan)) plan [datafusion]

2024-11-15 Thread via GitHub
alamb commented on code in PR #13422: URL: https://github.com/apache/datafusion/pull/13422#discussion_r1844325950 ## datafusion/sql/src/unparser/utils.rs: ## @@ -318,7 +318,9 @@ pub(crate) fn try_transform_to_simple_table_scan_with_filters( plan_stack.push(alia

Re: [I] `ParquetExec::statistics()` does not read statistics for many column types (like timstamps, strings, etc) [datafusion]

2024-11-15 Thread via GitHub
alamb closed issue #8295: `ParquetExec::statistics()` does not read statistics for many column types (like timstamps, strings, etc) URL: https://github.com/apache/datafusion/issues/8295 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

  1   2   3   >