Re: [I] Provide Arrow Schema Hint to Parquet Reader [arrow-rs]

2024-04-17 Thread via GitHub
liukun4515 commented on issue #5657: URL: https://github.com/apache/arrow-rs/issues/5657#issuecomment-2063038264 > > I think my expectation would be for you to provide the `SchemaRef` for the entire file > Basically agree with your idea In the datafusion, the `ParquetExec` of

Re: [I] Provide Arrow Schema Hint to Parquet Reader [arrow-rs]

2024-04-17 Thread via GitHub
liukun4515 commented on issue #5657: URL: https://github.com/apache/arrow-rs/issues/5657#issuecomment-2063030971 > ```rust > ParquetRecordBatchReaderBuilder > ``` In the datafusion, the `ParquetExec` of `FileScanConfig` contains the schema for the parquet file, but I think the

Re: [I] Provide Arrow Schema Hint to Parquet Reader [arrow-rs]

2024-04-17 Thread via GitHub
liukun4515 commented on issue #5657: URL: https://github.com/apache/arrow-rs/issues/5657#issuecomment-2063031558 > I think my expectation would be for you to provide the `SchemaRef` for the entire file In the datafusion, the `ParquetExec` of `FileScanConfig` contains the schema for

Re: [PR] Fix large futures causing stack overflows [arrow-datafusion]

2024-04-17 Thread via GitHub
sergiimk commented on code in PR #10033: URL: https://github.com/apache/arrow-datafusion/pull/10033#discussion_r1569902481 ## datafusion/core/src/execution/context/mod.rs: ## @@ -471,24 +471,37 @@ impl SessionContext { /// [`SQLOptions::verify_plan`]. pub async fn

Re: [I] DataFusion weekly project plan (Andrew Lamb) - April 8, 2024 [arrow-datafusion]

2024-04-17 Thread via GitHub
liukun4515 commented on issue #10002: URL: https://github.com/apache/arrow-datafusion/issues/10002#issuecomment-2062930505 > 4 BuiltInScalarFunctions left Hi @alamb I scan some PRs about https://github.com/apache/arrow-datafusion/issues/9285, and find the the item of

Re: [I] Scalar function in DataFusion cannot coerce dictionary type inputs [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya commented on issue #265: URL: https://github.com/apache/arrow-datafusion-comet/issues/265#issuecomment-2062925130 The diff is updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Scalar function in DataFusion cannot coerce dictionary type inputs [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya closed issue #265: Scalar function in DataFusion cannot coerce dictionary type inputs URL: https://github.com/apache/arrow-datafusion-comet/issues/265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] GH-39722: [JS] Clean up packaging [arrow]

2024-04-17 Thread via GitHub
domoritz merged PR #39723: URL: https://github.com/apache/arrow/pull/39723 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] GH-39722: [JS] Clean up packaging [arrow]

2024-04-17 Thread via GitHub
domoritz commented on PR #39723: URL: https://github.com/apache/arrow/pull/39723#issuecomment-2062921058 Nice. I took it for a spin and source maps etc seem to work well even for the UMD bundles. Looks good. -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] [JS] Remove unnecessary production dependencies [arrow]

2024-04-17 Thread via GitHub
domoritz commented on issue #40108: URL: https://github.com/apache/arrow/issues/40108#issuecomment-2062915087 arrow2csv is a bin provided by the library so we need to keep it as a dependency. We could probably remove that script, though but that's for another day. I removed swc in

Re: [PR] fix: incorrect result with aggregate expression with filter [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya commented on PR #284: URL: https://github.com/apache/arrow-datafusion-comet/pull/284#issuecomment-2062914211 Merged. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Aggregate expression with filter is incorrectly translated to Comet aggregate native aggregation expression [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya closed issue #283: Aggregate expression with filter is incorrectly translated to Comet aggregate native aggregation expression URL: https://github.com/apache/arrow-datafusion-comet/issues/283 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] fix: incorrect result with aggregate expression with filter [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya merged PR #284: URL: https://github.com/apache/arrow-datafusion-comet/pull/284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] GH-41240: [Release][Packaging] Use Debian bookworm for uploading binaries [arrow]

2024-04-17 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #41241: URL: https://github.com/apache/arrow/pull/41241#issuecomment-2062910097 After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit c3fd79f16e4e580ab4277ec70c3e2fe7b922bbd2. There were 3

Re: [PR] GH-41102: [Packaging][Release] Create unique git tags for release candidates (e.g. apache-arrow-{MAJOR}.{MINOR}.{PATCH}-rc{RC_NUM}) [arrow]

2024-04-17 Thread via GitHub
kou commented on PR #41131: URL: https://github.com/apache/arrow/pull/41131#issuecomment-2062909455 Thanks! Let's wait for reviews from others. (We may need to ping them later.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] GH-40108: [JS] Remove dependencies [arrow]

2024-04-17 Thread via GitHub
github-actions[bot] commented on PR #41274: URL: https://github.com/apache/arrow/pull/41274#issuecomment-2062909330 :warning: GitHub issue #40108 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the

[PR] GH-40108: [JS] Remove dependencies [arrow]

2024-04-17 Thread via GitHub
domoritz opened a new pull request, #41274: URL: https://github.com/apache/arrow/pull/41274 Remove some dependencies to make the package lighter. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] [JS] Compatibility Issue with Apache Arrow Library in Angular Application [arrow]

2024-04-17 Thread via GitHub
domoritz commented on issue #39970: URL: https://github.com/apache/arrow/issues/39970#issuecomment-2062902873 Can you provide a specific (minimal) reproduction? Also, make sure your bundler pulls in `Arrow.dom` not `Arrow.node`. -- This is an automated message from the Apache Git

Re: [PR] feat: Add manual test to calculate spark builtin functions coverage [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
advancedxy commented on code in PR #263: URL: https://github.com/apache/arrow-datafusion-comet/pull/263#discussion_r1569832400 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation

Re: [I] HAVING doesn't work with ORDER BY [arrow-datafusion]

2024-04-17 Thread via GitHub
jonahgao commented on issue #10013: URL: https://github.com/apache/arrow-datafusion/issues/10013#issuecomment-2062867665 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] [Python][Parquet] Attempt to encrypt column of type 'list' produces OSError [arrow]

2024-04-17 Thread via GitHub
tritzman commented on issue #41246: URL: https://github.com/apache/arrow/issues/41246#issuecomment-2062856650 In my application code, when I call `write_dataset`, I have a file_visitor that collects metadata as Parquet files are created. Looking at the `pyarrow.dataset.WrittenFile`'s

Re: [PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jonahgao commented on code in PR #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118#discussion_r1569794436 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1437,6 +1438,91 @@ async fn unnest_analyze_metrics() -> Result<()> { Ok(()) } + +#[tokio::test]

Re: [PR] GH-23221: [Python] python changes for pyodide build [arrow]

2024-04-17 Thread via GitHub
kou commented on PR #37822: URL: https://github.com/apache/arrow/pull/37822#issuecomment-2062848787 Could you run `pre-commit run -a` to fix lint failures? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] fix: incorrect result with aggregate expression with filter [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya commented on PR #284: URL: https://github.com/apache/arrow-datafusion-comet/pull/284#issuecomment-2062836233 Another unsupported case of aggregation but Comet incorrectly takes it. cc @huaxingao @sunchao @andygrove -- This is an automated message from the Apache Git

[PR] fix: incorrect result with aggregate expression with filter [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya opened a new pull request, #284: URL: https://github.com/apache/arrow-datafusion-comet/pull/284 ## Which issue does this PR close? Closes #283. ## Rationale for this change ## What changes are included in this PR? ## How are these

Re: [PR] feat: Port Datafusion Covariance to Comet [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
huaxingao commented on PR #234: URL: https://github.com/apache/arrow-datafusion-comet/pull/234#issuecomment-2062825551 Thanks @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[I] Aggregate expression with filer is incorrectly translated to Comet aggregate native aggregation expression [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya opened a new issue, #283: URL: https://github.com/apache/arrow-datafusion-comet/issues/283 ### Describe the bug Aggregate expression with filer is incorrectly translated to Comet aggregate native aggregation expression without filter now. It causes at least one query failure.

Re: [I] [Python] Expose the device interface through the Arrow PyCapsule protocol [arrow]

2024-04-17 Thread via GitHub
vyasr commented on issue #38325: URL: https://github.com/apache/arrow/issues/38325#issuecomment-2062811403 Hmm my interpretation of various comments above like this one: > Well, for me the question is: how do we later add options to the API without breaking compatibility with

Re: [PR] feat: Port Datafusion Covariance to Comet [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya commented on PR #234: URL: https://github.com/apache/arrow-datafusion-comet/pull/234#issuecomment-2062806816 Merged. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: Port Datafusion Covariance to Comet [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya merged PR #234: URL: https://github.com/apache/arrow-datafusion-comet/pull/234 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] feat: Add manual test to calculate spark builtin functions coverage [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
comphead commented on PR #263: URL: https://github.com/apache/arrow-datafusion-comet/pull/263#issuecomment-2062801113 Something is wrong with TPC-DS Correctness it runs for 5 hours and stuck on downloadin Maven deps -- This is an automated message from the Apache Git Service. To

Re: [PR] feat: Add extended explain info to Comet plan [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
parthchandra commented on PR #255: URL: https://github.com/apache/arrow-datafusion-comet/pull/255#issuecomment-2062785797 @andygrove I changed the core of the implementation. Instead of setting information in a CometExplainInfo structure and bubbling it up in the plan, I now set the

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
jayzhan211 commented on PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115#issuecomment-2062784683 Thanks @peter-toth @alamb @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
jayzhan211 merged PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
jayzhan211 commented on code in PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115#discussion_r1569728570 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -140,140 +139,132 @@ struct UnwrapCastExprRewriter { impl TreeNodeRewriter for

Re: [I] [EPIC] Improve the performance of ListingTable [arrow-datafusion]

2024-04-17 Thread via GitHub
Lordworms commented on issue #9964: URL: https://github.com/apache/arrow-datafusion/issues/9964#issuecomment-2062774430 > @Lordworms thanks for the work on this. > > Just to confirm - what was the improvement in milliseconds we saw from the object meta cache? For context, in low

Re: [PR] GH-40964: [CI][Archery] Archery linking should also check for undefined symbols Linux [arrow]

2024-04-17 Thread via GitHub
vibhatha commented on code in PR #40520: URL: https://github.com/apache/arrow/pull/40520#discussion_r1569712391 ## dev/archery/archery/linking.py: ## @@ -61,9 +63,83 @@ def list_dependency_names(self): names.append(name) return names +def

Re: [PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jayzhan211 commented on code in PR #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118#discussion_r1569710494 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1437,6 +1438,91 @@ async fn unnest_analyze_metrics() -> Result<()> { Ok(()) } +

Re: [I] fix Sort Merge Join to pass TPCH tests [arrow-datafusion]

2024-04-17 Thread via GitHub
comphead commented on issue #10100: URL: https://github.com/apache/arrow-datafusion/issues/10100#issuecomment-2062762876 Narrowed down the problem to query ``` with t as (select 1 a, 1 b), t1 as (select 1 a, 1 b) select * from t, t1 where t.a = t1.a and exists (select

Re: [PR] GH-40407: [JS] Fix string coercion in MapRowProxyHandler.ownKeys [arrow]

2024-04-17 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40408: URL: https://github.com/apache/arrow/pull/40408#issuecomment-2062762405 After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 117460b12f5a9f99f961f634e5a2ea85ad445992. There were

Re: [PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jayzhan211 commented on code in PR #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118#discussion_r1569707146 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1437,6 +1438,91 @@ async fn unnest_analyze_metrics() -> Result<()> { Ok(()) } +

Re: [PR] GH-40964: [CI][Archery] Archery linking should also check for undefined symbols Linux [arrow]

2024-04-17 Thread via GitHub
vibhatha commented on code in PR #40520: URL: https://github.com/apache/arrow/pull/40520#discussion_r1569708081 ## dev/archery/archery/linking.py: ## @@ -61,9 +63,83 @@ def list_dependency_names(self): names.append(name) return names +def

Re: [PR] feat: `DataFrame` supports unnesting multiple columns [arrow-datafusion]

2024-04-17 Thread via GitHub
jayzhan211 commented on code in PR #10118: URL: https://github.com/apache/arrow-datafusion/pull/10118#discussion_r1569707146 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1437,6 +1438,91 @@ async fn unnest_analyze_metrics() -> Result<()> { Ok(()) } +

Re: [PR] GH-41263: [C#][Integration] Ensure offset is considered in all branches of the bitmap comparison [arrow]

2024-04-17 Thread via GitHub
paleolimbot commented on PR #41264: URL: https://github.com/apache/arrow/pull/41264#issuecomment-2062751775 I'm not sure what's going on with the zerolength case, but it seems to be failing for Java and C# producing (with C# consuming): ``` # FAILURES

Re: [I] [Java] Remove deprecated code from Arrow Java [arrow]

2024-04-17 Thread via GitHub
vibhatha commented on issue #15167: URL: https://github.com/apache/arrow/issues/15167#issuecomment-2062751633 Thanks @kou -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Install instructions for R package fail to install parquet format functionality [arrow]

2024-04-17 Thread via GitHub
amoeba commented on issue #41265: URL: https://github.com/apache/arrow/issues/41265#issuecomment-2062731451 I'm going to close this but feel free to re-open if anything changes or file a new issue if you run into anything else. -- This is an automated message from the Apache Git Service.

Re: [I] Install instructions for R package fail to install parquet format functionality [arrow]

2024-04-17 Thread via GitHub
aaelony-fb commented on issue #41265: URL: https://github.com/apache/arrow/issues/41265#issuecomment-2062662844 That works on my end. Thank-you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] fix Sort Merge Join to pass TPCH tests [arrow-datafusion]

2024-04-17 Thread via GitHub
comphead commented on issue #10100: URL: https://github.com/apache/arrow-datafusion/issues/10100#issuecomment-2062647591 Only Q21 failing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Add missing JS changes for arrow 16 [arrow]

2024-04-17 Thread via GitHub
domoritz commented on PR #41261: URL: https://github.com/apache/arrow/pull/41261#issuecomment-2062616848 Too late for the release. Will try to get these into 16.1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Add missing JS changes for arrow 16 [arrow]

2024-04-17 Thread via GitHub
domoritz closed pull request #41261: Add missing JS changes for arrow 16 URL: https://github.com/apache/arrow/pull/41261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Fix large futures causing stack overflows [arrow-datafusion]

2024-04-17 Thread via GitHub
andygrove commented on code in PR #10033: URL: https://github.com/apache/arrow-datafusion/pull/10033#discussion_r1569630334 ## datafusion/core/src/execution/context/mod.rs: ## @@ -471,24 +471,37 @@ impl SessionContext { /// [`SQLOptions::verify_plan`]. pub async fn

Re: [PR] Update proc-macro2 requirement from =1.0.80 to =1.0.81 [arrow-rs]

2024-04-17 Thread via GitHub
tustvold merged PR #5659: URL: https://github.com/apache/arrow-rs/pull/5659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] arrow failing on mac prerel [arrow]

2024-04-17 Thread via GitHub
assignUser commented on issue #41267: URL: https://github.com/apache/arrow/issues/41267#issuecomment-2062577077 This is caused by gnulibtool being on the path. Brew specifically warns against doing this and it has to be added manually. We have a check for this in 16.0.0(which

Re: [PR] GH-41229: [C++] FS: Support naive GCS Async Close [arrow]

2024-04-17 Thread via GitHub
felipecrv commented on PR #41232: URL: https://github.com/apache/arrow/pull/41232#issuecomment-2062514844 > Oh, another problem here is bool closed() is weird after this patch @mapleFU that weirdness is related to what I was referring to in the thread above. What `closed_` means

Re: [I] [EPIC] Improve the performance of ListingTable [arrow-datafusion]

2024-04-17 Thread via GitHub
matthewmturner commented on issue #9964: URL: https://github.com/apache/arrow-datafusion/issues/9964#issuecomment-2062512489 @Lordworms thanks for the work on this. Just to confirm - what was the improvement in milliseconds we saw from the object meta cache? For context, in low

Re: [I] Stop copying `LogicalPlan` during OptimizerPasses [arrow-datafusion]

2024-04-17 Thread via GitHub
Lordworms commented on issue #9637: URL: https://github.com/apache/arrow-datafusion/issues/9637#issuecomment-2062455668 I am interested in refactoring some of the OptimizeRule, I'll start with some easy optimization rules -- This is an automated message from the Apache Git Service. To

Re: [PR] Minor: only trigger dependency check on changes to Cargo.toml [arrow-datafusion]

2024-04-17 Thread via GitHub
Jefffrey merged PR #10099: URL: https://github.com/apache/arrow-datafusion/pull/10099 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] fix: another non-deterministic test in `joins.slt` [arrow-datafusion]

2024-04-17 Thread via GitHub
Jefffrey merged PR #10122: URL: https://github.com/apache/arrow-datafusion/pull/10122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Configure dependabot to auto upgrade dependencies for datafusion-cli [arrow-datafusion]

2024-04-17 Thread via GitHub
Jefffrey closed issue #10106: Configure dependabot to auto upgrade dependencies for datafusion-cli URL: https://github.com/apache/arrow-datafusion/issues/10106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Update dependabot to consider datafusion-cli [arrow-datafusion]

2024-04-17 Thread via GitHub
Jefffrey merged PR #10108: URL: https://github.com/apache/arrow-datafusion/pull/10108 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] WIP: [Release] Verify release-16.0.0-rc0 [arrow]

2024-04-17 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #41235: URL: https://github.com/apache/arrow/pull/41235#issuecomment-2062412339 Thanks for your patience. Conbench analyzed the 7 benchmarking runs that have been run so far on PR commit 6a28035c2b49b432dc63f5ee7524d76b4ed2d762. There was 1

Re: [I] csharp/src/Drivers/Apache: Rename this project and add NuGet creation [arrow-adbc]

2024-04-17 Thread via GitHub
davidhcoe commented on issue #1726: URL: https://github.com/apache/arrow-adbc/issues/1726#issuecomment-2062350106 I also thought about Apache.Arrow.Adbc.Drivers.Thrift since things all build on that, but that seemed less helpful -- This is an automated message from the Apache Git

Re: [I] Provide Arrow Schema Hint to Parquet Reader [arrow-rs]

2024-04-17 Thread via GitHub
Lordworms commented on issue #5657: URL: https://github.com/apache/arrow-rs/issues/5657#issuecomment-2062331038 Let me try the remaining part if it is ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] [EPIC] Improve the performance of ListingTable [arrow-datafusion]

2024-04-17 Thread via GitHub
Lordworms commented on issue #9964: URL: https://github.com/apache/arrow-datafusion/issues/9964#issuecomment-2062327479 I have implemented a basic LRU metadata cache, and I think just caching the metadata would get slight performance improvement(we call the List_Object API just once but

[PR] LRU DashMap to cache objectMeta [arrow-datafusion]

2024-04-17 Thread via GitHub
Lordworms opened a new pull request, #10125: URL: https://github.com/apache/arrow-datafusion/pull/10125 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [I] Install instructions for R package fail to install parquet format functionality [arrow]

2024-04-17 Thread via GitHub
amoeba commented on issue #41265: URL: https://github.com/apache/arrow/issues/41265#issuecomment-2062301107 Hi @aaelony-fb and @joonlee3, sorry for the difficulty. The Mac build on CRAN is broken at the moment and, while we work on a fix, the best way to install arrow R on a Mac is from

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
aljazerzen commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569527028 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

Re: [I] CLI docs page is showing 404 [arrow-datafusion]

2024-04-17 Thread via GitHub
samuelcolvin commented on issue #10124: URL: https://github.com/apache/arrow-datafusion/issues/10124#issuecomment-2062263486 Thanks. Redirect would be good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] CLI docs page is showing 404 [arrow-datafusion]

2024-04-17 Thread via GitHub
tinfoil-knight commented on issue #10124: URL: https://github.com/apache/arrow-datafusion/issues/10124#issuecomment-2062256336 @samuelcolvin Docs for CLI were recently moved to https://arrow.apache.org/datafusion/user-guide/cli/index.html in this PR: #10078 @alamb could we setup a

Re: [I] [Java] Remove deprecated code from Arrow Java [arrow]

2024-04-17 Thread via GitHub
kou commented on issue #15167: URL: https://github.com/apache/arrow/issues/15167#issuecomment-2062249910 Added! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] GH-41102: [Packaging][Release] Create unique git tags for release candidates (e.g. apache-arrow-{MAJOR}.{MINOR}.{PATCH}-rc{RC_NUM}) [arrow]

2024-04-17 Thread via GitHub
sgilmore10 commented on PR #41131: URL: https://github.com/apache/arrow/pull/41131#issuecomment-2062154256 > I haven't tried this but I think that it will work. Hi @Kou, I just tested out the most recent changes - with a few minor tweaks - using the mathworks/arrow repo. Here's a

[I] CLI docs page is showing 404 [arrow-datafusion]

2024-04-17 Thread via GitHub
samuelcolvin opened a new issue, #10124: URL: https://github.com/apache/arrow-datafusion/issues/10124 ### Describe the bug The CLI docs are down - https://arrow.apache.org/datafusion/user-guide/cli.html they're just showing 404 "Not Found" ### To Reproduce go to

Re: [PR] GH-41262: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-04-17 Thread via GitHub
stevelorddremio commented on code in PR #41237: URL: https://github.com/apache/arrow/pull/41237#discussion_r1569451408 ## java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlStatelessExample.java: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
alexandreyc commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569418062 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

Re: [PR] GH-41262: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-04-17 Thread via GitHub
stevelorddremio commented on code in PR #41237: URL: https://github.com/apache/arrow/pull/41237#discussion_r1569416848 ## java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/test/TestFlightSqlStateless.java: ## @@ -0,0 +1,88 @@ +/* + * Licensed to the Apache

Re: [PR] GH-41262: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-04-17 Thread via GitHub
stevelorddremio commented on code in PR #41237: URL: https://github.com/apache/arrow/pull/41237#discussion_r1569414105 ## java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlStatelessExample.java: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache

Re: [I] Install instructions for R package fail to install parquet format functionality [arrow]

2024-04-17 Thread via GitHub
joonlee3 commented on issue #41265: URL: https://github.com/apache/arrow/issues/41265#issuecomment-2062083069 I am experiencing the same issue right now. I am also wondering how to make **arrow** package compatible with parquet on M2 macbook pro. -- This is an automated message from the

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
alexandreyc commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569409474 ## rust2/core/Cargo.toml: ## @@ -0,0 +1,29 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the

Re: [PR] fix: Comet should not translate try_sum to native sum expression [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya merged PR #277: URL: https://github.com/apache/arrow-datafusion-comet/pull/277 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] fix: Comet should not translate try_sum to native sum expression [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya commented on PR #277: URL: https://github.com/apache/arrow-datafusion-comet/pull/277#issuecomment-2062042410 Merged. Thanks @huaxingao @sunchao @advancedxy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Comet should not translate try_sum to native sum expression [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya closed issue #276: Comet should not translate try_sum to native sum expression URL: https://github.com/apache/arrow-datafusion-comet/issues/276 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] fix: Comet should not translate try_sum to native sum expression [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
viirya commented on PR #277: URL: https://github.com/apache/arrow-datafusion-comet/pull/277#issuecomment-2062041872 Thank you @sunchao  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] ci(python): upload nightly python packages [arrow-nanoarrow]

2024-04-17 Thread via GitHub
raulcd commented on PR #429: URL: https://github.com/apache/arrow-nanoarrow/pull/429#issuecomment-2062037046 By default packages uploaded are private. I had to mark the package as public. It should appear now here: https://pypi.fury.io/arrow-nightlies/nanoarrow -- This is an automated

Re: [PR] Move coalesce to datafusion-functions and remove BuiltInScalarFunction [arrow-datafusion]

2024-04-17 Thread via GitHub
Omega359 commented on code in PR #10098: URL: https://github.com/apache/arrow-datafusion/pull/10098#discussion_r1569374404 ## datafusion/expr/src/expr.rs: ## @@ -1276,7 +1260,7 @@ impl Expr { pub fn short_circuits() -> bool { match self {

Re: [PR] Move coalesce to datafusion-functions and remove BuiltInScalarFunction [arrow-datafusion]

2024-04-17 Thread via GitHub
Omega359 commented on code in PR #10098: URL: https://github.com/apache/arrow-datafusion/pull/10098#discussion_r1569374404 ## datafusion/expr/src/expr.rs: ## @@ -1276,7 +1260,7 @@ impl Expr { pub fn short_circuits() -> bool { match self {

Re: [I] Provide Arrow Schema Hint to Parquet Reader [arrow-rs]

2024-04-17 Thread via GitHub
tustvold commented on issue #5657: URL: https://github.com/apache/arrow-rs/issues/5657#issuecomment-2062024150 I think my expectation would be for you to provide the `SchemaRef` for the entire file -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Fix large futures causing stack overflows [arrow-datafusion]

2024-04-17 Thread via GitHub
sergiimk commented on code in PR #10033: URL: https://github.com/apache/arrow-datafusion/pull/10033#discussion_r1569364304 ## datafusion/core/src/dataframe/mod.rs: ## @@ -156,7 +156,7 @@ impl Default for DataFrameWriteOptions { /// ``` #[derive(Debug, Clone)] pub struct

Re: [PR] Fix large futures causing stack overflows [arrow-datafusion]

2024-04-17 Thread via GitHub
sergiimk commented on PR #10033: URL: https://github.com/apache/arrow-datafusion/pull/10033#issuecomment-2062022937 @alamb thanks, I resolved all you comments. Created backport PR here https://github.com/apache/arrow-datafusion/pull/10123 (fix only, no linting). Please

Re: [PR] Fix intermittent CI test failure in `joins.slt` [arrow-datafusion]

2024-04-17 Thread via GitHub
viirya commented on PR #10120: URL: https://github.com/apache/arrow-datafusion/pull/10120#issuecomment-2062021948 Thank you for fixing the CI! @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] Backport: Lighten DataFrame size and fix large futures warnings [arrow-datafusion]

2024-04-17 Thread via GitHub
sergiimk opened a new pull request, #10123: URL: https://github.com/apache/arrow-datafusion/pull/10123 **Note:** this targets `branch-37`, NOT `main`. ## Which issue does this PR close? Part of 37 maintenance release

Re: [PR] feat(rust): add public abstract API and dummy driver implementation [arrow-adbc]

2024-04-17 Thread via GitHub
alexandreyc commented on code in PR #1725: URL: https://github.com/apache/arrow-adbc/pull/1725#discussion_r1569356502 ## rust2/core/src/lib.rs: ## @@ -0,0 +1,520 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See

Re: [PR] improve monotonicity api [arrow-datafusion]

2024-04-17 Thread via GitHub
tinfoil-knight commented on PR #10117: URL: https://github.com/apache/arrow-datafusion/pull/10117#issuecomment-2062012465 @ozankabak Are you suggesting something like this? ```rust use std::ops::Deref; struct LimitedVector(Vec>); impl LimitedVector { fn

Re: [PR] Move coalesce to datafusion-functions and remove BuiltInScalarFunction [arrow-datafusion]

2024-04-17 Thread via GitHub
comphead commented on code in PR #10098: URL: https://github.com/apache/arrow-datafusion/pull/10098#discussion_r1569320216 ## datafusion/expr/src/expr.rs: ## @@ -1276,7 +1260,7 @@ impl Expr { pub fn short_circuits() -> bool { match self {

Re: [I] Epic: Unified TreeNode rewrite API [arrow-datafusion]

2024-04-17 Thread via GitHub
peter-toth commented on issue #8913: URL: https://github.com/apache/arrow-datafusion/issues/8913#issuecomment-2061980606 > Well, since you asked @peter-toth  migrating other passes listed on to use the TreeNode API would certainly be helpful. I know CSE is a big one. #9873 might also be

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
comphead commented on code in PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115#discussion_r1569309601 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -140,140 +139,132 @@ struct UnwrapCastExprRewriter { impl TreeNodeRewriter for

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
peter-toth commented on code in PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115#discussion_r1569303691 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -140,140 +139,132 @@ struct UnwrapCastExprRewriter { impl TreeNodeRewriter for

Re: [PR] GH-41262: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-04-17 Thread via GitHub
stevelorddremio closed pull request #41237: GH-41262: [Java][FlightSQL] Implement stateless prepared statements URL: https://github.com/apache/arrow/pull/41237 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Refactor `UnwrapCastInComparison` to remove `Expr` clones [arrow-datafusion]

2024-04-17 Thread via GitHub
comphead commented on code in PR #10115: URL: https://github.com/apache/arrow-datafusion/pull/10115#discussion_r1569283141 ## datafusion/optimizer/src/unwrap_cast_in_comparison.rs: ## @@ -140,140 +139,132 @@ struct UnwrapCastExprRewriter { impl TreeNodeRewriter for

[PR] fix: another non-deterministic test in `joins.slt` [arrow-datafusion]

2024-04-17 Thread via GitHub
korowa opened a new pull request, #10122: URL: https://github.com/apache/arrow-datafusion/pull/10122 ## Which issue does this PR close? Part of #10119. ## Rationale for this change Follow up on #10120 -- there is another one test like the fixed one

Re: [PR] feat: Add manual test to calculate spark builtin functions coverage [arrow-datafusion-comet]

2024-04-17 Thread via GitHub
comphead commented on code in PR #263: URL: https://github.com/apache/arrow-datafusion-comet/pull/263#discussion_r1569269530 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundation

  1   2   3   4   >