viirya commented on code in PR #1561:
URL: https://github.com/apache/datafusion-comet/pull/1561#discussion_r2006927110
##
common/src/main/scala/org/apache/comet/CometConf.scala:
##
@@ -236,17 +236,18 @@ object CometConf extends ShimCometConf {
val COMET_MEMORY_OVERHEAD: Optio
viirya commented on code in PR #1561:
URL: https://github.com/apache/datafusion-comet/pull/1561#discussion_r2006927319
##
common/src/main/scala/org/apache/comet/CometConf.scala:
##
@@ -255,8 +256,7 @@ object CometConf extends ShimCometConf {
val COMET_MEMORY_OVERHEAD_MIN_MI
wForget commented on code in PR #1555:
URL: https://github.com/apache/datafusion-comet/pull/1555#discussion_r2006926052
##
native/core/src/parquet/mod.rs:
##
@@ -641,6 +640,8 @@ pub unsafe extern "system" fn
Java_org_apache_comet_parquet_Native_initRecordBat
session_timezo
iffyio commented on code in PR #1772:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1772#discussion_r2006914091
##
src/parser/mod.rs:
##
@@ -11145,17 +11145,16 @@ impl<'a> Parser<'a> {
}
/// Parse a `SET ROLE` statement. Expects SET to be consumed alre
alamb commented on PR #15313:
URL: https://github.com/apache/datafusion/pull/15313#issuecomment-2741938887
> Just FYI, I think Github will close the issue because it has "closes #xxx"
automation
Good call -- I have updated the description to say "related to" rather than
closes
--
2010YOUY01 commented on PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#issuecomment-2742297724
> I ran this against Q23, results look promising! Elapsed 3.173 seconds with
`datafusion.optimizer.enable_dynamic_filter_pushdown = true` vs. 4.696 with
`false`. Both with predica
2010YOUY01 commented on code in PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#discussion_r2006853236
##
datafusion/physical-plan/src/sorts/sort.rs:
##
@@ -1067,35 +1067,53 @@ impl ExecutionPlan for SortExec {
) -> Result {
trace!("Start SortExec::
adriangb commented on issue #15343:
URL: https://github.com/apache/datafusion/issues/15343#issuecomment-2742171861
Plus one for making it the default :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
xudong963 commented on issue #15343:
URL: https://github.com/apache/datafusion/issues/15343#issuecomment-2742191169
+1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
0xlearner opened a new issue, #15342:
URL: https://github.com/apache/datafusion/issues/15342
### Describe the bug
### Description
When using `rquest` with `datafusion`, a dependency conflict occurs because
both crates depend on libraries that link to the native `lzma` library.
Jiashu-Hu commented on issue #15267:
URL: https://github.com/apache/datafusion/issues/15267#issuecomment-2742161658
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
2010YOUY01 commented on code in PR #15302:
URL: https://github.com/apache/datafusion/pull/15302#discussion_r2006777564
##
datafusion/physical-plan/src/sorts/sort.rs:
##
@@ -688,15 +707,29 @@ impl ExternalSorter {
let fetch = self.fetch;
let expressions: LexOr
wForget commented on code in PR #1556:
URL: https://github.com/apache/datafusion-comet/pull/1556#discussion_r2006738045
##
spark/src/test/scala/org/apache/spark/sql/benchmark/CometReadBenchmark.scala:
##
@@ -63,6 +65,7 @@ object CometReadBenchmark extends CometBenchmarkBase {
wForget commented on code in PR #1556:
URL: https://github.com/apache/datafusion-comet/pull/1556#discussion_r2006738045
##
spark/src/test/scala/org/apache/spark/sql/benchmark/CometReadBenchmark.scala:
##
@@ -63,6 +65,7 @@ object CometReadBenchmark extends CometBenchmarkBase {
wForget commented on code in PR #1556:
URL: https://github.com/apache/datafusion-comet/pull/1556#discussion_r2006750847
##
pom.xml:
##
@@ -447,6 +448,13 @@ under the License.
5.1.0
+
+org.apache.hadoop
+hadoop-client-minicluster
Review C
kosiew commented on code in PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#discussion_r2006732123
##
datafusion/physical-plan/src/topk/mod.rs:
##
@@ -163,26 +187,32 @@ impl TopK {
// TODO make this algorithmically better?:
// Idea: filter out r
wForget commented on code in PR #1556:
URL: https://github.com/apache/datafusion-comet/pull/1556#discussion_r2006742948
##
pom.xml:
##
@@ -447,6 +448,13 @@ under the License.
5.1.0
+
+org.apache.hadoop
+hadoop-client-minicluster
Review C
wForget commented on code in PR #1556:
URL: https://github.com/apache/datafusion-comet/pull/1556#discussion_r2006742948
##
pom.xml:
##
@@ -447,6 +448,13 @@ under the License.
5.1.0
+
+org.apache.hadoop
+hadoop-client-minicluster
Review C
wForget commented on code in PR #1556:
URL: https://github.com/apache/datafusion-comet/pull/1556#discussion_r2006738045
##
spark/src/test/scala/org/apache/spark/sql/benchmark/CometReadBenchmark.scala:
##
@@ -63,6 +65,7 @@ object CometReadBenchmark extends CometBenchmarkBase {
jsai28 commented on issue #15276:
URL: https://github.com/apache/datafusion/issues/15276#issuecomment-2742070297
Regarding your first two points, I do think that `Vec` may be the way
to do this. Mainly as it supports handling the case of literal values out of
the box. If sqlparser is eventu
linhr commented on issue #15328:
URL: https://github.com/apache/datafusion/issues/15328#issuecomment-2742022196
Thanks @alamb! Everything looks good now!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
adriangb commented on code in PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#discussion_r2006243095
##
datafusion/physical-plan/src/topk/mod.rs:
##
@@ -644,10 +737,72 @@ impl RecordBatchStore {
}
}
+struct TopKDynamicFilterSource {
+/// The TopK heap
alamb closed issue #14914: [EPIC] Complete `SQL EXPLAIN` Tree Rendering
URL: https://github.com/apache/datafusion/issues/14914
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
alamb commented on issue #15323:
URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2741967802
Do you see too many threads when writing the spill files or when reading?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
alamb commented on issue #14914:
URL: https://github.com/apache/datafusion/issues/14914#issuecomment-2741966208
Thank you so much to @irenjj @zebsme @Standing-Man and others I think we are
basically done with this epic...
It is even documented!
https://datafusion.apache.org/use
alamb commented on issue #15343:
URL: https://github.com/apache/datafusion/issues/15343#issuecomment-2741964039
For example, here is the plan from a recent query from
https://github.com/apache/datafusion/issues/15177 (I actually had to trim it to
fit in the 65k limit):
```sql
alamb commented on PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#issuecomment-2741947591
> I ran this against Q23, results look promising! Elapsed 3.173 seconds with
`datafusion.optimizer.enable_dynamic_filter_pushdown = true` vs. 4.696 with
`false`. Both with predicate pu
alamb commented on code in PR #15322:
URL: https://github.com/apache/datafusion/pull/15322#discussion_r2006649688
##
datafusion/functions-aggregate/src/min_max.rs:
##
@@ -264,6 +265,7 @@ impl AggregateUDFImpl for Max {
| Binary
| LargeBinary
waynexia commented on code in PR #15299:
URL: https://github.com/apache/datafusion/pull/15299#discussion_r2006630060
##
datafusion/sqllogictest/test_files/union.slt:
##
@@ -910,8 +910,8 @@ SELECT * FROM (SELECT y FROM u1 UNION ALL SELECT y FROM u2)
ORDER BY y;
query I
SELECT
deanm commented on issue #754:
URL:
https://github.com/apache/datafusion-python/issues/754#issuecomment-2741575371
Another potential friendly alias method is to use **kwargs in `select` and
`aggregate`. Here's a select implementation:
```python
def select(self, *exprs: Exp
tv42 commented on issue #241:
URL:
https://github.com/apache/datafusion-sqlparser-rs/issues/241#issuecomment-2741880470
Almost 5 years without update. Can this be closed in light of
https://github.com/apache/datafusion-sqlparser-rs/blob/main/tests/sqlparser_custom_dialect.rs
?
--
This i
waynexia commented on PR #15327:
URL: https://github.com/apache/datafusion/pull/15327#issuecomment-2741876016
>It might also be a good idea to include some documentation in the operators
themselves that DataFusion doesn't have default implementations
Added in
[5828cba](https://github
waynexia commented on PR #15327:
URL: https://github.com/apache/datafusion/pull/15327#issuecomment-2741867405
>I think there should be sql level tests (sqllogitests) that run these
operators
That's a good idea! I think I know them much better after writing some SQLs
(though none of t
comphead commented on PR #1550:
URL:
https://github.com/apache/datafusion-comet/pull/1550#issuecomment-2741862047
> Great tests @comphead Do you think we need to add some cases with one more
level of nesting -
>
> ```
> array
> +- struct
> +- array
> ``
blaginin commented on code in PR #15313:
URL: https://github.com/apache/datafusion/pull/15313#discussion_r2006567915
##
datafusion/physical-plan/src/aggregates/mod.rs:
##
@@ -1776,17 +1790,17 @@ mod tests {
assert_eq!(batch.num_columns(), 2);
assert_eq!(batch.n
kevinjqliu commented on code in PR #60:
URL: https://github.com/apache/datafusion-site/pull/60#discussion_r2006136201
##
content/blog/2025-03-20-parquet-pruning.md:
##
@@ -0,0 +1,118 @@
+---
+layout: post
+title: Parquet Pruning in DataFusion: Read Only What Matters
+date: 2025-
milenkovicm commented on PR #1213:
URL:
https://github.com/apache/datafusion-ballista/pull/1213#issuecomment-2741763693
Good catch, thanks @nj7
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
jinwenjie123 commented on issue #1552:
URL:
https://github.com/apache/datafusion-comet/issues/1552#issuecomment-2737302767
> [@jinwenjie123](https://github.com/jinwenjie123) I would recommend
starting off by using a pre-built JAR which contains native binaries for
multiple architectures.
andygrove commented on code in PR #63:
URL: https://github.com/apache/datafusion-site/pull/63#discussion_r2006506263
##
content/blog/2025-03-20-datafusion-comet-0.7.0.md:
##
@@ -0,0 +1,130 @@
+---
+layout: post
+title: Apache DataFusion Comet 0.7.0 Release
+date: 2025-03-20
+aut
alamb closed issue #102: [Rust] [datafusion] Allow integration in non libc
environments
URL: https://github.com/apache/datafusion/issues/102
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
Omega359 opened a new pull request, #15341:
URL: https://github.com/apache/datafusion/pull/15341
## Which issue does this PR close?
- Closes #15340
## Rationale for this change
Better handle type coercion when unsigned numerics are involved
## What changes are included
andygrove commented on code in PR #63:
URL: https://github.com/apache/datafusion-site/pull/63#discussion_r2006442098
##
content/blog/2025-03-20-datafusion-comet-0.7.0.md:
##
@@ -0,0 +1,131 @@
+---
+layout: post
+title: Apache DataFusion Comet 0.7.0 Release
+date: 2025-03-20
+aut
andygrove commented on code in PR #63:
URL: https://github.com/apache/datafusion-site/pull/63#discussion_r2006440492
##
content/blog/2025-03-20-datafusion-comet-0.7.0.md:
##
@@ -0,0 +1,131 @@
+---
+layout: post
+title: Apache DataFusion Comet 0.7.0 Release
+date: 2025-03-20
+aut
kazuyukitanimura commented on code in PR #63:
URL: https://github.com/apache/datafusion-site/pull/63#discussion_r2006435819
##
content/blog/2025-03-20-datafusion-comet-0.7.0.md:
##
@@ -0,0 +1,131 @@
+---
+layout: post
+title: Apache DataFusion Comet 0.7.0 Release
+date: 2025-03-
Omega359 commented on issue #15340:
URL: https://github.com/apache/datafusion/issues/15340#issuecomment-2741609413
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
comphead commented on code in PR #63:
URL: https://github.com/apache/datafusion-site/pull/63#discussion_r2006367866
##
content/blog/2025-03-20-datafusion-comet-0.7.0.md:
##
@@ -0,0 +1,130 @@
+---
+layout: post
+title: Apache DataFusion Comet 0.7.0 Release
+date: 2025-03-20
+auth
tv42 commented on issue #1519:
URL:
https://github.com/apache/datafusion-sqlparser-rs/issues/1519#issuecomment-2741566749
This was fixed in sqlparse v0.55.0, likely
https://github.com/apache/datafusion-sqlparser-rs/pull/1669
--
This is an automated message from the Apache Git Service.
To
deanm opened a new pull request, #1076:
URL: https://github.com/apache/datafusion-python/pull/1076
# Which issue does this PR close?
Closes #1075
# Rationale for this change
To improve ergonomics of the API by providing a quicker way of accessing
columns using the __ge
andygrove opened a new pull request, #63:
URL: https://github.com/apache/datafusion-site/pull/63
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe
Dandandan commented on issue #15177:
URL: https://github.com/apache/datafusion/issues/15177#issuecomment-2741469672
I traced this down to an issue in the planner, which uses
`PartitionMode::Auto` iff stats are collected
(`datafusion.execution.collect_statistics`)
We can however still use
Dandandan opened a new pull request, #15339:
URL: https://github.com/apache/datafusion/pull/15339
## Which issue does this PR close?
- Closes #.
## Rationale for this change
## What changes are included in this PR?
## Are these changes teste
kazuyukitanimura commented on code in PR #1556:
URL: https://github.com/apache/datafusion-comet/pull/1556#discussion_r2006238491
##
spark/src/test/scala/org/apache/comet/WithHdfsCluster.scala:
##
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under on
andygrove commented on code in PR #1561:
URL: https://github.com/apache/datafusion-comet/pull/1561#discussion_r2006287257
##
spark/src/main/scala/org/apache/comet/CometExecIterator.scala:
##
@@ -63,9 +64,28 @@ class CometExecIterator(
}.toArray
private val plan = {
va
westhide commented on PR #15335:
URL: https://github.com/apache/datafusion/pull/15335#issuecomment-2740991253
> Thank you @westhide
>
> Should we remove the `batch_size` from JSON source too?
>
>
https://github.com/apache/datafusion/blob/dd9c3a815d7b4af2ef503ea557332ecc700af318
deanm opened a new issue, #1075:
URL: https://github.com/apache/datafusion-python/issues/1075
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
This would allow columns to be referred to as attr methods of col. For
example inste
Omega359 commented on PR #15337:
URL: https://github.com/apache/datafusion/pull/15337#issuecomment-2741362724
LGTM, thanks @christophermcdermott !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
deanm commented on issue #1064:
URL:
https://github.com/apache/datafusion-python/issues/1064#issuecomment-2741351917
@timsaucer I put in a [draft
PR](https://github.com/apache/datafusion-python/pull/1074) that does all the
one input arg functions.
Is your reluctance to putting *
shruti2522 commented on code in PR #15322:
URL: https://github.com/apache/datafusion/pull/15322#discussion_r2006224462
##
datafusion/functions-aggregate/src/min_max.rs:
##
@@ -264,6 +265,7 @@ impl AggregateUDFImpl for Max {
| Binary
| LargeBinar
shruti2522 commented on code in PR #15322:
URL: https://github.com/apache/datafusion/pull/15322#discussion_r2006224462
##
datafusion/functions-aggregate/src/min_max.rs:
##
@@ -264,6 +265,7 @@ impl AggregateUDFImpl for Max {
| Binary
| LargeBinar
Omega359 commented on code in PR #61:
URL: https://github.com/apache/datafusion-site/pull/61#discussion_r2006154315
##
content/blog/2025-03-21-parquet-pushdown.md:
##
@@ -0,0 +1,259 @@
+---
+layout: post
+title: Efficient Filter Pushdown in Parquet
+date: 2025-03-21
+author: Xia
arpity22 commented on issue #102:
URL: https://github.com/apache/datafusion/issues/102#issuecomment-2741307866
Since this issue was opened a while ago, has it been resolved but not
updated here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please
XiangpengHao commented on PR #62:
URL: https://github.com/apache/datafusion-site/pull/62#issuecomment-2741287848
Thank you @kevinjqliu
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
comphead commented on code in PR #61:
URL: https://github.com/apache/datafusion-site/pull/61#discussion_r2006162703
##
content/blog/2025-03-21-parquet-pushdown.md:
##
@@ -0,0 +1,259 @@
+---
+layout: post
+title: Efficient Filter Pushdown in Parquet
+date: 2025-03-21
+author: Xia
christophermcdermott opened a new pull request, #15337:
URL: https://github.com/apache/datafusion/pull/15337
## Which issue does this PR close?
Closes #15336
## Rationale for this change
Support additional types in hive partitions.
## What changes a
kevinjqliu commented on PR #60:
URL: https://github.com/apache/datafusion-site/pull/60#issuecomment-2741250429
#62 should fix it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
deanm opened a new pull request, #1074:
URL: https://github.com/apache/datafusion-python/pull/1074
# Which issue does this PR close?
Works towards closing #1064
# Rationale for this change
To improve ergonomics of the API by adding functions to the Expr class so
th
kevinjqliu commented on PR #62:
URL: https://github.com/apache/datafusion-site/pull/62#issuecomment-2741251030
cc @XiangpengHao @alamb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
comphead commented on code in PR #61:
URL: https://github.com/apache/datafusion-site/pull/61#discussion_r2006168286
##
content/blog/2025-03-21-parquet-pushdown.md:
##
@@ -0,0 +1,259 @@
+---
+layout: post
+title: Efficient Filter Pushdown in Parquet
+date: 2025-03-21
+author: Xia
comphead commented on PR #61:
URL: https://github.com/apache/datafusion-site/pull/61#issuecomment-2741241689
on content/images/parquet-pushdown/baseline-impl.jpg the flow comes from 3
to 5, I assume it is expected, perhaps its needed to make a separate comment?
--
This is an automated me
christophermcdermott opened a new issue, #15336:
URL: https://github.com/apache/datafusion/issues/15336
### Is your feature request related to a problem or challenge?
I hit this error:
DataFusion error: This feature is not implemented: it is not yet supported
to write to hive part
milenkovicm commented on PR #1212:
URL:
https://github.com/apache/datafusion-ballista/pull/1212#issuecomment-2741214382
apparently you found another bug:
https://github.com/apache/datafusion-ballista/blob/bb10a1bebd52ebb91515efa7a2a977df740c2d7a/ballista/scheduler/src/scheduler_serv
adriangb commented on PR #15261:
URL: https://github.com/apache/datafusion/pull/15261#issuecomment-2741174091
Marking as ready for review. The main TODO is an API for transmitting
statistics information for generated columns before they get generated, but
that can even be a followup PR.
-
kevinjqliu commented on PR #60:
URL: https://github.com/apache/datafusion-site/pull/60#issuecomment-2741193664
> The diagram below illustrates the [Parquet reading
pipeline](https://docs.rs/datafusion/46.0.0/datafusion/datasource/physical_plan/parquet/source/struct.ParquetSource.html%60%60%6
adriangb commented on issue #7955:
URL: https://github.com/apache/datafusion/issues/7955#issuecomment-2741188852
I have a PR up for doing something similar for TopK sorts (`ORDER BY col
LIMIT 10`) in https://github.com/apache/datafusion/pull/15301. I think we
should be able to re-use that w
comphead commented on code in PR #15329:
URL: https://github.com/apache/datafusion/pull/15329#discussion_r2006103754
##
docs/source/library-user-guide/working-with-exprs.md:
##
@@ -50,6 +50,29 @@ As another example, the SQL expression `a + b * c` would be
represented as an `E
logan-keede commented on PR #15316:
URL: https://github.com/apache/datafusion/pull/15316#issuecomment-2741075889
I thought it mattered because `datasource` has an dependency on `catalog`
but on a second look it is only `Session`. Any plans on pulling `Session` out?
also corresponding `
alamb commented on PR #60:
URL: https://github.com/apache/datafusion-site/pull/60#issuecomment-2741065799
And it is live:
https://datafusion.apache.org/blog/2025/03/20/parquet-pruning/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
andygrove commented on code in PR #1561:
URL: https://github.com/apache/datafusion-comet/pull/1561#discussion_r2005998677
##
spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala:
##
@@ -1334,26 +1334,46 @@ object CometSparkSessionExtensions extends Logging {
alamb commented on PR #15313:
URL: https://github.com/apache/datafusion/pull/15313#issuecomment-2740902341
FYI @blaginin
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
logan-keede commented on PR #15316:
URL: https://github.com/apache/datafusion/pull/15316#issuecomment-2740991793
where does Memtable belong datasource or catalog? it is TableProvider
implementation so I thought It was going to be in catalog, but I m not so sure
anymore as it has dependency
alamb commented on PR #15316:
URL: https://github.com/apache/datafusion/pull/15316#issuecomment-2741010424
> where does Memtable belong datasource or catalog? it is TableProvider
implementation so I thought It was going to be in catalog, but I m not so sure
anymore as it has dependency on d
alamb commented on issue #15177:
URL: https://github.com/apache/datafusion/issues/15177#issuecomment-2740980237
> Thanks for checking [@alamb](https://github.com/alamb) !
>
> I think a large portion is spent in the hash join (repartitioning the
right side input) - I think because it r
westhide commented on code in PR #15335:
URL: https://github.com/apache/datafusion/pull/15335#discussion_r2005989730
##
datafusion/proto/proto/datafusion.proto:
##
@@ -997,6 +997,7 @@ message FileScanExecConf {
reserved 10;
datafusion_common.Constraints constraints = 11;
alamb commented on code in PR #15316:
URL: https://github.com/apache/datafusion/pull/15316#discussion_r2005982815
##
datafusion/physical-expr/src/physical_expr.rs:
##
@@ -146,6 +148,38 @@ pub fn create_ordering(
Ok(all_sort_orders)
}
+/// Create a physical sort expressio
alamb commented on code in PR #15327:
URL: https://github.com/apache/datafusion/pull/15327#discussion_r2005971323
##
datafusion/physical-expr/src/expressions/binary.rs:
##
@@ -793,8 +793,10 @@ impl BinaryExpr {
BitwiseShiftRight => bitwise_shift_right_dyn(left, righ
alamb commented on issue #15177:
URL: https://github.com/apache/datafusion/issues/15177#issuecomment-2740900315
I am not really sure where the time is going 🤔
output of explain analyze:
[explain.txt](https://github.com/user-attachments/files/19370532/explain.txt)
--
This
alamb commented on issue #15177:
URL: https://github.com/apache/datafusion/issues/15177#issuecomment-2740888007
I tried the rewrite into a Semi join and indeed it is over 2x slower (5.3sec
vs 12sec)
```sql
> SELECT * from 'hits_partitioned' WHERE "URL" LIKE '%google%' ORDER BY
"Ev
alamb closed issue #1209: Unsupported NdJsonExec plan and extension codec
URL: https://github.com/apache/datafusion-ballista/issues/1209
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
Dandandan commented on issue #15177:
URL: https://github.com/apache/datafusion/issues/15177#issuecomment-2740936826
Thanks for checking @alamb !
I think a large portion is spent in the h join (repartitioning the right
input) - I think because it runs as `Partitioned` hash join, instea
alamb merged PR #15311:
URL: https://github.com/apache/datafusion/pull/15311
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on issue #15037:
URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2740932019
Thanks @adriangb -- I will try and review it asap (hopefully tomorrow
afternoon or tomorrow)
--
This is an automated message from the Apache Git Service.
To respond to the mess
alamb commented on code in PR #15335:
URL: https://github.com/apache/datafusion/pull/15335#discussion_r2005949069
##
datafusion/proto/proto/datafusion.proto:
##
@@ -997,6 +997,7 @@ message FileScanExecConf {
reserved 10;
datafusion_common.Constraints constraints = 11;
+
goldmedal opened a new pull request, #15334:
URL: https://github.com/apache/datafusion/pull/15334
## Which issue does this PR close?
- Closes #13486
## Rationale for this change
When working on unparsing the plan optimized by `ScalarSubqueryToJoin`, I
notice
alamb commented on PR #15318:
URL: https://github.com/apache/datafusion/pull/15318#issuecomment-2740909824
Thanks @adriangb and @ozankabak
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
jonahgao commented on code in PR #15183:
URL: https://github.com/apache/datafusion/pull/15183#discussion_r2005949005
##
datafusion/sql/src/planner.rs:
##
@@ -560,11 +558,11 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
SQLDataType::SmallInt(_) | SQLDataType::Int
vadimpiven commented on issue #15328:
URL: https://github.com/apache/datafusion/issues/15328#issuecomment-2740896190
Hi! I can report that without `datafusion` crate release the issue
https://github.com/apache/datafusion/issues/15122 still reproduces and still
requires hotfix
```
[dep
Shreyaskr1409 commented on code in PR #15313:
URL: https://github.com/apache/datafusion/pull/15313#discussion_r2005883069
##
datafusion/physical-plan/Cargo.toml:
##
@@ -58,6 +58,7 @@ futures = { workspace = true }
half = { workspace = true }
hashbrown = { workspace = true }
i
andygrove closed pull request #1561: [WIP] chore: Fix some inconsistencies in
memory pool configuration
URL: https://github.com/apache/datafusion-comet/pull/1561
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
westhide opened a new pull request, #15335:
URL: https://github.com/apache/datafusion/pull/15335
## Which issue does this PR close?
- Closes None.
- Reference [Support serde for
batch_size](https://github.com/apache/datafusion/pull/15311#discussion_r2004114426)
## Ra
Dandandan commented on code in PR #15302:
URL: https://github.com/apache/datafusion/pull/15302#discussion_r2005752259
##
datafusion/physical-plan/src/sorts/sort.rs:
##
@@ -688,15 +707,29 @@ impl ExternalSorter {
let fetch = self.fetch;
let expressions: LexOrd
1 - 100 of 158 matches
Mail list logo