Re: [I] Making Comet Common Module Engine Independent [datafusion-comet]

2024-04-29 Thread via GitHub
advancedxy commented on issue #329: URL: https://github.com/apache/datafusion-comet/issues/329#issuecomment-208203 I'm +1 for this direction in the long term and I can help review the Iceberg integration if needed. In the short term, I think Iceberg could integrate Comet in its

Re: [PR] Validate ScalarUDF output rows and support nulls for `array_has` and `get_field` for `Map` [datafusion]

2024-04-29 Thread via GitHub
duongcongtoai commented on code in PR #10148: URL: https://github.com/apache/datafusion/pull/10148#discussion_r1582634557 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -146,11 +146,22 @@ impl PhysicalExpr for ScalarFunctionExpr { // evaluate the function

Re: [PR] Better Cast name for display [datafusion]

2024-04-29 Thread via GitHub
jayzhan211 commented on PR #10276: URL: https://github.com/apache/datafusion/pull/10276#issuecomment-2082078281 I think we should change the `display_name` not `create_name`, as the commented said, name for casting should preserve -- This is an automated message from the Apache Git Servic

[PR] chore(deps): update substrait requirement from 0.31.0 to 0.32.0 [datafusion]

2024-04-29 Thread via GitHub
dependabot[bot] opened a new pull request, #10279: URL: https://github.com/apache/datafusion/pull/10279 Updates the requirements on [substrait](https://github.com/substrait-io/substrait-rs) to permit the latest version. Release notes Sourced from https://github.com/substrait-io/su

[I] Make `alias_symbol` more human-readable [datafusion]

2024-04-29 Thread via GitHub
JasonLi-cn opened a new issue, #10280: URL: https://github.com/apache/datafusion/issues/10280 ### Is your feature request related to a problem or challenge? ```shell DataFusion CLI v37.1.0 > select * from number; ++++ | c0 | c1 | c2 | ++++ | 1

Re: [I] Move `create_physical_expr` to `physical-expr-common` [datafusion]

2024-04-29 Thread via GitHub
jayzhan211 commented on issue #10074: URL: https://github.com/apache/datafusion/issues/10074#issuecomment-2082240499 How about we introduce physical aggregate function trait, `AggregateUDFPhysicalImpl` I expect that `AggregateUDFPhysicalImpl` handle the physical-expr, for example, `accum

Re: [I] Move `create_physical_expr` to `physical-expr-common` [datafusion]

2024-04-29 Thread via GitHub
jayzhan211 commented on issue #10074: URL: https://github.com/apache/datafusion/issues/10074#issuecomment-2082270314 > Hmm this is a tricky refactor -- it is like a ball knot in a piece of string -- we just need to keep tugging at it and at some point it will unravel. Moving Aggregate

Re: [PR] feat: add a config param to avoid grouping partitions [datafusion]

2024-04-29 Thread via GitHub
mustafasrepo commented on PR #10259: URL: https://github.com/apache/datafusion/pull/10259#issuecomment-2082318558 I think having dedicated config setting is more verbose and clear (as in `prefer_existing_union`). If we were to use `prefer_existing_sort` that might also work. However, if th

Re: [PR] feat: Improve CometBroadcastHashJoin statistics [datafusion-comet]

2024-04-29 Thread via GitHub
planga82 commented on PR #339: URL: https://github.com/apache/datafusion-comet/pull/339#issuecomment-2082356399 It seems that there are problems in tests with Spark 3.3 and Spark 3.2. I'm checking it out. -- This is an automated message from the Apache Git Service. To respond to the messa

[I] May 2024 ASF Board Report [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10281: URL: https://github.com/apache/datafusion/issues/10281 ### Is your feature request related to a problem or challenge? Per https://www.apache.org/foundation/board/reporting, for the first three months of a project it should submit monthly board reports

[I] July 2024 ASF Board Report [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10282: URL: https://github.com/apache/datafusion/issues/10282 ### Is your feature request related to a problem or challenge? Per https://www.apache.org/foundation/board/reporting, for the first three months of a project it should submit monthly board reports

[I] DataFusion weekly project plan (Andrew Lamb) - April 29, 2024 [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10283: URL: https://github.com/apache/datafusion/issues/10283 Follow on to https://github.com/apache/datafusion/issues/10172 **It would be great for other contributors to DataFusion who plan non trivial work could try to make them visible somehow as well** 🙏

Re: [I] DataFusion weekly project plan (Andrew Lamb) - April 22, 2024 [datafusion]

2024-04-29 Thread via GitHub
alamb closed issue #10172: DataFusion weekly project plan (Andrew Lamb) - April 22, 2024 URL: https://github.com/apache/datafusion/issues/10172 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] DataFusion weekly project plan (Andrew Lamb) - April 22, 2024 [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #10172: URL: https://github.com/apache/datafusion/issues/10172#issuecomment-2082463655 Next week: https://github.com/apache/datafusion/issues/10283 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] DataFusion weekly project plan (Andrew Lamb) - April 29, 2024 [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #10283: URL: https://github.com/apache/datafusion/issues/10283#issuecomment-2082464041 Review Queue - [ ] https://github.com/apache/datafusion/pull/10221 - [ ] https://github.com/apache/datafusion/pull/9593 - [ ] https://github.com/apache/datafusion/pull/1014

Re: [I] Blog Post about graduating to a new top level project [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #10135: URL: https://github.com/apache/datafusion/issues/10135#issuecomment-2082467760 Now that we have completed most of the work on https://github.com/apache/datafusion/issues/9691 I hope to draft this post sometime this week -- This is an automated message fro

Re: [PR] Add mailing list descriptions to documentation [datafusion]

2024-04-29 Thread via GitHub
alamb commented on code in PR #10284: URL: https://github.com/apache/datafusion/pull/10284#discussion_r1582954905 ## docs/source/contributor-guide/communication.md: ## @@ -37,18 +37,35 @@ We use the Slack and Discord platforms for informal discussions and coordination meet oth

[PR] Add mailing list descriptions to documentation [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new pull request, #10284: URL: https://github.com/apache/datafusion/pull/10284 ## Which issue does this PR close? Part of https://github.com/apache/datafusion/issues/9691 ## Rationale for this change We have new mailing lists for datafusion, so let's document w

[PR] Parquet exec visitor [datafusion]

2024-04-29 Thread via GitHub
matthewmturner opened a new pull request, #10286: URL: https://github.com/apache/datafusion/pull/10286 ## Which issue does this PR close? Closes #10012 ## Rationale for this change ## What changes are included in this PR? ## Are these chang

[I] Stop copying LogicalPlan and Exprs in `EliminateCrossJoin` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10287: URL: https://github.com/apache/datafusion/issues/10287 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

Re: [I] Stop copying LogicalPlan and Exprs in `EliminateCrossJoin` [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #10287: URL: https://github.com/apache/datafusion/issues/10287#issuecomment-2082650814 I believe @Lordworms is working on this -- https://github.com/apache/datafusion/issues/9637#issuecomment-2075311002 -- This is an automated message from the Apache Git Service.

Re: [I] idea: add another `static_name()` method to `ExecutionPlan` trait [datafusion]

2024-04-29 Thread via GitHub
waynexia closed issue #10246: idea: add another `static_name()` method to `ExecutionPlan` trait URL: https://github.com/apache/datafusion/issues/10246 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] feat: add static_name() to ExecutionPlan [datafusion]

2024-04-29 Thread via GitHub
waynexia merged PR #10266: URL: https://github.com/apache/datafusion/pull/10266 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[I] Stop copying LogicalPlan and Exprs in `EliminateFilter` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10288: URL: https://github.com/apache/datafusion/issues/10288 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

Re: [PR] feat: add static_name() to ExecutionPlan [datafusion]

2024-04-29 Thread via GitHub
matthewmturner commented on PR #10266: URL: https://github.com/apache/datafusion/pull/10266#issuecomment-2082655628 This is nice @waynexia -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Parquet exec visitor [datafusion]

2024-04-29 Thread via GitHub
matthewmturner commented on PR #10286: URL: https://github.com/apache/datafusion/pull/10286#issuecomment-2082652618 @alamb FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[I] Stop copying LogicalPlan and Exprs in `PropagateEmptyRelation` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10290: URL: https://github.com/apache/datafusion/issues/10290 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

[I] Stop copying LogicalPlan and Exprs in `DecorrelatePredicateSubquery ` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10289: URL: https://github.com/apache/datafusion/issues/10289 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

[I] Stop copying LogicalPlan and Exprs in `PushDownFilter` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10291: URL: https://github.com/apache/datafusion/issues/10291 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

[I] Stop copying LogicalPlan and Exprs in `PushDownLimit` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10292: URL: https://github.com/apache/datafusion/issues/10292 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

[I] Stop copying LogicalPlan and Exprs in `ReplaceDistinctWithAggregate` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10293: URL: https://github.com/apache/datafusion/issues/10293 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

[I] Stop copying LogicalPlan and Exprs in `SingleDistinctToGroupBy` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10295: URL: https://github.com/apache/datafusion/issues/10295 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

[I] Stop copying LogicalPlan and Exprs in `ScalarSubqueryToJoin` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10294: URL: https://github.com/apache/datafusion/issues/10294 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

Re: [I] XxHash64 hash function support [datafusion-comet]

2024-04-29 Thread via GitHub
advancedxy commented on issue #344: URL: https://github.com/apache/datafusion-comet/issues/344#issuecomment-2082664013 FYI, I'm working on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[I] Stop copying LogicalPlan and Exprs in `EliminateNestedUnion` [datafusion]

2024-04-29 Thread via GitHub
alamb opened a new issue, #10296: URL: https://github.com/apache/datafusion/issues/10296 ### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/9637 As part of making the planner faster, we are updating the optimizer passes

[PR] docs: Add a plugin overview page to the contributors guide [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove opened a new pull request, #345: URL: https://github.com/apache/datafusion-comet/pull/345 ## Which issue does this PR close? N/A ## Rationale for this change Make it easier for new contributors to understand how Comet works and where to start lo

Re: [I] Implement Spark-compatible CAST from String to Timestamp [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove closed issue #328: Implement Spark-compatible CAST from String to Timestamp URL: https://github.com/apache/datafusion-comet/issues/328 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] feat: Disable cast string to timestamp by default [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove merged PR #337: URL: https://github.com/apache/datafusion-comet/pull/337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove commented on PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#issuecomment-2082781370 @viirya @sunchao @parthchandra @comphead I did quite a bit of refactoring and performance tuning over the weekend. Please take another look when you can. -- This is an automate

Re: [I] Finalize SIGMOD 2024 paper ~(if accepted)~ [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #8373: URL: https://github.com/apache/datafusion/issues/8373#issuecomment-2082793044 Needed to tweak the title to have a `A` rather than `a` ``` -- Forwarded message - From: Tim Pollitt Date: Mon, Apr 29, 2024 at 8:47 AM Subject: AC

Re: [I] [EPIC] Tasks for a new Top Level Apache Project [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #9691: URL: https://github.com/apache/datafusion/issues/9691#issuecomment-2082798866 I filed a few more doc tweaks https://github.com/apache/datafusion/pull/10284 and https://github.com/apache/datafusion/pull/10285 I think all that is left for this epic is to

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-04-29 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1583125654 ## datafusion/expr/src/signature.rs: ## @@ -92,14 +92,22 @@ pub enum TypeSignature { /// A function such as `concat` is `Variadic(vec![DataType::Utf8, Dat

Re: [PR] feat: support `grouping` aggregate function [datafusion]

2024-04-29 Thread via GitHub
waynexia commented on code in PR #10208: URL: https://github.com/apache/datafusion/pull/10208#discussion_r1583128802 ## datafusion/physical-expr/src/aggregate/grouping.rs: ## @@ -96,8 +113,172 @@ impl PartialEq for Grouping { self.name == x.name

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-04-29 Thread via GitHub
jayzhan211 commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1583129464 ## datafusion/expr/src/type_coercion/binary.rs: ## @@ -289,15 +290,164 @@ fn bitwise_coercion(left_type: &DataType, right_type: &DataType) -> Option TypeCatego

[I] Build conda nightlies jobs are failing on main for aarch64 [datafusion-python]

2024-04-29 Thread via GitHub
raulcd opened a new issue, #659: URL: https://github.com/apache/datafusion-python/issues/659 **Describe the bug** The aarch 64 jobs for conda nightlies are failing with: ``` Conda detected a mismatch between the expected content and downloaded content ``` See: - https://git

Re: [I] [EPIC] Tasks for a new Top Level Apache Project [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #9691: URL: https://github.com/apache/datafusion/issues/9691#issuecomment-2082831457 Actually, we also owe the ASF board a report each month for the first 3 months. I'll begin coordinating the first one shortly (tracked via https://github.com/apache/datafusion/issue

[I] There are 31 open PRs for dependabot [datafusion-python]

2024-04-29 Thread via GitHub
raulcd opened a new issue, #660: URL: https://github.com/apache/datafusion-python/issues/660 **Describe the bug** There are currently 31 open PRs for dependabot: https://github.com/apache/datafusion-python/pulls/app%2Fdependabot We should either merge them, close them or configure de

[PR] Clean-up: Remove AggregateExec::group_by() [datafusion]

2024-04-29 Thread via GitHub
berkaysynnada opened a new pull request, #10297: URL: https://github.com/apache/datafusion/pull/10297 ## Which issue does this PR close? Closes #. ## Rationale for this change `AggregateExec` already has ``` pub fn group_expr(&self) -> &PhysicalGrou

Re: [PR] Clean-up: Remove AggregateExec::group_by() [datafusion]

2024-04-29 Thread via GitHub
berkaysynnada closed pull request #10297: Clean-up: Remove AggregateExec::group_by() URL: https://github.com/apache/datafusion/pull/10297 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] Zero-copy conversion from SchemaRef to DfSchema [datafusion]

2024-04-29 Thread via GitHub
tustvold opened a new pull request, #10298: URL: https://github.com/apache/datafusion/pull/10298 Following https://github.com/apache/datafusion/pull/9595 we can avoid needing to potentially clone the underlying arrow schema -- This is an automated message from the Apache Git Service. To r

Re: [PR] feat: Implement Spark-compatible CAST from string to timestamp types [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove commented on PR #335: URL: https://github.com/apache/datafusion-comet/pull/335#issuecomment-2082886164 > Update - > > Working on this bug, there is 8 hours difference in comet vs spark cast. If anyone has any pointer, please let me know. > > https://private-user-image

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove commented on PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#issuecomment-2082911871 > Thanks for your effort @andygrove, the new code is well crafted. Thank you for the thorough review @advancedxy! -- This is an automated message from the Apache Git Serv

[I] Enable bloom filters by default on read [datafusion]

2024-04-29 Thread via GitHub
hiltontj opened a new issue, #10299: URL: https://github.com/apache/datafusion/issues/10299 ### Is your feature request related to a problem or challenge? When reading from parquet files, bloom filters are **_not_** enabled by default. It is not immediately obvious that they are not b

[PR] feat(CLI): print column headers for empty query results [datafusion]

2024-04-29 Thread via GitHub
jonahgao opened a new pull request, #10300: URL: https://github.com/apache/datafusion/pull/10300 ## Which issue does this PR close? Closes #8904. ## Rationale for this change When the query result does not contain rows, use the newly added `schema` parameter to creat

Re: [PR] feat(7181): provide slicing of CursorValues [datafusion]

2024-04-29 Thread via GitHub
wiedld commented on PR #7912: URL: https://github.com/apache/datafusion/pull/7912#issuecomment-2083006339 I think we should close this @alamb , since it's not a priority at the moment. And whenever we circle back, there will be a very large diff (due to changes). Ok to close? -- This is

Re: [PR] Zero-copy conversion from SchemaRef to DfSchema [datafusion]

2024-04-29 Thread via GitHub
tustvold merged PR #10298: URL: https://github.com/apache/datafusion/pull/10298 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] Enable bloom filters by default on read [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #10299: URL: https://github.com/apache/datafusion/issues/10299#issuecomment-2083019014 Thank you @hiltontj -- this is a great description. I agree it seems like a good idea to enable using bloom filters on read when they are present in the file. The test coverage i

Re: [I] Enable bloom filters by default on read [datafusion]

2024-04-29 Thread via GitHub
alamb commented on issue #10299: URL: https://github.com/apache/datafusion/issues/10299#issuecomment-2083021698 (BTW this is a wonderfully clear ticket in my mind -- thank you @hiltontj ) -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Enable bloom filters by default on read [datafusion]

2024-04-29 Thread via GitHub
hengfeiyang commented on issue #10299: URL: https://github.com/apache/datafusion/issues/10299#issuecomment-2083032645 @alamb As an experiment feature, we disable it as default. so, it is disabled as default when we implement it. -- This is an automated message from the Apache Git Service.

Re: [PR] feat(CLI): print column headers for empty query results [datafusion]

2024-04-29 Thread via GitHub
comphead commented on code in PR #10300: URL: https://github.com/apache/datafusion/pull/10300#discussion_r1583311427 ## datafusion-cli/src/command.rs: ## @@ -58,18 +58,18 @@ impl Command { ctx: &mut SessionContext, print_options: &mut PrintOptions, ) -> Re

Re: [PR] feat: add a config param to avoid grouping partitions [datafusion]

2024-04-29 Thread via GitHub
NGA-TRAN commented on code in PR #10259: URL: https://github.com/apache/datafusion/pull/10259#discussion_r1583311537 ## datafusion/core/src/physical_optimizer/enforce_distribution.rs: ## @@ -1192,7 +1192,11 @@ fn ensure_distribution( .collect::>>()?; let children_pla

Re: [PR] feat: add a config param to avoid grouping partitions [datafusion]

2024-04-29 Thread via GitHub
alamb commented on PR #10259: URL: https://github.com/apache/datafusion/pull/10259#issuecomment-2083079688 > Hence, I think it is better to proceed with current approach in this PR. (as in prefer_existing_union) Sounds good to me. > In the future, if we add support for OrderP

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-04-29 Thread via GitHub
alamb commented on PR #9593: URL: https://github.com/apache/datafusion/pull/9593#issuecomment-2083081732 @NGA-TRAN do you have time to review this PR as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] chore: Update Error for Unnest Rewritter [datafusion]

2024-04-29 Thread via GitHub
comphead merged PR #10263: URL: https://github.com/apache/datafusion/pull/10263 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-04-29 Thread via GitHub
alamb commented on PR #9593: URL: https://github.com/apache/datafusion/pull/9593#issuecomment-2083102517 No worries @suremarc -- I am very excited about this PR. I plan to review it sometime this week (hopefully later today) -- This is an automated message from the Apache Git Service. To

Re: [PR] Coverage: Add a manual test to show what Spark built in expression the DF can support directly [datafusion-comet]

2024-04-29 Thread via GitHub
comphead commented on PR #331: URL: https://github.com/apache/datafusion-comet/pull/331#issuecomment-2083105249 > Hi @comphead is this ready for review? Thanks @advancedxy I think so. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Signature::Equal for NVL and NullIf [datafusion]

2024-04-29 Thread via GitHub
yyy1000 closed pull request #10272: Signature::Equal for NVL and NullIf URL: https://github.com/apache/datafusion/pull/10272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] fix: CometShuffleExchangeExec logical link should be correct [datafusion-comet]

2024-04-29 Thread via GitHub
viirya commented on PR #324: URL: https://github.com/apache/datafusion-comet/pull/324#issuecomment-2083122048 cc @sunchao @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: Determine ordering of file groups [datafusion]

2024-04-29 Thread via GitHub
NGA-TRAN commented on PR #9593: URL: https://github.com/apache/datafusion/pull/9593#issuecomment-2083150440 I will review this either today or tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] feat: add a config param to avoid grouping partitions [datafusion]

2024-04-29 Thread via GitHub
NGA-TRAN commented on code in PR #10259: URL: https://github.com/apache/datafusion/pull/10259#discussion_r1583365106 ## datafusion/core/src/physical_optimizer/enforce_distribution.rs: ## @@ -1723,21 +1727,30 @@ pub(crate) mod tests { /// * `REPARTITION_FILE_MIN_SIZE` (optio

Re: [I] Finalize SIGMOD 2024 paper ~(if accepted)~ [datafusion]

2024-04-29 Thread via GitHub
viirya commented on issue #8373: URL: https://github.com/apache/datafusion/issues/8373#issuecomment-2083163913 > Needed to tweak the title to have a `A` rather than `a` I remember we have discussed this `A`/`a` issue before in emails. We got the answer now. 😄 Thanks for dealin

Re: [PR] docs: Add a plugin overview page to the contributors guide [datafusion-comet]

2024-04-29 Thread via GitHub
viirya commented on code in PR #345: URL: https://github.com/apache/datafusion-comet/pull/345#discussion_r1583377840 ## docs/source/contributor-guide/plugin_overview.md: ## @@ -0,0 +1,50 @@ + + +# Comet Plugin Overview + +The entry point to Comet is the `org.apache.comet.CometSp

Re: [PR] docs: add download page [datafusion]

2024-04-29 Thread via GitHub
comphead commented on code in PR #10271: URL: https://github.com/apache/datafusion/pull/10271#discussion_r1583378356 ## docs/source/download.md: ## @@ -0,0 +1,66 @@ + + +# Download + +The official Apache DataFusion releases are provided as source artifacts. + +## Releases + +The

[PR] Issue 326/cast float string [datafusion-comet]

2024-04-29 Thread via GitHub
mattharder91 opened a new pull request, #346: URL: https://github.com/apache/datafusion-comet/pull/346 ## Which issue does this PR close? Closes #. https://github.com/apache/datafusion-comet/issues/326 ## Rationale for this change Improve compatibility with Apache

Re: [PR] docs: Add a plugin overview page to the contributors guide [datafusion-comet]

2024-04-29 Thread via GitHub
viirya commented on code in PR #345: URL: https://github.com/apache/datafusion-comet/pull/345#discussion_r1583377840 ## docs/source/contributor-guide/plugin_overview.md: ## @@ -0,0 +1,50 @@ + + +# Comet Plugin Overview + +The entry point to Comet is the `org.apache.comet.CometSp

[PR] docs: Add more content to the user guide [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove opened a new pull request, #347: URL: https://github.com/apache/datafusion-comet/pull/347 ## Which issue does this PR close? Part of https://github.com/apache/datafusion-comet/issues/230 ## Rationale for this change Provide more documentation in

Re: [PR] docs: Add more content to the user guide [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove commented on code in PR #347: URL: https://github.com/apache/datafusion-comet/pull/347#discussion_r1583382116 ## README.md: ## @@ -69,82 +69,4 @@ Linux, Apple OSX (Intel and M1) ## Getting started -Make sure the requirements above are met and software installed on

Re: [I] Enable bloom filters by default on read [datafusion]

2024-04-29 Thread via GitHub
progval commented on issue #10299: URL: https://github.com/apache/datafusion/issues/10299#issuecomment-2083190189 I don't see any reason not to enable them. Ditto with `datafusion.execution.parquet.pushdown_filters`. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] feat: Implement Spark-compatible CAST between integer types [datafusion-comet]

2024-04-29 Thread via GitHub
ganeshkumar269 commented on PR #340: URL: https://github.com/apache/datafusion-comet/pull/340#issuecomment-2083227842 Hi @viirya @andygrove , firstly please let me know if this PR aligns with the expectations on how to fix the issue, if not kindly provide pointers on how I can move in the r

Re: [I] Implement Spark-compatible CAST float/double to string [datafusion-comet]

2024-04-29 Thread via GitHub
mattharder91 commented on issue #312: URL: https://github.com/apache/datafusion-comet/issues/312#issuecomment-2083225823 Here you go https://github.com/apache/datafusion-comet/pull/346 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Making Comet Common Module Engine Independent [datafusion-comet]

2024-04-29 Thread via GitHub
parthchandra commented on issue #329: URL: https://github.com/apache/datafusion-comet/issues/329#issuecomment-2083237844 @advancedxy Good suggestions. I believe this Issue is to address point 3 above while 1 and 2 are in progress. -- This is an automated message from the Apache Git Serv

[PR] feat: Supports Stddev [datafusion-comet]

2024-04-29 Thread via GitHub
huaxingao opened a new pull request, #348: URL: https://github.com/apache/datafusion-comet/pull/348 ## Which issue does this PR close? Closes #. ## Rationale for this change Supports `STDDEV_SAMP` and `STDDEV_POP` The implementation mostly is the same as the Da

Re: [PR] feat: ic for native unhex [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove commented on code in PR #342: URL: https://github.com/apache/datafusion-comet/pull/342#discussion_r1583437769 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -1396,6 +1396,17 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {

Re: [PR] feat: Disable cast string to timestamp by default [datafusion-comet]

2024-04-29 Thread via GitHub
parthchandra commented on code in PR #337: URL: https://github.com/apache/datafusion-comet/pull/337#discussion_r1583442744 ## common/src/main/scala/org/apache/comet/CometConf.scala: ## @@ -365,6 +365,13 @@ object CometConf { .booleanConf .createWithDefault(false) +

Re: [I] Implement Spark-compatible CAST from String to Floating Point [datafusion-comet]

2024-04-29 Thread via GitHub
mattharder91 commented on issue #326: URL: https://github.com/apache/datafusion-comet/issues/326#issuecomment-2083268908 @viirya hi I created a pr for float to string not strong to float -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Implement Spark-compatible CAST from String to Floating Point [datafusion-comet]

2024-04-29 Thread via GitHub
viirya commented on issue #326: URL: https://github.com/apache/datafusion-comet/issues/326#issuecomment-2083272160 @mattharder91 Er? I think you pointed your PR to this issue i.e., #326, no? -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Implement Spark-compatible CAST from String to Floating Point [datafusion-comet]

2024-04-29 Thread via GitHub
viirya commented on issue #326: URL: https://github.com/apache/datafusion-comet/issues/326#issuecomment-2083274786 Oh, you pointed to wrong issue, I corrected it to be #312 now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] docs: Add more content to the user guide [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove merged PR #347: URL: https://github.com/apache/datafusion-comet/pull/347 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [I] Implement Spark-compatible CAST from String to Floating Point [datafusion-comet]

2024-04-29 Thread via GitHub
viirya commented on issue #326: URL: https://github.com/apache/datafusion-comet/issues/326#issuecomment-2083265466 FYI @psvri @mattharder91 just created a PR against this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] Implement Spark-compatible CAST from String to Floating Point [datafusion-comet]

2024-04-29 Thread via GitHub
mattharder91 commented on issue #326: URL: https://github.com/apache/datafusion-comet/issues/326#issuecomment-2083274584 @viirya oh sry that was my mistake I pin it to the other one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] Implement Spark-compatible CAST from String to Date [datafusion-comet]

2024-04-29 Thread via GitHub
vidyasankarv commented on issue #327: URL: https://github.com/apache/datafusion-comet/issues/327#issuecomment-2083284878 Hi @andygrove, thank you I am new to rust and open source work in general, but eager to learn and contribute. looking at https://github.com/apache/datafusion-comet/pull/

Re: [I] Implement Spark-compatible CAST from String to Floating Point [datafusion-comet]

2024-04-29 Thread via GitHub
mattharder91 commented on issue #326: URL: https://github.com/apache/datafusion-comet/issues/326#issuecomment-2083291681 thank you and sry again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] docs: Generate configuration guide in mvn build [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove opened a new pull request, #349: URL: https://github.com/apache/datafusion-comet/pull/349 ## Which issue does this PR close? Closes https://github.com/apache/datafusion-comet/issues/315 ## Rationale for this change We should publish our configura

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-04-29 Thread via GitHub
erratic-pattern commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1583508138 ## datafusion/expr/src/type_coercion/binary.rs: ## @@ -289,15 +290,164 @@ fn bitwise_coercion(left_type: &DataType, right_type: &DataType) -> Option TypeC

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-04-29 Thread via GitHub
erratic-pattern commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1583508138 ## datafusion/expr/src/type_coercion/binary.rs: ## @@ -289,15 +290,164 @@ fn bitwise_coercion(left_type: &DataType, right_type: &DataType) -> Option TypeC

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-04-29 Thread via GitHub
erratic-pattern commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1583511505 ## datafusion/expr/src/type_coercion/binary.rs: ## @@ -289,15 +290,164 @@ fn bitwise_coercion(left_type: &DataType, right_type: &DataType) -> Option TypeC

Re: [PR] Fix `Coalesce` casting logic to follows what Postgres and DuckDB do. Introduce signature that do non-comparison coercion [datafusion]

2024-04-29 Thread via GitHub
erratic-pattern commented on code in PR #10268: URL: https://github.com/apache/datafusion/pull/10268#discussion_r1583515399 ## datafusion/expr/src/type_coercion/binary.rs: ## @@ -289,15 +290,164 @@ fn bitwise_coercion(left_type: &DataType, right_type: &DataType) -> Option TypeC

Re: [PR] feat: Improve CometBroadcastHashJoin statistics [datafusion-comet]

2024-04-29 Thread via GitHub
planga82 commented on PR #339: URL: https://github.com/apache/datafusion-comet/pull/339#issuecomment-2083388537 Fix tested in my repository with github actions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] feat: add a config param to avoid grouping partitions [datafusion]

2024-04-29 Thread via GitHub
NGA-TRAN commented on PR #10259: URL: https://github.com/apache/datafusion/pull/10259#issuecomment-2083399303 @alamb @phillipleblanc and @mustafasrepo all the tests have passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Add more developer documentation [datafusion-comet]

2024-04-29 Thread via GitHub
andygrove commented on issue #230: URL: https://github.com/apache/datafusion-comet/issues/230#issuecomment-2083421034 We should also explain how the shims work for different Spark versions -- This is an automated message from the Apache Git Service. To respond to the message, please log o

  1   2   3   >