[jira] [Updated] (SPARK-42776) invalid issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Affects Version/s: 2.4.8
   (was: 3.3.1)

> invalid issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Test
>  Components: Windows
>Affects Versions: 2.4.8
>Reporter: Timothy Miller
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42776) invalid issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Issue Type: Test  (was: Bug)

> invalid issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Test
>  Components: Windows
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42776) invalid issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Priority: Trivial  (was: Major)

> invalid issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Windows
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42776) invalid issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Component/s: Windows
 (was: Optimizer)

> invalid issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Windows
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42776) invalid issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Summary: invalid issue  (was: deleted issue)

> invalid issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-42776) deleted issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller closed SPARK-42776.
--

> deleted issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42776) deleted issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller resolved SPARK-42776.

Resolution: Invalid

> deleted issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42776) deleted issue

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Summary: deleted issue  (was: 
BroadcastHashJoinExec.requiredChildDistribution called before columnar 
replacement rules)

> deleted issue
> -
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42776) BroadcastHashJoinExec.requiredChildDistribution called before columnar replacement rules

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Environment: (was: I'm prototyping on a Mac, but that's not really 
relevant.)

> BroadcastHashJoinExec.requiredChildDistribution called before columnar 
> replacement rules
> 
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] (SPARK-42776) BroadcastHashJoinExec.requiredChildDistribution called before columnar replacement rules

2023-03-15 Thread Timothy Miller (Jira)


[ https://issues.apache.org/jira/browse/SPARK-42776 ]


Timothy Miller deleted comment on SPARK-42776:


was (Author: JIRAUSER287471):
A little more detail about the sequence events that cause this bug:
 * org.apache.spark.sql.execution.RemoveRedundantProjects is applied
 * that causes BroadcastHashJoinExec to get created
 * org.apache.spark.sql.execution.exchange.EnsureRequirements is applied
 * BroadcastHashJoinExec.requiredChildDistribution gets called, creating the 
hashmap object that gets broadcast
 * a few more rules are applied, followed by 
org.apache.spark.sql.execution.ApplyColumnarRulesAndInsertTransitions
 * Only after that can I replace BroadcastHashJoinExec with a columnar 
alternative, but by then it's too late.

I can't find a way to inject extra rules into or between 
RemoveRedundantProjects or EnsureRequirements, so there doesn't seem to be a 
workaround either.

> BroadcastHashJoinExec.requiredChildDistribution called before columnar 
> replacement rules
> 
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
> Environment: I'm prototyping on a Mac, but that's not really relevant.
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42776) BroadcastHashJoinExec.requiredChildDistribution called before columnar replacement rules

2023-03-15 Thread Timothy Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Miller updated SPARK-42776:
---
Description: (was: I am trying to replace BroadcastHashJoinExec with a 
columnar equivalent. However, I noticed that 
BroadcastHashJoinExec.requiredChildDistribution gets called BEFORE the columnar 
replacement rules. As a result, the object that gets broadcast is the plain old 
hashmap created from row data. By the time the columnar replacement rules are 
applied, it's too late to get Spark to broadcast any other kind of object.)

> BroadcastHashJoinExec.requiredChildDistribution called before columnar 
> replacement rules
> 
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
> Environment: I'm prototyping on a Mac, but that's not really relevant.
>Reporter: Timothy Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-42776) BroadcastHashJoinExec.requiredChildDistribution called before columnar replacement rules

2023-03-14 Thread Timothy Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17700286#comment-17700286
 ] 

Timothy Miller edited comment on SPARK-42776 at 3/14/23 4:34 PM:
-

A little more detail about the sequence events that cause this bug:
 * org.apache.spark.sql.execution.RemoveRedundantProjects is applied
 * that causes BroadcastHashJoinExec to get created
 * org.apache.spark.sql.execution.exchange.EnsureRequirements is applied
 * BroadcastHashJoinExec.requiredChildDistribution gets called, creating the 
hashmap object that gets broadcast
 * a few more rules are applied, followed by 
org.apache.spark.sql.execution.ApplyColumnarRulesAndInsertTransitions
 * Only after that can I replace BroadcastHashJoinExec with a columnar 
alternative, but by then it's too late.

I can't find a way to inject extra rules into or between 
RemoveRedundantProjects or EnsureRequirements, so there doesn't seem to be a 
workaround either.


was (Author: JIRAUSER287471):
A little more detail about the sequence events that cause this bug:
 * org.apache.spark.sql.execution.RemoveRedundantProjects is applied
 * that causes BroadcastHashJoinExec to get created
 * org.apache.spark.sql.execution.exchange.EnsureRequirements is applied
 * BroadcastHashJoinExec.requiredChildDistribution gets called, creating the 
hashmap object that gets broadcast
 * a few more rules are applied, followed by 
org.apache.spark.sql.execution.ApplyColumnarRulesAndInsertTransitions

I can't find a way to inject extra rules into or between 
RemoveRedundantProjects or EnsureRequirements, so there doesn't seem to be a 
workaround either.

> BroadcastHashJoinExec.requiredChildDistribution called before columnar 
> replacement rules
> 
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
> Environment: I'm prototyping on a Mac, but that's not really relevant.
>Reporter: Timothy Miller
>Priority: Major
>
> I am trying to replace BroadcastHashJoinExec with a columnar equivalent. 
> However, I noticed that BroadcastHashJoinExec.requiredChildDistribution gets 
> called BEFORE the columnar replacement rules. As a result, the object that 
> gets broadcast is the plain old hashmap created from row data. By the time 
> the columnar replacement rules are applied, it's too late to get Spark to 
> broadcast any other kind of object.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42776) BroadcastHashJoinExec.requiredChildDistribution called before columnar replacement rules

2023-03-14 Thread Timothy Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17700286#comment-17700286
 ] 

Timothy Miller commented on SPARK-42776:


A little more detail about the sequence events that cause this bug:
 * org.apache.spark.sql.execution.RemoveRedundantProjects is applied
 * that causes BroadcastHashJoinExec to get created
 * org.apache.spark.sql.execution.exchange.EnsureRequirements is applied
 * BroadcastHashJoinExec.requiredChildDistribution gets called, creating the 
hashmap object that gets broadcast
 * a few more rules are applied, followed by 
org.apache.spark.sql.execution.ApplyColumnarRulesAndInsertTransitions

I can't find a way to inject extra rules into or between 
RemoveRedundantProjects or EnsureRequirements, so there doesn't seem to be a 
workaround either.

> BroadcastHashJoinExec.requiredChildDistribution called before columnar 
> replacement rules
> 
>
> Key: SPARK-42776
> URL: https://issues.apache.org/jira/browse/SPARK-42776
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.3.1
> Environment: I'm prototyping on a Mac, but that's not really relevant.
>Reporter: Timothy Miller
>Priority: Major
>
> I am trying to replace BroadcastHashJoinExec with a columnar equivalent. 
> However, I noticed that BroadcastHashJoinExec.requiredChildDistribution gets 
> called BEFORE the columnar replacement rules. As a result, the object that 
> gets broadcast is the plain old hashmap created from row data. By the time 
> the columnar replacement rules are applied, it's too late to get Spark to 
> broadcast any other kind of object.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42776) BroadcastHashJoinExec.requiredChildDistribution called before columnar replacement rules

2023-03-13 Thread Timothy Miller (Jira)
Timothy Miller created SPARK-42776:
--

 Summary: BroadcastHashJoinExec.requiredChildDistribution called 
before columnar replacement rules
 Key: SPARK-42776
 URL: https://issues.apache.org/jira/browse/SPARK-42776
 Project: Spark
  Issue Type: Bug
  Components: Optimizer
Affects Versions: 3.3.1
 Environment: I'm prototyping on a Mac, but that's not really relevant.
Reporter: Timothy Miller


I am trying to replace BroadcastHashJoinExec with a columnar equivalent. 
However, I noticed that BroadcastHashJoinExec.requiredChildDistribution gets 
called BEFORE the columnar replacement rules. As a result, the object that gets 
broadcast is the plain old hashmap created from row data. By the time the 
columnar replacement rules are applied, it's too late to get Spark to broadcast 
any other kind of object.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org