Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-29 Thread L . C . Hsieh
oncern is about how to define "compatible partitions". Things
> >>>> like `days(ts)` are straightforward: the same timestamp value always
> >>>> results in the same partition value, in whatever v2 sources. `bucket(col,
> >>>> num)` is tricky, as Spark doesn't define the bucket hash function. Two v2
> >>>> sources may return different bucket IDs for the same value, and this 
> >>>> breaks
> >>>> the phase 1 split-wise join.
> >>>>
> >>>> And two questions for further improvements:
> >>>> 1. Can we apply this idea to partitioned file source tables
> >>>> (non-bucketed) as well?
> >>>> 2. What if the table has many partitions? Shall we apply certain join
> >>>> algorithms in the phase 1 split-wise join as well? Or even launch a Spark
> >>>> job to do so?
> >>>>
> >>>> Thanks,
> >>>> Wenchen
> >>>>
> >>>> On Wed, Oct 27, 2021 at 3:08 AM Chao Sun  wrote:
> >>>>
> >>>>> Thanks Cheng for the comments.
> >>>>>
> >>>>> > Is migrating Hive table read path to data source v2, being a
> >>>>> prerequisite of this SPIP
> >>>>>
> >>>>> Yes, this SPIP only aims at DataSourceV2, so obviously it will help if
> >>>>> Hive eventually moves to use V2 API. With that said, I think some of the
> >>>>> ideas could be useful for V1 Hive support as well. For instance, with 
> >>>>> the
> >>>>> newly proposed logic to compare whether output partitionings from both
> >>>>> sides of a join operator are compatible, we can have HiveTableScanExec 
> >>>>> to
> >>>>> report a different partitioning other than HashPartitioning, and
> >>>>> EnsureRequirements could potentially recognize that and therefore avoid
> >>>>> shuffle if both sides report the same compatible partitioning. In 
> >>>>> addition,
> >>>>> SPARK-35703, which is part of the SPIP, is also useful in that it 
> >>>>> relaxes
> >>>>> the constraint for V1 bucket join so that the join keys do not 
> >>>>> necessarily
> >>>>> be identical to the bucket keys.
> >>>>>
> >>>>> > Would aggregate work automatically after the SPIP?
> >>>>>
> >>>>> Yes it will work as before. This case is already supported by
> >>>>> DataSourcePartitioning in V2 (see SPARK-22389).
> >>>>>
> >>>>> > Any major use cases in mind except Hive bucketed table?
> >>>>>
> >>>>> Our first use case is Apache Iceberg. In addition to that we also want
> >>>>> to add the support for Spark's built-in file data sources.
> >>>>>
> >>>>> Thanks,
> >>>>> Chao
> >>>>>
> >>>>> On Tue, Oct 26, 2021 at 10:34 AM Cheng Su  wrote:
> >>>>>
> >>>>>> +1 for this. This is exciting movement to efficiently read bucketed
> >>>>>> table from other systems (Hive, Trino & Presto)!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Still looking at the details but having some early questions:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>1. Is migrating Hive table read path to data source v2, being a
> >>>>>>prerequisite of this SPIP?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Hive table read path is currently a mix of data source v1 (for
> >>>>>> Parquet & ORC file format only), and legacy Hive code path
> >>>>>> (HiveTableScanExec). In the SPIP, I am seeing we only make change for 
> >>>>>> data
> >>>>>> source v2, so wondering how this would work with existing Hive table 
> >>>>>> read
> >>>>>> path. In addition, just FYI, supporting writing Hive bucketed table is
> >>>>>> merged in master recently (SPARK-19256
> >>>>>> <https://issues.apache.org/jira/browse/SPARK-19256> has details).
> >>>>>>
> >>>>>>
> >>>>&

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-28 Thread Wenchen Fan
t;> (non-bucketed) as well?
>>>> 2. What if the table has many partitions? Shall we apply certain join
>>>> algorithms in the phase 1 split-wise join as well? Or even launch a Spark
>>>> job to do so?
>>>>
>>>> Thanks,
>>>> Wenchen
>>>>
>>>> On Wed, Oct 27, 2021 at 3:08 AM Chao Sun  wrote:
>>>>
>>>>> Thanks Cheng for the comments.
>>>>>
>>>>> > Is migrating Hive table read path to data source v2, being a
>>>>> prerequisite of this SPIP
>>>>>
>>>>> Yes, this SPIP only aims at DataSourceV2, so obviously it will help if
>>>>> Hive eventually moves to use V2 API. With that said, I think some of the
>>>>> ideas could be useful for V1 Hive support as well. For instance, with the
>>>>> newly proposed logic to compare whether output partitionings from both
>>>>> sides of a join operator are compatible, we can have HiveTableScanExec to
>>>>> report a different partitioning other than HashPartitioning, and
>>>>> EnsureRequirements could potentially recognize that and therefore avoid
>>>>> shuffle if both sides report the same compatible partitioning. In 
>>>>> addition,
>>>>> SPARK-35703, which is part of the SPIP, is also useful in that it relaxes
>>>>> the constraint for V1 bucket join so that the join keys do not necessarily
>>>>> be identical to the bucket keys.
>>>>>
>>>>> > Would aggregate work automatically after the SPIP?
>>>>>
>>>>> Yes it will work as before. This case is already supported by
>>>>> DataSourcePartitioning in V2 (see SPARK-22389).
>>>>>
>>>>> > Any major use cases in mind except Hive bucketed table?
>>>>>
>>>>> Our first use case is Apache Iceberg. In addition to that we also want
>>>>> to add the support for Spark's built-in file data sources.
>>>>>
>>>>> Thanks,
>>>>> Chao
>>>>>
>>>>> On Tue, Oct 26, 2021 at 10:34 AM Cheng Su  wrote:
>>>>>
>>>>>> +1 for this. This is exciting movement to efficiently read bucketed
>>>>>> table from other systems (Hive, Trino & Presto)!
>>>>>>
>>>>>>
>>>>>>
>>>>>> Still looking at the details but having some early questions:
>>>>>>
>>>>>>
>>>>>>
>>>>>>1. Is migrating Hive table read path to data source v2, being a
>>>>>>prerequisite of this SPIP?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hive table read path is currently a mix of data source v1 (for
>>>>>> Parquet & ORC file format only), and legacy Hive code path
>>>>>> (HiveTableScanExec). In the SPIP, I am seeing we only make change for 
>>>>>> data
>>>>>> source v2, so wondering how this would work with existing Hive table read
>>>>>> path. In addition, just FYI, supporting writing Hive bucketed table is
>>>>>> merged in master recently (SPARK-19256
>>>>>> <https://issues.apache.org/jira/browse/SPARK-19256> has details).
>>>>>>
>>>>>>
>>>>>>
>>>>>>1. Would aggregate work automatically after the SPIP?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Another major benefit for having bucketed table, is to avoid shuffle
>>>>>> before aggregate. Just want to bring to our attention that it would be
>>>>>> great to consider aggregate as well when doing this proposal.
>>>>>>
>>>>>>
>>>>>>
>>>>>>1. Any major use cases in mind except Hive bucketed table?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Just curious if there’s any other use cases we are targeting as part
>>>>>> of SPIP.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Cheng Su
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From: *Ryan Blue 
>>>>>> *Date: *Tuesday, 

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-27 Thread Ryan Blue
P only aims at DataSourceV2, so obviously it will help if
>>>> Hive eventually moves to use V2 API. With that said, I think some of the
>>>> ideas could be useful for V1 Hive support as well. For instance, with the
>>>> newly proposed logic to compare whether output partitionings from both
>>>> sides of a join operator are compatible, we can have HiveTableScanExec to
>>>> report a different partitioning other than HashPartitioning, and
>>>> EnsureRequirements could potentially recognize that and therefore avoid
>>>> shuffle if both sides report the same compatible partitioning. In addition,
>>>> SPARK-35703, which is part of the SPIP, is also useful in that it relaxes
>>>> the constraint for V1 bucket join so that the join keys do not necessarily
>>>> be identical to the bucket keys.
>>>>
>>>> > Would aggregate work automatically after the SPIP?
>>>>
>>>> Yes it will work as before. This case is already supported by
>>>> DataSourcePartitioning in V2 (see SPARK-22389).
>>>>
>>>> > Any major use cases in mind except Hive bucketed table?
>>>>
>>>> Our first use case is Apache Iceberg. In addition to that we also want
>>>> to add the support for Spark's built-in file data sources.
>>>>
>>>> Thanks,
>>>> Chao
>>>>
>>>> On Tue, Oct 26, 2021 at 10:34 AM Cheng Su  wrote:
>>>>
>>>>> +1 for this. This is exciting movement to efficiently read bucketed
>>>>> table from other systems (Hive, Trino & Presto)!
>>>>>
>>>>>
>>>>>
>>>>> Still looking at the details but having some early questions:
>>>>>
>>>>>
>>>>>
>>>>>1. Is migrating Hive table read path to data source v2, being a
>>>>>prerequisite of this SPIP?
>>>>>
>>>>>
>>>>>
>>>>> Hive table read path is currently a mix of data source v1 (for Parquet
>>>>> & ORC file format only), and legacy Hive code path (HiveTableScanExec). In
>>>>> the SPIP, I am seeing we only make change for data source v2, so wondering
>>>>> how this would work with existing Hive table read path. In addition, just
>>>>> FYI, supporting writing Hive bucketed table is merged in master recently (
>>>>> SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
>>>>> details).
>>>>>
>>>>>
>>>>>
>>>>>1. Would aggregate work automatically after the SPIP?
>>>>>
>>>>>
>>>>>
>>>>> Another major benefit for having bucketed table, is to avoid shuffle
>>>>> before aggregate. Just want to bring to our attention that it would be
>>>>> great to consider aggregate as well when doing this proposal.
>>>>>
>>>>>
>>>>>
>>>>>1. Any major use cases in mind except Hive bucketed table?
>>>>>
>>>>>
>>>>>
>>>>> Just curious if there’s any other use cases we are targeting as part
>>>>> of SPIP.
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Cheng Su
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From: *Ryan Blue 
>>>>> *Date: *Tuesday, October 26, 2021 at 9:39 AM
>>>>> *To: *John Zhuge 
>>>>> *Cc: *Chao Sun , Wenchen Fan ,
>>>>> Cheng Su , DB Tsai , Dongjoon Hyun
>>>>> , Hyukjin Kwon ,
>>>>> Wenchen Fan , angers zhu ,
>>>>> dev , huaxin gao 
>>>>> *Subject: *Re: [DISCUSS] SPIP: Storage Partitioned Join for Data
>>>>> Source V2
>>>>>
>>>>> Instead of commenting on the doc, could we keep discussion here on the
>>>>> dev list please? That way more people can follow it and there is more room
>>>>> for discussion. Comment threads have a very small area and easily become
>>>>> hard to follow.
>>>>>
>>>>>
>>>>>
>>>>> Ryan
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 26, 2021 at 9:32 AM John Zhuge  wrote:
>>>>>
>>>>> +1  Nicely done!
>>

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-27 Thread Chao Sun
to define "compatible partitions". Things
>>> like `days(ts)` are straightforward: the same timestamp value always
>>> results in the same partition value, in whatever v2 sources. `bucket(col,
>>> num)` is tricky, as Spark doesn't define the bucket hash function. Two v2
>>> sources may return different bucket IDs for the same value, and this breaks
>>> the phase 1 split-wise join.
>>>
>>> And two questions for further improvements:
>>> 1. Can we apply this idea to partitioned file source tables
>>> (non-bucketed) as well?
>>> 2. What if the table has many partitions? Shall we apply certain join
>>> algorithms in the phase 1 split-wise join as well? Or even launch a Spark
>>> job to do so?
>>>
>>> Thanks,
>>> Wenchen
>>>
>>> On Wed, Oct 27, 2021 at 3:08 AM Chao Sun  wrote:
>>>
>>>> Thanks Cheng for the comments.
>>>>
>>>> > Is migrating Hive table read path to data source v2, being a
>>>> prerequisite of this SPIP
>>>>
>>>> Yes, this SPIP only aims at DataSourceV2, so obviously it will help if
>>>> Hive eventually moves to use V2 API. With that said, I think some of the
>>>> ideas could be useful for V1 Hive support as well. For instance, with the
>>>> newly proposed logic to compare whether output partitionings from both
>>>> sides of a join operator are compatible, we can have HiveTableScanExec to
>>>> report a different partitioning other than HashPartitioning, and
>>>> EnsureRequirements could potentially recognize that and therefore avoid
>>>> shuffle if both sides report the same compatible partitioning. In addition,
>>>> SPARK-35703, which is part of the SPIP, is also useful in that it relaxes
>>>> the constraint for V1 bucket join so that the join keys do not necessarily
>>>> be identical to the bucket keys.
>>>>
>>>> > Would aggregate work automatically after the SPIP?
>>>>
>>>> Yes it will work as before. This case is already supported by
>>>> DataSourcePartitioning in V2 (see SPARK-22389).
>>>>
>>>> > Any major use cases in mind except Hive bucketed table?
>>>>
>>>> Our first use case is Apache Iceberg. In addition to that we also want
>>>> to add the support for Spark's built-in file data sources.
>>>>
>>>> Thanks,
>>>> Chao
>>>>
>>>> On Tue, Oct 26, 2021 at 10:34 AM Cheng Su  wrote:
>>>>
>>>>> +1 for this. This is exciting movement to efficiently read bucketed
>>>>> table from other systems (Hive, Trino & Presto)!
>>>>>
>>>>>
>>>>>
>>>>> Still looking at the details but having some early questions:
>>>>>
>>>>>
>>>>>
>>>>>1. Is migrating Hive table read path to data source v2, being a
>>>>>prerequisite of this SPIP?
>>>>>
>>>>>
>>>>>
>>>>> Hive table read path is currently a mix of data source v1 (for Parquet
>>>>> & ORC file format only), and legacy Hive code path (HiveTableScanExec). In
>>>>> the SPIP, I am seeing we only make change for data source v2, so wondering
>>>>> how this would work with existing Hive table read path. In addition, just
>>>>> FYI, supporting writing Hive bucketed table is merged in master recently (
>>>>> SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
>>>>> details).
>>>>>
>>>>>
>>>>>
>>>>>1. Would aggregate work automatically after the SPIP?
>>>>>
>>>>>
>>>>>
>>>>> Another major benefit for having bucketed table, is to avoid shuffle
>>>>> before aggregate. Just want to bring to our attention that it would be
>>>>> great to consider aggregate as well when doing this proposal.
>>>>>
>>>>>
>>>>>
>>>>>1. Any major use cases in mind except Hive bucketed table?
>>>>>
>>>>>
>>>>>
>>>>> Just curious if there’s any other use cases we are targeting as part
>>>>> of SPIP.
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Cheng Su
>>>>>
>>>>>
>>>>

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-27 Thread L . C . Hsieh
+1 for the SPIP. This is a great improvement and optimization!

On 2021/10/26 19:01:03, Erik Krogen  wrote: 
> It's great to see this SPIP going live. Once this is complete, it will
> really help Spark to play nicely with a broader data ecosystem (Hive,
> Iceberg, Trino, etc.), and it's great to see that besides just bringing the
> existing bucketed-join support to V2, we are also making the types of
> partitioning that can be accommodated more broad and leaving open pathways
> for future optimizations like partially clustered distributions.
> 
> Big thanks to Ryan and Chao!
> 
> On Tue, Oct 26, 2021 at 10:35 AM Cheng Su  wrote:
> 
> > +1 for this. This is exciting movement to efficiently read bucketed table
> > from other systems (Hive, Trino & Presto)!
> >
> >
> >
> > Still looking at the details but having some early questions:
> >
> >
> >
> >1. Is migrating Hive table read path to data source v2, being a
> >prerequisite of this SPIP?
> >
> >
> >
> > Hive table read path is currently a mix of data source v1 (for Parquet &
> > ORC file format only), and legacy Hive code path (HiveTableScanExec). In
> > the SPIP, I am seeing we only make change for data source v2, so wondering
> > how this would work with existing Hive table read path. In addition, just
> > FYI, supporting writing Hive bucketed table is merged in master recently (
> > SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
> > details).
> >
> >
> >
> >1. Would aggregate work automatically after the SPIP?
> >
> >
> >
> > Another major benefit for having bucketed table, is to avoid shuffle
> > before aggregate. Just want to bring to our attention that it would be
> > great to consider aggregate as well when doing this proposal.
> >
> >
> >
> >1. Any major use cases in mind except Hive bucketed table?
> >
> >
> >
> > Just curious if there’s any other use cases we are targeting as part of
> > SPIP.
> >
> >
> >
> > Thanks,
> >
> > Cheng Su
> >
> >
> >
> >
> >
> >
> >
> > *From: *Ryan Blue 
> > *Date: *Tuesday, October 26, 2021 at 9:39 AM
> > *To: *John Zhuge 
> > *Cc: *Chao Sun , Wenchen Fan ,
> > Cheng Su , DB Tsai , Dongjoon Hyun <
> > dongjoon.h...@gmail.com>, Hyukjin Kwon , Wenchen Fan
> > , angers zhu , dev <
> > dev@spark.apache.org>, huaxin gao 
> > *Subject: *Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2
> >
> > Instead of commenting on the doc, could we keep discussion here on the dev
> > list please? That way more people can follow it and there is more room for
> > discussion. Comment threads have a very small area and easily become hard
> > to follow.
> >
> >
> >
> > Ryan
> >
> >
> >
> > On Tue, Oct 26, 2021 at 9:32 AM John Zhuge  wrote:
> >
> > +1  Nicely done!
> >
> >
> >
> > On Tue, Oct 26, 2021 at 8:08 AM Chao Sun  wrote:
> >
> > Oops, sorry. I just fixed the permission setting.
> >
> >
> >
> > Thanks everyone for the positive support!
> >
> >
> >
> > On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan  wrote:
> >
> > +1 to this SPIP and nice writeup of the design doc!
> >
> >
> >
> > Can we open comment permission in the doc so that we can discuss details
> > there?
> >
> >
> >
> > On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon  wrote:
> >
> > Seems making sense to me.
> >
> > Would be great to have some feedback from people such as @Wenchen Fan
> >  @Cheng Su  @angers zhu
> > .
> >
> >
> >
> >
> >
> > On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
> > wrote:
> >
> > +1 for this SPIP.
> >
> >
> >
> > On Sun, Oct 24, 2021 at 9:59 AM huaxin gao  wrote:
> >
> > +1. Thanks for lifting the current restrictions on bucket join and making
> > this more generalized.
> >
> >
> >
> > On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
> >
> > +1 from me as well. Thanks Chao for doing so much to get it to this point!
> >
> >
> >
> > On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
> >
> > +1 on this SPIP.
> >
> > This is a more generalized version of bucketed tables and bucketed
> > joins which can eliminate very expensive data shuffles when joins, and
> > many users in the Apache Spark community h

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-27 Thread Wenchen Fan
ses in mind except Hive bucketed table?
>>>
>>> Our first use case is Apache Iceberg. In addition to that we also want
>>> to add the support for Spark's built-in file data sources.
>>>
>>> Thanks,
>>> Chao
>>>
>>> On Tue, Oct 26, 2021 at 10:34 AM Cheng Su  wrote:
>>>
>>>> +1 for this. This is exciting movement to efficiently read bucketed
>>>> table from other systems (Hive, Trino & Presto)!
>>>>
>>>>
>>>>
>>>> Still looking at the details but having some early questions:
>>>>
>>>>
>>>>
>>>>1. Is migrating Hive table read path to data source v2, being a
>>>>prerequisite of this SPIP?
>>>>
>>>>
>>>>
>>>> Hive table read path is currently a mix of data source v1 (for Parquet
>>>> & ORC file format only), and legacy Hive code path (HiveTableScanExec). In
>>>> the SPIP, I am seeing we only make change for data source v2, so wondering
>>>> how this would work with existing Hive table read path. In addition, just
>>>> FYI, supporting writing Hive bucketed table is merged in master recently (
>>>> SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
>>>> details).
>>>>
>>>>
>>>>
>>>>1. Would aggregate work automatically after the SPIP?
>>>>
>>>>
>>>>
>>>> Another major benefit for having bucketed table, is to avoid shuffle
>>>> before aggregate. Just want to bring to our attention that it would be
>>>> great to consider aggregate as well when doing this proposal.
>>>>
>>>>
>>>>
>>>>1. Any major use cases in mind except Hive bucketed table?
>>>>
>>>>
>>>>
>>>> Just curious if there’s any other use cases we are targeting as part of
>>>> SPIP.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Cheng Su
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From: *Ryan Blue 
>>>> *Date: *Tuesday, October 26, 2021 at 9:39 AM
>>>> *To: *John Zhuge 
>>>> *Cc: *Chao Sun , Wenchen Fan ,
>>>> Cheng Su , DB Tsai , Dongjoon Hyun <
>>>> dongjoon.h...@gmail.com>, Hyukjin Kwon , Wenchen
>>>> Fan , angers zhu , dev <
>>>> dev@spark.apache.org>, huaxin gao 
>>>> *Subject: *Re: [DISCUSS] SPIP: Storage Partitioned Join for Data
>>>> Source V2
>>>>
>>>> Instead of commenting on the doc, could we keep discussion here on the
>>>> dev list please? That way more people can follow it and there is more room
>>>> for discussion. Comment threads have a very small area and easily become
>>>> hard to follow.
>>>>
>>>>
>>>>
>>>> Ryan
>>>>
>>>>
>>>>
>>>> On Tue, Oct 26, 2021 at 9:32 AM John Zhuge  wrote:
>>>>
>>>> +1  Nicely done!
>>>>
>>>>
>>>>
>>>> On Tue, Oct 26, 2021 at 8:08 AM Chao Sun  wrote:
>>>>
>>>> Oops, sorry. I just fixed the permission setting.
>>>>
>>>>
>>>>
>>>> Thanks everyone for the positive support!
>>>>
>>>>
>>>>
>>>> On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan 
>>>> wrote:
>>>>
>>>> +1 to this SPIP and nice writeup of the design doc!
>>>>
>>>>
>>>>
>>>> Can we open comment permission in the doc so that we can discuss
>>>> details there?
>>>>
>>>>
>>>>
>>>> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon 
>>>> wrote:
>>>>
>>>> Seems making sense to me.
>>>>
>>>> Would be great to have some feedback from people such as @Wenchen Fan
>>>>  @Cheng Su  @angers zhu
>>>> .
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
>>>> wrote:
>>>>
>>>> +1 for this SPIP.
>>>>
>>>>
>>>>
>>>> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
>>>> wrote:
>>>>
>>>> +1. Thanks for lifting the current restrictions on bucket join and
>>>> making this more generalized.
>>>>
>>>>
>>>>
>>>> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>>>>
>>>> +1 from me as well. Thanks Chao for doing so much to get it to this
>>>> point!
>>>>
>>>>
>>>>
>>>> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>>>>
>>>> +1 on this SPIP.
>>>>
>>>> This is a more generalized version of bucketed tables and bucketed
>>>> joins which can eliminate very expensive data shuffles when joins, and
>>>> many users in the Apache Spark community have wanted this feature for
>>>> a long time!
>>>>
>>>> Thank you, Ryan and Chao, for working on this, and I look forward to
>>>> it as a new feature in Spark 3.3
>>>>
>>>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>>>>
>>>> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > Ryan and I drafted a design doc to support a new type of join:
>>>> storage partitioned join which covers bucket join support for DataSourceV2
>>>> but is more general. The goal is to let Spark leverage distribution
>>>> properties reported by data sources and eliminate shuffle whenever 
>>>> possible.
>>>> >
>>>> > Design doc:
>>>> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>>>> (includes a POC link at the end)
>>>> >
>>>> > We'd like to start a discussion on the doc and any feedback is
>>>> welcome!
>>>> >
>>>> > Thanks,
>>>> > Chao
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Ryan Blue
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> John Zhuge
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Ryan Blue
>>>>
>>>
>
> --
> Ryan Blue
>


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-27 Thread Ryan Blue
 &
>>> ORC file format only), and legacy Hive code path (HiveTableScanExec). In
>>> the SPIP, I am seeing we only make change for data source v2, so wondering
>>> how this would work with existing Hive table read path. In addition, just
>>> FYI, supporting writing Hive bucketed table is merged in master recently (
>>> SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
>>> details).
>>>
>>>
>>>
>>>1. Would aggregate work automatically after the SPIP?
>>>
>>>
>>>
>>> Another major benefit for having bucketed table, is to avoid shuffle
>>> before aggregate. Just want to bring to our attention that it would be
>>> great to consider aggregate as well when doing this proposal.
>>>
>>>
>>>
>>>1. Any major use cases in mind except Hive bucketed table?
>>>
>>>
>>>
>>> Just curious if there’s any other use cases we are targeting as part of
>>> SPIP.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Cheng Su
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From: *Ryan Blue 
>>> *Date: *Tuesday, October 26, 2021 at 9:39 AM
>>> *To: *John Zhuge 
>>> *Cc: *Chao Sun , Wenchen Fan ,
>>> Cheng Su , DB Tsai , Dongjoon Hyun <
>>> dongjoon.h...@gmail.com>, Hyukjin Kwon , Wenchen
>>> Fan , angers zhu , dev <
>>> dev@spark.apache.org>, huaxin gao 
>>> *Subject: *Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source
>>> V2
>>>
>>> Instead of commenting on the doc, could we keep discussion here on the
>>> dev list please? That way more people can follow it and there is more room
>>> for discussion. Comment threads have a very small area and easily become
>>> hard to follow.
>>>
>>>
>>>
>>> Ryan
>>>
>>>
>>>
>>> On Tue, Oct 26, 2021 at 9:32 AM John Zhuge  wrote:
>>>
>>> +1  Nicely done!
>>>
>>>
>>>
>>> On Tue, Oct 26, 2021 at 8:08 AM Chao Sun  wrote:
>>>
>>> Oops, sorry. I just fixed the permission setting.
>>>
>>>
>>>
>>> Thanks everyone for the positive support!
>>>
>>>
>>>
>>> On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan  wrote:
>>>
>>> +1 to this SPIP and nice writeup of the design doc!
>>>
>>>
>>>
>>> Can we open comment permission in the doc so that we can discuss details
>>> there?
>>>
>>>
>>>
>>> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon 
>>> wrote:
>>>
>>> Seems making sense to me.
>>>
>>> Would be great to have some feedback from people such as @Wenchen Fan
>>>  @Cheng Su  @angers zhu
>>> .
>>>
>>>
>>>
>>>
>>>
>>> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
>>> wrote:
>>>
>>> +1 for this SPIP.
>>>
>>>
>>>
>>> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
>>> wrote:
>>>
>>> +1. Thanks for lifting the current restrictions on bucket join and
>>> making this more generalized.
>>>
>>>
>>>
>>> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>>>
>>> +1 from me as well. Thanks Chao for doing so much to get it to this
>>> point!
>>>
>>>
>>>
>>> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>>>
>>> +1 on this SPIP.
>>>
>>> This is a more generalized version of bucketed tables and bucketed
>>> joins which can eliminate very expensive data shuffles when joins, and
>>> many users in the Apache Spark community have wanted this feature for
>>> a long time!
>>>
>>> Thank you, Ryan and Chao, for working on this, and I look forward to
>>> it as a new feature in Spark 3.3
>>>
>>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>>>
>>> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
>>> >
>>> > Hi,
>>> >
>>> > Ryan and I drafted a design doc to support a new type of join: storage
>>> partitioned join which covers bucket join support for DataSourceV2 but is
>>> more general. The goal is to let Spark leverage distribution properties
>>> reported by data sources and eliminate shuffle whenever possible.
>>> >
>>> > Design doc:
>>> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>>> (includes a POC link at the end)
>>> >
>>> > We'd like to start a discussion on the doc and any feedback is welcome!
>>> >
>>> > Thanks,
>>> > Chao
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Ryan Blue
>>>
>>>
>>>
>>>
>>> --
>>>
>>> John Zhuge
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Ryan Blue
>>>
>>

-- 
Ryan Blue


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-27 Thread Wenchen Fan
IIUC, the general idea is to let each input split report its partition
value, and Spark can perform the join in two phases:
1. join the input splits from left and right tables according to their
partitions values and join keys, at the driver side.
2. for each joined input splits pair (or a group of splits), launch a Spark
task to join the rows.

My major concern is about how to define "compatible partitions". Things
like `days(ts)` are straightforward: the same timestamp value always
results in the same partition value, in whatever v2 sources. `bucket(col,
num)` is tricky, as Spark doesn't define the bucket hash function. Two v2
sources may return different bucket IDs for the same value, and this breaks
the phase 1 split-wise join.

And two questions for further improvements:
1. Can we apply this idea to partitioned file source tables (non-bucketed)
as well?
2. What if the table has many partitions? Shall we apply certain join
algorithms in the phase 1 split-wise join as well? Or even launch a Spark
job to do so?

Thanks,
Wenchen

On Wed, Oct 27, 2021 at 3:08 AM Chao Sun  wrote:

> Thanks Cheng for the comments.
>
> > Is migrating Hive table read path to data source v2, being a
> prerequisite of this SPIP
>
> Yes, this SPIP only aims at DataSourceV2, so obviously it will help if
> Hive eventually moves to use V2 API. With that said, I think some of the
> ideas could be useful for V1 Hive support as well. For instance, with the
> newly proposed logic to compare whether output partitionings from both
> sides of a join operator are compatible, we can have HiveTableScanExec to
> report a different partitioning other than HashPartitioning, and
> EnsureRequirements could potentially recognize that and therefore avoid
> shuffle if both sides report the same compatible partitioning. In addition,
> SPARK-35703, which is part of the SPIP, is also useful in that it relaxes
> the constraint for V1 bucket join so that the join keys do not necessarily
> be identical to the bucket keys.
>
> > Would aggregate work automatically after the SPIP?
>
> Yes it will work as before. This case is already supported by
> DataSourcePartitioning in V2 (see SPARK-22389).
>
> > Any major use cases in mind except Hive bucketed table?
>
> Our first use case is Apache Iceberg. In addition to that we also want to
> add the support for Spark's built-in file data sources.
>
> Thanks,
> Chao
>
> On Tue, Oct 26, 2021 at 10:34 AM Cheng Su  wrote:
>
>> +1 for this. This is exciting movement to efficiently read bucketed table
>> from other systems (Hive, Trino & Presto)!
>>
>>
>>
>> Still looking at the details but having some early questions:
>>
>>
>>
>>1. Is migrating Hive table read path to data source v2, being a
>>prerequisite of this SPIP?
>>
>>
>>
>> Hive table read path is currently a mix of data source v1 (for Parquet &
>> ORC file format only), and legacy Hive code path (HiveTableScanExec). In
>> the SPIP, I am seeing we only make change for data source v2, so wondering
>> how this would work with existing Hive table read path. In addition, just
>> FYI, supporting writing Hive bucketed table is merged in master recently (
>> SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
>> details).
>>
>>
>>
>>1. Would aggregate work automatically after the SPIP?
>>
>>
>>
>> Another major benefit for having bucketed table, is to avoid shuffle
>> before aggregate. Just want to bring to our attention that it would be
>> great to consider aggregate as well when doing this proposal.
>>
>>
>>
>>1. Any major use cases in mind except Hive bucketed table?
>>
>>
>>
>> Just curious if there’s any other use cases we are targeting as part of
>> SPIP.
>>
>>
>>
>> Thanks,
>>
>> Cheng Su
>>
>>
>>
>>
>>
>>
>>
>> *From: *Ryan Blue 
>> *Date: *Tuesday, October 26, 2021 at 9:39 AM
>> *To: *John Zhuge 
>> *Cc: *Chao Sun , Wenchen Fan ,
>> Cheng Su , DB Tsai , Dongjoon Hyun <
>> dongjoon.h...@gmail.com>, Hyukjin Kwon , Wenchen
>> Fan , angers zhu , dev <
>> dev@spark.apache.org>, huaxin gao 
>> *Subject: *Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source
>> V2
>>
>> Instead of commenting on the doc, could we keep discussion here on the
>> dev list please? That way more people can follow it and there is more room
>> for discussion. Comment threads have a very small area and easily become
>> hard to follow.
>>
>>
>>
>> Ryan
>>
>>
>>
>

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Chao Sun
Thanks Cheng for the comments.

> Is migrating Hive table read path to data source v2, being a prerequisite
of this SPIP

Yes, this SPIP only aims at DataSourceV2, so obviously it will help if Hive
eventually moves to use V2 API. With that said, I think some of the ideas
could be useful for V1 Hive support as well. For instance, with the newly
proposed logic to compare whether output partitionings from both sides of a
join operator are compatible, we can have HiveTableScanExec to report a
different partitioning other than HashPartitioning, and EnsureRequirements
could potentially recognize that and therefore avoid shuffle if both sides
report the same compatible partitioning. In addition, SPARK-35703, which is
part of the SPIP, is also useful in that it relaxes the constraint for V1
bucket join so that the join keys do not necessarily be identical to the
bucket keys.

> Would aggregate work automatically after the SPIP?

Yes it will work as before. This case is already supported by
DataSourcePartitioning in V2 (see SPARK-22389).

> Any major use cases in mind except Hive bucketed table?

Our first use case is Apache Iceberg. In addition to that we also want to
add the support for Spark's built-in file data sources.

Thanks,
Chao

On Tue, Oct 26, 2021 at 10:34 AM Cheng Su  wrote:

> +1 for this. This is exciting movement to efficiently read bucketed table
> from other systems (Hive, Trino & Presto)!
>
>
>
> Still looking at the details but having some early questions:
>
>
>
>1. Is migrating Hive table read path to data source v2, being a
>prerequisite of this SPIP?
>
>
>
> Hive table read path is currently a mix of data source v1 (for Parquet &
> ORC file format only), and legacy Hive code path (HiveTableScanExec). In
> the SPIP, I am seeing we only make change for data source v2, so wondering
> how this would work with existing Hive table read path. In addition, just
> FYI, supporting writing Hive bucketed table is merged in master recently (
> SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
> details).
>
>
>
>1. Would aggregate work automatically after the SPIP?
>
>
>
> Another major benefit for having bucketed table, is to avoid shuffle
> before aggregate. Just want to bring to our attention that it would be
> great to consider aggregate as well when doing this proposal.
>
>
>
>1. Any major use cases in mind except Hive bucketed table?
>
>
>
> Just curious if there’s any other use cases we are targeting as part of
> SPIP.
>
>
>
> Thanks,
>
> Cheng Su
>
>
>
>
>
>
>
> *From: *Ryan Blue 
> *Date: *Tuesday, October 26, 2021 at 9:39 AM
> *To: *John Zhuge 
> *Cc: *Chao Sun , Wenchen Fan ,
> Cheng Su , DB Tsai , Dongjoon Hyun <
> dongjoon.h...@gmail.com>, Hyukjin Kwon , Wenchen Fan
> , angers zhu , dev <
> dev@spark.apache.org>, huaxin gao 
> *Subject: *Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2
>
> Instead of commenting on the doc, could we keep discussion here on the dev
> list please? That way more people can follow it and there is more room for
> discussion. Comment threads have a very small area and easily become hard
> to follow.
>
>
>
> Ryan
>
>
>
> On Tue, Oct 26, 2021 at 9:32 AM John Zhuge  wrote:
>
> +1  Nicely done!
>
>
>
> On Tue, Oct 26, 2021 at 8:08 AM Chao Sun  wrote:
>
> Oops, sorry. I just fixed the permission setting.
>
>
>
> Thanks everyone for the positive support!
>
>
>
> On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan  wrote:
>
> +1 to this SPIP and nice writeup of the design doc!
>
>
>
> Can we open comment permission in the doc so that we can discuss details
> there?
>
>
>
> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon  wrote:
>
> Seems making sense to me.
>
> Would be great to have some feedback from people such as @Wenchen Fan
>  @Cheng Su  @angers zhu
> .
>
>
>
>
>
> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
> wrote:
>
> +1 for this SPIP.
>
>
>
> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao  wrote:
>
> +1. Thanks for lifting the current restrictions on bucket join and making
> this more generalized.
>
>
>
> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>
> +1 from me as well. Thanks Chao for doing so much to get it to this point!
>
>
>
> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>
> +1 on this SPIP.
>
> This is a more generalized version of bucketed tables and bucketed
> joins which can eliminate very expensive data shuffles when joins, and
> many users in the Apache Spark community have wanted this feature for
> a long time!
>
> Thank you, Ryan and Chao,

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Erik Krogen
It's great to see this SPIP going live. Once this is complete, it will
really help Spark to play nicely with a broader data ecosystem (Hive,
Iceberg, Trino, etc.), and it's great to see that besides just bringing the
existing bucketed-join support to V2, we are also making the types of
partitioning that can be accommodated more broad and leaving open pathways
for future optimizations like partially clustered distributions.

Big thanks to Ryan and Chao!

On Tue, Oct 26, 2021 at 10:35 AM Cheng Su  wrote:

> +1 for this. This is exciting movement to efficiently read bucketed table
> from other systems (Hive, Trino & Presto)!
>
>
>
> Still looking at the details but having some early questions:
>
>
>
>1. Is migrating Hive table read path to data source v2, being a
>prerequisite of this SPIP?
>
>
>
> Hive table read path is currently a mix of data source v1 (for Parquet &
> ORC file format only), and legacy Hive code path (HiveTableScanExec). In
> the SPIP, I am seeing we only make change for data source v2, so wondering
> how this would work with existing Hive table read path. In addition, just
> FYI, supporting writing Hive bucketed table is merged in master recently (
> SPARK-19256 <https://issues.apache.org/jira/browse/SPARK-19256> has
> details).
>
>
>
>1. Would aggregate work automatically after the SPIP?
>
>
>
> Another major benefit for having bucketed table, is to avoid shuffle
> before aggregate. Just want to bring to our attention that it would be
> great to consider aggregate as well when doing this proposal.
>
>
>
>1. Any major use cases in mind except Hive bucketed table?
>
>
>
> Just curious if there’s any other use cases we are targeting as part of
> SPIP.
>
>
>
> Thanks,
>
> Cheng Su
>
>
>
>
>
>
>
> *From: *Ryan Blue 
> *Date: *Tuesday, October 26, 2021 at 9:39 AM
> *To: *John Zhuge 
> *Cc: *Chao Sun , Wenchen Fan ,
> Cheng Su , DB Tsai , Dongjoon Hyun <
> dongjoon.h...@gmail.com>, Hyukjin Kwon , Wenchen Fan
> , angers zhu , dev <
> dev@spark.apache.org>, huaxin gao 
> *Subject: *Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2
>
> Instead of commenting on the doc, could we keep discussion here on the dev
> list please? That way more people can follow it and there is more room for
> discussion. Comment threads have a very small area and easily become hard
> to follow.
>
>
>
> Ryan
>
>
>
> On Tue, Oct 26, 2021 at 9:32 AM John Zhuge  wrote:
>
> +1  Nicely done!
>
>
>
> On Tue, Oct 26, 2021 at 8:08 AM Chao Sun  wrote:
>
> Oops, sorry. I just fixed the permission setting.
>
>
>
> Thanks everyone for the positive support!
>
>
>
> On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan  wrote:
>
> +1 to this SPIP and nice writeup of the design doc!
>
>
>
> Can we open comment permission in the doc so that we can discuss details
> there?
>
>
>
> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon  wrote:
>
> Seems making sense to me.
>
> Would be great to have some feedback from people such as @Wenchen Fan
>  @Cheng Su  @angers zhu
> .
>
>
>
>
>
> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
> wrote:
>
> +1 for this SPIP.
>
>
>
> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao  wrote:
>
> +1. Thanks for lifting the current restrictions on bucket join and making
> this more generalized.
>
>
>
> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>
> +1 from me as well. Thanks Chao for doing so much to get it to this point!
>
>
>
> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>
> +1 on this SPIP.
>
> This is a more generalized version of bucketed tables and bucketed
> joins which can eliminate very expensive data shuffles when joins, and
> many users in the Apache Spark community have wanted this feature for
> a long time!
>
> Thank you, Ryan and Chao, for working on this, and I look forward to
> it as a new feature in Spark 3.3
>
> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>
> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
> >
> > Hi,
> >
> > Ryan and I drafted a design doc to support a new type of join: storage
> partitioned join which covers bucket join support for DataSourceV2 but is
> more general. The goal is to let Spark leverage distribution properties
> reported by data sources and eliminate shuffle whenever possible.
> >
> > Design doc:
> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
> (includes a POC link at the end)
> >
> > We'd like to start a discussion on the doc and any feedback is welcome!
> >
> > Thanks,
> > Chao
>
>
>
>
> --
>
> Ryan Blue
>
>
>
>
> --
>
> John Zhuge
>
>
>
>
> --
>
> Ryan Blue
>


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Cheng Su
+1 for this. This is exciting movement to efficiently read bucketed table from 
other systems (Hive, Trino & Presto)!

Still looking at the details but having some early questions:


  1.  Is migrating Hive table read path to data source v2, being a prerequisite 
of this SPIP?



Hive table read path is currently a mix of data source v1 (for Parquet & ORC 
file format only), and legacy Hive code path (HiveTableScanExec). In the SPIP, 
I am seeing we only make change for data source v2, so wondering how this would 
work with existing Hive table read path. In addition, just FYI, supporting 
writing Hive bucketed table is merged in master recently 
(SPARK-19256<https://issues.apache.org/jira/browse/SPARK-19256> has details).


  1.  Would aggregate work automatically after the SPIP?

Another major benefit for having bucketed table, is to avoid shuffle before 
aggregate. Just want to bring to our attention that it would be great to 
consider aggregate as well when doing this proposal.


  1.  Any major use cases in mind except Hive bucketed table?

Just curious if there’s any other use cases we are targeting as part of SPIP.

Thanks,
Cheng Su





From: Ryan Blue 
Date: Tuesday, October 26, 2021 at 9:39 AM
To: John Zhuge 
Cc: Chao Sun , Wenchen Fan , Cheng Su 
, DB Tsai , Dongjoon Hyun 
, Hyukjin Kwon , Wenchen Fan 
, angers zhu , dev 
, huaxin gao 
Subject: Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2
Instead of commenting on the doc, could we keep discussion here on the dev list 
please? That way more people can follow it and there is more room for 
discussion. Comment threads have a very small area and easily become hard to 
follow.

Ryan

On Tue, Oct 26, 2021 at 9:32 AM John Zhuge 
mailto:jzh...@apache.org>> wrote:
+1  Nicely done!

On Tue, Oct 26, 2021 at 8:08 AM Chao Sun 
mailto:sunc...@apache.org>> wrote:
Oops, sorry. I just fixed the permission setting.

Thanks everyone for the positive support!

On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan 
mailto:cloud0...@gmail.com>> wrote:
+1 to this SPIP and nice writeup of the design doc!

Can we open comment permission in the doc so that we can discuss details there?

On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon 
mailto:gurwls...@gmail.com>> wrote:
Seems making sense to me.

Would be great to have some feedback from people such as @Wenchen 
Fan<mailto:wenc...@databricks.com> @Cheng Su<mailto:chen...@fb.com> @angers 
zhu<mailto:angers@gmail.com>.


On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
+1 for this SPIP.

On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
mailto:huaxin.ga...@gmail.com>> wrote:
+1. Thanks for lifting the current restrictions on bucket join and making this 
more generalized.

On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue 
mailto:b...@apache.org>> wrote:
+1 from me as well. Thanks Chao for doing so much to get it to this point!

On Sat, Oct 23, 2021 at 11:29 PM DB Tsai 
mailto:dbt...@dbtsai.com>> wrote:
+1 on this SPIP.

This is a more generalized version of bucketed tables and bucketed
joins which can eliminate very expensive data shuffles when joins, and
many users in the Apache Spark community have wanted this feature for
a long time!

Thank you, Ryan and Chao, for working on this, and I look forward to
it as a new feature in Spark 3.3

DB Tsai  |  https://www.dbtsai.com/<https://www.dbtsai.com/>  |  PGP 
42E5B25A8F7A82C1

On Fri, Oct 22, 2021 at 12:18 PM Chao Sun 
mailto:sunc...@apache.org>> wrote:
>
> Hi,
>
> Ryan and I drafted a design doc to support a new type of join: storage 
> partitioned join which covers bucket join support for DataSourceV2 but is 
> more general. The goal is to let Spark leverage distribution properties 
> reported by data sources and eliminate shuffle whenever possible.
>
> Design doc: 
> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE<https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE>
>  (includes a POC link at the end)
>
> We'd like to start a discussion on the doc and any feedback is welcome!
>
> Thanks,
> Chao


--
Ryan Blue


--
John Zhuge


--
Ryan Blue


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Ryan Blue
Instead of commenting on the doc, could we keep discussion here on the dev
list please? That way more people can follow it and there is more room for
discussion. Comment threads have a very small area and easily become hard
to follow.

Ryan

On Tue, Oct 26, 2021 at 9:32 AM John Zhuge  wrote:

> +1  Nicely done!
>
> On Tue, Oct 26, 2021 at 8:08 AM Chao Sun  wrote:
>
>> Oops, sorry. I just fixed the permission setting.
>>
>> Thanks everyone for the positive support!
>>
>> On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan  wrote:
>>
>>> +1 to this SPIP and nice writeup of the design doc!
>>>
>>> Can we open comment permission in the doc so that we can discuss details
>>> there?
>>>
>>> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon 
>>> wrote:
>>>
 Seems making sense to me.

 Would be great to have some feedback from people such as @Wenchen Fan
  @Cheng Su  @angers zhu
 .


 On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
 wrote:

> +1 for this SPIP.
>
> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
> wrote:
>
>> +1. Thanks for lifting the current restrictions on bucket join and
>> making this more generalized.
>>
>> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>>
>>> +1 from me as well. Thanks Chao for doing so much to get it to this
>>> point!
>>>
>>> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>>>
 +1 on this SPIP.

 This is a more generalized version of bucketed tables and bucketed
 joins which can eliminate very expensive data shuffles when joins,
 and
 many users in the Apache Spark community have wanted this feature
 for
 a long time!

 Thank you, Ryan and Chao, for working on this, and I look forward to
 it as a new feature in Spark 3.3

 DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1

 On Fri, Oct 22, 2021 at 12:18 PM Chao Sun 
 wrote:
 >
 > Hi,
 >
 > Ryan and I drafted a design doc to support a new type of join:
 storage partitioned join which covers bucket join support for 
 DataSourceV2
 but is more general. The goal is to let Spark leverage distribution
 properties reported by data sources and eliminate shuffle whenever 
 possible.
 >
 > Design doc:
 https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
 (includes a POC link at the end)
 >
 > We'd like to start a discussion on the doc and any feedback is
 welcome!
 >
 > Thanks,
 > Chao

>>>
>>>
>>> --
>>> Ryan Blue
>>>
>>
>
> --
> John Zhuge
>


-- 
Ryan Blue


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread John Zhuge
+1  Nicely done!

On Tue, Oct 26, 2021 at 8:08 AM Chao Sun  wrote:

> Oops, sorry. I just fixed the permission setting.
>
> Thanks everyone for the positive support!
>
> On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan  wrote:
>
>> +1 to this SPIP and nice writeup of the design doc!
>>
>> Can we open comment permission in the doc so that we can discuss details
>> there?
>>
>> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon  wrote:
>>
>>> Seems making sense to me.
>>>
>>> Would be great to have some feedback from people such as @Wenchen Fan
>>>  @Cheng Su  @angers zhu
>>> .
>>>
>>>
>>> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
>>> wrote:
>>>
 +1 for this SPIP.

 On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
 wrote:

> +1. Thanks for lifting the current restrictions on bucket join and
> making this more generalized.
>
> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>
>> +1 from me as well. Thanks Chao for doing so much to get it to this
>> point!
>>
>> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>>
>>> +1 on this SPIP.
>>>
>>> This is a more generalized version of bucketed tables and bucketed
>>> joins which can eliminate very expensive data shuffles when joins,
>>> and
>>> many users in the Apache Spark community have wanted this feature for
>>> a long time!
>>>
>>> Thank you, Ryan and Chao, for working on this, and I look forward to
>>> it as a new feature in Spark 3.3
>>>
>>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>>>
>>> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun 
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > Ryan and I drafted a design doc to support a new type of join:
>>> storage partitioned join which covers bucket join support for 
>>> DataSourceV2
>>> but is more general. The goal is to let Spark leverage distribution
>>> properties reported by data sources and eliminate shuffle whenever 
>>> possible.
>>> >
>>> > Design doc:
>>> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>>> (includes a POC link at the end)
>>> >
>>> > We'd like to start a discussion on the doc and any feedback is
>>> welcome!
>>> >
>>> > Thanks,
>>> > Chao
>>>
>>
>>
>> --
>> Ryan Blue
>>
>

-- 
John Zhuge


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Chao Sun
Oops, sorry. I just fixed the permission setting.

Thanks everyone for the positive support!

On Tue, Oct 26, 2021 at 7:30 AM Wenchen Fan  wrote:

> +1 to this SPIP and nice writeup of the design doc!
>
> Can we open comment permission in the doc so that we can discuss details
> there?
>
> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon  wrote:
>
>> Seems making sense to me.
>>
>> Would be great to have some feedback from people such as @Wenchen Fan
>>  @Cheng Su  @angers zhu
>> .
>>
>>
>> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
>> wrote:
>>
>>> +1 for this SPIP.
>>>
>>> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
>>> wrote:
>>>
 +1. Thanks for lifting the current restrictions on bucket join and
 making this more generalized.

 On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:

> +1 from me as well. Thanks Chao for doing so much to get it to this
> point!
>
> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>
>> +1 on this SPIP.
>>
>> This is a more generalized version of bucketed tables and bucketed
>> joins which can eliminate very expensive data shuffles when joins, and
>> many users in the Apache Spark community have wanted this feature for
>> a long time!
>>
>> Thank you, Ryan and Chao, for working on this, and I look forward to
>> it as a new feature in Spark 3.3
>>
>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>>
>> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
>> >
>> > Hi,
>> >
>> > Ryan and I drafted a design doc to support a new type of join:
>> storage partitioned join which covers bucket join support for 
>> DataSourceV2
>> but is more general. The goal is to let Spark leverage distribution
>> properties reported by data sources and eliminate shuffle whenever 
>> possible.
>> >
>> > Design doc:
>> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>> (includes a POC link at the end)
>> >
>> > We'd like to start a discussion on the doc and any feedback is
>> welcome!
>> >
>> > Thanks,
>> > Chao
>>
>
>
> --
> Ryan Blue
>



Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread angers zhu
+1 on this,

Wenchen Fan  于2021年10月26日周二 下午10:29写道:

> +1 to this SPIP and nice writeup of the design doc!
>
> Can we open comment permission in the doc so that we can discuss details
> there?
>
> On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon  wrote:
>
>> Seems making sense to me.
>>
>> Would be great to have some feedback from people such as @Wenchen Fan
>>  @Cheng Su  @angers zhu
>> .
>>
>>
>> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
>> wrote:
>>
>>> +1 for this SPIP.
>>>
>>> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
>>> wrote:
>>>
 +1. Thanks for lifting the current restrictions on bucket join and
 making this more generalized.

 On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:

> +1 from me as well. Thanks Chao for doing so much to get it to this
> point!
>
> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>
>> +1 on this SPIP.
>>
>> This is a more generalized version of bucketed tables and bucketed
>> joins which can eliminate very expensive data shuffles when joins, and
>> many users in the Apache Spark community have wanted this feature for
>> a long time!
>>
>> Thank you, Ryan and Chao, for working on this, and I look forward to
>> it as a new feature in Spark 3.3
>>
>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>>
>> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
>> >
>> > Hi,
>> >
>> > Ryan and I drafted a design doc to support a new type of join:
>> storage partitioned join which covers bucket join support for 
>> DataSourceV2
>> but is more general. The goal is to let Spark leverage distribution
>> properties reported by data sources and eliminate shuffle whenever 
>> possible.
>> >
>> > Design doc:
>> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>> (includes a POC link at the end)
>> >
>> > We'd like to start a discussion on the doc and any feedback is
>> welcome!
>> >
>> > Thanks,
>> > Chao
>>
>
>
> --
> Ryan Blue
>



Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Wenchen Fan
+1 to this SPIP and nice writeup of the design doc!

Can we open comment permission in the doc so that we can discuss details
there?

On Tue, Oct 26, 2021 at 8:29 PM Hyukjin Kwon  wrote:

> Seems making sense to me.
>
> Would be great to have some feedback from people such as @Wenchen Fan
>  @Cheng Su  @angers zhu
> .
>
>
> On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun 
> wrote:
>
>> +1 for this SPIP.
>>
>> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao 
>> wrote:
>>
>>> +1. Thanks for lifting the current restrictions on bucket join and
>>> making this more generalized.
>>>
>>> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>>>
 +1 from me as well. Thanks Chao for doing so much to get it to this
 point!

 On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:

> +1 on this SPIP.
>
> This is a more generalized version of bucketed tables and bucketed
> joins which can eliminate very expensive data shuffles when joins, and
> many users in the Apache Spark community have wanted this feature for
> a long time!
>
> Thank you, Ryan and Chao, for working on this, and I look forward to
> it as a new feature in Spark 3.3
>
> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>
> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
> >
> > Hi,
> >
> > Ryan and I drafted a design doc to support a new type of join:
> storage partitioned join which covers bucket join support for DataSourceV2
> but is more general. The goal is to let Spark leverage distribution
> properties reported by data sources and eliminate shuffle whenever 
> possible.
> >
> > Design doc:
> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
> (includes a POC link at the end)
> >
> > We'd like to start a discussion on the doc and any feedback is
> welcome!
> >
> > Thanks,
> > Chao
>


 --
 Ryan Blue

>>>


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Hyukjin Kwon
Seems making sense to me.

Would be great to have some feedback from people such as @Wenchen Fan
 @Cheng Su  @angers zhu
.


On Tue, 26 Oct 2021 at 17:25, Dongjoon Hyun  wrote:

> +1 for this SPIP.
>
> On Sun, Oct 24, 2021 at 9:59 AM huaxin gao  wrote:
>
>> +1. Thanks for lifting the current restrictions on bucket join and making
>> this more generalized.
>>
>> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>>
>>> +1 from me as well. Thanks Chao for doing so much to get it to this
>>> point!
>>>
>>> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>>>
 +1 on this SPIP.

 This is a more generalized version of bucketed tables and bucketed
 joins which can eliminate very expensive data shuffles when joins, and
 many users in the Apache Spark community have wanted this feature for
 a long time!

 Thank you, Ryan and Chao, for working on this, and I look forward to
 it as a new feature in Spark 3.3

 DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1

 On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
 >
 > Hi,
 >
 > Ryan and I drafted a design doc to support a new type of join:
 storage partitioned join which covers bucket join support for DataSourceV2
 but is more general. The goal is to let Spark leverage distribution
 properties reported by data sources and eliminate shuffle whenever 
 possible.
 >
 > Design doc:
 https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
 (includes a POC link at the end)
 >
 > We'd like to start a discussion on the doc and any feedback is
 welcome!
 >
 > Thanks,
 > Chao

>>>
>>>
>>> --
>>> Ryan Blue
>>>
>>


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Dongjoon Hyun
+1 for this SPIP.

On Sun, Oct 24, 2021 at 9:59 AM huaxin gao  wrote:

> +1. Thanks for lifting the current restrictions on bucket join and making
> this more generalized.
>
> On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:
>
>> +1 from me as well. Thanks Chao for doing so much to get it to this point!
>>
>> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>>
>>> +1 on this SPIP.
>>>
>>> This is a more generalized version of bucketed tables and bucketed
>>> joins which can eliminate very expensive data shuffles when joins, and
>>> many users in the Apache Spark community have wanted this feature for
>>> a long time!
>>>
>>> Thank you, Ryan and Chao, for working on this, and I look forward to
>>> it as a new feature in Spark 3.3
>>>
>>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>>>
>>> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
>>> >
>>> > Hi,
>>> >
>>> > Ryan and I drafted a design doc to support a new type of join: storage
>>> partitioned join which covers bucket join support for DataSourceV2 but is
>>> more general. The goal is to let Spark leverage distribution properties
>>> reported by data sources and eliminate shuffle whenever possible.
>>> >
>>> > Design doc:
>>> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>>> (includes a POC link at the end)
>>> >
>>> > We'd like to start a discussion on the doc and any feedback is welcome!
>>> >
>>> > Thanks,
>>> > Chao
>>>
>>
>>
>> --
>> Ryan Blue
>>
>


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-24 Thread huaxin gao
+1. Thanks for lifting the current restrictions on bucket join and making
this more generalized.

On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue  wrote:

> +1 from me as well. Thanks Chao for doing so much to get it to this point!
>
> On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:
>
>> +1 on this SPIP.
>>
>> This is a more generalized version of bucketed tables and bucketed
>> joins which can eliminate very expensive data shuffles when joins, and
>> many users in the Apache Spark community have wanted this feature for
>> a long time!
>>
>> Thank you, Ryan and Chao, for working on this, and I look forward to
>> it as a new feature in Spark 3.3
>>
>> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>>
>> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
>> >
>> > Hi,
>> >
>> > Ryan and I drafted a design doc to support a new type of join: storage
>> partitioned join which covers bucket join support for DataSourceV2 but is
>> more general. The goal is to let Spark leverage distribution properties
>> reported by data sources and eliminate shuffle whenever possible.
>> >
>> > Design doc:
>> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>> (includes a POC link at the end)
>> >
>> > We'd like to start a discussion on the doc and any feedback is welcome!
>> >
>> > Thanks,
>> > Chao
>>
>
>
> --
> Ryan Blue
>


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-24 Thread Ryan Blue
+1 from me as well. Thanks Chao for doing so much to get it to this point!

On Sat, Oct 23, 2021 at 11:29 PM DB Tsai  wrote:

> +1 on this SPIP.
>
> This is a more generalized version of bucketed tables and bucketed
> joins which can eliminate very expensive data shuffles when joins, and
> many users in the Apache Spark community have wanted this feature for
> a long time!
>
> Thank you, Ryan and Chao, for working on this, and I look forward to
> it as a new feature in Spark 3.3
>
> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>
> On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
> >
> > Hi,
> >
> > Ryan and I drafted a design doc to support a new type of join: storage
> partitioned join which covers bucket join support for DataSourceV2 but is
> more general. The goal is to let Spark leverage distribution properties
> reported by data sources and eliminate shuffle whenever possible.
> >
> > Design doc:
> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
> (includes a POC link at the end)
> >
> > We'd like to start a discussion on the doc and any feedback is welcome!
> >
> > Thanks,
> > Chao
>


-- 
Ryan Blue


Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-23 Thread DB Tsai
+1 on this SPIP.

This is a more generalized version of bucketed tables and bucketed
joins which can eliminate very expensive data shuffles when joins, and
many users in the Apache Spark community have wanted this feature for
a long time!

Thank you, Ryan and Chao, for working on this, and I look forward to
it as a new feature in Spark 3.3

DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1

On Fri, Oct 22, 2021 at 12:18 PM Chao Sun  wrote:
>
> Hi,
>
> Ryan and I drafted a design doc to support a new type of join: storage 
> partitioned join which covers bucket join support for DataSourceV2 but is 
> more general. The goal is to let Spark leverage distribution properties 
> reported by data sources and eliminate shuffle whenever possible.
>
> Design doc: 
> https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE
>  (includes a POC link at the end)
>
> We'd like to start a discussion on the doc and any feedback is welcome!
>
> Thanks,
> Chao

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org