[ 
https://issues.apache.org/jira/browse/SPARK-29248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938060#comment-16938060
 ] 

Jungtaek Lim commented on SPARK-29248:
--------------------------------------

Assuming writer got the information of number of partitions, what to do from 
writer if the number of partitions in upstream plan for writer is different 
from writer's requirement?

In many cases writer just assumes that upstream gives rows with desired 
partitions/sort order - needs of SPARK-23889. There might be some cases this 
issue still holds valid, writer may know about partitioning function (without 
such interface) and be able to map row-to-partition and send row to specific 
partition. I just haven't seen actual use case of this. Maybe it would be 
better if you could elaborate what you're implementing and what you will do 
with number of partitions.

> Pass in number of partitions to BuildWriter
> -------------------------------------------
>
>                 Key: SPARK-29248
>                 URL: https://issues.apache.org/jira/browse/SPARK-29248
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Ximo Guanter
>            Priority: Major
>
> When implementing a ScanBuilder, we require the implementor to provide the 
> schema of the data and the number of partitions.
> However, when someone is implementing WriteBuilder we only pass them the 
> schema, but not the number of partitions. This is an asymetrical developer 
> experience. Passing in the number of partitions on the WriteBuilder would 
> enable data sources to provision their write targets before starting to 
> write. For example, it could be used to provision a Kafka topic with a 
> specific number of partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to