I'd like to bump this. I agree with Carlos that there is very little 
information at the DataSoruceWrite/DataSourceReader level. To me, ideally, the 
DataSourceWriter/Reader should have as much information as possible. Not only 
the number of partitions, but also ideally the whole execution plan.

This would not only enable things like automatic creation of kafka topics with 
the correct number of partitions (like Carlos mentioned), but it would also 
allow advanced DataSources that, for example, analyze the execution plan to 
choose the correct parameters to implement differential privacy.

CC'ing in Ryan, since he is leading the DataSourceV2 workgroup (sorry I can't 
joint the sync meetings, but I'm in CET time and the time logictics of that 
meeting don't work for Europe).

Ryan, do you think it would be a good idea to provide extra information at the 
DataSourceWriter/Reader level to enable more advanced datasources? Would a PR 
contribution with these changed be a welcome addition?

Thanks,
Ximo

-----Mensaje original-----
De: CARLOS DEL PRADO MOTA <carlos.delpradom...@telefonica.com>
Enviado el: jueves, 7 de marzo de 2019 10:19
Para: dev@spark.apache.org
Asunto: Partitions at DataSource API V2

Hello, I’m Carlos del Prado, developer at Telefonica.

We are working with Spark's DataSource API V2 building a custom Kafka connector 
that creates the topic upon write. In order to do that, we need to know the 
number of partitions before writing data in each partition, at the 
DataSourceWriter level.

Is there any way for us do that?

King regards,
Carlos.

________________________________

Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede 
contener información privilegiada o confidencial y es para uso exclusivo de la 
persona o entidad de destino. Si no es usted. el destinatario indicado, queda 
notificado de que la lectura, utilización, divulgación y/o copia sin 
autorización puede estar prohibida en virtud de la legislación vigente. Si ha 
recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente 
por esta misma vía y proceda a su destrucción.

The information contained in this transmission is privileged and confidential 
information intended only for the use of the individual or entity named above. 
If the reader of this message is not the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this communication 
is strictly prohibited. If you have received this transmission in error, do not 
read it. Please immediately reply to the sender that you have received this 
communication in error and then delete it.

Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode 
conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa 
ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica 
notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização 
pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem 
por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e 
proceda a sua destruição

Reply via email to