I'd like to bump this. I agree with Carlos that there is very little information at the DataSoruceWrite/DataSourceReader level. To me, ideally, the DataSourceWriter/Reader should have as much information as possible. Not only the number of partitions, but also ideally the whole execution plan.
This would not only enable things like automatic creation of kafka topics with the correct number of partitions (like Carlos mentioned), but it would also allow advanced DataSources that, for example, analyze the execution plan to choose the correct parameters to implement differential privacy. CC'ing in Ryan, since he is leading the DataSourceV2 workgroup (sorry I can't joint the sync meetings, but I'm in CET time and the time logictics of that meeting don't work for Europe). Ryan, do you think it would be a good idea to provide extra information at the DataSourceWriter/Reader level to enable more advanced datasources? Would a PR contribution with these changed be a welcome addition? Thanks, Ximo -----Mensaje original----- De: CARLOS DEL PRADO MOTA <carlos.delpradom...@telefonica.com> Enviado el: jueves, 7 de marzo de 2019 10:19 Para: dev@spark.apache.org Asunto: Partitions at DataSource API V2 Hello, I’m Carlos del Prado, developer at Telefonica. We are working with Spark's DataSource API V2 building a custom Kafka connector that creates the topic upon write. In order to do that, we need to know the number of partitions before writing data in each partition, at the DataSourceWriter level. Is there any way for us do that? King regards, Carlos. ________________________________ Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta misma vía y proceda a su destrucción. The information contained in this transmission is privileged and confidential information intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it. Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e proceda a sua destruição