Anyone can attend the v2 sync. You just need to let me know what email address you'd like to have added. Sorry it is invite-only. That's a limitation of the platform (hangouts), the Spark community welcomes anyone that wants to participate.
On Mon, Dec 10, 2018 at 1:00 AM JOAQUIN GUANTER GONZALBEZ < joaquin.guantergonzal...@telefonica.com> wrote: > Ah, yes, you are right. The DataSourceV2 APIs wouldn’t let an implementor > mark a DataSet as “bucketed”. Is there any documentation about the upcoming > table support for data source v2 or any way of getting invited to the > DataSourceV2 community sync? > > > > Thanks! > > Ximo. > > > > *De:* Wenchen Fan <cloud0...@gmail.com> > *Enviado el:* miércoles, 5 de diciembre de 2018 15:51 > *Para:* JOAQUIN GUANTER GONZALBEZ <joaquin.guantergonzal...@telefonica.com > > > *CC:* Spark dev list <dev@spark.apache.org> > *Asunto:* Re: [SPARK-26160] Make assertNotBucketed call in > DataFrameWriter::save optional > > > > The bucket feature is designed to only work with data sources with table > support, and currently the table support is not public yet, which means no > external data sources can access bucketing information right now. The > bucket feature only works with Spark native file source tables. > > > > We are working on adding table support to data source v2, and we should > have a good story about bucket when it's done. > > > > On Tue, Nov 27, 2018 at 1:01 AM JOAQUIN GUANTER GONZALBEZ < > joaquin.guantergonzal...@telefonica.com> wrote: > > Hello, > > > > I have a proposal for a small improvement in the Datasource API and I’d > like to know if it sounds like a change the Spark project would accept. > > > > Currently, the `.save` method in DataFrameWriter will fail if the > dataframe is bucketed and/or sorted. This makes sense, since there is no > way of storing metadata in the current file-based data sources to know > whether a file was bucketed or not. > > > > I have a use case where I would like to implement a new, file-based data > source which could keep track of that kind of metadata (without using the > HiveMetastore), so I would like to be able to `.save` bucketed dataframes. > > > > Would a patch to extend the datasource api with an indicator of whether > that source is able to serialize bucketed dataframes be a welcome addition? > I'm happy to work on it if that’s the case. > > > > I have opened this as https://issues.apache.org/jira/browse/SPARK-26160 > in the Spark Jira. > > > > Cheers, > > Ximo. > > > ------------------------------ > > > Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, > puede contener información privilegiada o confidencial y es para uso > exclusivo de la persona o entidad de destino. Si no es usted. el > destinatario indicado, queda notificado de que la lectura, utilización, > divulgación y/o copia sin autorización puede estar prohibida en virtud de > la legislación vigente. Si ha recibido este mensaje por error, le rogamos > que nos lo comunique inmediatamente por esta misma vía y proceda a su > destrucción. > > The information contained in this transmission is privileged and > confidential information intended only for the use of the individual or > entity named above. If the reader of this message is not the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this communication is strictly prohibited. If you have received > this transmission in error, do not read it. Please immediately reply to the > sender that you have received this communication in error and then delete > it. > > Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, > pode conter informação privilegiada ou confidencial e é para uso exclusivo > da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário > indicado, fica notificado de que a leitura, utilização, divulgação e/ou > cópia sem autorização pode estar proibida em virtude da legislação vigente. > Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique > imediatamente por esta mesma via e proceda a sua destruição > > > ------------------------------ > > Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, > puede contener información privilegiada o confidencial y es para uso > exclusivo de la persona o entidad de destino. Si no es usted. el > destinatario indicado, queda notificado de que la lectura, utilización, > divulgación y/o copia sin autorización puede estar prohibida en virtud de > la legislación vigente. Si ha recibido este mensaje por error, le rogamos > que nos lo comunique inmediatamente por esta misma vía y proceda a su > destrucción. > > The information contained in this transmission is privileged and > confidential information intended only for the use of the individual or > entity named above. If the reader of this message is not the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this communication is strictly prohibited. If you have received > this transmission in error, do not read it. Please immediately reply to the > sender that you have received this communication in error and then delete > it. > > Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, > pode conter informação privilegiada ou confidencial e é para uso exclusivo > da pessoa ou entidade de destino. Se não é vossa senhoria o destinatário > indicado, fica notificado de que a leitura, utilização, divulgação e/ou > cópia sem autorização pode estar proibida em virtude da legislação vigente. > Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique > imediatamente por esta mesma via e proceda a sua destruição > -- Ryan Blue Software Engineer Netflix