Le 27/08/2020 à 21:55, Ivo Jimenez a écrit : > Hi Antoine, > >> Our main concern is that this new arrow::dataset::RadosFormat class will >> be >>> deriving from the arrow::dataset::FileFormat class, which seems to raise >> a >>> conceptual mismatch as there isn’t really a RADOS format but rather a >>> formatting/serialization deferral that will be taking place, effectively >>> introducing a new client-server layer in the Dataset API. >> >> So, RadosFormat would ultimately redirect to another dataset format >> (e.g. ParquetFormat) when it comes to actually understanding the data? >> > > Yes, that is our plan. Since this is going to be done on the storage-, > server-side, this would be transparent to the client. So our main concern > is whether this be OK from the design perspective, and could this > eventually be merged upstream?
Arrow datasets have no notion of client and server, so I'm not sure what you mean here. Do you simply mean contributing RadosFormat to the Arrow codebase? I would say that depends on the required dependencies, and ease of testing (and/or CI) for other developers. Regards Antoine.