I think if someone wants to build a plugin model for datasets / file formats (and refactor the existing "built-in" formats to use those plugin APIs), that sounds like a fine idea to me. I don't think the idea was for the API to be closed only to the formats that are implemented inside the Arrow codebase.
On Thu, Jul 29, 2021 at 4:09 PM Weston Pace <weston.p...@gmail.com> wrote: > > In reviewing the RADOS PR I ran into another question. I recently > sent an email on the topic where the author wants their integration to > be part of the Arrow repo (I believe this is the case for the RADOS > PR). However, what about the case where the author doesn't want to be > part of the Gibhub repo (so, to be clear, this email is not relevant > for the RADOS PR). > > Right now, in order to add a new file format to the dataset API the > author has to add code to the Arrow codebase to create a new > FileFormat or Fragment. Do we want to make the datasets API a > "plugin" architecture to allow new formats in the future be added > dynamically. > > Of course, now that I'm writing the email, I suppose the answer is > clear. If someone cares enough about having an external extension > they can always do the work to add such a plugin system. Does this > sound right or is there some other reason against this or different > approach we'd want to take in the future?