tychoish commented on issue #8345: URL: https://github.com/apache/arrow-datafusion/issues/8345#issuecomment-1870448773
Interesting, and I'm glad this is becoming more relevant! For my part, I managed to use the StreamingTable implementation to add (to GlareDB) support for reading/writing BSON without needing any of these changes,[^bson-mod] and I'm _quite_ pleased with the results. It isn't a 100% replacement, through the gaps sort of feel like (to me!) cases where the FileFormat/FileType infrastructure is a bit overbuilt, though this might just be application specific concerns - the write path goes through a slightly different code path that's disconnected (though frankly, given atomicity requirements and having multiple files feeding a single table, it feels weirder to combine both of these things, - and schema inference is a bit fiddly, though I think schema inference is something that should itself be pluggable and maybe separately, [^bson-mod]:See [here](https://github.com/GlareDB/glaredb/tree/main/crates/datasources/src/bson) for the main code; there's some other plumbing and wiring to connect it in, but it both works pretty well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
