hi Eli, I'm wondering what kind of API you would want, if the perfect one existed. If I understand correctly, you are embedding objects in a BYTE_ARRAY column in Parquet, and need to do some post-processing as the data goes in / comes out of Parquet?
Thanks, Wes On Sat, Jan 6, 2018 at 8:37 AM, Eli <h5r...@protonmail.ch> wrote: > Hi, > > I'm looking to send "regular" columnar binary data to a database, the kind > that gets created by struct.pack, array.array, numpy.tobytes or str.encode. > > The origin is parquet files, which I'm reading ever so comfortably via > PyArrow. > > I do however need to deserialize to Python objcets, currently via > to_pandas(), then re-serialize the columns with one of the above. > > I was wondering whether there was a better way to go about it, one which > would be most fast end effective. > > Ideally I'd like to go through Python, but I can do C or even some C++ if > necessary. > > I posted the question on stackoverflow, and was asked to post here. > Appreciate any feedback! > > Thanks, > Eli > > Sent with [ProtonMail](https://protonmail.com) Secure Email.