Hello, you can also use Python wrapper pyarrow to create nested/json-like structures in Python. For example using `pyarrow.array([[1, 2], [1], None, [12, 23, 23]])`.
Cheers Uwe On Tue, Jun 12, 2018, at 4:45 PM, Lee, David wrote: > Python supports tabular structures using pyarrow. > > https://arrow.apache.org/docs/python/generated/pyarrow.schema.html > > For nested structures like JSON you have to use C++ (parquet-cpp) > > https://github.com/apache/parquet-cpp > > We need more APIs developed to create nested JSON.. > > -----Original Message----- > From: Divya Gehlot [mailto:[email protected]] > Sent: Tuesday, June 12, 2018 5:25 AM > To: [email protected] > Subject: Re: Which perform better JSON or convert JSON to parquet format ? > > [EXTERNAL EMAIL] > > > Hi David, > How to create the schema first using parquet library ? > Can you please give an example? > > Thanks, > Divya > > On Tue, 12 Jun 2018 at 00:03, Lee, David <[email protected]> wrote: > > > Parquet is faster especially if you are only looking for a subset of > > json objects. Every JSON key / array is treated as a column. > > > > With that said creating parquet from JSON is not bullet proof if you > > have really complex json which may have NULL values or many optional > > keys (Drill can't figure out what data type a NULL JSON value is and > > has trouble merging optional keys after sampling the first 20,000? > > records) > > > > If you are creating parquet you should be using the parquet libraries > > to define a consistent schema first. I've pretty much given up trying > > to create parquet from json which always ends in index out of bound > > (server > > crashing) errors when trying to query parquet. > > > > -----Original Message----- > > From: Ted Dunning [mailto:[email protected]] > > Sent: Monday, June 11, 2018 4:47 AM > > To: user <[email protected]> > > Subject: Re: Which perform better JSON or convert JSON to parquet format ? > > > > [EXTERNAL EMAIL] > > > > > > Yes. Drill is good at JSON. > > > > But Parquet will be faster during a scan. > > > > Faster may be better. Or other things may be more important. > > > > You have to decide what is important to you. The great virtue of drill > > is that you have the choice. > > > > > > > > On Mon, Jun 11, 2018 at 11:06 AM Divya Gehlot > > <[email protected]> > > wrote: > > > > > Thanks to all for your opinions ! > > > As Drill has been popularised as complex JSON reader as compare to > > > other tools in space . > > > Was wondering does drill works better for JSON rather than parquet. > > > > > > > > > This message may contain information that is confidential or privileged. > > If you are not the intended recipient, please advise the sender > > immediately and delete this message. See > > http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers > > for further information. Please refer to > > http://www.blackrock.com/corporate/en-us/compliance/privacy-policy for > > more information about BlackRock’s Privacy Policy. > > > > For a list of BlackRock's office addresses worldwide, see > > http://www.blackrock.com/corporate/en-us/about-us/contacts-locations. > > > > © 2018 BlackRock, Inc. All rights reserved. > >
