Re: Requesting json file with schema

2020-02-23 Thread Paul Rogers
Hi, Sorry for the delay in responding. Thank you for the helpful background information - very helpful indeed. Here are some thoughts about how we could extend Drill to help with your use case. Your challenge seems rather open-ended: you don't know the format of the incoming data and don't

Re: Re: Requesting json file with schema

2020-02-17 Thread userdrill.mail...@laposte.net.INVALID
Hi, For our particular case here, we just want to prepare a dataset (in Parquet Format) which will be reused by different users. And we don't know what they will do with it. So we prefer define the subparts with different definition just as text to let the final user do what he want we the

Re: Requesting json file with schema

2020-02-14 Thread Paul Rogers
Thanks for the explanation, very helpful. There are two parts to the problem. On the one hand, you want to read an ever-changing set of JSON files. Your example with "c5" is exactly the kind of "cannot predict the future" issues that can trip up Drill (or, I would argue, any tool that tries to

Re: Requesting json file with schema

2020-02-14 Thread userdrill.mail...@laposte.net.INVALID
Hi, Thanks for all the details. Come back to one use case : the context is the transformation into Parquet of JSONs containing billions of records and for which each record have the global same schema but can have some specificities. Simplified example:

Re: Requesting json file with schema

2020-02-05 Thread Paul Rogers
Hi, Welcome to the Drill mailing list. You are right. Drill is a SQL engine. It works best when the JSON input files represent rows and columns. Of course, JSON itself can represent arbitrary data structures: you can use it to serialize any Java structure you want. Relational tables and