Yes. This is a pain in the butt.
One thing that might work for you is to use a union of different
wild-cards. Here is an example where I have a directory with both csv and
json files.
select * from (
select columns[0] as a, columns[1] as b
from dfs.tdunning.`foo1/*.csv`
) union (
select j.a, j.b
from dfs.tdunning.`foo1/*.json` j
);
Note that each record in a csv consists of a single value (called columns)
which is an array. Each record from a json is a structure. I have to
extract these components in order to get data that can be union'ed.
On Sat, Oct 17, 2015 at 10:33 AM, Stefán Baxter <[email protected]>
wrote:
> Thanks Abhishek,
>
> I think Drill is still quite far from eliminating ETL and the list of
> obstacles on the way to there seems growing. (yeah, disappointment got me
> for a bit)
>
> Regards,
> -Stefan
>