I know Avro is the unwanted child of the Drill world. (I know others have
tried to mature the Avro support and that has been something that still is
in a "experiemental" state.

That said, isn't it time for us to clean it up?

I am sure I there are some open JIRAs out there, (last Doc update on the
Avro Page, Nov 21, 2016) points to this
https://issues.apache.org/jira/browse/DRILL/component/12328941/?selectedTab=com.atlassian.jira.jira-projects-plugin:component-summary-panel

And I just ran into a issue... I am going to run it by here to see if it's
JIRA worthy or known:

I have two directories, one json (brodns) and one avro (brodnsavro)

The both have subdirectories that are YYYY-MM-DD dates.

Where I run

select dir0, count(*) from `brodns` group by dir0  - This works great!

when I run

select dir0, count(*) from `brodnsavro` group by dir0 - I get:

VALIDATION ERROR: From line 1, column 58 to line 1, column 61: Column
'dir0' not found in any table


If I run


select count(*) from `brodnsavro/2017-08-17` this works

if I run


select count(*) from `brodnsavro` this also works


But dir0 doesn't appear to be applied to Avro.



I really feel this should be consistent (in addition to fixing the
other issues in Avro) and lets make Avro o a

first class citizen of the Drill world.


(If folks are interested, I'd be happy to discuss my use case, it involves

applying a schema to json records on kafka/maprstreams in streamsets, and then

outputting to avro files... from there I hope to convert to parquet, but

don't want to use mapreduce, hence drill!

)

Reply via email to