Hi John,

Love Drill but we no longer use it in production as our main query tool.

I do have a fairly long list of pet peeves but I also have a long list of
features that I love and would not want to be without.

In my opinion it's time for Drill to decide where its commitment lies
regarding evolving schema and ETL elimination and if it wants to be
something more than a cogil in a Hadoop distribution wheel or an effort
some see as a way to their startup stardom.

There is no denying the great effect it has had and its usefulness (Arrow
also making waves now). I am, as I have been, just frustrated by
shortcomings I feel are not addressed because they are addressed else where
(where the true loyalties lie)

I can name a few (I have not upgraded to 1.11):
   - Empty values still default to double for partial/segment lists which
triggers all sorts of problems  (no attempt is made to convert values to
lowest common denominator (string))
   - Two NullableX values both containing nothing (Null) still produce
schema change errors instead of waiting for a type to become apparent
   - Syntax error reporting is terrible
   - Schema change reporting is almost absent
   - Avro schema is fixed/strict even though text formats support
evolving/variable schema (With all sorts of side effects)
   - Avro still does not support dirN

and so many more things (not to mention the politics and the defensive
attitude when trying to address shortcomings).

My only regret here is that I never had proper resources to contribute a
fix to some of these.

All the best,
 -Stefán

On Thu, Aug 17, 2017 at 2:20 PM, Charles Givre <cgi...@gmail.com> wrote:

> I’m not an Avro user, but I’d definitely vote for improving this.
> — C
>
> > On Aug 17, 2017, at 10:17, John Omernik <j...@omernik.com> wrote:
> >
> > I was guessing you would chime in with a response ;)
> >
> > Are you still using Drill w/ Avro how has things been lately?
> >
> > On Thu, Aug 17, 2017 at 8:00 AM, Stefán Baxter <
> ste...@activitystream.com>
> > wrote:
> >
> >> woha!!!
> >>
> >>
> >> (sorry, I just had to)
> >>
> >>
> >> Best of luck with that!
> >>
> >> Regards,
> >> -Stefán
> >>
> >> On Thu, Aug 17, 2017 at 12:37 PM, John Omernik <j...@omernik.com>
> wrote:
> >>
> >>> I know Avro is the unwanted child of the Drill world. (I know others
> have
> >>> tried to mature the Avro support and that has been something that still
> >> is
> >>> in a "experiemental" state.
> >>>
> >>> That said, isn't it time for us to clean it up?
> >>>
> >>> I am sure I there are some open JIRAs out there, (last Doc update on
> the
> >>> Avro Page, Nov 21, 2016) points to this
> >>> https://issues.apache.org/jira/browse/DRILL/component/
> >>> 12328941/?selectedTab=com.atlassian.jira.jira-projects-
> >>> plugin:component-summary-panel
> >>>
> >>> And I just ran into a issue... I am going to run it by here to see if
> >> it's
> >>> JIRA worthy or known:
> >>>
> >>> I have two directories, one json (brodns) and one avro (brodnsavro)
> >>>
> >>> The both have subdirectories that are YYYY-MM-DD dates.
> >>>
> >>> Where I run
> >>>
> >>> select dir0, count(*) from `brodns` group by dir0  - This works great!
> >>>
> >>> when I run
> >>>
> >>> select dir0, count(*) from `brodnsavro` group by dir0 - I get:
> >>>
> >>> VALIDATION ERROR: From line 1, column 58 to line 1, column 61: Column
> >>> 'dir0' not found in any table
> >>>
> >>>
> >>> If I run
> >>>
> >>>
> >>> select count(*) from `brodnsavro/2017-08-17` this works
> >>>
> >>> if I run
> >>>
> >>>
> >>> select count(*) from `brodnsavro` this also works
> >>>
> >>>
> >>> But dir0 doesn't appear to be applied to Avro.
> >>>
> >>>
> >>>
> >>> I really feel this should be consistent (in addition to fixing the
> >>> other issues in Avro) and lets make Avro o a
> >>>
> >>> first class citizen of the Drill world.
> >>>
> >>>
> >>> (If folks are interested, I'd be happy to discuss my use case, it
> >> involves
> >>>
> >>> applying a schema to json records on kafka/maprstreams in streamsets,
> and
> >>> then
> >>>
> >>> outputting to avro files... from there I hope to convert to parquet,
> but
> >>>
> >>> don't want to use mapreduce, hence drill!
> >>>
> >>> )
> >>>
> >>
>
>

Reply via email to