This is all great news. Seeing movement, and seeing it articulated, at a
higher level, what has led to the discrepancies in Avro is also very
helpful (thanks Paul).

On Sat, Aug 19, 2017 at 6:29 PM, Stefán Baxter <[email protected]>
wrote:

> Thank you Saurabh,
>
> I was not really expecting a constructive reply to my previous email, was
> appreciated.
>
> I guess some old frustration got the better of me.
>
> All the best,
>  -Stefan
>
> On Fri, Aug 18, 2017 at 10:55 PM, Saurabh Mahapatra <
> [email protected]> wrote:
>
> > Thank you for this candid feedback, Stefan. The fact that you even
> decided
> > to write an email offering this feedback despite moving away from Drill
> > just suggests to me that you are still a supporter. We need all the help
> > that we can get from every member in this community to make Drill provide
> > value to all users that include you.
> >
> > I am new to the community but I have looked at your emails where your
> past
> > attempts to doing this have not taken you anywhere. We have to change
> that.
> >
> > We cannot undo the past as far as addressing your needs are concerned
> but I
> > want to assure you that we are bringing reform to the community in
> general.
> > The stakeholders who are impacted by Drill have increased beyond the
> small
> > group that existed a couple of years ago. So be rest assured that you
> have
> > a voice here.
> >
> > I think the biggest challenge we have in the community is that there are
> > users who could get a lot of value if some work was done to support
> > integrations. I know for sure that there are many developers who would
> love
> > to participate in this community and do the work for a modest fee. It
> helps
> > them get interested in the project, helps them provide support beyond
> just
> > the open source aspect and also helps users such as you to get the value
> > that you need where you need it.
> >
> > Please let me know if you would be willing to pursue that route.
> >
> > On the Avro front, I do hear a lot of users asking for it but I hear a
> lot
> > more requests on Parquet. Plus, there are core issues in Drill that needs
> > to be addressed first. The community is definitely trying to prioritize
> > given what we have. But we do not have to feel constrained. We can get
> more
> > developers to participate in this and help out. And I am very positive
> > about that approach-I know that I helped a user here to get help on using
> > Apache Drill inside a commercial setting where there asks were very
> > specific.
> >
> > Those are my thoughts but please do not give up on us. Your critical
> > feedback may not sound nice to the ears but is exactly the kind of
> feedback
> > that will make this project truly successful.
> >
> > Best,
> > Saurabh
> >
> >
> >
> > I
> >
> > On Fri, Aug 18, 2017 at 1:42 PM, Stefán Baxter <
> [email protected]>
> > wrote:
> >
> > > Hi John,
> > >
> > > Love Drill but we no longer use it in production as our main query
> tool.
> > >
> > > I do have a fairly long list of pet peeves but I also have a long list
> of
> > > features that I love and would not want to be without.
> > >
> > > In my opinion it's time for Drill to decide where its commitment lies
> > > regarding evolving schema and ETL elimination and if it wants to be
> > > something more than a cogil in a Hadoop distribution wheel or an effort
> > > some see as a way to their startup stardom.
> > >
> > > There is no denying the great effect it has had and its usefulness
> (Arrow
> > > also making waves now). I am, as I have been, just frustrated by
> > > shortcomings I feel are not addressed because they are addressed else
> > where
> > > (where the true loyalties lie)
> > >
> > > I can name a few (I have not upgraded to 1.11):
> > >    - Empty values still default to double for partial/segment lists
> which
> > > triggers all sorts of problems  (no attempt is made to convert values
> to
> > > lowest common denominator (string))
> > >    - Two NullableX values both containing nothing (Null) still produce
> > > schema change errors instead of waiting for a type to become apparent
> > >    - Syntax error reporting is terrible
> > >    - Schema change reporting is almost absent
> > >    - Avro schema is fixed/strict even though text formats support
> > > evolving/variable schema (With all sorts of side effects)
> > >    - Avro still does not support dirN
> > >
> > > and so many more things (not to mention the politics and the defensive
> > > attitude when trying to address shortcomings).
> > >
> > > My only regret here is that I never had proper resources to contribute
> a
> > > fix to some of these.
> > >
> > > All the best,
> > >  -Stefán
> > >
> > > On Thu, Aug 17, 2017 at 2:20 PM, Charles Givre <[email protected]>
> wrote:
> > >
> > > > I’m not an Avro user, but I’d definitely vote for improving this.
> > > > — C
> > > >
> > > > > On Aug 17, 2017, at 10:17, John Omernik <[email protected]> wrote:
> > > > >
> > > > > I was guessing you would chime in with a response ;)
> > > > >
> > > > > Are you still using Drill w/ Avro how has things been lately?
> > > > >
> > > > > On Thu, Aug 17, 2017 at 8:00 AM, Stefán Baxter <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > >> woha!!!
> > > > >>
> > > > >>
> > > > >> (sorry, I just had to)
> > > > >>
> > > > >>
> > > > >> Best of luck with that!
> > > > >>
> > > > >> Regards,
> > > > >> -Stefán
> > > > >>
> > > > >> On Thu, Aug 17, 2017 at 12:37 PM, John Omernik <[email protected]>
> > > > wrote:
> > > > >>
> > > > >>> I know Avro is the unwanted child of the Drill world. (I know
> > others
> > > > have
> > > > >>> tried to mature the Avro support and that has been something that
> > > still
> > > > >> is
> > > > >>> in a "experiemental" state.
> > > > >>>
> > > > >>> That said, isn't it time for us to clean it up?
> > > > >>>
> > > > >>> I am sure I there are some open JIRAs out there, (last Doc update
> > on
> > > > the
> > > > >>> Avro Page, Nov 21, 2016) points to this
> > > > >>> https://issues.apache.org/jira/browse/DRILL/component/
> > > > >>> 12328941/?selectedTab=com.atlassian.jira.jira-projects-
> > > > >>> plugin:component-summary-panel
> > > > >>>
> > > > >>> And I just ran into a issue... I am going to run it by here to
> see
> > if
> > > > >> it's
> > > > >>> JIRA worthy or known:
> > > > >>>
> > > > >>> I have two directories, one json (brodns) and one avro
> (brodnsavro)
> > > > >>>
> > > > >>> The both have subdirectories that are YYYY-MM-DD dates.
> > > > >>>
> > > > >>> Where I run
> > > > >>>
> > > > >>> select dir0, count(*) from `brodns` group by dir0  - This works
> > > great!
> > > > >>>
> > > > >>> when I run
> > > > >>>
> > > > >>> select dir0, count(*) from `brodnsavro` group by dir0 - I get:
> > > > >>>
> > > > >>> VALIDATION ERROR: From line 1, column 58 to line 1, column 61:
> > Column
> > > > >>> 'dir0' not found in any table
> > > > >>>
> > > > >>>
> > > > >>> If I run
> > > > >>>
> > > > >>>
> > > > >>> select count(*) from `brodnsavro/2017-08-17` this works
> > > > >>>
> > > > >>> if I run
> > > > >>>
> > > > >>>
> > > > >>> select count(*) from `brodnsavro` this also works
> > > > >>>
> > > > >>>
> > > > >>> But dir0 doesn't appear to be applied to Avro.
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> I really feel this should be consistent (in addition to fixing
> the
> > > > >>> other issues in Avro) and lets make Avro o a
> > > > >>>
> > > > >>> first class citizen of the Drill world.
> > > > >>>
> > > > >>>
> > > > >>> (If folks are interested, I'd be happy to discuss my use case, it
> > > > >> involves
> > > > >>>
> > > > >>> applying a schema to json records on kafka/maprstreams in
> > streamsets,
> > > > and
> > > > >>> then
> > > > >>>
> > > > >>> outputting to avro files... from there I hope to convert to
> > parquet,
> > > > but
> > > > >>>
> > > > >>> don't want to use mapreduce, hence drill!
> > > > >>>
> > > > >>> )
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to