Re: Drill Hangout tomorrow 08/21

Vitalii Diravka Wed, 12 Sep 2018 09:47:43 -0700

Oleksandr,

You couldn't connect to this hangout meeting. But you can share your ideas
in the answer to our last comment regarding Drill Metastore [1].
Could you please take a look?


[1]
https://issues.apache.org/jira/browse/DRILL-6552?focusedCommentId=16612437&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16612437

Kind regards
Vitalii


On Wed, Aug 22, 2018 at 8:28 AM Hanumath Rao Maduri <[email protected]>
wrote:

> Hangout attendees on 08/21:
> Pritesh, Salim, Hanumath, Boaz, Robert, Jyothsna, Karthik, Gautam, Vitalli,
> Vova, Parth, Olek
>
> Vitalli and Vova gave a presentation on Drill Metadata management project.
>
> Some of the questions which were discussed during the discussion.
> 1) Gautam suggested to use native operators for collecting stats instead of
> aggregation operators.
> 2) The metadata API should be made abstract such that metastore can use a
> dfs or hive metastore etc.
> 3) Schema change exception can be minimized by hive metastore but not
> totally overcome.
> 4) Discussion on how to refresh the metadata.
> 5) Caching the metadata and discussion on what problems the eariler caching
> solutions had in Drill.
>
>
> Further metadata discussion will be continued in the next hangout.
>
> -Hanu
>
> On Tue, Aug 21, 2018 at 9:53 AM Vitalii Diravka <[email protected]
> >
> wrote:
>
> > Hi Alex,
> >
> > The issues pointed by you really exist. And using of HMS is still open
> > question.
> >
> > The main goal is to make Drill Metastore API, which can be used for
> > different Drill data sources. Then to adapt current Parquet metadata
> cache
> > files mechanism to this API.
> > It will be the first implementation. The second one could be HMS.
> > Although it has limitations, it has also benefits: it is easy to leverage
> > it in Drill, a lot of projects already use HMS (Spark, Presto ...),
> > so for some users it can be a good choice for storing metadata.
> >
> > Other implementations for Drill Metastore could be discussed (MetaCat,
> > WhereHow, new own implementation based on HBase/MapR-DB).
> >
> >
> > Kind regards
> > Vitalii
> >
> >
> > On Tue, Aug 21, 2018 at 7:04 PM Oleksandr Kalinin <[email protected]>
> > wrote:
> >
> > > Hi Volodymyr,
> > >
> > > Just recalling on recent discussions in DEV list, it would be
> interesting
> > > to see if following topics are addressed in the Drill metadata
> management
> > > initiative:
> > >
> > > 1. Avoiding repetition of Hive mistakes (mainly relying on RDBMS)
> > > Just to substantiate this point of view from practical experience, and
> if
> > > we reflect on ambition to integrate and operate Drill in
> mission-critical
> > > environment, following aspects could be listed:
> > >   - Need of DBA support if cluster is subject to service level
> > > objectives/agreements, which is somehow remote from Hadoop world. Need
> of
> > > strong DBA skills if resulting DB workload is challenging in terms of
> > > performance tuning.
> > >   - Common RDBMS setups offer active-standby HA model. In secure
> > > environments, e.g. environments which are subject to PCI-DSS
> compliancy,
> > > that implies frequent OS patching and reboot (in reality every 30 days
> > > max), thus causing an additional coordination effort and service outage
> > for
> > > duration of the failovers.
> > >   - Active-active HA clusters like Galera / Percona are free of above
> > > disadvantage, but require specific skill set which is not widespread in
> > DBA
> > > community. Also they are sensitive to even disk IO performance across
> the
> > > cluster which may require additional hardware adjustment and IO
> > isolation.
> > >   - Need of backup / restore mechanism, which is probably lesser of
> > > concerns
> > >
> > > 2. Bottleneck in foreman when performing initial metadata collection
> (and
> > > eventually pruning) on large amount of Parquet files
> > >   - From discussion in the mailing list it was not fully clear whether
> > > metastore will address it
> > >   - Or shall this discussion be continued outside of metastore
> initiative
> > > from your point of view?
> > >
> > > I hope it would be OK with you and Vitalii to share some thoughts on
> > this.
> > >
> > > Thanks & Best Regards,
> > > Alex
> > >
> > > On Mon, Aug 20, 2018 at 10:50 PM Volodymyr Vysotskyi <
> > [email protected]
> > > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I and Vitalii Diravka want to give the presentation with our ideas
> > > > connected with Drill Metadata management project (DRILL-6552
> > > > <https://issues.apache.org/jira/browse/DRILL-6552>).
> > > >
> > > > We will be happy to discuss it and choose the right way for further
> > > > development.
> > > >
> > > > Kind regards,
> > > > Volodymyr Vysotskyi
> > > >
> > > >
> > > > On Mon, Aug 20, 2018 at 10:35 PM Hanumath Rao Maduri <
> > [email protected]
> > > >
> > > > wrote:
> > > >
> > > > > The Apache Drill Hangout will be held tomorrow at 10:00am PST;
> please
> > > let
> > > > > us know should you have a topic for tomorrow's hangout. We will
> also
> > > ask
> > > > > for topics at the beginning of the hangout.
> > > > >
> > > > > Hangout Link -
> > > > >
> > >
> https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc
> > > > >
> > > > > Regards,
> > > > > Hanu
> > > > >
> > > >
> > >
> >
>

Re: Drill Hangout tomorrow 08/21

Reply via email to