Oleksandr, You couldn't connect to this hangout meeting. But you can share your ideas in the answer to our last comment regarding Drill Metastore [1]. Could you please take a look?
[1] https://issues.apache.org/jira/browse/DRILL-6552?focusedCommentId=16612437&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16612437 Kind regards Vitalii On Wed, Aug 22, 2018 at 8:28 AM Hanumath Rao Maduri <[email protected]> wrote: > Hangout attendees on 08/21: > Pritesh, Salim, Hanumath, Boaz, Robert, Jyothsna, Karthik, Gautam, Vitalli, > Vova, Parth, Olek > > Vitalli and Vova gave a presentation on Drill Metadata management project. > > Some of the questions which were discussed during the discussion. > 1) Gautam suggested to use native operators for collecting stats instead of > aggregation operators. > 2) The metadata API should be made abstract such that metastore can use a > dfs or hive metastore etc. > 3) Schema change exception can be minimized by hive metastore but not > totally overcome. > 4) Discussion on how to refresh the metadata. > 5) Caching the metadata and discussion on what problems the eariler caching > solutions had in Drill. > > > Further metadata discussion will be continued in the next hangout. > > -Hanu > > On Tue, Aug 21, 2018 at 9:53 AM Vitalii Diravka <[email protected] > > > wrote: > > > Hi Alex, > > > > The issues pointed by you really exist. And using of HMS is still open > > question. > > > > The main goal is to make Drill Metastore API, which can be used for > > different Drill data sources. Then to adapt current Parquet metadata > cache > > files mechanism to this API. > > It will be the first implementation. The second one could be HMS. > > Although it has limitations, it has also benefits: it is easy to leverage > > it in Drill, a lot of projects already use HMS (Spark, Presto ...), > > so for some users it can be a good choice for storing metadata. > > > > Other implementations for Drill Metastore could be discussed (MetaCat, > > WhereHow, new own implementation based on HBase/MapR-DB). > > > > > > Kind regards > > Vitalii > > > > > > On Tue, Aug 21, 2018 at 7:04 PM Oleksandr Kalinin <[email protected]> > > wrote: > > > > > Hi Volodymyr, > > > > > > Just recalling on recent discussions in DEV list, it would be > interesting > > > to see if following topics are addressed in the Drill metadata > management > > > initiative: > > > > > > 1. Avoiding repetition of Hive mistakes (mainly relying on RDBMS) > > > Just to substantiate this point of view from practical experience, and > if > > > we reflect on ambition to integrate and operate Drill in > mission-critical > > > environment, following aspects could be listed: > > > - Need of DBA support if cluster is subject to service level > > > objectives/agreements, which is somehow remote from Hadoop world. Need > of > > > strong DBA skills if resulting DB workload is challenging in terms of > > > performance tuning. > > > - Common RDBMS setups offer active-standby HA model. In secure > > > environments, e.g. environments which are subject to PCI-DSS > compliancy, > > > that implies frequent OS patching and reboot (in reality every 30 days > > > max), thus causing an additional coordination effort and service outage > > for > > > duration of the failovers. > > > - Active-active HA clusters like Galera / Percona are free of above > > > disadvantage, but require specific skill set which is not widespread in > > DBA > > > community. Also they are sensitive to even disk IO performance across > the > > > cluster which may require additional hardware adjustment and IO > > isolation. > > > - Need of backup / restore mechanism, which is probably lesser of > > > concerns > > > > > > 2. Bottleneck in foreman when performing initial metadata collection > (and > > > eventually pruning) on large amount of Parquet files > > > - From discussion in the mailing list it was not fully clear whether > > > metastore will address it > > > - Or shall this discussion be continued outside of metastore > initiative > > > from your point of view? > > > > > > I hope it would be OK with you and Vitalii to share some thoughts on > > this. > > > > > > Thanks & Best Regards, > > > Alex > > > > > > On Mon, Aug 20, 2018 at 10:50 PM Volodymyr Vysotskyi < > > [email protected] > > > > > > > wrote: > > > > > > > Hi all, > > > > > > > > I and Vitalii Diravka want to give the presentation with our ideas > > > > connected with Drill Metadata management project (DRILL-6552 > > > > <https://issues.apache.org/jira/browse/DRILL-6552>). > > > > > > > > We will be happy to discuss it and choose the right way for further > > > > development. > > > > > > > > Kind regards, > > > > Volodymyr Vysotskyi > > > > > > > > > > > > On Mon, Aug 20, 2018 at 10:35 PM Hanumath Rao Maduri < > > [email protected] > > > > > > > > wrote: > > > > > > > > > The Apache Drill Hangout will be held tomorrow at 10:00am PST; > please > > > let > > > > > us know should you have a topic for tomorrow's hangout. We will > also > > > ask > > > > > for topics at the beginning of the hangout. > > > > > > > > > > Hangout Link - > > > > > > > > > https://hangouts.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc > > > > > > > > > > Regards, > > > > > Hanu > > > > > > > > > > > > > > >
