Charles, I'm sure we'll have a link for remote folks to join - will share it closer to the day.
On Thu, Nov 1, 2018 at 1:58 PM hanu mapr <[email protected]> wrote: > Hello All, > > There was typo for the year in the mail. It should be 2018 instead of 2019. > Thanks Aman for correcting it. > > Regards, > -Hanu > > On Thu, Nov 1, 2018 at 6:30 AM Charles Givre <[email protected]> wrote: > > > Hi Hanumath, > > This looks great!! Will you be streaming the event for those of us not > in > > the Bay Area? > > Thx, > > — C > > > > > On Nov 1, 2018, at 00:10, Hanumath Rao Maduri <[email protected]> > > wrote: > > > > > > Drill Developers, > > > > > > > > > I am quite excited to announce the details of the Drill developers day > > > 2018. I have consolidated the topics from our earlier discussions and > > > prioritized them according to the votes. > > > > > > > > > MapR has offered to host it on Nov 14th in Training room downstairs. > > > > > > > > > Here is the exact location > > > > > > > > > Training Room at > > > > > > 4555 Great America Pkwy, Suite 201, Santa Clara, CA, 95054. > > > > > > > > > Please find the agenda for the meetup. > > > > > > > > > > > > *Lunch starts at 12:00PM.* > > > > > > > > > *[12:25 - 12:40] Welcome * > > > > > > - Recap on last year's activities > > > - Preview of this year's focus > > > > > > *[12:40 - 1:00] Storage plugins* > > > > > > > > > > > > - Adding new storage plugins for the following: > > > - Netflix Iceberg, Kudu(some code already exists), Cassandra, > > > Elasticsearch, Carbondata, ORC/XML file formats, Spark > > > RDD/DataFrames/Datasets, Graph databases & more > > > - Improving documentation related to Storage plugins > > > > > > > > > *[1:00 - 1:45] Schema discovery & Evolution* > > > > > > > > > > > > - Creation, management of schema > > > - Handling schema changes in certain common cases > > > - Handling NULL values elegantly > > > - Schema learning (similar to MSGpack plugin) > > > - Query hints > > > > > > *[1:45 - 2:30] Metadata Management* > > > > > > > > > > > > - Defining an abstraction layer for various types of metadata: views, > > > schema, statistics, security > > > - Underlying storage for metadata: what are the options and their > > > trade-offs? > > > - Hive metastore > > > - Parquet metadata cache (parquet specific for row group metadata) > > > - Ease of using the parquet files generated by other engines (like > > spark) > > > > > > > > > *[2:30 - 2:45] Break* > > > > > > > > > *[2:45 - 4:00] Resource management* > > > > > > > > > > > > - Resource limits per query > > > - Optimal memory assignment for blocking operators based on stats > > > - Enhancing the blocking and exchange operators to live within memory > > > limits > > > - Aligning with admission control/queueing (YARN concepts) > > > - Query scheduling based on queues using tagging and costing > > > - Drill on kubernetes > > > > > > > > > *[4:00 - 4:20] Apache Arrow* > > > > > > - Benefits of integrating Apache Drill with Apache Arrow > > > - Possible trade-offs & implementation hurdles > > > > > > *[4:20 - 4:40] **Performance Improvements* > > > > > > - Efficient handling of Broadcast/Semi/Anti Semi join > > > - Drill Statistics handling > > > - Optimizing complex Parquet reader > > > > > > Thanks, > > > -Hanu > > > > >
