Thank you Sharad. So I could use this system for remote sensing data, like 3-dimension (time, space, and measurement) type of cubes? Does it support numerical data well?
Sorry for so many questions just excited :) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Sharad Agarwal <sha...@apache.org> Reply-To: "sha...@apache.org" <sha...@apache.org> Date: Friday, September 19, 2014 4:06 AM To: Chris Mattmann <chris.a.mattm...@jpl.nasa.gov> Cc: "general@incubator.apache.org" <general@incubator.apache.org> Subject: Re: [PROPOSAL] Grill as new Incubator project >Chris, Thanks for your comments. > > >The differences that I see are: >- SciDB exposes Array Data model and Array Query Language (AQL). Grill >data model is based on OLAP Fact and Dimensions. Grill exposes SQL like >language (a subset of Hive QL) that works on *logical* entities (facts, >dimensions) > > >- The goal of Grill is not to build a new query execution database, but >to unify them by having a central metadata catalog, and provide a Cube >abstraction layer on top of it. > > > >Thanks, >Sharad > > >On Fri, Sep 19, 2014 at 9:34 AM, Mattmann, Chris A (3980) ><chris.a.mattm...@jpl.nasa.gov> wrote: > >This sounds super cool! > >How does this relate to SciDB? is it trying to do a similar thing? > >Cheers, >Chris > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: chris.a.mattm...@nasa.gov >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Adjunct Associate Professor, Computer Science Department >University of Southern California, Los Angeles, CA 90089 USA >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > >-----Original Message----- >From: Sharad Agarwal <sha...@apache.org> >Reply-To: "general@incubator.apache.org" <general@incubator.apache.org>, >"sha...@apache.org" <sha...@apache.org> >Date: Thursday, September 18, 2014 8:54 PM >To: "general@incubator.apache.org" <general@incubator.apache.org> >Subject: [PROPOSAL] Grill as new Incubator project > >>Grill Proposal >>========== >> >># Abstract >> >>Grill is a platform that enables multi-dimensional queries in a unified >>way >>over datasets stored in multiple warehouses. Grill integrates Apache Hive >>with other data warehouses by tiering them together to form logical data >>cubes. >> >> >># Proposal >> >>Grill provides a unified Cube abstraction for data stored in different >>stores. Grill tiers multiple data warehouses for unified representation >>and >>efficient access. It provides SQL-like Cube query language to query and >>describe data sets organized in data cubes. It enables users to run >>queries >>against Facts and Dimensions that can span multiple physical tables >>stored >>in different stores. >> >>The primary use cases that Grill aims to solve: >>- Facilitate analytical queries by providing the OLAP like Cube >>abstraction >>- Data Discovery by providing single metadata layer for data stored in >>different stores >>- Unified access to data by integrating Hive with other traditional data >>warehouses >> >> >># Background >> >>Apache Hive is a data warehouse that facilitates querying and managing >>large datasets stored in distributed storage systems like HDFS. It >>provides >>SQL like language called HiveQL aka HQL. Apache Hive is a widely used >>platform in various organizations for doing adhoc analytical queries. >>In a typical Data warehouse scenario, the data is multi-dimensional and >>organized into Facts and Dimensions to form Data Cubes. Grill provides >>this >>logical layer to enable querying and manage data as Cubes. >>The Grill project is actively being developed at InMobi to provide the >>higher level of analytical abstraction to query data stored in different >>storages including Hive and beyond seamlessly. >> >> >># Rationale >> >>The Grill project aims to ease the analytical querying capabilities and >>cut >>the data-silos by providing a single view of data across multiple data >>stores. >>Conceiving data as a cube with hierarchical dimensions leads to >>conceptually straightforward operations to facilitate analysis. >>Integrating >>Apache Hive with other traditional warehouses provides the opportunity to >>optimize on the query execution cost by tiering the data across multiple >>warehouses. Grill provides >>- Access to data Cubes via Cube Query language similar to HiveQL. >>- Driver based architecture to allow for plugging systems like Hive and >>other warehouses such as columnar data RDBMS. >>- Cost based engine selection that provides optimal use of resources by >>selecting the best execution engine for a given query. >> >>In a typical Data warehouse, data is organized in Cubes with multiple >>dimensions and measures. This facilitates the analysis by conceiving the >>data in terms of Facts and Dimensions instead of physical tables. Grill >>aims to provide this logical Cube abstraction on Data warehouses like >>Hive >>and other traditional warehouses. >> >> >># Initial Goals >> >>- Donate the Grill source code and documentation to Apache Software >>Foundation >>- Build a user and developer community >>- Support Hive and other Columnar data warehouses >>- Support full query life cycle management >>- Add authentication for querying cubes >>- Provide detailed query statistics >> >> >># Long Term Goals >> >>Here are some longer-term capabilities that would be added to Grill >>- Add authorization for managing and querying Cubes >>- Provide REST and CLI for full Admin controls >>- Capability to schedule queries >>- Query caching >>- Integrate with Apache Spark. Creating Spark RDD from Grill query >>- Integrate with Apache Optiq >> >> >># Current Status >> >>The project is actively developed at InMobi. The first version is >>deployed >>at InMobi 4 months back. This version allows querying dimension and fact >>data stored in Hive over CLI. The source code and documentation is hosted >>at GitHub. >> >>## Meritocracy >> >>We intend to build a diverse developer and user community for the project >>following the Apache meritocracy model. We want to encourage contributors >>from multiple organizations, provide plenty of support to new developers >>and welcome them to be committers. >> >>## Community >> >>Currently the project is being developed at InMobi. We hope to extend our >>contributor and user base significantly in the future and build a solid >>open source community around Grill. >>Core Developers >>Grill is currently being developed by Amareshwari Sriramadasu, Sharad >>Agarwal and Jaideep Dhok from InMobi, and Sreekanth Ramakrishnan who is >>currently employed by SoftwareAG. Raghavendra Singh from InMobi has built >>the QA automation for Grill. >> >>## Alignment >> >>The ASF is a natural home to Grill as it is for Apache Hadoop, Apache >>Hive, >>Apache Spark and other emerging projects in Big Data space. >>We believe in any enterprise, multiple data warehouses will co-exist, as >>not all workloads are cost effective to run on single one. Apache Hive is >>one of the crucial data warehouse along with upcoming projects like >>Apache >>Spark in Hadoop ecosystem. Grill will benefit in working in close >>proximity >>with these projects. >>The traditional Columnar data warehouses complement Apache Hive as >>certain >>workloads continue to be cost effective to run in traditional columnar >>data >>warehouses. Having multiple data warehouses leads to data silos that >>Grill >>aims to cut within the enterprise and provide a holistic unified access >>to >>data. >> >> >># Known Risks >> >>## Orphaned products & Reliance on Salaried Developers >> >>There is little risk of Grill getting orphaned, as Grill is key part of >>the >>Data Platform stack at InMobi. The core Grill developers plan to work on >>it >>full-time. We think Grill will bring value in the Big Data space and we >>plan to grow the community of users and contributors. >> >>## Inexperience with Open Source >> >>All the core developers have long and significant experience in Apache >>projects and Hadoop ecosystem. Amareshwari Sriramadasu has long standing >>contributions to Apache Hadoop MapReduce and Apache Hive, she being PMC >>member of Hadoop and a committer of Hive. Sharad Agarwal is a PMC member >>of >>Hadoop and contributed to Hadoop YARN and Hadoop MapReduce. Srikanth >>Sundarrajan is a PMC member of Apache Falcon. Sreekanth Ramakrishnan is >>committer of Apache Hadoop. Jaideep Dhok has contributed patches to >>Apache >>Hive. Gunther is a PMC member of Apache Hive. Vikram is a committer of >>Apache Hive. >> >>## Homogeneous Developers >> >>The initial developers are employed by Hortonworks, InMobi and >>SoftwareAG. >>We are committed to recruiting additional committers from other companies >>based on their contribution to the project. >> >>## Reliance on Salaried Developers >> >>The majority of initial committers are paid by their employee to >>contribute >>to the project and few are contributing in their spare time. Once the >>project has a community built, we are committed to recruit committers and >>developers from outside the current core developers. >> >>## Relationships with Other Apache Products >> >>Grill is deeply integrated with other Apache projects. Grill uses and >>extends Apache Hive HCatalog to store and manage the Data cubes. It uses >>HDFS and Hive session management libraries. Grill has the driver-based >>architecture that allows for adding multiple execution drivers. Apart >>from >>integrating Apache Hive, it can be integrated with Apache Spark over >>Spark >>SQL or Shark, Apache Drill, Apache Tajo and Apache Phoenix. >>In future we want to use Apache Optiq in Grill for query optimization and >>cost based driver selection. >> >>## An Excessive Fascination with the Apache Brand >> >>The project is conceived from beginning to be in line with the Apache >>philosophy. As the core developers have good experience with Apache, the >>source code organization, build, review and commit process are highly >>influenced by Apache. We believe that Apache will be a solid home for >>Grill >>to grow and build the open source community. We have also described the >>reasons in the Rationale and Alignment sections. >> >> >># Documentation >> >>http://inmobi.github.io/grill/ >> >> >># Initial Source >> >>The source is currently in github repository at: >>https://github.com/inmobi/grill >> >> >># Source and Intellectual Property Submission Plan >> >>The complete Grill code is already under Apache Software License 2. >> >> >># External Dependencies >> >>The dependencies all have Apache compatible licenses. These include >>Apache >>2.0, BSD, MIT, EPL and CDDL licensed dependencies. >> >> >># Cryptography >> >>None >> >> >># Required Resources >> >>## Mailing lists >> >>grill-dev AT incubator DOT apache DOT org >>grill-commits AT incubator DOT apache DOT org >>grill-private AT incubator DOT apache DOT org >> >>## Subversion Directory >> >>Git is the preferred source control system: git:// >>git.apache.org/incubator-grill <http://git.apache.org/incubator-grill> >> >>## Issue Tracking >> >>JIRA Grill (GRILL) >> >> >># Initial Committers >> >>Amareshwari Sriramadasu (amareshwari AT apache DOT org) >>Gunther Hagleitner (gunther AT apache DOT org) >>Jaideep Dhok (jaideep.dhok AT Inmobi DOT com) >>Raghavendra Singh (raghavendra.singh AT Inmobi DOT com) >>Sharad Agarwal (sharad AT apache DOT org) >>Sreekanth Ramakrishnan (sreekanth AT apache DOT org) >>Srikanth Sundarrajan (sriksun AT apache DOT org) >>Suma Shivaprasad (suma.shivaprasad AT Inmobi DOT com) >>Vikram Dixit (vikram AT apache DOT org) >> >> >># Affiliations >> >>Amareshwari SR (InMobi) >>Gunther Hagleitner (Hortonworks) >>Jaideep Dhok (InMobi) >>Raghavendra Singh (InMobi) >>Sharad Agarwal (InMobi) >>Sreekanth Ramakrishnan (SoftwareAG) >>Srikanth Sundarrajan (InMobi) >>Suma Shivaprasad (InMobi) >>Vikram Dixit (Hortonworks) >> >> >># Sponsors >> >>## Champion >> >>Vinod K <vinodkv AT apache DOT org> (Apache Member) >> >>## Nominated Mentors >> >>Chris Douglas (Microsoft) >>Jacob Homan (Microsoft) >>Jean Baptiste Onofre (Talend) >>Vinod K (Hortonworks) >> >>## Sponsoring Entity >> >>Incubator PMC > > > > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org