SQL interface for Pig
---------------------

                 Key: PIG-824
                 URL: https://issues.apache.org/jira/browse/PIG-824
             Project: Pig
          Issue Type: New Feature
            Reporter: Olga Natkovich


In the last 18 month PigLatin has gained significant popularity within the open 
source community. Many users like its data flow model, its rich type system and 
its ability to work with any data available on HDFS or outside. We have also 
heard from many users that having Pig speak SQL would bring many more users. 
Having a single system that exports multiple interfaces is a big advantage as 
it guarantees consistent semantics, custom code reuse, and reduces the amount 
of maintenance. This is especially relevant for project where using both 
interfaces for different parts of the system is relevant.  For instance, in a 
data warehousing system, you would have ETL component that brings data  into 
the warehouse and a component that analyzes the data and produces reports. 
PigLatin is uniquely suited for ETL processing while SQL might be a better fit 
for report generation.

To start, it would make sense to implement a subset of SQL92 standard and to be 
as much as possible standard compliant. This would include all the standard 
constructs: select, from, where, group-by + having, order by, limit, join 
(inner + outer). Several extensions  such as support for pig's UDFs and 
possibly streaming, multiquery and support for pig's complex types would be 
helpful.

This work is dependent on metadata support outlined in 
https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to