SQL interface for Pig
---------------------
Key: PIG-824
URL: https://issues.apache.org/jira/browse/PIG-824
Project: Pig
Issue Type: New Feature
Reporter: Olga Natkovich
In the last 18 month PigLatin has gained significant popularity within the open
source community. Many users like its data flow model, its rich type system and
its ability to work with any data available on HDFS or outside. We have also
heard from many users that having Pig speak SQL would bring many more users.
Having a single system that exports multiple interfaces is a big advantage as
it guarantees consistent semantics, custom code reuse, and reduces the amount
of maintenance. This is especially relevant for project where using both
interfaces for different parts of the system is relevant. For instance, in a
data warehousing system, you would have ETL component that brings data into
the warehouse and a component that analyzes the data and produces reports.
PigLatin is uniquely suited for ETL processing while SQL might be a better fit
for report generation.
To start, it would make sense to implement a subset of SQL92 standard and to be
as much as possible standard compliant. This would include all the standard
constructs: select, from, where, group-by + having, order by, limit, join
(inner + outer). Several extensions such as support for pig's UDFs and
possibly streaming, multiquery and support for pig's complex types would be
helpful.
This work is dependent on metadata support outlined in
https://issues.apache.org/jira/browse/PIG-823
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.