[ https://issues.apache.org/jira/browse/CALCITE-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16885464#comment-16885464 ]
Khai Tran commented on CALCITE-3122: ------------------------------------ Hi [~zabetak], thanks a lot for your suggestions. In this feature, I get the Pig logical plan from Pig parser then traverse through this plan, use RelBuilder to construct Calcite logical plan, let's call it plan #1. After this step, I need to write an additional rule to optimize plan #1 into plan #2. As an example, Pig group-by + aggregate is literally translated as a Calcite aggregate with COLLECT() agg func, then applying a Project that use Pig aggregate UDFs (work on Pig DataBag or SQL multiset, which is result of COLLECT()). So the rule will convert COLLECT() + Pig aggregate UDFs into Calcite builtin aggregate operator. And we have other use cases that need to optimize a RelNode plan using a given set of rules and probably using a customer planner to set costs in the way that can enforce certain rules. So coupling RelNode with planners make it hard to do so. I will try to check out more about HepPlanner, but do you have any example of setting HepPlanner for RelBuilder? > Convert Pig Latin scripts into Calcite logical plan > ---------------------------------------------------- > > Key: CALCITE-3122 > URL: https://issues.apache.org/jira/browse/CALCITE-3122 > Project: Calcite > Issue Type: New Feature > Components: core, piglet > Reporter: Khai Tran > Priority: Major > Labels: pull-request-available > Fix For: 1.21.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > We create an internal Calcite repo at LinkedIn and develop APIs to parse any > Pig Latin scripts into Calcite logical plan. The code was tested in nearly > ~1000 Pig scripts written at LinkedIn. > Changes: > 1. piglet: main conversion code live there, include: > * APIs to convert any Pig scripts into RelNode plans or SQL statements > * Use Pig Grunt parser to parse Pig Latin scripts into Pig logical plan > (DAGs) > * Convert Pig schemas into RelDatatype > * Traverse through Pig expression plan and convert Pig expressions into > RexNodes > * Map some basic Pig UDFs to Calcite SQL operators > * Build Calcite UDFs for any other Pig UDFs, including UDFs written in both > Java and Python > * Traverse (DFS) through Pig logical plans to convert each Pig logical nodes > to RelNodes > * Have an optimizer rule to optimize Pig group/cogroup into Aggregate > operators > 2. core: > * Implement other RelNode in Rel2Sql so that Pig can be translated into SQL > * Other minor changes in a few other classes to make Pig to Calcite works -- This message was sent by Atlassian JIRA (v7.6.14#76016)