I will log Calcite JIRA for each easy commit in the fork. Thanks!


On Tue, Nov 17, 2015 at 10:44 AM, Julian Hyde <[email protected]> wrote:
> I hear you.
>
> However, get Jinfeng to log a calcite jira for each "easy" commit and we can 
> start burning through them. As I said earlier a commit Id to cherry pick 
> might be sufficient. That review process involves work on my part (or on the 
> part of other calcite committers reviewing) so we would be contributing to 
> the effort. (Yeah I know it doesn't always feel that a reviewer is doing 
> useful work.)
>
> Julian
>
>> On Nov 17, 2015, at 09:54, Jacques Nadeau <[email protected]> wrote:
>>
>> I'm fine with any approach. My suggestion came from my sense that, so far,
>> our burn down hasn't been super successful. And sometimes I feel like
>> Jinfeng gets stuck doing all the heavy lifting.
>>
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>>
>>> On Tue, Nov 17, 2015 at 9:48 AM, Julian Hyde <[email protected]> wrote:
>>>
>>> I can do a hackathon if it helps. You're approaching it as a "big bang"
>>> but we could approach it as a burn down, counting the number of commits
>>> different between drill's fork and master.  There's a lot of incremental
>>> value each commit we remove.
>>>
>>> Julian
>>>
>>>> On Nov 17, 2015, at 08:58, Jacques Nadeau <[email protected]> wrote:
>>>>
>>>> Hey Jinfeng,
>>>>
>>>> Thanks for pulling this together. It is definitely time to get off the
>>>> fork. I see a number of items with PCM. Should we do a hackathon to do
>>> all
>>>> the PCMs one afternoon? Maybe we could convince Julian to join so we
>>> could
>>>> make short progress of the issues. The original commits that have PCM
>>> were
>>>> done by the following developers: Mehant, Jinfeng, Jacques, Venki & Aman.
>>>> Even if we just get Aman, Jinfeng and myself, we could probably close out
>>>> most of the gaps. (Mehant only has one and it seems pretty small. Venki's
>>>> is larger but is only one as well.)
>>>>
>>>> It seems like an achievable goal that we could get could get on top of
>>>> 1.5.x master + star + 2-3 other patches for the 1.4 release. What are
>>> your
>>>> thoughts?
>>>>
>>>>
>>>>
>>>> --
>>>> Jacques Nadeau
>>>> CTO and Co-Founder, Dremio
>>>>
>>>>> On Tue, Nov 10, 2015 at 2:55 PM, Jinfeng Ni <[email protected]>
>>> wrote:
>>>>>
>>>>> Julian,
>>>>>
>>>>> Thanks a lot for your detailed comments!
>>>>>
>>>>> I totally agree with you on "calcite first" policy. That's why on MapR
>>>>> side, I recommend people get changes in Calcite master first, then
>>>>> cherry-pick back. Several recent patches follow this patten, as we
>>>>> realized the pain to maintain this calcite fork.
>>>>>
>>>>> I also agree that we should get them in ASAP and it is a good idea to
>>>>> start with simple patch firsts.
>>>>>
>>>>>
>>>>> 1. Why do you need "Add new method to ViewExpander interface to allow
>>>>> passing SchemaRoot."? Most ViewExpander implementations have a root
>>>>> schema. Note that the return type of that method has changed from
>>>>> RelNode to RelRoot.
>>>>>
>>>>> [
>>>>> Venki might have better idea. As far as I remember, the patch was
>>>>> merged to handle case where the view is created in a different schema
>>>>> from the source schema. Drill uses Frameworks, not the regular code
>>>>> path that Calcite master uses. Not sure if that's the cause of the
>>>>> problem on Drill side.
>>>>> ]
>>>>>
>>>>> 2. "Match Logical Rel in ReduceExpressionRule." should be easy now we've
>>>>> parameterized the rules, and maybe you can get rid of "Enable
>>>>> inheriting from previously singleton filter and calc constant
>>>>> reduction rules.".
>>>>>
>>>>> [ Agree.]
>>>>>
>>>>> 3. "Disable the nullability cast checking in Filter." is worth
>>> discussing
>>>>> on the Calcite list.
>>>>>
>>>>> [ Will do that once I revise the patch with testcase.]
>>>>>
>>>>> 4. In "Return validated row type and validated SqlNode when call
>>>>> Planner.validate()." you should return a Pair<SqlNode, RelDataType>.
>>>>>
>>>>> [ The patch essentially returns Pair<SqlNode, RelDataType>, since it
>>>>> put SqlNode and RelDataType in a wrapper. With Calcite master, seems
>>>>> RelRoot probably is a better choice; it includes everything that we
>>>>> want.
>>>>> ]
>>>>>
>>>>>
>>>>> 5. To do "Use case-insensitive comparison when creating output row type
>>>>> of a JoinRel." properly we require a new config parameter saying
>>>>> whether internal field names are case-sensitive.
>>>>>
>>>>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until
>>>>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute
>>>>> a fix for CALCITE-777! :)
>>>>>
>>>>> [Aman will have better idea]
>>>>>
>>>>> 6. Why do you need "Make public the constructor of
>>>>> SqlSumEmptyIsZeroAggFunction,"?
>>>>>
>>>>> [ It's used in Dril's ReduceAggregateRule. We should revisit this rule
>>>>> to see whether it's still required to keep this rule in stead of using
>>>>> Calcite's ReduceAggregateRule directly.
>>>>> ]
>>>>>
>>>>> 7. You don't need "Make certain push project constructors protected such
>>>>> that derived classes can access them." now we've parameterized
>>>>> ProjectFilterTransposeRule.
>>>>>
>>>>> [Agree.]
>>>>>
>>>>> 8. "DRILL-1455: Add return type-inference strategy for arithmetic
>>>>> operators when one of the arguments is ANY type." needs some
>>>>> discussion. What if both are ANY? How are other implementations of the
>>>>> operators (e.g. Calcite, Phoenix) going to implement the operator now
>>>>> that Drill's lax validation policy has allowed it in?
>>>>>
>>>>> [  Will discuss in Calcite list once we have the PR ready for this
>>> patch.
>>>>> ]
>>>>>
>>>>> 9. I can't see yet how the [StarColumn] changes could make it in. Will
>>>>> require some discussion. Note that CALCITE-546 has broken some of your
>>>>> assumptions.
>>>>>
>>>>> [ I know it would be big headache to resolve the changes related to
>>>>> star column for schema-on-read table.  I have to re-visit those
>>>>> changes, and probably start a discussion on Calcite list once I do a
>>>>> rebase and code cleanup
>>>>> ]
>>>>>
>>>>> Again, thanks for your feedback. I feel it's the right time for the
>>>>> Drill community to try to get off calcite fork!
>>>>>
>>>>> Regards,
>>>>>
>>>>> Jinfeng
>>>>>
>>>>>
>>>>>> On Tue, Nov 10, 2015 at 1:51 PM, Julian Hyde <[email protected]> wrote:
>>>>>> Going forward, I think you should start a "calcite first" policy --
>>>>>> get changes accepted into Calcite, with tests of course, then
>>>>>> cherry-pick into Drill's branch. Recent changes like "Allow a MAP
>>>>>> literal type." should have done that. And especially changes where you
>>>>>> could just parameterize planner rule. In other words, stop digging the
>>>>>> hole deeper.
>>>>>>
>>>>>> I think you should identify and do the easy ones ASAP.
>>>>>>
>>>>>> Rather than creating a branch for each of these contributions, if you
>>>>>> prefer you could create a pull request of the whole branch I think
>>>>>> that individual JIRAs can just reference a commit id that a Calcite
>>>>>> committer can cherry-pick.
>>>>>>
>>>>>> Now some remarks on specific changes.
>>>>>>
>>>>>> Why do you need "Add new method to ViewExpander interface to allow
>>>>>> passing SchemaRoot."? Most ViewExpander implementations have a root
>>>>>> schema. Note that the return type of that method has changed from
>>>>>> RelNode to RelRoot.
>>>>>>
>>>>>> "Match Logical Rel in ReduceExpressionRule." should be easy now we've
>>>>>> parameterized the rules, and maybe you can get rid of "Enable
>>>>>> inheriting from previously singleton filter and calc constant
>>>>>> reduction rules.".
>>>>>>
>>>>>> "Disable the nullability cast checking in Filter." is worth discussing
>>>>>> on the Calcite list.
>>>>>>
>>>>>> In "Return validated row type and validated SqlNode when call
>>>>>> Planner.validate()." you should return a Pair<SqlNode, RelDataType>.
>>>>>>
>>>>>> To do "Use case-insensitive comparison when creating output row type
>>>>>> of a JoinRel." properly we require a new config parameter saying
>>>>>> whether internal field names are case-sensitive.
>>>>>>
>>>>>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until
>>>>>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute
>>>>>> a fix for CALCITE-777! :)
>>>>>>
>>>>>> Why do you need "Make public the constructor of
>>>>> SqlSumEmptyIsZeroAggFunction,"?
>>>>>>
>>>>>> You don't need "Make certain push project constructors protected such
>>>>>> that derived classes can access them." now we've parameterized
>>>>>> ProjectFilterTransposeRule.
>>>>>>
>>>>>> "DRILL-1455: Add return type-inference strategy for arithmetic
>>>>>> operators when one of the arguments is ANY type." needs some
>>>>>> discussion. What if both are ANY? How are other implementations of the
>>>>>> operators (e.g. Calcite, Phoenix) going to implement the operator now
>>>>>> that Drill's lax validation policy has allowed it in?
>>>>>>
>>>>>> I can't see yet how the [StarColumn] changes could make it in. Will
>>>>>> require some discussion. Note that CALCITE-546 has broken some of your
>>>>>> assumptions.
>>>>>>
>>>>>> Julian
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Tue, Nov 10, 2015 at 12:27 PM, Jinfeng Ni <[email protected]> wrote:
>>>>>>> Currently, Drill maintains a forked version of Calcite. Because of
>>>>>>> this, it has been a big headache to rebase onto the latest Calcite
>>>>>>> master branch every time when Calcite makes a new release.
>>>>>>>
>>>>>>> During today's Drill hangout, we discuss ideas around how to get Drill
>>>>>>> off the forked Calcite, by either pushing back patches in the forked
>>>>>>> version to Calcite master branch, or trying to see if those patches
>>>>>>> are still required for Drill to work properly.
>>>>>>>
>>>>>>> There are 49 commits in forked Calcite whose base is Calcite 1.4.0
>>>>>>> release [1]. I put a spreadsheet for the current status of those 49
>>>>>>> commits, as far as I know [2]. Half of them are the patches that were
>>>>>>> cherry-picked from Calcite master branch,  9 are just for version
>>>>>>> change, and the rest requires either push back or re-work in Drill
>>>>>>> side.
>>>>>>>
>>>>>>> I think we probably have to spend considerable effort to get this done
>>>>>>> ASAP. Otherwise, it will continue to be an issue for Drill.
>>>>>>>
>>>>>>> There are two separate work to be done:
>>>>>>> 1. Continue to rebase forked version onto latest Calcite master
>>>>>>> (Currently it's 1.5.0)
>>>>>>> 2. Push the patches in fork to Calcite master. Once all patches are
>>>>>>> in, we could finally get rid of fork and rebasing efforts.
>>>>>>>
>>>>>>> The spreadsheet has a column for sign-up.
>>>>>>>
>>>>>>> Thoughts?
>>>>>>>
>>>>>>> (let me know if you have problem accessing the spreadsheet).
>>>>>>>
>>>>>>>
>>>>>>> [1] https://github.com/dremio/calcite/tree/DrillCalcite1.4.0
>>>>>>> [2]
>>>>>
>>> https://docs.google.com/spreadsheets/d/1Z0J5aMCBfqq_9SEl-LXsi_B8Ahaosygqke4cH6pPwmo/edit#gid=0
>>>>>
>>>

Reply via email to