I will log Calcite JIRA for each easy commit in the fork. Thanks!
On Tue, Nov 17, 2015 at 10:44 AM, Julian Hyde <[email protected]> wrote: > I hear you. > > However, get Jinfeng to log a calcite jira for each "easy" commit and we can > start burning through them. As I said earlier a commit Id to cherry pick > might be sufficient. That review process involves work on my part (or on the > part of other calcite committers reviewing) so we would be contributing to > the effort. (Yeah I know it doesn't always feel that a reviewer is doing > useful work.) > > Julian > >> On Nov 17, 2015, at 09:54, Jacques Nadeau <[email protected]> wrote: >> >> I'm fine with any approach. My suggestion came from my sense that, so far, >> our burn down hasn't been super successful. And sometimes I feel like >> Jinfeng gets stuck doing all the heavy lifting. >> >> -- >> Jacques Nadeau >> CTO and Co-Founder, Dremio >> >>> On Tue, Nov 17, 2015 at 9:48 AM, Julian Hyde <[email protected]> wrote: >>> >>> I can do a hackathon if it helps. You're approaching it as a "big bang" >>> but we could approach it as a burn down, counting the number of commits >>> different between drill's fork and master. There's a lot of incremental >>> value each commit we remove. >>> >>> Julian >>> >>>> On Nov 17, 2015, at 08:58, Jacques Nadeau <[email protected]> wrote: >>>> >>>> Hey Jinfeng, >>>> >>>> Thanks for pulling this together. It is definitely time to get off the >>>> fork. I see a number of items with PCM. Should we do a hackathon to do >>> all >>>> the PCMs one afternoon? Maybe we could convince Julian to join so we >>> could >>>> make short progress of the issues. The original commits that have PCM >>> were >>>> done by the following developers: Mehant, Jinfeng, Jacques, Venki & Aman. >>>> Even if we just get Aman, Jinfeng and myself, we could probably close out >>>> most of the gaps. (Mehant only has one and it seems pretty small. Venki's >>>> is larger but is only one as well.) >>>> >>>> It seems like an achievable goal that we could get could get on top of >>>> 1.5.x master + star + 2-3 other patches for the 1.4 release. What are >>> your >>>> thoughts? >>>> >>>> >>>> >>>> -- >>>> Jacques Nadeau >>>> CTO and Co-Founder, Dremio >>>> >>>>> On Tue, Nov 10, 2015 at 2:55 PM, Jinfeng Ni <[email protected]> >>> wrote: >>>>> >>>>> Julian, >>>>> >>>>> Thanks a lot for your detailed comments! >>>>> >>>>> I totally agree with you on "calcite first" policy. That's why on MapR >>>>> side, I recommend people get changes in Calcite master first, then >>>>> cherry-pick back. Several recent patches follow this patten, as we >>>>> realized the pain to maintain this calcite fork. >>>>> >>>>> I also agree that we should get them in ASAP and it is a good idea to >>>>> start with simple patch firsts. >>>>> >>>>> >>>>> 1. Why do you need "Add new method to ViewExpander interface to allow >>>>> passing SchemaRoot."? Most ViewExpander implementations have a root >>>>> schema. Note that the return type of that method has changed from >>>>> RelNode to RelRoot. >>>>> >>>>> [ >>>>> Venki might have better idea. As far as I remember, the patch was >>>>> merged to handle case where the view is created in a different schema >>>>> from the source schema. Drill uses Frameworks, not the regular code >>>>> path that Calcite master uses. Not sure if that's the cause of the >>>>> problem on Drill side. >>>>> ] >>>>> >>>>> 2. "Match Logical Rel in ReduceExpressionRule." should be easy now we've >>>>> parameterized the rules, and maybe you can get rid of "Enable >>>>> inheriting from previously singleton filter and calc constant >>>>> reduction rules.". >>>>> >>>>> [ Agree.] >>>>> >>>>> 3. "Disable the nullability cast checking in Filter." is worth >>> discussing >>>>> on the Calcite list. >>>>> >>>>> [ Will do that once I revise the patch with testcase.] >>>>> >>>>> 4. In "Return validated row type and validated SqlNode when call >>>>> Planner.validate()." you should return a Pair<SqlNode, RelDataType>. >>>>> >>>>> [ The patch essentially returns Pair<SqlNode, RelDataType>, since it >>>>> put SqlNode and RelDataType in a wrapper. With Calcite master, seems >>>>> RelRoot probably is a better choice; it includes everything that we >>>>> want. >>>>> ] >>>>> >>>>> >>>>> 5. To do "Use case-insensitive comparison when creating output row type >>>>> of a JoinRel." properly we require a new config parameter saying >>>>> whether internal field names are case-sensitive. >>>>> >>>>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until >>>>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute >>>>> a fix for CALCITE-777! :) >>>>> >>>>> [Aman will have better idea] >>>>> >>>>> 6. Why do you need "Make public the constructor of >>>>> SqlSumEmptyIsZeroAggFunction,"? >>>>> >>>>> [ It's used in Dril's ReduceAggregateRule. We should revisit this rule >>>>> to see whether it's still required to keep this rule in stead of using >>>>> Calcite's ReduceAggregateRule directly. >>>>> ] >>>>> >>>>> 7. You don't need "Make certain push project constructors protected such >>>>> that derived classes can access them." now we've parameterized >>>>> ProjectFilterTransposeRule. >>>>> >>>>> [Agree.] >>>>> >>>>> 8. "DRILL-1455: Add return type-inference strategy for arithmetic >>>>> operators when one of the arguments is ANY type." needs some >>>>> discussion. What if both are ANY? How are other implementations of the >>>>> operators (e.g. Calcite, Phoenix) going to implement the operator now >>>>> that Drill's lax validation policy has allowed it in? >>>>> >>>>> [ Will discuss in Calcite list once we have the PR ready for this >>> patch. >>>>> ] >>>>> >>>>> 9. I can't see yet how the [StarColumn] changes could make it in. Will >>>>> require some discussion. Note that CALCITE-546 has broken some of your >>>>> assumptions. >>>>> >>>>> [ I know it would be big headache to resolve the changes related to >>>>> star column for schema-on-read table. I have to re-visit those >>>>> changes, and probably start a discussion on Calcite list once I do a >>>>> rebase and code cleanup >>>>> ] >>>>> >>>>> Again, thanks for your feedback. I feel it's the right time for the >>>>> Drill community to try to get off calcite fork! >>>>> >>>>> Regards, >>>>> >>>>> Jinfeng >>>>> >>>>> >>>>>> On Tue, Nov 10, 2015 at 1:51 PM, Julian Hyde <[email protected]> wrote: >>>>>> Going forward, I think you should start a "calcite first" policy -- >>>>>> get changes accepted into Calcite, with tests of course, then >>>>>> cherry-pick into Drill's branch. Recent changes like "Allow a MAP >>>>>> literal type." should have done that. And especially changes where you >>>>>> could just parameterize planner rule. In other words, stop digging the >>>>>> hole deeper. >>>>>> >>>>>> I think you should identify and do the easy ones ASAP. >>>>>> >>>>>> Rather than creating a branch for each of these contributions, if you >>>>>> prefer you could create a pull request of the whole branch I think >>>>>> that individual JIRAs can just reference a commit id that a Calcite >>>>>> committer can cherry-pick. >>>>>> >>>>>> Now some remarks on specific changes. >>>>>> >>>>>> Why do you need "Add new method to ViewExpander interface to allow >>>>>> passing SchemaRoot."? Most ViewExpander implementations have a root >>>>>> schema. Note that the return type of that method has changed from >>>>>> RelNode to RelRoot. >>>>>> >>>>>> "Match Logical Rel in ReduceExpressionRule." should be easy now we've >>>>>> parameterized the rules, and maybe you can get rid of "Enable >>>>>> inheriting from previously singleton filter and calc constant >>>>>> reduction rules.". >>>>>> >>>>>> "Disable the nullability cast checking in Filter." is worth discussing >>>>>> on the Calcite list. >>>>>> >>>>>> In "Return validated row type and validated SqlNode when call >>>>>> Planner.validate()." you should return a Pair<SqlNode, RelDataType>. >>>>>> >>>>>> To do "Use case-insensitive comparison when creating output row type >>>>>> of a JoinRel." properly we require a new config parameter saying >>>>>> whether internal field names are case-sensitive. >>>>>> >>>>>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until >>>>>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute >>>>>> a fix for CALCITE-777! :) >>>>>> >>>>>> Why do you need "Make public the constructor of >>>>> SqlSumEmptyIsZeroAggFunction,"? >>>>>> >>>>>> You don't need "Make certain push project constructors protected such >>>>>> that derived classes can access them." now we've parameterized >>>>>> ProjectFilterTransposeRule. >>>>>> >>>>>> "DRILL-1455: Add return type-inference strategy for arithmetic >>>>>> operators when one of the arguments is ANY type." needs some >>>>>> discussion. What if both are ANY? How are other implementations of the >>>>>> operators (e.g. Calcite, Phoenix) going to implement the operator now >>>>>> that Drill's lax validation policy has allowed it in? >>>>>> >>>>>> I can't see yet how the [StarColumn] changes could make it in. Will >>>>>> require some discussion. Note that CALCITE-546 has broken some of your >>>>>> assumptions. >>>>>> >>>>>> Julian >>>>>> >>>>>> >>>>>> >>>>>>> On Tue, Nov 10, 2015 at 12:27 PM, Jinfeng Ni <[email protected]> wrote: >>>>>>> Currently, Drill maintains a forked version of Calcite. Because of >>>>>>> this, it has been a big headache to rebase onto the latest Calcite >>>>>>> master branch every time when Calcite makes a new release. >>>>>>> >>>>>>> During today's Drill hangout, we discuss ideas around how to get Drill >>>>>>> off the forked Calcite, by either pushing back patches in the forked >>>>>>> version to Calcite master branch, or trying to see if those patches >>>>>>> are still required for Drill to work properly. >>>>>>> >>>>>>> There are 49 commits in forked Calcite whose base is Calcite 1.4.0 >>>>>>> release [1]. I put a spreadsheet for the current status of those 49 >>>>>>> commits, as far as I know [2]. Half of them are the patches that were >>>>>>> cherry-picked from Calcite master branch, 9 are just for version >>>>>>> change, and the rest requires either push back or re-work in Drill >>>>>>> side. >>>>>>> >>>>>>> I think we probably have to spend considerable effort to get this done >>>>>>> ASAP. Otherwise, it will continue to be an issue for Drill. >>>>>>> >>>>>>> There are two separate work to be done: >>>>>>> 1. Continue to rebase forked version onto latest Calcite master >>>>>>> (Currently it's 1.5.0) >>>>>>> 2. Push the patches in fork to Calcite master. Once all patches are >>>>>>> in, we could finally get rid of fork and rebasing efforts. >>>>>>> >>>>>>> The spreadsheet has a column for sign-up. >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> (let me know if you have problem accessing the spreadsheet). >>>>>>> >>>>>>> >>>>>>> [1] https://github.com/dremio/calcite/tree/DrillCalcite1.4.0 >>>>>>> [2] >>>>> >>> https://docs.google.com/spreadsheets/d/1Z0J5aMCBfqq_9SEl-LXsi_B8Ahaosygqke4cH6pPwmo/edit#gid=0 >>>>> >>>
