That sounds great, thanks for the help Julian! -- Jacques Nadeau CTO and Co-Founder, Dremio
On Tue, Nov 17, 2015 at 10:44 AM, Julian Hyde <jhyde.apa...@gmail.com> wrote: > I hear you. > > However, get Jinfeng to log a calcite jira for each "easy" commit and we > can start burning through them. As I said earlier a commit Id to cherry > pick might be sufficient. That review process involves work on my part (or > on the part of other calcite committers reviewing) so we would be > contributing to the effort. (Yeah I know it doesn't always feel that a > reviewer is doing useful work.) > > Julian > > > On Nov 17, 2015, at 09:54, Jacques Nadeau <jacq...@dremio.com> wrote: > > > > I'm fine with any approach. My suggestion came from my sense that, so > far, > > our burn down hasn't been super successful. And sometimes I feel like > > Jinfeng gets stuck doing all the heavy lifting. > > > > -- > > Jacques Nadeau > > CTO and Co-Founder, Dremio > > > >> On Tue, Nov 17, 2015 at 9:48 AM, Julian Hyde <jhyde.apa...@gmail.com> > wrote: > >> > >> I can do a hackathon if it helps. You're approaching it as a "big bang" > >> but we could approach it as a burn down, counting the number of commits > >> different between drill's fork and master. There's a lot of incremental > >> value each commit we remove. > >> > >> Julian > >> > >>> On Nov 17, 2015, at 08:58, Jacques Nadeau <jacq...@dremio.com> wrote: > >>> > >>> Hey Jinfeng, > >>> > >>> Thanks for pulling this together. It is definitely time to get off the > >>> fork. I see a number of items with PCM. Should we do a hackathon to do > >> all > >>> the PCMs one afternoon? Maybe we could convince Julian to join so we > >> could > >>> make short progress of the issues. The original commits that have PCM > >> were > >>> done by the following developers: Mehant, Jinfeng, Jacques, Venki & > Aman. > >>> Even if we just get Aman, Jinfeng and myself, we could probably close > out > >>> most of the gaps. (Mehant only has one and it seems pretty small. > Venki's > >>> is larger but is only one as well.) > >>> > >>> It seems like an achievable goal that we could get could get on top of > >>> 1.5.x master + star + 2-3 other patches for the 1.4 release. What are > >> your > >>> thoughts? > >>> > >>> > >>> > >>> -- > >>> Jacques Nadeau > >>> CTO and Co-Founder, Dremio > >>> > >>>> On Tue, Nov 10, 2015 at 2:55 PM, Jinfeng Ni <jinfengn...@gmail.com> > >> wrote: > >>>> > >>>> Julian, > >>>> > >>>> Thanks a lot for your detailed comments! > >>>> > >>>> I totally agree with you on "calcite first" policy. That's why on MapR > >>>> side, I recommend people get changes in Calcite master first, then > >>>> cherry-pick back. Several recent patches follow this patten, as we > >>>> realized the pain to maintain this calcite fork. > >>>> > >>>> I also agree that we should get them in ASAP and it is a good idea to > >>>> start with simple patch firsts. > >>>> > >>>> > >>>> 1. Why do you need "Add new method to ViewExpander interface to allow > >>>> passing SchemaRoot."? Most ViewExpander implementations have a root > >>>> schema. Note that the return type of that method has changed from > >>>> RelNode to RelRoot. > >>>> > >>>> [ > >>>> Venki might have better idea. As far as I remember, the patch was > >>>> merged to handle case where the view is created in a different schema > >>>> from the source schema. Drill uses Frameworks, not the regular code > >>>> path that Calcite master uses. Not sure if that's the cause of the > >>>> problem on Drill side. > >>>> ] > >>>> > >>>> 2. "Match Logical Rel in ReduceExpressionRule." should be easy now > we've > >>>> parameterized the rules, and maybe you can get rid of "Enable > >>>> inheriting from previously singleton filter and calc constant > >>>> reduction rules.". > >>>> > >>>> [ Agree.] > >>>> > >>>> 3. "Disable the nullability cast checking in Filter." is worth > >> discussing > >>>> on the Calcite list. > >>>> > >>>> [ Will do that once I revise the patch with testcase.] > >>>> > >>>> 4. In "Return validated row type and validated SqlNode when call > >>>> Planner.validate()." you should return a Pair<SqlNode, RelDataType>. > >>>> > >>>> [ The patch essentially returns Pair<SqlNode, RelDataType>, since it > >>>> put SqlNode and RelDataType in a wrapper. With Calcite master, seems > >>>> RelRoot probably is a better choice; it includes everything that we > >>>> want. > >>>> ] > >>>> > >>>> > >>>> 5. To do "Use case-insensitive comparison when creating output row > type > >>>> of a JoinRel." properly we require a new config parameter saying > >>>> whether internal field names are case-sensitive. > >>>> > >>>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until > >>>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute > >>>> a fix for CALCITE-777! :) > >>>> > >>>> [Aman will have better idea] > >>>> > >>>> 6. Why do you need "Make public the constructor of > >>>> SqlSumEmptyIsZeroAggFunction,"? > >>>> > >>>> [ It's used in Dril's ReduceAggregateRule. We should revisit this rule > >>>> to see whether it's still required to keep this rule in stead of using > >>>> Calcite's ReduceAggregateRule directly. > >>>> ] > >>>> > >>>> 7. You don't need "Make certain push project constructors protected > such > >>>> that derived classes can access them." now we've parameterized > >>>> ProjectFilterTransposeRule. > >>>> > >>>> [Agree.] > >>>> > >>>> 8. "DRILL-1455: Add return type-inference strategy for arithmetic > >>>> operators when one of the arguments is ANY type." needs some > >>>> discussion. What if both are ANY? How are other implementations of the > >>>> operators (e.g. Calcite, Phoenix) going to implement the operator now > >>>> that Drill's lax validation policy has allowed it in? > >>>> > >>>> [ Will discuss in Calcite list once we have the PR ready for this > >> patch. > >>>> ] > >>>> > >>>> 9. I can't see yet how the [StarColumn] changes could make it in. Will > >>>> require some discussion. Note that CALCITE-546 has broken some of your > >>>> assumptions. > >>>> > >>>> [ I know it would be big headache to resolve the changes related to > >>>> star column for schema-on-read table. I have to re-visit those > >>>> changes, and probably start a discussion on Calcite list once I do a > >>>> rebase and code cleanup > >>>> ] > >>>> > >>>> Again, thanks for your feedback. I feel it's the right time for the > >>>> Drill community to try to get off calcite fork! > >>>> > >>>> Regards, > >>>> > >>>> Jinfeng > >>>> > >>>> > >>>>> On Tue, Nov 10, 2015 at 1:51 PM, Julian Hyde <jh...@apache.org> > wrote: > >>>>> Going forward, I think you should start a "calcite first" policy -- > >>>>> get changes accepted into Calcite, with tests of course, then > >>>>> cherry-pick into Drill's branch. Recent changes like "Allow a MAP > >>>>> literal type." should have done that. And especially changes where > you > >>>>> could just parameterize planner rule. In other words, stop digging > the > >>>>> hole deeper. > >>>>> > >>>>> I think you should identify and do the easy ones ASAP. > >>>>> > >>>>> Rather than creating a branch for each of these contributions, if you > >>>>> prefer you could create a pull request of the whole branch I think > >>>>> that individual JIRAs can just reference a commit id that a Calcite > >>>>> committer can cherry-pick. > >>>>> > >>>>> Now some remarks on specific changes. > >>>>> > >>>>> Why do you need "Add new method to ViewExpander interface to allow > >>>>> passing SchemaRoot."? Most ViewExpander implementations have a root > >>>>> schema. Note that the return type of that method has changed from > >>>>> RelNode to RelRoot. > >>>>> > >>>>> "Match Logical Rel in ReduceExpressionRule." should be easy now we've > >>>>> parameterized the rules, and maybe you can get rid of "Enable > >>>>> inheriting from previously singleton filter and calc constant > >>>>> reduction rules.". > >>>>> > >>>>> "Disable the nullability cast checking in Filter." is worth > discussing > >>>>> on the Calcite list. > >>>>> > >>>>> In "Return validated row type and validated SqlNode when call > >>>>> Planner.validate()." you should return a Pair<SqlNode, RelDataType>. > >>>>> > >>>>> To do "Use case-insensitive comparison when creating output row type > >>>>> of a JoinRel." properly we require a new config parameter saying > >>>>> whether internal field names are case-sensitive. > >>>>> > >>>>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until > >>>>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please > contribute > >>>>> a fix for CALCITE-777! :) > >>>>> > >>>>> Why do you need "Make public the constructor of > >>>> SqlSumEmptyIsZeroAggFunction,"? > >>>>> > >>>>> You don't need "Make certain push project constructors protected such > >>>>> that derived classes can access them." now we've parameterized > >>>>> ProjectFilterTransposeRule. > >>>>> > >>>>> "DRILL-1455: Add return type-inference strategy for arithmetic > >>>>> operators when one of the arguments is ANY type." needs some > >>>>> discussion. What if both are ANY? How are other implementations of > the > >>>>> operators (e.g. Calcite, Phoenix) going to implement the operator now > >>>>> that Drill's lax validation policy has allowed it in? > >>>>> > >>>>> I can't see yet how the [StarColumn] changes could make it in. Will > >>>>> require some discussion. Note that CALCITE-546 has broken some of > your > >>>>> assumptions. > >>>>> > >>>>> Julian > >>>>> > >>>>> > >>>>> > >>>>>> On Tue, Nov 10, 2015 at 12:27 PM, Jinfeng Ni <j...@apache.org> > wrote: > >>>>>> Currently, Drill maintains a forked version of Calcite. Because of > >>>>>> this, it has been a big headache to rebase onto the latest Calcite > >>>>>> master branch every time when Calcite makes a new release. > >>>>>> > >>>>>> During today's Drill hangout, we discuss ideas around how to get > Drill > >>>>>> off the forked Calcite, by either pushing back patches in the forked > >>>>>> version to Calcite master branch, or trying to see if those patches > >>>>>> are still required for Drill to work properly. > >>>>>> > >>>>>> There are 49 commits in forked Calcite whose base is Calcite 1.4.0 > >>>>>> release [1]. I put a spreadsheet for the current status of those 49 > >>>>>> commits, as far as I know [2]. Half of them are the patches that > were > >>>>>> cherry-picked from Calcite master branch, 9 are just for version > >>>>>> change, and the rest requires either push back or re-work in Drill > >>>>>> side. > >>>>>> > >>>>>> I think we probably have to spend considerable effort to get this > done > >>>>>> ASAP. Otherwise, it will continue to be an issue for Drill. > >>>>>> > >>>>>> There are two separate work to be done: > >>>>>> 1. Continue to rebase forked version onto latest Calcite master > >>>>>> (Currently it's 1.5.0) > >>>>>> 2. Push the patches in fork to Calcite master. Once all patches are > >>>>>> in, we could finally get rid of fork and rebasing efforts. > >>>>>> > >>>>>> The spreadsheet has a column for sign-up. > >>>>>> > >>>>>> Thoughts? > >>>>>> > >>>>>> (let me know if you have problem accessing the spreadsheet). > >>>>>> > >>>>>> > >>>>>> [1] https://github.com/dremio/calcite/tree/DrillCalcite1.4.0 > >>>>>> [2] > >>>> > >> > https://docs.google.com/spreadsheets/d/1Z0J5aMCBfqq_9SEl-LXsi_B8Ahaosygqke4cH6pPwmo/edit#gid=0 > >>>> > >> >