There is no right answer, but I feel if you go this path a long way, it will be very difficult to merge back. Given that this is not a new functionality, and improvement to existing code (which will also evolve), it will become difficult to maintain/review a big diff in the future.
I haven't thought much about it, but can start by creating the high-level interfaces first, and then going from there. For e.g.: create interfaces for operators which take in an array of rows instead of a single row - initially the array size can always be 1. Now, proceed from there. What makes you think, merging a branch 6 months/1 year from now will be easier than working on the current branch ? Having said that, both approaches can be made to work - but I think you are just delaying the merging work instead of taking the hit upfront. Thanks, -namit On 4/4/13 2:40 AM, "Jitendra Pandey" <jiten...@hortonworks.com> wrote: > We did consider implementing these changes on the trunk. But, it would >take several patches in various parts of the code before a simple end to >end query can be executed on vectorized path. For example a patch for >vectorized expressions will be a significant amount of code, but will not >be used in a query until a vectorized operator is implemented and the >query >plan is modified to use the vectorized path. Vectorization of even basic >expressions becomes non trivial because we need to optimize for various >cases like chain of expressions, for non-null columns or repeating values >and also handle case for nullable columns, or short circuit optimization >etc. Careful handling of these is important for performance gains. > > Committing those intermediate patches in trunk without stabilizing them >in a branch first might be a cause of concern. > > A separate branch will let us make incremental changes to the system so >that each patch addresses a single feature or functionality and is small >enough to review. > We will make sure that the branch is frequently updated with the >changes >in the trunk to avoid conflicts at the time of the merge. > Also, we plan to propose merger of the branch as soon as a basic end to >end query begins to work and is sufficiently tested, instead of waiting >for >all operators to get vectorized. Initially our target is to make select >and >filter operators work with vectorized expressions for primitive types. > > We will have a single global configuration flag that can be used to >turn >off the entire vectorization code path and we will specifically test to >make sure that when this flag is off there is no regression on the current >system. When vectorization is turned on, we will have a validation step to >make sure the given query is supported on the vectorization path otherwise >it will fall back to current code path. > > Although, we intend to follow commit then review policy on the branch >for >speed of development, each patch will have an associated jira and will be >available for review and feedback. > >thanks >jitendra > >On Tue, Apr 2, 2013 at 8:37 PM, Namit Jain <nj...@fb.com> wrote: > >> It will be difficult to merge back the branch. >> Can you stage your changes incrementally ? >> >> I mean, start with the making the operators vectorized - it can be a for >> loop to >> start with ? I think it will be very difficult to merge it back if we >> diverge on this. >> I would recommend starting with simple interfaces for operators and then >> plugging them >> in slowly instead of a new branch, unless this approach is extremely >> difficult. >> >> >> Thanks, >> -namit >> >> On 4/3/13 1:52 AM, "Jitendra Pandey" <jiten...@hortonworks.com> wrote: >> >> >Hi Folks, >> > I want to propose for creation of a separate branch for HIVE-4160 >> >work. This is a significant amount of work, and support for very basic >> >functionality will need big chunks of code. It will also take some >>time to >> >stabilize and test. A separate dev branch will allow us to do this work >> >incrementally and collaboratively. We have already uploaded a design >> >document on the jira for comments/feedback. >> > >> >thanks >> >jitendra >> > >> > >> >-- >> ><http://hortonworks.com/download/> >> >> > > >-- ><http://hortonworks.com/download/>