We did consider implementing these changes on the trunk. But, it would take several patches in various parts of the code before a simple end to end query can be executed on vectorized path. For example a patch for vectorized expressions will be a significant amount of code, but will not be used in a query until a vectorized operator is implemented and the query plan is modified to use the vectorized path. Vectorization of even basic expressions becomes non trivial because we need to optimize for various cases like chain of expressions, for non-null columns or repeating values and also handle case for nullable columns, or short circuit optimization etc. Careful handling of these is important for performance gains.
Committing those intermediate patches in trunk without stabilizing them in a branch first might be a cause of concern. A separate branch will let us make incremental changes to the system so that each patch addresses a single feature or functionality and is small enough to review. We will make sure that the branch is frequently updated with the changes in the trunk to avoid conflicts at the time of the merge. Also, we plan to propose merger of the branch as soon as a basic end to end query begins to work and is sufficiently tested, instead of waiting for all operators to get vectorized. Initially our target is to make select and filter operators work with vectorized expressions for primitive types. We will have a single global configuration flag that can be used to turn off the entire vectorization code path and we will specifically test to make sure that when this flag is off there is no regression on the current system. When vectorization is turned on, we will have a validation step to make sure the given query is supported on the vectorization path otherwise it will fall back to current code path. Although, we intend to follow commit then review policy on the branch for speed of development, each patch will have an associated jira and will be available for review and feedback. thanks jitendra On Tue, Apr 2, 2013 at 8:37 PM, Namit Jain <nj...@fb.com> wrote: > It will be difficult to merge back the branch. > Can you stage your changes incrementally ? > > I mean, start with the making the operators vectorized - it can be a for > loop to > start with ? I think it will be very difficult to merge it back if we > diverge on this. > I would recommend starting with simple interfaces for operators and then > plugging them > in slowly instead of a new branch, unless this approach is extremely > difficult. > > > Thanks, > -namit > > On 4/3/13 1:52 AM, "Jitendra Pandey" <jiten...@hortonworks.com> wrote: > > >Hi Folks, > > I want to propose for creation of a separate branch for HIVE-4160 > >work. This is a significant amount of work, and support for very basic > >functionality will need big chunks of code. It will also take some time to > >stabilize and test. A separate dev branch will allow us to do this work > >incrementally and collaboratively. We have already uploaded a design > >document on the jira for comments/feedback. > > > >thanks > >jitendra > > > > > >-- > ><http://hortonworks.com/download/> > > -- <http://hortonworks.com/download/>