I am open to any approach to making the build and project more modular.
 Maven is no silver bullet but it does have some big positives. Most
importantly I have been a little distressed with Ivy when changing library
versions as it caused hundreds of tests to fail for no good reason in
HIVE-3632.  I have hit this several times on Ivy projects.  As has been
iterated before offline and eclipse support are built in.  IMO alignment
with the other Hadoop ecosystem projects is another win.


On Sat, Jul 27, 2013 at 9:03 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote:

> Or feel free to suggest different approach. I am used to managing software
> as multi-module maven projects.
> From a development standpoint if I was working on beeline, it would be nice
> to only require some of the sub-projects to be open in my IDE to do that.
> Also managing everything globally is not ideal.
>
> Hive's project layout, build, and test infrastructure is just funky. It has
> to do a few interesting things (shims, testing), but I do not think what we
> are doing justifies the massive ant build system we have. Ant is so ten
> years ago.
>
>
>
> On Sat, Jul 27, 2013 at 12:04 AM, Alan Gates <ga...@hortonworks.com>
> wrote:
>
> > But I assume they'd still be a part of targets like package, tar, and
> > binary?  Making them compile and test separately and explicitly load the
> > core Hive jars from maven/ivy seems reasonable.
> >
> > Alan.
> >
> > On Jul 26, 2013, at 8:40 PM, Brock Noland wrote:
> >
> > > Hi,
> > >
> > > I think thats part of it but I'd like to decouple the downstream
> projects
> > > even further so that the only connection is the dependency on the hive
> > jars.
> > >
> > > Brock
> > > On Jul 26, 2013 10:10 PM, "Alan Gates" <ga...@hortonworks.com> wrote:
> > >
> > >> I'm not sure how this is different from what hcat does today.  It
> needs
> > >> Hive's jars to compile, so it's one of the last things in the compile
> > step.
> > >> Would moving the other modules you note to be in the same category be
> > >> enough?  Did you want to also make it so that the default ant target
> > >> doesn't compile those?
> > >>
> > >> Alan.
> > >>
> > >> On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote:
> > >>
> > >>> My mistake on saying hcat was a fork metastore. I had a brain fart
> for
> > a
> > >>> moment.
> > >>>
> > >>> One way we could do this is create a folder called downstream. In our
> > >>> release step we can execute the downstream builds and then copy the
> > files
> > >>> we need back. So nothing downstream will be on the classpath of the
> > main
> > >>> project.
> > >>>
> > >>> This could help us breakup ql as well. Things like exotic file
> formats
> > ,
> > >>> and things that are pluggable like zk locking can go here. That might
> > be
> > >>> overkill.
> > >>>
> > >>> For now we can focus on building downstream and hivethrift1might be
> the
> > >>> first thing to try to downstream.
> > >>>
> > >>>
> > >>> On Friday, July 26, 2013, Thejas Nair <the...@hortonworks.com>
> wrote:
> > >>>> +1 to the idea of making the build of core hive and other downstream
> > >>>> components independent.
> > >>>>
> > >>>> bq.  I was under the impression that Hcat and hive-metastore was
> > >>>> supposed to merge up somehow.
> > >>>>
> > >>>> The metastore code was never forked. Hcat was just using
> > >>>> hive-metastore and making the metadata available to rest of hadoop
> > >>>> (pig, java MR..).
> > >>>> A lot of the changes that were driven by hcat goals were being made
> in
> > >>>> hive-metastore. You can think of hcat as set of libraries that let
> pig
> > >>>> and java MR use hive metastore. Since hcat is closely tied to
> > >>>> hive-metastore, it makes sense to have them in same project.
> > >>>>
> > >>>>
> > >>>> On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo <
> > edlinuxg...@gmail.com
> > >>>
> > >>> wrote:
> > >>>>> Also i believe hcatalog web can fall into the same designation.
> > >>>>>
> > >>>>> Question , hcatalog was initily a big hive-metastore fork. I was
> > under
> > >>> the
> > >>>>> impression that Hcat and hive-metastore was supposed to merge up
> > >> somehow.
> > >>>>> What is the status on that? I remember that was one of the core
> > reasons
> > >>> we
> > >>>>> brought it in.
> > >>>>>
> > >>>>> On Friday, July 26, 2013, Edward Capriolo <edlinuxg...@gmail.com>
> > >> wrote:
> > >>>>>> I prefer option 3 as well.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland <
> br...@cloudera.com>
> > >>> wrote:
> > >>>>>>>
> > >>>>>>> On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo <
> > >> edlinuxg...@gmail.com
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> I have been developing my laptop on a duel core 2 GB Ram laptop
> > for
> > >>>>> years
> > >>>>>>>> now. With the addition of hcatalog, hive-thrift2, and some other
> > >>> growth
> > >>>>>>>> trying to develop hive in a eclipse on this machine craws,
> > >> especially
> > >>>>> if
> > >>>>>>>> 'build automatically' is turned on. As we look to add on more
> > things
> > >>>>> this
> > >>>>>>>> is only going to get worse.
> > >>>>>>>>
> > >>>>>>>> I am also noticing issues like this:
> > >>>>>>>>
> > >>>>>>>> https://issues.apache.org/jira/browse/HIVE-4849
> > >>>>>>>>
> > >>>>>>>> What I think we should do is strip down/out optional parts of
> > hive.
> > >>>>>>>>
> > >>>>>>>> 1) Hive Hbase
> > >>>>>>>> This should really be it's own project to do this right we
> really
> > >>>>> have to
> > >>>>>>>> have multiple branches since hbase is not backwards compatible.
> > >>>>>>>>
> > >>>>>>>> 2) Hive Web Interface
> > >>>>>>>> Now really a big project but not really critical can be just as
> > >>> easily
> > >>>>> be
> > >>>>>>>> build separately
> > >>>>>>>>
> > >>>>>>>> 3) hive thrift 1
> > >>>>>>>> We have hive thrift 2 now, it is time for the sun to set on
> > >>>>> hivethrift1,
> > >>>>>>>>
> > >>>>>>>> 4) odbc
> > >>>>>>>> Not entirely convinced about this one but it is really not
> > critical
> > >>> to
> > >>>>>>>> running hive.
> > >>>>>>>>
> > >>>>>>>> What I think we should do is create sub-projects for the above
> > >> things
> > >>>>> or
> > >>>>>>>> simply move them into directories that do not build with hive.
> > >>> Ideally
> > >>>>> they
> > >>>>>>>> would use maven to pull dependencies.
> > >>>>>>>>
> > >>>>>>>> What does everyone think?
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> I agree that projects like the HBase handler and probably others
> as
> > >>> well
> > >>>>>>> should somehow be "downstream" projects which simply depend on
> the
> > >> hive
> > >>>>>>> jars.  I see a couple alternatives for this:
> > >>>>>>>
> > >>>>>>> * Take the "module" in question to the Apache Incubator
> > >>>>>>> * Move the "module" in question to the Apache Extras
> > >>>>>>> * Breakup the projects within our own source tree
> > >>>>>>>
> > >>>>>>> I'd prefer the third option at this point.
> > >>>>>>>
> > >>>>>>> Brock
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Brock
> > >>>>>>
> > >>>>>>
> > >>>>
> > >>
> > >>
> >
> >
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Reply via email to