Re: [Discuss] project chop up

Brock Noland Wed, 07 Aug 2013 13:02:56 -0700

FYI I am still waiting on Infra for the CMS move:
https://issues.apache.org/jira/browse/INFRA-6593



On Wed, Aug 7, 2013 at 2:57 PM, Edward Capriolo <[email protected]>wrote:

> I think that is a good idea. I have been thinking about it a lot. I
> especially hate how the offline build is now broken.
>
> However I think it is going to take some time. There are some tricks like
> how we build hive-exec jar that are not very clean to do in maven. I am
> very interested
>
> The last initiative we spoke about on list was moving from forest, I would
> like to finish/start that before we get onto the project chop up.
>
>
> On Wed, Aug 7, 2013 at 3:06 PM, Brock Noland <[email protected]> wrote:
>
> > Thus far there hasn't been any dissent to managing our modules with
> maven.
> >  In addition there have been several comments positive on a move towards
> > maven. I'd like to add Ivy seems to have issues managing multiple
> versions
> > of libraries. For example in HIVE-3632 Ivy cache had to be cleared when
> > testing patches that installed the new version of DataNucleus  I have had
> > the same issue on HIVE-4388. Requiring the deletion of the ivy cache
> > is extremely painful for developers that don't have access to high
> > bandwidth connections or live in areas far from California where most of
> > these jars are hosted.
> >
> > I'd like to propose we move towards Maven.
> >
> >
> > On Sat, Jul 27, 2013 at 1:19 PM, Mohammad Islam <[email protected]>
> > wrote:
> >
> > >
> > >
> > > Yes hive build and test cases got convoluted as the project scope
> > > gradually increased. This is the time to take action!
> > >
> > > Based on my other Apache experiences, I prefer the option #3 "Breakup
> the
> > > projects within our own source tree". Make multiple modules or
> > > sub-projects. By default, only key modules will be built.
> > >
> > > Maven could be a possible candidate.
> > >
> > > Regards,
> > > Mohammad
> > >
> > >
> > >
> > > ________________________________
> > >  From: Edward Capriolo <[email protected]>
> > > To: "[email protected]" <[email protected]>
> > > Sent: Saturday, July 27, 2013 7:03 AM
> > > Subject: Re: [Discuss] project chop up
> > >
> > >
> > > Or feel free to suggest different approach. I am used to managing
> > software
> > > as multi-module maven projects.
> > > From a development standpoint if I was working on beeline, it would be
> > nice
> > > to only require some of the sub-projects to be open in my IDE to do
> that.
> > > Also managing everything globally is not ideal.
> > >
> > > Hive's project layout, build, and test infrastructure is just funky. It
> > has
> > > to do a few interesting things (shims, testing), but I do not think
> what
> > we
> > > are doing justifies the massive ant build system we have. Ant is so ten
> > > years ago.
> > >
> > >
> > >
> > > On Sat, Jul 27, 2013 at 12:04 AM, Alan Gates <[email protected]>
> > > wrote:
> > >
> > > > But I assume they'd still be a part of targets like package, tar, and
> > > > binary?  Making them compile and test separately and explicitly load
> > the
> > > > core Hive jars from maven/ivy seems reasonable.
> > > >
> > > > Alan.
> > > >
> > > > On Jul 26, 2013, at 8:40 PM, Brock Noland wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I think thats part of it but I'd like to decouple the downstream
> > > projects
> > > > > even further so that the only connection is the dependency on the
> > hive
> > > > jars.
> > > > >
> > > > > Brock
> > > > > On Jul 26, 2013 10:10 PM, "Alan Gates" <[email protected]>
> > wrote:
> > > > >
> > > > >> I'm not sure how this is different from what hcat does today.  It
> > > needs
> > > > >> Hive's jars to compile, so it's one of the last things in the
> > compile
> > > > step.
> > > > >> Would moving the other modules you note to be in the same category
> > be
> > > > >> enough?  Did you want to also make it so that the default ant
> target
> > > > >> doesn't compile those?
> > > > >>
> > > > >> Alan.
> > > > >>
> > > > >> On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote:
> > > > >>
> > > > >>> My mistake on saying hcat was a fork metastore. I had a brain
> fart
> > > for
> > > > a
> > > > >>> moment.
> > > > >>>
> > > > >>> One way we could do this is create a folder called downstream. In
> > our
> > > > >>> release step we can execute the downstream builds and then copy
> the
> > > > files
> > > > >>> we need back. So nothing downstream will be on the classpath of
> the
> > > > main
> > > > >>> project.
> > > > >>>
> > > > >>> This could help us breakup ql as well. Things like exotic file
> > > formats
> > > > ,
> > > > >>> and things that are pluggable like zk locking can go here. That
> > might
> > > > be
> > > > >>> overkill.
> > > > >>>
> > > > >>> For now we can focus on building downstream and hivethrift1might
> be
> > > the
> > > > >>> first thing to try to downstream.
> > > > >>>
> > > > >>>
> > > > >>> On Friday, July 26, 2013, Thejas Nair <[email protected]>
> > > wrote:
> > > > >>>> +1 to the idea of making the build of core hive and other
> > downstream
> > > > >>>> components independent.
> > > > >>>>
> > > > >>>> bq.  I was under the impression that Hcat and hive-metastore was
> > > > >>>> supposed to merge up somehow.
> > > > >>>>
> > > > >>>> The metastore code was never forked. Hcat was just using
> > > > >>>> hive-metastore and making the metadata available to rest of
> hadoop
> > > > >>>> (pig, java MR..).
> > > > >>>> A lot of the changes that were driven by hcat goals were being
> > made
> > > in
> > > > >>>> hive-metastore. You can think of hcat as set of libraries that
> let
> > > pig
> > > > >>>> and java MR use hive metastore. Since hcat is closely tied to
> > > > >>>> hive-metastore, it makes sense to have them in same project.
> > > > >>>>
> > > > >>>>
> > > > >>>> On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo <
> > > > [email protected]
> > > > >>>
> > > > >>> wrote:
> > > > >>>>> Also i believe hcatalog web can fall into the same designation.
> > > > >>>>>
> > > > >>>>> Question , hcatalog was initily a big hive-metastore fork. I
> was
> > > > under
> > > > >>> the
> > > > >>>>> impression that Hcat and hive-metastore was supposed to merge
> up
> > > > >> somehow.
> > > > >>>>> What is the status on that? I remember that was one of the core
> > > > reasons
> > > > >>> we
> > > > >>>>> brought it in.
> > > > >>>>>
> > > > >>>>> On Friday, July 26, 2013, Edward Capriolo <
> [email protected]
> > >
> > > > >> wrote:
> > > > >>>>>> I prefer option 3 as well.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland <
> > > [email protected]>
> > > > >>> wrote:
> > > > >>>>>>>
> > > > >>>>>>> On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo <
> > > > >> [email protected]
> > > > >>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> I have been developing my laptop on a duel core 2 GB Ram
> > laptop
> > > > for
> > > > >>>>> years
> > > > >>>>>>>> now. With the addition of hcatalog, hive-thrift2, and some
> > other
> > > > >>> growth
> > > > >>>>>>>> trying to develop hive in a eclipse on this machine craws,
> > > > >> especially
> > > > >>>>> if
> > > > >>>>>>>> 'build automatically' is turned on. As we look to add on
> more
> > > > things
> > > > >>>>> this
> > > > >>>>>>>> is only going to get worse.
> > > > >>>>>>>>
> > > > >>>>>>>> I am also noticing issues like this:
> > > > >>>>>>>>
> > > > >>>>>>>> https://issues.apache.org/jira/browse/HIVE-4849
> > > > >>>>>>>>
> > > > >>>>>>>> What I think we should do is strip down/out optional parts
> of
> > > > hive.
> > > > >>>>>>>>
> > > > >>>>>>>> 1) Hive Hbase
> > > > >>>>>>>> This should really be it's own project to do this right we
> > > really
> > > > >>>>> have to
> > > > >>>>>>>> have multiple branches since hbase is not backwards
> > compatible.
> > > > >>>>>>>>
> > > > >>>>>>>> 2) Hive Web Interface
> > > > >>>>>>>> Now really a big project but not really critical can be just
> > as
> > > > >>> easily
> > > > >>>>> be
> > > > >>>>>>>> build separately
> > > > >>>>>>>>
> > > > >>>>>>>> 3) hive thrift 1
> > > > >>>>>>>> We have hive thrift 2 now, it is time for the sun to set on
> > > > >>>>> hivethrift1,
> > > > >>>>>>>>
> > > > >>>>>>>> 4) odbc
> > > > >>>>>>>> Not entirely convinced about this one but it is really not
> > > > critical
> > > > >>> to
> > > > >>>>>>>> running hive.
> > > > >>>>>>>>
> > > > >>>>>>>> What I think we should do is create sub-projects for the
> above
> > > > >> things
> > > > >>>>> or
> > > > >>>>>>>> simply move them into directories that do not build with
> hive.
> > > > >>> Ideally
> > > > >>>>> they
> > > > >>>>>>>> would use maven to pull dependencies.
> > > > >>>>>>>>
> > > > >>>>>>>> What does everyone think?
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> I agree that projects like the HBase handler and probably
> > others
> > > as
> > > > >>> well
> > > > >>>>>>> should somehow be "downstream" projects which simply depend
> on
> > > the
> > > > >> hive
> > > > >>>>>>> jars.  I see a couple alternatives for this:
> > > > >>>>>>>
> > > > >>>>>>> * Take the "module" in question to the Apache Incubator
> > > > >>>>>>> * Move the "module" in question to the Apache Extras
> > > > >>>>>>> * Breakup the projects within our own source tree
> > > > >>>>>>>
> > > > >>>>>>> I'd prefer the third option at this point.
> > > > >>>>>>>
> > > > >>>>>>> Brock
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Brock
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> >
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: [Discuss] project chop up

Reply via email to