Hi Ian,

Could you specify the exact class name of the index stress test? I would
like to look at it. Thanks.

Best,
Taewoo

On Tue, Jun 2, 2015 at 9:05 AM, Ian Maxon <[email protected]> wrote:

> I'm in favor of merging them as well. Keeping the git repositories separate
> doesn't enforce any kind of architectural separation, it just makes build +
> test more complex. Nearly every major change is using the topic field hack
> by this point.
> I think the only downside is that the tests will take longer, but that may
> need to be revisited anyway (in Hyracks, the index stress tests- especially
> for inverted indexes- take far too long).
>
> Another .02¢ :)
>
> - Ian
>
> On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <[email protected]> wrote:
>
> > Chris,
> >
> > Thanks for the input!!
> >
> > >>1. If we're serious about Hyracks being a re-usable component of other
> > products, it makes sense to dogfood that in Asterixdb. If there are
> > problems ?>>keeping Hyracks separate from Asterix or keeping Hyracks with
> > clean interfaces, this forces us to address them.
> >
> > In my opinion,  merging the repository doesn't break the separation of
> > hyracks and asterixdb, because the dependencies are controlled by mvn pom
> > files. We just make the code physically live together under the root
> > directory, one is hyracks as it is and the other is asterixdb as it is.
> > For example, Spark lives together with all the things on top of it and
> that
> > doesn't seem to prevent its reusability. Hadoop lives together with
> > Hive/Pig/Zookeeper in the same repo until year 2010 when it is very
> stable.
> >
> > Currently almost all my changes are spanning hyracks and asterixdb.  I
> > believe many people also suffer from that.  Merging them together will
> have
> > the following benefits:
> > 1) It forces those hyracks-only changes to pass asterixdb regression
> > tests.  Currently hyracks-only change are not verified by asterixdb
> tests.
> > 2) On my local machine,  I don't need to always install hyracks and then
> > verify asterixdb from time to time.  Especially, switching branches seems
> > painful because the installed hyracks snapshot is overwritten from time
> to
> > time.
> > 3) I only need to make one code review request and one jenkins job.
> > Currently I need to manually change the topic of my asterixdb gerrit CL
> > every time before I update my hyracks CL, and then manually schedule
> > jenkins to run a new asterixdb job.  If I forget to schedule the jenkins
> > job, the asterixdb CL is still shown to be "verified by jenkins".
> >
> > >>2. We only just recently took the initiative to take Pregelix and
> > Hiversterix *out* of the same repository, and that was because they were
> > specifically >>causing us problems as components of the same build.
> (There
> > were issues of competing dependency versions with Ian's YARN work, as
> well
> > as >>several spurious pregelix test failures, as I recall.) At a bare
> > minimum, we cannot merge those projects back in without re-researching
> and
> > addressing >>those problems.
> >
> > Those will be definitely be fixed before Pregelix and IMRU are merged
> > back.  Hivesterix is dead and will not be merged. I'm not proposing that
> we
> > should bring Pregelix and IMRU in now but to do that later when they are
> > ready.
> >
> > Best,
> > Yingyi
> >
> >
> >
> >
> > On Mon, Jun 1, 2015 at 5:15 PM, Chris Hillery <[email protected]>
> wrote:
> >
> > > My $.02 - no, we shouldn't.
> > >
> > > Two main reasons:
> > >
> > > 1. If we're serious about Hyracks being a re-usable component of other
> > > products, it makes sense to dogfood that in Asterixdb. If there are
> > > problems keeping Hyracks separate from Asterix or keeping Hyracks with
> > > clean interfaces, this forces us to address them.
> > >
> > > 2. We only just recently took the initiative to take Pregelix and
> > > Hiversterix *out* of the same repository, and that was because they
> were
> > > specifically causing us problems as components of the same build.
> (There
> > > were issues of competing dependency versions with Ian's YARN work, as
> > well
> > > as several spurious pregelix test failures, as I recall.) At a bare
> > > minimum, we cannot merge those projects back in without re-researching
> > and
> > > addressing those problems.
> > >
> > > What benefits would we gain by merging them? I honestly don't agree
> with
> > > Yingyi's suggestion that it would make building, bug-fixing, and code
> > > review much simpler. At best it would help a bit on those occasions
> when
> > a
> > > change spans Hyracks and Asterix, and again, IMHO that is something
> that
> > > *should* require additional thought and oversight. As for build and
> test,
> > > my feeling is that it will make it considerably harder, or at the very
> > > least slower, simply due to doubling the Maven overhead.
> > >
> > > I do not feel that merging the projects to either fit in better with
> > > Apache, or to game the Apache popularity indexes, is a good trade-off.
> > >
> > > Ceej
> > > aka Chris Hillery
> > >
> > > On Mon, Jun 1, 2015 at 12:02 PM, Yingyi Bu <[email protected]> wrote:
> > >
> > >> Hi folks,
> > >>
> > >>     Should we merge hyracks, asterixdb, and potentially pregelix/imru
> > >> into the same repository?   It will make build, fix, and code review
> > >> process much simpler.
> > >>     An example is that everything built on top of Spark lives in the
> > same
> > >> repository:  https://github.com/apache/spark.   That's also why Spark
> > is
> > >> the most active Apache project now, due to its commit frequency.
> > >>     Does anyone have concerns for merging the hyracks and asterixdb
> > >> repositories?
> > >>     Thanks!
> > >>
> > >> Best,
> > >> Yingyi
> > >>
> > >>
> > >> On Wed, Apr 22, 2015 at 10:13 PM, Till Westmann <[email protected]>
> > wrote:
> > >>
> > >>> Ok, let’s find out what is the “more work” part before we decide :)
> > >>>
> > >>> We should already have the SGA (as it’s part of the SGA that Mike
> sent
> > >>> in) and it seemed to me that all we’re need to do “later” (e.g. next
> > >>> week/month) would be to
> > >>> a) vote on bringing it into AsterixDB (that would be an incubator
> vote
> > I
> > >>> assume) and
> > >>> b) asking infra for another git repository.
> > >>> So the extra work would be the vote on the incubator list.
> > >>> Is that right or is there something else we’d need to do?
> > >>>
> > >>> Cheers,
> > >>> Till
> > >>>
> > >>> On Apr 22, 2015, at 10:04 PM, Mattmann, Chris A (3980) <
> > >>> [email protected]> wrote:
> > >>>
> > >>> Hey Mike and team,
> > >>>
> > >>> Thanks for bringing this to the list. I think these are precisely
> > >>> the type of conversations that we want to have here at the ASF and
> > >>> as part of our Incubating project. Having these discussions in the
> > >>> community here at the ASF (which is now the Apache AsterixDB
> community)
> > >>> is great.
> > >>>
> > >>> My opinion - it’s fine either way. I’m happy if you guys want to
> > >>> bring Pregelix into the code base here via AsterixDB. It’s easily
> > >>> reversible and incremental. If you want to spin out Pregelix later
> > >>> as its own TLP and it’s shown to have its own community we can
> > >>> file a board resolution to do that. Heck, nothing stops us from
> > >>> graduating 2 Incubator projects=>TLPs out of this effort even in
> > >>> the Incubator. That’s fine. If you want to wait and bring it in
> > >>> later, it will definitely be more work - so let’s call a spade a
> > >>> spade there. But if you want to do that that’s fine too.
> > >>>
> > >>> My personal recommendation - bring it in - won’t hurt and we can
> > >>> always pivot in the ways above later.
> > >>>
> > >>> Cheers,
> > >>> Chris
> > >>>
> > >>>
> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >>> Chris Mattmann, Ph.D.
> > >>> Chief Architect
> > >>> Instrument Software and Science Data Systems Section (398)
> > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > >>> Office: 168-519, Mailstop: 168-527
> > >>> Email: [email protected]
> > >>> WWW:  http://sunset.usc.edu/~mattmann/
> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >>> Adjunct Associate Professor, Computer Science Department
> > >>> University of Southern California, Los Angeles, CA 90089 USA
> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> -----Original Message-----
> > >>> From: Michael Carey <[email protected]>
> > >>> Date: Tuesday, April 21, 2015 at 11:49 AM
> > >>> To: Chris Mattmann <[email protected]>, Till Westmann
> > >>> <[email protected]>
> > >>> Cc: Chris Hillery <[email protected]>, Ian Maxon <[email protected]>,
> > >>> Yingyi
> > >>> Bu <[email protected]>, "[email protected]"
> > >>> <[email protected]>
> > >>> Subject: Re: Migration of git repository
> > >>>
> > >>> Sure!  Let me clarify the issue for everyone (and broaden the
> > question).
> > >>>
> > >>> One of the technical by-products of the AsterixDB project is a graph
> > >>> analytics package called Pregelix - as the name suggests, it is a
> > "knock
> > >>> off" of Pregel, as are packages like Giraph.  What's unique about
> > >>> Pregelix is that it actually scales without OOM'ing
> > >>> - under the covers it uses database join processing techniques.  You
> > can
> > >>> find out more about it by visiting
> > >>> http://pregelix.ics.uci.edu/ and/or by skimming the attached paper -
> > >>> check out the experimental results compared to other popular
> > >>> alternatives.  Anyway, we have made it freely available (as we do all
> > of
> > >>> our AsterixDB-related
> > >>> research products) and we were thinking that we should simply include
> > it
> > >>> under the AsterixDB project - kind of like Spark has subprojects for
> > SQL,
> > >>> streams, graphs, etc.  As a result, I listed it on the list of
> > >>> transferred artifacts when I sent in the licensing
> > >>> form the other day.  (So we at least have that step done.)  Its code
> > >>> conntributors have been a small subset of the AsterixDB team; it was
> a
> > >>> small sub-project, basically.  (Mostly just Yingyi Bu!)
> > >>>
> > >>> Pregelix is kind of a sibling of Apache VXQuery in that its runtime
> is
> > >>> based on Hyracks but it hasn't otherwise been AsterixDB-dependent.
> > >>> However, we have just finished teaching it to read/write directly
> from
> > >>> AsterixDB native storage - instead of just HDFS
> > >>> - so now it has an AsterixDB dependency, and we are using it as a
> > >>> driving example of how to couple AsterixDB to other analytic engines.
> > >>>
> > >>> Rather than going through another exercise to open-source this
> > >>> separately, it seemed like we could take this approach.
> > >>>
> > >>> Thoughts?
> > >>> Cheers,
> > >>> Mike
> > >>>
> > >>>
> > >>> On 4/21/15 7:45 AM, Mattmann, Chris A (3980) wrote:
> > >>>
> > >>>
> > >>> Yes, in fact, this whole conversations should be happening on
> > >>> the dev list. OK for me to CC them on my reply?
> > >>>
> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >>> Chris Mattmann, Ph.D.
> > >>> Chief Architect
> > >>> Instrument Software and Science Data Systems Section (398)
> > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > >>> Office: 168-519, Mailstop: 168-527
> > >>> Email: [email protected]
> > >>> WWW:  http://sunset.usc.edu/~mattmann/
> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >>> Adjunct Associate Professor, Computer Science Department
> > >>> University of Southern California, Los Angeles, CA 90089 USA
> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> -----Original Message-----
> > >>> From: "Michael J. Carey" <[email protected]>
> > >>> <mailto:[email protected] <[email protected]>>
> > >>> Date: Tuesday, April 21, 2015 at 3:13 AM
> > >>> To: Till Westmann <[email protected]> <mailto:[email protected]
> > >>> <[email protected]>>
> > >>> Cc: Chris Hillery <[email protected]> <mailto:[email protected]
> > >>> <[email protected]>>, Ian
> > >>> Maxon <[email protected]> <mailto:[email protected] <[email protected]>>,
> > Yingyi
> > >>> Bu <[email protected]> <mailto:[email protected] <
> [email protected]
> > >>,
> > >>> Chris Mattmann
> > >>> <[email protected]> <mailto:
> [email protected]
> > >>> <[email protected]>>
> > >>> Subject: Re: Migration of git repository
> > >>>
> > >>> + Yingyi on the Pregelix Q.  Should we also ask Chris M for advice on
> > >>> that?
> > >>> On Apr 20, 2015 4:23 PM, "Till Westmann" <[email protected]>
> > >>> <mailto:[email protected] <[email protected]>> wrote:
> > >>>
> > >>> Hi Ian,
> > >>>
> > >>>
> > >>> That’s a good question - and I don’t know the answer.
> > >>> We’ve got 2 repos so far:
> > >>>
> > >>>
> >
> https://issues.apache.org/jira/browse/INFRA-9212https://issues.apache.org/
> > >>> jira/browse/INFRA-9306
> > >>> so we should have space for Hyracks and AsterixDB.
> > >>>
> > >>>
> > >>> I think that there’s an open questions about Pregelix, but maybe that
> > >>> shouldn’t keep us from going ahead.
> > >>>
> > >>>
> > >>> I further think that it would be great if you could send an e-mail to
> > >>> [email protected]<
> > >>> mailto:[email protected]
> > >>> <[email protected]>
> > >>> rg> <mailto:[email protected]
> > >>> <[email protected]>> and ask if it’s ok to
> > >>> import
> > >>> our git repo(s) or if something else needs to be done first. (I could
> > >>> send that e-mail as well, but it would be great if there were more
> > >>> non-Till e0mails on the list :) )
> > >>>
> > >>>
> > >>> Cheers,
> > >>> Till
> > >>>
> > >>>
> > >>> On Apr 20, 2015, at 4:07 PM, Ian Maxon <[email protected]>
> > >>> <mailto:[email protected] <[email protected]>> wrote:
> > >>>
> > >>> Hi Mike, Chris and Till,
> > >>>
> > >>>
> > >>> Since (I think?) the paperwork for the software grant is done now,
> > should
> > >>> I copy our GC branches over to the ASF git repositories now ( as well
> > as
> > >>> making it a mirror in the Gerrit commit hook script)?
> > >>>
> > >>>
> > >>> Thanks,
> > >>> - Ian
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
>

Reply via email to