Ian, thank you for the information. Best, Taewoo
On Tue, Jun 2, 2015 at 9:33 AM, Ian Maxon <[email protected]> wrote: > Hi Taewoo, > It's really anything > in hyracks-tests/hyracks-storage-am-lsm-invertedindex-test (besides the > tokenizer test). All of the tests in that package alone take over 20 > minutes. Each one takes about 2 minutes. > > Thanks, > - Ian > > On Tue, Jun 2, 2015 at 9:13 AM, Taewoo Kim <[email protected]> wrote: > > > Hi Ian, > > > > Could you specify the exact class name of the index stress test? I would > > like to look at it. Thanks. > > > > Best, > > Taewoo > > > > On Tue, Jun 2, 2015 at 9:05 AM, Ian Maxon <[email protected]> wrote: > > > > > I'm in favor of merging them as well. Keeping the git repositories > > separate > > > doesn't enforce any kind of architectural separation, it just makes > > build + > > > test more complex. Nearly every major change is using the topic field > > hack > > > by this point. > > > I think the only downside is that the tests will take longer, but that > > may > > > need to be revisited anyway (in Hyracks, the index stress tests- > > especially > > > for inverted indexes- take far too long). > > > > > > Another .02¢ :) > > > > > > - Ian > > > > > > On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <[email protected]> wrote: > > > > > > > Chris, > > > > > > > > Thanks for the input!! > > > > > > > > >>1. If we're serious about Hyracks being a re-usable component of > > other > > > > products, it makes sense to dogfood that in Asterixdb. If there are > > > > problems ?>>keeping Hyracks separate from Asterix or keeping Hyracks > > with > > > > clean interfaces, this forces us to address them. > > > > > > > > In my opinion, merging the repository doesn't break the separation > of > > > > hyracks and asterixdb, because the dependencies are controlled by mvn > > pom > > > > files. We just make the code physically live together under the root > > > > directory, one is hyracks as it is and the other is asterixdb as it > is. > > > > For example, Spark lives together with all the things on top of it > and > > > that > > > > doesn't seem to prevent its reusability. Hadoop lives together with > > > > Hive/Pig/Zookeeper in the same repo until year 2010 when it is very > > > stable. > > > > > > > > Currently almost all my changes are spanning hyracks and asterixdb. > I > > > > believe many people also suffer from that. Merging them together > will > > > have > > > > the following benefits: > > > > 1) It forces those hyracks-only changes to pass asterixdb regression > > > > tests. Currently hyracks-only change are not verified by asterixdb > > > tests. > > > > 2) On my local machine, I don't need to always install hyracks and > > then > > > > verify asterixdb from time to time. Especially, switching branches > > seems > > > > painful because the installed hyracks snapshot is overwritten from > time > > > to > > > > time. > > > > 3) I only need to make one code review request and one jenkins job. > > > > Currently I need to manually change the topic of my asterixdb gerrit > CL > > > > every time before I update my hyracks CL, and then manually schedule > > > > jenkins to run a new asterixdb job. If I forget to schedule the > > jenkins > > > > job, the asterixdb CL is still shown to be "verified by jenkins". > > > > > > > > >>2. We only just recently took the initiative to take Pregelix and > > > > Hiversterix *out* of the same repository, and that was because they > > were > > > > specifically >>causing us problems as components of the same build. > > > (There > > > > were issues of competing dependency versions with Ian's YARN work, as > > > well > > > > as >>several spurious pregelix test failures, as I recall.) At a bare > > > > minimum, we cannot merge those projects back in without > re-researching > > > and > > > > addressing >>those problems. > > > > > > > > Those will be definitely be fixed before Pregelix and IMRU are merged > > > > back. Hivesterix is dead and will not be merged. I'm not proposing > > that > > > we > > > > should bring Pregelix and IMRU in now but to do that later when they > > are > > > > ready. > > > > > > > > Best, > > > > Yingyi > > > > > > > > > > > > > > > > > > > > On Mon, Jun 1, 2015 at 5:15 PM, Chris Hillery <[email protected]> > > > wrote: > > > > > > > > > My $.02 - no, we shouldn't. > > > > > > > > > > Two main reasons: > > > > > > > > > > 1. If we're serious about Hyracks being a re-usable component of > > other > > > > > products, it makes sense to dogfood that in Asterixdb. If there are > > > > > problems keeping Hyracks separate from Asterix or keeping Hyracks > > with > > > > > clean interfaces, this forces us to address them. > > > > > > > > > > 2. We only just recently took the initiative to take Pregelix and > > > > > Hiversterix *out* of the same repository, and that was because they > > > were > > > > > specifically causing us problems as components of the same build. > > > (There > > > > > were issues of competing dependency versions with Ian's YARN work, > as > > > > well > > > > > as several spurious pregelix test failures, as I recall.) At a bare > > > > > minimum, we cannot merge those projects back in without > > re-researching > > > > and > > > > > addressing those problems. > > > > > > > > > > What benefits would we gain by merging them? I honestly don't agree > > > with > > > > > Yingyi's suggestion that it would make building, bug-fixing, and > code > > > > > review much simpler. At best it would help a bit on those occasions > > > when > > > > a > > > > > change spans Hyracks and Asterix, and again, IMHO that is something > > > that > > > > > *should* require additional thought and oversight. As for build and > > > test, > > > > > my feeling is that it will make it considerably harder, or at the > > very > > > > > least slower, simply due to doubling the Maven overhead. > > > > > > > > > > I do not feel that merging the projects to either fit in better > with > > > > > Apache, or to game the Apache popularity indexes, is a good > > trade-off. > > > > > > > > > > Ceej > > > > > aka Chris Hillery > > > > > > > > > > On Mon, Jun 1, 2015 at 12:02 PM, Yingyi Bu <[email protected]> > > wrote: > > > > > > > > > >> Hi folks, > > > > >> > > > > >> Should we merge hyracks, asterixdb, and potentially > > pregelix/imru > > > > >> into the same repository? It will make build, fix, and code > review > > > > >> process much simpler. > > > > >> An example is that everything built on top of Spark lives in > the > > > > same > > > > >> repository: https://github.com/apache/spark. That's also why > > Spark > > > > is > > > > >> the most active Apache project now, due to its commit frequency. > > > > >> Does anyone have concerns for merging the hyracks and > asterixdb > > > > >> repositories? > > > > >> Thanks! > > > > >> > > > > >> Best, > > > > >> Yingyi > > > > >> > > > > >> > > > > >> On Wed, Apr 22, 2015 at 10:13 PM, Till Westmann <[email protected] > > > > > > wrote: > > > > >> > > > > >>> Ok, let’s find out what is the “more work” part before we decide > :) > > > > >>> > > > > >>> We should already have the SGA (as it’s part of the SGA that Mike > > > sent > > > > >>> in) and it seemed to me that all we’re need to do “later” (e.g. > > next > > > > >>> week/month) would be to > > > > >>> a) vote on bringing it into AsterixDB (that would be an incubator > > > vote > > > > I > > > > >>> assume) and > > > > >>> b) asking infra for another git repository. > > > > >>> So the extra work would be the vote on the incubator list. > > > > >>> Is that right or is there something else we’d need to do? > > > > >>> > > > > >>> Cheers, > > > > >>> Till > > > > >>> > > > > >>> On Apr 22, 2015, at 10:04 PM, Mattmann, Chris A (3980) < > > > > >>> [email protected]> wrote: > > > > >>> > > > > >>> Hey Mike and team, > > > > >>> > > > > >>> Thanks for bringing this to the list. I think these are precisely > > > > >>> the type of conversations that we want to have here at the ASF > and > > > > >>> as part of our Incubating project. Having these discussions in > the > > > > >>> community here at the ASF (which is now the Apache AsterixDB > > > community) > > > > >>> is great. > > > > >>> > > > > >>> My opinion - it’s fine either way. I’m happy if you guys want to > > > > >>> bring Pregelix into the code base here via AsterixDB. It’s easily > > > > >>> reversible and incremental. If you want to spin out Pregelix > later > > > > >>> as its own TLP and it’s shown to have its own community we can > > > > >>> file a board resolution to do that. Heck, nothing stops us from > > > > >>> graduating 2 Incubator projects=>TLPs out of this effort even in > > > > >>> the Incubator. That’s fine. If you want to wait and bring it in > > > > >>> later, it will definitely be more work - so let’s call a spade a > > > > >>> spade there. But if you want to do that that’s fine too. > > > > >>> > > > > >>> My personal recommendation - bring it in - won’t hurt and we can > > > > >>> always pivot in the ways above later. > > > > >>> > > > > >>> Cheers, > > > > >>> Chris > > > > >>> > > > > >>> > > > > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >>> Chris Mattmann, Ph.D. > > > > >>> Chief Architect > > > > >>> Instrument Software and Science Data Systems Section (398) > > > > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > > > >>> Office: 168-519, Mailstop: 168-527 > > > > >>> Email: [email protected] > > > > >>> WWW: http://sunset.usc.edu/~mattmann/ > > > > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >>> Adjunct Associate Professor, Computer Science Department > > > > >>> University of Southern California, Los Angeles, CA 90089 USA > > > > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> -----Original Message----- > > > > >>> From: Michael Carey <[email protected]> > > > > >>> Date: Tuesday, April 21, 2015 at 11:49 AM > > > > >>> To: Chris Mattmann <[email protected]>, Till > Westmann > > > > >>> <[email protected]> > > > > >>> Cc: Chris Hillery <[email protected]>, Ian Maxon < > [email protected] > > >, > > > > >>> Yingyi > > > > >>> Bu <[email protected]>, "[email protected]" > > > > >>> <[email protected]> > > > > >>> Subject: Re: Migration of git repository > > > > >>> > > > > >>> Sure! Let me clarify the issue for everyone (and broaden the > > > > question). > > > > >>> > > > > >>> One of the technical by-products of the AsterixDB project is a > > graph > > > > >>> analytics package called Pregelix - as the name suggests, it is a > > > > "knock > > > > >>> off" of Pregel, as are packages like Giraph. What's unique about > > > > >>> Pregelix is that it actually scales without OOM'ing > > > > >>> - under the covers it uses database join processing techniques. > > You > > > > can > > > > >>> find out more about it by visiting > > > > >>> http://pregelix.ics.uci.edu/ and/or by skimming the attached > > paper - > > > > >>> check out the experimental results compared to other popular > > > > >>> alternatives. Anyway, we have made it freely available (as we do > > all > > > > of > > > > >>> our AsterixDB-related > > > > >>> research products) and we were thinking that we should simply > > include > > > > it > > > > >>> under the AsterixDB project - kind of like Spark has subprojects > > for > > > > SQL, > > > > >>> streams, graphs, etc. As a result, I listed it on the list of > > > > >>> transferred artifacts when I sent in the licensing > > > > >>> form the other day. (So we at least have that step done.) Its > > code > > > > >>> conntributors have been a small subset of the AsterixDB team; it > > was > > > a > > > > >>> small sub-project, basically. (Mostly just Yingyi Bu!) > > > > >>> > > > > >>> Pregelix is kind of a sibling of Apache VXQuery in that its > runtime > > > is > > > > >>> based on Hyracks but it hasn't otherwise been > AsterixDB-dependent. > > > > >>> However, we have just finished teaching it to read/write directly > > > from > > > > >>> AsterixDB native storage - instead of just HDFS > > > > >>> - so now it has an AsterixDB dependency, and we are using it as a > > > > >>> driving example of how to couple AsterixDB to other analytic > > engines. > > > > >>> > > > > >>> Rather than going through another exercise to open-source this > > > > >>> separately, it seemed like we could take this approach. > > > > >>> > > > > >>> Thoughts? > > > > >>> Cheers, > > > > >>> Mike > > > > >>> > > > > >>> > > > > >>> On 4/21/15 7:45 AM, Mattmann, Chris A (3980) wrote: > > > > >>> > > > > >>> > > > > >>> Yes, in fact, this whole conversations should be happening on > > > > >>> the dev list. OK for me to CC them on my reply? > > > > >>> > > > > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >>> Chris Mattmann, Ph.D. > > > > >>> Chief Architect > > > > >>> Instrument Software and Science Data Systems Section (398) > > > > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > > > >>> Office: 168-519, Mailstop: 168-527 > > > > >>> Email: [email protected] > > > > >>> WWW: http://sunset.usc.edu/~mattmann/ > > > > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >>> Adjunct Associate Professor, Computer Science Department > > > > >>> University of Southern California, Los Angeles, CA 90089 USA > > > > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> -----Original Message----- > > > > >>> From: "Michael J. Carey" <[email protected]> > > > > >>> <mailto:[email protected] <[email protected]>> > > > > >>> Date: Tuesday, April 21, 2015 at 3:13 AM > > > > >>> To: Till Westmann <[email protected]> <mailto:[email protected] > > > > >>> <[email protected]>> > > > > >>> Cc: Chris Hillery <[email protected]> <mailto: > [email protected] > > > > >>> <[email protected]>>, Ian > > > > >>> Maxon <[email protected]> <mailto:[email protected] <[email protected]>>, > > > > Yingyi > > > > >>> Bu <[email protected]> <mailto:[email protected] < > > > [email protected] > > > > >>, > > > > >>> Chris Mattmann > > > > >>> <[email protected]> <mailto: > > > [email protected] > > > > >>> <[email protected]>> > > > > >>> Subject: Re: Migration of git repository > > > > >>> > > > > >>> + Yingyi on the Pregelix Q. Should we also ask Chris M for > advice > > on > > > > >>> that? > > > > >>> On Apr 20, 2015 4:23 PM, "Till Westmann" <[email protected]> > > > > >>> <mailto:[email protected] <[email protected]>> wrote: > > > > >>> > > > > >>> Hi Ian, > > > > >>> > > > > >>> > > > > >>> That’s a good question - and I don’t know the answer. > > > > >>> We’ve got 2 repos so far: > > > > >>> > > > > >>> > > > > > > > > > > https://issues.apache.org/jira/browse/INFRA-9212https://issues.apache.org/ > > > > >>> jira/browse/INFRA-9306 > > > > >>> so we should have space for Hyracks and AsterixDB. > > > > >>> > > > > >>> > > > > >>> I think that there’s an open questions about Pregelix, but maybe > > that > > > > >>> shouldn’t keep us from going ahead. > > > > >>> > > > > >>> > > > > >>> I further think that it would be great if you could send an > e-mail > > to > > > > >>> [email protected]< > > > > >>> mailto:[email protected] > > > > >>> <[email protected]> > > > > >>> rg> <mailto:[email protected] > > > > >>> <[email protected]>> and ask if it’s ok to > > > > >>> import > > > > >>> our git repo(s) or if something else needs to be done first. (I > > could > > > > >>> send that e-mail as well, but it would be great if there were > more > > > > >>> non-Till e0mails on the list :) ) > > > > >>> > > > > >>> > > > > >>> Cheers, > > > > >>> Till > > > > >>> > > > > >>> > > > > >>> On Apr 20, 2015, at 4:07 PM, Ian Maxon <[email protected]> > > > > >>> <mailto:[email protected] <[email protected]>> wrote: > > > > >>> > > > > >>> Hi Mike, Chris and Till, > > > > >>> > > > > >>> > > > > >>> Since (I think?) the paperwork for the software grant is done > now, > > > > should > > > > >>> I copy our GC branches over to the ASF git repositories now ( as > > well > > > > as > > > > >>> making it a mirror in the Gerrit commit hook script)? > > > > >>> > > > > >>> > > > > >>> Thanks, > > > > >>> - Ian > > > > >>> > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > > > > > >
