> On Jun 2, 2015, at 22:45, Yingyi Bu <[email protected]> wrote: > > >>I haven't tried working on multiple Hyracks branches at the same time, so I > >>haven't experienced this. This seems like a working method error, though. > >>If >>you're working with two things that are "the same version" (even if > >>that's a snapshot version), you'll need to use separate Maven repositories > >>to install >>them. In fact, merging the two git repositories would do > >>nothing to fix this problem, will it? If the proposal is to put the two > >>source repositories in the >>same git repo but otherwise leave them > >>untouched, then nothing would change in the build process. It's possible > >>I'm missing something there, though. > > Is there a way to use multiple mvn repositories on the same machine? I used > to think mvn always installs artifacts to the directory ~/.m2/repository. > I guess we just need to have a root-level pom and leave hyracks and asterixdb > untouched. Then, a single root-level "mvn package ..." will build everything > without requiring installing hyracks first. It's just like what we currently > do for hyracks and algebricks. Then, builds/tests do not leave side-effects > in ~/.m2/repository.
Great question! I just looked into this a bit (but I didn't try it) and the docs seem to suggest that a) you should be able to specify the local repository in a settings.xml and that b) you should be able to specify the settings.xml on the maven command line. So it should be possible to do that - and with some shell magic I think that it should even be possible to do that in a largely invisible way. > >>As for manually scheduling Asterix Jenkins jobs, that sounds like it's only > >>a problem where your Hyracks change breaks an existing public API. That > >>>>would be obviated by having true API testing inside of Hyracks, which is > >>something that we should have regardless of any decisions about source > >>>>locations. > > I agree that's the right software engineering way. Going forward, we do need > to add more unit tests in hyracks and asterixdb. But considering the resource > constraints, I'm not sure whether (or when) we can have a complete API test > suite for hyracks/algebricks: > 1) both hyracks and algebricks public APIs allow an arbitrary input DAG (a > logical plan or a hyracks job). It's hard to enumerate all possibilities in > hyracks/algebricks tests. My experience is that when we see a broken AQL > query, we fix it in both hyracks/asterixdb codebases, and verify it with > the AQL query. In those cases, there might be no need to have yet-another > verbose hyracks/algebricks test. > 2) even if we have a comprehensive test suite for hyracks, I'm not sure > whether it can guarantee to pass asterixdb tests because the current > asterixdb test suite covers a lot of edge cases in the hyracks runtime, LSM, > and algebricks. One way to use existing clients as tests for Hyracks could be to set up a system that runs the tests of the existing versions of the clients against a new version of Hyracks - ideally all client isolated from each other and in parallel to keep turn around times low. Does that sound feasible? Cheers, Till > Anyway, if the repositories have to be separated, it would be nice that the > "change-topic" issue can be fixed. > > Best, > Yingyi > > >> On Tue, Jun 2, 2015 at 10:00 AM, Chris Hillery <[email protected]> wrote: >>> On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <[email protected]> wrote: >>> In my opinion, merging the repository doesn't break the separation of >>> hyracks and asterixdb, because the dependencies are controlled by mvn pom >>> files. >> >> That wasn't the separation I was talking about. I meant API separation. As >> it is now, when we make a change to both Asterix and Hyracks, we are forced >> to consider the API implications, or at least they are put out there in a >> very clear way that we need to look at. If we merge them, people will >> (rightly) treat the whole thing as one product, and there will be no brakes >> on making wide-ranging API changes. >> >> (As an aside: I don't trust Maven's pom files to do a good job of keeping >> the dependency management clean. In fact I trust it to do precisely the >> opposite, by making it both easier to screw up the dependencies and harder >> to update them in future.) >> >> Again, my point is this: If we truly believe that Hyracks is a re-usable >> component, it should be treated as such from source to build to delivery. By >> merging in Asterix, we are saying that Asterix is "more equal" than others >> Hyracks clients, to the point that we're tacitly willing to break those >> other clients in favor of simplifying Asterix development. If that is a fair >> and true statement, well, then, sure, let's merge them. >> >>> 1) It forces those hyracks-only changes to pass asterixdb regression tests. >>> Currently hyracks-only change are not verified by asterixdb tests. >> >> This is a good point, I will admit. However, I think this same goal can be >> met in other ways. My strong preference would be to create a set of true API >> tests inside of Hyracks, which both document and test the external Hyracks >> API. That will make API-breaking changes in future much easier to spot, and >> also make it clear when Asterix is using internal APIs that it should not. >> >>> 2) On my local machine, I don't need to always install hyracks and then >>> verify asterixdb from time to time. Especially, switching branches seems >>> painful because the installed hyracks snapshot is overwritten from time to >>> time. >> >> I haven't tried working on multiple Hyracks branches at the same time, so I >> haven't experienced this. This seems like a working method error, though. If >> you're working with two things that are "the same version" (even if that's a >> snapshot version), you'll need to use separate Maven repositories to install >> them. In fact, merging the two git repositories would do nothing to fix this >> problem, will it? If the proposal is to put the two source repositories in >> the same git repo but otherwise leave them untouched, then nothing would >> change in the build process. It's possible I'm missing something there, >> though. >> >>> 3) I only need to make one code review request and one jenkins job. >>> Currently I need to manually change the topic of my asterixdb gerrit CL >>> every time before I update my hyracks CL, and then manually schedule >>> jenkins to run a new asterixdb job. If I forget to schedule the jenkins >>> job, the asterixdb CL is still shown to be "verified by jenkins". >> >> This is a problem, but it's a problem in commit validation, not in the >> source. Modifying the source to work around these issues is still a bad idea >> IMHO. >> >> The "change-topic" issue could be fixed with a bit of development work (have >> the topic point to a change, rather than a specific patchset on the change, >> so you only need to set it once, for instance). >> >> As for manually scheduling Asterix Jenkins jobs, that sounds like it's only >> a problem where your Hyracks change breaks an existing public API. That >> would be obviated by having true API testing inside of Hyracks, which is >> something that we should have regardless of any decisions about source >> locations. >> >> In summary / repeating myself again: yes, we have some problems because >> Hyracks and Asterix are in seperate repositories. But those problems are >> pointing out true issues with our development and processes. Merging the >> repositories isn't fixing those problems, it's sweeping them under the rug. >> Long term we would be much better off to identify, isolate, and fix the >> problems themselves. >> >> Ceej >> aka Chris Hillery >> >> >
