> On Jun 2, 2015, at 22:45, Yingyi Bu <[email protected]> wrote:
> 
> >>I haven't tried working on multiple Hyracks branches at the same time, so I 
> >>haven't experienced this. This seems like a working method error, though. 
> >>If >>you're working with two things that are "the same version" (even if 
> >>that's a snapshot version), you'll need to use separate Maven repositories 
> >>to install >>them. In fact, merging the two git repositories would do 
> >>nothing to fix this problem, will it? If the proposal is to put the two 
> >>source repositories in the >>same git repo but otherwise leave them 
> >>untouched, then nothing would change in the build process. It's possible 
> >>I'm missing something there, though.
> 
> Is there a way to use multiple mvn repositories on the same machine?   I used 
> to think mvn always installs artifacts to the directory ~/.m2/repository.
> I guess we just need to have a root-level pom and leave hyracks and asterixdb 
> untouched.  Then, a single root-level "mvn package ..." will build everything 
> without requiring installing hyracks first.  It's just like what we currently 
> do for hyracks and algebricks.  Then, builds/tests do not leave side-effects 
> in ~/.m2/repository.

Great question! I just looked into this a bit (but I didn't try it) and the 
docs seem to suggest that 
a) you should be able to specify the local repository in a settings.xml and that
b) you should be able to specify the settings.xml on the maven command line.
So it should be possible to do that - and with some shell magic I think that it 
should even be possible to do that in a largely invisible way.

> >>As for manually scheduling Asterix Jenkins jobs, that sounds like it's only 
> >>a problem where your Hyracks change breaks an existing public API. That 
> >>>>would be obviated by having true API testing inside of Hyracks, which is 
> >>something that we should have regardless of any decisions about source 
> >>>>locations.
> 
> I agree that's the right software engineering way. Going forward, we do need 
> to add more unit tests in hyracks and asterixdb. But considering the resource 
> constraints, I'm not sure whether (or when) we can have a complete API test 
> suite for hyracks/algebricks:
> 1)  both hyracks and algebricks public APIs allow an arbitrary input DAG (a 
> logical plan or a hyracks job).  It's hard to enumerate all possibilities in 
> hyracks/algebricks tests.  My experience is that when we see a broken AQL 
> query,  we fix it in both hyracks/asterixdb codebases,  and verify it with 
> the AQL query. In those cases,  there might be no need to have yet-another 
> verbose hyracks/algebricks test.
> 2)  even if we have a comprehensive test suite for hyracks,  I'm not sure 
> whether it can guarantee to pass asterixdb tests because the current 
> asterixdb test suite covers a lot of edge cases in the hyracks runtime, LSM, 
> and algebricks.

One way to use existing clients as tests for Hyracks could be to set up a 
system that runs the tests of the existing versions of the clients against a 
new version of Hyracks - ideally all client isolated from each other and in 
parallel to keep turn around times low.
Does that sound feasible?

Cheers,
Till

> Anyway, if the repositories have to be separated, it would be nice that the 
> "change-topic" issue can be fixed.
> 
> Best,
> Yingyi
> 
> 
>> On Tue, Jun 2, 2015 at 10:00 AM, Chris Hillery <[email protected]> wrote:
>>> On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <[email protected]> wrote:
>>> In my opinion,  merging the repository doesn't break the separation of 
>>> hyracks and asterixdb, because the dependencies are controlled by mvn pom 
>>> files.
>> 
>> That wasn't the separation I was talking about. I meant API separation. As 
>> it is now, when we make a change to both Asterix and Hyracks, we are forced 
>> to consider the API implications, or at least they are put out there in a 
>> very clear way that we need to look at. If we merge them, people will 
>> (rightly) treat the whole thing as one product, and there will be no brakes 
>> on making wide-ranging API changes.
>> 
>> (As an aside: I don't trust Maven's pom files to do a good job of keeping 
>> the dependency management clean. In fact I trust it to do precisely the 
>> opposite, by making it both easier to screw up the dependencies and harder 
>> to update them in future.)
>> 
>> Again, my point is this: If we truly believe that Hyracks is a re-usable 
>> component, it should be treated as such from source to build to delivery. By 
>> merging in Asterix, we are saying that Asterix is "more equal" than others 
>> Hyracks clients, to the point that we're tacitly willing to break those 
>> other clients in favor of simplifying Asterix development. If that is a fair 
>> and true statement, well, then, sure, let's merge them.
>> 
>>> 1) It forces those hyracks-only changes to pass asterixdb regression tests. 
>>>  Currently hyracks-only change are not verified by asterixdb tests.
>> 
>> This is a good point, I will admit. However, I think this same goal can be 
>> met in other ways. My strong preference would be to create a set of true API 
>> tests inside of Hyracks, which both document and test the external Hyracks 
>> API. That will make API-breaking changes in future much easier to spot, and 
>> also make it clear when Asterix is using internal APIs that it should not.
>>  
>>> 2) On my local machine,  I don't need to always install hyracks and then 
>>> verify asterixdb from time to time.  Especially, switching branches seems 
>>> painful because the installed hyracks snapshot is overwritten from time to 
>>> time.
>> 
>> I haven't tried working on multiple Hyracks branches at the same time, so I 
>> haven't experienced this. This seems like a working method error, though. If 
>> you're working with two things that are "the same version" (even if that's a 
>> snapshot version), you'll need to use separate Maven repositories to install 
>> them. In fact, merging the two git repositories would do nothing to fix this 
>> problem, will it? If the proposal is to put the two source repositories in 
>> the same git repo but otherwise leave them untouched, then nothing would 
>> change in the build process. It's possible I'm missing something there, 
>> though.
>>  
>>> 3) I only need to make one code review request and one jenkins job.  
>>> Currently I need to manually change the topic of my asterixdb gerrit CL 
>>> every time before I update my hyracks CL, and then manually schedule 
>>> jenkins to run a new asterixdb job.  If I forget to schedule the jenkins 
>>> job, the asterixdb CL is still shown to be "verified by jenkins".
>> 
>> This is a problem, but it's a problem in commit validation, not in the 
>> source. Modifying the source to work around these issues is still a bad idea 
>> IMHO.
>> 
>> The "change-topic" issue could be fixed with a bit of development work (have 
>> the topic point to a change, rather than a specific patchset on the change, 
>> so you only need to set it once, for instance).
>> 
>> As for manually scheduling Asterix Jenkins jobs, that sounds like it's only 
>> a problem where your Hyracks change breaks an existing public API. That 
>> would be obviated by having true API testing inside of Hyracks, which is 
>> something that we should have regardless of any decisions about source 
>> locations.
>> 
>> In summary / repeating myself again: yes, we have some problems because 
>> Hyracks and Asterix are in seperate repositories. But those problems are 
>> pointing out true issues with our development and processes. Merging the 
>> repositories isn't fixing those problems, it's sweeping them under the rug. 
>> Long term we would be much better off to identify, isolate, and fix the 
>> problems themselves.
>> 
>> Ceej
>> aka Chris Hillery
>> 
>> 
> 

Reply via email to