Re: Migration of git repository

Mike Carey Mon, 08 Jun 2015 14:11:14 -0700

Agreed. Hyracks doesn't have a narrow testable API, I fear, so that'dbe Mission Impossible. :-)


On 6/8/15 12:15 PM, Steven Jacobs wrote:

*IMHO, the first one is "easy" [*] to solve via testing. Either we
addproper API testing to Hyracks and ensure Asterix/VXQuery/etc only
useproper APIs, and/or we add Asterix/VXQuery/etc builds and tests to
thetesting jobs on Jenkins.*
-We should be going in the direction of the latter here. As discussed,
there are more issues that we see than simple API breakage.


Steven

On Mon, Jun 8, 2015 at 12:01 PM, Chris Hillery <[email protected]>
wrote:

I think maybe part of the reason we're having a tough time figuring this
out is that we're conflating two different problems.

1. We want to ensure that changes to Hyracks don't break Asterix, VXQuery,
etc.

2. We fairly often need to make related changes in Hyracks and Asterix that
"go together", ie, Asterix won't build/work with the new change until it
can see the corresponding Hyracks change.

Those really are completely different problems and may well need different
solutions.

IMHO, the first one is "easy" [*] to solve via testing. Either we add
proper API testing to Hyracks and ensure Asterix/VXQuery/etc only use
proper APIs, and/or we add Asterix/VXQuery/etc builds and tests to the
testing jobs on Jenkins.

The second problem is where we get into the trickiness of Maven releases
vs. Apache releases. This is why I asked about the actual requirements and
audience. My not-totally-thought-out suggestion for problem #2 would be to
not "solve" it at all, and simply state that the tip of Asterix requires
the latest tip of Hyracks to build. That's the way we all develop code on
our local machines anyway, as far as I know. If there are no outside
clients that we have to be concerned about between releases, doesn't this
solve the problem?

Obviously when it comes time to make a real Hyracks (or Asterix) release
we'll need to do a little extra work to ensure those *released* codebases
build together. That might mean that we usually need to make Hyracks and
Asterix releases at the same time, and I don't know whether that's now
harder to achieve in the incubator world.

(As a side note, the original proposal to merge the codebases would "solve"
[sweep under the rug] problem #1 for Asterix, at the cost of quite possibly
making it worse for VXQuery. It would sort of "solve" problem #2 for
Asterix as well, because it would physically enforce the same tip-tip rule
I'm proposing above. I still believe that we can solve both problems in
other strictly superior ways, however.)

Ceej
aka Chris Hillery

[*] - not actually easy.

On Mon, Jun 8, 2015 at 6:39 AM, Mike Carey <[email protected]> wrote:

All,

It feels to me (as one who is completely naive about much of this stuff)
like we need two levels of "releases", one level for the outside world

(the

public releases that users might pick up) and a different internal level
for the development process (where we essentially want to have
tagged/extra-tested checkpoints and want to be able to manage in a

careful

way the cross-dependencies from/to other related development processes X

e.g., for X = VXQuery, AsterixDB, and someday Pregelix).  When we do an
official signed release of anything, we'd need to do one for the DAG of
things - so there might be sync'ed "multireleases" (for Hyacks and then

for

X).  Does that make any sense and/or give anyone more thoughts about how

we

might achieve that...?

Cheers,
MIke



On 6/8/15 2:08 AM, Chris Hillery wrote:

If not, it may be worth taking a step back and asking what exactly the
problem is. I understand the general rule that "we don't want Asterix to
be
broken", but what precisely does that mean? Is it acceptable that the

tip

of the Asterix source branch is only guaranteed to build against the tip
of
the Hyracks branch, for example? If not, why not? What audience are we
required to keep things working for at the source level, and what
expectations do they have?

Ceej
aka Chris Hillery

On Mon, Jun 8, 2015 at 2:06 AM, Chris Hillery <[email protected]>
wrote:

  So, if we pushed these not-releases to the Nexus repo running at UCI,

and

devs pulled from there in preference to "official" repos, that would
solve
the problem?

Ceej
aka Chris Hillery

On Sun, Jun 7, 2015 at 7:29 PM, Ted Dunning <[email protected]>
wrote:

  If it is pushed to any wider audience than roughly the dev@ list, it

is

a release. That definitely includes maven central.  Artifacts in maven
are
convenience binaries and this not a release but they should be
traceable to
an exact source release.

Sent from my iPhone

  On Jun 7, 2015, at 19:10, Till Westmann <[email protected]> wrote:

Hmm, good point. It doesn’t have to. One question might be if we can

push it to some maven repository, if it’s not an official release.

But I think that should also be fine as long as we don’t push it to a

repository that claims to contain official releases.

Some mentor input might be helpful on this as well :)

Cheers,
Till

  On Jun 7, 2015, at 6:53 PM, Ildar Absalyamov <
[email protected]> wrote:
Does version bump always mean full-fledged Apache release? We need

the

former just to resolve compile time dependencies.
On Jun 7, 2015, at 18:49, Till Westmann <[email protected]> wrote:

In principle I agree with this, but creating a new release will be

little more involved that just running maven, when we do this at the

ASF.

To publish a new release we will have to vet and vote on the release.

This takes at least 72 hours  in the best case if we’re a TLP, the

first
release candidate is great, and have enough people to vote. While

we’re

still in the incubator, releasing will take a little longer as we also
have
to get enough votes for the release in the incubator.

As I proposed earlier, it would be really good to go through the full

release process once, before we decide how to structure our

processes

and
infrastructure.

Cheers,

Till

  On Jun 4, 2015, at 6:37 PM, Ildar Absalyamov <
[email protected]> wrote:

I am with Chris on repository separation and I think that the

solution to the issue of Hyracks commits breaking Asterix build is

using
release Hyracks versions instead of snapshot ones. Yes, that will
create a
frequent Hyracks releases (we will have to release it each time there
is a
change which spans both Hyracks & Asterix) and we have abandoned this
practice a while ago, but it seems that’s the only way to separate
projects
logically.

Here are few examples to clear the picture. In all examples Hyracks

version is 4.5.6-Snapshot, Asterix version is 1.2.3-Snapshot (but

it

depends on previous release version Hyracks 4.5.5):

1) The changes span both Asterix & Hyracks.

First make sure that Asterix could depend on Hyracks

4.5.6-Snapshot

without API conflicts & switch Asterix dependency to

4.5.6-Snapshot.

Submit Gerrit review, once it is done as a part of git-asf script

commit changes, bump Hyracks version to 4.5.6, make Asterix depend

on 4.5.6
and bump Hyracks to 4.5.7-Snapshot right after.

2) The changes are located only in Hyracks. Regular review and

commit (with snapshot version) without any version bump.

3) The changes are located only in Asterix. Regular review and

commit (with snapshot version) without any version bump.

In this scenario Hyracks commit can never make Asterix build fail

(since it depends on a stable release) and it’s the responsibility

of the
first person, whose commits spans both repos to make sure that the
changes
in snapshot Hyracks version are properly merged.

Regarding the Yingyi’s issue with Gerrit topics: could we modify

git-gerrit script so it would submit both Asterix & Hyracks reviews

(granted that the latter is needed), and link them together, setting

the

proper topic? Gerrit seems to have API for changing that, right?

On Jun 4, 2015, at 15:45, Mike Carey <[email protected]> wrote:

Just a quick high-level note from our nearest equivalent of the

pointy-haired Dilbert guy (aka me):  What would be nice is to have

Hyracks
changes kick off tests of all "supported client projects" - AsterixDB,
VXQuery, maybe also Pregelix, IMRU, and possibly others in the future.
I
don't think we'll ever prevent such downstream things from being

broken

unless we run their tests - so I would suggest that we need a

mechanism

to
keep Hyracks changes from being permitted to happen without verifying
the
ongoing integrity of all "blessed" (priority 1) affected projects....
We
could have an agreed upon list of such projects and tests for each....
It
would be nice to have a "quick check" (hello world still works, basics
are
working) that was synchronously blocking of such changes, and at

least a

daily verification that all's totally well (AFAWK) for them all.

Not sure how this affects the still two-sided discussion...  :-)

Cheers,
Mike


  On 6/2/15 10:00 AM, Chris Hillery wrote:

On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <[email protected]>

wrote:

In my opinion,  merging the repository doesn't break the

separation of

hyracks and asterixdb, because the dependencies are controlled by

mvn pom

files.

  That wasn't the separation I was talking about. I meant API

separation. As

it is now, when we make a change to both Asterix and Hyracks, we

are forced

to consider the API implications, or at least they are put out

there in a

very clear way that we need to look at. If we merge them, people

will

(rightly) treat the whole thing as one product, and there will be

no brakes

on making wide-ranging API changes.

(As an aside: I don't trust Maven's pom files to do a good job

of

keeping

the dependency management clean. In fact I trust it to do

precisely the

opposite, by making it both easier to screw up the dependencies

and harder

to update them in future.)

Again, my point is this: If we truly believe that Hyracks is a

re-usable

component, it should be treated as such from source to build to

delivery.

By merging in Asterix, we are saying that Asterix is "more equal"

than

others Hyracks clients, to the point that we're tacitly willing to

break

those other clients in favor of simplifying Asterix development.

If that is

a fair and true statement, well, then, sure, let's merge them.

1) It forces those hyracks-only changes to pass asterixdb

regression

tests.  Currently hyracks-only change are not verified by

asterixdb tests.

This is a good point, I will admit. However, I think this same

goal can be

met in other ways. My strong preference would be to create a set

of true

API tests inside of Hyracks, which both document and test the

external

Hyracks API. That will make API-breaking changes in future much

easier to

spot, and also make it clear when Asterix is using internal APIs

that it

should not.


  2) On my local machine,  I don't need to always install hyracks
and then

verify asterixdb from time to time.  Especially, switching

branches seems

painful because the installed hyracks snapshot is overwritten

from time to

time.

  I haven't tried working on multiple Hyracks branches at the

same

time, so I

haven't experienced this. This seems like a working method error,

though.

If you're working with two things that are "the same version"

(even if

that's a snapshot version), you'll need to use separate Maven

repositories

to install them. In fact, merging the two git repositories would

do nothing

to fix this problem, will it? If the proposal is to put the two

source

repositories in the same git repo but otherwise leave them

untouched, then

nothing would change in the build process. It's possible I'm

missing

something there, though.


  3) I only need to make one code review request and one jenkins
job.

Currently I need to manually change the topic of my asterixdb

gerrit CL

every time before I update my hyracks CL, and then manually

schedule

jenkins to run a new asterixdb job.  If I forget to schedule the

jenkins

job, the asterixdb CL is still shown to be "verified by jenkins".

  This is a problem, but it's a problem in commit validation,

not

in

the

source. Modifying the source to work around these issues is still

a bad

idea IMHO.

The "change-topic" issue could be fixed with a bit of

development

work

(have the topic point to a change, rather than a specific patchset

on the

change, so you only need to set it once, for instance).

As for manually scheduling Asterix Jenkins jobs, that sounds

like

it's only

a problem where your Hyracks change breaks an existing public API.

That

would be obviated by having true API testing inside of Hyracks,

which is

something that we should have regardless of any decisions about

source

locations.

In summary / repeating myself again: yes, we have some problems

because

Hyracks and Asterix are in seperate repositories. But those

problems are

pointing out true issues with our development and processes.

Merging the

repositories isn't fixing those problems, it's sweeping them under

the rug.

Long term we would be much better off to identify, isolate, and

fix the

problems themselves.

Ceej
aka Chris Hillery

  Best regards,

Ildar

  Best regards,

Ildar

Re: Migration of git repository

Reply via email to