Agreed. Hyracks doesn't have a narrow testable API, I fear, so that'd be Mission Impossible. :-)

On 6/8/15 12:15 PM, Steven Jacobs wrote:
*IMHO, the first one is "easy" [*] to solve via testing. Either we
addproper API testing to Hyracks and ensure Asterix/VXQuery/etc only
useproper APIs, and/or we add Asterix/VXQuery/etc builds and tests to
thetesting jobs on Jenkins.*
-We should be going in the direction of the latter here. As discussed,
there are more issues that we see than simple API breakage.

Steven

On Mon, Jun 8, 2015 at 12:01 PM, Chris Hillery <[email protected]>
wrote:

I think maybe part of the reason we're having a tough time figuring this
out is that we're conflating two different problems.

1. We want to ensure that changes to Hyracks don't break Asterix, VXQuery,
etc.

2. We fairly often need to make related changes in Hyracks and Asterix that
"go together", ie, Asterix won't build/work with the new change until it
can see the corresponding Hyracks change.

Those really are completely different problems and may well need different
solutions.

IMHO, the first one is "easy" [*] to solve via testing. Either we add
proper API testing to Hyracks and ensure Asterix/VXQuery/etc only use
proper APIs, and/or we add Asterix/VXQuery/etc builds and tests to the
testing jobs on Jenkins.

The second problem is where we get into the trickiness of Maven releases
vs. Apache releases. This is why I asked about the actual requirements and
audience. My not-totally-thought-out suggestion for problem #2 would be to
not "solve" it at all, and simply state that the tip of Asterix requires
the latest tip of Hyracks to build. That's the way we all develop code on
our local machines anyway, as far as I know. If there are no outside
clients that we have to be concerned about between releases, doesn't this
solve the problem?

Obviously when it comes time to make a real Hyracks (or Asterix) release
we'll need to do a little extra work to ensure those *released* codebases
build together. That might mean that we usually need to make Hyracks and
Asterix releases at the same time, and I don't know whether that's now
harder to achieve in the incubator world.

(As a side note, the original proposal to merge the codebases would "solve"
[sweep under the rug] problem #1 for Asterix, at the cost of quite possibly
making it worse for VXQuery. It would sort of "solve" problem #2 for
Asterix as well, because it would physically enforce the same tip-tip rule
I'm proposing above. I still believe that we can solve both problems in
other strictly superior ways, however.)

Ceej
aka Chris Hillery

[*] - not actually easy.

On Mon, Jun 8, 2015 at 6:39 AM, Mike Carey <[email protected]> wrote:

All,

It feels to me (as one who is completely naive about much of this stuff)
like we need two levels of "releases", one level for the outside world
(the
public releases that users might pick up) and a different internal level
for the development process (where we essentially want to have
tagged/extra-tested checkpoints and want to be able to manage in a
careful
way the cross-dependencies from/to other related development processes X
-
e.g., for X = VXQuery, AsterixDB, and someday Pregelix).  When we do an
official signed release of anything, we'd need to do one for the DAG of
things - so there might be sync'ed "multireleases" (for Hyacks and then
for
X).  Does that make any sense and/or give anyone more thoughts about how
we
might achieve that...?

Cheers,
MIke



On 6/8/15 2:08 AM, Chris Hillery wrote:

If not, it may be worth taking a step back and asking what exactly the
problem is. I understand the general rule that "we don't want Asterix to
be
broken", but what precisely does that mean? Is it acceptable that the
tip
of the Asterix source branch is only guaranteed to build against the tip
of
the Hyracks branch, for example? If not, why not? What audience are we
required to keep things working for at the source level, and what
expectations do they have?

Ceej
aka Chris Hillery

On Mon, Jun 8, 2015 at 2:06 AM, Chris Hillery <[email protected]>
wrote:

  So, if we pushed these not-releases to the Nexus repo running at UCI,
and
devs pulled from there in preference to "official" repos, that would
solve
the problem?

Ceej
aka Chris Hillery

On Sun, Jun 7, 2015 at 7:29 PM, Ted Dunning <[email protected]>
wrote:

  If it is pushed to any wider audience than roughly the dev@ list, it
is
a release. That definitely includes maven central.  Artifacts in maven
are
convenience binaries and this not a release but they should be
traceable to
an exact source release.

Sent from my iPhone

  On Jun 7, 2015, at 19:10, Till Westmann <[email protected]> wrote:
Hmm, good point. It doesn’t have to. One question might be if we can

push it to some maven repository, if it’s not an official release.

But I think that should also be fine as long as we don’t push it to a

repository that claims to contain official releases.

Some mentor input might be helpful on this as well :)

Cheers,
Till

  On Jun 7, 2015, at 6:53 PM, Ildar Absalyamov <
[email protected]> wrote:
Does version bump always mean full-fledged Apache release? We need
the
former just to resolve compile time dependencies.
On Jun 7, 2015, at 18:49, Till Westmann <[email protected]> wrote:
In principle I agree with this, but creating a new release will be
a
little more involved that just running maven, when we do this at the
ASF.

To publish a new release we will have to vet and vote on the release.
This takes at least 72 hours  in the best case if we’re a TLP, the
first
release candidate is great, and have enough people to vote. While
we’re
still in the incubator, releasing will take a little longer as we also
have
to get enough votes for the release in the incubator.

As I proposed earlier, it would be really good to go through the full
release process once, before we decide how to structure our
processes
and
infrastructure.

Cheers,
Till

  On Jun 4, 2015, at 6:37 PM, Ildar Absalyamov <
[email protected]> wrote:
I am with Chris on repository separation and I think that the
solution to the issue of Hyracks commits breaking Asterix build is
using
release Hyracks versions instead of snapshot ones. Yes, that will
create a
frequent Hyracks releases (we will have to release it each time there
is a
change which spans both Hyracks & Asterix) and we have abandoned this
practice a while ago, but it seems that’s the only way to separate
projects
logically.

Here are few examples to clear the picture. In all examples Hyracks
version is 4.5.6-Snapshot, Asterix version is 1.2.3-Snapshot (but
it
depends on previous release version Hyracks 4.5.5):

1) The changes span both Asterix & Hyracks.
First make sure that Asterix could depend on Hyracks
4.5.6-Snapshot
without API conflicts & switch Asterix dependency to
4.5.6-Snapshot.
Submit Gerrit review, once it is done as a part of git-asf script
commit changes, bump Hyracks version to 4.5.6, make Asterix depend
on 4.5.6
and bump Hyracks to 4.5.7-Snapshot right after.

2) The changes are located only in Hyracks. Regular review and
commit (with snapshot version) without any version bump.
3) The changes are located only in Asterix. Regular review and
commit (with snapshot version) without any version bump.
In this scenario Hyracks commit can never make Asterix build fail
(since it depends on a stable release) and it’s the responsibility
of the
first person, whose commits spans both repos to make sure that the
changes
in snapshot Hyracks version are properly merged.

Regarding the Yingyi’s issue with Gerrit topics: could we modify
git-gerrit script so it would submit both Asterix & Hyracks reviews
(granted that the latter is needed), and link them together, setting
the
proper topic? Gerrit seems to have API for changing that, right?

On Jun 4, 2015, at 15:45, Mike Carey <[email protected]> wrote:
Just a quick high-level note from our nearest equivalent of the

pointy-haired Dilbert guy (aka me):  What would be nice is to have
Hyracks
changes kick off tests of all "supported client projects" - AsterixDB,
VXQuery, maybe also Pregelix, IMRU, and possibly others in the future.
I
don't think we'll ever prevent such downstream things from being
broken
unless we run their tests - so I would suggest that we need a
mechanism
to
keep Hyracks changes from being permitted to happen without verifying
the
ongoing integrity of all "blessed" (priority 1) affected projects....
We
could have an agreed upon list of such projects and tests for each....
It
would be nice to have a "quick check" (hello world still works, basics
are
working) that was synchronously blocking of such changes, and at
least a
daily verification that all's totally well (AFAWK) for them all.

Not sure how this affects the still two-sided discussion...  :-)
Cheers,
Mike


  On 6/2/15 10:00 AM, Chris Hillery wrote:
On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <[email protected]>

wrote:
In my opinion,  merging the repository doesn't break the
separation of
hyracks and asterixdb, because the dependencies are controlled by
mvn pom
files.
  That wasn't the separation I was talking about. I meant API
separation. As
it is now, when we make a change to both Asterix and Hyracks, we
are forced
to consider the API implications, or at least they are put out
there in a
very clear way that we need to look at. If we merge them, people
will
(rightly) treat the whole thing as one product, and there will be
no brakes
on making wide-ranging API changes.
(As an aside: I don't trust Maven's pom files to do a good job
of
keeping
the dependency management clean. In fact I trust it to do
precisely the
opposite, by making it both easier to screw up the dependencies
and harder
to update them in future.)
Again, my point is this: If we truly believe that Hyracks is a

re-usable
component, it should be treated as such from source to build to
delivery.
By merging in Asterix, we are saying that Asterix is "more equal"
than
others Hyracks clients, to the point that we're tacitly willing to
break
those other clients in favor of simplifying Asterix development.
If that is
a fair and true statement, well, then, sure, let's merge them.
1) It forces those hyracks-only changes to pass asterixdb

regression
tests.  Currently hyracks-only change are not verified by
asterixdb tests.
This is a good point, I will admit. However, I think this same
goal can be
met in other ways. My strong preference would be to create a set
of true
API tests inside of Hyracks, which both document and test the
external
Hyracks API. That will make API-breaking changes in future much
easier to
spot, and also make it clear when Asterix is using internal APIs
that it
should not.

  2) On my local machine,  I don't need to always install hyracks
and then
verify asterixdb from time to time.  Especially, switching
branches seems
painful because the installed hyracks snapshot is overwritten
from time to
time.
  I haven't tried working on multiple Hyracks branches at the
same
time, so I
haven't experienced this. This seems like a working method error,
though.
If you're working with two things that are "the same version"
(even if
that's a snapshot version), you'll need to use separate Maven
repositories
to install them. In fact, merging the two git repositories would
do nothing
to fix this problem, will it? If the proposal is to put the two
source
repositories in the same git repo but otherwise leave them
untouched, then
nothing would change in the build process. It's possible I'm
missing
something there, though.

  3) I only need to make one code review request and one jenkins
job.
Currently I need to manually change the topic of my asterixdb
gerrit CL
every time before I update my hyracks CL, and then manually
schedule
jenkins to run a new asterixdb job.  If I forget to schedule the
jenkins
job, the asterixdb CL is still shown to be "verified by jenkins".
  This is a problem, but it's a problem in commit validation,
not
in

the
source. Modifying the source to work around these issues is still
a bad
idea IMHO.
The "change-topic" issue could be fixed with a bit of
development
work
(have the topic point to a change, rather than a specific patchset
on the
change, so you only need to set it once, for instance).
As for manually scheduling Asterix Jenkins jobs, that sounds
like
it's only
a problem where your Hyracks change breaks an existing public API.
That
would be obviated by having true API testing inside of Hyracks,
which is
something that we should have regardless of any decisions about
source
locations.
In summary / repeating myself again: yes, we have some problems

because
Hyracks and Asterix are in seperate repositories. But those
problems are
pointing out true issues with our development and processes.
Merging the
repositories isn't fixing those problems, it's sweeping them under
the rug.
Long term we would be much better off to identify, isolate, and
fix the
problems themselves.
Ceej
aka Chris Hillery

  Best regards,
Ildar

  Best regards,
Ildar



Reply via email to