Ok, cool. I think that we do have a plan :)
Cheers,
Till
On 11 May 2017, at 13:27, Ian Maxon wrote:
I'm fine with changing the ability to override as it's recorded as
part of the commit. I don't see any reason it can't be tied to
committer privs.
On Thu, May 11, 2017 at 1:13 PM, Till Westmann <[email protected]>
wrote:
I think that the point is, that the gerrit/jenkins admin rights are
not
necessarily aligned with the committer rights. And so storing and
changing
binaries in git is something everybody can do, while overriding in
gerrit is
not (as least I think that’s the way it is).
So I think that we should either change the test or the ability to
override.
Cheers,
Till
On 11 May 2017, at 12:30, Ian Maxon wrote:
I'm not sure I follow, you'd still need permission to change the
dataset files. Unless you mean to commit them to git? However I
don't
think thats an option either because even at the trivial size they
are
now (2 pages) that's huge git-wise.
On Thu, May 11, 2017 at 8:41 AM, abdullah alamoudi
<[email protected]>
wrote:
It will not require special access to jenkins...
On May 10, 2017, at 8:06 PM, Ian Maxon <[email protected]> wrote:
I don't really see how that would require any less intervention
than
the current job. As it stands, once it's decided that breaking the
storage for a patch is fine, all we need to do is override the -1
Verified and merge it. What it checks against is HEAD and HEAD~1
of
the patch, not the release version.
On Wed, May 10, 2017 at 7:59 PM, abdullah alamoudi
<[email protected]
<mailto:[email protected]>> wrote:
A follow up on this. So this change has been through a round of
review
and will be merged soon. Currently, it fails the storage check
test. The
storage check test works as follow:
It builds some artifacts using the previous asterixdb version and
then
tries to read them using the build with the change. The test
itself is good
as it catches those changes that modifies the storage files but
not very
convenient as one has to have access to jenkins to override them.
What I propose is to disable this test and have binary data files
with
a test that reads them. If a change changes any storage related
classes,
then it will still fail those tests until the files are updated.
At which
point, the reviewer should catch that and if it is a legitimate
change, then
it should be allowed in.
Eventually, we should have backward compatibility storage and/or
a
migration facility and then maybe we can put those back on
jenkins.
Thoughts? Proposals?
On May 8, 2017, at 10:59 PM, abdullah alamoudi
<[email protected]>
wrote:
Devs,
For some time, out storage code has been suffering from
incremental
design changes through additional use cases that come along
research
projects.
We have done some cleanup but the code base still suffered from
lots
of duplicate code and unneeded work (for both developers and
CPUs).
One thing we used to do is whenever we need to access an index,
we
have to create its "IDataflowHelperFactory". This object will
contain most
things that defines the index. Interestingly, it didn't contain
all that is
needed.
For some obscure reason, typeTraits, comparatorFactories, and
bloomFilterKeyFields were places in
TreeIndexCreateOperatorDescriptor.
Different indexes had different dataflow helpers and so
sometimes,
multiple class definitions are needed for operators per index
type. and this
leads to further bloat of the system's code.
If one thinks about it, the index related objects that are
needed
during index construction are only needed when the index gets
created.
Further needs to access the index for any reason should only
gets the index
key (the path in this case).
In fact, if one follows the execution of the code, they will see
that
that is exactly what is needed and that whenever we try to
search/insert/delete/upsert/bulkload, we recompute many
artifacts that end
up being useless.
So, I proposed a change that fixes this part. Index related
objects
are only provided at index creation time and for index access,
only the
index path is required. This is done by removing the create
method from the
IIndexDataflowHelper interface and moving it to IIndexBuilder.
With this, all implementations of IIndexDataflowHelper are now
in a
single class that basically uses the path to fetch the index on
a Node
Controller.
This change gets rid of more than 3000 lines of code and makes
things
much cleaner. Classes that should go to hyracks are moved to
hyracks.
Asterix related information such as dataset id and partition
number are kept
in asterix through the introduction of DatasetLocalResource. To
show the
effect of this change, you can look at an example in
https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java
<https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java>
<https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java
<https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java>>
Look at the amount of work that was unneeded to access an
inverted
index. The only thing that was actually needed is the index
FileSplitProvider. Take another look at
https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java
<https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java>
<https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java
<https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java>>
and see how much unneeded code gets removed. This is not enough
on the
asterixdb side but it creates a good foundation that would allow
existing
and new code to get cleaner.
Please, take a look at the change
https://asterix-gerrit.ics.uci.edu/#/c/1728/
<https://asterix-gerrit.ics.uci.edu/#/c/1728/>
<https://asterix-gerrit.ics.uci.edu/#/c/1728/
<https://asterix-gerrit.ics.uci.edu/#/c/1728/>> if interested
and let me
know if you have any comments. Note that it fails storage check
test and
that is expected because it changes the persisted resource info
classes.
Cheers,
Abdullah.