I'm not sure I follow, you'd still need permission to change the dataset files. Unless you mean to commit them to git? However I don't think thats an option either because even at the trivial size they are now (2 pages) that's huge git-wise.
On Thu, May 11, 2017 at 8:41 AM, abdullah alamoudi <[email protected]> wrote: > It will not require special access to jenkins... > > >> On May 10, 2017, at 8:06 PM, Ian Maxon <[email protected]> wrote: >> >> I don't really see how that would require any less intervention than >> the current job. As it stands, once it's decided that breaking the >> storage for a patch is fine, all we need to do is override the -1 >> Verified and merge it. What it checks against is HEAD and HEAD~1 of >> the patch, not the release version. >> >> On Wed, May 10, 2017 at 7:59 PM, abdullah alamoudi <[email protected] >> <mailto:[email protected]>> wrote: >>> A follow up on this. So this change has been through a round of review and >>> will be merged soon. Currently, it fails the storage check test. The >>> storage check test works as follow: >>> >>> It builds some artifacts using the previous asterixdb version and then >>> tries to read them using the build with the change. The test itself is good >>> as it catches those changes that modifies the storage files but not very >>> convenient as one has to have access to jenkins to override them. >>> What I propose is to disable this test and have binary data files with a >>> test that reads them. If a change changes any storage related classes, then >>> it will still fail those tests until the files are updated. At which point, >>> the reviewer should catch that and if it is a legitimate change, then it >>> should be allowed in. >>> >>> Eventually, we should have backward compatibility storage and/or a >>> migration facility and then maybe we can put those back on jenkins. >>> >>> Thoughts? Proposals? >>> >>> >>>> On May 8, 2017, at 10:59 PM, abdullah alamoudi <[email protected]> wrote: >>>> >>>> Devs, >>>> For some time, out storage code has been suffering from incremental design >>>> changes through additional use cases that come along research projects. >>>> We have done some cleanup but the code base still suffered from lots of >>>> duplicate code and unneeded work (for both developers and CPUs). >>>> >>>> One thing we used to do is whenever we need to access an index, we have to >>>> create its "IDataflowHelperFactory". This object will contain most things >>>> that defines the index. Interestingly, it didn't contain all that is >>>> needed. >>>> For some obscure reason, typeTraits, comparatorFactories, and >>>> bloomFilterKeyFields were places in TreeIndexCreateOperatorDescriptor. >>>> Different indexes had different dataflow helpers and so sometimes, >>>> multiple class definitions are needed for operators per index type. and >>>> this leads to further bloat of the system's code. >>>> >>>> If one thinks about it, the index related objects that are needed during >>>> index construction are only needed when the index gets created. Further >>>> needs to access the index for any reason should only gets the index key >>>> (the path in this case). >>>> In fact, if one follows the execution of the code, they will see that that >>>> is exactly what is needed and that whenever we try to >>>> search/insert/delete/upsert/bulkload, we recompute many artifacts that end >>>> up being useless. >>>> >>>> So, I proposed a change that fixes this part. Index related objects are >>>> only provided at index creation time and for index access, only the index >>>> path is required. This is done by removing the create method from the >>>> IIndexDataflowHelper interface and moving it to IIndexBuilder. >>>> With this, all implementations of IIndexDataflowHelper are now in a single >>>> class that basically uses the path to fetch the index on a Node Controller. >>>> >>>> This change gets rid of more than 3000 lines of code and makes things much >>>> cleaner. Classes that should go to hyracks are moved to hyracks. Asterix >>>> related information such as dataset id and partition number are kept in >>>> asterix through the introduction of DatasetLocalResource. To show the >>>> effect of this change, you can look at an example in >>>> https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java >>>> >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java> >>>> >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java >>>> >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java>> >>>> >>>> Look at the amount of work that was unneeded to access an inverted index. >>>> The only thing that was actually needed is the index FileSplitProvider. >>>> Take another look at >>>> https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java >>>> >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java> >>>> >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java >>>> >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java>> >>>> and see how much unneeded code gets removed. This is not enough on the >>>> asterixdb side but it creates a good foundation that would allow existing >>>> and new code to get cleaner. >>>> >>>> Please, take a look at the change >>>> https://asterix-gerrit.ics.uci.edu/#/c/1728/ >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/> >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/ >>>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/>> if interested and let me >>>> know if you have any comments. Note that it fails storage check test and >>>> that is expected because it changes the persisted resource info classes. >>>> >>>> Cheers, >>>> Abdullah. >
