It will not require special access to jenkins...
> On May 10, 2017, at 8:06 PM, Ian Maxon <[email protected]> wrote: > > I don't really see how that would require any less intervention than > the current job. As it stands, once it's decided that breaking the > storage for a patch is fine, all we need to do is override the -1 > Verified and merge it. What it checks against is HEAD and HEAD~1 of > the patch, not the release version. > > On Wed, May 10, 2017 at 7:59 PM, abdullah alamoudi <[email protected] > <mailto:[email protected]>> wrote: >> A follow up on this. So this change has been through a round of review and >> will be merged soon. Currently, it fails the storage check test. The storage >> check test works as follow: >> >> It builds some artifacts using the previous asterixdb version and then tries >> to read them using the build with the change. The test itself is good as it >> catches those changes that modifies the storage files but not very >> convenient as one has to have access to jenkins to override them. >> What I propose is to disable this test and have binary data files with a >> test that reads them. If a change changes any storage related classes, then >> it will still fail those tests until the files are updated. At which point, >> the reviewer should catch that and if it is a legitimate change, then it >> should be allowed in. >> >> Eventually, we should have backward compatibility storage and/or a migration >> facility and then maybe we can put those back on jenkins. >> >> Thoughts? Proposals? >> >> >>> On May 8, 2017, at 10:59 PM, abdullah alamoudi <[email protected]> wrote: >>> >>> Devs, >>> For some time, out storage code has been suffering from incremental design >>> changes through additional use cases that come along research projects. >>> We have done some cleanup but the code base still suffered from lots of >>> duplicate code and unneeded work (for both developers and CPUs). >>> >>> One thing we used to do is whenever we need to access an index, we have to >>> create its "IDataflowHelperFactory". This object will contain most things >>> that defines the index. Interestingly, it didn't contain all that is needed. >>> For some obscure reason, typeTraits, comparatorFactories, and >>> bloomFilterKeyFields were places in TreeIndexCreateOperatorDescriptor. >>> Different indexes had different dataflow helpers and so sometimes, multiple >>> class definitions are needed for operators per index type. and this leads >>> to further bloat of the system's code. >>> >>> If one thinks about it, the index related objects that are needed during >>> index construction are only needed when the index gets created. Further >>> needs to access the index for any reason should only gets the index key >>> (the path in this case). >>> In fact, if one follows the execution of the code, they will see that that >>> is exactly what is needed and that whenever we try to >>> search/insert/delete/upsert/bulkload, we recompute many artifacts that end >>> up being useless. >>> >>> So, I proposed a change that fixes this part. Index related objects are >>> only provided at index creation time and for index access, only the index >>> path is required. This is done by removing the create method from the >>> IIndexDataflowHelper interface and moving it to IIndexBuilder. >>> With this, all implementations of IIndexDataflowHelper are now in a single >>> class that basically uses the path to fetch the index on a Node Controller. >>> >>> This change gets rid of more than 3000 lines of code and makes things much >>> cleaner. Classes that should go to hyracks are moved to hyracks. Asterix >>> related information such as dataset id and partition number are kept in >>> asterix through the introduction of DatasetLocalResource. To show the >>> effect of this change, you can look at an example in >>> https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java >>> >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java> >>> >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java >>> >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-algebra/src/main/java/org/apache/asterix/algebra/operators/physical/InvertedIndexPOperator.java>> >>> >>> Look at the amount of work that was unneeded to access an inverted index. >>> The only thing that was actually needed is the index FileSplitProvider. >>> Take another look at >>> https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java >>> >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java> >>> >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java >>> >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/3/asterixdb/asterix-metadata/src/main/java/org/apache/asterix/metadata/declared/MetadataProvider.java>> >>> and see how much unneeded code gets removed. This is not enough on the >>> asterixdb side but it creates a good foundation that would allow existing >>> and new code to get cleaner. >>> >>> Please, take a look at the change >>> https://asterix-gerrit.ics.uci.edu/#/c/1728/ >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/> >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/ >>> <https://asterix-gerrit.ics.uci.edu/#/c/1728/>> if interested and let me >>> know if you have any comments. Note that it fails storage check test and >>> that is expected because it changes the persisted resource info classes. >>> >>> Cheers, >>> Abdullah.
