I vote for including this fix in the next Asterxi/Hyracks release, not this one.
Chen On Fri, Sep 25, 2015 at 4:23 PM, Ildar Absalyamov < [email protected]> wrote: > It did not really occur to me during today during the meeting, but Preston > pointed out that the secondary index delete fix, that I proposed, spans > both Hyracks & Asterix codebase. Thus we will either have to release > Hyracks once again, or bite the bullet, sign the RC without this fixing > this issue and create bug-fix releases for both Hyracks&Asterix right after. > > > On Sep 22, 2015, at 22:27, Mike Carey <[email protected]> wrote: > > > > Ah - that makes sense now. Thx. (And welcome back. :-)) > > > > On 9/22/15 10:02 PM, Ildar Absalyamov wrote: > >> Sorry for confusion, my initial answer was not correct enough, probably > should have waited sometime after I drove 1500 miles form Seattle :) > >> The casting in the insert pipeline, which Abdullah mentioned, is needed > only for secondary index insert. The reasoning behind this casting is to > ensure that the record is equivalent, thus it is safe to create an open > index. It is true that we can get <Pk, Sk> pairs out of original record > using get-field-by-name\index, but the cast operator is introduced merely > to kill the pipeline if the dataset input is not correct. > >> Thus the records in primary are never touched of modified, not matter > what indexes were created. > >> I am not sure however what is the second cast in Abdullah’s plan, and > where is comes from. > >> > >> @Taewoo, so scan-delete-btree-secondary-index-open test does not > actually delete data from the secondary index? I have checked the plan and > it has the delete operator. Maybe it is initialized with wrong parameters, > I’ll have a close look. > >> > >>> On Sep 22, 2015, at 18:33, Mike Carey <[email protected]> wrote: > >>> > >>> Sounds kinda bad! Also, I wonder what happens when the compiler > encounters records in the dataset - whose type in the catalog doesn't claim > to have a given (but now indexed) open field - e.g., during a data scan or > an access via some other path? Can Bad Things Happen due to the compiler > not properly anticipating the casted form of the records? (Maybe I am > misunderstanding something, but we should probably take a careful look at > the test cases - and make sure we do things like add a bunch of records, > then add such an index, then add some more records, then stress-test > type-related things that come at the dataset (i) thru the index, (ii) thru > a primary dataset scan, and (iii) thru some other index.) > >>> > >>> On 9/22/15 4:06 PM, Taewoo Kim wrote: > >>>> I think this issue: > https://issues.apache.org/jira/browse/ASTERIXDB-1109 is > >>>> related. Currently, index entries (SK, PK) are not deleted on an > open-type > >>>> secondary index during a deletion. This issue was not surfaced due to > the > >>>> fact that every search after a secondary index search had to go > through the > >>>> primary index lookup. > >>>> > >>>> Best, > >>>> Taewoo > >>>> > >>>> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov < > >>>> [email protected]> wrote: > >>>> > >>>>> Abdullah, > >>>>> > >>>>> If I remember correctly whenever a secondary open index is created > all > >>>>> existing records would be casted to a proper type to ensure that the > index > >>>>> creation is valid. > >>>>> As for the overall correctness of casting operation, semantically > creating > >>>>> an open index is the same thing as altering the dataset type. The > current > >>>>> implementation allows only one open index of particular type created > on a > >>>>> single field. If we would have had “alter datatype” functionality > the open > >>>>> indexing would not be required at all. > >>>>> > >>>>>> On Sep 21, 2015, at 23:25, abdullah alamoudi <[email protected]> > wrote: > >>>>>> > >>>>>> More thoughts: > >>>>>> I assume the intention of the cast was just to make sure if the open > >>>>> field > >>>>>> exists, it is of the specified type. Moreover, the un-casted record > >>>>> should > >>>>>> be inserted into the index. > >>>>>> If my assumptions are not correct, please, let me know ASAP. > >>>>>> > >>>>>> I have two thoughts on this: > >>>>>> 1. Actually, insert plans show that the records being inserted into > the > >>>>>> primary index is actually the casted record creating the issue > described > >>>>>> above. > >>>>>> > >>>>>> 2. I don't believe this is the right way to ensure that the open > field if > >>>>>> exists is of the right type. why not extract the field using field > access > >>>>>> by name function and then verify the type using the field tag? > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi < > [email protected]> > >>>>>> wrote: > >>>>>> > >>>>>>> Hi Dev, @Ildar, > >>>>>>> > >>>>>>> In the insert pipeline for datasets with open indexes, we > introduce a > >>>>> cast > >>>>>>> function before the insert and so one would expect the records to > look > >>>>> like > >>>>>>> the casted record type which I assume has {{the closed fields + a > >>>>> nullable > >>>>>>> field}}. > >>>>>>> > >>>>>>> The question is, what happens to the previously existing records?, > since > >>>>>>> now the index has both, records of the original type and records > of the > >>>>>>> casted type. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Abdullah. > >>>>>>> > >>>>> Best regards, > >>>>> Ildar > >>>>> > >>>>> > >> Best regards, > >> Ildar > >> > >> > > > > Best regards, > Ildar > >
