Re: Question about open indexes

Young-Seok Kim Tue, 22 Sep 2015 22:18:04 -0700

The plan has the delete operator, but the operator doesn't actually delete
entries from the secondary index due to the following tupleFilter.accept()
code in AsterixLSMInsertDeleteOperatorNodePushable.nextFrame()
The accept() returns false because the secondary key fields fed from the
underlying operator delivers null even if those field values shouldn't be
null. (We also tested that if the same operation is applied to the closed
type dataset and the non-open index, the delete operator actually deletes
entries.)


                if (tupleFilter != null) {

                    frameTuple.reset(accessor, i);

                    if (!tupleFilter.accept(frameTuple)) {

                        tmpCnt++;

                        continue;

                    }

                }


Young-Seok

On Tue, Sep 22, 2015 at 10:02 PM, Ildar Absalyamov <
[email protected]> wrote:

> Sorry for confusion, my initial answer was not correct enough, probably
> should have waited sometime after I drove 1500 miles form Seattle :)
> The casting in the insert pipeline, which Abdullah mentioned, is needed
> only for secondary index insert. The reasoning behind this casting is to
> ensure that the record is equivalent, thus it is safe to create an open
> index. It is true that we can get <Pk, Sk> pairs out of original record
> using get-field-by-name\index, but the cast operator is introduced merely
> to kill the pipeline if the dataset input is not correct.
> Thus the records in primary are never touched of modified, not matter what
> indexes were created.
> I am not sure however what is the second cast in Abdullah’s plan, and
> where is comes from.
>
> @Taewoo, so scan-delete-btree-secondary-index-open test does not actually
> delete data from the secondary index? I have checked the plan and it has
> the delete operator. Maybe it is initialized with wrong parameters, I’ll
> have a close look.
>
> > On Sep 22, 2015, at 18:33, Mike Carey <[email protected]> wrote:
> >
> > Sounds kinda bad!  Also, I wonder what happens when the compiler
> encounters records in the dataset - whose type in the catalog doesn't claim
> to have a given (but now indexed) open field - e.g., during a data scan or
> an access via some other path?  Can Bad Things Happen due to the compiler
> not properly anticipating the casted form of the records?  (Maybe I am
> misunderstanding something, but we should probably take a careful look at
> the test cases - and make sure we do things like add a bunch of records,
> then add such an index, then add some more records, then stress-test
> type-related things that come at the dataset (i) thru the index, (ii) thru
> a primary dataset scan, and (iii) thru some other index.)
> >
> > On 9/22/15 4:06 PM, Taewoo Kim wrote:
> >> I think this issue:https://issues.apache.org/jira/browse/ASTERIXDB-1109
> is
> >> related. Currently, index entries (SK, PK) are not deleted on an
> open-type
> >> secondary index during a deletion. This issue was not surfaced due to
> the
> >> fact that every search after a secondary index search had to go through
> the
> >> primary index lookup.
> >>
> >> Best,
> >> Taewoo
> >>
> >> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov <
> >> [email protected]> wrote:
> >>
> >>> Abdullah,
> >>>
> >>> If I remember correctly whenever a secondary open index is created all
> >>> existing records would be casted to a proper type to ensure that the
> index
> >>> creation is valid.
> >>> As for the overall correctness of casting operation, semantically
> creating
> >>> an open index is the same thing as altering the dataset type. The
> current
> >>> implementation allows only one open index of particular type created
> on a
> >>> single field. If we would have had “alter datatype” functionality the
> open
> >>> indexing would not be required at all.
> >>>
> >>>> On Sep 21, 2015, at 23:25, abdullah alamoudi <[email protected]>
> wrote:
> >>>>
> >>>> More thoughts:
> >>>> I assume the intention of the cast was just to make sure if the open
> >>> field
> >>>> exists, it is of the specified type. Moreover, the un-casted record
> >>> should
> >>>> be inserted into the index.
> >>>> If my assumptions are not correct, please, let me know ASAP.
> >>>>
> >>>> I have two thoughts on this:
> >>>> 1. Actually, insert plans show that the records being inserted into
> the
> >>>> primary index is actually the casted record creating the issue
> described
> >>>> above.
> >>>>
> >>>> 2. I don't believe this is the right way to ensure that the open
> field if
> >>>> exists is of the right type. why not extract the field using field
> access
> >>>> by name function and then verify the type using the field tag?
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi <[email protected]
> >
> >>>> wrote:
> >>>>
> >>>>> Hi Dev, @Ildar,
> >>>>>
> >>>>> In the insert pipeline for datasets with open indexes, we introduce a
> >>> cast
> >>>>> function before the insert and so one would expect the records to
> look
> >>> like
> >>>>> the casted record type which I assume has {{the closed fields + a
> >>> nullable
> >>>>> field}}.
> >>>>>
> >>>>> The question is, what happens to the previously existing records?,
> since
> >>>>> now the index has both, records of the original type and records of
> the
> >>>>> casted type.
> >>>>>
> >>>>> Thanks,
> >>>>> Abdullah.
> >>>>>
> >>> Best regards,
> >>> Ildar
> >>>
> >>>
> >
>
> Best regards,
> Ildar
>
>

Re: Question about open indexes

Reply via email to