Re: Question about open indexes

abdullah alamoudi Tue, 22 Sep 2015 03:53:18 -0700

P.S
The cast function should definitely have better output in the plan that
includes input and output types.




Amoudi, Abdullah.

On Tue, Sep 22, 2015 at 1:43 PM, abdullah alamoudi <[email protected]>
wrote:

> Never mind,
> I figured it out.
>
> The cast in red actually changes the record in primary index into the
> casted record. The cast before the insert operator into the primary index
> actually casts from the input to the open type since they are compatible.
>
> Regards,
> Abdullah.
>
>
> Amoudi, Abdullah.
>
> On Tue, Sep 22, 2015 at 11:19 AM, abdullah alamoudi <[email protected]>
> wrote:
>
>> @Ildar,
>> If that is the case, then why do we cast as well after the primary index
>> insert operator. If all the records are casted already, then why is the
>> second cast needed?
>>
>> For example, look at the following plan:
>> Statement:
>> insert into dataset OrdersOpen (
>> for $x in dataset Orders
>> return $x
>> );
>> Plan:
>> commit
>> -- COMMIT  |PARTITIONED|
>>   project ([$$3])
>>   -- STREAM_PROJECT  |PARTITIONED|
>>     exchange
>>     -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>>       insert into idx_Orders_Custkey on tpch:OrdersOpen from [%0->$$7]
>>       -- INDEX_INSERT_DELETE  |PARTITIONED|
>>         exchange
>>         -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>>           project ([$$3, $$7])
>>           -- STREAM_PROJECT  |PARTITIONED|
>>             assign [$$7] <- [function-call:
>> asterix:field-access-by-index, Args:[function-call: asterix:cast-record,
>> Args:[%0->$$4], AInt32: {8}]]
>>             -- ASSIGN  |PARTITIONED|
>>               exchange
>>               -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>>                 insert into tpch:OrdersOpen from %0->$$4 partitioned by
>> [%0->$$3]
>>                 -- INSERT_DELETE  |PARTITIONED|
>>                   exchange
>>                   -- HASH_PARTITION_EXCHANGE [$$3]  |PARTITIONED|
>>                     assign [$$3] <- [function-call:
>> asterix:field-access-by-index, Args:[%0->$$4, AInt32: {0}]]
>>                     -- ASSIGN  |PARTITIONED|
>>                       project ([$$4])
>>                       -- STREAM_PROJECT  |PARTITIONED|
>>                         assign [$$4] <- [function-call:
>> asterix:cast-record, Args:[%0->$$0]]
>>                         -- ASSIGN  |PARTITIONED|
>>                           project ([$$0])
>>                           -- STREAM_PROJECT  |PARTITIONED|
>>                             exchange
>>                             -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>>                               data-scan []<-[$$5, $$0] <- tpch:Orders
>>                               -- DATASOURCE_SCAN  |PARTITIONED|
>>                                 exchange
>>                                 -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
>>                                   empty-tuple-source
>>                                   -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
>>
>> what is the point of the cast in red?
>>
>>
>> Amoudi, Abdullah.
>>
>> On Tue, Sep 22, 2015 at 10:18 AM, abdullah alamoudi <[email protected]>
>> wrote:
>>
>>> I see.
>>> Thanks Ildar,
>>>
>>> Abdullah.
>>>
>>> Amoudi, Abdullah.
>>>
>>> On Tue, Sep 22, 2015 at 10:04 AM, Ildar Absalyamov <
>>> [email protected]> wrote:
>>>
>>>> Abdullah,
>>>>
>>>> If I remember correctly whenever a secondary open index is created all
>>>> existing records would be casted to a proper type to ensure that the index
>>>> creation is valid.
>>>> As for the overall correctness of casting operation, semantically
>>>> creating an open index is the same thing as altering the dataset type. The
>>>> current implementation allows only one open index of particular type
>>>> created on a single field. If we would have had “alter datatype”
>>>> functionality the open indexing would not be required at all.
>>>>
>>>> > On Sep 21, 2015, at 23:25, abdullah alamoudi <[email protected]>
>>>> wrote:
>>>> >
>>>> > More thoughts:
>>>> > I assume the intention of the cast was just to make sure if the open
>>>> field
>>>> > exists, it is of the specified type. Moreover, the un-casted record
>>>> should
>>>> > be inserted into the index.
>>>> > If my assumptions are not correct, please, let me know ASAP.
>>>> >
>>>> > I have two thoughts on this:
>>>> > 1. Actually, insert plans show that the records being inserted into
>>>> the
>>>> > primary index is actually the casted record creating the issue
>>>> described
>>>> > above.
>>>> >
>>>> > 2. I don't believe this is the right way to ensure that the open
>>>> field if
>>>> > exists is of the right type. why not extract the field using field
>>>> access
>>>> > by name function and then verify the type using the field tag?
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi <[email protected]
>>>> >
>>>> > wrote:
>>>> >
>>>> >> Hi Dev, @Ildar,
>>>> >>
>>>> >> In the insert pipeline for datasets with open indexes, we introduce
>>>> a cast
>>>> >> function before the insert and so one would expect the records to
>>>> look like
>>>> >> the casted record type which I assume has {{the closed fields + a
>>>> nullable
>>>> >> field}}.
>>>> >>
>>>> >> The question is, what happens to the previously existing records?,
>>>> since
>>>> >> now the index has both, records of the original type and records of
>>>> the
>>>> >> casted type.
>>>> >>
>>>> >> Thanks,
>>>> >> Abdullah.
>>>> >>
>>>>
>>>> Best regards,
>>>> Ildar
>>>>
>>>>
>>>
>>
>

Re: Question about open indexes

Reply via email to