Re: Homogeneous lists with nullable items

Wail Alkowaileet Sat, 19 Dec 2015 22:16:25 -0800

I have a small thought on that one ... would 0=null for numerical sparse
list? or would better to extend the complex types to have "vectors and
matrices" ?


On Fri, Dec 18, 2015 at 5:45 PM, Mike Carey <[email protected]> wrote:

> Agreed.  We probably need a mini design doc here. The short term urgency
> seems to be a need to represent lists that can include nulls, as this is
> blocking JPL and is also something easily produced by queries (AQL or
> SQL++).  Longer term one can imagine where this would be something that
> might vary (at the lowest level of detail) by list, e.g., you might
> represent dense and sparse lists quite differently, you might use
> compression for certain kinds of lists, etc.
>
>
> On 12/18/15 1:57 AM, Till Westmann wrote:
>
>> Hi Ildar,
>>
>> it seems that we have 2 separate points here:
>> 1) There are bugs in the way we decide which list representation to use
>> and
>> 2) we could add support for (and an optimized representation for) a list
>> of a fixed but nullable type.
>> It seems that - by fixing 1) - we could get rid of the issues you’ve
>> listed.
>>
>> But I also think that it would be nice to support lists of a nullable
>> type (feels like an omission that we don’t support that in the language) -
>> and potentially provide an efficient representation for them.
>> However, it is not clear to me how we would do this.
>> A few thoughts:
>> - Would we maintain the current representation for homogenous lists of
>> non-nullable types?
>> - Would we introduce a new type tag for “nullable lists”?
>> - Would we redefine the current representation to mean something else?
>> Do you have thoughts on those?
>>
>> Cheers,
>> Till
>>
>> On 16 Dec 2015, at 8:12, Ildar Absalyamov wrote:
>>
>> Hi devs,
>>>
>>> Recently I have been playing around with lists and functions, which
>>> receive/return list parameters/values. I have noticed one particular issue,
>>> which seems to be incorrect.
>>> As you might know internally we do support 2 types of lists homogeneous,
>>> where all the items are untagged and the item type is stored in type
>>> definition, and heterogeneous, where items on contrary are tagged, and the
>>> list item type is effectively ANY.
>>> The decision which of two types would be used is usually done by parser
>>> or is altered by IntroduceEnforcedListTypeRule, which effectively turns
>>> heterogenous list into homogenous if all the items have the same type.
>>> Right now only we allow homogeneous lists to be defined as a field in
>>> some type, we also restrict the item type to be only non-nullable type:
>>> create type listType {
>>> “id”:int64,
>>> “list”:[int64]   // [int64?] is not possible
>>> }
>>>
>>> This constraint spans both of the language level as well as
>>> serialization. Under that restriction the only way to load the list, which
>>> contains null values, would be to make the appropriate field open (open
>>> lists are heterogenous by definition).
>>>
>>> 1) Seems like we’re missing an optimization opportunity when we are
>>> dealing with large sparse lists. Serialization in this case might use a bit
>>> mask to specify which items in the lists are not null, and later encode
>>> only those items.
>>> 2) I believe if we alter IntroduceEnforcedListTypeRule to enforce list
>>> to homogeneous list with nullable item type we might resolve issues
>>> https://issues.apache.org/jira/browse/ASTERIXDB-905,
>>> https://issues.apache.org/jira/browse/ASTERIXDB-867,
>>> https://issues.apache.org/jira/browse/ASTERIXDB-1131all at once.
>>>
>>> Thoughts?
>>>
>>> Best regards,
>>> Ildar
>>>
>>
>


-- 

*Regards,*
Wail Alkowaileet

Re: Homogeneous lists with nullable items

Reply via email to