I think Preston's suggestion of looking at the AsterixDB implementation of
its binary data model is a good one, as it shares the efficient field
access by name requirements and several VXQuery folks are experts in its
details as well.  I believe it uses a sorted list instead of a hash table
internally, perhaps - slightly simpler for updates perhaps.
On May 9, 2016 7:35 AM, "Riyafa Abdul Hameed" <[email protected]>
wrote:

Hi again,

I have been thinking of Till's suggestion of using a dictionary, and I
think it would be a better alternative because then we wouldn't have to
process the valuetag of the value of a particular key before moving to the
next key. Hence it would be easy to implement jdm:keys method. Any
suggestions? Shall I updated the wiki and the doc based on this.

Thank you.
Riyafa

On 9 May 2016 at 19:21, Riyafa Abdul Hameed <[email protected]> wrote:

> Hi Till,
>
> Currently I have suggested storing each key followed by the value. This
> uses less space and is quite similar to storing the offset of the values
> and the access is also linear to the number of keys.
>
> Thanks.
> Riyafa
>
> On 9 May 2016 at 18:54, Till Westmann <[email protected]> wrote:
>
>> All of this looks pretty good!
>>
>> Wrt. the question of the dictionary for the fields, I think that we
should
>> consider the 2 ways that we can access an object:
>> 1. Either we get all keys (jdm:keys) or
>> 2. we get a value for a key (jdm:value).
>>
>> To get all the keys efficiently and to be able to skip huge nested values
>> a
>> simple approach could be store a dictionary of the keys (in their
original
>> order) with pointers (offsets) to the values. That way we could get the
>> keys
>> quickly by scanning the dictionary and each value by scanning the
>> dictionary
>> + 1 hop to find the value. This certainly has the problem, that the
access
>> is linear in the number of the keys. But it is reasonably simple and it
>> would allow us to get a correct + testable implementation relatively soon
>> and to have a baseline for a more optimized representation.
>>
>> Thoughts?
>>
>> Cheers,
>> Till
>>
>> [1]
>>
http://jsoniq.org/docs/JSONiqExtensionToXQuery/html-single/index.html#idm139680641300880
>>
>> On 8 May 2016, at 22:19, Riyafa Abdul Hameed wrote:
>>
>> Hi Preston,
>>>
>>> I have edited the wiki[1] and the doc[2] based on the comments. Thank
you
>>> for the suggestions provided. I have removed the part that assigns an id
>>> to
>>> the keys and instead suggested that the keys be stored in the order they
>>> appear in the json object. I am not sure I understand the concept of
>>> hashcode--how to generate the hashcodes used for easy lookup?
>>>
>>>
>>> [1]https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq
>>> [2]
>>>
>>>
https://drive.google.com/open?id=1-wT0pE8rTTNIzuY4iTgvhqkdHmKGek4CgNthXN6mlm0
>>>
>>> Thank you again.
>>>
>>> Yours sincerely,
>>> Riyafa
>>>
>>> On 9 May 2016 at 01:23, christina pavlopoulou <[email protected]> wrote:
>>>
>>> Hi,
>>>>
>>>> I updated the wiki page according to Preston's comments along with the
>>>> json array example in [1].
>>>>
>>>> [1]
>>>>
>>>>
https://docs.google.com/document/d/1GOAcvhw_F9cJrNmRq2TwZxI0wYRmvLEV3mywJS4H9Lg/edit
>>>>
>>>> Thank you,
>>>> Christina
>>>>
>>>> On 5/8/2016 9:43 AM, Preston Carman wrote:
>>>>
>>>> Nice job guys. I can see you are picking up how to create a data
>>>>> model. I have limited my comments to the wiki [1] for now. At a high
>>>>> level, I was impressed with your detail and thoughtful layouts. It
>>>>> reminds me of the age old trade off: speed vs space. At this time,
>>>>> lets error on saving space. The data model should the as compact as
>>>>> possible.
>>>>>
>>>>> I also found the AsterixDB serialization [2] we can use as a
>>>>> reference. Even though the AsterixDB data model includes object
>>>>> length, I would leave that out since all the XQuery data models do not
>>>>> include this property.
>>>>>
>>>>> Riyafa, take a look at the method AsterixDB uses for quick look ups (a
>>>>> hash value for the name). Consider the pros and cons between your
>>>>> method and AsterixDB's method: a list hash value for name and a sorted
>>>>> list of names.
>>>>>
>>>>> Also, take a look at my wiki comments. Its a great start!
>>>>>
>>>>> Mahalo,
>>>>> Preston
>>>>>
>>>>> [1] https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq
>>>>> [2]
>>>>>
>>>>>
https://cwiki.apache.org/confluence/display/ASTERIXDB/AsterixDB+Object+Serialization+Reference
>>>>>
>>>>> On Sat, May 7, 2016 at 6:47 PM, christina pavlopoulou <
>>>>> [email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> I, also, designed an example for the json array [1] given the
>>>>>> description I
>>>>>> wrote in the wiki page.
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>>
>>>>>>
https://docs.google.com/document/d/1GOAcvhw_F9cJrNmRq2TwZxI0wYRmvLEV3mywJS4H9Lg/edit
>>>>>>
>>>>>> Thank you,
>>>>>> Christina
>>>>>>
>>>>>>
>>>>>> On 5/7/2016 11:22 AM, Riyafa Abdul Hameed wrote:
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>> I am attempting to create a doc on the JSONiq data model for
>>>>>>> objects[1]
>>>>>>> (It
>>>>>>> might be full of errors because I am doing the calculations
>>>>>>> manually).
>>>>>>>
>>>>>>> This is what I have come up on the data model for objects:
>>>>>>>
>>>>>>> The first byte would have the value tag, followed by the id (4
>>>>>>> bytes) of
>>>>>>> the object. Then 4 bytes to represent the size of the object. Then
>>>>>>> another
>>>>>>> four bytes to represent the number of key-value pairs. Next few
bytes
>>>>>>> represent the offsets of keys which follow (each offset is
>>>>>>> represented
>>>>>>> by
>>>>>>> 4
>>>>>>> bytes). Ids would be assigned to the keys. Next few bytes would be a
>>>>>>> sorted
>>>>>>> list of ids for keys in alphabetical order. The following bytes
would
>>>>>>> represent the keys in the object.Each key is a StringPointable
>>>>>>> followed
>>>>>>> by
>>>>>>> the id of the key. Each object would have a sequence pointable: the
>>>>>>> following bytes would be the number of Items (items are the values
>>>>>>> for
>>>>>>> keys) in the sequence. The next bytes would be the offset of each
>>>>>>> item
>>>>>>> in
>>>>>>> the sequence. The last bytes would be the values for each key
>>>>>>> followed
>>>>>>> by
>>>>>>> the respective id of the key.
>>>>>>>
>>>>>>> Hope it makes sense.
>>>>>>>
>>>>>>> My problem is,
>>>>>>>
>>>>>>> I have not provided for the white spaces in the object. What can I
>>>>>>> use
>>>>>>> to
>>>>>>> represent the white spaces? I cannot use a text node because object
>>>>>>> is
>>>>>>> not
>>>>>>> a node.
>>>>>>>
>>>>>>>
>>>>>>> [1]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
https://drive.google.com/open?id=1-wT0pE8rTTNIzuY4iTgvhqkdHmKGek4CgNthXN6mlm0
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Yours sincerely,
>>>>>>> Riyafa
>>>>>>>
>>>>>>>
>>>>>>> On 26 April 2016 at 10:29, Preston Carman <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> We have two students working with us this summer through GSOC to
>>>>>>>
>>>>>>>> complete
>>>>>>>> JSONiq specification for arrays and objects. I think the first step
>>>>>>>> is
>>>>>>>> to
>>>>>>>> define the data model used by JSONiq. The definition should be
>>>>>>>> defined
>>>>>>>> in
>>>>>>>> our wiki [1] before coding starts this summer. The wiki will allow
>>>>>>>> the
>>>>>>>> community to discuss the JSON data model implementation in VXQuery.
>>>>>>>>
>>>>>>>> I updated the JSONiq wiki to help get the documentation started.
>>>>>>>> Please
>>>>>>>> fill in the JSON data model based on the examples seen on our
>>>>>>>> website
>>>>>>>> (links on the wiki page).
>>>>>>>>
>>>>>>>> Post here if you have any questions.
>>>>>>>>
>>>>>>>> [1] https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>> --
>>> Riyafa Abdul Hameed
>>> Undergraduate, University of Moratuwa
>>>
>>> Email: [email protected]
>>> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
>>> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
>>> <http://twitter.com/Riyafa1>
>>>
>>
>
>
> --
> Riyafa Abdul Hameed
> Undergraduate, University of Moratuwa
>
> Email: [email protected]
> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
> <http://twitter.com/Riyafa1>
>



--
Riyafa Abdul Hameed
Undergraduate, University of Moratuwa

Email: [email protected]
Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
<http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
<http://twitter.com/Riyafa1>

Reply via email to