Hi,

Is there any documentation I could go through to understand the AsterixDB
Hash code implementation on the open fields? I am not sure I understand
enough from the AsterixDB serialization [1] to define the data model for
objects using it.

Sorry about any confusion.

[1]
https://cwiki.apache.org/confluence/display/ASTERIXDB/AsterixDB+Object+Serialization+Reference

Thank you.
Riyafa

On 9 May 2016 at 20:16, Michael J. Carey <[email protected]> wrote:

> I think Preston's suggestion of looking at the AsterixDB implementation of
> its binary data model is a good one, as it shares the efficient field
> access by name requirements and several VXQuery folks are experts in its
> details as well.  I believe it uses a sorted list instead of a hash table
> internally, perhaps - slightly simpler for updates perhaps.
> On May 9, 2016 7:35 AM, "Riyafa Abdul Hameed" <[email protected]>
> wrote:
>
> Hi again,
>
> I have been thinking of Till's suggestion of using a dictionary, and I
> think it would be a better alternative because then we wouldn't have to
> process the valuetag of the value of a particular key before moving to the
> next key. Hence it would be easy to implement jdm:keys method. Any
> suggestions? Shall I updated the wiki and the doc based on this.
>
> Thank you.
> Riyafa
>
> On 9 May 2016 at 19:21, Riyafa Abdul Hameed <[email protected]>
> wrote:
>
> > Hi Till,
> >
> > Currently I have suggested storing each key followed by the value. This
> > uses less space and is quite similar to storing the offset of the values
> > and the access is also linear to the number of keys.
> >
> > Thanks.
> > Riyafa
> >
> > On 9 May 2016 at 18:54, Till Westmann <[email protected]> wrote:
> >
> >> All of this looks pretty good!
> >>
> >> Wrt. the question of the dictionary for the fields, I think that we
> should
> >> consider the 2 ways that we can access an object:
> >> 1. Either we get all keys (jdm:keys) or
> >> 2. we get a value for a key (jdm:value).
> >>
> >> To get all the keys efficiently and to be able to skip huge nested
> values
> >> a
> >> simple approach could be store a dictionary of the keys (in their
> original
> >> order) with pointers (offsets) to the values. That way we could get the
> >> keys
> >> quickly by scanning the dictionary and each value by scanning the
> >> dictionary
> >> + 1 hop to find the value. This certainly has the problem, that the
> access
> >> is linear in the number of the keys. But it is reasonably simple and it
> >> would allow us to get a correct + testable implementation relatively
> soon
> >> and to have a baseline for a more optimized representation.
> >>
> >> Thoughts?
> >>
> >> Cheers,
> >> Till
> >>
> >> [1]
> >>
>
> http://jsoniq.org/docs/JSONiqExtensionToXQuery/html-single/index.html#idm139680641300880
> >>
> >> On 8 May 2016, at 22:19, Riyafa Abdul Hameed wrote:
> >>
> >> Hi Preston,
> >>>
> >>> I have edited the wiki[1] and the doc[2] based on the comments. Thank
> you
> >>> for the suggestions provided. I have removed the part that assigns an
> id
> >>> to
> >>> the keys and instead suggested that the keys be stored in the order
> they
> >>> appear in the json object. I am not sure I understand the concept of
> >>> hashcode--how to generate the hashcodes used for easy lookup?
> >>>
> >>>
> >>> [1]https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq
> >>> [2]
> >>>
> >>>
>
> https://drive.google.com/open?id=1-wT0pE8rTTNIzuY4iTgvhqkdHmKGek4CgNthXN6mlm0
> >>>
> >>> Thank you again.
> >>>
> >>> Yours sincerely,
> >>> Riyafa
> >>>
> >>> On 9 May 2016 at 01:23, christina pavlopoulou <[email protected]>
> wrote:
> >>>
> >>> Hi,
> >>>>
> >>>> I updated the wiki page according to Preston's comments along with the
> >>>> json array example in [1].
> >>>>
> >>>> [1]
> >>>>
> >>>>
>
> https://docs.google.com/document/d/1GOAcvhw_F9cJrNmRq2TwZxI0wYRmvLEV3mywJS4H9Lg/edit
> >>>>
> >>>> Thank you,
> >>>> Christina
> >>>>
> >>>> On 5/8/2016 9:43 AM, Preston Carman wrote:
> >>>>
> >>>> Nice job guys. I can see you are picking up how to create a data
> >>>>> model. I have limited my comments to the wiki [1] for now. At a high
> >>>>> level, I was impressed with your detail and thoughtful layouts. It
> >>>>> reminds me of the age old trade off: speed vs space. At this time,
> >>>>> lets error on saving space. The data model should the as compact as
> >>>>> possible.
> >>>>>
> >>>>> I also found the AsterixDB serialization [2] we can use as a
> >>>>> reference. Even though the AsterixDB data model includes object
> >>>>> length, I would leave that out since all the XQuery data models do
> not
> >>>>> include this property.
> >>>>>
> >>>>> Riyafa, take a look at the method AsterixDB uses for quick look ups
> (a
> >>>>> hash value for the name). Consider the pros and cons between your
> >>>>> method and AsterixDB's method: a list hash value for name and a
> sorted
> >>>>> list of names.
> >>>>>
> >>>>> Also, take a look at my wiki comments. Its a great start!
> >>>>>
> >>>>> Mahalo,
> >>>>> Preston
> >>>>>
> >>>>> [1] https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq
> >>>>> [2]
> >>>>>
> >>>>>
>
> https://cwiki.apache.org/confluence/display/ASTERIXDB/AsterixDB+Object+Serialization+Reference
> >>>>>
> >>>>> On Sat, May 7, 2016 at 6:47 PM, christina pavlopoulou <
> >>>>> [email protected]>
> >>>>> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>>
> >>>>>> I, also, designed an example for the json array [1] given the
> >>>>>> description I
> >>>>>> wrote in the wiki page.
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>>
> >>>>>>
>
> https://docs.google.com/document/d/1GOAcvhw_F9cJrNmRq2TwZxI0wYRmvLEV3mywJS4H9Lg/edit
> >>>>>>
> >>>>>> Thank you,
> >>>>>> Christina
> >>>>>>
> >>>>>>
> >>>>>> On 5/7/2016 11:22 AM, Riyafa Abdul Hameed wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>>
> >>>>>>> I am attempting to create a doc on the JSONiq data model for
> >>>>>>> objects[1]
> >>>>>>> (It
> >>>>>>> might be full of errors because I am doing the calculations
> >>>>>>> manually).
> >>>>>>>
> >>>>>>> This is what I have come up on the data model for objects:
> >>>>>>>
> >>>>>>> The first byte would have the value tag, followed by the id (4
> >>>>>>> bytes) of
> >>>>>>> the object. Then 4 bytes to represent the size of the object. Then
> >>>>>>> another
> >>>>>>> four bytes to represent the number of key-value pairs. Next few
> bytes
> >>>>>>> represent the offsets of keys which follow (each offset is
> >>>>>>> represented
> >>>>>>> by
> >>>>>>> 4
> >>>>>>> bytes). Ids would be assigned to the keys. Next few bytes would be
> a
> >>>>>>> sorted
> >>>>>>> list of ids for keys in alphabetical order. The following bytes
> would
> >>>>>>> represent the keys in the object.Each key is a StringPointable
> >>>>>>> followed
> >>>>>>> by
> >>>>>>> the id of the key. Each object would have a sequence pointable: the
> >>>>>>> following bytes would be the number of Items (items are the values
> >>>>>>> for
> >>>>>>> keys) in the sequence. The next bytes would be the offset of each
> >>>>>>> item
> >>>>>>> in
> >>>>>>> the sequence. The last bytes would be the values for each key
> >>>>>>> followed
> >>>>>>> by
> >>>>>>> the respective id of the key.
> >>>>>>>
> >>>>>>> Hope it makes sense.
> >>>>>>>
> >>>>>>> My problem is,
> >>>>>>>
> >>>>>>> I have not provided for the white spaces in the object. What can I
> >>>>>>> use
> >>>>>>> to
> >>>>>>> represent the white spaces? I cannot use a text node because object
> >>>>>>> is
> >>>>>>> not
> >>>>>>> a node.
> >>>>>>>
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
>
> https://drive.google.com/open?id=1-wT0pE8rTTNIzuY4iTgvhqkdHmKGek4CgNthXN6mlm0
> >>>>>>>
> >>>>>>> Thank you.
> >>>>>>>
> >>>>>>> Yours sincerely,
> >>>>>>> Riyafa
> >>>>>>>
> >>>>>>>
> >>>>>>> On 26 April 2016 at 10:29, Preston Carman <[email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> We have two students working with us this summer through GSOC to
> >>>>>>>
> >>>>>>>> complete
> >>>>>>>> JSONiq specification for arrays and objects. I think the first
> step
> >>>>>>>> is
> >>>>>>>> to
> >>>>>>>> define the data model used by JSONiq. The definition should be
> >>>>>>>> defined
> >>>>>>>> in
> >>>>>>>> our wiki [1] before coding starts this summer. The wiki will allow
> >>>>>>>> the
> >>>>>>>> community to discuss the JSON data model implementation in
> VXQuery.
> >>>>>>>>
> >>>>>>>> I updated the JSONiq wiki to help get the documentation started.
> >>>>>>>> Please
> >>>>>>>> fill in the JSON data model based on the examples seen on our
> >>>>>>>> website
> >>>>>>>> (links on the wiki page).
> >>>>>>>>
> >>>>>>>> Post here if you have any questions.
> >>>>>>>>
> >>>>>>>> [1] https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>
> >>>
> >>> --
> >>> Riyafa Abdul Hameed
> >>> Undergraduate, University of Moratuwa
> >>>
> >>> Email: [email protected]
> >>> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
> >>> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
> >>> <http://twitter.com/Riyafa1>
> >>>
> >>
> >
> >
> > --
> > Riyafa Abdul Hameed
> > Undergraduate, University of Moratuwa
> >
> > Email: [email protected]
> > Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
> > <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
> > <http://twitter.com/Riyafa1>
> >
>
>
>
> --
> Riyafa Abdul Hameed
> Undergraduate, University of Moratuwa
>
> Email: [email protected]
> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
> <http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
> <http://twitter.com/Riyafa1>
>



-- 
Riyafa Abdul Hameed
Undergraduate, University of Moratuwa

Email: [email protected]
Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
<http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
<http://twitter.com/Riyafa1>

Reply via email to