I think Preston's suggestion of looking at the AsterixDB implementation of its binary data model is a good one, as it shares the efficient field access by name requirements and several VXQuery folks are experts in its details as well. I believe it uses a sorted list instead of a hash table internally, perhaps - slightly simpler for updates perhaps. On May 9, 2016 7:35 AM, "Riyafa Abdul Hameed" <[email protected]> wrote:
Hi again, I have been thinking of Till's suggestion of using a dictionary, and I think it would be a better alternative because then we wouldn't have to process the valuetag of the value of a particular key before moving to the next key. Hence it would be easy to implement jdm:keys method. Any suggestions? Shall I updated the wiki and the doc based on this. Thank you. Riyafa On 9 May 2016 at 19:21, Riyafa Abdul Hameed <[email protected]> wrote: > Hi Till, > > Currently I have suggested storing each key followed by the value. This > uses less space and is quite similar to storing the offset of the values > and the access is also linear to the number of keys. > > Thanks. > Riyafa > > On 9 May 2016 at 18:54, Till Westmann <[email protected]> wrote: > >> All of this looks pretty good! >> >> Wrt. the question of the dictionary for the fields, I think that we should >> consider the 2 ways that we can access an object: >> 1. Either we get all keys (jdm:keys) or >> 2. we get a value for a key (jdm:value). >> >> To get all the keys efficiently and to be able to skip huge nested values >> a >> simple approach could be store a dictionary of the keys (in their original >> order) with pointers (offsets) to the values. That way we could get the >> keys >> quickly by scanning the dictionary and each value by scanning the >> dictionary >> + 1 hop to find the value. This certainly has the problem, that the access >> is linear in the number of the keys. But it is reasonably simple and it >> would allow us to get a correct + testable implementation relatively soon >> and to have a baseline for a more optimized representation. >> >> Thoughts? >> >> Cheers, >> Till >> >> [1] >> http://jsoniq.org/docs/JSONiqExtensionToXQuery/html-single/index.html#idm139680641300880 >> >> On 8 May 2016, at 22:19, Riyafa Abdul Hameed wrote: >> >> Hi Preston, >>> >>> I have edited the wiki[1] and the doc[2] based on the comments. Thank you >>> for the suggestions provided. I have removed the part that assigns an id >>> to >>> the keys and instead suggested that the keys be stored in the order they >>> appear in the json object. I am not sure I understand the concept of >>> hashcode--how to generate the hashcodes used for easy lookup? >>> >>> >>> [1]https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq >>> [2] >>> >>> https://drive.google.com/open?id=1-wT0pE8rTTNIzuY4iTgvhqkdHmKGek4CgNthXN6mlm0 >>> >>> Thank you again. >>> >>> Yours sincerely, >>> Riyafa >>> >>> On 9 May 2016 at 01:23, christina pavlopoulou <[email protected]> wrote: >>> >>> Hi, >>>> >>>> I updated the wiki page according to Preston's comments along with the >>>> json array example in [1]. >>>> >>>> [1] >>>> >>>> https://docs.google.com/document/d/1GOAcvhw_F9cJrNmRq2TwZxI0wYRmvLEV3mywJS4H9Lg/edit >>>> >>>> Thank you, >>>> Christina >>>> >>>> On 5/8/2016 9:43 AM, Preston Carman wrote: >>>> >>>> Nice job guys. I can see you are picking up how to create a data >>>>> model. I have limited my comments to the wiki [1] for now. At a high >>>>> level, I was impressed with your detail and thoughtful layouts. It >>>>> reminds me of the age old trade off: speed vs space. At this time, >>>>> lets error on saving space. The data model should the as compact as >>>>> possible. >>>>> >>>>> I also found the AsterixDB serialization [2] we can use as a >>>>> reference. Even though the AsterixDB data model includes object >>>>> length, I would leave that out since all the XQuery data models do not >>>>> include this property. >>>>> >>>>> Riyafa, take a look at the method AsterixDB uses for quick look ups (a >>>>> hash value for the name). Consider the pros and cons between your >>>>> method and AsterixDB's method: a list hash value for name and a sorted >>>>> list of names. >>>>> >>>>> Also, take a look at my wiki comments. Its a great start! >>>>> >>>>> Mahalo, >>>>> Preston >>>>> >>>>> [1] https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq >>>>> [2] >>>>> >>>>> https://cwiki.apache.org/confluence/display/ASTERIXDB/AsterixDB+Object+Serialization+Reference >>>>> >>>>> On Sat, May 7, 2016 at 6:47 PM, christina pavlopoulou < >>>>> [email protected]> >>>>> wrote: >>>>> >>>>> Hi, >>>>>> >>>>>> I, also, designed an example for the json array [1] given the >>>>>> description I >>>>>> wrote in the wiki page. >>>>>> >>>>>> [1] >>>>>> >>>>>> >>>>>> https://docs.google.com/document/d/1GOAcvhw_F9cJrNmRq2TwZxI0wYRmvLEV3mywJS4H9Lg/edit >>>>>> >>>>>> Thank you, >>>>>> Christina >>>>>> >>>>>> >>>>>> On 5/7/2016 11:22 AM, Riyafa Abdul Hameed wrote: >>>>>> >>>>>> Hi, >>>>>>> >>>>>>> I am attempting to create a doc on the JSONiq data model for >>>>>>> objects[1] >>>>>>> (It >>>>>>> might be full of errors because I am doing the calculations >>>>>>> manually). >>>>>>> >>>>>>> This is what I have come up on the data model for objects: >>>>>>> >>>>>>> The first byte would have the value tag, followed by the id (4 >>>>>>> bytes) of >>>>>>> the object. Then 4 bytes to represent the size of the object. Then >>>>>>> another >>>>>>> four bytes to represent the number of key-value pairs. Next few bytes >>>>>>> represent the offsets of keys which follow (each offset is >>>>>>> represented >>>>>>> by >>>>>>> 4 >>>>>>> bytes). Ids would be assigned to the keys. Next few bytes would be a >>>>>>> sorted >>>>>>> list of ids for keys in alphabetical order. The following bytes would >>>>>>> represent the keys in the object.Each key is a StringPointable >>>>>>> followed >>>>>>> by >>>>>>> the id of the key. Each object would have a sequence pointable: the >>>>>>> following bytes would be the number of Items (items are the values >>>>>>> for >>>>>>> keys) in the sequence. The next bytes would be the offset of each >>>>>>> item >>>>>>> in >>>>>>> the sequence. The last bytes would be the values for each key >>>>>>> followed >>>>>>> by >>>>>>> the respective id of the key. >>>>>>> >>>>>>> Hope it makes sense. >>>>>>> >>>>>>> My problem is, >>>>>>> >>>>>>> I have not provided for the white spaces in the object. What can I >>>>>>> use >>>>>>> to >>>>>>> represent the white spaces? I cannot use a text node because object >>>>>>> is >>>>>>> not >>>>>>> a node. >>>>>>> >>>>>>> >>>>>>> [1] >>>>>>> >>>>>>> >>>>>>> >>>>>>> https://drive.google.com/open?id=1-wT0pE8rTTNIzuY4iTgvhqkdHmKGek4CgNthXN6mlm0 >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> Yours sincerely, >>>>>>> Riyafa >>>>>>> >>>>>>> >>>>>>> On 26 April 2016 at 10:29, Preston Carman <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> We have two students working with us this summer through GSOC to >>>>>>> >>>>>>>> complete >>>>>>>> JSONiq specification for arrays and objects. I think the first step >>>>>>>> is >>>>>>>> to >>>>>>>> define the data model used by JSONiq. The definition should be >>>>>>>> defined >>>>>>>> in >>>>>>>> our wiki [1] before coding starts this summer. The wiki will allow >>>>>>>> the >>>>>>>> community to discuss the JSON data model implementation in VXQuery. >>>>>>>> >>>>>>>> I updated the JSONiq wiki to help get the documentation started. >>>>>>>> Please >>>>>>>> fill in the JSON data model based on the examples seen on our >>>>>>>> website >>>>>>>> (links on the wiki page). >>>>>>>> >>>>>>>> Post here if you have any questions. >>>>>>>> >>>>>>>> [1] https://cwiki.apache.org/confluence/display/VXQUERY/JSONiq >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>> >>> >>> -- >>> Riyafa Abdul Hameed >>> Undergraduate, University of Moratuwa >>> >>> Email: [email protected] >>> Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/> >>> <http://facebook.com/riyafa.ahf> <http://lk.linkedin.com/in/riyafa> >>> <http://twitter.com/Riyafa1> >>> >> > > > -- > Riyafa Abdul Hameed > Undergraduate, University of Moratuwa > > Email: [email protected] > Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/> > <http://facebook.com/riyafa.ahf> <http://lk.linkedin.com/in/riyafa> > <http://twitter.com/Riyafa1> > -- Riyafa Abdul Hameed Undergraduate, University of Moratuwa Email: [email protected] Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/> <http://facebook.com/riyafa.ahf> <http://lk.linkedin.com/in/riyafa> <http://twitter.com/Riyafa1>
