[ 
https://issues.apache.org/jira/browse/DRILL-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894571#comment-16894571
 ] 

ASF GitHub Bot commented on DRILL-7096:
---------------------------------------

paul-rogers commented on issue #1829: DRILL-7096: Develop vector for canonical 
Map<K,V>
URL: https://github.com/apache/drill/pull/1829#issuecomment-515716535
 
 
   here seems to be little documentation of the design or implementation: I 
find myself having to reverse engineer a design from the code. Would it be 
helpful for other reviewers if we included a bit of explanation?
   
   In particular, what is the structure of the new vector? The true map vector 
seems to have a user-defined key and value type. Does this mean keys can be 
integers or maps? Do the proposed map functions handle non-Varchar keys? Do 
they handle repeated or map keys? If not, should the implementation restrict 
key types?
   
   How do we handle varying value types? (I have a map with, say, {name: 
"Fred", balance: 123}. Would the user specify a Union? Have we fixed the many 
existing problems with the Union type? Or, is the true map meant to be like a 
Java map: Map<K,V>, and not a Python or JSON map with string keys and values of 
any type?
   
   A bit of documentation (a README file, a package-info file or just a Javadoc 
explanation in the vector class) would help answer these questions.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Develop vector for canonical Map<K,V>
> -------------------------------------
>
>                 Key: DRILL-7096
>                 URL: https://issues.apache.org/jira/browse/DRILL-7096
>             Project: Apache Drill
>          Issue Type: Sub-task
>            Reporter: Igor Guzenko
>            Assignee: Bohdan Kazydub
>            Priority: Major
>             Fix For: 1.17.0
>
>
> Canonical Map<K,V> datatype can be represented using combination of three 
> value vectors:
> keysVector - vector for storing keys of each map
> valuesVector - vector for storing values of each map
> offsetsVector - vector for storing of start indexes of next each map
> So it's not very hard to create such Map vector, but there is a major issue 
> with such map representation. It's hard to search maps values by key in such 
> vector, need to investigate some advanced techniques to make such search 
> efficient. Or find other more suitable options to represent map datatype in 
> world of vectors.
> After question about maps, Apache Arrow developers responded that for Java 
> they don't have real Map vector, for now they just have logical Map type 
> definition where they define Map like: List< Struct<key:key_type, 
> value:value_type> >. So implementation of value vector would be useful for 
> Arrow too.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to