Hi Rakesh, Your analysis is correct overall.
The specification of delimiters in DDL statement (create table ...) is invented when we only allow a single level of list or map. If there are multiple levels, these delimiter specifications won't work as you expect. For now, please do the following when creating nested types. 1. Don't specify any delimiters when creating the table 2. When loading the data, the data should be formatted in this way: A. Each level of list will take one level of delimiter, and each level of map will take two levels of delimitors. B. If it's list of list, the first list will be delimited by ^A, the second will be delimited by ^B C. If it's map of map, the first map will take ^A and ^B, the second will take ^C and ^D. D. If it's list of map, the list will take ^A, map will take ^B, ^C. E. If it's map of list, the map will take ^A, ^B, the list will take ^C. I hope this helps to solve your problem. We will allow customizable delimiters in the future (please open a jira if you are dependent on that). Zheng On Tue, Jul 7, 2009 at 12:00 PM, Rakesh Setty<[email protected]> wrote: > I think this solution will not deal with maps within maps and lists within > lists. > > > > Thanks, > > Rakesh > > > > ________________________________ > > From: Rakesh Setty > Sent: Tuesday, July 07, 2009 11:37 AM > To: '[email protected]' > Subject: Issue with nested types > > > > Hi, > > > > The issue of nested types addressed recently through JIRA > HIVE-603 is very useful. But I have an issue with the schema specification. > > I have a table page_views with two columns – page_info is a map > with key delimiter as Ctrl-D and the key-value pair (record) delimiter as > Ctrl-C and page_links is a list of maps with each list item separated using > Ctrl-B, map delimiters being Ctrl-D and Ctrl-C as mentioned above. > > In the DDL statement, if I do not specify “collection items > terminated by” and “array items terminated by” clauses, page_links is > deserialized properly, but page_info is not deserialized properly. If I > specify the clauses - collection items terminated by ‘\003’ and map keys > terminated by ‘\004’, page_info is deserialized properly but page_links is > not deserialized properly. The reason I think is that in page_links it > considers ‘\003’ or Ctrl-C as delimiter for both array and map record. But I > have Ctrl-B as array delimiter and Ctrl-D as map record delimiter. > > I think we should replace the clause “collection items > terminated by” with separate clauses like “list items terminated by” and > “map items terminated by”. > > > > Thanks, > > Rakesh -- Yours, Zheng
