[ https://issues.apache.org/jira/browse/PIG-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171322#comment-14171322 ]
Daniel Dai commented on PIG-3749: --------------------------------- I tried something similar but not able to reproduce it. Seems your patch deals with the 0x00 in the bytearray. Is it in the middle of the bytearray or in the end? I checked DataGenerator, it does not seems we generate 0x00 in the middle. If it is in the end, shouldn't it also be bounded by b.length? Can you upload your page_views_sample with the offending record? > PigPerformance - data in the map gets lost during parsing > --------------------------------------------------------- > > Key: PIG-3749 > URL: https://issues.apache.org/jira/browse/PIG-3749 > Project: Pig > Issue Type: Bug > Affects Versions: 0.12.0 > Reporter: Keren Ouaknine > Assignee: Keren Ouaknine > Fix For: 0.14.0 > > Attachments: PIG-3749.patch > > > Create a Pigmix sample dataset which looks as follow: > keren 1 2 qt 3 4 5.0 aaaabbbb > mccccddddeeeedmffffgggghhhh > Launch the following query: > A = load 'page_views_sample.txt' using > org.apache.pig.test.pigmix.udf.PigPerformanceLoader() > as (user, action, timespent, query_term, ip_addr, timestamp, > estimated_revenue, page_info, page_links); > store A into 'L1out_A'; > B = foreach A generate user, (int)action as action, (map[])page_info as > page_info, flatten((bag{tuple(map[])})page_links) as page_links; > store B into 'L1out_B'; > The result looks like this: > keren 1 [b#bbb,a#aaa] [d#,e#eee,c#ccc] > keren 1 [b#bbb,a#aaa] [f#fff,g#ggg,h#hhh > It is missing the 'ddd' value and a closing bracket. > Thanks, > Keren -- This message was sent by Atlassian JIRA (v6.3.4#6332)