[ https://issues.apache.org/jira/browse/PIG-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703155#action_12703155 ]
David Ciemiewicz commented on PIG-771: -------------------------------------- Take a file of UTF-8 Chinese characters (ch.txt). Load it and dump it. {code}A = load 'ch.txt' using PigStorage() as (str: chararray); dump A; store A into 'ch.out' using PigStorage();{code} > PigDump does not properly output Chinese UTF8 characters - they are displayed > as question marks ?? > -------------------------------------------------------------------------------------------------- > > Key: PIG-771 > URL: https://issues.apache.org/jira/browse/PIG-771 > Project: Pig > Issue Type: Bug > Reporter: David Ciemiewicz > > PigDump does not properly output Chinese UTF8 characters. > The reason for this is that the function Tuple.toString() is called. > DefaultTuple implements Tuple.toString() and it calls Object.toString() on > the opaque object d. > Instead, I think that the code should be changed instead to call the new > DataType.toString() function. > {code} > @Override > public String toString() { > StringBuilder sb = new StringBuilder(); > sb.append('('); > for (Iterator<Object> it = mFields.iterator(); it.hasNext();) { > Object d = it.next(); > if(d != null) { > if(d instanceof Map) { > sb.append(DataType.mapToString((Map<Object, Object>)d)); > } else { > sb.append(DataType.toString(d)); // <<< Change this one > line > if(d instanceof Long) { > sb.append("L"); > } else if(d instanceof Float) { > sb.append("F"); > } > } > } else { > sb.append(""); > } > if (it.hasNext()) > sb.append(","); > } > sb.append(')'); > return sb.toString(); > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.