[ 
https://issues.apache.org/jira/browse/PIG-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703195#action_12703195
 ] 

David Ciemiewicz commented on PIG-771:
--------------------------------------

Very strange.  I can display UTF8 chinese characters in my Mac OS Terminal 
window.  Only dump has a problem.

Here's the transcript of what I did.  If you look, you'll see:

{code}
-bash-3.00$ cat > ch.txt
中文测试

-bash-3.00$ file ch.txt
ch.txt: UTF-8 Unicode text

-bash-3.00$ cat ch.txt
中文测试

-bash-3.00$ cat ch.pig
A = load 'ch.txt' using PigStorage() as (str: chararray);
dump A;
store A into 'ch.out' using PigStorage();

-bash-3.00$ pig -exectype local ch.pig
USING: /grid/0/gs/pig/current
2009-04-27 16:15:16,314 [main] INFO  
org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
2009-04-27 16:15:16,315 [main] INFO  
org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
(????)
2009-04-27 16:15:16,339 [main] INFO  
org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
2009-04-27 16:15:16,339 [main] INFO  
org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!

-bash-3.00$ cat ch.out
中文测试

-bash-3.00$ pig -exectype local 
USING: /grid/0/gs/pig/current
grunt> A = load 'ch.txt' using PigStorage() as (str: chararray);
grunt> dump A;
2009-04-27 16:16:51,786 [main] INFO  
org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100% complete!
2009-04-27 16:16:51,786 [main] INFO  
org.apache.pig.backend.local.executionengine.LocalPigLauncher - Success!!
(????)
grunt> 
{code}

> PigDump does not properly output Chinese UTF8 characters - they are displayed 
> as question marks ??
> --------------------------------------------------------------------------------------------------
>
>                 Key: PIG-771
>                 URL: https://issues.apache.org/jira/browse/PIG-771
>             Project: Pig
>          Issue Type: Bug
>            Reporter: David Ciemiewicz
>
> PigDump does not properly output Chinese UTF8 characters.
> The reason for this is that the function Tuple.toString() is called.
> DefaultTuple implements Tuple.toString() and it calls Object.toString() on 
> the opaque object d.
> Instead, I think that the code should be changed instead to call the new 
> DataType.toString() function.
> {code}
>     @Override
>     public String toString() {
>         StringBuilder sb = new StringBuilder();
>         sb.append('(');
>         for (Iterator<Object> it = mFields.iterator(); it.hasNext();) {
>             Object d = it.next();
>             if(d != null) {
>                 if(d instanceof Map) {
>                     sb.append(DataType.mapToString((Map<Object, Object>)d));
>                 } else {
>                     sb.append(DataType.toString(d));  // <<< Change this one 
> line
>                     if(d instanceof Long) {
>                         sb.append("L");
>                     } else if(d instanceof Float) {
>                         sb.append("F");
>                     }
>                 }
>             } else {
>                 sb.append("");
>             }
>             if (it.hasNext())
>                 sb.append(",");
>         }
>         sb.append(')');
>         return sb.toString();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to