Why the default LOAD and STORE use UTF-8? Why not use byte?

paradisehit Tue, 26 Aug 2008 01:52:31 -0700

Hello!
    I have meet a code problem about the charset. I use Hadoop to store the log 
data, and my log data is not coded in UTF-8, for example GBK in china. If I use 
the PigStorage() to process my data, the data will be treated as UTF-8, then, I 
use my program to process the UTF-8 data, it can also run, but the result will 
be
 not right.
    And can we use the pig LOAD and STORE like Hadoop, not change the orignal 
data charset, store it as it was! Any one can help me? Or tell me why use the 
default UTF8?

Why the default LOAD and STORE use UTF-8? Why not use byte?

Reply via email to