[ https://issues.apache.org/jira/browse/HADOOP-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HADOOP-1821: ---------------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) Tests passed (finally - thanks Hudson). Committed. > [hbase] Replace all String.getBytes() with String.getBytes("UTF-8") > ------------------------------------------------------------------- > > Key: HADOOP-1821 > URL: https://issues.apache.org/jira/browse/HADOOP-1821 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Affects Versions: 0.15.0 > Reporter: Jim Kellerman > Assignee: Jim Kellerman > Fix For: 0.15.0 > > Attachments: patch.txt > > > We cannot rely on the default encoding being UTF-8 so a naked > String.getBytes() will return the bytes in whatever the default encoding is > for the platform on which code is running. If it is subsequently read on > another machine with a different default encoding, converting the bytes back > to a string will result in garbage. > Consequently, we should always specify an encoding for getBytes() and new > String. UTF-8 is the preferred encoding. > The places where we use unqualified getBytes are: > HConstants.DELETE_BYTES, HConstants.COMPLETE_CACHEFLUSH > hbase.io.MapWritable.main (but this will not be an issue once HADOOP-1760 is > completed) > TestHMemcache.addRows > PerformanceEvaluation.generateValue > TestGet > TestHRegion > TestHBaseCluster > TestTableMapReduce > TestScanner2 > TestRegExpRowFilter > TestRowFilterSet > org.onelab.test.StringKey -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.