[ https://issues.apache.org/jira/browse/HADOOP-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261281#comment-13261281 ]
Jim Donofrio commented on HADOOP-7940: -------------------------------------- This will likely hurt the performance problem described in HADOOP-6109. I was looking at the javadoc in Hadoop 1.0.2. The more recent javadoc adds: "Please use {@link #copyBytes()} if you need the returned array to be precisely the length of the data." which makes it even clearer than you must pay attention to the length if you use getBytes() > method clear() in org.apache.hadoop.io.Text does not work > --------------------------------------------------------- > > Key: HADOOP-7940 > URL: https://issues.apache.org/jira/browse/HADOOP-7940 > Project: Hadoop Common > Issue Type: Bug > Components: io > Affects Versions: 0.20.205.0 > Environment: Ubuntu, hadoop cloudera CDH3U2, Oracle SUN JDK 6U30 > Reporter: Aaron, > Assignee: Csaba Miklos > Labels: patch > Fix For: 2.0.0 > > Attachments: HADOOP-7940.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > LineReader reader = new LineReader(in, 4096); > ... > Text text = new Text(); > while((reader.readLine(text)) > 0) { > ... > text.clear(); > } > } > Even the clear() method is called each time, some bytes are still not filled > as zero. > So, when reader.readLine(text) is called in a loop, some bytes are dirty > which was from last call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira