[ https://issues.apache.org/jira/browse/HADOOP-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440166#comment-13440166 ]
Hadoop QA commented on HADOOP-8655: ----------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12542076/HADOOP-8655%20%282%29.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ha.TestZKFailoverController +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1348//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1348//console This message is automatically generated. > In TextInputFormat, while specifying textinputformat.record.delimiter the > character/character sequences in data file similar to starting > character/starting character sequence in delimiter were found missing in > certain cases in the Map Output > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-8655 > URL: https://issues.apache.org/jira/browse/HADOOP-8655 > Project: Hadoop Common > Issue Type: Bug > Components: util > Affects Versions: 0.20.2 > Environment: Linux- Ubuntu 10.04 > Reporter: Arun A K > Labels: hadoop, mapreduce, textinputformat, > textinputformat.record.delimiter > Attachments: HADOOP-8655 (2).patch, HADOOP-8655.patch, > HADOOP-8655.patch, HADOOP-8655.patch, MAPREDUCE-4519.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Set textinputformat.record.delimiter as "</entity>" > Suppose the input is a text file with the following content > <entity><id>1</id><name>User1</name></entity><entity><id>2</id><name>User2</name></entity><entity><id>3</id><name>User3</name></entity><entity><id>4</id><name>User4</name></entity><entity><id>5</id><name>User5</name></entity> > Mapper was expected to get value as > Value 1 - <entity><id>1</id><name>User1</name> > Value 2 - <entity><id>2</id><name>User2</name> > Value 3 - <entity><id>3</id><name>User3</name> > Value 4 - <entity><id>4</id><name>User4</name> > Value 5 - <entity><id>5</id><name>User5</name> > According to this bug Mapper gets value > Value 1 - entity><id>1</id><name>User1</name> > Value 2 - <entity>id>2</id><name>User2</name> > Value 3 - <entity><id>3id><name>User3</name> > Value 4 - <entity><id>4</id><name>User4name> > Value 5 - <entity><id>5</id><name>User5</name> > The pattern shown above need not occur for value 1,2,3 necessarily. The bug > occurs at some random positions in the map input. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira