[ http://issues.apache.org/jira/browse/HADOOP-439?page=all ]
Sameer Paranjpye updated HADOOP-439:
------------------------------------
Fix Version/s: 0.6.0
Affects Version/s: 0.5.0
> Streaming does not work for text data if the records don't fit in a short
> UTF8 [2^16/3 characters]
> --------------------------------------------------------------------------------------------------
>
> Key: HADOOP-439
> URL: http://issues.apache.org/jira/browse/HADOOP-439
> Project: Hadoop
> Issue Type: Bug
> Affects Versions: 0.5.0
> Reporter: Dick King
> Assigned To: Michel Tourn
> Priority: Critical
> Fix For: 0.6.0
>
>
> The streaming code internally reads the input data into a UTF8 . This causes
> truncated data to be shipped to the mapper when the input exceeds about 21000
> characters, with no notice to the user except possibly in individual tasks'
> machines' logs, which people would not normally read for apparently
> successful jobs.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira