[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12438547 ]
Doug Cutting commented on HADOOP-550:
-
Two minor nits:
1. Instead of ignoring the CharacterCodingException that should never be
thrown, it would be better to
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12438541 ]
Hairong Kuang commented on HADOOP-550:
--
Thanks for your comments, Addison. Currently Text is the default clas for
map/reduce text input files, in which record
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12437958 ]
Addison Phillips commented on HADOOP-550:
-
If you want to have *text*, then you need to know the encoding and have some
assurance that it is correct. A tex
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12437954 ]
Hairong Kuang commented on HADOOP-550:
--
Currently class Text is the default class for text inputs. Because only valid
UTF8 bytes are allowed in Text, if user
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12436728 ]
Addison Phillips commented on HADOOP-550:
-
I had a hand in advising about this code (and wrote some of it). I agree with
Doug Cutting that the current impl
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12436413 ]
Doug Cutting commented on HADOOP-550:
-
I think the default, like new String(), should be not to validate, and to
silently replace bad data. If we want to use
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12436412 ]
Hairong Kuang commented on HADOOP-550:
--
In java, supplementary characters, i.e., codepoints that are greater than
U+, are represented by a pair of char va
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12436403 ]
Sameer Paranjpye commented on HADOOP-550:
-
I think we should maintain the invariant that a Text object contains valid
UTF-8.
Why not add a constructor to
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12436368 ]
Bryan Pendleton commented on HADOOP-550:
Ah, the bowels of String handling I hadn't uncovered yet
Yes, I would support that, at least as the default. P
[
http://issues.apache.org/jira/browse/HADOOP-550?page=comments#action_12436362 ]
Doug Cutting commented on HADOOP-550:
-
So you're claiming that 'new String(myBytes, "UTF-8") doesn't throw an
exception, but that 'new Text(myBytes)' does?
Th
10 matches
Mail list logo