[ 
https://issues.apache.org/jira/browse/HADOOP-17141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164325#comment-17164325
 ] 

Hudson commented on HADOOP-17141:
---------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18469 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18469/])
HADOOP-17141. Add Capability To Get Text Length (#2157) (github: rev 
e60096c377d8a3cb5bed3992352779195be95bb4)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/Text.java
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/TestText.java


> Add Capability To Get Text Length
> ---------------------------------
>
>                 Key: HADOOP-17141
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17141
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Minor
>             Fix For: 3.4.0
>
>
> The Hadoop {{Text}} class contains an array of byte which contain a UTF-8 
> encoded string.  However, there is no way to quickly get the length of that 
> string.  One can get the number of bytes in the byte array, but to figure out 
> the length of the String, it needs to be decoded first.  In this simple 
> example, sorting the {{Text}} objects by String length, the String needs to 
> be decoded from the byte array repeatedly.  This was brought to my attention 
> based on [HIVE-23870].
> {code:java}
>   public static void main(String[] args) {
>     List<Text> list = Arrays.asList(new Text("1"), new Text("22"), new 
> Text("333"));
>     list.sort((Text t1, Text t2) -> t1.toString().length() - 
> t2.toString().length());
>   }
> {code}
> Also helpful if I want to check the last letter in the {{Text}} object 
> repeatedly:
> {code:java}
>     Text t = new Text("4444");
>     System.out.println(t.charAt(t.toString().length() - 1));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to