[
https://issues.apache.org/jira/browse/HADOOP-17141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved HADOOP-17141.
-------------------------------------
Fix Version/s: 3.4.0
Resolution: Fixed
> Add Capability To Get Text Length
> ---------------------------------
>
> Key: HADOOP-17141
> URL: https://issues.apache.org/jira/browse/HADOOP-17141
> Project: Hadoop Common
> Issue Type: Improvement
> Components: common
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Minor
> Fix For: 3.4.0
>
>
> The Hadoop {{Text}} class contains an array of byte which contain a UTF-8
> encoded string. However, there is no way to quickly get the length of that
> string. One can get the number of bytes in the byte array, but to figure out
> the length of the String, it needs to be decoded first. In this simple
> example, sorting the {{Text}} objects by String length, the String needs to
> be decoded from the byte array repeatedly. This was brought to my attention
> based on [HIVE-23870].
> {code:java}
> public static void main(String[] args) {
> List<Text> list = Arrays.asList(new Text("1"), new Text("22"), new
> Text("333"));
> list.sort((Text t1, Text t2) -> t1.toString().length() -
> t2.toString().length());
> }
> {code}
> Also helpful if I want to check the last letter in the {{Text}} object
> repeatedly:
> {code:java}
> Text t = new Text("4444");
> System.out.println(t.charAt(t.toString().length() - 1));
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]