[jira] [Work logged] (HADOOP-17905) Modify Text.ensureCapacity() to efficiently max out the backing array size

ASF GitHub Bot (Jira) Sun, 12 Sep 2021 05:21:06 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-17905?focusedWorklogId=649698&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-649698
 ]


ASF GitHub Bot logged work on HADOOP-17905:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Sep/21 12:20
            Start Date: 12/Sep/21 12:20
    Worklog Time Spent: 10m 
      Work Description: pbacsko commented on a change in pull request #3423:
URL: https://github.com/apache/hadoop/pull/3423#discussion_r706827825



##########
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/Text.java
##########
@@ -73,6 +73,10 @@ protected CharsetDecoder initialValue() {
     }
   };
 
+  // max size of the byte array, seems to be a safe choice for multiple VMs

Review comment:
       Maybe I should have written different kind of VMs (OpenJDK, HotSpot, 
etc). It's more like a practical value that will likely work under different 
versions. Some details: 
https://programming.guide/java/array-maximum-length.html. If this comment is 
confusing, I can remove it (or perhaps extend it a little bit?).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 649698)
    Time Spent: 1h 40m  (was: 1.5h)

> Modify Text.ensureCapacity() to efficiently max out the backing array size
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-17905
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17905
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This is a continuation of HADOOP-17901.
> Right now we use a factor of 1.5x to increase the byte array if it's full. 
> However, if the size reaches a certain point, the increment is only (current 
> size + length). This can cause performance issues if the textual data which 
> we intend to store is beyond this point.
> Instead, let's max out the array to the maximum. Based on different sources, 
> a safe choice seems to be Integer.MAX_VALUE - 8 (see ArrayList, 
> AbstractCollection, HashTable, etc).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HADOOP-17905) Modify Text.ensureCapacity() to efficiently max out the backing array size

Reply via email to