[
https://issues.apache.org/jira/browse/HADOOP-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gera Shegalov updated HADOOP-11506:
-----------------------------------
Attachment: HADOOP-11506.004.patch
Minor fix: I had an unintended HashSet size specification:
{code}
diff --git
a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
index 9380d68..c47e771 100644
---
a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
+++
b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
@@ -963,7 +963,7 @@ private String substituteVars(String expr) {
final int afterRightBrace = varBounds[SUB_END_IDX] + "}".length();
final String refVar = eval.substring(dollar, afterRightBrace);
if (evalSet == null) {
- evalSet = new HashSet<String>(1);
+ evalSet = new HashSet<String>();
}
if (!evalSet.add(refVar)) {
return expr; // return original expression if there is a loop
{code}
> Configuration.get() is unnecessarily slow
> -----------------------------------------
>
> Key: HADOOP-11506
> URL: https://issues.apache.org/jira/browse/HADOOP-11506
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Dmitriy V. Ryaboy
> Assignee: Gera Shegalov
> Attachments: HADOOP-11506.001.patch, HADOOP-11506.002.patch,
> HADOOP-11506.003.patch, HADOOP-11506.004.patch
>
>
> Profiling several large Hadoop jobs, we discovered that a surprising amount
> of time was spent inside Configuration.get, more specifically, in regex
> matching caused by the substituteVars call.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)