[ 
https://issues.apache.org/jira/browse/HADOOP-11506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297807#comment-14297807
 ] 

Andrew Wang commented on HADOOP-11506:
--------------------------------------

Happy to, thanks for working on this everyone. Review comments below, only one 
substantial one. Core logic looks good though.

* The new recursion detection only handles a two-var loop, but the old one 
handles bigger loops with the hash set. For instance, {{FOO1=FOO2, FOO2=FOO3, 
FOO3=FOO1}}. The behavior is a little different thus; before, we'd return the 
start of the loop, here, we'd hit the IllegalStateException at the end. I feel 
like the exception is arguably better, but compatibility dictates we preserve 
the existing behavior.

Nits:

* Not caused by your patch, but do you mind adding javadoc to substituteVars 
about the expected input/output? Same for findSubVariable would be nice.
* Also good to document why we go through this dance rather than using a regex 
(can mention this JIRA #)
* Did you use a return parameter for performance reasons? Eden allocation is 
pretty fast, so just returning seems fine. With all the {{substring}} and 
string concat flying around, we're doing a lot of allocation anyway.

Spacing is a bit off here:
{code}
      eval =   eval.substring(0, dollar)
{code}

remove "the"
{code}
    // that can occur in the nested class names
{code}

I think this should be a right brace rather than left for ultimate clarity:
{code}
      int afterRightBrace = varBounds[SUB_END_IDX] + "{".length();
{code}

Similar, I think {{"\{c"}} should be {{"c\}"}} for ultimate clarity:
{code}
         bracePos > 0 && bracePos + "{c".length() < eval.length();
{code}


> Configuration.get() is unnecessarily slow
> -----------------------------------------
>
>                 Key: HADOOP-11506
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11506
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Gera Shegalov
>         Attachments: HADOOP-11506.001.patch, HADOOP-11506.002.patch
>
>
> Profiling several large Hadoop jobs, we discovered that a surprising amount 
> of time was spent inside Configuration.get, more specifically, in regex 
> matching caused by the substituteVars call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to