[ 
https://issues.apache.org/jira/browse/RANGER-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406733#comment-17406733
 ] 

Pradeep Agrawal commented on RANGER-3386:
-----------------------------------------

[~caozhiqiang] : isEmpty() , isNotEmpty() of CollectionUtils is not doing any 
major operations.  We need to have more evidence of what you claim, if we go 
ahead directly we might have too many changes and will loose track of important 
changes. 

For now I will be vote -1 for this. However we could do these things:
 * Add a test case patch so that other developers also can verify the same in 
their environment. 
 * Check with apache common community that why this could be slow.
 * you can give more time to this patch, make this change in your company's 
private branch and observe the improvement.

 

 

 

> apache CollectionUtils package reduce ranger agent performance and should be 
> replaced
> -------------------------------------------------------------------------------------
>
>                 Key: RANGER-3386
>                 URL: https://issues.apache.org/jira/browse/RANGER-3386
>             Project: Ranger
>          Issue Type: Improvement
>          Components: plugins
>    Affects Versions: 2.0.1
>            Reporter: caozhiqiang
>            Assignee: caozhiqiang
>            Priority: Major
>         Attachments: CollectionUtils.png, RANGER-3386-branch-2.0.1.001.patch, 
> RANGER-3386-branch-2.0.1.002.patch
>
>
> org.apache.commons.collections.CollectionUtils' performance is too poor and 
> reduce the performance of ranger plugins, particularly with hdfs. There are 
> too many places used CollectionUtils.isNotEmpty and CollectionUtils.isEmpty 
> in agent component, so we should replace them.
> We can see many CollectionUtils call is take too much time in namenode 
> benchmark's result.
> In this patch, I replace almost CollectionUtils functions in agents-common. 
> After adding this patch, in creating file benchmark of hdfs, the performance 
> can improve from creating *7000* files to *7600* files per second.
> By the way, I write a simple test code below. collection.isEmpty is almost 
> take 0 milliseconds, but CollectionUtils.isNotEmpty take 5 milliseconds.
>  
> {code:java}
> List<Integer> list = new ArrayList<Integer>();
> for(int i = 0; i<1000; i++)
>     list.add(i);
> long startTime = System.currentTimeMillis();
> if(list != null && !list.isEmpty()){
> }
> long endTime   = System.currentTimeMillis();
> long totalTime = endTime - startTime;
> System.out.println(totalTime);
> long startTime2 = System.currentTimeMillis();
> CollectionUtils.isNotEmpty(list);
> long endTime2   = System.currentTimeMillis();
> long totalTime2 = endTime2 - startTime2;
> System.out.println(totalTime2);
> {code}
>  
> !CollectionUtils.png|width=680,height=324!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to