[jira] [Commented] (HADOOP-13403) AzureNativeFileSystem rename/delete performance improvements

Chris Nauroth (JIRA) Mon, 01 Aug 2016 14:15:31 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402836#comment-15402836
 ]


Chris Nauroth commented on HADOOP-13403:
----------------------------------------

Thank you for sharing patch 003.

If the reason for the unusual executor logic is optimization, then I suggest 
adding more comments in the {{executeParallel}} JavaDocs to explain that.  I'm 
not sure that the memory optimization argument is true for the {{delete}} code 
path, where it still does a conversion from {{ArrayList}} to array.

bq. Is there any way to achieve this through futures?

If the code had followed idiomatic usage, then the typical solution is to call 
{{ThreadPoolExecutor#submit}} for each task, track every returned {{Future}} in 
a list, and then iterate through the list and call {{Future#get}} on each one.  
If any individual task threw an exception, then the call to {{Future#get}} 
would propagate that exception.  Then, that would give you an opportunity to 
call {{ThreadPoolExecutor#shutdownNow}} to cancel or interrupt all remaining 
tasks.  With the current logic though, I don't really see a way to adapt this 
pattern.

Repeating an earlier comment, I don't see any exceptions thrown from 
{{getThreadPool}}, so coding exception handling around it and tests for it 
looks unnecessary.  If you check validity of {{deleteThreadCount}} and 
{{renameThreadCount}} in {{initialize}} (e.g. check for values <= 0) and fail 
fast by throwing an exception during initialization, then even unchecked 
exceptions will be impossible during calls to {{getThreadPool}}.

I still see numerous test failures in {{TestFileSystemOperationsWithThreads}}.  
For the next patch revision, would you please ensure all tests pass?


> AzureNativeFileSystem rename/delete performance improvements
> ------------------------------------------------------------
>
>                 Key: HADOOP-13403
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13403
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: azure
>    Affects Versions: 2.7.2
>            Reporter: Subramanyam Pattipaka
>            Assignee: Subramanyam Pattipaka
>             Fix For: 2.9.0
>
>         Attachments: HADOOP-13403-001.patch, HADOOP-13403-002.patch, 
> HADOOP-13403-003.patch
>
>
> WASB Performance Improvements
> Problem
> -----------
> Azure Native File system operations like rename/delete which has large number 
> of directories and/or files in the source directory are experiencing 
> performance issues. Here are possible reasons
> a)    We first list all files under source directory hierarchically. This is 
> a serial operation. 
> b)    After collecting the entire list of files under a folder, we delete or 
> rename files one by one serially.
> c)    There is no logging information available for these costly operations 
> even in DEBUG mode leading to difficulty in understanding wasb performance 
> issues.
> Proposal
> -------------
> Step 1: Rename and delete operations will generate a list all files under the 
> source folder. We need to use azure flat listing option to get list with 
> single request to azure store. We have introduced config 
> fs.azure.flatlist.enable to enable this option. The default value is 'false' 
> which means flat listing is disabled.
> Step 2: Create thread pool and threads dynamically based on user 
> configuration. These thread pools will be deleted after operation is over.  
> We are introducing introducing two new configs
>       a)      fs.azure.rename.threads : Config to set number of rename 
> threads. Default value is 0 which means no threading.
>       b)      fs.azure.delete.threads: Config to set number of delete 
> threads. Default value is 0 which means no threading.
>       We have provided debug log information on number of threads not used 
> for the operation which can be useful .
>       Failure Scenarios:
>       If we fail to create thread pool due to ANY reason (for example trying 
> create with thread count with large value such as 1000000), we fall back to 
> serialization operation. 
> Step 3: Bob operations can be done in parallel using multiple threads 
> executing following snippet
>       while ((currentIndex = fileIndex.getAndIncrement()) < files.length) {
>               FileMetadata file = files[currentIndex];
>               Rename/delete(file);
>       }
>       The above strategy depends on the fact that all files are stored in a 
> final array and each thread has to determine synchronized next index to do 
> the job. The advantage of this strategy is that even if user configures large 
> number of unusable threads, we always ensure that work doesn’t get serialized 
> due to lagging threads. 
>       We are logging following information which can be useful for tuning 
> number of threads
>       a) Number of unusable threads
>       b) Time taken by each thread
>       c) Number of files processed by each thread
>       d) Total time taken for the operation
>       Failure Scenarios:
>       Failure to queue a thread execute request shouldn’t be an issue if we 
> can ensure at least one thread has completed execution successfully. If we 
> couldn't schedule one thread then we should take serialization path. 
> Exceptions raised while executing threads are still considered regular 
> exceptions and returned to client as operation failed. Exceptions raised 
> while stopping threads and deleting thread pool shouldn't can be ignored if 
> operation all files are done with out any issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-13403) AzureNativeFileSystem rename/delete performance improvements

Reply via email to