[ 
https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871830#comment-16871830
 ] 

Sean Mackrory commented on HADOOP-15729:
----------------------------------------

Attach a patch with my proposed change. Unfortunately, it has a bigger impact 
on simple things like 1 GB upload than I thought, although it's hard to be sure 
it's not noise. See below for numbers to upload a 1GB file from a machine in 
us-west-2 to a bucket / pre-created DynamoDB table in the same region. Maybe 
this is worth adding fs.s3a.core.threads with a moderate default value. 
Long-running processes (like Hive Server2) might access many buckets and 
fs.s3a.max.threads grows absolutely out of control - core threads could still 
do the same unless it's *much* lower, in which case you'd easily hit this 
performance regression anyway. I would suggest we just proceed and consider 
fs.s3a.core.threads if further performance testing reveals an issue. Thoughts?

Without change:
{code}
real    0m27.415s
user    0m25.128s
sys     0m6.377s

real    0m25.360s
user    0m25.081s
sys     0m6.368s

real    0m27.615s
user    0m25.296s
sys     0m6.015s

real    0m25.001s
user    0m25.408s
sys     0m6.717s

real    0m28.083s
user    0m24.764s
sys     0m5.774s

real    0m26.117s
user    0m25.192s
sys     0m5.867s
{code}

With change:
{code}
real    0m28.928s
user    0m24.182s
sys     0m5.699s

real    0m33.359s
user    0m25.508s
sys     0m6.407s

real    0m44.412s
user    0m24.565s
sys     0m6.226s

real    0m27.469s
user    0m25.326s
sys     0m6.142s

real    0m35.660s
user    0m25.206s
sys     0m6.154s

real    0m31.811s
user    0m25.042s
sys     0m6.057s
{code}

> [s3a] stop treat fs.s3a.max.threads as the long-term minimum
> ------------------------------------------------------------
>
>                 Key: HADOOP-15729
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15729
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>            Priority: Major
>         Attachments: HADOOP-15729.001.patch
>
>
> A while ago the s3a connector started experiencing deadlocks because the AWS 
> SDK requires an unbounded threadpool. It places monitoring tasks on the work 
> queue before the tasks they wait on, so it's possible (has even happened with 
> larger-than-default threadpools) for the executor to become permanently 
> saturated and deadlock.
> So we started giving an unbounded threadpool executor to the SDK, and using a 
> bounded, blocking threadpool service for everything else S3A needs (although 
> currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then 
> only limits this threadpool, however we also specified fs.s3a.max.threads as 
> the number of core threads in the unbounded threadpool, which in hindsight is 
> pretty terrible.
> Currently those core threads do not timeout, so this is actually setting a 
> sort of minimum. Once that many tasks have been submitted, the threadpool 
> will be locked at that number until it bursts beyond that, but it will only 
> spin down that far. If fs.s3a.max.threads is set reasonably high and someone 
> uses a bunch of S3 buckets, they could easily have thousands of idle threads 
> constantly.
> We should either not use fs.s3a.max.threads for the corepool size and 
> introduce a new configuration, or we should simply allow core threads to 
> timeout. I'm reading the OpenJDK source now to see what subtle differences 
> there are between core threads and other threads if core threads can timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to