[
https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871830#comment-16871830
]
Sean Mackrory commented on HADOOP-15729:
----------------------------------------
Attach a patch with my proposed change. Unfortunately, it has a bigger impact
on simple things like 1 GB upload than I thought, although it's hard to be sure
it's not noise. See below for numbers to upload a 1GB file from a machine in
us-west-2 to a bucket / pre-created DynamoDB table in the same region. Maybe
this is worth adding fs.s3a.core.threads with a moderate default value.
Long-running processes (like Hive Server2) might access many buckets and
fs.s3a.max.threads grows absolutely out of control - core threads could still
do the same unless it's *much* lower, in which case you'd easily hit this
performance regression anyway. I would suggest we just proceed and consider
fs.s3a.core.threads if further performance testing reveals an issue. Thoughts?
Without change:
{code}
real 0m27.415s
user 0m25.128s
sys 0m6.377s
real 0m25.360s
user 0m25.081s
sys 0m6.368s
real 0m27.615s
user 0m25.296s
sys 0m6.015s
real 0m25.001s
user 0m25.408s
sys 0m6.717s
real 0m28.083s
user 0m24.764s
sys 0m5.774s
real 0m26.117s
user 0m25.192s
sys 0m5.867s
{code}
With change:
{code}
real 0m28.928s
user 0m24.182s
sys 0m5.699s
real 0m33.359s
user 0m25.508s
sys 0m6.407s
real 0m44.412s
user 0m24.565s
sys 0m6.226s
real 0m27.469s
user 0m25.326s
sys 0m6.142s
real 0m35.660s
user 0m25.206s
sys 0m6.154s
real 0m31.811s
user 0m25.042s
sys 0m6.057s
{code}
> [s3a] stop treat fs.s3a.max.threads as the long-term minimum
> ------------------------------------------------------------
>
> Key: HADOOP-15729
> URL: https://issues.apache.org/jira/browse/HADOOP-15729
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
> Priority: Major
> Attachments: HADOOP-15729.001.patch
>
>
> A while ago the s3a connector started experiencing deadlocks because the AWS
> SDK requires an unbounded threadpool. It places monitoring tasks on the work
> queue before the tasks they wait on, so it's possible (has even happened with
> larger-than-default threadpools) for the executor to become permanently
> saturated and deadlock.
> So we started giving an unbounded threadpool executor to the SDK, and using a
> bounded, blocking threadpool service for everything else S3A needs (although
> currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then
> only limits this threadpool, however we also specified fs.s3a.max.threads as
> the number of core threads in the unbounded threadpool, which in hindsight is
> pretty terrible.
> Currently those core threads do not timeout, so this is actually setting a
> sort of minimum. Once that many tasks have been submitted, the threadpool
> will be locked at that number until it bursts beyond that, but it will only
> spin down that far. If fs.s3a.max.threads is set reasonably high and someone
> uses a bunch of S3 buckets, they could easily have thousands of idle threads
> constantly.
> We should either not use fs.s3a.max.threads for the corepool size and
> introduce a new configuration, or we should simply allow core threads to
> timeout. I'm reading the OpenJDK source now to see what subtle differences
> there are between core threads and other threads if core threads can timeout.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]