[ 
https://issues.apache.org/jira/browse/HADOOP-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-17313:
------------------------------------
    Release Note: 
The option "fs.creation.parallel.count" sets a a semaphore to throttle the 
number of FileSystem instances which
can be created simultaneously.

This is designed to reduce the impact of many threads in an application calling
FileSystem.get() on a filesystem which takes time to instantiate -for example
to an object where HTTPS connections are set up during initialization.
Many threads trying to do this may create spurious delays by conflicting
for access to synchronized blocks, when simply limiting the parallelism
diminishes the conflict, so speeds up all threads trying to access
the store.

The default value, 64, is larger than is likely to deliver any speedup -but
it does mean that there should be no adverse effects from the change.

If a service appears to be blocking on all threads initializing connections to
abfs, s3a or store, try a smaller (possibly significantly smaller) value.

     Description: 
A recurrent problem in processes with many worker threads (hive, spark etc) is 
that calling `FileSystem.get(URI-to-object-store)` triggers the creation and 
then discard of many FS clients -all but one for the same URL. As well as the 
direct performance hit, this can exacerbate locking problems and make 
instantiation a lot slower than it would otherwise be.

This has been observed with the S3A and ABFS connectors.

The ultimate solution here would probably be something more complicated to 
ensure that only one thread was ever creating a connector for a given URL -the 
rest would wait for it to be initialized. This would (a) reduce contention & 
CPU, IO network load, and (b) reduce the time for all but the first thread to 
resume processing to that of the remaining time in .initialize(). This would 
also benefit the S3A connector.

We'd need something like

# A (per-user) map of filesystems being created <URI, FileSystem>
# split createFileSystem into two: instantiateFileSystem and 
initializeFileSystem
# each thread to instantiate the FS, put() it into the new map
# If there was one already, discard the old one and wait for the new one to be 
ready via a call to Object.wait()
# If there wasn't an entry, call initializeFileSystem) and then, finally, call 
Object.notifyAll(), and move it from the map of filesystems being initialized 
to the map of created filesystems

This sounds too straightforward to be that simple; the troublespots are 
probably related to race conditions moving entries between the two maps and 
making sure that no thread will block on the FS being initialized while it has 
already been initialized (and so wait() will block forever).

Rather than seek perfection, it may be safest go for a best-effort optimisation 
of the #of FS instances created/initialized. That is: its better to maybe 
create a few more FS instances than needed than it is to block forever.

Something is doable here, it's just not quick-and-dirty. Testing will be "fun"; 
probably best to isolate this new logic somewhere where we can simulate slow 
starts on one thread with many other threads waiting for it.

A simpler option would be to have a lock on the construction process: only one 
FS can be instantiated per user at a a time.


  was:

A recurrent problem in processes with many worker threads (hive, spark etc) is 
that calling `FileSystem.get(URI-to-object-store)` triggers the creation and 
then discard of many FS clients -all but one for the same URL. As well as the 
direct performance hit, this can exacerbate locking problems and make 
instantiation a lot slower than it would otherwise be.

This has been observed with the S3A and ABFS connectors.

The ultimate solution here would probably be something more complicated to 
ensure that only one thread was ever creating a connector for a given URL -the 
rest would wait for it to be initialized. This would (a) reduce contention & 
CPU, IO network load, and (b) reduce the time for all but the first thread to 
resume processing to that of the remaining time in .initialize(). This would 
also benefit the S3A connector.

We'd need something like

# A (per-user) map of filesystems being created <URI, FileSystem>
# split createFileSystem into two: instantiateFileSystem and 
initializeFileSystem
# each thread to instantiate the FS, put() it into the new map
# If there was one already, discard the old one and wait for the new one to be 
ready via a call to Object.wait()
# If there wasn't an entry, call initializeFileSystem) and then, finally, call 
Object.notifyAll(), and move it from the map of filesystems being initialized 
to the map of created filesystems

This sounds too straightforward to be that simple; the troublespots are 
probably related to race conditions moving entries between the two maps and 
making sure that no thread will block on the FS being initialized while it has 
already been initialized (and so wait() will block forever).

Rather than seek perfection, it may be safest go for a best-effort optimisation 
of the #of FS instances created/initialized. That is: its better to maybe 
create a few more FS instances than needed than it is to block forever.

Something is doable here, it's just not quick-and-dirty. Testing will be "fun"; 
probably best to isolate this new logic somewhere where we can simulate slow 
starts on one thread with many other threads waiting for it.

A simpler option would be to have a lock on the construction process: only one 
FS can be instantiated per user at a a time.



> FileSystem.get to support slow-to-instantiate FS clients
> --------------------------------------------------------
>
>                 Key: HADOOP-17313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17313
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/azure, fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> A recurrent problem in processes with many worker threads (hive, spark etc) 
> is that calling `FileSystem.get(URI-to-object-store)` triggers the creation 
> and then discard of many FS clients -all but one for the same URL. As well as 
> the direct performance hit, this can exacerbate locking problems and make 
> instantiation a lot slower than it would otherwise be.
> This has been observed with the S3A and ABFS connectors.
> The ultimate solution here would probably be something more complicated to 
> ensure that only one thread was ever creating a connector for a given URL 
> -the rest would wait for it to be initialized. This would (a) reduce 
> contention & CPU, IO network load, and (b) reduce the time for all but the 
> first thread to resume processing to that of the remaining time in 
> .initialize(). This would also benefit the S3A connector.
> We'd need something like
> # A (per-user) map of filesystems being created <URI, FileSystem>
> # split createFileSystem into two: instantiateFileSystem and 
> initializeFileSystem
> # each thread to instantiate the FS, put() it into the new map
> # If there was one already, discard the old one and wait for the new one to 
> be ready via a call to Object.wait()
> # If there wasn't an entry, call initializeFileSystem) and then, finally, 
> call Object.notifyAll(), and move it from the map of filesystems being 
> initialized to the map of created filesystems
> This sounds too straightforward to be that simple; the troublespots are 
> probably related to race conditions moving entries between the two maps and 
> making sure that no thread will block on the FS being initialized while it 
> has already been initialized (and so wait() will block forever).
> Rather than seek perfection, it may be safest go for a best-effort 
> optimisation of the #of FS instances created/initialized. That is: its better 
> to maybe create a few more FS instances than needed than it is to block 
> forever.
> Something is doable here, it's just not quick-and-dirty. Testing will be 
> "fun"; probably best to isolate this new logic somewhere where we can 
> simulate slow starts on one thread with many other threads waiting for it.
> A simpler option would be to have a lock on the construction process: only 
> one FS can be instantiated per user at a a time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to