[
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067772#comment-15067772
]
GAO Rui commented on HDFS-9494:
-------------------------------
Hi [~szetszwo], I agree processing failures sequentially is more safer than
trying to confirm {{checkStreamers()}}'s thread safety. I have updated a new
patch. Could you review it, again?
I use a {{ConcurrentHashMap}} to keep both the streamer and the related
exception of this streamer inside of {{call()}}, so that we could provide more
specific informations to {{handleStreamerFailure()}} later. For the handling
codes:
{code}
Iterator<Map.Entry<StripedDataStreamer, Exception>> iterator =
streamersExceptionMap.entrySet().iterator();
while (iterator.hasNext()) {
Map.Entry<StripedDataStreamer, Exception> entry = iterator.next();
StripedDataStreamer s = entry.getKey();
Exception e = entry.getValue();
handleStreamerFailure("flushInternal " + s, e, s);
iterator.remove();
}
{code}
I think maybe we could move these codes to the position after this the for
loop, so codes will become look like:
{code}
for (int i = 0; i < healthyStreamerCount; i++) {
try {
executorCompletionService.take().get();
} catch (InterruptedException ie) {
throw DFSUtilClient.toInterruptedIOException(
"Interrupted during waiting all streamer flush, ", ie);
} catch (ExecutionException ee) {
LOG.warn(
"Caught ExecutionException while waiting all streamer flush, ", ee);
}
}
Iterator<Map.Entry<StripedDataStreamer, Exception>> iterator =
streamersExceptionMap.entrySet().iterator();
while (iterator.hasNext()) {
Map.Entry<StripedDataStreamer, Exception> entry = iterator.next();
StripedDataStreamer s = entry.getKey();
Exception e = entry.getValue();
handleStreamerFailure("flushInternal " + s, e, s);
iterator.remove();
}
{code}
But for safe consideration, in the new 04 patch, I put these handling codes
inside the for loop, as finally sub case. Could you share your opinions? Thank
you.
> Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
> --------------------------------------------------------------------
>
> Key: HDFS-9494
> URL: https://issues.apache.org/jira/browse/HDFS-9494
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: GAO Rui
> Assignee: GAO Rui
> Priority: Minor
> Attachments: HDFS-9494-origin-trunk.00.patch,
> HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch,
> HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch
>
>
> Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and
> wait for flushInternal( ) in sequence. So the runtime flow is like:
> {code}
> Streamer0#flushInternal( )
> Streamer0#waitForAckedSeqno( )
> Streamer1#flushInternal( )
> Streamer1#waitForAckedSeqno( )
> …
> Streamer8#flushInternal( )
> Streamer8#waitForAckedSeqno( )
> {code}
> It could be better to trigger all the streamers to flushInternal( ) and
> wait for all of them to return from waitForAckedSeqno( ), and then
> flushAllInternals( ) returns.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)