[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9079: -- Labels: hdfs-ec-3.0-nice-to-have (was: ) > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-9079.01.patch, HDFS-9079.02.patch, > HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch, > HDFS-9079.06.patch, HDFS-9079.07.patch, HDFS-9079.08.patch, > HDFS-9079.09.patch, HDFS-9079.10.patch, HDFS-9079.11.patch, > HDFS-9079.12.patch, HDFS-9079.13.patch, HDFS-9079.14.patch, > HDFS-9079.15.patch, HDFS-9079-HDFS-7285.00.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9079: -- Issue Type: Improvement (was: Sub-task) Parent: (was: HDFS-8031) > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079.01.patch, HDFS-9079.02.patch, > HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch, > HDFS-9079.06.patch, HDFS-9079.07.patch, HDFS-9079.08.patch, > HDFS-9079.09.patch, HDFS-9079.10.patch, HDFS-9079.11.patch, > HDFS-9079.12.patch, HDFS-9079.13.patch, HDFS-9079.14.patch, > HDFS-9079.15.patch, HDFS-9079-HDFS-7285.00.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.15.patch Updating the patch to harden the handling of expired tokens. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, > HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, > HDFS-9079.11.patch, HDFS-9079.12.patch, HDFS-9079.13.patch, > HDFS-9079.14.patch, HDFS-9079.15.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.14.patch Updating patch to fix a trivial NPE bug and the block token test failure. So the {{TestDFSStripedOutputStreamWithFailurexxx}} tests sometimes time out during setup and tearDown. We should think of a way to reduce this kind of false alarms. I turned off the random checking in {{TestBase}} and ran all tests locally, only got timeouts in setup and tearDown. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, > HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, > HDFS-9079.11.patch, HDFS-9079.12.patch, HDFS-9079.13.patch, HDFS-9079.14.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.13.patch The new patch passes all {{TestDFSStripedOutputStreamWithFailurexxx}} tests locally, even with the randomized factor turned off. Triggering Jenkins again. Actually trunk is failing some of those tests, like [this one | https://builds.apache.org/job/PreCommit-HDFS-Build/13572/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure050/test0/]. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, > HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, > HDFS-9079.11.patch, HDFS-9079.12.patch, HDFS-9079.13.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.12.patch Updating the patch to address the NPE error in some unit tests. {{TestDFSStripedOutputStreamWithFailurexxx}} actually pass consistently on my local machine. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, > HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, > HDFS-9079.11.patch, HDFS-9079.12.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.11.patch Updating the patch to fix the scenario where a streamer ends its block before having a chance to process the external error caused by another streamer. Now the randomized {{TestDFSStripedOutputStreamWithFailurexxx}} tests pass consistently on my local machine. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, > HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, HDFS-9079.11.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.09.patch Fixed test failures from last Jenkins run, by addressing the following corner cases: # In some test cases a streamer doesn't have any byte to write. Should properly handle the status of such streamers in the coordinator # {{setExternalError}} should wait until the streamer is in {{DATA_STREAMING}} stage (i.e. {{blockStream}} is not null) > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, > HDFS-9079.08.patch, HDFS-9079.09.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.10.patch Minor fix for test failures. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, > HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.08.patch The new patch fixed a few bugs: # DN failures can happen in 2 cases: i) while data is streaming; ii) in {{nextBlockOutputStream}}. Previous patches didn't issue a {{DN_FAILURE}} event for the second case # {{closeImpl}} should finish flushing empty packets before waiting for {{BlockMetadataCoordinator#getAllStreamersEndedBlock()}} # In order to limit the lifespan of all streamers, we need to send empty packet to a streamer even if its internal block is empty: {code} -if (s.getBytesCurBlock() > 0) { - setCurrentPacketToEmpty(); -} +setCurrentPacketToEmpty(); {code} Identified some existing bugs: HDFS-9342, HDFS-9386 [~szetszwo], [~jingzhao], [~walter.k.su] mind taking another look? Thanks much. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, HDFS-9079.08.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.07.patch v06 patch miscalculates the size of the finally committed block: if a data block fails, we shouldn't deduct it from the size. This is actually a bug in trunk: for non-EC files, we have: {code} protected ExtendedBlock block; // its length is number of bytes acked {code} For EC files, the size of {{DFSStripedOutputStream#currentBlockGroup}} is incremented in {{writeChunk}} without waiting for ack. I just filed HDFS-9342 to address the above. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.06.patch Updating the patch with more Javadoc and addressing the test failures. In 05 patch {{streamerStatus}} was not properly initiated. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, > HDFS-9079.05.patch, HDFS-9079.06.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Description: A non-striped DataStreamer goes through the following steps in error handling: {code} 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) Updates block on NN {code} With multiple streamer threads run in parallel, we need to correctly handle a large number of possible combinations of interleaved thread events. For example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and {{streamer_A.3}}. HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. This JIRA proposes some further optimizations based on HDFS-9040: # We can preallocate GS when NN creates a new striped block group ({{FSN#createNewBlock}}). For each new striped block group we can reserve {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have happened we shouldn't try to further recover anyway. # We can use a dedicated event processor to offload the error handling logic from {{DFSStripedOutputStream}}, which is not a long running daemon. # We can limit the lifespan of a streamer to be a single block. A streamer ends either after finishing the current block or when encountering a DN failure. With the proposed change, a {{StripedDataStreamer}}'s flow becomes: {code} 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) => terminates 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) {code} was: A non-striped DataStreamer goes through the following steps in error handling: {code} 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) Updates block on NN {code} To simplify the above we can preallocate GS when NN creates a new striped block group ({{FSN#createNewBlock}}). For each new striped block group we can reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we shouldn't try to further recover anyway. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > With multiple streamer threads run in parallel, we need to correctly handle a > large number of possible combinations of interleaved thread events. For > example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and > {{streamer_A.3}}. > HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. > This JIRA proposes some further optimizations based on HDFS-9040: > # We can preallocate GS when NN creates a new striped block group > ({{FSN#createNewBlock}}). For each new striped block group we can reserve > {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have > happened we shouldn't try to further recover anyway. > # We can use a dedicated event processor to offload the error handling logic > from {{DFSStripedOutputStream}}, which is not a long running daemon. > # We can limit the lifespan of a streamer to be a single block. A streamer > ends either after finishing the current block or when encountering a DN > failure. > With the proposed change, a {{StripedDataStreamer}}'s flow becomes: > {code} > 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) > => terminates > 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) > => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.05.patch Many thanks for Walter's review! Attaching new patch to address the comments. # Reverted change to shouldStop(). I realized the coordinator doesn't need to {{setExternalError}} at all. An external error is defined by the mismatch b/w local and coordinator genStamps: {code} @Override protected boolean processDatanodeOrExternalError() throws IOException { if (coordinator.getProposedGenStamp() > getLocatedBlock().getBlock().getGenerationStamp()) { setExternalError(); } return super.processDatanodeOrExternalError(); } {code} We can further eliminate the external error altogether. Can be a separate JIRA. # bq. Thread coordinator leaks Good point. Addressed # bq. The 2 healthy check duplicates, Good suggestion. It's not actually a duplicate health check. It's rather to avoid sending a failure event if coordinator already knows about the failure. I refactored the code. # bq. You insert a empty closed streamer. Why not just use the new concept StreamerStatus.NULL I tried leaving the element in {{streamers}} as null instead of a closed streamer. The issue is that many places assume the {{DFSOutputStream#streamer}} is long running. We need to update everywhere calling {{getStreamer()}} to make it work. So leaving that part as is. # bq. we can use coordinator to postpone setExternalError() to address #1 See point #1 in my comment above. It's already postponed. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.04.patch The only locally reproducible failure is {{testDatanodeFailureRandomLength}}. Adding a fix to clear {{currentPackets}} when allocating new blocks. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.03.patch Updating the patch to fix all reported test failures. Main change is to add logic to get new block token when current one expires. I'm currently working on adding Javadocs and fixing exception handling. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch, HDFS-9079.03.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.02.patch Updating the patch to fix a few bugs: # {{DataStreamer#shouldStop}} should check internal error instead of all errors. Otherwise there could be an infinite loop if a streamer is assigned an external error before initializing its own {{blockStream}}. # {{StripedDataStreamer}} should rely on comparing its own GS with the coordinator version to determine whether there's an external error. Otherwise there's a bug when processing multiple external errors: {{setExternalError}} is called multiple times but all the external errors are cleared in one {{setupPipelineInternal}} call. Still working on the token issue. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, > HDFS-9079.02.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.01.patch Rebased the patch on top of HDFS-9040 and made it more complete. It's still a WIP. The main logic is: # As described [above | https://issues.apache.org/jira/browse/HDFS-9079?focusedCommentId=14905503=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14905503], redesigned the coordinator to be an event processing daemon. # Limiting the lifespan of {{StripedDataStreamer}} to a single block. This is to simplify the logic. # Preallocating GS by groups, so GS bumping can be processed locally by the coordinator. I also modified {{TestWriteStripedFileWithFailure}} to be a minimum error handling test -- it writes a small file (< 1 block), and there's 1 failure during the write. The patch passes the test and the sequence of events is as expected. I'm now working on more complex tests including {{TestDFSStripedOutputStreamWithFailure}}. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Component/s: erasure-coding > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: (was: HDFS-9079.00.patch) > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079-HDFS-7285.00.patch [~walter.k.su] I named the patch wrongly. It is based on HDFS-7285 branch. As mentioned above it contains part of HDFS-9040. The current patch is just to illustrate the overall idea. More details will be added. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Summary: Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers (was: Erasure coding: preallocate multiple generation stamps when creating striped blocks) > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9079: Attachment: HDFS-9079.00.patch This is a wip patch to explore the idea of serializing *different types* of update events from {{StripedDataStreamer}}. The current implementation creates new tools ({{MultipleBlockingQueue}} {{ConcurrentPoll}}) to serialize all events of the same type. For example, all {{updateBlockForPipeline}} calls will be synchronized. But as we discussed under HDFS-9040, it is still possible to have interleaved updates from different types of events. [~jingzhao]'s patch there does a great job to simplify the logic by concentrating most updates to the {{DFSStripedOutputStream}} level. I'm uploading a patch based on Jing and Walter's work under that JIRA. My earlier [comment | https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741972=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741972] describes part of the algorithm in the patch. It has the following potential benefits: # I think Walter made a similar argument when creating his first wip patch on HDFS-9040: {{DFSOutputStream}} is not a long-running daemon, but a one-off thread for specific calls, like {{writeChunk}} and {{close}}. It's not easy to insert the logic to periodically checking streamers. {{BlockMetadataCoordinator}} in this patch is a little similar to {{BlockGroupStreamer}} in Walter's patch. It is a daemon to process incoming updates. This will avoid the need to wait for the next {{DFSOutputStream}} to process an update. # This patch also limits the lifetime of a {{StripedDataStreamer}} to a single block. At {{endBlock}} the streamer will close itself. A few TODOs that I will add in the next rev: # I need to add the logic of replacing all streamers at {{writeChunk}}, similar to {{replaceFailedStreamers}} in Jing's patch. # Need to bump GS of all {{FINISHED}} streamers to the maximum of preallocation. When all streamers are either {{FINISHED}} or {{FAILED}}, need to update NN. # Need to handle the block token problem Walter pointed out. > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079.00.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)