[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921541#comment-16921541 ] Eric Yang commented on HDDS-1554: - [~arp] closer examination shows that: {code} mvn -T 1C clean install -DskipTests=true -Pdist -Dtar -DskipShade -Pit,docker-build -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT {code} This does not work because skipTests flag is set. {code} mvn test -Pit -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT {code} This also doesn't work because the tests are written for integration-test phase. By running test phase only, it does not trigger integration tests to run. The proper command to run, looks like any of the following examples: {code} mvn clean install -Pit,docker-build mvn verify -Pit -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT {code} Hope this clarifies the usage of maven commands for these integration tests. If the commands are too cumbersome, we can remove "it" profile. I prefer to avoid docker-build, or docker.image parameters. They are mandatory today because the dist module supports three ways of using docker images. Hence it is necessary to drive from the top level to instruct which image to use. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch, HDDS-1554.014.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919733#comment-16919733 ] Arpit Agarwal commented on HDDS-1554: - No luck. I used{{-Pit}} during both compilation and test, still the tests were skipped. {code:java} [INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ hadoop-ozone-read-write-tests --- [INFO] Tests are skipped.{code} > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch, HDDS-1554.014.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918658#comment-16918658 ] Arpit Agarwal commented on HDDS-1554: - Hi [~eyang] , I used the -Pit flag when running the test. Let me try adding it to the build command also. {code:java} mvn test -Pit -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT {code} > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch, HDDS-1554.014.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918264#comment-16918264 ] Eric Yang commented on HDDS-1554: - [~arp] The test is written to run by specifying the "it" profile. {code} mvn -T 1C clean install -DskipTests=true -Pdist -Dtar -DskipShade -P,itdocker-build -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT{code} > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch, HDDS-1554.014.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918037#comment-16918037 ] Arpit Agarwal commented on HDDS-1554: - Thanks for the updated patch [~eyang]. It looks like the tests are still getting skipped. {code:java} $ mvn -T 1C clean install -DskipTests=true -Pdist -Dtar -DskipShade -am -pl :hadoop-ozone-dist -Pdocker-build -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT$ mvn test -Pit -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT $ mvn test -Pit -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT ... [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-ozone-read-write-tests --- [INFO] Compiling 2 source files to /Users/agarwal/src/hadoop/hadoop-ozone/fault-injection-test/disk-tests/read-write-test/target/test-classes [INFO] [INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ hadoop-ozone-read-write-tests --- [INFO] Tests are skipped. {code} > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch, HDDS-1554.014.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904169#comment-16904169 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 32s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 23 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 59s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 7m 27s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 14s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 17s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 46s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 53s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}144m 31s{color} | {color:black} {color} | \\
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904077#comment-16904077 ] Eric Yang commented on HDDS-1554: - [~arp] Thank you for the review. {quote}ITDiskReadOnly#testReadOnlyDiskStartup - The following block of code can probably be removed, since it's really testing that the cluster is read-only in safe mode. We have unit tests for that: {quote} Correct me, if I am wrong. The tests are not exactly the same. This test is triggering validation from Ozone client point of view. The unit test for TestVolumeSet#testFailedVolume is written for the server side. The smoke test tests the positive test case to ensure volume can be created, but not when disk is in read-only mode. I think there is value in test client side response to ensure we have better coverage. Thought? {quote}ITDiskReadOnly#testUpload - do we need to wait for safe mode exit after restarting the cluster? Also I think this test is essentially the same as the previous one. {quote} Safe mode validation is skipped here because Ozone exits on read-only disk. The extra wait time only adds formality for wait time. In reality, it would be better to keep Ozone daemon running, but keep the file system in safe mode or degraded mode that prevents write operations. This would be useful for disaster recovery that System admin may want to prevent further damage to disk but intend to recover data from Ozone buckets. This test is designed to pass for running in read-only mode, and exit strategy mode. Both design are validate. Test is more useful, if Ozone daemons don't exit on read-only disk. I intend to add a download test for ITDiskReadOnly as well, if read-only mode can be implemented. {quote}ITDiskCorruption#addCorruption:72 - looks like we have a hard-coded path. Should we get from configuration instead? {quote} Thank you for the suggestion. I made adjustment to ensure maven project build directory can be customized in patch 014. The test is using ${buildDirectory}/data/meta to store metadata, which defaults to maven ${project.build.directory}. It will corrupt the data file. Placing the data file in maven build directory is a good way to ensure that mvn clean will reset the state of the data file cleanly. When this is configured externally, then external mechanism must be developed to reset the data file state. {quote}ITDiskCorruption#testUpload - The corruption implementation is bit of a heavy hammer, it is replacing the content of all meta files. Is it possible to make it reflect real-world corruption where a part of the file may be corrupted. Also we should probably restart the cluster after corrupting RocksDB meta files. {quote} If Ozone is restarted after metadata corruption, it will fall into the same code path that unable to open rocksdb and fail to start. This will make corruption upload test to execution the same code path as ITDiskReadOnly#testReadOnlyDiskStartupp. The test would have no purpose. The test is purposefully corrupting metadata files without restart. This is to ensure safety mechanism will be built to protect metadata integrity. One possible design is to have background thread that check for rocksdb health. In the test, we can shorten the interval of the check to almost immediate, to verify that upload would not be successful when metadata corruption happens, and Ozone protect further corruption by entering safe mode or degraded mode. {quote}ITDiskCorruption#testDownload:161 - should we just remove the assertTrue since it is no-op? {quote} The intend is to ensure IOException is throw for the test assertion to pass. It is better written for clarity: {code:java} Assert.assertTrue("Download File test passed.", e instanceof IOException); {code} Patch 014 also includes the improved assertTrue statements. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 -
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903443#comment-16903443 ] Arpit Agarwal commented on HDDS-1554: - Looking at the test case implementations: # {{ITDiskReadOnly#testReadOnlyDiskStartup}} - The following block of code from can be removed, since it's really testing that the cluster is read-only in safe mode. We have unit tests for that: {code} try { createVolumeAndBucket(); } catch (Exception e) { LOG.info("Bucket creation failed for read-only disk: ", e); Assert.assertTrue("Cluster is still in safe mode.", safeMode); } {code} # {{ITDiskReadOnly#testUpload}} - do we need to wait for safe mode exit after restarting the cluster? Also I think this test is essentially the same as the previous one. Once we have ensured that read-only disk forces us to remain in safe mode, the rest of the checks should be covered by safe-mode unit tests. Still reviewing the rest. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903436#comment-16903436 ] Eric Yang commented on HDDS-1554: - [~arp] The tests are written to run in integration phase, try: {code} mvn verify -Pit,docker-build {code} > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903428#comment-16903428 ] Arpit Agarwal commented on HDDS-1554: - Sorry about the delay in getting back to these [~eyang]. I tried running the tests. I used the following command: {code} $ mvn test -Pit -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT ... [INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ hadoop-ozone-read-write-tests --- [INFO] Tests are skipped. {code} It looks like the tests were skipped. Any idea what I did wrong? > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884020#comment-16884020 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 50s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 1s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 23 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 42s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 4s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 22s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 34s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 17s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 54s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 56s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 25m 58s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 6s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}123m 56s{color} | {color:black} {color} | \\
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883920#comment-16883920 ] Eric Yang commented on HDDS-1554: - Patch 13 fixes check style and white space issues. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch, HDDS-1554.013.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883499#comment-16883499 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 23 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 59s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 1s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 22s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 53s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 48s{color} | {color:orange} hadoop-ozone: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 17s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 17s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 33m 41s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 51s{color} | {color:green} The patch does
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883425#comment-16883425 ] Eric Yang commented on HDDS-1554: - Patch 12 addressed all comments up to date. Most of reusable code has been moved to ClusterTester class with a set of primitive methods for reuse purpose. [~arp] All yaml file have been clean up to use inheritance. [~elek] For H) there is no fix because between cluster restart there is a 30 seconds wait time, I think this is sufficient for initialize scm metadata, hence no extra handling has been added. I think I addressed all previous concerns, let me know if I miss anything. Thanks > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch, > HDDS-1554.012.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882358#comment-16882358 ] Eric Yang commented on HDDS-1554: - [~elek] Thank you for the thorough review, and here are my answers: {quote}There are no clear separation between the two areas and it makes it impossible to run the same tests in any other environment. As an example: existing smoketests can be easily executed in kubernetes.{quote} StartCluster/StopCluster method is only using docker-compose to start a distributed cluster at this time. It can be extended to perform launching Ozone cluster on different environment such as YARN service later to keep the scope of the current feature set in check. Smoke test only runs in a closed environment. There is no exposure to outside network. Ozone cli client works inside docker container. This limits the tests from interacting with Ozone cluster from a external network. By embedding robot framework in docker container, this poses another risk of inseparable test artifacts in docker container that can not be removed later. Test artifacts makes up 400+MB of the current docker image which is 100% more bloated than necessary. Even when test artifact is separated later, we have no way to be sure that the docker image can properly function without test artifact because the tests can only work in docker containers. The fault injection tests have mapped the ports manually to external network. It can provide better coverage on testing docker network with external environment today. Smoke test can be modified to work with external network but effectively doubling the installation of test tool chain on host system, and doubling the shell script configuration templating to run in Robot framework on host system. Given maven is already a dev toolchain set in stone for Hadoop, and minimalist mindset to make best use of the tool chain. I chose to stay in maven environment to let existing tool chain do the style and error checking for these tests. I respect your approach to write the smoke test in shell + robot framework, but fault injection tests can perform more efficiently with help of Java tool chain imho. {quote}Using `@FixMethodOrder(MethodSorters.NAME_ASCENDING)` is especially error-prone and shows that we use unit test in a way which is different from the normal usage (usually the @Test methods should be independent){quote} There is nothing wrong with using feature, and implemented base on popular request. It is the developer who doesn't know how to use this feature correctly can create problem for themselves. I will group the entire flow into one test case because you don't like using this annotation. {quote}B.) The current tests uses the approach to use external mounts to reuse the /data directory. It introduces new problems (see the workarounds with the id/UID). All of them can be avoided with using internal docker volumes which makes it possible to achieve the same without additional workarounds (ITDiskReadOnly doesn't require direct access to the volume ITDiskCorruption.addCorruption can be replaced with simple shell executions `echo "asd" > ../db`){quote} There are two possible cases where data becomes read-only. B.1) The disk is mounted as read-only, or B.2) the data file is read-only. It would be nice if there are distinct error message to inform system administrator to make adjustment for mounting the data disk, or he made a error when copying data during servicing the server. ITReadOnly test is focusing on case #B.1 and can be expanded to case #B.2, if necessary. Using internal path can not clearly test case #B.1. {quote}C.) The unit test files contains a lot of code duplication inside and outside the cluster. For example ITDiskReadOnly.startCluster and ITDiskReadOnly.stopCluster are almost the same. The logic of waiting for the safe mode is duplicated in all of the subprojects.{quote} Good point, I will clean up in next patch. {quote}D.) I would use JUnit assertion everywhere. For me combining java and junit assertions are confusing (ITDiskReadOnly.startCluster/stopCluster) E.) I would use Autoclosable try-catch blocks (or at least finally blocks) to close opened resources. It would help to define the boundaries where the resources are used.{quote} Setup procedures are not tests. It is common to throw IOException and InterruptedException to handle exceptions in JUnit setup method. This is the reason that they are written this way. You are correct that using autoclosable block is nicer to close file system resources, and will make the change. {quote}F.) We don't need to catch `Exceptions` and `fail` as it's already handled by JUnit framework. Just remove try and catch block.{quote} I like to be able to identify the starting point of the log. This is the reason that I use try, catch and additional string
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882253#comment-16882253 ] Eric Yang commented on HDDS-1554: - Thank you for the review [~arp]. {quote}I think the docker-compose.yaml files should wrap ${UID}:${GID} in quotes.{quote} Agreed, I found this is required for docker-compose on Mac, but not on Linux. I will make correction for this. {quote}Another issue I ran into: Since the docker inline build with -Ddocker-build generates the image with ${user.name}/ozone, should we use the same in the docker-compose files instead of apache/ozone?{quote} The compose file uses docker.image as reference to image name. This allows inheritance of image name from top level. The only inconsistency is allowing two distinct defaults based on docker-build profile is activated or not. The concern was raised In HDDS-1667, but [~elek] does not review my patch unless I followed exactly what he specified. Marton's argument was that Ozone Kubernetes development requires a distinct prefix repository to pull docker image for the distributed environment during development. The fix can easy be supplying -Ddocker.image to the command line to customize the value. Instead, he insisted on using -Pdocker-build to activate the ${username}/ozone default value. This is the reason that we need to pass -Pdocker-build for fault injection test when we are not bulding docker image. We can also run fault injection test with: {code}mvn clean verify -Pit -Ddocker.image=${user.name}/ozone:0.5.0-SNAPSHOT{code} I think this double defaults are not intuitive, and a single default apache/ozone:${project.version} can make user experience much better, this allows user to do the development without having to specify the docker image name. SANPSHOT string in docker image tag is enough to determine if it is a local image. Kubernetes developer can configure -Ddocker.image in settings.xml to ensure that they can customize docker image without making ${user.name}/ozone:0.5.0-SNAPSHOT as mandatory that leading to mandatory -Pdocker-build flag or -Ddocker.image= flags for non-Kubernetes developer. Unless [~elek] can agree this is required change, otherwise it would be hard to clean up messy maven code that were forced in HDDS-1667. {quote}There is quite a bit of duplication of the configuration files and YAML files. Do you think there is a way to reduce the duplication?{quote} It is possible to clean up the duplications using docker compose inheritance. I will add to my next patch. Thank you for the review. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881950#comment-16881950 ] Elek, Marton commented on HDDS-1554: Thank you very much to work on this Eric Yang. I am very interested to have some real / useful disk failure injection tests. First of all let me clarify: I am not against this patch. It introduces new way to execute tests, but doesn't modify any of the existing parts of the build, therefore it's a good way to experiment with new approaches. But I would like to share why I think that this approach introduces new problems (even if my advices will be ignored). There are two main problems which should be solved when we create docker based acceptance tests: 1. How to start / stop / restart docker-based pseudo-clusters (manage the environment) 2. Execute tests and do the assertions. In the existing smoketest framework (1.) is handled by shell scripts (2.) is handled by robotframework. In this approach (1.) is handled by maven plugins and/or java code and (2.) is handled by java code. I have multiple problems with this approach: * There are no clear separation between the two areas and it makes it impossible to run the same tests in any other environment. As an example: existing smoketests can be easily executed in kubernetes. * Personally I don't think that java is the most effective way to maintain the execution of external applications. For example the following code can be replaced with one or two lines of shell code: {code:java} private synchronized void startCluster(String mode) throws IOException, InterruptedException { String relPath = getClass().getProtectionDomain().getCodeSource() .getLocation().getFile(); File log = new File(relPath+"../test-dir/docker-compose-up.txt"); String composeFile = relPath+"../../target/docker-compose.yaml"; if (mode.equals(READ_ONLY)) { composeFile = relPath+"../../target/docker-compose-read-only.yaml"; } ProcessBuilder pb = new ProcessBuilder("docker-compose", "-f", composeFile, "up"); pb.redirectErrorStream(true); pb.redirectOutput(Redirect.appendTo(log)); Process p = pb.start(); assert pb.redirectInput() == Redirect.PIPE; assert pb.redirectOutput().file() == log; assert p.getInputStream().read() == -1; p.waitFor(30L, TimeUnit.SECONDS); } {code} I think all the mentioned goals can be achieved with the existing smoketest framework (see my uploaded PR for an example). External volume management, or asserting on exceptions are both can be achieved. Using the existing smoketest approach would make to maintenance of the tests more easy. And a big part of the code introduced here (docker-compose up, waiting for safe mode, etc.) are already implemented there. Using `@FixMethodOrder(MethodSorters.NAME_ASCENDING)` is especially error-prone and shows that we use unit test in a way which is different from the normal usage (usually the @Test methods should be independent) I know that Junit is a good hammer, but... B.) The current tests uses the approach to use external mounts to reuse the /data directory. It introduces new problems (see the workarounds with the id/UID). All of them can be avoided with using internal docker volumes which makes it possible to achieve the same without additional workarounds (ITDiskReadOnly doesn't require direct access to the volume ITDiskCorruption.addCorruption can be replaced with simple shell executions `echo "asd" > ../db`) C.) The unit test files contains a lot of code duplication inside and outside the cluster. * For example ITDiskReadOnly.startCluster and ITDiskReadOnly.stopCluster are almost the same. * The logic of waiting for the safe mode is duplicated in all of the subprojects. D.) I would use JUnit assertion everywhere. For me combining java and junit assertions are confusing (ITDiskReadOnly.startCluster/stopCluster) E.) I would use Autoclosable try-catch blocks (or at least finally blocks) to close opened resources. It would help to define the boundaries where the resources are used. For example in `ITDiskReadWrite.testStatFile`: {code:java} FileSystem fs = FileSystem.get(ozoneConf); ... fs.close(); Assert.assertTrue("Create file failed", fs.exists(dest)); {code} fs seems to be used after the close. (try catch is used sometimes bot not everywhere) F.) We don't need to catch `Exceptions` and `fail` as it's already handled by JUnit framework. Just remove try and catch block. {code:java} @Test public void testUpload() { LOG.info("Performing upload test."); try { OzoneConfiguration ozoneConf = new OzoneConfiguration(); FileSystem fs = FileSystem.get(ozoneConf); Path dest = new Path("/test"); FSDataOutputStream out = fs.create(dest); byte[] b = "Hello world\n".getBytes(); out.write(b); out.close(); fs.close();
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881619#comment-16881619 ] Arpit Agarwal commented on HDDS-1554: - Hi [~eyang], I tried running the tests with {{mvn verify -Pit}} under _hadoop-ozone/fault-injection-test/disk-tests_. It failed with the following error: {code} [ERROR] services.datanode1.user contains an invalid type, it should be a string [ERROR] services.datanode2.user contains an invalid type, it should be a string [ERROR] services.datanode3.user contains an invalid type, it should be a string [ERROR] services.om.user contains an invalid type, it should be a string [ERROR] services.scm.user contains an invalid type, it should be a string {code} I think the docker-compose.yaml files should wrap {{${UID}:${GID}}} in quotes. Another issue I ran into: Since the docker inline build with {{-Ddocker-build}} generates the image with {{${user.name}/ozone}}, should we use the same in the docker-compose files instead of {{apache/ozone}}? There is quite a bit of duplication of the configuration files and YAML files. Do you think there is a way to reduce the duplication? Still reviewing the java test cases. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch, HDDS-1554.011.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878239#comment-16878239 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 41s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 31 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 56s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 33s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 13s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 14s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 40s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 26m 10s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}125m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.TestStorageContainerManager | | |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877385#comment-16877385 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 51s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 1s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 3s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 10s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 6m 24s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 17s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 1s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 5m 47s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 28m 54s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 7s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}127m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdds.scm.block.TestBlockManager | | | hadoop.ozone.TestMiniOzoneCluster | | |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877340#comment-16877340 ] Eric Yang commented on HDDS-1554: - Patch 10 rebase to current trunk. The current usage command is: {code} mvn clean verify -Pit,docker-build {code} It does not work without docker-build parameter because the default docker image name apache/ozone:0.5.0-SNAPSHOT doesn't exist. User can force to build apache/pzone:0.5.0-SNAPSHOT with: {code} mvn clean verify -Ddocker.image=apache/ozone:0.5.0-SNAPSHOT -Pit,docker-build {code} [~elek] [~arp] I was unsuccessful in communicating for a better default docker image name for docker-build profile in HDDS-1667. This is the reason that docker-build profile needs to be passed even when user is not building docker image. Patch 10 version is to discuss if we are open to eliminate docker-build passing by making the docker.image name default to apache/ozone:${project.version} because snapshot are most likely locally built image. There is no point to make further distinction between user and apache docker image name prefix when docker version tag already make the distinction in my view. I am not sure if you can agree to this point. The current test cases is same in the issue description with exception that Read-only test does not fully initialize metadata. I will update a new version of read-only test to ensure metadata initialization is done before mark volume as read-only. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch, HDDS-1554.007.patch, HDDS-1554.008.patch, > HDDS-1554.009.patch, HDDS-1554.010.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872624#comment-16872624 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 49s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 56s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 59s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 45s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} hadolint {color} | {color:red} 0m 1s{color} | {color:red} The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 13s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 10s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 45s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 58s{color} |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872610#comment-16872610 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 50s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 26s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 56s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 46s{color} | {color:orange} hadoop-ozone: The patch generated 12 new + 0 unchanged - 0 fixed = 12 total (was 0) {color} | | {color:red}-1{color} | {color:red} hadolint {color} | {color:red} 0m 1s{color} | {color:red} The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 13s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 6s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 51s{color} |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871844#comment-16871844 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 28s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 6m 44s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 46s{color} | {color:orange} hadoop-ozone: The patch generated 12 new + 0 unchanged - 0 fixed = 12 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 16s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 40s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 26m 5s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 58s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}124m 27s{color} | {color:black} {color} | \\
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869944#comment-16869944 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 1s{color} | {color:blue} Shelldocs was not available. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 1s{color} | {color:blue} yamllint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 56s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 18s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 35s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} hadolint {color} | {color:red} 0m 1s{color} | {color:red} The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 15s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 8s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 17s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 57s{color} |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869897#comment-16869897 ] Eric Yang commented on HDDS-1554: - Patch 6 fixed a logic error that [~elek] pointed out for waiting for safe mode, and reduce the number of max retries to 3 because there is now 10 retries built in ipc.Client. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch, > HDDS-1554.006.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868908#comment-16868908 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 2s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 23s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 43s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} hadolint {color} | {color:red} 0m 2s{color} | {color:red} The patch generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 13s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 28s{color} | {color:green} hadoop-hdds in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 18s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 53s{color} |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868831#comment-16868831 ] Eric Yang commented on HDDS-1554: - Patch 005 fixes hard coded uid:gid issues, and use a read-only mount for /data. Disk tests will supply -u flag to ensure the mounting location does not create filesystem uid/gid inconsistency problem. Other smoke tests are recommended to use -u flag to prevent containers to write out data of another user's uid/gid to host level file system, HDDS-1609 maybe a good place to start applying -u flag to tests outside of fault-injection tests. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch, HDDS-1554.005.patch > > Time Spent: 20m > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868728#comment-16868728 ] Eric Yang commented on HDDS-1554: - [~elek] {quote}This code tries to check the safe mode. Actually we are not interested about the safe mode here, as SCM can't be started (or shouldn't be started) with read only directory. The other problem with this code fragment that you assume that the safe mode is true in case of any exception. In case of any exception you wait 60 seconds in the tests without checking what is exactly the problem.{quote} If scmClient.inSafeMode API is written correctly, SafeMode should never being set to false. The test case is default safeMode to true even when SCM is offline. Unless SCM returned improper value instead of throwing connection error. SafeMode should never be set to false. This is a good test to show that there is a problem with scm client. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > Time Spent: 20m > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867858#comment-16867858 ] Eric Yang commented on HDDS-1554: - [~elek] Thank you for the review. The disk tests development is stuck on container and filesystem uid issue. Until we can have closure on HDDS-1609. Some of the test can not be execrised. {quote}The other problem with this code fragment that you assume that the safe mode is true in case of any exception. In case of any exception you wait 60 seconds in the tests without checking what is exactly the problem.{quote} The current Ozone client is throwing error without retry. I have filed HDDS-1583 to make Ozone client more robust, then we can refine testWaitForSafeMode in read-only test. {quote}I think it's better to commit working tests one by one. Let's focus on the corruption-test, for now. As you requested I created a PR to show how is it possible to test it with the existing tools. (With a more simple way).{quote} Thank you for sharing your implementation. # I think it is risky to dump all tests in dist project. It is a snowball growing. It would be nice to have ability to selectively run test cases from maven cli. # It becomes increasing difficult to identify which compose file is used by test and which one is meant for release because all compose files are stored in dist/src/main subdirectory. # In read-only test, it requires sudo privileges to change files to read-only. This is another security risk that allowing hadoop user to be sudo in container, it gives the container ability to jail break out of container. This is not ideal. # Can not simulate disk full because test result output is written inside container. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > Time Spent: 20m > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867452#comment-16867452 ] Elek, Marton commented on HDDS-1554: Thanks Eric the answer. Can you please describe what is the partial implementation? I can see a docker-compose file (hadoop-ozone/fault-injection-test/disk-tests/read-only-test/) but can't see any volume settings. hadoop-ozone/fault-injection-test/disk-tests/read-only-test/src/test/resources/compose/docker-compose.yaml {code:java} + scm: + image: ${user.name}/ozone:${project.version} + ports: + - 9860:9860 + - 9861:9861 + - 9863:9863 + - 9876:9876 + env_file: + - ./docker-config + environment: + ENSURE_SCM_INITIALIZED: /data/metadata/scm/current/VERSION + command: ["/opt/hadoop/bin/ozone","scm"]{code} I think you used the basic /data directory which is writable: hadoop-ozone/fault-injection-test/disk-tests/read-only-test/src/test/resources/compose/docker-config {code:java} +OZONE-SITE.XML_ozone.metadata.dirs=/data/metadata{code} But you tries to check the safe mode: hadoop-ozone/fault-injection-test/disk-tests/read-only-test/src/test/java/org/apache/hadoop/ozone/ITDiskReadOnly.java {code:java} + @Test + public void testWaitForSafeMode() throws InterruptedException { +LOG.info("Wait for cluster to exit safe mode..."); +int retries = 1; +boolean safeMode = true; +ScmClient scmClient; +while (safeMode && retries <= MAX_RETRIES) { + try { +LOG.info("Connection attempt {} of 30.", retries); +scmClient = new SCMCLI().createScmClient(); +safeMode = scmClient.inSafeMode(); + } catch (Exception e) { +safeMode = true; + } + retries++; + Thread.sleep(2000); +} +Assert.assertFalse("Ozone cluster should not exit safe mode.", safeMode); + }{code} This code tries to check the safe mode. Actually we are not interested about the safe mode here, as SCM can't be started (or shouldn't be started) with read only directory. The other problem with this code fragment that you assume that the safe mode is true in case of any exception. In case of any exception you wait 60 seconds in the tests without checking what is exactly the problem. {quote}Some test case can be refined when error detection is better implemented later. Does this work for you? {quote} I think it's better to commit working tests one by one. Let's focus on the corruption-test, for now. As you requested I created a PR to show how is it possible to test it with the existing tools. (With a more simple way). Feel free to check it: https://github.com/apache/hadoop/pull/990 > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867067#comment-16867067 ] Eric Yang commented on HDDS-1554: - [~elek] Partial implementation is included for 1 and 2. New tests can be added in future tickets. This ticket is to setup the basic test case and build structure. Some test case can be refined when error detection is better implemented later. Does this work for you? > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866874#comment-16866874 ] Elek, Marton commented on HDDS-1554: Oh, ok, thanks. Which one of these are tested in the uploaded patch? > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866844#comment-16866844 ] Eric Yang commented on HDDS-1554: - [~elek] Read-only tests suppose to have the following tests: # If the disk is in read-only state, the daemons shouldn't start, and print out correct error message to console that writing to disk is not possible. # If the file permission is read-only, the daemon shouldn't start, and print out correct error message to console that metadata can not be written. # If the disk become full while Ozone is running, no new data can be written. Ozone must report IOException to write clients without crashing. # If the disk become full while Ozone is running, Ozone read client must work correctly to fetch data. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866599#comment-16866599 ] Elek, Marton commented on HDDS-1554: bq. Feel free to rewrite the read-only and corruption tests in Robot framework, and submit the patch here. I would be happy to review the patches. OK, I am not sure about the read only test (1) but the corruption test is converted to robot test framework in the PR. (1): AFAIK SCM can't be started on read only directory. If you can describe the exact scenario for read-only test I would be happy to implement it as well. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861284#comment-16861284 ] Anu Engineer commented on HDDS-1554: At this point, ozone uses MiniChaosCluster as the basic error path. Let us stick to that, since this provides no extra benefit. And the original design discussions did talk about using a file system as a error generating facade. I don't think this approach buys us anything. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861267#comment-16861267 ] Eric Yang commented on HDDS-1554: - 3. {quote}You can help me with sharing more details about the future plans{quote} In this patch, I added a check for expecting IOException to be thrown when om metadata have been corrupted. The code base is not throwing IOException currently, but this could be one of the criteria for fault injection test to pass. MiniOzoneCluster may not have abilities to produce some distributed complex test cases. For example, one of the datanode getting constant SIGINT, and validate the replications have happened correctly. Examine high frequency polling to SCM metrics have any ill effect to Ozone cluster communication. Some scenarios can only be tested in distributed environment with Java type checking to ensure the exception wrapping from server side propagated to client side is meaningful. Java tests may produce higher fidelity in the result sets. If you still feel this is redundant, and there is ways to accomplish the same in MiniOzoneCluster, or Robot test. Feel free to rewrite the read-only and corruption tests in Robot framework, and submit the patch here. I would be happy to review the patches. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861089#comment-16861089 ] Elek, Marton commented on HDDS-1554: bq. Yeah, if the user knows where to look. Robot plugin shows the link on the navigation of the build, which is a much nicer way to access the html report than digging into aggregated artifact link to find the log file. Sorry, I misunderstood. Until now you mentioned the problem about " Color coding in test summary report". This is a different problem and also can be solved: https://wiki.jenkins.io/display/JENKINS/HTML+Publisher+Plugin 2. HDDS-1583 can be catched by robot framework tests imho. It's not about the exception it's about the environment (but I may be wrong) 3. My current problem is that I have only this patch as information and you may have more detailed plan or vision in your mind. (You can help me with sharing more details about the future plans). This patch executes very simple ozone shell commands and very high level FileInputStream operations. I think both can be tested with existing robot tests + miniozone tests with less problem (for example you can forget your volume permission problem). But I have no problem to maintain multiple type of tests IF the complexity is not increased or if there are additional benefits to increase the complexity. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858957#comment-16858957 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 1s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 52s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 43s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 12s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 14s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 42s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 53s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 27m 57s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}120m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.container.common.impl.TestHddsDispatcher | | |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858871#comment-16858871 ] Eric Yang commented on HDDS-1554: - Made a mistake with patch 003. It contains some Dockerfile modification specifically to work for my development environment. Revised patch 004 to remove the environment specific changes. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch, HDDS-1554.004.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858868#comment-16858868 ] Eric Yang commented on HDDS-1554: - Patch 003 fixes checkstyle and white space issues. The failed unit tests don't seem to be related to this patch. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch, > HDDS-1554.003.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858026#comment-16858026 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} yamllint {color} | {color:blue} 0m 0s{color} | {color:blue} yamllint was not available. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 30 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 6s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 5m 36s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 27s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 54s{color} | {color:orange} hadoop-ozone: The patch generated 27 new + 0 unchanged - 0 fixed = 27 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 10 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 13s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 40s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 54s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 25m 59s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 0s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}121m 34s{color} | {color:black} {color} |
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857101#comment-16857101 ] Eric Yang commented on HDDS-1554: - Patch 002 depends on HDDS-1458 patch 17. Will retrigger precommit build test after HDDS-1458 is committed. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857100#comment-16857100 ] Eric Yang commented on HDDS-1554: - Patch 002 setup docker compose cluster to be a distributed cluster with three datanodes, and add corruptions to /data/metadata directory. The test case will attempt to upload file to ozone after ozone manager database has been corrupted. The tests will try to capture IOException to ensure that file operation fails. At this stage, Ozone does not check metadata is corrupted, and continue to run with in memory data. Hence, the test cases are supposed to fail for now until more intelligence are added to Ozone to detect disk faults. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857099#comment-16857099 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 13s{color} | {color:red} HDDS-1554 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDDS-1554 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12970986/HDDS-1554.002.patch | | Console output | https://builds.apache.org/job/PreCommit-HDDS-Build/2720/console | | versions | git=2.7.4 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch, HDDS-1554.002.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856857#comment-16856857 ] Eric Yang commented on HDDS-1554: - [~elek] {quote}It works.{quote} Yeah, if the user knows where to look. Robot plugin shows the link on the navigation of the build, which is a much nicer way to access the html report than digging into aggregated artifact link to find the log file. {quote}I think we can agree that junit and a cli have exactly the same chance to access the internal state of a remote backend in the running container.{quote} This is only true if cli is always wrapping on top of Java programs. If native program like curl is used to access a web port. The end result will contain the shorten stack trace of the server side because server responds to "Accept: text/html" header from curl. The server buffered writer will try to flatten the server side stack trace and render as html. If developer wants to preserve the full stack trace of the client side and server side, then calling junit test can capture more details because jersey client or rpc client does not ask server to render result in html. Hence, full stack trace can be captured. For example, HDDS-1583 was filed against Ozone RPC client because junit test was able to capture the client exception that when server is offline. Robot framework smoketest should have caught this, but didn't. Robot framework is a general dispatcher framework like maven surefire plugin that checks for exit code and stdin/stdout. It doesn't know the concept of Java exceptions, hence writing white box tests in Java provides some advantages in exception handling and collect more detailed stack trace. My suggestion is to keep both. Robot framework testing is good for black box tests where QA can think outside of the box. White box testing is good for measuring if the components are built properly according to specification. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856654#comment-16856654 ] Elek, Marton commented on HDDS-1554: 1. Color coded robot framework tests executed with jenkins: https://ci.anzix.net/job/ozone-nightly/121/artifact/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/smoketest/result/log.html It works. 2. Sorry, I am lost. You wrote that junit test is better because it can access the internal state: " For example, how to introspect that authentication is verified on the server side instead of Java client. Robots framework can not tap into JVM to give us the answer that we seek, but a junit test can." I think we can agree that junit and a cli have exactly the same chance to access the internal state of a remote backend in the running container. 3. Remote debug is an independent question. I agree it's very useful and I use it all the time. Can be used on any kind of docker-compose based pseudo cluster. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855940#comment-16855940 ] Eric Yang commented on HDDS-1554: - {quote}Based on my experience it can be executed in jenkins without any problem. I think it works very well without robot test plugin. It's enough to run robot tests in a dind image. Isn't it?{quote} No, it would be very difficult to look at the console output look to determine which test case has failed because FAIL is a common word in the output. Color coding in test summary report is very useful to identify the failed test case with a single glance. Jenkins Robot framework plugin also helps to organize build number and generated report. {quote}Can you please explain how the junit test will do it if the backend runs in a spearated container?{quote} Junit tests can be written to interact with rpc, http to retrieve information from backend that runs in a separate container. Exception Wrapping can handle server side stack trace to provide seamless experience without additional coding. Seasoned programmer maybe interested in remote debugging to capture private variable states, remote debugger parameters can be passed as environment variable to JAVA_OPTS for docker container: {code}-Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y{code} This allows IDE to connect to containerized server for troubleshooting, while launching Junit tests are interacting with the container. This arrangement can cover most of end to end white box testing with power of IDE to assist remote debugging. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1687#comment-1687 ] Elek, Marton commented on HDDS-1554: bq. Robot test framework based test cases doesn't converge toward using Apache Infra's Jenkins server Based on my experience it can be executed in jenkins without any problem. I think it works very well without robot test plugin. It's enough to run robot tests in a dind image. Isn't it? bq. For example, how to introspect that authentication is verified on the server side instead of Java client. Robots framework can not tap into JVM to give us the answer that we seek, but a junit test can Can you please explain how the junit test will do it if the backend runs in a spearated container? > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855130#comment-16855130 ] Eric Yang commented on HDDS-1554: - [~elek] {quote}Still I have mixed feelings. We already have a framework to execute docker-composed based tests (smoketest/acceptance). I am wondering if it would be easier to improve the existing approach to support this use case as well.{quote} Smoke test is written to exercise ozone from a black box point of view. It can be developed quickly to cover wide range of features, but the reporting result is coarse grit. There is some language specific barrier that Robot framework can not easily penetrate and solve. For example, how to introspect that authentication is verified on the server side instead of Java client. Robots framework can not tap into JVM to give us the answer that we seek, but a junit test can. Robot test framework based test cases doesn't converge toward using Apache Infra's Jenkins server. There are 33 out standing bugs for Jenkins Robot Framework Plugin, and half of the bugs date back 5 to 10 years with no plan for resolution. It will be a lot of work to work with Apache Infra to get Robot test framework plugin to work in builds.apache.org. The time maybe better spent on writing junit tests. Fault injection test is white box testing because it can exercises Ozone libraries in parts and inject faulty signal by making volume disappear or read-only while interrogate client JVM for answers. However, white box tests take longer to develop the complete pictures. I think both type of tests are contributing differently to the collective end goals. There is no need to favor one type of test over another. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854496#comment-16854496 ] Elek, Marton commented on HDDS-1554: Thanks a lot, appreciate your help. It's easier to imagine to vision/goal based on this list. Good feature set. Still I have mixed feelings. We already have a framework to execute docker-composed based tests (smoketest/acceptance). I am wondering if it would be easier to improve the existing approach to support this use case as well. Introducing a third way to execute integration tests can be useful but also introduces additional complexity. And as I see in the patch, most of the time high level command executions (eg. OzoneShell...) are used, it seems to be very similar to what we already have. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850133#comment-16850133 ] Eric Yang commented on HDDS-1554: - [~elek] The approach to use docker for integration test is heavily influenced by decision to use docker-compose heavily in existing blockade tests. It is better than mini cluster in some situations because: # Each process is running with independent network address # Docker containers can run production code instead of a simulated cluster # Docker containers can isolate visibility of data disks # It will be possible to test SELinux and Hadoop compatibilities # It will be possible to test Kernel security capabilities # It is easier to verify multiple users file permission related issues # Simulation for resource constrained environment (out of disk space) > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849411#comment-16849411 ] Elek, Marton commented on HDDS-1554: Thank you [~eyang] to explain it. To be honest It's not clear for me. I am not *against* using docker for fault tolerant testing, I am just trying to understand the motivation behind the decision. If I understood well these are the current arguments: To use docker for fault injection tests: A. Pro: * It "provides better opportunity to create errors asynchronously." (?) B. To use java for fault injection test with MiniOzoneCluster Pro: * Can be easier to generate bogus read events, can be easier to run Con: * A slightly different approach with aspectj "was not fruitful" I am pretty sure that this is not the right classification, but this is created based on the mentioned *technical* problems. The only thing what I suggested is to add more arguments to explain it. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849266#comment-16849266 ] Eric Yang commented on HDDS-1554: - 1 {quote}The new tests are missing from the distribution tar file (hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/tests/). We agreed to support the execution of all the new tests from the final tar.{quote} Yes, I remember that conversation, and not discounting that agreement. The code needs to be rewritten in Python and move to be built prior to distribution project to achieve what we agreed on. What we will lose as part of the process are: * Lose ability to accurately pin point where exception occurs because Java stacktrace may not be captured by python tests. * Working against maven life cycle. Integration test suppose to come after package has happened. We are sending more testing binaries in release tarball that are irrelevant in production. * Wasting time packaging integration test binaries in release tarball. 2 {quote} I am not sure why we need the normal read/write test. All of the smoketests and integration-tests are testing this scenario{quote} The only difference between this version and smoke test is the cilent is not running in the same network as the docker containers. This has actually helped us to catch a few bugs, like SCMCLI client retries, and protobuf versioning problem. It also help us to test if client JDK is different from cluster JDK. It provides a better testbed to show what it is like for data injection to container cluster look like from external clients. 3 {quote}With the Read/Only test: I don't think that we need to support read-only disks. The only question is if the right exception is thrown. I think it also can be tested from MiniOzoneCluster / real unit tests in a more lightweight way.{quote} Read only is to prevent disk write to simulate configuration issue for data directory, or disk is mounted as read-only incorrectly. This injects faults into normal workflow by changing a few docker parameters, and easy to clean up without leaving read-only debris in build directory. This area needs more expansion. We can add test case that focus on making metadata disk read-only, or datanode disk read-only. Then measure if strained process have negative side effect to the cluster, and check replication proceeded correctly. 4 {quote}Anu Engineer suggested multiple times to do the disk failure injection on the java code level where more sophisticated tests can be added (eg. generate corrupt read with low probability with using specific Input/OutputStream). Can you please explain the design consideration to use docker images? Why is it better than the suggested solution?{quote} We have already done that with aspect/J in HDFS-435. The work was not fruitful and [proposed for removal|https://issues.apache.org/jira/browse/HDFS-6819?focusedCommentId=15235595=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15235595]. The key point of fault injection is to catch exceptions that may not have been handled correctly. By randomly adding junk to data file or change files to read-only, the tests can exercise the normal routines to generate exceptions that may not have been tested as fully. By using Docker mounted volumes, we can generate the faults outside of normal Java code path. This provides better opportunity to create errors asynchronously. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848663#comment-16848663 ] Elek, Marton commented on HDDS-1554: Thank you very much the patch [~eyang] 1. The new tests are missing from the distribution tar file (hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/tests/). We agreed to support the execution of all the new tests from the final tar. 2. I am not sure why we need the normal read/write test. All of the smoketests and integration-tests are testing this scenario 3. With the Read/Only test: I don't think that we need to support read-only disks. The only question is if the right exception is thrown. I think it also can be tested from MiniOzoneCluster / real unit tests in a more lightweight way. 4. [~anu] suggested multiple times to do the disk failure injection on the java code level where more sophisticated tests can be added (eg. generate corrupt read with low probability with using specific Input/OutputStream). Can you please explain the design consideration to use docker images? Why is it better than the suggested solution? > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847927#comment-16847927 ] Eric Yang commented on HDDS-1554: - This patch requires HDDS-1458 patch 13 or newer. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1554) Create disk tests for fault injection test
[ https://issues.apache.org/jira/browse/HDDS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847925#comment-16847925 ] Hadoop QA commented on HDDS-1554: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 11s{color} | {color:red} HDDS-1554 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDDS-1554 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969695/HDDS-1554.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDDS-Build/2709/console | | versions | git=1.9.1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. > Create disk tests for fault injection test > -- > > Key: HDDS-1554 > URL: https://issues.apache.org/jira/browse/HDDS-1554 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: HDDS-1554.001.patch > > > The current plan for fault injection disk tests are: > # Scenario 1 - Read/Write test > ## Run docker-compose to bring up a cluster > ## Initialize scm and om > ## Upload data to Ozone cluster > ## Verify data is correct > ## Shutdown cluster > # Scenario 2 - Read/Only test > ## Repeat Scenario 1 > ## Mount data disk as read only > ## Try to write data to Ozone cluster > ## Validate error message is correct > ## Shutdown cluster > # Scenario 3 - Corruption test > ## Repeat Scenario 2 > ## Shutdown cluster > ## Modify data disk data > ## Restart cluster > ## Validate error message for read from corrupted data > ## Validate error message for write to corrupted volume -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org