[jira] [Comment Edited] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-08-17 Thread George Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130850#comment-16130850
 ] 

George Huang edited comment on HDFS-11912 at 8/17/17 5:42 PM:
--

Test was having a 10 min timeout. However, it took more than 10 mins to create 
5000 test files in HDFS:

2017-08-17 03:12:54,076 [Thread-126] INFO  common.Storage 
(Storage.java:tryLock(847)) - Lock on /testptch/hadoop/hadoop-hdf
...[truncated 9653305 chars]...
wed=trueugi=jenkins (auth:SIMPLE)   ip=/127.0.0.1   cmd=create  
src=/WITNESSDIR/1720/1719/1718/1717/1716/1715/1714/1713/file1720
dst=nullperm=jenkins:supergroup:rw-r--r--   proto=rpc
2017-08-17 03:22:52,582 [IPC Server handler 6 on 40751] INFO  hdfs.StateChange 
(FSDirWriteFileOp.java:logAllocatedBlock(787)) - BLOCK* allocate 
blk_1073745266_4442, replicas=127.0.0.1:43960 for 
/WITNESSDIR/1720/1719/1718/1717/1716/1715/1714/1713/file1720
2017-08-17 03:22:52,583 [DataXceiver for client 
DFSClient_NONMAPREDUCE_1460336802_1 at /127.0.0.1:53834 [Receiving block 
BP-233349655-172.17.0.2-1502939568997:blk_1073745266_4442]] INFO  
datanode.DataNode (DataXceiver.java:writeBlock(742)) - Receiving 
BP-233349655-172.17.0.2-1502939568997:blk_1073745266_4442 src: /127.0.0.1:53834 
dest: /127.0.0.1:43960
2017-08-17 03:22:52,586 [PacketResponder: 
BP-233349655-172.17.0.2-1502939568997:blk_1073745266_4442, 
type=LAST_IN_PIPELINE] INFO  DataNode.clienttrace 
(BlockReceiver.java:finalizeBlock(1523)) - src: /127.0.0.1:53834, dest: 
/127.0.0.1:43960, bytes: 69, op: HDFS_WRITE, cliID: 
DFSClient_NONMAPREDUCE_1460336802_1, offset: 0, srvID: 
ae9a2151-cb92-461b-8d73-cd9641184228, blockid: 
BP-233349655-172.17.0.2-1502939568997:blk_1073745266_4442, duration(ns): 1413179
2017-08-17 03:22:52,586 [PacketResponder: 
BP-233349655-172.17.0.2-1502939568997:blk_1073745266_4442, 
type=LAST_IN_PIPELINE] INFO  datanode.DataNode (BlockReceiver.java:run(1496)) - 
PacketResponder: BP-233349655-172.17.0.2-1502939568997:blk_1073745266_4442, 
type=LAST_IN_PIPELINE terminating
2017-08-17 03:22:52,590 [IPC Server handler 4 on 40751] INFO  hdfs.StateChange 
(FSNamesystem.java:completeFile(2755)) - DIR* completeFile: 
/WITNESSDIR/1720/1719/1718/1717/1716/1715/1714/1713/file1720 is closed by 
DFSClient_NONMAPREDUCE_1460336802_1
2017-08-17 03:22:52,600 [IPC Server handler 5 on 40751] INFO  
FSNamesystem.audit (FSNamesystem.java:logAuditMessage(7512)) - allowed=true 
  ugi=jenkins (auth:SIMPLE)   ip=/127.0.0.1   cmd=getfileinfo 
src=/WITNESSDIR/1720/1719/1718/1717/1716/1715/1714/1713/file1720
dst=nullperm=null   proto=rpc
2017-08-17 03:22:52,601 [Thread-173] INFO  snapshot.TestRandomOpsWithSnapshots 
(TestRandomOpsWithSnapshots.java:createFiles(634)) - createFiles, file:


was (Author: ghuangups):
Test was having a 10 min timeout. However, setup related operations took almost 
10 mins and hence left no time for test to finish. Test starts at around 03:12, 
but it reaches to actual test at around 03:22:  :(

2017-08-17 03:12:47,798 [main] INFO  hdfs.MiniDFSCluster 
(MiniDFSCluster.java:(469)) - starting cluster: numNameNodes=1, 
numDataNodes=3
Formatting using clusterid: testClusterID
:
:
2017-08-17 03:12:54,076 [Thread-126] INFO  common.Storage 
(Storage.java:tryLock(847)) - Lock on /testptch/hadoop/hadoop-hdf
...[truncated 9653305 chars]...
wed=trueugi=jenkins (auth:SIMPLE)   ip=/127.0.0.1   cmd=create  
src=/WITNESSDIR/1720/1719/1718/1717/1716/1715/1714/1713/file1720
dst=nullperm=jenkins:supergroup:rw-r--r--   proto=rpc
2017-08-17 03:22:52,582 [IPC Server handler 6 on 40751] INFO  hdfs.StateChange 
:
:



> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Assignee: George Huang
>Priority: Minor
>  Labels: TestGap
> Attachments: HDFS-11912.001.patch, HDFS-11912.002.patch, 
> HDFS-11912.003.patch, HDFS-11912.004.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-08-16 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16129881#comment-16129881
 ] 

Manoj Govindassamy edited comment on HDFS-11912 at 8/17/17 4:18 AM:


[~ghuangups],
Thanks much for working on the patch revision. Checkstyle issues are mostly 
fixed. Just the last 4 pending. The newly added unit test is still failing. Can 
you please take a look?



was (Author: manojg):
[~ghuangups],
  Checkstyle issues are mostly fixed. Just the last 4 pending. The newly added 
unit test is still failing. Can you please take a look?


> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Assignee: George Huang
>Priority: Minor
>  Labels: TestGap
> Attachments: HDFS-11912.001.patch, HDFS-11912.002.patch, 
> HDFS-11912.003.patch, HDFS-11912.004.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-08-16 Thread George Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16129736#comment-16129736
 ] 

George Huang edited comment on HDFS-11912 at 8/17/17 1:29 AM:
--

Hi Manoj,

Thank you so much for the comment.

Test randomly generated number of iterations to execute. It may time out if the 
overall operation takes too long. I'm reducing the max number of iterations and 
executed locally many times without timeout.

Also fixed checkstyle issues listed.

Many thanks,
George


was (Author: ghuangups):
Test randomly generated number of iterations for the current run. Test may time 
out if the overall operation takes too long. I'm reducing the max number of 
iterations and executed locally many times without timeout.

> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Assignee: George Huang
>Priority: Minor
>  Labels: TestGap
> Attachments: HDFS-11912.001.patch, HDFS-11912.002.patch, 
> HDFS-11912.003.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-06-01 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033821#comment-16033821
 ] 

Manoj Govindassamy edited comment on HDFS-11912 at 6/1/17 10:27 PM:


Thanks for contributing this patch [~ghuangups]. Looks good overall. Few 
comments from the quick look. Will add more comments later.

In HDFS-9406, snapshot operations were believed to causing metadata 
inconsistencies in the fsimage. Can you please try running this new test 
without the fix for HDFS-9406 and see if it can recreate the problem? 

1.
{noformat}
if (randomNum > currentWeightSum && randomNum <= (currentWeightSum + 
currentValue.getWeight())) {
  snapshotRandomOp = currentValue;
  break;
}
{noformat}
Shouldn't the check be just (randomNum < (currentWeightSum + 
currentValue.getWeight())

2.
{noformat}
  private static MiniDFSCluster cluster;
  private static DistributedFileSystem hdfs;
  private static Random GENERATOR = null;
{noformat}
Above class members need not be static.

3.
{{FileSystemOperations}} and {{SnapshotOperations}} are very similar except for 
enum values and weights. Code duplication here can be avoided if we can merge 
these two enums into one and expose proper methods.

4.
{noformat}
// Set
Random RANDOM = new Random();
long seed = RANDOM.nextLong();
GENERATOR = new Random(seed);
{noformat}
Any specific reason why a simple seed like System.currentTimeMillis() will not 
be useful here ? Here seed is generated from random, which is inturn is not 
seeded. Also, RANDOM need not be all caps.

5.
{noformat}
int fileLen = new Random().nextInt(MAX_NUM_FILE_LENGTH);
createFiles(testDirString, fileLen);
{noformat}
GENERATOR random can be used here instead of creating a new one.

6.
{noformat}
// Create files in a directory with random depth, ranging from 0-10.
for (int i = 0; i < TOTAL_BLOCKS; i += fileLength) {
{noformat}
Is this TOTAL_BLOCKS are total files ?

7.
{noformat}
private String GetNewPathString(String originalString,
{noformat}
Metnhod name should be in camel case, like getNewPathString()




was (Author: manojg):
Thanks for contributing this patch [~ghuangups]. Few comments from the quick 
look. Will add more comments later.

In HDFS-9406, snapshot operations were believed to causing metadata 
inconsistencies in the fsimage. Can you please try running this new test 
without the fix for HDFS-9406 and see if it can recreate the problem? 

1.
{noformat}
if (randomNum > currentWeightSum && randomNum <= (currentWeightSum + 
currentValue.getWeight())) {
  snapshotRandomOp = currentValue;
  break;
}
{noformat}
Shouldn't the check be just (randomNum < (currentWeightSum + 
currentValue.getWeight())

2.
{noformat}
  private static MiniDFSCluster cluster;
  private static DistributedFileSystem hdfs;
  private static Random GENERATOR = null;
{noformat}
Above class members need not be static.

3.
{{FileSystemOperations}} and {{SnapshotOperations}} are very similar except for 
enum values and weights. Code duplication here can be avoided if we can merge 
these two enums into one and expose proper methods.

4.
{noformat}
// Set
Random RANDOM = new Random();
long seed = RANDOM.nextLong();
GENERATOR = new Random(seed);
{noformat}
Any specific reason why a simple seed like System.currentTimeMillis() will not 
be useful here ? Here seed is generated from random, which is inturn is not 
seeded. Also, RANDOM need not be all caps.

5.
{noformat}
int fileLen = new Random().nextInt(MAX_NUM_FILE_LENGTH);
createFiles(testDirString, fileLen);
{noformat}
GENERATOR random can be used here instead of creating a new one.

6.
{noformat}
// Create files in a directory with random depth, ranging from 0-10.
for (int i = 0; i < TOTAL_BLOCKS; i += fileLength) {
{noformat}
Is this TOTAL_BLOCKS are total files ?

7.
{noformat}
private String GetNewPathString(String originalString,
{noformat}
Metnhod name should be in camel case, like getNewPathString()



> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Priority: Minor
> Attachments: HDFS-11912.001.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org