date:20150702


 [ 
https://issues.apache.org/jira/browse/HDFS-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-8711:
-
Attachment: HDFS-8711.patch

 setSpaceQuota command should print the available storage type when input 
 storage type is wrong
 --

 Key: HDFS-8711
 URL: https://issues.apache.org/jira/browse/HDFS-8711
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.0
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore
 Attachments: HDFS-8711.patch


 If input storage type is wrong then currently *setSpaceQuota* give exception 
 like this.
 {code}
 ./hdfs dfsadmin -setSpaceQuota 1000 -storageType COLD /testDir
  setSpaceQuota: No enum constant org.apache.hadoop.fs.StorageType.COLD
 {code}
 It should be 
 {code}
 setSpaceQuota: Storage type COLD not available. Available storage type are 
 [SSD, DISK, ARCHIVE]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8686) WebHdfsFileSystem#getXAttr(Path p, final String name) doesn't work if domain name is in bigcase

2015-07-02 Thread kanaka kumar avvaru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru updated HDFS-8686:
--
Component/s: webhdfs

 WebHdfsFileSystem#getXAttr(Path p, final String name) doesn't work if domain 
 name is in bigcase
 ---

 Key: HDFS-8686
 URL: https://issues.apache.org/jira/browse/HDFS-8686
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Jagadesh Kiran N
Assignee: kanaka kumar avvaru

 {code}  hadoop fs -getfattr -n USER.attr1 /dir1 {code} 
 == returns value
 {code} webhdfs.getXAttr(new Path(/dir1),USER.attr1)) {code} 
 == returns null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8495) Consolidate append() related implementation into a single class

2015-07-02 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8495:
---
Attachment: HDFS-8495-003.patch

 Consolidate append() related implementation into a single class
 ---

 Key: HDFS-8495
 URL: https://issues.apache.org/jira/browse/HDFS-8495
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, 
 HDFS-8495-002.patch, HDFS-8495-003.patch


 This jira proposes to consolidate {{FSNamesystem#append()}} related methods 
 into a single class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8711) setSpaceQuota command should print the available storage type when input storage type is wrong

Surendra Singh Lilhore created HDFS-8711:


 Summary: setSpaceQuota command should print the available storage 
type when input storage type is wrong
 Key: HDFS-8711
 URL: https://issues.apache.org/jira/browse/HDFS-8711
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.0
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


If input storage type is wrong then currently *setSpaceQuota* give exception 
like this.

{code}
./hdfs dfsadmin -setSpaceQuota 1000 -storageType COLD /testDir
 setSpaceQuota: No enum constant org.apache.hadoop.fs.StorageType.COLD
{code}

It should be 

{code}
setSpaceQuota: Storage type COLD not available. Available storage type are 
[SSD, DISK, ARCHIVE]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8495) Consolidate append() related implementation into a single class

2015-07-02 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611677#comment-14611677
 ] 

Rakesh R commented on HDFS-8495:


Attached another patch fixing {{hadoop.hdfs.server.namenode.TestINodeFile}} 
failure.

 Consolidate append() related implementation into a single class
 ---

 Key: HDFS-8495
 URL: https://issues.apache.org/jira/browse/HDFS-8495
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, 
 HDFS-8495-002.patch, HDFS-8495-003.patch


 This jira proposes to consolidate {{FSNamesystem#append()}} related methods 
 into a single class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8703) Merge refactor of DFSInputStream from ErasureCoding branch

2015-07-02 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611783#comment-14611783
 ] 

Vinayakumar B commented on HDFS-8703:
-

Test failure is not related.
Will commit shortly. 
Thanks for the review [~hitliuyi]

 Merge refactor of DFSInputStream from ErasureCoding branch
 --

 Key: HDFS-8703
 URL: https://issues.apache.org/jira/browse/HDFS-8703
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-8703-01.patch, HDFS-8703-02.patch


 There were some refactors done in DFSInputStream for the support of 
 ErasureCoding in branch HDFS-7285.
 These refactors are generic and applicable to current trunk
 This Jira targets to merge them back to trunk to reduce size of the final 
 merge patch for the branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8710) Always read DU value from the cached dfsUsed file on datanode startup

2015-07-02 Thread Xinwei Qin (JIRA)

Xinwei Qin  created HDFS-8710:
-

 Summary: Always read DU value from the cached dfsUsed file on 
datanode startup
 Key: HDFS-8710
 URL: https://issues.apache.org/jira/browse/HDFS-8710
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 


Currently, DataNode will cache DU value in dfsUsed file termly. When DataNode 
starts or restarts, it will read in the cached DU value from dfsUsed file if 
the value is less than 600 seconds old, otherwise, it will run DU command, 
which is a very time-consuming operation(may up to dozens of minutes) when 
DataNode has huge number of blocks.

Since slight imprecision of dfsUsed is not critical, and the DU value will be 
updated every 600 seconds (the default DU interval) after DataNode started, we 
can always read DU value from the cached file (Regardless of whether this value 
is less than 600 seconds old or not) and skip DU operation on DataNode startup 
to significantly shorten the startup time.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8711) setSpaceQuota command should print the available storage type when input storage type is wrong


[ 
https://issues.apache.org/jira/browse/HDFS-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611759#comment-14611759
 ] 

Surendra Singh Lilhore commented on HDFS-8711:
--

Attached patch, Please review...

 setSpaceQuota command should print the available storage type when input 
 storage type is wrong
 --

 Key: HDFS-8711
 URL: https://issues.apache.org/jira/browse/HDFS-8711
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.0
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore
 Attachments: HDFS-8711.patch


 If input storage type is wrong then currently *setSpaceQuota* give exception 
 like this.
 {code}
 ./hdfs dfsadmin -setSpaceQuota 1000 -storageType COLD /testDir
  setSpaceQuota: No enum constant org.apache.hadoop.fs.StorageType.COLD
 {code}
 It should be 
 {code}
 setSpaceQuota: Storage type COLD not available. Available storage type are 
 [SSD, DISK, ARCHIVE]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8703) Merge refactor of DFSInputStream from ErasureCoding branch

2015-07-02 Thread Vinayakumar B (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-8703:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2.
Thanks for the reviews

 Merge refactor of DFSInputStream from ErasureCoding branch
 --

 Key: HDFS-8703
 URL: https://issues.apache.org/jira/browse/HDFS-8703
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.8.0

 Attachments: HDFS-8703-01.patch, HDFS-8703-02.patch


 There were some refactors done in DFSInputStream for the support of 
 ErasureCoding in branch HDFS-7285.
 These refactors are generic and applicable to current trunk
 This Jira targets to merge them back to trunk to reduce size of the final 
 merge patch for the branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7285) Erasure Coding Support inside HDFS

2015-07-02 Thread Vinayakumar B (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vinayakumar B updated HDFS-7285:

Attachment: HDFS-7285-merge-consolidated.trunk.03.patch

Rebased patch.

Erasure Coding Support inside HDFS
--

Key: HDFS-7285
URL: https://issues.apache.org/jira/browse/HDFS-7285
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch,
HDFS-7285-merge-consolidated-01.patch,
HDFS-7285-merge-consolidated-trunk-01.patch,
HDFS-7285-merge-consolidated.trunk.03.patch,
HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch,
HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf,
HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf,
HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf,
fsimage-analysis-20150105.pdf

Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice
of data reliability, comparing to the existing HDFS 3-replica approach. For
example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks,
with storage overhead only being 40%. This makes EC a quite attractive
alternative for big data storage, particularly for cold data.
Facebook had a related open source project called HDFS-RAID. It used to be
one of the contribute packages in HDFS but had been removed since Hadoop 2.0
for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends
on MapReduce to do encoding and decoding tasks; 2) it can only be used for
cold files that are intended not to be appended anymore; 3) the pure Java EC
coding implementation is extremely slow in practical use. Due to these, it
might not be a good idea to just bring HDFS-RAID back.
We (Intel and Cloudera) are working on a design to build EC into HDFS that
gets rid of any external dependencies, makes it self-contained and
independently maintained. This design lays the EC feature on the storage type
support and considers compatible with existing HDFS features like caching,
snapshot, encryption, high availability and etc. This design will also
support different EC coding schemes, implementations and policies for
different deployment scenarios. By utilizing advanced libraries (e.g. Intel
ISA-L library), an implementation can greatly improve the performance of EC
encoding/decoding and makes the EC solution even more attractive. We will
post the design document soon.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8686) WebHdfsFileSystem#getXAttr(Path p, final String name) doesn't work if domain name is in bigcase

2015-07-02 Thread kanaka kumar avvaru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru updated HDFS-8686:
--
Attachment: HDFS-8686-00.patch

 WebHdfsFileSystem#getXAttr(Path p, final String name) doesn't work if domain 
 name is in bigcase
 ---

 Key: HDFS-8686
 URL: https://issues.apache.org/jira/browse/HDFS-8686
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Jagadesh Kiran N
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8686-00.patch


 {code}  hadoop fs -getfattr -n USER.attr1 /dir1 {code} 
 == returns value
 {code} webhdfs.getXAttr(new Path(/dir1),USER.attr1)) {code} 
 == returns null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8686) WebHdfsFileSystem#getXAttr(Path p, final String name) doesn't work if domain name is in bigcase

2015-07-02 Thread kanaka kumar avvaru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru updated HDFS-8686:
--
Status: Patch Available  (was: Open)

The reason for the issue is {{WebHdfsFileSystem#getXAttr(Path p, final String 
name)}} is doing a second lookup via {{JsonUtilClient}} with the user passed 
string USER.attr1 which is un-neccessary as the NN has already given filtered 
Xattribute.

Attached patch to correct the code with different method to pickup value 
directly from response json map.

 WebHdfsFileSystem#getXAttr(Path p, final String name) doesn't work if domain 
 name is in bigcase
 ---

 Key: HDFS-8686
 URL: https://issues.apache.org/jira/browse/HDFS-8686
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Jagadesh Kiran N
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8686-00.patch


 {code}  hadoop fs -getfattr -n USER.attr1 /dir1 {code} 
 == returns value
 {code} webhdfs.getXAttr(new Path(/dir1),USER.attr1)) {code} 
 == returns null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8710) Always read DU value from the cached dfsUsed file on datanode startup

2015-07-02 Thread Xinwei Qin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-8710:
--
Status: Patch Available  (was: Open)

 Always read DU value from the cached dfsUsed file on datanode startup
 ---

 Key: HDFS-8710
 URL: https://issues.apache.org/jira/browse/HDFS-8710
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 
 Attachments: HDFS-8710.001.patch


 Currently, DataNode will cache DU value in dfsUsed file termly. When 
 DataNode starts or restarts, it will read in the cached DU value from 
 dfsUsed file if the value is less than 600 seconds old, otherwise, it will 
 run DU command, which is a very time-consuming operation(may up to dozens of 
 minutes) when DataNode has huge number of blocks.
 Since slight imprecision of dfsUsed is not critical, and the DU value will be 
 updated every 600 seconds (the default DU interval) after DataNode started, 
 we can always read DU value from the cached file (Regardless of whether this 
 value is less than 600 seconds old or not) and skip DU operation on DataNode 
 startup to significantly shorten the startup time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8711) setSpaceQuota command should print the available storage type when input storage type is wrong


 [ 
https://issues.apache.org/jira/browse/HDFS-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-8711:
-
Status: Patch Available  (was: Open)

 setSpaceQuota command should print the available storage type when input 
 storage type is wrong
 --

 Key: HDFS-8711
 URL: https://issues.apache.org/jira/browse/HDFS-8711
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.0
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore
 Attachments: HDFS-8711.patch


 If input storage type is wrong then currently *setSpaceQuota* give exception 
 like this.
 {code}
 ./hdfs dfsadmin -setSpaceQuota 1000 -storageType COLD /testDir
  setSpaceQuota: No enum constant org.apache.hadoop.fs.StorageType.COLD
 {code}
 It should be 
 {code}
 setSpaceQuota: Storage type COLD not available. Available storage type are 
 [SSD, DISK, ARCHIVE]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8696) Reduce the variances of latency of WebHDFS

2015-07-02 Thread Bob Hansen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612013#comment-14612013
 ] 

Bob Hansen commented on HDFS-8696:
--

Be sure to  set the high water mark before the low water mark.  See 
https://github.com/netty/netty/issues/3806



 Reduce the variances of latency of WebHDFS
 --

 Key: HDFS-8696
 URL: https://issues.apache.org/jira/browse/HDFS-8696
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.7.0
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
 Attachments: HDFS-8696.1.patch


 There is an issue that appears related to the webhdfs server. When making two 
 concurrent requests, the DN will sometimes pause for extended periods (I've 
 seen 1-300 seconds), killing performance and dropping connections. 
 To reproduce: 
 1. set up a HDFS cluster
 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
 the time out to /tmp/times.txt
 {noformat}
 i=1
 while (true); do 
 echo $i
 let i++
 /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null 
 http://namenode:50070/webhdfs/v1/tmp/bigfile?op=OPENuser.name=rootlength=1;
 done
 {noformat}
 3. Watch for 1-byte requests that take more than one second:
 tail -F /tmp/times.txt | grep -E ^[^0]
 4. After it has had a chance to warm up, start doing large transfers from
 another shell:
 {noformat}
 i=1
 while (true); do 
 echo $i
 let i++
 (/usr/bin/time -f %e curl -s -L -o /dev/null 
 http://namenode:50070/webhdfs/v1/tmp/bigfile?op=OPENuser.name=root);
 done
 {noformat}
 It's easy to find after a minute or two that small reads will sometimes
 pause for 1-300 seconds. In some extreme cases, it appears that the
 transfers timeout and the DN drops the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8711) setSpaceQuota command should print the available storage type when input storage type is wrong

2015-07-02 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612024#comment-14612024
 ] 

Xiaoyu Yao commented on HDFS-8711:
--

Patch LGTM.  A few comments:
1. Available storage type are  should be Available storage types are
2. Can you add a unit test for the change?

 setSpaceQuota command should print the available storage type when input 
 storage type is wrong
 --

 Key: HDFS-8711
 URL: https://issues.apache.org/jira/browse/HDFS-8711
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.0
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore
 Attachments: HDFS-8711.patch


 If input storage type is wrong then currently *setSpaceQuota* give exception 
 like this.
 {code}
 ./hdfs dfsadmin -setSpaceQuota 1000 -storageType COLD /testDir
  setSpaceQuota: No enum constant org.apache.hadoop.fs.StorageType.COLD
 {code}
 It should be 
 {code}
 setSpaceQuota: Storage type COLD not available. Available storage type are 
 [SSD, DISK, ARCHIVE]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-8714) Folder ModificationTime in Millis Changed When NameNode is restarted


 [ 
https://issues.apache.org/jira/browse/HDFS-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su reassigned HDFS-8714:
---

Assignee: Walter Su

 Folder ModificationTime in Millis Changed When NameNode is restarted
 

 Key: HDFS-8714
 URL: https://issues.apache.org/jira/browse/HDFS-8714
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chandan Biswas
Assignee: Walter Su

 *Steps to Produce*
 # Steps need to do in program
 ** Create a folder into HDFS 
 ** Print folder modificationTime in millis
 ** Upload a file or copy a file to this newly created folder
 ** Print file and folder modificationTime in millis
 ** Restart the name node
 ** Print file and folder modificationTime in millis
 # Expected Result
 ** folder modification time should be the file modification time before name 
 node restart
 ** folder modification time should not change after name node restart
 # Actual result
 ** folder modification time is not same with file modification time
 ** folder modification time is changed after name node restart and it's 
 changed to file modification time
 *Impact of this behavior:* Before task is launched, distributed cache 
 files/folders are checked for any modification. The checks are done by 
 comparing file/folder modicationTime in millis. So any job that uses 
 distributed cache has a potential chance of failure if 
 # name node restarts and running tasks are resubmitted or 
 # for e.g among 100 tasks 50 are in queue for run. Now name node restarts
 Here is the sample code I used for testing-
 {code}
 // file creating in hdfs
 final Path pathToFiles = new Path(/user/vagrant/chandan/test/);
 fileSystem.mkdirs(pathToFiles);
 System.out.println(HDFS Folder Modification Time in long Before file 
 copy:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 FileUtil.copy(fileSystem, new Path(/user/cloudera/test), 
 fileSystem, pathToFiles, false, configuration);
 System.out.println(HDFS File Modification Time in long:
 + fileSystem.getFileStatus(new 
 Path(/user/vagrant/chandan/test/test)).getModificationTime());
 System.out.println(HDFS Folder Modification Time in long After file 
 copy:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 for (int i = 0; i  100; i++) {
 System.out.println(Normal HDFS Folder Modification Time in long:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 System.out.println(Normal HDFS File Modification Time in long:
 + fileSystem.getFileStatus(new 
 Path(/user/vagrant/chandan/test/test)).getModificationTime());
 Thread.sleep(6 * 2);
 }
 {code}
 Here is the output -
 {code}
 HDFS Folder Modification Time in long Before file copy:1435868217309
 HDFS File Modification Time in long:1435868217368
 HDFS Folder Modification Time in long After file copy:1435868217353
 Normal HDFS Folder Modification Time in long:1435868217353
 Normal HDFS File Modification Time in long:1435868217368
 Normal HDFS Folder Modification Time in long:1435868217353
 Normal HDFS File Modification Time in long:1435868217368
 Normal HDFS Folder Modification Time in long:1435868217368
 Normal HDFS File Modification Time in long:1435868217368
 {code}
 The last two lines are printed after name node restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails


[ 
https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612784#comment-14612784
 ] 

Walter Su commented on HDFS-8704:
-

Agree. Please go ahead.

 Erasure Coding: client fails to write large file when one datanode fails
 

 Key: HDFS-8704
 URL: https://issues.apache.org/jira/browse/HDFS-8704
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-8704-000.patch


 I test current code on a 5-node cluster using RS(3,2).  When a datanode is 
 corrupt, client succeeds to write a file smaller than a block group but fails 
 to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests 
 files smaller than a block group, this jira will add more test situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8680) OzoneHandler : Add Local StorageHandler support for volumes

2015-07-02 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8680:
---
Attachment: hdfs-8680-HDFS-7240.002.patch

[~cnauroth] thanks for the review , I have updated the patch based on your 
comments.


* came-cased test names
* replaced {{Time#monotonicNow}} with {{System#currentTimeMillis}}

 OzoneHandler : Add Local StorageHandler support for volumes
 ---

 Key: HDFS-8680
 URL: https://issues.apache.org/jira/browse/HDFS-8680
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: hdfs-8680-HDFS-7240.001.patch, 
 hdfs-8680-HDFS-7240.002.patch


 Add a local StorageHandler that can store data into local DB. This is useful 
 for running tests against MiniDFSCluster in the stand-alone mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8620) Clean up the checkstyle warinings about ClientProtocol

2015-07-02 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612746#comment-14612746
 ] 

Takanobu Asanuma commented on HDFS-8620:


And I have a question. Most methods in ClientPlotocol throw IOException and 
subclasses of IOException.
Sholud we remove subclasses of IOException from the throws list of the methods?

 Clean up the checkstyle warinings about ClientProtocol
 --

 Key: HDFS-8620
 URL: https://issues.apache.org/jira/browse/HDFS-8620
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma
 Attachments: HDFS-8620.1.patch, HDFS-8620.2.patch


 These warnings were generated in HDFS-8238.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8714) Folder ModificationTime in Millis Changed When NameNode is restarted


[ 
https://issues.apache.org/jira/browse/HDFS-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612770#comment-14612770
 ] 

Walter Su commented on HDFS-8714:
-

Sorry [~pela], I can't reproduce this with hdp-2.6.0 or trunk. Which version 
did you use?

 Folder ModificationTime in Millis Changed When NameNode is restarted
 

 Key: HDFS-8714
 URL: https://issues.apache.org/jira/browse/HDFS-8714
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chandan Biswas

 *Steps to Produce*
 # Steps need to do in program
 ** Create a folder into HDFS 
 ** Print folder modificationTime in millis
 ** Upload a file or copy a file to this newly created folder
 ** Print file and folder modificationTime in millis
 ** Restart the name node
 ** Print file and folder modificationTime in millis
 # Expected Result
 ** folder modification time should be the file modification time before name 
 node restart
 ** folder modification time should not change after name node restart
 # Actual result
 ** folder modification time is not same with file modification time
 ** folder modification time is changed after name node restart and it's 
 changed to file modification time
 *Impact of this behavior:* Before task is launched, distributed cache 
 files/folders are checked for any modification. The checks are done by 
 comparing file/folder modicationTime in millis. So any job that uses 
 distributed cache has a potential chance of failure if 
 # name node restarts and running tasks are resubmitted or 
 # for e.g among 100 tasks 50 are in queue for run. Now name node restarts
 Here is the sample code I used for testing-
 {code}
 // file creating in hdfs
 final Path pathToFiles = new Path(/user/vagrant/chandan/test/);
 fileSystem.mkdirs(pathToFiles);
 System.out.println(HDFS Folder Modification Time in long Before file 
 copy:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 FileUtil.copy(fileSystem, new Path(/user/cloudera/test), 
 fileSystem, pathToFiles, false, configuration);
 System.out.println(HDFS File Modification Time in long:
 + fileSystem.getFileStatus(new 
 Path(/user/vagrant/chandan/test/test)).getModificationTime());
 System.out.println(HDFS Folder Modification Time in long After file 
 copy:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 for (int i = 0; i  100; i++) {
 System.out.println(Normal HDFS Folder Modification Time in long:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 System.out.println(Normal HDFS File Modification Time in long:
 + fileSystem.getFileStatus(new 
 Path(/user/vagrant/chandan/test/test)).getModificationTime());
 Thread.sleep(6 * 2);
 }
 {code}
 Here is the output -
 {code}
 HDFS Folder Modification Time in long Before file copy:1435868217309
 HDFS File Modification Time in long:1435868217368
 HDFS Folder Modification Time in long After file copy:1435868217353
 Normal HDFS Folder Modification Time in long:1435868217353
 Normal HDFS File Modification Time in long:1435868217368
 Normal HDFS Folder Modification Time in long:1435868217353
 Normal HDFS File Modification Time in long:1435868217368
 Normal HDFS Folder Modification Time in long:1435868217368
 Normal HDFS File Modification Time in long:1435868217368
 {code}
 The last two lines are printed after name node restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8696) Reduce the variances of latency of WebHDFS

2015-07-02 Thread Xiaobing Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-8696:

Attachment: HDFS-8696.2.patch

patch V2. boss threads setting is unnecessary. Just use default value of 2.

 Reduce the variances of latency of WebHDFS
 --

 Key: HDFS-8696
 URL: https://issues.apache.org/jira/browse/HDFS-8696
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.7.0
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
 Attachments: HDFS-8696.1.patch, HDFS-8696.2.patch


 There is an issue that appears related to the webhdfs server. When making two 
 concurrent requests, the DN will sometimes pause for extended periods (I've 
 seen 1-300 seconds), killing performance and dropping connections. 
 To reproduce: 
 1. set up a HDFS cluster
 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
 the time out to /tmp/times.txt
 {noformat}
 i=1
 while (true); do 
 echo $i
 let i++
 /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null 
 http://namenode:50070/webhdfs/v1/tmp/bigfile?op=OPENuser.name=rootlength=1;
 done
 {noformat}
 3. Watch for 1-byte requests that take more than one second:
 tail -F /tmp/times.txt | grep -E ^[^0]
 4. After it has had a chance to warm up, start doing large transfers from
 another shell:
 {noformat}
 i=1
 while (true); do 
 echo $i
 let i++
 (/usr/bin/time -f %e curl -s -L -o /dev/null 
 http://namenode:50070/webhdfs/v1/tmp/bigfile?op=OPENuser.name=root);
 done
 {noformat}
 It's easy to find after a minute or two that small reads will sometimes
 pause for 1-300 seconds. In some extreme cases, it appears that the
 transfers timeout and the DN drops the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8260) Erasure Coding: test of writing EC file

2015-07-02 Thread GAO Rui (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612761#comment-14612761
 ] 

GAO Rui commented on HDFS-8260:
---

Hi [~xinwei], is there some progress for EC file writing test? and reading test?

 Erasure Coding:  test of writing EC file
 

 Key: HDFS-8260
 URL: https://issues.apache.org/jira/browse/HDFS-8260
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: HDFS-7285
Reporter: GAO Rui
Assignee: Xinwei Qin 

 1. Normally writing EC file(writing without datanote failure)
 2. Writing EC file with tolerable number of datanodes failing.
 3. Writing EC file with intolerable number of datanodes failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8680) OzoneHandler : Add Local StorageHandler support for volumes

2015-07-02 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612655#comment-14612655
 ] 

Chris Nauroth commented on HDFS-8680:
-

Hi [~anu].  I just noticed that {{OzoneUtils}} constructs {{Date}} instances 
based on {{Time#monotonicNow}}.  I expect this doesn't create correct dates, 
because the {{Date}} constructor expects milliseconds since 1970, but 
{{Time#monotonicNow}} is based on {{System#nanoTime}}, which can return any 
arbitrary number depending on the state of the timer.  Switching to 
{{System#currentTimeMillis}} ought to fix this.

 OzoneHandler : Add Local StorageHandler support for volumes
 ---

 Key: HDFS-8680
 URL: https://issues.apache.org/jira/browse/HDFS-8680
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: hdfs-8680-HDFS-7240.001.patch


 Add a local StorageHandler that can store data into local DB. This is useful 
 for running tests against MiniDFSCluster in the stand-alone mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8620) Clean up the checkstyle warinings about ClientProtocol

2015-07-02 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-8620:
---
Attachment: HDFS-8620.2.patch

Sorry, my initial patch did not decrese the checkstyle warinings. I attached a 
second patch.

 Clean up the checkstyle warinings about ClientProtocol
 --

 Key: HDFS-8620
 URL: https://issues.apache.org/jira/browse/HDFS-8620
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma
 Attachments: HDFS-8620.1.patch, HDFS-8620.2.patch


 These warnings were generated in HDFS-8238.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8714) Folder ModificationTime in Millis Changed When NameNode is restarted


 [ 
https://issues.apache.org/jira/browse/HDFS-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8714:

Assignee: (was: Walter Su)

 Folder ModificationTime in Millis Changed When NameNode is restarted
 

 Key: HDFS-8714
 URL: https://issues.apache.org/jira/browse/HDFS-8714
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chandan Biswas

 *Steps to Produce*
 # Steps need to do in program
 ** Create a folder into HDFS 
 ** Print folder modificationTime in millis
 ** Upload a file or copy a file to this newly created folder
 ** Print file and folder modificationTime in millis
 ** Restart the name node
 ** Print file and folder modificationTime in millis
 # Expected Result
 ** folder modification time should be the file modification time before name 
 node restart
 ** folder modification time should not change after name node restart
 # Actual result
 ** folder modification time is not same with file modification time
 ** folder modification time is changed after name node restart and it's 
 changed to file modification time
 *Impact of this behavior:* Before task is launched, distributed cache 
 files/folders are checked for any modification. The checks are done by 
 comparing file/folder modicationTime in millis. So any job that uses 
 distributed cache has a potential chance of failure if 
 # name node restarts and running tasks are resubmitted or 
 # for e.g among 100 tasks 50 are in queue for run. Now name node restarts
 Here is the sample code I used for testing-
 {code}
 // file creating in hdfs
 final Path pathToFiles = new Path(/user/vagrant/chandan/test/);
 fileSystem.mkdirs(pathToFiles);
 System.out.println(HDFS Folder Modification Time in long Before file 
 copy:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 FileUtil.copy(fileSystem, new Path(/user/cloudera/test), 
 fileSystem, pathToFiles, false, configuration);
 System.out.println(HDFS File Modification Time in long:
 + fileSystem.getFileStatus(new 
 Path(/user/vagrant/chandan/test/test)).getModificationTime());
 System.out.println(HDFS Folder Modification Time in long After file 
 copy:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 for (int i = 0; i  100; i++) {
 System.out.println(Normal HDFS Folder Modification Time in long:
 + 
 fileSystem.getFileStatus(pathToFiles).getModificationTime());
 System.out.println(Normal HDFS File Modification Time in long:
 + fileSystem.getFileStatus(new 
 Path(/user/vagrant/chandan/test/test)).getModificationTime());
 Thread.sleep(6 * 2);
 }
 {code}
 Here is the output -
 {code}
 HDFS Folder Modification Time in long Before file copy:1435868217309
 HDFS File Modification Time in long:1435868217368
 HDFS Folder Modification Time in long After file copy:1435868217353
 Normal HDFS Folder Modification Time in long:1435868217353
 Normal HDFS File Modification Time in long:1435868217368
 Normal HDFS Folder Modification Time in long:1435868217353
 Normal HDFS File Modification Time in long:1435868217368
 Normal HDFS Folder Modification Time in long:1435868217368
 Normal HDFS File Modification Time in long:1435868217368
 {code}
 The last two lines are printed after name node restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8709) Clarify automatic sync in FSEditLog#logEdit


 [ 
https://issues.apache.org/jira/browse/HDFS-8709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8709:
--
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Thanks for reviewing Arpit, I ran the failed test locally okay, committed to 
trunk and branch-2.

 Clarify automatic sync in FSEditLog#logEdit
 ---

 Key: HDFS-8709
 URL: https://issues.apache.org/jira/browse/HDFS-8709
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Minor
 Fix For: 2.8.0

 Attachments: hdfs-8709.001.patch


 The code flow and comments regarding the logSync() in logEdit() is a little 
 messy, could be improved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8712) Remove public and abstract modifiers in FsVolumeSpi and FsDatasetSpi


 [ 
https://issues.apache.org/jira/browse/HDFS-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8712:

Attachment: HDFS-8712.000.patch

Remove {{public}} and {{abstract}} modifier in {{FsVolumeSpi}} and 
{{FsDatasetSpi}} interfaces. 

No test is included, since there is no actual code change.

 Remove public and abstract modifiers in FsVolumeSpi and FsDatasetSpi
 

 Key: HDFS-8712
 URL: https://issues.apache.org/jira/browse/HDFS-8712
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Trivial
 Attachments: HDFS-8712.000.patch


 In [Java Language Specification 
 9.4|http://docs.oracle.com/javase/specs/jls/se7/html/jls-9.html#jls-9.4]:
 bq. It is permitted, but discouraged as a matter of style, to redundantly 
 specify the public and/or abstract modifier for a method declared in an 
 interface.
 {{FsDatasetSpi}} and {{FsVolumeSpi}} mark methods as public, which cause many 
 warnings in IDEs and {{checkstyle}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8694) Expose the stats of IOErrors on each FsVolume through JMX


 [ 
https://issues.apache.org/jira/browse/HDFS-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8694:

Attachment: HDFS-8694.001.patch

Updated the patch to fix {{TestDataTransferKeepalive}}, the other failed unit 
tests were passed locally on my machine. 

Not sure about the timed out tests. So upload this patch to test again.

 Expose the stats of IOErrors on each FsVolume through JMX
 -

 Key: HDFS-8694
 URL: https://issues.apache.org/jira/browse/HDFS-8694
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, HDFS
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-8694.000.patch, HDFS-8694.001.patch


 Currently, once DataNode hits an {{IOError}} when writing / reading block 
 files, it starts a background {{DiskChecker.checkDirs()}} thread. But if this 
 thread successfully finishes, DN does not record this {{IOError}}. 
 We need one measurement to count all {{IOErrors}} for each volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8712) Remove public and abstract modifiers in FsVolumeSpi and FsDatasetSpi


 [ 
https://issues.apache.org/jira/browse/HDFS-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8712:

Status: Patch Available  (was: Open)

 Remove public and abstract modifiers in FsVolumeSpi and FsDatasetSpi
 

 Key: HDFS-8712
 URL: https://issues.apache.org/jira/browse/HDFS-8712
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Trivial
 Attachments: HDFS-8712.000.patch


 In [Java Language Specification 
 9.4|http://docs.oracle.com/javase/specs/jls/se7/html/jls-9.html#jls-9.4]:
 bq. It is permitted, but discouraged as a matter of style, to redundantly 
 specify the public and/or abstract modifier for a method declared in an 
 interface.
 {{FsDatasetSpi}} and {{FsVolumeSpi}} mark methods as public, which cause many 
 warnings in IDEs and {{checkstyle}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8712) Remove public and abstract modifiers in FsVolumeSpi and FsDatasetSpi


[ 
https://issues.apache.org/jira/browse/HDFS-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612312#comment-14612312
 ] 

Andrew Wang commented on HDFS-8712:
---

LGTM +1 pending, will help reduce the # of checkstyle warnings too.

 Remove public and abstract modifiers in FsVolumeSpi and FsDatasetSpi
 

 Key: HDFS-8712
 URL: https://issues.apache.org/jira/browse/HDFS-8712
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Trivial
 Attachments: HDFS-8712.000.patch


 In [Java Language Specification 
 9.4|http://docs.oracle.com/javase/specs/jls/se7/html/jls-9.html#jls-9.4]:
 bq. It is permitted, but discouraged as a matter of style, to redundantly 
 specify the public and/or abstract modifier for a method declared in an 
 interface.
 {{FsDatasetSpi}} and {{FsVolumeSpi}} mark methods as public, which cause many 
 warnings in IDEs and {{checkstyle}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8712) Remove public and abstract modifiers in FsVolumeSpi and FsDatasetSpi

Lei (Eddy) Xu created HDFS-8712:
---

 Summary: Remove public and abstract modifiers in FsVolumeSpi 
and FsDatasetSpi
 Key: HDFS-8712
 URL: https://issues.apache.org/jira/browse/HDFS-8712
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Trivial


In [Java Language Specification 
9.4|http://docs.oracle.com/javase/specs/jls/se7/html/jls-9.html#jls-9.4]:

bq. It is permitted, but discouraged as a matter of style, to redundantly 
specify the public and/or abstract modifier for a method declared in an 
interface.

{{FsDatasetSpi}} and {{FsVolumeSpi}} mark methods as public, which cause many 
warnings in IDEs and {{checkstyle}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8713) Convert DatanodeDescriptor to use SLF4J logging

Andrew Wang created HDFS-8713:
-

 Summary: Convert DatanodeDescriptor to use SLF4J logging
 Key: HDFS-8713
 URL: https://issues.apache.org/jira/browse/HDFS-8713
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial


Let's convert this class to use SLF4J



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8713) Convert DatanodeDescriptor to use SLF4J logging


 [ 
https://issues.apache.org/jira/browse/HDFS-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8713:
--
Attachment: hdfs-8713.001.patch

Patch attached, I also added an if guard to an INFO print I saw spamming the NN 
log (Number of failed storages changes from 0 to 0) I think due to a NN 
restart.

 Convert DatanodeDescriptor to use SLF4J logging
 ---

 Key: HDFS-8713
 URL: https://issues.apache.org/jira/browse/HDFS-8713
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: hdfs-8713.001.patch


 Let's convert this class to use SLF4J



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8713) Convert DatanodeDescriptor to use SLF4J logging


[ 
https://issues.apache.org/jira/browse/HDFS-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612456#comment-14612456
 ] 

Lei (Eddy) Xu commented on HDFS-8713:
-

LGTM. 

+1, pending Jenkings. Thanks [~andrew.wang]

 Convert DatanodeDescriptor to use SLF4J logging
 ---

 Key: HDFS-8713
 URL: https://issues.apache.org/jira/browse/HDFS-8713
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: hdfs-8713.001.patch


 Let's convert this class to use SLF4J



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8714) Folder ModificationTime in Millis Changed When NameNode is restarted

2015-07-02 Thread Chandan Biswas (JIRA)

Chandan Biswas created HDFS-8714:


 Summary: Folder ModificationTime in Millis Changed When NameNode 
is restarted
 Key: HDFS-8714
 URL: https://issues.apache.org/jira/browse/HDFS-8714
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chandan Biswas


*Steps to Produce*
# Steps need to do in program
** Create a folder into HDFS 
** Print folder modificationTime in millis
** Upload a file or copy a file to this newly created folder
** Print file and folder modificationTime in millis
** Restart the name node
** Print file and folder modificationTime in millis
# Expected Result
** folder modification time should be the file modification time before name 
node restart
** folder modification time should not change after name node restart
# Actual result
** folder modification time is not same with file modification time
** folder modification time is changed after name node restart and it's changed 
to file modification time

*Impact of this behavior:* Before task is launched, distributed cache 
files/folders are checked for any modification. The checks are done by 
comparing file/folder modicationTime in millis. So any job that uses 
distributed cache has a potential chance of failure if 
# name node restarts and running tasks are resubmitted or 
# for e.g among 100 tasks 50 are in queue for run. Now name node restarts

Here is the sample code I used for testing-
{code}
// file creating in hdfs
final Path pathToFiles = new Path(/user/vagrant/chandan/test/);
fileSystem.mkdirs(pathToFiles);
System.out.println(HDFS Folder Modification Time in long Before file 
copy:
+ fileSystem.getFileStatus(pathToFiles).getModificationTime());
FileUtil.copy(fileSystem, new Path(/user/cloudera/test), fileSystem, 
pathToFiles, false, configuration);
System.out.println(HDFS File Modification Time in long:
+ fileSystem.getFileStatus(new 
Path(/user/vagrant/chandan/test/test)).getModificationTime());
System.out.println(HDFS Folder Modification Time in long After file 
copy:
+ fileSystem.getFileStatus(pathToFiles).getModificationTime());

for (int i = 0; i  100; i++) {
System.out.println(Normal HDFS Folder Modification Time in long:
+ 
fileSystem.getFileStatus(pathToFiles).getModificationTime());
System.out.println(Normal HDFS File Modification Time in long:
+ fileSystem.getFileStatus(new 
Path(/user/vagrant/chandan/test/test)).getModificationTime());
Thread.sleep(6 * 2);
}
{code}
Here is the output -
{code}
HDFS Folder Modification Time in long Before file copy:1435868217309
HDFS File Modification Time in long:1435868217368
HDFS Folder Modification Time in long After file copy:1435868217353
Normal HDFS Folder Modification Time in long:1435868217353
Normal HDFS File Modification Time in long:1435868217368
Normal HDFS Folder Modification Time in long:1435868217353
Normal HDFS File Modification Time in long:1435868217368
Normal HDFS Folder Modification Time in long:1435868217368
Normal HDFS File Modification Time in long:1435868217368
{code}
The last two lines are printed after name node restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8713) Convert DatanodeDescriptor to use SLF4J logging


 [ 
https://issues.apache.org/jira/browse/HDFS-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8713:
--
Status: Patch Available  (was: Open)

 Convert DatanodeDescriptor to use SLF4J logging
 ---

 Key: HDFS-8713
 URL: https://issues.apache.org/jira/browse/HDFS-8713
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.6-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Attachments: hdfs-8713.001.patch


 Let's convert this class to use SLF4J



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8680) OzoneHandler : Add Local StorageHandler support for volumes

2015-07-02 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612530#comment-14612530
 ] 

Chris Nauroth commented on HDFS-8680:
-

Hi [~anu].  This looks good, and thank you for the thorough comments in 
{{OzoneMetadataManager}}.  I just have one minor nitpick.  Please rename the 
methods in {{TestVolumeStructs}} to use camel-case (e.g. 
{{testVolumeInfoParse}} instead of {{TestVolumeInfoParse}}).

 OzoneHandler : Add Local StorageHandler support for volumes
 ---

 Key: HDFS-8680
 URL: https://issues.apache.org/jira/browse/HDFS-8680
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: hdfs-8680-HDFS-7240.001.patch


 Add a local StorageHandler that can store data into local DB. This is useful 
 for running tests against MiniDFSCluster in the stand-alone mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8654) OzoneHandler : Add ACL support

2015-07-02 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8654:
---
Attachment: hdfs-8654-HDFS-7240.002.patch

[~arpitagarwal] Thanks for the review.  I have updated this patch with the 
changes you suggested, basically make enum objects support getTypeFromName 
calls, since valueof needs the exact string.


 OzoneHandler : Add ACL support
 --

 Key: HDFS-8654
 URL: https://issues.apache.org/jira/browse/HDFS-8654
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: hdfs-8654-HDFS-7240.001.patch, 
 hdfs-8654-HDFS-7240.002.patch


 Add ACL support which is needed by Ozone Buckets



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8694) Expose the stats of IOErrors on each FsVolume through JMX


 [ 
https://issues.apache.org/jira/browse/HDFS-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8694:

Attachment: (was: HDFS-8694.001.patch)

 Expose the stats of IOErrors on each FsVolume through JMX
 -

 Key: HDFS-8694
 URL: https://issues.apache.org/jira/browse/HDFS-8694
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, HDFS
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-8694.000.patch


 Currently, once DataNode hits an {{IOError}} when writing / reading block 
 files, it starts a background {{DiskChecker.checkDirs()}} thread. But if this 
 thread successfully finishes, DN does not record this {{IOError}}. 
 We need one measurement to count all {{IOErrors}} for each volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8694) Expose the stats of IOErrors on each FsVolume through JMX