[jira] [Updated] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-30 Thread Xinwei Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-7859:
--
Attachment: HDFS-7859-HDFS-7285.003.patch

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-8137 started by Uma Maheswara Rao G.
-
 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8183) Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-8183.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

The patch LGTM. +1 and I just committed it to the branch (since the change is 
simple we can probably watch Jenkins later). Thanks Rakesh for the contribution!

 Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads
 --

 Key: HDFS-8183
 URL: https://issues.apache.org/jira/browse/HDFS-8183
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: HDFS-7285

 Attachments: HDFS-8183-001.patch, HDFS-8183-002.patch


 The idea of this task is to improve closing of all the streamers. Presently 
 if any of the streamer throws an exception, it will returning immediately. 
 This leaves all the other streamer threads running. Instead its good to 
 handle the exceptions of each streamer independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7949:
---
Status: Patch Available  (was: In Progress)

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521152#comment-14521152
 ] 

Hadoop QA commented on HDFS-8276:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 43s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  7s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 224m 19s | Tests failed in hadoop-hdfs. |
| | | 272m 48s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729099/HDFS-8276.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / aa22450 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/checkstyle-result-diff.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/console |


This message was automatically generated.

 LazyPersistFileScrubber should be disabled if scrubber interval configured 
 zero
 ---

 Key: HDFS-8276
 URL: https://issues.apache.org/jira/browse/HDFS-8276
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8276.patch


 bq. but I think it is simple enough to change the meaning of the value so 
 that zero means 'never scrub'. Let me post an updated patch.
 As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
 scrubber should be disable if 
 *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
 Currently namenode startup is failing if interval configured zero
 {code}
 2015-04-27 23:47:31,744 ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
 initialization failed.
 java.lang.IllegalArgumentException: 
 

[jira] [Updated] (HDFS-8178) QJM doesn't purge empty and corrupt inprogress edits files

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8178:

Description: When a QJM crashes, the in-progress edit log file at that time 
remains in the file system. When the node comes back, it will accept new edit 
logs and those stale in-progress files are never cleaned up. QJM treats them as 
regular in-progress edit log files and tries to finalize them, which 
potentially causes high memory usage. This JIRA aims to move aside those stale 
edit log files to avoid this scenario.  (was: HDFS-5919 fixes the issue for 
{{FileJournalManager}}. A similar fix is needed for QJM.)

 QJM doesn't purge empty and corrupt inprogress edits files
 --

 Key: HDFS-8178
 URL: https://issues.apache.org/jira/browse/HDFS-8178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8178.000.patch


 When a QJM crashes, the in-progress edit log file at that time remains in the 
 file system. When the node comes back, it will accept new edit logs and those 
 stale in-progress files are never cleaned up. QJM treats them as regular 
 in-progress edit log files and tries to finalize them, which potentially 
 causes high memory usage. This JIRA aims to move aside those stale edit log 
 files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521005#comment-14521005
 ] 

Zhe Zhang commented on HDFS-8178:
-

Oops last paragraph was added by mistake, please ignore it.

 QJM doesn't move aside stale inprogress edits files
 ---

 Key: HDFS-8178
 URL: https://issues.apache.org/jira/browse/HDFS-8178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8178.000.patch


 When a QJM crashes, the in-progress edit log file at that time remains in the 
 file system. When the node comes back, it will accept new edit logs and those 
 stale in-progress files are never cleaned up. QJM treats them as regular 
 in-progress edit log files and tries to finalize them, which potentially 
 causes high memory usage. This JIRA aims to move aside those stale edit log 
 files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521002#comment-14521002
 ] 

Zhe Zhang commented on HDFS-8178:
-

Thanks ATM for the helpful review! After looking at HDFS-5919 more closely, we 
are actually trying to solve a different problem here. The objective of 
HDFS-5919 is sorely to save disk space (since FJM doesn't try to process those 
corrupt/empty files anyway). It's a safe cleanup, making sure the tx ID of 
empty / corrupt files are old enough before purging. So I think we should do 
the same in QJM.

Our main target here is _stale_ in-progress edit log files, which are not 
necessarily empty/corrupt (so they won't be mark as so). As the updated 
description states, we want to properly take care of those files so QJM doesn't 
try to process them. I like your proposal of rename / move aside those files 
and remove them when they are older than {{minTxIdToKeep}}. I'll update the 
patch based on this idea.

I also propose we do the same for corrupt / empty files, for both FJM and QJM. 

 QJM doesn't move aside stale inprogress edits files
 ---

 Key: HDFS-8178
 URL: https://issues.apache.org/jira/browse/HDFS-8178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8178.000.patch


 When a QJM crashes, the in-progress edit log file at that time remains in the 
 file system. When the node comes back, it will accept new edit logs and those 
 stale in-progress files are never cleaned up. QJM treats them as regular 
 in-progress edit log files and tries to finalize them, which potentially 
 causes high memory usage. This JIRA aims to move aside those stale edit log 
 files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8294:
---
Attachment: HDFS-8294-HDFS-7285.00.patch

 Erasure Coding: Fix Findbug warnings present in erasure coding
 --

 Key: HDFS-8294
 URL: https://issues.apache.org/jira/browse/HDFS-8294
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8294-HDFS-7285.00.patch


 Following are the findbug warnings :-
 # Possible null pointer dereference of arr$ in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 {code}
 Bug type NP_NULL_ON_SOME_PATH (click for details) 
 In class 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 Value loaded from arr$
 Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206]
 Known null at BlockInfoStripedUnderConstruction.java:[line 200]
 {code}
 # Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema): String.getBytes()
 Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
  new String(byte[])
 {code}
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema)
 Called method String.getBytes()
 At ErasureCodingZoneManager.java:[line 116]
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath)
 Called method new String(byte[])
 At ErasureCodingZoneManager.java:[line 81]
 {code}
 # Inconsistent synchronization of 
 org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time
 {code}
 Bug type IS2_INCONSISTENT_SYNC (click for details) 
 In class org.apache.hadoop.hdfs.DFSOutputStream
 Field org.apache.hadoop.hdfs.DFSOutputStream.streamer
 Synchronized 90% of the time
 Unsynchronized access at DFSOutputStream.java:[line 142]
 Unsynchronized access at DFSOutputStream.java:[line 853]
 Unsynchronized access at DFSOutputStream.java:[line 617]
 Unsynchronized access at DFSOutputStream.java:[line 620]
 Unsynchronized access at DFSOutputStream.java:[line 630]
 Unsynchronized access at DFSOutputStream.java:[line 338]
 Unsynchronized access at DFSOutputStream.java:[line 734]
 Unsynchronized access at DFSOutputStream.java:[line 897]
 {code}
 # Dead store to offSuccess in 
 org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 {code}
 Bug type DLS_DEAD_LOCAL_STORE (click for details) 
 In class org.apache.hadoop.hdfs.StripedDataStreamer
 In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 Local variable named offSuccess
 At StripedDataStreamer.java:[line 105]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 At BlockInfoStriped.java:[line 208]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.util.StripedBlockUtil
 In method 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 At StripedBlockUtil.java:[line 85]
 {code}
 # Switch statement found in 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map) where default case is missing
 {code}
 Bug type SF_SWITCH_NO_DEFAULT (click for details) 
 In class org.apache.hadoop.hdfs.DFSStripedInputStream
 In method 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map)
 At DFSStripedInputStream.java:[lines 468-491]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520986#comment-14520986
 ] 

Chris Nauroth commented on HDFS-8283:
-

These test failures might be related too:

https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/

 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.8.0

 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8290) WebHDFS calls before namesystem initialization can cause NullPointerException.

2015-04-30 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520990#comment-14520990
 ] 

Chris Nauroth commented on HDFS-8290:
-

The Findbugs warning is in an unrelated part of the codebase.  It's possible 
that both the Findbugs warning and the test failures were introduced by 
HDFS-8283.  I'm waiting for confirmation before I commit this.

 WebHDFS calls before namesystem initialization can cause NullPointerException.
 --

 Key: HDFS-8290
 URL: https://issues.apache.org/jira/browse/HDFS-8290
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: HDFS-8290.001.patch


 The NameNode has a brief window of time when the HTTP server has been 
 initialized, but the namesystem has not been initialized.  During this 
 window, a WebHDFS call can cause a {{NullPointerException}}.  We can catch 
 this condition and return a more meaningful error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521014#comment-14521014
 ] 

Zhe Zhang commented on HDFS-7949:
-

Thanks Rakesh,! The patch LGTM, +1 pending a Jenkins run. Do you mind Submit 
Patch and rename the patch as HDFS-7949-HDFS-7285.007.patch?

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
Priority: Minor
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-8137:
--
Attachment: HDFS-8137-0.patch

I generated an initial patch for review!
We supposed to get schema values from ECSchemaManager, but right now I don't 
see a better way to get from ECScheaManeger, so I added an API to get from 
BlockCollection itself like isStriped API in it. It's because BlockManager 
communicates with namesystem via Namesystem interface. I don't think its right 
to add apis there for every new features. BlockCollection is another interface 
like that and I added there. But logically Namesystem may be correct place to 
add getECSchema for a file path . But I am not too strong on that. I would like 
hear the suggestion on that if any.

 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520994#comment-14520994
 ] 

Zhe Zhang commented on HDFS-8282:
-

Thanks Yi for reviewing again! I just committed it to the branch.

 Erasure coding: move striped reading logic to StripedBlockUtil
 --

 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8282-HDFS-7285.00.patch, 
 HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8282:

   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

[~hitliuyi] We need to rebase both HDFS-7678 and HDFS-7348 against this change.

 Erasure coding: move striped reading logic to StripedBlockUtil
 --

 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Fix For: HDFS-7285

 Attachments: HDFS-8282-HDFS-7285.00.patch, 
 HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8183) Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads

2015-04-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521053#comment-14521053
 ] 

Rakesh R commented on HDFS-8183:


Thank you [~zhz] for reviewing and committing the changes.

 Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads
 --

 Key: HDFS-8183
 URL: https://issues.apache.org/jira/browse/HDFS-8183
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: HDFS-7285

 Attachments: HDFS-8183-001.patch, HDFS-8183-002.patch


 The idea of this task is to improve closing of all the streamers. Presently 
 if any of the streamer throws an exception, it will returning immediately. 
 This leaves all the other streamer threads running. Instead its good to 
 handle the exceptions of each streamer independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-04-30 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Status: Open  (was: Patch Available)

 HDFS client gets errors trying to to connect to IPv6 DataNode
 -

 Key: HDFS-8078
 URL: https://issues.apache.org/jira/browse/HDFS-8078
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Nate Edel
Assignee: Nate Edel
  Labels: ipv6
 Attachments: HDFS-8078.7.patch


 1st exception, on put:
 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
 java.lang.IllegalArgumentException: Does not contain a valid host:port 
 authority: 2401:db00:1010:70ba:face:0:8:0:50010
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
 Appears to actually stem from code in DataNodeID which assumes it's safe to 
 append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
 IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
 requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
 Currently using InetAddress.getByName() to validate IPv6 (guava 
 InetAddresses.forString has been flaky) but could also use our own parsing. 
 (From logging this, it seems like a low-enough frequency call that the extra 
 object creation shouldn't be problematic, and for me the slight risk of 
 passing in bad input that is not actually an IPv4 or IPv6 address and thus 
 calling an external DNS lookup is outweighed by getting the address 
 normalized and avoiding rewriting parsing.)
 Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
 ---
 2nd exception (on datanode)
 15/04/13 13:18:07 ERROR datanode.DataNode: 
 dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
 operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
 /2401:db00:11:d010:face:0:2f:0:50010
 java.io.EOFException
 at java.io.DataInputStream.readShort(DataInputStream.java:315)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
 at java.lang.Thread.run(Thread.java:745)
 Which also comes as client error -get: 2401 is not an IP string literal.
 This one has existing parsing logic which needs to shift to the last colon 
 rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
 rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-04-30 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Attachment: (was: HDFS-8078.6.patch)

 HDFS client gets errors trying to to connect to IPv6 DataNode
 -

 Key: HDFS-8078
 URL: https://issues.apache.org/jira/browse/HDFS-8078
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Nate Edel
Assignee: Nate Edel
  Labels: ipv6
 Attachments: HDFS-8078.7.patch


 1st exception, on put:
 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
 java.lang.IllegalArgumentException: Does not contain a valid host:port 
 authority: 2401:db00:1010:70ba:face:0:8:0:50010
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
 Appears to actually stem from code in DataNodeID which assumes it's safe to 
 append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
 IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
 requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
 Currently using InetAddress.getByName() to validate IPv6 (guava 
 InetAddresses.forString has been flaky) but could also use our own parsing. 
 (From logging this, it seems like a low-enough frequency call that the extra 
 object creation shouldn't be problematic, and for me the slight risk of 
 passing in bad input that is not actually an IPv4 or IPv6 address and thus 
 calling an external DNS lookup is outweighed by getting the address 
 normalized and avoiding rewriting parsing.)
 Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
 ---
 2nd exception (on datanode)
 15/04/13 13:18:07 ERROR datanode.DataNode: 
 dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
 operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
 /2401:db00:11:d010:face:0:2f:0:50010
 java.io.EOFException
 at java.io.DataInputStream.readShort(DataInputStream.java:315)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
 at java.lang.Thread.run(Thread.java:745)
 Which also comes as client error -get: 2401 is not an IP string literal.
 This one has existing parsing logic which needs to shift to the last colon 
 rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
 rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-04-30 Thread Nate Edel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Status: Patch Available  (was: Open)

 HDFS client gets errors trying to to connect to IPv6 DataNode
 -

 Key: HDFS-8078
 URL: https://issues.apache.org/jira/browse/HDFS-8078
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Nate Edel
Assignee: Nate Edel
  Labels: ipv6
 Attachments: HDFS-8078.7.patch


 1st exception, on put:
 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
 java.lang.IllegalArgumentException: Does not contain a valid host:port 
 authority: 2401:db00:1010:70ba:face:0:8:0:50010
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
 Appears to actually stem from code in DataNodeID which assumes it's safe to 
 append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
 IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
 requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
 Currently using InetAddress.getByName() to validate IPv6 (guava 
 InetAddresses.forString has been flaky) but could also use our own parsing. 
 (From logging this, it seems like a low-enough frequency call that the extra 
 object creation shouldn't be problematic, and for me the slight risk of 
 passing in bad input that is not actually an IPv4 or IPv6 address and thus 
 calling an external DNS lookup is outweighed by getting the address 
 normalized and avoiding rewriting parsing.)
 Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
 ---
 2nd exception (on datanode)
 15/04/13 13:18:07 ERROR datanode.DataNode: 
 dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
 operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
 /2401:db00:11:d010:face:0:2f:0:50010
 java.io.EOFException
 at java.io.DataInputStream.readShort(DataInputStream.java:315)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
 at java.lang.Thread.run(Thread.java:745)
 Which also comes as client error -get: 2401 is not an IP string literal.
 This one has existing parsing logic which needs to shift to the last colon 
 rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
 rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8178) QJM doesn't move aside stale inprogress edits files

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8178:

Summary: QJM doesn't move aside stale inprogress edits files  (was: QJM 
doesn't purge empty and corrupt inprogress edits files)

 QJM doesn't move aside stale inprogress edits files
 ---

 Key: HDFS-8178
 URL: https://issues.apache.org/jira/browse/HDFS-8178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8178.000.patch


 When a QJM crashes, the in-progress edit log file at that time remains in the 
 file system. When the node comes back, it will accept new edit logs and those 
 stale in-progress files are never cleaned up. QJM treats them as regular 
 in-progress edit log files and tries to finalize them, which potentially 
 causes high memory usage. This JIRA aims to move aside those stale edit log 
 files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8242:
---
Status: Patch Available  (was: In Progress)

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521026#comment-14521026
 ] 

Yi Liu edited comment on HDFS-7348 at 4/30/15 7:06 AM:
---

Thanks Zhe and Bo for further discussion.
I will rebase the patch, and make buffer size configurable and add decode part 
as Zhe's suggestion.

For sequential vs. parallel reading, I will file a follow-on and target in 
phase2.   For local read (if the source is local) and local write (if the 
target is local), you guys can do them as follow-on in your JIRAs and target to 
Phase2.


was (Author: hitliuyi):
Thanks Zhe and Bo for further discussion.
I will rebase the patch, and make buffer size configurable and add decode part 
as Zhe's suggestion.

For sequential vs. parallel reading, I will file a follow-on and target in 
phase2.   For local read and local write, you guys can do them as follow-on in 
your JIRAs and target to Phase2.

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations

2015-04-30 Thread Xinwei Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-8295:
--
Issue Type: Sub-task  (was: Task)
Parent: HDFS-8031

 Add MODIFY and REMOVE ECSchema editlog operations
 -

 Key: HDFS-8295
 URL: https://issues.apache.org/jira/browse/HDFS-8295
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 

 If MODIFY and REMOVE ECSchema operations are supported, then add these 
 editlog operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521026#comment-14521026
 ] 

Yi Liu commented on HDFS-7348:
--

Thanks Zhe and Bo for further discussion.
I will rebase the patch, and make buffer size configurable and add decode part 
as Zhe's suggestion.

For sequential vs. parallel reading, I will file a follow-on and target in 
phase2.   For local read and local write, you guys can do them as follow-on in 
your JIRAs and target to Phase2.

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7949:
---
Attachment: HDFS-7949-HDFS-7285.08.patch

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
Priority: Minor
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.

2015-04-30 Thread surendra singh lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-8229:
-
Attachment: HDFS-8229_2.patch

 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil

2015-04-30 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520989#comment-14520989
 ] 

Yi Liu commented on HDFS-8282:
--

yes, +1

 Erasure coding: move striped reading logic to StripedBlockUtil
 --

 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8282-HDFS-7285.00.patch, 
 HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8294:
---
Status: Patch Available  (was: Open)

 Erasure Coding: Fix Findbug warnings present in erasure coding
 --

 Key: HDFS-8294
 URL: https://issues.apache.org/jira/browse/HDFS-8294
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8294-HDFS-7285.00.patch


 Following are the findbug warnings :-
 # Possible null pointer dereference of arr$ in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 {code}
 Bug type NP_NULL_ON_SOME_PATH (click for details) 
 In class 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 Value loaded from arr$
 Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206]
 Known null at BlockInfoStripedUnderConstruction.java:[line 200]
 {code}
 # Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema): String.getBytes()
 Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
  new String(byte[])
 {code}
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema)
 Called method String.getBytes()
 At ErasureCodingZoneManager.java:[line 116]
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath)
 Called method new String(byte[])
 At ErasureCodingZoneManager.java:[line 81]
 {code}
 # Inconsistent synchronization of 
 org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time
 {code}
 Bug type IS2_INCONSISTENT_SYNC (click for details) 
 In class org.apache.hadoop.hdfs.DFSOutputStream
 Field org.apache.hadoop.hdfs.DFSOutputStream.streamer
 Synchronized 90% of the time
 Unsynchronized access at DFSOutputStream.java:[line 142]
 Unsynchronized access at DFSOutputStream.java:[line 853]
 Unsynchronized access at DFSOutputStream.java:[line 617]
 Unsynchronized access at DFSOutputStream.java:[line 620]
 Unsynchronized access at DFSOutputStream.java:[line 630]
 Unsynchronized access at DFSOutputStream.java:[line 338]
 Unsynchronized access at DFSOutputStream.java:[line 734]
 Unsynchronized access at DFSOutputStream.java:[line 897]
 {code}
 # Dead store to offSuccess in 
 org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 {code}
 Bug type DLS_DEAD_LOCAL_STORE (click for details) 
 In class org.apache.hadoop.hdfs.StripedDataStreamer
 In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 Local variable named offSuccess
 At StripedDataStreamer.java:[line 105]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 At BlockInfoStriped.java:[line 208]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.util.StripedBlockUtil
 In method 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 At StripedBlockUtil.java:[line 85]
 {code}
 # Switch statement found in 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map) where default case is missing
 {code}
 Bug type SF_SWITCH_NO_DEFAULT (click for details) 
 In class org.apache.hadoop.hdfs.DFSStripedInputStream
 In method 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map)
 At DFSStripedInputStream.java:[lines 468-491]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-30 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521029#comment-14521029
 ] 

Xinwei Qin  commented on HDFS-7859:
---

The 003 patch removes MODIFY and REMOVE ECSchema editlog operations, these 
operations will be added by another JIRA(HDFS-8295) later when they are 
supported. 

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7949:
---
Priority: Major  (was: Minor)
Target Version/s: HDFS-7285

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations

2015-04-30 Thread Xinwei Qin (JIRA)
Xinwei Qin  created HDFS-8295:
-

 Summary: Add MODIFY and REMOVE ECSchema editlog operations
 Key: HDFS-8295
 URL: https://issues.apache.org/jira/browse/HDFS-8295
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 


If MODIFY and REMOVE ECSchema operations are supported, then add these editlog 
operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521047#comment-14521047
 ] 

Hadoop QA commented on HDFS-8294:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729419/HDFS-8294-HDFS-7285.00.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 5a83838 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10472/console |


This message was automatically generated.

 Erasure Coding: Fix Findbug warnings present in erasure coding
 --

 Key: HDFS-8294
 URL: https://issues.apache.org/jira/browse/HDFS-8294
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8294-HDFS-7285.00.patch


 Following are the findbug warnings :-
 # Possible null pointer dereference of arr$ in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 {code}
 Bug type NP_NULL_ON_SOME_PATH (click for details) 
 In class 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 Value loaded from arr$
 Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206]
 Known null at BlockInfoStripedUnderConstruction.java:[line 200]
 {code}
 # Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema): String.getBytes()
 Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
  new String(byte[])
 {code}
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema)
 Called method String.getBytes()
 At ErasureCodingZoneManager.java:[line 116]
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath)
 Called method new String(byte[])
 At ErasureCodingZoneManager.java:[line 81]
 {code}
 # Inconsistent synchronization of 
 org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time
 {code}
 Bug type IS2_INCONSISTENT_SYNC (click for details) 
 In class org.apache.hadoop.hdfs.DFSOutputStream
 Field org.apache.hadoop.hdfs.DFSOutputStream.streamer
 Synchronized 90% of the time
 Unsynchronized access at DFSOutputStream.java:[line 142]
 Unsynchronized access at DFSOutputStream.java:[line 853]
 Unsynchronized access at DFSOutputStream.java:[line 617]
 Unsynchronized access at DFSOutputStream.java:[line 620]
 Unsynchronized access at DFSOutputStream.java:[line 630]
 Unsynchronized access at DFSOutputStream.java:[line 338]
 Unsynchronized access at DFSOutputStream.java:[line 734]
 Unsynchronized access at DFSOutputStream.java:[line 897]
 {code}
 # Dead store to offSuccess in 
 org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 {code}
 Bug type DLS_DEAD_LOCAL_STORE (click for details) 
 In class org.apache.hadoop.hdfs.StripedDataStreamer
 In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 Local variable named offSuccess
 At StripedDataStreamer.java:[line 105]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 At BlockInfoStriped.java:[line 208]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.util.StripedBlockUtil
 In method 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 At StripedBlockUtil.java:[line 85]
 {code}
 # Switch statement found in 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map) where default case is 

[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8242:
---
Attachment: HDFS-8242-HDFS-7285.04.patch

Attached previous patch again to see the jenkins report

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521123#comment-14521123
 ] 

Kai Zheng commented on HDFS-8137:
-

Uma thanks for the patch and good comments. I'd like to look at this and give 
my thoughts later today.

 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-04-30 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521202#comment-14521202
 ] 

zhouyingchao commented on HDFS-7897:


Any updates regarding this simple patch?

 Shutdown metrics when stopping JournalNode
 --

 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7897-001.patch


 In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
 is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520951#comment-14520951
 ] 

Li Bo commented on HDFS-7348:
-

bq. We can do local writing and local reading logics as follow-on under 
HDFS-8031.
Agree. We can do optimization of write and read logics later.


 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521004#comment-14521004
 ] 

Akira AJISAKA commented on HDFS-5574:
-

Looks like jenkins ran the tests in hadoop-hdfs project with 
hadoop-common-3.0.0-date.jar, which does not have 
{{FSInputChecker#readAndDiscard}}. I could reproduce the error by the following 
command:
{code}
$ cd hadoop-hdfs-project/hadoop-hdfs
$ mvn test -Dtest=TestDFSInputStream
{code}

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8296) BlockManager.getUnderReplicatedBlocksCount() is not giving correct count if namenode in safe mode.

2015-04-30 Thread surendra singh lilhore (JIRA)
surendra singh lilhore created HDFS-8296:


 Summary:  BlockManager.getUnderReplicatedBlocksCount() is not 
giving correct count if namenode in safe mode.
 Key: HDFS-8296
 URL: https://issues.apache.org/jira/browse/HDFS-8296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore


{{underReplicatedBlocksCount}} update by the {{updateState()}} API.

{code}
 void updateState() {
pendingReplicationBlocksCount = pendingReplications.size();
underReplicatedBlocksCount = neededReplications.size();
corruptReplicaBlocksCount = corruptReplicas.size();
  }
 {code}

 but this will not call when NN in safe mode. This is happening because 
computeDatanodeWork() we will return 0 if NN in safe mode 

 {code}

  int computeDatanodeWork() {
   .
if (namesystem.isInSafeMode()) {
  return 0;
}


this.updateState();


  }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8161) Both Namenodes are in standby State

2015-04-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521064#comment-14521064
 ] 

Brahma Reddy Battula commented on HDFS-8161:


[~vinayrpet] , [~jnp] and [~arpitagarwal] any thoughts on this..? 
As there is no checksum verification from the ZK Side and seems to be no one 
interested in checksum feature in ZK side (since I  did not seen any comment in 
ZOOKEEPER-2175 ),Can we have some mechanism here..?

 Both Namenodes are in standby State
 ---

 Key: HDFS-8161
 URL: https://issues.apache.org/jira/browse/HDFS-8161
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover
Affects Versions: 2.6.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: ACTIVEBreadcumb and StandbyElector.txt


 Suspected Scenario:
 
 Start cluster with three Nodes.
 Reboot Machine where ZKFC is not running..( Here Active Node ZKFC should open 
 session with this ZK )
 Now  ZKFC ( Active NN's ) session expire and try re-establish connection with 
 another ZK...Bythe time  ZKFC ( StndBy NN's ) will try to fence old active 
 and create the active Breadcrumb and Makes SNN to active state..
 But immediately it fence to standby state.. ( Here is the doubt)
 Hence both will be in standby state..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations

2015-04-30 Thread Xinwei Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-8295:
--
Attachment: HDFS-8295.001.patch

A initial patch based on HDFS-7859.

 Add MODIFY and REMOVE ECSchema editlog operations
 -

 Key: HDFS-8295
 URL: https://issues.apache.org/jira/browse/HDFS-8295
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 
 Attachments: HDFS-8295.001.patch


 If MODIFY and REMOVE ECSchema operations are supported, then add these 
 editlog operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.

2015-04-30 Thread surendra singh lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-8229:
-
Status: Patch Available  (was: Open)

 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.

2015-04-30 Thread surendra singh lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521145#comment-14521145
 ] 

surendra singh lilhore commented on HDFS-8229:
--

Attached new patch, please review.

In test case I am not using {{getCorruptReplicaBlocksCount()}} API for counting 
corrupt blocks because of 
[HDFS-8296|https://issues.apache.org/jira/browse/HDFS-8296]


 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

2015-04-30 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-7770:

   Resolution: Fixed
Fix Version/s: 2.7.1
   2.8.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-2, and branch-2.7. Thanks [~xyao] for the 
contribution.

 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Fix For: 2.8.0, 2.7.1

 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8297) Ability to online trigger data dir rescan for blocks

2015-04-30 Thread Hari Sekhon (JIRA)
Hari Sekhon created HDFS-8297:
-

 Summary: Ability to online trigger data dir rescan for blocks
 Key: HDFS-8297
 URL: https://issues.apache.org/jira/browse/HDFS-8297
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon


Feature request to add functionality to online trigger data dir rescan for 
available blocks without having to restart datanode.

Motivation is if using HDFS storage tiering with an archive tier to a separate 
hyperscale storage device over the network (Hedvig in this case) which may go 
away and then return due to say a network interruption or other temporary 
error, this leaves HDFS fsck declaring missing blocks, that are clearly visible 
on the mount point for the node's archive directory. An online trigger for data 
dir rescsan for available blocks would avoid having to do a rolling restart of 
all datanodes across a cluster. I did try sending a kill -HUP to the datanode 
process (both SecureDataNodeStarter parent and child) while tailing the log 
hoping this might do it, but nothing happened in the log.

Hari Sekhon
http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521253#comment-14521253
 ] 

Akira AJISAKA commented on HDFS-5574:
-

+1, I ran the failed tests locally and all the tests passed. Committing this.

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521306#comment-14521306
 ] 

Hadoop QA commented on HDFS-8242:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 43s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 5 new or modified test files. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 58s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 41s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 13s | The patch appears to introduce 9 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  88m 18s | Tests failed in hadoop-hdfs. |
| | | 135m 14s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time  
Unsynchronized access at DFSOutputStream.java:90% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 105] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:[line 85] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:[line 167] |
| Failed unit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.TestSetrepIncreasing |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729426/HDFS-8242-HDFS-7285.04.patch
 |
| Optional Tests 

[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-30 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521322#comment-14521322
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thank you for the information.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

2015-04-30 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521323#comment-14521323
 ] 

Akira AJISAKA commented on HDFS-7770:
-

Thanks [~xyao] for updating the patch. LGTM, +1.
bq. I think we can address that in a separate JIRA.
Agree, let's create a jira for this.

 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-5574:

   Resolution: Fixed
Fix Version/s: 2.8.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed this to trunk and branch-2. Thanks [~decster] for the contribution!

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7687) Change fsck to support EC files

2015-04-30 Thread Takanobu Asanuma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-7687:
---
Attachment: HDFS-7687.1.patch

I created an initial patch. The main codes in this patch are below.

# I separated {{collect\[File|Block\]Summary}} into 
{{collectReplicated\[File|Block\]Summary}} and 
{{collectEC\[File|Block\]Summary}}.
# I named or renamed some variables and outputs. For example, 
{{ReplicatedBlocks}} is {{ECBlockGroups}} in EC. And {{Replication}} or 
{{Replicas}} is {{ECBlocks}} in EC.
# I added EC summaries to Result#toString.

Please will you review this patch? I'm going to add some tests about this codes.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7578) NFS WRITE and COMMIT responses should always use the channel pipeline

2015-04-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521467#comment-14521467
 ] 

Allen Wittenauer commented on HDFS-7578:


bq. It's also strange that my comment triggered Jenkins.  Is that expected with 
the new test script?

Yup.  That part of the pipeline is before test-patch.sh.  It's always been that 
way.

 NFS WRITE and COMMIT responses should always use the channel pipeline
 -

 Key: HDFS-7578
 URL: https://issues.apache.org/jira/browse/HDFS-7578
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.7.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-7578.001.patch, HDFS-7578.002.patch


 Write and Commit responses directly write data to the channel instead of 
 propagating it to the next immediate handler in the channel pipeline. 
 Not following Netty channel pipeline model could be problematic. We don't 
 know whether it could cause any resource leak or performance issue especially 
 the internal pipeline implementation keeps changing with newer Netty releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem

2015-04-30 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8299:
--
Description: 
Fsck shows missing blocks when the blocks can be found on a datanode's 
filesystem and the datanode has been restarted to try to get it to recognize 
that the blocks are indeed present and hence report them to the NameNode in a 
block report.

Fsck output showing an example missing block:
{code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT 
blockpool BP-120244285-ip-1417023863606 block blk_1075202330
 MISSING 1 blocks of total size 3260848 B
0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 
MISSING!{code}
The block is definitely present on more than one datanode however, here is the 
output from one of them that I restarted to try to get it to report the block 
to the NameNode:
{code}# ll 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
-rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330
-rw-r--r-- 1 hdfs 499   25483 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code}
It's worth noting that this is on HDFS tiered storage on an archive tier going 
to a networked block device that may have become temporarily unavailable but is 
available now. See also feature request HDFS-8297 for online rescan to not have 
to go around restarting datanodes.

It turns out in the datanode log (that I am attaching) this is because the 
datanode fails to get a write lock on the filesystem. I think it would be 
better to be able to read-only those blocks however, since this way causes 
client visible data unavailability when the data could in fact be read.

{code}2015-04-30 14:11:08,235 WARN  datanode.DataNode 
(DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
/archive1/dn :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
writable: /archive1/dn
at 
org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
at 
org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
{code}

Hari Sekhon
http://www.linkedin.com/in/harisekhon

 HDFS reporting missing blocks when they are actually present due to read-only 
 filesystem
 

 Key: HDFS-8299
 URL: https://issues.apache.org/jira/browse/HDFS-8299
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Critical
 Attachments: datanode.log


 Fsck shows missing blocks when the blocks can be found on a datanode's 
 filesystem and the datanode has been restarted to try to get it to recognize 
 that the blocks are indeed present and hence report them to the NameNode in a 
 block report.
 Fsck output showing an example missing block:
 {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT 
 blockpool BP-120244285-ip-1417023863606 block blk_1075202330
  MISSING 1 blocks of total size 3260848 B
 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 
 MISSING!{code}
 The block is definitely present on more than one datanode however, here is 
 the output from one of them that I restarted to try to get it to report the 
 block to the NameNode:
 {code}# ll 
 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
 

[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8242:
---
Attachment: (was: HDFS-8242-HDFS-7285.05.patch)

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521407#comment-14521407
 ] 

Hudson commented on HDFS-5574:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/])
HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin 
Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java


 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8242:
---
Attachment: HDFS-8242-HDFS-7285.05.patch

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521401#comment-14521401
 ] 

Hudson commented on HDFS-8269:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/])
HDFS-8269. getBlockLocations() does not resolve the .reserved path and 
generates incorrect edit logs when updating the atime. Contributed by Haohui 
Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 getBlockLocations() does not resolve the .reserved path and generates 
 incorrect edit logs when updating the atime
 -

 Key: HDFS-8269
 URL: https://issues.apache.org/jira/browse/HDFS-8269
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Haohui Mai
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, 
 HDFS-8269.002.patch, HDFS-8269.003.patch


 When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, 
 it uses the path passed from the client, which generates incorrect edit logs 
 entries:
 {noformat}
   RECORD
 OPCODEOP_TIMES/OPCODE
 DATA
   TXID5085/TXID
   LENGTH0/LENGTH
   PATH/.reserved/.inodes/18230/PATH
   MTIME-1/MTIME
   ATIME1429908236392/ATIME
 /DATA
   /RECORD
 {noformat}
 Note that the NN does not resolve the {{/.reserved}} path when processing the 
 edit log, therefore it eventually leads to a NPE when loading the edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521409#comment-14521409
 ] 

Hudson commented on HDFS-8214:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/])
HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. 
Contributed by Charles Lamb. (wang: rev 
aa22450442ebe39916a6fd460fe97e347945526d)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java


 Secondary NN Web UI shows wrong date for Last Checkpoint
 

 Key: HDFS-8214
 URL: https://issues.apache.org/jira/browse/HDFS-8214
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS, namenode
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Fix For: 2.8.0

 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, 
 HDFS-8214.003.patch


 SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in 
 the web UI. This causes weird times, generally, just after the epoch, to be 
 displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521408#comment-14521408
 ] 

Hudson commented on HDFS-8283:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/])
HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz 
Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java


 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.8.0

 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8242:
---
Attachment: HDFS-8242-HDFS-7285.05.patch

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, 
 HDFS-8242-HDFS-7285.05.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521446#comment-14521446
 ] 

Hudson commented on HDFS-8269:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/913/])
HDFS-8269. getBlockLocations() does not resolve the .reserved path and 
generates incorrect edit logs when updating the atime. Contributed by Haohui 
Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java


 getBlockLocations() does not resolve the .reserved path and generates 
 incorrect edit logs when updating the atime
 -

 Key: HDFS-8269
 URL: https://issues.apache.org/jira/browse/HDFS-8269
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Haohui Mai
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, 
 HDFS-8269.002.patch, HDFS-8269.003.patch


 When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, 
 it uses the path passed from the client, which generates incorrect edit logs 
 entries:
 {noformat}
   RECORD
 OPCODEOP_TIMES/OPCODE
 DATA
   TXID5085/TXID
   LENGTH0/LENGTH
   PATH/.reserved/.inodes/18230/PATH
   MTIME-1/MTIME
   ATIME1429908236392/ATIME
 /DATA
   /RECORD
 {noformat}
 Note that the NN does not resolve the {{/.reserved}} path when processing the 
 edit log, therefore it eventually leads to a NPE when loading the edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521455#comment-14521455
 ] 

Hudson commented on HDFS-8214:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/913/])
HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. 
Contributed by Charles Lamb. (wang: rev 
aa22450442ebe39916a6fd460fe97e347945526d)
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java


 Secondary NN Web UI shows wrong date for Last Checkpoint
 

 Key: HDFS-8214
 URL: https://issues.apache.org/jira/browse/HDFS-8214
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS, namenode
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Fix For: 2.8.0

 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, 
 HDFS-8214.003.patch


 SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in 
 the web UI. This causes weird times, generally, just after the epoch, to be 
 displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521453#comment-14521453
 ] 

Hudson commented on HDFS-5574:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/913/])
HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin 
Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java


 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521460#comment-14521460
 ] 

Kai Zheng commented on HDFS-8137:
-

Hi Uma,
bq.We supposed to get schema values from ECSchemaManager, but right now I don't 
see a better way to get from ECScheaManeger, so I added an API to get from 
BlockCollection itself like isStriped API in it.
{{ECSchemaManager}} might not be supposed to get a schema associated with a 
zone, dir/file, but {{ErasureCodingZoneManager}} may do. We could query the 
schema info from a zone using ErasureCodingZoneManager. I thought it's good to 
add the method {{getECSchema}} along with the existing method {{isStriped}}, as 
it's essential to erasure coded files.
A quick look at the patch found it might need to align with some latest 
changes, regarding how to get schema from a zone/dir/xAttr, would you double 
check? Thanks.

 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521357#comment-14521357
 ] 

Hudson commented on HDFS-5574:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7705 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7705/])
HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin 
Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java


 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521376#comment-14521376
 ] 

Hadoop QA commented on HDFS-7859:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 48s | The applied patch generated  
10  additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 11s | The patch appears to introduce 9 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 239m 34s | Tests failed in hadoop-hdfs. |
| | | 288m  5s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time  
Unsynchronized access at DFSOutputStream.java:90% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 105] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:[line 85] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:[line 167] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestMetadataVersionOutput |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestDFSRollback |
|   | hadoop.hdfs.server.namenode.TestCreateEditsLog |
|   | hadoop.hdfs.protocol.TestLayoutVersion |
|   | hadoop.hdfs.TestDFSFinalize |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | 

[jira] [Updated] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem

2015-04-30 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8299:
--
Attachment: datanode.log

 HDFS reporting missing blocks when they are actually present due to read-only 
 filesystem
 

 Key: HDFS-8299
 URL: https://issues.apache.org/jira/browse/HDFS-8299
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
 Environment: Fsck shows missing blocks when the blocks can be found 
 on a datanode's filesystem and the datanode has been restarted to try to get 
 it to recognize that the blocks are indeed present and hence report them to 
 the NameNode in a block report.
 Fsck output showing an example missing block:
 {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT 
 blockpool BP-120244285-ip-1417023863606 block blk_1075202330
  MISSING 1 blocks of total size 3260848 B
 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 
 MISSING!{code}
 The block is definitely present on more than one datanode however, here is 
 the output from one of them that I restarted to try to get it to report the 
 block to the NameNode:
 {code}# ll 
 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
 -rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 
 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330
 -rw-r--r-- 1 hdfs 499   25483 Apr 27 15:02 
 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code}
 It's worth noting that this is on HDFS tiered storage on an archive tier 
 going to a networked block device that may have become temporarily 
 unavailable but is available now. See also feature request HDFS-8297 for 
 online rescan to not have to go around restarting datanodes.
 It turns out in the datanode log (that I am attaching) this is because the 
 datanode fails to get a write lock on the filesystem. I think it would be 
 better to be able to read-only those blocks however, since this way causes 
 client visible data unavailability when the data could in fact be read.
 {code}2015-04-30 14:11:08,235 WARN  datanode.DataNode 
 (DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
 /archive1/dn :
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
 writable: /archive1/dn
 at 
 org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
 at 
 org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
 at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
 at 
 org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
 {code}
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon
Reporter: Hari Sekhon
Priority: Critical
 Attachments: datanode.log






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem

2015-04-30 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8299:
--
Environment: HDP 2.2  (was: Fsck shows missing blocks when the blocks can 
be found on a datanode's filesystem and the datanode has been restarted to try 
to get it to recognize that the blocks are indeed present and hence report them 
to the NameNode in a block report.

Fsck output showing an example missing block:
{code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT 
blockpool BP-120244285-ip-1417023863606 block blk_1075202330
 MISSING 1 blocks of total size 3260848 B
0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 
MISSING!{code}
The block is definitely present on more than one datanode however, here is the 
output from one of them that I restarted to try to get it to report the block 
to the NameNode:
{code}# ll 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
-rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330
-rw-r--r-- 1 hdfs 499   25483 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code}
It's worth noting that this is on HDFS tiered storage on an archive tier going 
to a networked block device that may have become temporarily unavailable but is 
available now. See also feature request HDFS-8297 for online rescan to not have 
to go around restarting datanodes.

It turns out in the datanode log (that I am attaching) this is because the 
datanode fails to get a write lock on the filesystem. I think it would be 
better to be able to read-only those blocks however, since this way causes 
client visible data unavailability when the data could in fact be read.

{code}2015-04-30 14:11:08,235 WARN  datanode.DataNode 
(DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
/archive1/dn :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
writable: /archive1/dn
at 
org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
at 
org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
{code}

Hari Sekhon
http://www.linkedin.com/in/harisekhon)

 HDFS reporting missing blocks when they are actually present due to read-only 
 filesystem
 

 Key: HDFS-8299
 URL: https://issues.apache.org/jira/browse/HDFS-8299
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Critical
 Attachments: datanode.log






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem

2015-04-30 Thread Hari Sekhon (JIRA)
Hari Sekhon created HDFS-8299:
-

 Summary: HDFS reporting missing blocks when they are actually 
present due to read-only filesystem
 Key: HDFS-8299
 URL: https://issues.apache.org/jira/browse/HDFS-8299
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
 Environment: Fsck shows missing blocks when the blocks can be found on 
a datanode's filesystem and the datanode has been restarted to try to get it to 
recognize that the blocks are indeed present and hence report them to the 
NameNode in a block report.

Fsck output showing an example missing block:
{code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT 
blockpool BP-120244285-ip-1417023863606 block blk_1075202330
 MISSING 1 blocks of total size 3260848 B
0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 
MISSING!{code}
The block is definitely present on more than one datanode however, here is the 
output from one of them that I restarted to try to get it to report the block 
to the NameNode:
{code}# ll 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
-rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330
-rw-r--r-- 1 hdfs 499   25483 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code}
It's worth noting that this is on HDFS tiered storage on an archive tier going 
to a networked block device that may have become temporarily unavailable but is 
available now. See also feature request HDFS-8297 for online rescan to not have 
to go around restarting datanodes.

It turns out in the datanode log (that I am attaching) this is because the 
datanode fails to get a write lock on the filesystem. I think it would be 
better to be able to read-only those blocks however, since this way causes 
client visible data unavailability when the data could in fact be read.

{code}2015-04-30 14:11:08,235 WARN  datanode.DataNode 
(DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
/archive1/dn :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
writable: /archive1/dn
at 
org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
at 
org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
{code}

Hari Sekhon
http://www.linkedin.com/in/harisekhon
Reporter: Hari Sekhon
Priority: Critical
 Attachments: datanode.log





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521356#comment-14521356
 ] 

Hudson commented on HDFS-7770:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7705 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7705/])
HDFS-7770. Need document for storage type label of data node storage locations 
under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev 
de9404f02f36bf9a1100c67f41db907d494bb9ed)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Fix For: 2.8.0, 2.7.1

 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521397#comment-14521397
 ] 

Hudson commented on HDFS-5574:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/])
HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin 
Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java


 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521391#comment-14521391
 ] 

Hudson commented on HDFS-8269:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/])
HDFS-8269. getBlockLocations() does not resolve the .reserved path and 
generates incorrect edit logs when updating the atime. Contributed by Haohui 
Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 getBlockLocations() does not resolve the .reserved path and generates 
 incorrect edit logs when updating the atime
 -

 Key: HDFS-8269
 URL: https://issues.apache.org/jira/browse/HDFS-8269
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Haohui Mai
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, 
 HDFS-8269.002.patch, HDFS-8269.003.patch


 When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, 
 it uses the path passed from the client, which generates incorrect edit logs 
 entries:
 {noformat}
   RECORD
 OPCODEOP_TIMES/OPCODE
 DATA
   TXID5085/TXID
   LENGTH0/LENGTH
   PATH/.reserved/.inodes/18230/PATH
   MTIME-1/MTIME
   ATIME1429908236392/ATIME
 /DATA
   /RECORD
 {noformat}
 Note that the NN does not resolve the {{/.reserved}} path when processing the 
 edit log, therefore it eventually leads to a NPE when loading the edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521399#comment-14521399
 ] 

Hudson commented on HDFS-8214:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/])
HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. 
Contributed by Charles Lamb. (wang: rev 
aa22450442ebe39916a6fd460fe97e347945526d)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java


 Secondary NN Web UI shows wrong date for Last Checkpoint
 

 Key: HDFS-8214
 URL: https://issues.apache.org/jira/browse/HDFS-8214
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS, namenode
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Fix For: 2.8.0

 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, 
 HDFS-8214.003.patch


 SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in 
 the web UI. This causes weird times, generally, just after the epoch, to be 
 displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521398#comment-14521398
 ] 

Hudson commented on HDFS-8283:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/])
HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz 
Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java


 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.8.0

 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521394#comment-14521394
 ] 

Hudson commented on HDFS-7770:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/])
HDFS-7770. Need document for storage type label of data node storage locations 
under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev 
de9404f02f36bf9a1100c67f41db907d494bb9ed)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md


 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Fix For: 2.8.0, 2.7.1

 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521434#comment-14521434
 ] 

Rakesh R commented on HDFS-8242:


Attaching another patch fixing whitespace problem reported by jenkins

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch, 
 HDFS-8242-HDFS-7285.05.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521404#comment-14521404
 ] 

Hudson commented on HDFS-7770:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/])
HDFS-7770. Need document for storage type label of data node storage locations 
under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev 
de9404f02f36bf9a1100c67f41db907d494bb9ed)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md


 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Fix For: 2.8.0, 2.7.1

 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread surendra singh lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-8276:
-
Attachment: HDFS-8276_1.patch

 LazyPersistFileScrubber should be disabled if scrubber interval configured 
 zero
 ---

 Key: HDFS-8276
 URL: https://issues.apache.org/jira/browse/HDFS-8276
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8276.patch, HDFS-8276_1.patch


 bq. but I think it is simple enough to change the meaning of the value so 
 that zero means 'never scrub'. Let me post an updated patch.
 As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
 scrubber should be disable if 
 *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
 Currently namenode startup is failing if interval configured zero
 {code}
 2015-04-27 23:47:31,744 ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
 initialization failed.
 java.lang.IllegalArgumentException: 
 dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:828)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread surendra singh lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521416#comment-14521416
 ] 

surendra singh lilhore commented on HDFS-8276:
--

Thanks [~arpitagarwal] for review.
Attached new patch, added test case. Please review.

 LazyPersistFileScrubber should be disabled if scrubber interval configured 
 zero
 ---

 Key: HDFS-8276
 URL: https://issues.apache.org/jira/browse/HDFS-8276
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8276.patch, HDFS-8276_1.patch


 bq. but I think it is simple enough to change the meaning of the value so 
 that zero means 'never scrub'. Let me post an updated patch.
 As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
 scrubber should be disable if 
 *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
 Currently namenode startup is failing if interval configured zero
 {code}
 2015-04-27 23:47:31,744 ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
 initialization failed.
 java.lang.IllegalArgumentException: 
 dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:828)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8298) HA: NameNode should not shut down completely without quorum, doesn't recover from temporary failures

2015-04-30 Thread Hari Sekhon (JIRA)
Hari Sekhon created HDFS-8298:
-

 Summary: HA: NameNode should not shut down completely without 
quorum, doesn't recover from temporary failures
 Key: HDFS-8298
 URL: https://issues.apache.org/jira/browse/HDFS-8298
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, HDFS, namenode, qjm
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon


In an HDFS HA setup if there is a temporary problem with contacting journal 
nodes (eg. network interruption), the NameNode shuts down entirely, when it 
should instead go in to a standby mode so that it can stay online and retry to 
achieve quorum later.

If both NameNodes shut themselves off like this then even after the temporary 
network outage is resolved, the entire cluster remains offline indefinitely 
until operator intervention, whereas it could have self-repaired after 
re-contacting the journalnodes and re-achieving quorum.

{code}2015-04-15 15:59:26,900 FATAL namenode.FSEditLog 
(JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for 
required journal (JournalAndStre
am(mgr=QJM to [ip:8485, ip:8485, ip:8485], stream=QuorumOutputStream 
starting at txid 54270281))
java.io.IOException: Interrupted waiting 2ms for a quorum of nodes to 
respond.
at 
org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:134)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
at 
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639)
at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:388)
at java.lang.Thread.run(Thread.java:745)
2015-04-15 15:59:26,901 WARN  client.QuorumJournalManager 
(QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at 
txid 54270281
2015-04-15 15:59:26,904 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
Exiting with status 1
2015-04-15 15:59:27,001 INFO  namenode.NameNode (StringUtils.java:run(659)) - 
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down NameNode at custom_scrubbed/ip
/{code}

Hari Sekhon
http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521381#comment-14521381
 ] 

Hudson commented on HDFS-8269:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/])
HDFS-8269. getBlockLocations() does not resolve the .reserved path and 
generates incorrect edit logs when updating the atime. Contributed by Haohui 
Mai. (wheat9: rev 3dd6395bb2448e5b178a51c864e3c9a3d12e8bc9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGetBlockLocations.java


 getBlockLocations() does not resolve the .reserved path and generates 
 incorrect edit logs when updating the atime
 -

 Key: HDFS-8269
 URL: https://issues.apache.org/jira/browse/HDFS-8269
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Haohui Mai
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, 
 HDFS-8269.002.patch, HDFS-8269.003.patch


 When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, 
 it uses the path passed from the client, which generates incorrect edit logs 
 entries:
 {noformat}
   RECORD
 OPCODEOP_TIMES/OPCODE
 DATA
   TXID5085/TXID
   LENGTH0/LENGTH
   PATH/.reserved/.inodes/18230/PATH
   MTIME-1/MTIME
   ATIME1429908236392/ATIME
 /DATA
   /RECORD
 {noformat}
 Note that the NN does not resolve the {{/.reserved}} path when processing the 
 edit log, therefore it eventually leads to a NPE when loading the edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521384#comment-14521384
 ] 

Hudson commented on HDFS-7770:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/])
HDFS-7770. Need document for storage type label of data node storage locations 
under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev 
de9404f02f36bf9a1100c67f41db907d494bb9ed)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Fix For: 2.8.0, 2.7.1

 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521388#comment-14521388
 ] 

Hudson commented on HDFS-8283:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/])
HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz 
Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java


 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.8.0

 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521387#comment-14521387
 ] 

Hudson commented on HDFS-5574:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/])
HDFS-5574. Remove buffer copy in BlockReader.skip. Contributed by Binglin 
Chang. (aajisaka: rev e89fc53a1d264fde407dd2c36defab5241cd0b52)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderBase.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java


 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521389#comment-14521389
 ] 

Hudson commented on HDFS-8214:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/])
HDFS-8214. Secondary NN Web UI shows wrong date for Last Checkpoint. 
Contributed by Charles Lamb. (wang: rev 
aa22450442ebe39916a6fd460fe97e347945526d)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNodeInfoMXBean.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js


 Secondary NN Web UI shows wrong date for Last Checkpoint
 

 Key: HDFS-8214
 URL: https://issues.apache.org/jira/browse/HDFS-8214
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS, namenode
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Fix For: 2.8.0

 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, 
 HDFS-8214.003.patch


 SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in 
 the web UI. This causes weird times, generally, just after the epoch, to be 
 displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread surendra singh lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521414#comment-14521414
 ] 

surendra singh lilhore commented on HDFS-8276:
--

Failed test cases and find bugs are not related to this jira.



 LazyPersistFileScrubber should be disabled if scrubber interval configured 
 zero
 ---

 Key: HDFS-8276
 URL: https://issues.apache.org/jira/browse/HDFS-8276
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8276.patch, HDFS-8276_1.patch


 bq. but I think it is simple enough to change the meaning of the value so 
 that zero means 'never scrub'. Let me post an updated patch.
 As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
 scrubber should be disable if 
 *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
 Currently namenode startup is failing if interval configured zero
 {code}
 2015-04-27 23:47:31,744 ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
 initialization failed.
 java.lang.IllegalArgumentException: 
 dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:828)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8298) HA: NameNode should not shut down completely without quorum, doesn't recover from temporary network outages

2015-04-30 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8298:
--
Summary: HA: NameNode should not shut down completely without quorum, 
doesn't recover from temporary network outages  (was: HA: NameNode should not 
shut down completely without quorum, doesn't recover from temporary failures)

 HA: NameNode should not shut down completely without quorum, doesn't recover 
 from temporary network outages
 ---

 Key: HDFS-8298
 URL: https://issues.apache.org/jira/browse/HDFS-8298
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, HDFS, namenode, qjm
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon

 In an HDFS HA setup if there is a temporary problem with contacting journal 
 nodes (eg. network interruption), the NameNode shuts down entirely, when it 
 should instead go in to a standby mode so that it can stay online and retry 
 to achieve quorum later.
 If both NameNodes shut themselves off like this then even after the temporary 
 network outage is resolved, the entire cluster remains offline indefinitely 
 until operator intervention, whereas it could have self-repaired after 
 re-contacting the journalnodes and re-achieving quorum.
 {code}2015-04-15 15:59:26,900 FATAL namenode.FSEditLog 
 (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for 
 required journal (JournalAndStre
 am(mgr=QJM to [ip:8485, ip:8485, ip:8485], stream=QuorumOutputStream 
 starting at txid 54270281))
 java.io.IOException: Interrupted waiting 2ms for a quorum of nodes to 
 respond.
 at 
 org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:134)
 at 
 org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
 at 
 org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
 at 
 org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
 at 
 org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
 at 
 org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
 at 
 org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
 at 
 org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639)
 at 
 org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:388)
 at java.lang.Thread.run(Thread.java:745)
 2015-04-15 15:59:26,901 WARN  client.QuorumJournalManager 
 (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at 
 txid 54270281
 2015-04-15 15:59:26,904 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
 Exiting with status 1
 2015-04-15 15:59:27,001 INFO  namenode.NameNode (StringUtils.java:run(659)) - 
 SHUTDOWN_MSG:
 /
 SHUTDOWN_MSG: Shutting down NameNode at custom_scrubbed/ip
 /{code}
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521450#comment-14521450
 ] 

Hudson commented on HDFS-7770:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/913/])
HDFS-7770. Need document for storage type label of data node storage locations 
under dfs.data.dir. Contributed by Xiaoyu Yao. (aajisaka: rev 
de9404f02f36bf9a1100c67f41db907d494bb9ed)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Fix For: 2.8.0, 2.7.1

 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521454#comment-14521454
 ] 

Hudson commented on HDFS-8283:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/913/])
HDFS-8283. DataStreamer cleanup and some minor improvement. Contributed by Tsz 
Wo Nicholas Sze. (jing9: rev 7947e5b53b9ac9524b535b0384c1c355b74723ff)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/MultipleIOException.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java


 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.8.0

 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-30 Thread surendra singh lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-8277:
-
Attachment: HDFS-8277_2.patch

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon
Assignee: surendra singh lilhore
Priority: Minor
 Attachments: HDFS-8277.patch, HDFS-8277_1.patch, HDFS-8277_2.patch


 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-30 Thread surendra singh lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521238#comment-14521238
 ] 

surendra singh lilhore commented on HDFS-8277:
--

Attached new patch with test case, Please review

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon
Assignee: surendra singh lilhore
Priority: Minor
 Attachments: HDFS-8277.patch, HDFS-8277_1.patch, HDFS-8277_2.patch


 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-219) Add md5sum facility in dfsshell

2015-04-30 Thread Kengo Seki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521224#comment-14521224
 ] 

Kengo Seki commented on HDFS-219:
-

Maybe duplicate of HADOOP-9209?

 Add md5sum facility in dfsshell
 ---

 Key: HDFS-219
 URL: https://issues.apache.org/jira/browse/HDFS-219
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: zhangwei
  Labels: newbie

 I think it would be usefull to add md5sum (or anyone else) to dfsshell ,and 
 the facility can verify the file on hdfs.It can confirm the file is integrity 
 after copyFromLocal or copyToLocal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7810) Datanode registration process fails in hadoop 2.6

2015-04-30 Thread Vlad Frolov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521516#comment-14521516
 ] 

Vlad Frolov commented on HDFS-7810:
---

It seems that I hit the same issue here. I have properly set DNS server (bind9) 
with reverse DNS lookups, `nslookup` and `host` utilities can resolve IP into 
FQDN, but NameNode says 

{code:log}
15/04/30 13:45:12 WARN blockmanagement.DatanodeManager: Unresolved datanode 
registration: hostname cannot be resolved (ip=10.250.10.11, 
hostname=10.250.10.11)
15/04/30 13:45:12 INFO ipc.Server: IPC Server handler 3 on 8020, call 
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
10.250.10.11:35776 Call#68 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode 
denied communication with namenode because hostname cannot be resolved 
(ip=10.250.10.11, hostname=10.250.10.11): DatanodeRegistration(0.0.0.0, 
datanodeUuid=d5fe1cf5-09ac-4644-9f8a-8c4881e3c569, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=CID-e74a3224-300e-400b-ae92-bb7ae64cdf01;nsid=1242366503;c=0)
{code}

 Datanode registration process fails in hadoop 2.6 
 --

 Key: HDFS-7810
 URL: https://issues.apache.org/jira/browse/HDFS-7810
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
 Environment: ubuntu 12
Reporter: Biju Nair
  Labels: hadoop

 When a new DN is added to the cluster, the registration process fails. The 
 following are the steps followed.
 - Install and start a new DN
 - Add entry for the DN in the NN {{/etc/hosts}} file
 DN log shows that the registration process failed
 - Tried to restart DN with the same result
 Since all the DNs have multiple NW interface, we are using the following 
 {{hdfs-site.xml}} property, instead of listing all the 
 {{dfs.datanode.xx.address}} properties.
 {code:xml}
   property
 namedfs.datanode.dns.interface/name
 valueeth2/value
   /property
 {code}
 - Restarting the NN resolves the issue with registration which is not 
 desired. 
 - Adding the following {{dfs.datanode.xx.address}} properties seem to resolve 
 DN registration process without NN restart. But this is a different behavior 
 compared to *hadoop 2.2*. Is there a reason for the change?
 {code:xml}
   property
 namedfs.datanode.address/name
 value192.168.0.12:50010/value
   /property
   property
 namedfs.datanode.ipc.address/name
 value192.168.0.12:50020/value
   /property
   property
 namedfs.datanode.http.address/name
 value192.168.0.12:50075/value
   /property
 {code}
 *NN Log Error Entry*
 {quote}
 2015-02-17 12:21:53,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 6 on 8020, call 
 org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 
 192.168.100.13:37516 Call#1027 Retry#0 
 org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode 
 denied communication with namenode because hostname cannot be resolved 
 (ip=192.168.100.13, hostname=192.168.100.13): DatanodeRegistration(0.0.0.0, 
 datanodeUuid=bd23eb3c-a5b9-43e4-ad23-1683346564ac, infoPort=50075, 
 ipcPort=50020, 
 storageInfo=lv=-56;cid=CID-02099252-fbca-4bf2-b466-9a0ed67e53a3;nsid=2048643132;c=0)
  
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:887)
  
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5002)
  
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1065)
  
 at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
  
 at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
  
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
  
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at javax.security.auth.Subject.doAs(Subject.java:415) 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
  
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
 2015-02-17 12:21:58,607 WARN 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved 
 datanode registration: hostname cannot be resolved (ip=192.168.100.13, 
 hostname=192.168.100.13) 
 {quote}
 *DN Log Error Entry*
 {quote}
 2015-02-17 12:21:02,994 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Block 

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521517#comment-14521517
 ] 

Kai Zheng commented on HDFS-7348:
-

Thanks [~hitliuyi], [~zhz] and [~libo-intel] for the great discussion! Looks 
like we have already come to good plans.
bq.Does it save CPU to decode in big chunks? Kai Zheng Could you advise?
Sorry I just noticed this. Yes you're right, as Yi also noted, it's good to 
allocate big native buffers for ISA-L coders to outperform greatly. We have 
test data that indicate using about 32MB chunk size ISA-L coder can work the 
very best.
I agree it's good to decouple sync-and-decode unit from the chunk/cell size in 
a schema, and make it configurable. Yes it might not be good to do it in the 
entire block level as doing so may make DN exhaust in memory and not reliable. 
We should be able to enforce a memory usage threshold limit for recovery tasks. 
As some dedicated DNs have powerful CPU cores it's good to distribute recovery 
work to them, so very likely on such DNs there are more than one recovering 
tasks concurrently executing. 

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521549#comment-14521549
 ] 

Hadoop QA commented on HDFS-8277:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 52s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   8m 15s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m  7s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   4m 57s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 56s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 37s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 44s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 166m 39s | Tests failed in hadoop-hdfs. |
| | | 216m 18s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.TestClose |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729456/HDFS-8277_2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e89fc53 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10476/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10476/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10476/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10476/console |


This message was automatically generated.

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon
Assignee: surendra singh lilhore
Priority: Minor
 Attachments: HDFS-8277.patch, HDFS-8277_1.patch, HDFS-8277_2.patch


 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This 

[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521577#comment-14521577
 ] 

Hadoop QA commented on HDFS-8229:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 28s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 25s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 27s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  7s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 11s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 225m 54s | Tests failed in hadoop-hdfs. |
| | | 271m 34s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.namenode.TestSaveNamespace |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729442/HDFS-8229_2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f5b3847 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10475/artifact/patchprocess/checkstyle-result-diff.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10475/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10475/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10475/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10475/console |


This message was automatically generated.

 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521585#comment-14521585
 ] 

Uma Maheswara Rao G commented on HDFS-8137:
---

Thanks a lot for the review Kai. 
Good catch. You are right, we are storing in xattrs along with zone. 

{quote}
ECSchemaManager might not be supposed to get a schema associated with a zone, 
dir/file, but ErasureCodingZoneManager may do.
{quote}
By mistake I said as ECSchemaManager. Your are right, I should have said as 
ErasureCodingZoneManager as it has that related code what I was talking.

Also I added the getECSchema API in namesystem itself as we have already added 
some ECSchema related API in FSNameSystem.  For reusing the codes from 
ECZoneManager codes, keeping this new API in namesystem would give us the 
flexibility, but we can not get the same flexibility from BlockCollection as we 
can not access FSDirectory details there. 

Please check if the latest patch make sense for you?


 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch, HDFS-8137-1.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-8137:
--
Attachment: HDFS-8137-1.patch

 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch, HDFS-8137-1.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7949:

Attachment: HDFS-7949-HDFS-7285.08.patch

Thanks Rakesh for taking a close look. I'm attaching a dup patch just to be 
extra careful, since the space calculation _could_ affect other tests.

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch, 
 HDFS-7949-HDFS-7285.08.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.

2015-04-30 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521860#comment-14521860
 ] 

Arpit Agarwal commented on HDFS-8229:
-

+1 for the patch. Thanks for the updates [~surendrasingh].

I kicked off another pre-commit build since previous results look wrong.

 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error

2015-04-30 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah reassigned HDFS-8224:


Assignee: Rushabh S Shah

 Any IOException in DataTransfer#run() will run diskError thread even if it is 
 not disk error
 

 Key: HDFS-8224
 URL: https://issues.apache.org/jira/browse/HDFS-8224
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
 Fix For: 2.8.0


 This happened in our 2.6 cluster.
 One of the block and its metadata file were corrupted.
 The disk was healthy in this case.
 Only the block was corrupt.
 Namenode tried to copy that block to another datanode but failed with the 
 following stack trace:
 2015-04-20 01:04:04,421 
 [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN 
 datanode.DataNode: DatanodeRegistration(a.b.c.d, 
 datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, 
 infoSecurePort=0, ipcPort=8020, 
 storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed
  to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to 
 a1.b1.c1.d1:1004 got 
 java.io.IOException: Could not create DataChecksum of type 0 with 
 bytesPerChecksum 0
 at 
 org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:287)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989)
 at java.lang.Thread.run(Thread.java:722)
 The following catch block in DataTransfer#run method will treat every 
 IOException as disk error fault and run disk errror
 {noformat}
 catch (IOException ie) {
 LOG.warn(bpReg + :Failed to transfer  + b +  to  +
 targets[0] +  got , ie);
 // check if there are any disk problem
 checkDiskErrorAsync();
   } 
 {noformat}
 This block was never scanned by BlockPoolSliceScanner otherwise it would have 
 reported as corrupt block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >