date:20150430


 [ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-7859:
--
Attachment: HDFS-7859-HDFS-7285.003.patch

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-8137 started by Uma Maheswara Rao G.
-
 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-8183) Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads


 [ 
https://issues.apache.org/jira/browse/HDFS-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-8183.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

The patch LGTM. +1 and I just committed it to the branch (since the change is 
simple we can probably watch Jenkins later). Thanks Rakesh for the contribution!

 Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads
 --

 Key: HDFS-8183
 URL: https://issues.apache.org/jira/browse/HDFS-8183
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: HDFS-7285

 Attachments: HDFS-8183-001.patch, HDFS-8183-002.patch


 The idea of this task is to improve closing of all the streamers. Presently 
 if any of the streamer throws an exception, it will returning immediately. 
 This leaves all the other streamer threads running. Instead its good to 
 handle the exceptions of each streamer independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks


 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7949:
---
Status: Patch Available  (was: In Progress)

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521152#comment-14521152
 ] 

Hadoop QA commented on HDFS-8276:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 43s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  7s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 224m 19s | Tests failed in hadoop-hdfs. |
| | | 272m 48s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729099/HDFS-8276.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / aa22450 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/checkstyle-result-diff.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10468/console |


This message was automatically generated.

 LazyPersistFileScrubber should be disabled if scrubber interval configured 
 zero
 ---

 Key: HDFS-8276
 URL: https://issues.apache.org/jira/browse/HDFS-8276
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8276.patch


 bq. but I think it is simple enough to change the meaning of the value so 
 that zero means 'never scrub'. Let me post an updated patch.
 As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
 scrubber should be disable if 
 *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
 Currently namenode startup is failing if interval configured zero
 {code}
 2015-04-27 23:47:31,744 ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
 initialization failed.
 java.lang.IllegalArgumentException:

[jira] [Updated] (HDFS-8178) QJM doesn't purge empty and corrupt inprogress edits files


 [ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8178:

Description: When a QJM crashes, the in-progress edit log file at that time 
remains in the file system. When the node comes back, it will accept new edit 
logs and those stale in-progress files are never cleaned up. QJM treats them as 
regular in-progress edit log files and tries to finalize them, which 
potentially causes high memory usage. This JIRA aims to move aside those stale 
edit log files to avoid this scenario.  (was: HDFS-5919 fixes the issue for 
{{FileJournalManager}}. A similar fix is needed for QJM.)

 QJM doesn't purge empty and corrupt inprogress edits files
 --

 Key: HDFS-8178
 URL: https://issues.apache.org/jira/browse/HDFS-8178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8178.000.patch


 When a QJM crashes, the in-progress edit log file at that time remains in the 
 file system. When the node comes back, it will accept new edit logs and those 
 stale in-progress files are never cleaned up. QJM treats them as regular 
 in-progress edit log files and tries to finalize them, which potentially 
 causes high memory usage. This JIRA aims to move aside those stale edit log 
 files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files


[ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521005#comment-14521005
 ] 

Zhe Zhang commented on HDFS-8178:
-

Oops last paragraph was added by mistake, please ignore it.

 QJM doesn't move aside stale inprogress edits files
 ---

 Key: HDFS-8178
 URL: https://issues.apache.org/jira/browse/HDFS-8178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8178.000.patch


 When a QJM crashes, the in-progress edit log file at that time remains in the 
 file system. When the node comes back, it will accept new edit logs and those 
 stale in-progress files are never cleaned up. QJM treats them as regular 
 in-progress edit log files and tries to finalize them, which potentially 
 causes high memory usage. This JIRA aims to move aside those stale edit log 
 files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files

[
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521002#comment-14521002
]

Zhe Zhang commented on HDFS-8178:
-

Thanks ATM for the helpful review! After looking at HDFS-5919 more closely, we
are actually trying to solve a different problem here. The objective of
HDFS-5919 is sorely to save disk space (since FJM doesn't try to process those
corrupt/empty files anyway). It's a safe cleanup, making sure the tx ID of
empty / corrupt files are old enough before purging. So I think we should do
the same in QJM.

Our main target here is _stale_ in-progress edit log files, which are not
necessarily empty/corrupt (so they won't be mark as so). As the updated
description states, we want to properly take care of those files so QJM doesn't
try to process them. I like your proposal of rename / move aside those files
and remove them when they are older than {{minTxIdToKeep}}. I'll update the
patch based on this idea.

I also propose we do the same for corrupt / empty files, for both FJM and QJM.

QJM doesn't move aside stale inprogress edits files
---

Key: HDFS-8178
URL: https://issues.apache.org/jira/browse/HDFS-8178
Project: Hadoop HDFS
Issue Type: Bug
Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Attachments: HDFS-8178.000.patch

When a QJM crashes, the in-progress edit log file at that time remains in the
file system. When the node comes back, it will accept new edit logs and those
stale in-progress files are never cleaned up. QJM treats them as regular
in-progress edit log files and tries to finalize them, which potentially
causes high memory usage. This JIRA aims to move aside those stale edit log
files to avoid this scenario.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding


 [ 
https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8294:
---
Attachment: HDFS-8294-HDFS-7285.00.patch

 Erasure Coding: Fix Findbug warnings present in erasure coding
 --

 Key: HDFS-8294
 URL: https://issues.apache.org/jira/browse/HDFS-8294
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8294-HDFS-7285.00.patch


 Following are the findbug warnings :-
 # Possible null pointer dereference of arr$ in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 {code}
 Bug type NP_NULL_ON_SOME_PATH (click for details) 
 In class 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 Value loaded from arr$
 Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206]
 Known null at BlockInfoStripedUnderConstruction.java:[line 200]
 {code}
 # Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema): String.getBytes()
 Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
  new String(byte[])
 {code}
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema)
 Called method String.getBytes()
 At ErasureCodingZoneManager.java:[line 116]
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath)
 Called method new String(byte[])
 At ErasureCodingZoneManager.java:[line 81]
 {code}
 # Inconsistent synchronization of 
 org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time
 {code}
 Bug type IS2_INCONSISTENT_SYNC (click for details) 
 In class org.apache.hadoop.hdfs.DFSOutputStream
 Field org.apache.hadoop.hdfs.DFSOutputStream.streamer
 Synchronized 90% of the time
 Unsynchronized access at DFSOutputStream.java:[line 142]
 Unsynchronized access at DFSOutputStream.java:[line 853]
 Unsynchronized access at DFSOutputStream.java:[line 617]
 Unsynchronized access at DFSOutputStream.java:[line 620]
 Unsynchronized access at DFSOutputStream.java:[line 630]
 Unsynchronized access at DFSOutputStream.java:[line 338]
 Unsynchronized access at DFSOutputStream.java:[line 734]
 Unsynchronized access at DFSOutputStream.java:[line 897]
 {code}
 # Dead store to offSuccess in 
 org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 {code}
 Bug type DLS_DEAD_LOCAL_STORE (click for details) 
 In class org.apache.hadoop.hdfs.StripedDataStreamer
 In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 Local variable named offSuccess
 At StripedDataStreamer.java:[line 105]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 At BlockInfoStriped.java:[line 208]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.util.StripedBlockUtil
 In method 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 At StripedBlockUtil.java:[line 85]
 {code}
 # Switch statement found in 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map) where default case is missing
 {code}
 Bug type SF_SWITCH_NO_DEFAULT (click for details) 
 In class org.apache.hadoop.hdfs.DFSStripedInputStream
 In method 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map)
 At DFSStripedInputStream.java:[lines 468-491]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520986#comment-14520986
 ] 

Chris Nauroth commented on HDFS-8283:
-

These test failures might be related too:

https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/

 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.8.0

 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8290) WebHDFS calls before namesystem initialization can cause NullPointerException.

2015-04-30 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520990#comment-14520990
 ] 

Chris Nauroth commented on HDFS-8290:
-

The Findbugs warning is in an unrelated part of the codebase.  It's possible 
that both the Findbugs warning and the test failures were introduced by 
HDFS-8283.  I'm waiting for confirmation before I commit this.

 WebHDFS calls before namesystem initialization can cause NullPointerException.
 --

 Key: HDFS-8290
 URL: https://issues.apache.org/jira/browse/HDFS-8290
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Minor
 Attachments: HDFS-8290.001.patch


 The NameNode has a brief window of time when the HTTP server has been 
 initialized, but the namesystem has not been initialized.  During this 
 window, a WebHDFS call can cause a {{NullPointerException}}.  We can catch 
 this condition and return a more meaningful error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks


[ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521014#comment-14521014
 ] 

Zhe Zhang commented on HDFS-7949:
-

Thanks Rakesh,! The patch LGTM, +1 pending a Jenkins run. Do you mind Submit 
Patch and rename the patch as HDFS-7949-HDFS-7285.007.patch?

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
Priority: Minor
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Uma Maheswara Rao G (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uma Maheswara Rao G updated HDFS-8137:
--
Attachment: HDFS-8137-0.patch

I generated an initial patch for review!
We supposed to get schema values from ECSchemaManager, but right now I don't
see a better way to get from ECScheaManeger, so I added an API to get from
BlockCollection itself like isStriped API in it. It's because BlockManager
communicates with namesystem via Namesystem interface. I don't think its right
to add apis there for every new features. BlockCollection is another interface
like that and I added there. But logically Namesystem may be correct place to
add getECSchema for a file path . But I am not too strong on that. I would like
hear the suggestion on that if any.

Sends the EC schema to DataNode as well in EC encoding/recovering command
-

Key: HDFS-8137
URL: https://issues.apache.org/jira/browse/HDFS-8137
Project: Hadoop HDFS
Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
Attachments: HDFS-8137-0.patch

Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the
EC schema to DataNode as well contained in the EC encoding/recovering
command. The target DataNode will use it to guide the executing of the task.
Another way would be, DataNode would just request schema actively thru a
separate RPC call, and as an optimization consideration, DataNode may cache
schemas to avoid repeatedly asking for the same schema twice.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil


[ 
https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520994#comment-14520994
 ] 

Zhe Zhang commented on HDFS-8282:
-

Thanks Yi for reviewing again! I just committed it to the branch.

 Erasure coding: move striped reading logic to StripedBlockUtil
 --

 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8282-HDFS-7285.00.patch, 
 HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil


 [ 
https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8282:

   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

[~hitliuyi] We need to rebase both HDFS-7678 and HDFS-7348 against this change.

 Erasure coding: move striped reading logic to StripedBlockUtil
 --

 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Fix For: HDFS-7285

 Attachments: HDFS-8282-HDFS-7285.00.patch, 
 HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8183) Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads


[ 
https://issues.apache.org/jira/browse/HDFS-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521053#comment-14521053
 ] 

Rakesh R commented on HDFS-8183:


Thank you [~zhz] for reviewing and committing the changes.

 Erasure Coding: Improve DFSStripedOutputStream closing of datastreamer threads
 --

 Key: HDFS-8183
 URL: https://issues.apache.org/jira/browse/HDFS-8183
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: HDFS-7285

 Attachments: HDFS-8183-001.patch, HDFS-8183-002.patch


 The idea of this task is to improve closing of all the streamers. Presently 
 if any of the streamer throws an exception, it will returning immediately. 
 This leaves all the other streamer threads running. Instead its good to 
 handle the exceptions of each streamer independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-04-30 Thread Nate Edel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Status: Open  (was: Patch Available)

 HDFS client gets errors trying to to connect to IPv6 DataNode
 -

 Key: HDFS-8078
 URL: https://issues.apache.org/jira/browse/HDFS-8078
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Nate Edel
Assignee: Nate Edel
  Labels: ipv6
 Attachments: HDFS-8078.7.patch


 1st exception, on put:
 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
 java.lang.IllegalArgumentException: Does not contain a valid host:port 
 authority: 2401:db00:1010:70ba:face:0:8:0:50010
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
 Appears to actually stem from code in DataNodeID which assumes it's safe to 
 append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
 IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
 requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
 Currently using InetAddress.getByName() to validate IPv6 (guava 
 InetAddresses.forString has been flaky) but could also use our own parsing. 
 (From logging this, it seems like a low-enough frequency call that the extra 
 object creation shouldn't be problematic, and for me the slight risk of 
 passing in bad input that is not actually an IPv4 or IPv6 address and thus 
 calling an external DNS lookup is outweighed by getting the address 
 normalized and avoiding rewriting parsing.)
 Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
 ---
 2nd exception (on datanode)
 15/04/13 13:18:07 ERROR datanode.DataNode: 
 dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
 operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
 /2401:db00:11:d010:face:0:2f:0:50010
 java.io.EOFException
 at java.io.DataInputStream.readShort(DataInputStream.java:315)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
 at java.lang.Thread.run(Thread.java:745)
 Which also comes as client error -get: 2401 is not an IP string literal.
 This one has existing parsing logic which needs to shift to the last colon 
 rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
 rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-04-30 Thread Nate Edel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Attachment: (was: HDFS-8078.6.patch)

 HDFS client gets errors trying to to connect to IPv6 DataNode
 -

 Key: HDFS-8078
 URL: https://issues.apache.org/jira/browse/HDFS-8078
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Nate Edel
Assignee: Nate Edel
  Labels: ipv6
 Attachments: HDFS-8078.7.patch


 1st exception, on put:
 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
 java.lang.IllegalArgumentException: Does not contain a valid host:port 
 authority: 2401:db00:1010:70ba:face:0:8:0:50010
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
 Appears to actually stem from code in DataNodeID which assumes it's safe to 
 append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
 IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
 requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
 Currently using InetAddress.getByName() to validate IPv6 (guava 
 InetAddresses.forString has been flaky) but could also use our own parsing. 
 (From logging this, it seems like a low-enough frequency call that the extra 
 object creation shouldn't be problematic, and for me the slight risk of 
 passing in bad input that is not actually an IPv4 or IPv6 address and thus 
 calling an external DNS lookup is outweighed by getting the address 
 normalized and avoiding rewriting parsing.)
 Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
 ---
 2nd exception (on datanode)
 15/04/13 13:18:07 ERROR datanode.DataNode: 
 dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
 operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
 /2401:db00:11:d010:face:0:2f:0:50010
 java.io.EOFException
 at java.io.DataInputStream.readShort(DataInputStream.java:315)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
 at java.lang.Thread.run(Thread.java:745)
 Which also comes as client error -get: 2401 is not an IP string literal.
 This one has existing parsing logic which needs to shift to the last colon 
 rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
 rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-04-30 Thread Nate Edel (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Edel updated HDFS-8078:

Status: Patch Available  (was: Open)

 HDFS client gets errors trying to to connect to IPv6 DataNode
 -

 Key: HDFS-8078
 URL: https://issues.apache.org/jira/browse/HDFS-8078
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Nate Edel
Assignee: Nate Edel
  Labels: ipv6
 Attachments: HDFS-8078.7.patch


 1st exception, on put:
 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
 java.lang.IllegalArgumentException: Does not contain a valid host:port 
 authority: 2401:db00:1010:70ba:face:0:8:0:50010
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
 Appears to actually stem from code in DataNodeID which assumes it's safe to 
 append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
 IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which 
 requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010
 Currently using InetAddress.getByName() to validate IPv6 (guava 
 InetAddresses.forString has been flaky) but could also use our own parsing. 
 (From logging this, it seems like a low-enough frequency call that the extra 
 object creation shouldn't be problematic, and for me the slight risk of 
 passing in bad input that is not actually an IPv4 or IPv6 address and thus 
 calling an external DNS lookup is outweighed by getting the address 
 normalized and avoiding rewriting parsing.)
 Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()
 ---
 2nd exception (on datanode)
 15/04/13 13:18:07 ERROR datanode.DataNode: 
 dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown 
 operation  src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
 /2401:db00:11:d010:face:0:2f:0:50010
 java.io.EOFException
 at java.io.DataInputStream.readShort(DataInputStream.java:315)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
 at java.lang.Thread.run(Thread.java:745)
 Which also comes as client error -get: 2401 is not an IP string literal.
 This one has existing parsing logic which needs to shift to the last colon 
 rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
 rather than split.  Could alternatively use the techniques above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8178) QJM doesn't move aside stale inprogress edits files


 [ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8178:

Summary: QJM doesn't move aside stale inprogress edits files  (was: QJM 
doesn't purge empty and corrupt inprogress edits files)

 QJM doesn't move aside stale inprogress edits files
 ---

 Key: HDFS-8178
 URL: https://issues.apache.org/jira/browse/HDFS-8178
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: qjm
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8178.000.patch


 When a QJM crashes, the in-progress edit log file at that time remains in the 
 file system. When the node comes back, it will accept new edit logs and those 
 stale in-progress files are never cleaned up. QJM treats them as regular 
 in-progress edit log files and tries to finalize them, which potentially 
 causes high memory usage. This JIRA aims to move aside those stale edit log 
 files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands


 [ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8242:
---
Status: Patch Available  (was: In Progress)

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521026#comment-14521026
 ] 

Yi Liu edited comment on HDFS-7348 at 4/30/15 7:06 AM:
---

Thanks Zhe and Bo for further discussion.
I will rebase the patch, and make buffer size configurable and add decode part 
as Zhe's suggestion.

For sequential vs. parallel reading, I will file a follow-on and target in 
phase2.   For local read (if the source is local) and local write (if the 
target is local), you guys can do them as follow-on in your JIRAs and target to 
Phase2.


was (Author: hitliuyi):
Thanks Zhe and Bo for further discussion.
I will rebase the patch, and make buffer size configurable and add decode part 
as Zhe's suggestion.

For sequential vs. parallel reading, I will file a follow-on and target in 
phase2.   For local read and local write, you guys can do them as follow-on in 
your JIRAs and target to Phase2.

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations


 [ 
https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-8295:
--
Issue Type: Sub-task  (was: Task)
Parent: HDFS-8031

 Add MODIFY and REMOVE ECSchema editlog operations
 -

 Key: HDFS-8295
 URL: https://issues.apache.org/jira/browse/HDFS-8295
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 

 If MODIFY and REMOVE ECSchema operations are supported, then add these 
 editlog operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521026#comment-14521026
 ] 

Yi Liu commented on HDFS-7348:
--

Thanks Zhe and Bo for further discussion.
I will rebase the patch, and make buffer size configurable and add decode part 
as Zhe's suggestion.

For sequential vs. parallel reading, I will file a follow-on and target in 
phase2.   For local read and local write, you guys can do them as follow-on in 
your JIRAs and target to Phase2.

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks


 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7949:
---
Attachment: HDFS-7949-HDFS-7285.08.patch

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
Priority: Minor
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.


 [ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-8229:
-
Attachment: HDFS-8229_2.patch

 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil

2015-04-30 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520989#comment-14520989
 ] 

Yi Liu commented on HDFS-8282:
--

yes, +1

 Erasure coding: move striped reading logic to StripedBlockUtil
 --

 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8282-HDFS-7285.00.patch, 
 HDFS-8282-HDFS-7285.01.patch, HDFS-8282-HDFS-7285.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding


 [ 
https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8294:
---
Status: Patch Available  (was: Open)

 Erasure Coding: Fix Findbug warnings present in erasure coding
 --

 Key: HDFS-8294
 URL: https://issues.apache.org/jira/browse/HDFS-8294
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8294-HDFS-7285.00.patch


 Following are the findbug warnings :-
 # Possible null pointer dereference of arr$ in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 {code}
 Bug type NP_NULL_ON_SOME_PATH (click for details) 
 In class 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 Value loaded from arr$
 Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206]
 Known null at BlockInfoStripedUnderConstruction.java:[line 200]
 {code}
 # Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema): String.getBytes()
 Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
  new String(byte[])
 {code}
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema)
 Called method String.getBytes()
 At ErasureCodingZoneManager.java:[line 116]
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath)
 Called method new String(byte[])
 At ErasureCodingZoneManager.java:[line 81]
 {code}
 # Inconsistent synchronization of 
 org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time
 {code}
 Bug type IS2_INCONSISTENT_SYNC (click for details) 
 In class org.apache.hadoop.hdfs.DFSOutputStream
 Field org.apache.hadoop.hdfs.DFSOutputStream.streamer
 Synchronized 90% of the time
 Unsynchronized access at DFSOutputStream.java:[line 142]
 Unsynchronized access at DFSOutputStream.java:[line 853]
 Unsynchronized access at DFSOutputStream.java:[line 617]
 Unsynchronized access at DFSOutputStream.java:[line 620]
 Unsynchronized access at DFSOutputStream.java:[line 630]
 Unsynchronized access at DFSOutputStream.java:[line 338]
 Unsynchronized access at DFSOutputStream.java:[line 734]
 Unsynchronized access at DFSOutputStream.java:[line 897]
 {code}
 # Dead store to offSuccess in 
 org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 {code}
 Bug type DLS_DEAD_LOCAL_STORE (click for details) 
 In class org.apache.hadoop.hdfs.StripedDataStreamer
 In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 Local variable named offSuccess
 At StripedDataStreamer.java:[line 105]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 At BlockInfoStriped.java:[line 208]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.util.StripedBlockUtil
 In method 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 At StripedBlockUtil.java:[line 85]
 {code}
 # Switch statement found in 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map) where default case is missing
 {code}
 Bug type SF_SWITCH_NO_DEFAULT (click for details) 
 In class org.apache.hadoop.hdfs.DFSStripedInputStream
 In method 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map)
 At DFSStripedInputStream.java:[lines 468-491]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode


[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521029#comment-14521029
 ] 

Xinwei Qin  commented on HDFS-7859:
---

The 003 patch removes MODIFY and REMOVE ECSchema editlog operations, these 
operations will be added by another JIRA(HDFS-8295) later when they are 
supported. 

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks


 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7949:
---
Priority: Major  (was: Minor)
Target Version/s: HDFS-7285

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
 Attachments: HDFS-7949-001.patch, HDFS-7949-002.patch, 
 HDFS-7949-003.patch, HDFS-7949-004.patch, HDFS-7949-005.patch, 
 HDFS-7949-006.patch, HDFS-7949-007.patch, HDFS-7949-HDFS-7285.08.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations

Xinwei Qin  created HDFS-8295:
-

 Summary: Add MODIFY and REMOVE ECSchema editlog operations
 Key: HDFS-8295
 URL: https://issues.apache.org/jira/browse/HDFS-8295
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 


If MODIFY and REMOVE ECSchema operations are supported, then add these editlog 
operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding

2015-04-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521047#comment-14521047
 ] 

Hadoop QA commented on HDFS-8294:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729419/HDFS-8294-HDFS-7285.00.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 5a83838 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10472/console |


This message was automatically generated.

 Erasure Coding: Fix Findbug warnings present in erasure coding
 --

 Key: HDFS-8294
 URL: https://issues.apache.org/jira/browse/HDFS-8294
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8294-HDFS-7285.00.patch


 Following are the findbug warnings :-
 # Possible null pointer dereference of arr$ in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 {code}
 Bug type NP_NULL_ON_SOME_PATH (click for details) 
 In class 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
 Value loaded from arr$
 Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206]
 Known null at BlockInfoStripedUnderConstruction.java:[line 200]
 {code}
 # Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema): String.getBytes()
 Found reliance on default encoding in 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
  new String(byte[])
 {code}
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
  ECSchema)
 Called method String.getBytes()
 At ErasureCodingZoneManager.java:[line 116]
 Bug type DM_DEFAULT_ENCODING (click for details) 
 In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
 In method 
 org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath)
 Called method new String(byte[])
 At ErasureCodingZoneManager.java:[line 81]
 {code}
 # Inconsistent synchronization of 
 org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time
 {code}
 Bug type IS2_INCONSISTENT_SYNC (click for details) 
 In class org.apache.hadoop.hdfs.DFSOutputStream
 Field org.apache.hadoop.hdfs.DFSOutputStream.streamer
 Synchronized 90% of the time
 Unsynchronized access at DFSOutputStream.java:[line 142]
 Unsynchronized access at DFSOutputStream.java:[line 853]
 Unsynchronized access at DFSOutputStream.java:[line 617]
 Unsynchronized access at DFSOutputStream.java:[line 620]
 Unsynchronized access at DFSOutputStream.java:[line 630]
 Unsynchronized access at DFSOutputStream.java:[line 338]
 Unsynchronized access at DFSOutputStream.java:[line 734]
 Unsynchronized access at DFSOutputStream.java:[line 897]
 {code}
 # Dead store to offSuccess in 
 org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 {code}
 Bug type DLS_DEAD_LOCAL_STORE (click for details) 
 In class org.apache.hadoop.hdfs.StripedDataStreamer
 In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
 Local variable named offSuccess
 At StripedDataStreamer.java:[line 105]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped
 In method 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
 At BlockInfoStriped.java:[line 208]
 {code}
 # Result of integer multiplication cast to long in 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 {code}
 Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
 In class org.apache.hadoop.hdfs.util.StripedBlockUtil
 In method 
 org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
  int, int, int, int)
 At StripedBlockUtil.java:[line 85]
 {code}
 # Switch statement found in 
 org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
 long, byte[], int, Map) where default case is

[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands


 [ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8242:
---
Attachment: HDFS-8242-HDFS-7285.04.patch

Attached previous patch again to see the jenkins report

 Erasure Coding: XML based end-to-end test for ECCli commands
 

 Key: HDFS-8242
 URL: https://issues.apache.org/jira/browse/HDFS-8242
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8242-001.patch, HDFS-8242-002.patch, 
 HDFS-8242-003.patch, HDFS-8242-HDFS-7285.04.patch


 This JIRA to add test cases with CLI test f/w for the commands present in 
 {{ECCli}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8137) Sends the EC schema to DataNode as well in EC encoding/recovering command

2015-04-30 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521123#comment-14521123
 ] 

Kai Zheng commented on HDFS-8137:
-

Uma thanks for the patch and good comments. I'd like to look at this and give 
my thoughts later today.

 Sends the EC schema to DataNode as well in EC encoding/recovering command
 -

 Key: HDFS-8137
 URL: https://issues.apache.org/jira/browse/HDFS-8137
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-8137-0.patch


 Discussed with [~umamaheswararao] and [~vinayrpet], we should also send the 
 EC schema to DataNode as well contained in the EC encoding/recovering 
 command. The target DataNode will use it to guide the executing of the task. 
 Another way would be, DataNode would just request schema actively thru a 
 separate RPC call, and as an optimization consideration, DataNode may cache 
 schemas to avoid repeatedly asking for the same schema twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-04-30 Thread zhouyingchao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521202#comment-14521202
 ] 

zhouyingchao commented on HDFS-7897:


Any updates regarding this simple patch?

 Shutdown metrics when stopping JournalNode
 --

 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7897-001.patch


 In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
 is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Li Bo (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520951#comment-14520951
 ] 

Li Bo commented on HDFS-7348:
-

bq. We can do local writing and local reading logics as follow-on under 
HDFS-8031.
Agree. We can do optimization of write and read logics later.


 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip


[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521004#comment-14521004
 ] 

Akira AJISAKA commented on HDFS-5574:
-

Looks like jenkins ran the tests in hadoop-hdfs project with 
hadoop-common-3.0.0-date.jar, which does not have 
{{FSInputChecker#readAndDiscard}}. I could reproduce the error by the following 
command:
{code}
$ cd hadoop-hdfs-project/hadoop-hdfs
$ mvn test -Dtest=TestDFSInputStream
{code}

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8296) BlockManager.getUnderReplicatedBlocksCount() is not giving correct count if namenode in safe mode.

surendra singh lilhore created HDFS-8296:


 Summary:  BlockManager.getUnderReplicatedBlocksCount() is not 
giving correct count if namenode in safe mode.
 Key: HDFS-8296
 URL: https://issues.apache.org/jira/browse/HDFS-8296
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore


{{underReplicatedBlocksCount}} update by the {{updateState()}} API.

{code}
 void updateState() {
pendingReplicationBlocksCount = pendingReplications.size();
underReplicatedBlocksCount = neededReplications.size();
corruptReplicaBlocksCount = corruptReplicas.size();
  }
 {code}

 but this will not call when NN in safe mode. This is happening because 
computeDatanodeWork() we will return 0 if NN in safe mode 

 {code}

  int computeDatanodeWork() {
   .
if (namesystem.isInSafeMode()) {
  return 0;
}


this.updateState();


  }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8161) Both Namenodes are in standby State

2015-04-30 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521064#comment-14521064
 ] 

Brahma Reddy Battula commented on HDFS-8161:


[~vinayrpet] , [~jnp] and [~arpitagarwal] any thoughts on this..? 
As there is no checksum verification from the ZK Side and seems to be no one 
interested in checksum feature in ZK side (since I  did not seen any comment in 
ZOOKEEPER-2175 ),Can we have some mechanism here..?

 Both Namenodes are in standby State
 ---

 Key: HDFS-8161
 URL: https://issues.apache.org/jira/browse/HDFS-8161
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover
Affects Versions: 2.6.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: ACTIVEBreadcumb and StandbyElector.txt


 Suspected Scenario:
 
 Start cluster with three Nodes.
 Reboot Machine where ZKFC is not running..( Here Active Node ZKFC should open 
 session with this ZK )
 Now  ZKFC ( Active NN's ) session expire and try re-establish connection with 
 another ZK...Bythe time  ZKFC ( StndBy NN's ) will try to fence old active 
 and create the active Breadcrumb and Makes SNN to active state..
 But immediately it fence to standby state.. ( Here is the doubt)
 Hence both will be in standby state..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations


 [ 
https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-8295:
--
Attachment: HDFS-8295.001.patch

A initial patch based on HDFS-7859.

 Add MODIFY and REMOVE ECSchema editlog operations
 -

 Key: HDFS-8295
 URL: https://issues.apache.org/jira/browse/HDFS-8295
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xinwei Qin 
Assignee: Xinwei Qin 
 Attachments: HDFS-8295.001.patch


 If MODIFY and REMOVE ECSchema operations are supported, then add these 
 editlog operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.


 [ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-8229:
-
Status: Patch Available  (was: Open)

 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.


[ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521145#comment-14521145
 ] 

surendra singh lilhore commented on HDFS-8229:
--

Attached new patch, please review.

In test case I am not using {{getCorruptReplicaBlocksCount()}} API for counting 
corrupt blocks because of 
[HDFS-8296|https://issues.apache.org/jira/browse/HDFS-8296]


 LAZY_PERSIST file gets deleted after NameNode restart.
 --

 Key: HDFS-8229
 URL: https://issues.apache.org/jira/browse/HDFS-8229
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: surendra singh lilhore
Assignee: surendra singh lilhore
 Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch


 {code}
 2015-04-20 10:26:55,180 WARN 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
 file /LAZY_PERSIST/smallfile with no replicas.
 {code}
 After namenode restart and before DN's registration if 
 {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

[
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Akira AJISAKA updated HDFS-7770:

Resolution: Fixed
Fix Version/s: 2.7.1
2.8.0
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)

Committed this to trunk, branch-2, and branch-2.7. Thanks [~xyao] for the
contribution.

Need document for storage type label of data node storage locations under
dfs.data.dir
--

Key: HDFS-7770
URL: https://issues.apache.org/jira/browse/HDFS-7770
Project: Hadoop HDFS
Issue Type: Improvement
Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
Fix For: 2.8.0, 2.7.1

Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch,
HDFS-7770.02.patch

HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN
as a collection of storages with different types. However, I can't find
document on how to label different storage types from the following two
documents. I found the information from the design spec. It will be good we
document this for admins and users to use the related Archival storage and
storage policy features.
http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
This JIRA is opened to add document for the new storage type labels.
1. Add an example under ArchivalStorage.html#Configuration section:
{code}
property
namedfs.data.dir/name
value[DISK]file:///hddata/dn/disk0,
[SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
/property
{code}
2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in
hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage
type is labeled in the data node storage location configuration.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8297) Ability to online trigger data dir rescan for blocks

2015-04-30 Thread Hari Sekhon (JIRA)

Hari Sekhon created HDFS-8297:
-

 Summary: Ability to online trigger data dir rescan for blocks
 Key: HDFS-8297
 URL: https://issues.apache.org/jira/browse/HDFS-8297
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon


Feature request to add functionality to online trigger data dir rescan for 
available blocks without having to restart datanode.

Motivation is if using HDFS storage tiering with an archive tier to a separate 
hyperscale storage device over the network (Hedvig in this case) which may go 
away and then return due to say a network interruption or other temporary 
error, this leaves HDFS fsck declaring missing blocks, that are clearly visible 
on the mount point for the node's archive directory. An online trigger for data 
dir rescsan for available blocks would avoid having to do a rolling restart of 
all datanodes across a cluster. I did try sending a kill -HUP to the datanode 
process (both SecureDataNodeStarter parent and child) while tailing the log 
hoping this might do it, but nothing happened in the log.

Hari Sekhon
http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip


[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521253#comment-14521253
 ] 

Akira AJISAKA commented on HDFS-5574:
-

+1, I ran the failed tests locally and all the tests passed. Committing this.

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands

2015-04-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521306#comment-14521306
 ] 

Hadoop QA commented on HDFS-8242:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 43s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 5 new or modified test files. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 58s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 41s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 13s | The patch appears to introduce 9 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  88m 18s | Tests failed in hadoop-hdfs. |
| | | 135m 14s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time  
Unsynchronized access at DFSOutputStream.java:90% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 105] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:[line 85] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:[line 167] |
| Failed unit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.TestSetrepIncreasing |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729426/HDFS-8242-HDFS-7285.04.patch
 |
| Optional Tests

[jira] [Commented] (HDFS-7687) Change fsck to support EC files

2015-04-30 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521322#comment-14521322
 ] 

Takanobu Asanuma commented on HDFS-7687:


Thank you for the information.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir

[
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521323#comment-14521323
]

Akira AJISAKA commented on HDFS-7770:
-

Thanks [~xyao] for updating the patch. LGTM, +1.
bq. I think we can address that in a separate JIRA.
Agree, let's create a jira for this.

Need document for storage type label of data node storage locations under
dfs.data.dir
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-5574) Remove buffer copy in BlockReader.skip


 [ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-5574:

   Resolution: Fixed
Fix Version/s: 2.8.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed this to trunk and branch-2. Thanks [~decster] for the contribution!

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7687) Change fsck to support EC files

2015-04-30 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-7687:
---
Attachment: HDFS-7687.1.patch

I created an initial patch. The main codes in this patch are below.

# I separated {{collect\[File|Block\]Summary}} into 
{{collectReplicated\[File|Block\]Summary}} and 
{{collectEC\[File|Block\]Summary}}.
# I named or renamed some variables and outputs. For example, 
{{ReplicatedBlocks}} is {{ECBlockGroups}} in EC. And {{Replication}} or 
{{Replicas}} is {{ECBlocks}} in EC.
# I added EC summaries to Result#toString.

Please will you review this patch? I'm going to add some tests about this codes.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma
 Attachments: HDFS-7687.1.patch


 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7578) NFS WRITE and COMMIT responses should always use the channel pipeline

2015-04-30 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521467#comment-14521467
 ] 

Allen Wittenauer commented on HDFS-7578:


bq. It's also strange that my comment triggered Jenkins.  Is that expected with 
the new test script?

Yup.  That part of the pipeline is before test-patch.sh.  It's always been that 
way.

 NFS WRITE and COMMIT responses should always use the channel pipeline
 -

 Key: HDFS-7578
 URL: https://issues.apache.org/jira/browse/HDFS-7578
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.7.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-7578.001.patch, HDFS-7578.002.patch


 Write and Commit responses directly write data to the channel instead of 
 propagating it to the next immediate handler in the channel pipeline. 
 Not following Netty channel pipeline model could be problematic. We don't 
 know whether it could cause any resource leak or performance issue especially 
 the internal pipeline implementation keeps changing with newer Netty releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem

2015-04-30 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8299:
--
Description: 
Fsck shows missing blocks when the blocks can be found on a datanode's 
filesystem and the datanode has been restarted to try to get it to recognize 
that the blocks are indeed present and hence report them to the NameNode in a 
block report.

Fsck output showing an example missing block:
{code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT 
blockpool BP-120244285-ip-1417023863606 block blk_1075202330
 MISSING 1 blocks of total size 3260848 B
0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 
MISSING!{code}
The block is definitely present on more than one datanode however, here is the 
output from one of them that I restarted to try to get it to report the block 
to the NameNode:
{code}# ll 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
-rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330
-rw-r--r-- 1 hdfs 499   25483 Apr 27 15:02 
/archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code}
It's worth noting that this is on HDFS tiered storage on an archive tier going 
to a networked block device that may have become temporarily unavailable but is 
available now. See also feature request HDFS-8297 for online rescan to not have 
to go around restarting datanodes.

It turns out in the datanode log (that I am attaching) this is because the 
datanode fails to get a write lock on the filesystem. I think it would be 
better to be able to read-only those blocks however, since this way causes 
client visible data unavailability when the data could in fact be read.

{code}2015-04-30 14:11:08,235 WARN  datanode.DataNode 
(DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
/archive1/dn :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
writable: /archive1/dn
at 
org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
at 
org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
at 
org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
{code}

Hari Sekhon
http://www.linkedin.com/in/harisekhon

 HDFS reporting missing blocks when they are actually present due to read-only 
 filesystem
 

 Key: HDFS-8299
 URL: https://issues.apache.org/jira/browse/HDFS-8299
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Critical
 Attachments: datanode.log


 Fsck shows missing blocks when the blocks can be found on a datanode's 
 filesystem and the datanode has been restarted to try to get it to recognize 
 that the blocks are indeed present and hence report them to the NameNode in a 
 block report.
 Fsck output showing an example missing block:
 {code}/apps/hive/warehouse/custom_scrubbed.db/someTable/00_0: CORRUPT 
 blockpool BP-120244285-ip-1417023863606 block blk_1075202330
  MISSING 1 blocks of total size 3260848 B
 0. BP-120244285-ip-1417023863606:blk_1075202330_1484191 len=3260848 
 MISSING!{code}
 The block is definitely present on more than one datanode however, here is 
 the output from one of them that I restarted to try to get it to report the 
 block to the NameNode:
 {code}# ll 
 /archive1/dn/current/BP-120244285-ip-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*

[jira] [Updated] (HDFS-8242) Erasure Coding: XML based end-to-end test for ECCli commands