[jira] [Updated] (HDFS-8086) Move LeaseRenewer to the hdfs.client.impl package

2015-04-30 Thread Takanobu Asanuma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-8086:
---
Attachment: HDFS-8086.1.patch

I submitted a first patch. Please take note that I used reflection for 
TestDistributedFileSystem which calls package private methods of LeaseRenewer. 
Could you review this patch?

> Move LeaseRenewer to the hdfs.client.impl package
> -
>
> Key: HDFS-8086
> URL: https://issues.apache.org/jira/browse/HDFS-8086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: HDFS-8086.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8245) Standby namenode doesn't process DELETED_BLOCK if the add block request is in edit log.

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522838#comment-14522838
 ] 

Hadoop QA commented on HDFS-8245:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 32s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 23s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 22s | The applied patch generated  3 
new checkstyle issues (total was 206, now 208). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 22  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  5s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 226m 48s | Tests failed in hadoop-hdfs. |
| | | 268m 30s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.web.TestWebHDFS |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.namenode.TestSaveNamespace |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729637/HDFS-8245.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 87e9978 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/10498/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10498/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10498/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10498/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10498/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10498/console |


This message was automatically generated.

> Standby namenode doesn't process DELETED_BLOCK if the add block request is in 
> edit log.
> ---
>
> Key: HDFS-8245
> URL: https://issues.apache.org/jira/browse/HDFS-8245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-8245.patch
>
>
> The following series of events happened on Standby namenode :
> 2015-04-09 07:47:21,735 \[Edit log tailer] INFO ha.EditLogTailer: Triggering 
> log roll on remote NameNode Active Namenode (ANN)
> 2015-04-09 07:58:01,858 \[Edit log tailer] INFO ha.EditLogTailer: Triggering 
> log roll on remote NameNode ANN
> The following series of events happened on Active Namenode:,
> 2015-04-09 07:47:21,747 \[IPC Server handler 99 on 8020] INFO 
> namenode.FSNamesystem: Roll Edit Log from

[jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522829#comment-14522829
 ] 

Hadoop QA commented on HDFS-8178:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 54s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   7m 45s | The applied patch generated  133  
additional warning messages. |
| {color:green}+1{color} | javadoc |  10m  5s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m  1s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m 23s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  8s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 225m 15s | Tests failed in hadoop-hdfs. |
| | | 268m 32s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
178-202] |
| Failed unit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.cli.TestHDFSCLI |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.server.mover.TestMover |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729632/HDFS-8178.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 87e9978 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10496/artifact/patchprocess/diffJavacWarnings.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10496/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10496/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10496/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10496/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10496/console |


This message was automatically generated.

> QJM doesn't move aside stale inprogress edits files
> ---
>
> Key: HDFS-8178
> URL: https://issues.apache.org/jira/browse/HDFS-8178
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8178.000.patch, HDFS-8178.002.patch, 
> HDFS-8178.003.patch
>
>
> When a QJM crashes, the in-progress edit log file at that time remains in the 
> file system. When the node comes back, it will accept new edit logs and those 
> stale in-progress files are never cleaned up. QJM treats them as regular 
> in-progress edit log files and tries to finalize them, which potentially 
> causes high memory usage. This JIRA aims to move aside those stale edit log 
> files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8193) Add the ability to delay replica deletion for a period of time

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522819#comment-14522819
 ] 

Zhe Zhang commented on HDFS-8193:
-

Thanks [~sureshms] for the helpful comments!

bq. Second use case, NN deleted file and admin wants to restore it (the case of 
NN metadata backup). Going back to an older fsimage is not that straight 
forward and a solution to be used only in desperate situation. It can cause 
corruption for other applications running on HDFS. It also results in loss of 
newly created data across the file system. Snapshots and trash are solutions 
for this.
You are absolutely right that it's always preferable to protect data on the 
file instead of block level. This JIRA indeed is aimed as the last resort for 
desperate situations. It's similar to recovering data directly from hard disk 
drives when the file system is corrupt beyond recovery. It's fully controlled 
by the DN and is the last layer of protection when all layers above have failed 
(trash mistakenly emptied, snapshots not correctly setup, etc.).

bq. First use case, NN deletes blocks without deleting files. Have you seen an 
instance of this? It would be great to get one pager on how one handles this 
condition.
One possible situation (recently fixed by HDFS-7960) is that NN mistakenly 
considers some blocks as over replicated, caused by zombie storages. Even 
though HDFS-7960 is already fixed, we should do something to protect against 
possible future NN bugs. This is the crux of why file-level protections, 
although always desirable, are not always sufficient. It could be that the NN 
gets something wrong, and then we're left with irrecoverable data loss.

bq. Does NN keep deleting the blocks until it is hot fixed? 
In the above case, NN will delete all replicas it considers over replicated 
until hot fixed.

bq. Also completing deletion of blocks in a timely manner is important for a 
running cluster.
Yes this is a valid concern. Empirically, most customer clusters do not run 
even close to near disk capacity. Therefore, adding a reasonable grace period 
shouldn't delay allocating new blocks. The configured delay window should also 
be enforced under the constraint of available space (e.g., don't delay deletion 
when available disk space < 10%). We will also add Web UI and metrics support 
to clearly show the space consumption by deletion-delayed replicas.

bq. All files don't require the same reliability. Intermediate data and tmp 
files need to be deleted immediately to free up cluster storage to avoid the 
risk of running out of storage space. At datanode level, there is no notion of 
whether files are temporary or important ones that need to be preserved. So a 
trash such as this can result in retaining lot of tmp files and deletes not 
being able to free up storage with in the cluster fast enough.
This is a great point. The proposed work (at least in the first phase) is 
intended as a best-effort optimization and will always yield to foreground 
workloads. The target is to statistically reduce the chance and severity of 
data losses given typical storage consumption conditions. It's certainly still 
possible for wave of tmp data to flush out more important data in DN trashes. 
We can design some smart eviction algorithms as future work.

As I [commented | 
https://issues.apache.org/jira/browse/HDFS-8193?focusedCommentId=14505336&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14505336]
 above, we are considering a more radical approach as a potential next phase of 
this work, where deletion-delayed replicas will just be overwritten by incoming 
replicas. In that case we might not even need to count deletion-delayed 
replicas in the space quota, making the feature more transparent to admins.

> Add the ability to delay replica deletion for a period of time
> --
>
> Key: HDFS-8193
> URL: https://issues.apache.org/jira/browse/HDFS-8193
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Aaron T. Myers
>Assignee: Zhe Zhang
>
> When doing maintenance on an HDFS cluster, users may be concerned about the 
> possibility of administrative mistakes or software bugs deleting replicas of 
> blocks that cannot easily be restored. It would be handy if HDFS could be 
> made to optionally not delete any replicas for a configurable period of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7281) Missing block is marked as corrupted block

2015-04-30 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522804#comment-14522804
 ] 

Yongjun Zhang commented on HDFS-7281:
-

The test failure is unrelated, I reran all failed tests successfully at local 
machine. I do see that the failed tests appear in multiple jenkins tests for 
different jiras, something about the test env.

+1. I will commit by tomorrow. 



> Missing block is marked as corrupted block
> --
>
> Key: HDFS-7281
> URL: https://issues.apache.org/jira/browse/HDFS-7281
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: supportability
> Attachments: HDFS-7281-2.patch, HDFS-7281-3.patch, HDFS-7281-4.patch, 
> HDFS-7281-5.patch, HDFS-7281-6.patch, HDFS-7281.patch
>
>
> In the situation where the block lost all its replicas, fsck shows the block 
> is missing as well as corrupted. Perhaps it is better not to mark the block 
> corrupted in this case. The reason it is marked as corrupted is 
> numCorruptNodes == numNodes == 0 in the following code.
> {noformat}
> BlockManager
> final boolean isCorrupt = numCorruptNodes == numNodes;
> {noformat}
> Would like to clarify if it is the intent to mark missing block as corrupted 
> or it is just a bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522795#comment-14522795
 ] 

Zhe Zhang commented on HDFS-7678:
-

Jenkins on HDFS-7285 branch is not stable recently. Will try trigger it again 
tmr.

> Erasure coding: DFSInputStream with decode functionality
> 
>
> Key: HDFS-7678
> URL: https://issues.apache.org/jira/browse/HDFS-7678
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Li Bo
>Assignee: Zhe Zhang
> Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
> HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
> HDFS-7678-HDFS-7285.005.patch, HDFS-7678.000.patch, HDFS-7678.001.patch
>
>
> A block group reader will read data from BlockGroup no matter in striping 
> layout or contiguous layout. The corrupt blocks can be known before 
> reading(told by namenode), or just be found during reading. The block group 
> reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8281) Erasure Coding: implement parallel stateful reading for striped layout

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522770#comment-14522770
 ] 

Zhe Zhang commented on HDFS-8281:
-

Great work Jing! I like the elegant logic of reading at stripe granularity 
(meanwhile have a concern stated below), and the implementation looks good 
overall. 

# {{StripeRange}} can be moved to {{StripedBlockUtil}}. We can also consider 
(maybe as a follow-on) consolidating it with {{ReadPortion}}.
# _Follow-on_: striping brings in a lot of new concepts. We have already 
defined a block group, an internal block, a cell, and we are formally defining 
a stripe here. I've been thinking about a follow-on to make them easier to 
understand. My current thought is to have cell as the basic concept. Each cell 
sits on 2 dimensions: internal block _i_ and stripe _j_. Each cell also has a 
logical index, _k_, marking its position in the file. In the follow-on I'll  
name each cell as {{(i, j, k)}} in all comments and variable names. Just wanted 
to add a headsup here; feedbacks are very welcome.
# {{readOneStripe}} looks quite similar to {{fetchBlockByteRange}}, except that 
the {{blockReader}} is already created. If we consolidate them, stateful read 
can basically get decode functionality for free (after HDFS-7678). We can 
either do the consolidation here or I can do it in a separate JIRA.
# "Short read" in my earlier comment is the scenario where read() returns less 
data than requested (instead of application requesting a small amount of data 
at a time). In talking to [~cmccabe] I got the understanding that this is 
intentional because some applications want to overlap computation with I/O -- 
they can however much data is buffered in {{blockReader}} (usually 64K) very 
fast and start processing the data (maybe issue another async I/O to fill 
buffer). The read() request is big because there's no way for the application 
to know how much data is actually buffered. So while {{readOneCell}} is 
logically simpler and increases throughput for sequential I/O, I wonder if we 
should make sequential / parallel stateful read configurable?
# If we use a small buffer size, the stripe buffer will be smaller. But the 
{{pos}} enters a new stripe more frequently, whenever that happens, that 
unlucky read request (first in the stripe) needs to wait for all 6 threads to 
return.

> Erasure Coding: implement parallel stateful reading for striped layout
> --
>
> Key: HDFS-8281
> URL: https://issues.apache.org/jira/browse/HDFS-8281
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8281-HDFS-7285.001.patch, 
> HDFS-8281-HDFS-7285.001.patch, HDFS-8281.000.patch
>
>
> This jira aims to support parallel reading for stateful read in 
> {{DFSStripedInputStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522760#comment-14522760
 ] 

Zhe Zhang commented on HDFS-7348:
-

Please find my full review below. The overall patch looks good to me. Thanks Yi 
again for the work!
# _Follow on_: we should consider consolidating the init thread pool logic for 
hedged read, client striped read, and DN striped read.
# As mentioned under HDFS-8282, we should probably get rid of the {{getBlock}} 
method here.
# How about {{DataRecoveryAndTransfer}} -> {{ReconstructAndTransferBlock}}? 
It's not a big deal, but we have decided to use reconstruction for EC to avoid 
confusion with block recovery.
# Should {{WRITE_PACKET_SIZE}} be linked to 
{{BlockSender#MIN_BUFFER_WITH_TRANSFERTO}}?
# Why do we need {{targetInputStreams}}?
# In the test, the following 2 lines should be flipped:
{code}
cluster.getFileSystem().getClient().createErasureCodingZone("/", null);
fs = cluster.getFileSystem();
{code}
# The test failed on my local machine, reporting NPE when closing file:
{code}
java.io.IOException: java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.DataStreamer$LastException.check(DataStreamer.java:193)
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:422)
{code}
# {{cluster#stopDataNode}} might be an easier way to kill a DN?
# We could set {{DFS_NAMENODE_REPLICATION_INTERVAL_KEY}} to 1 to speedup the 
test

> Erasure Coding: striped block recovery
> --
>
> Key: HDFS-7348
> URL: https://issues.apache.org/jira/browse/HDFS-7348
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Yi Liu
> Attachments: ECWorker.java, HDFS-7348.001.patch
>
>
> This JIRA is to recover one or more missed striped block in the striped block 
> group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8294) Erasure Coding: Fix Findbug warnings present in erasure coding

2015-04-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522740#comment-14522740
 ] 

Rakesh R commented on HDFS-8294:


oops again it got ABORTED, please see [jenkins 
build|https://builds.apache.org/job/PreCommit-HDFS-Build/10479/console]. It 
looks like HDFS-7285-branch, jenkins unit testing is taking too much 
time(>300minutes) and it is getting ABORTED frequently.

> Erasure Coding: Fix Findbug warnings present in erasure coding
> --
>
> Key: HDFS-8294
> URL: https://issues.apache.org/jira/browse/HDFS-8294
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8294-HDFS-7285.00.patch, 
> HDFS-8294-HDFS-7285.01.patch
>
>
> Following are the findbug warnings :-
> # Possible null pointer dereference of arr$ in 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
> {code}
> Bug type NP_NULL_ON_SOME_PATH (click for details) 
> In class 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction
> In method 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
> Value loaded from arr$
> Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206]
> Known null at BlockInfoStripedUnderConstruction.java:[line 200]
> {code}
> # Found reliance on default encoding in 
> org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
>  ECSchema): String.getBytes()
> Found reliance on default encoding in 
> org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
>  new String(byte[])
> {code}
> Bug type DM_DEFAULT_ENCODING (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
> In method 
> org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
>  ECSchema)
> Called method String.getBytes()
> At ErasureCodingZoneManager.java:[line 116]
> Bug type DM_DEFAULT_ENCODING (click for details) 
> In class org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager
> In method 
> org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath)
> Called method new String(byte[])
> At ErasureCodingZoneManager.java:[line 81]
> {code}
> # Inconsistent synchronization of 
> org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time
> {code}
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.hdfs.DFSOutputStream
> Field org.apache.hadoop.hdfs.DFSOutputStream.streamer
> Synchronized 90% of the time
> Unsynchronized access at DFSOutputStream.java:[line 142]
> Unsynchronized access at DFSOutputStream.java:[line 853]
> Unsynchronized access at DFSOutputStream.java:[line 617]
> Unsynchronized access at DFSOutputStream.java:[line 620]
> Unsynchronized access at DFSOutputStream.java:[line 630]
> Unsynchronized access at DFSOutputStream.java:[line 338]
> Unsynchronized access at DFSOutputStream.java:[line 734]
> Unsynchronized access at DFSOutputStream.java:[line 897]
> {code}
> # Dead store to offSuccess in 
> org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
> {code}
> Bug type DLS_DEAD_LOCAL_STORE (click for details) 
> In class org.apache.hadoop.hdfs.StripedDataStreamer
> In method org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()
> Local variable named offSuccess
> At StripedDataStreamer.java:[line 105]
> {code}
> # Result of integer multiplication cast to long in 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
> {code}
> Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
> In class org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped
> In method 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()
> At BlockInfoStriped.java:[line 208]
> {code}
> # Result of integer multiplication cast to long in 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
>  int, int, int, int)
> {code}
> Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
> In class org.apache.hadoop.hdfs.util.StripedBlockUtil
> In method 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
>  int, int, int, int)
> At StripedBlockUtil.java:[line 85]
> {code}
> # Switch statement found in 
> org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
> long, byte[], int, Map) where default case is missing
> {code}
> Bug type SF_SWITCH_NO_DEFAULT (click for details) 
> In class org.apache.hadoop.hdfs.DFSStripedInputStream
> In method 
> org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRa

[jira] [Commented] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522729#comment-14522729
 ] 

Hadoop QA commented on HDFS-8303:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  9s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 50s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 11s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 227m 53s | Tests failed in hadoop-hdfs. |
| | | 271m  9s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSOutputStream |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729620/HDFS-8303.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5f8112f |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10494/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10494/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10494/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10494/console |


This message was automatically generated.

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch, HDFS-8303.1.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);

[jira] [Commented] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522722#comment-14522722
 ] 

Hadoop QA commented on HDFS-8306:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 56s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 12s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 20s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 227m 23s | Tests failed in hadoop-hdfs. |
| | | 270m 31s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.fs.viewfs.TestViewFileSystemWithAcls |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSOutputStream |
| Timed out tests | 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729618/HDFS-8306.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5f8112f |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10493/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10493/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10493/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10493/console |


This message was automatically generated.

> Generate ACL and Xattr outputs in OIV XML outputs
> -
>
> Key: HDFS-8306
> URL: https://issues.apache.org/jira/browse/HDFS-8306
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-8306.000.patch
>
>
> Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
> outputs. It makes inspecting {{fsimage}} from XML outputs less practical. 
> Also it prevents recovering a fsimage from XML file.
> This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
> achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7739) ZKFC - transitionToActive is indefinitely waiting to complete fenceOldActive

2015-04-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522717#comment-14522717
 ] 

Brahma Reddy Battula commented on HDFS-7739:


Thanks a lot for look into this issue..

AFAIK 1) password less ssh is configured..And Even it was not prompting for 
password..
  2) deafult value of {{dfs.ha.fencing.ssh.connect-timeout}} is 30 
sec's..after this it should come out..

Anyway let me retry this scenario once..

> ZKFC - transitionToActive is indefinitely waiting to complete fenceOldActive
> 
>
> Key: HDFS-7739
> URL: https://issues.apache.org/jira/browse/HDFS-7739
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.6.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Attachments: zkfctd.out
>
>
>  *Scenario:* 
> One of the cluster disk got full and ZKFC making tranisionToAcitve ,To fence 
> old active node it needs to execute the command and wait for tge result, 
> since disk got full, strempumper thread will be indefinitely waiting( Even 
> after free the disk also, it will not come out)...
>  *{color:blue}Please check the attached thread dump of ZKFC{color}* ..
>  *{color:green}Better to maintain the timeout for stream-pumper 
> thread{color}* .
> {code}
> protected void pump() throws IOException {
> InputStreamReader inputStreamReader = new InputStreamReader(stream);
> BufferedReader br = new BufferedReader(inputStreamReader);
> String line = null;
> while ((line = br.readLine()) != null) {
>   if (type == StreamType.STDOUT) {
> log.info(logPrefix + ": " + line);
>   } else {
> log.warn(logPrefix + ": " + line);  
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7739) ZKFC - transitionToActive is indefinitely waiting to complete fenceOldActive

2015-04-30 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522710#comment-14522710
 ] 

Chris Nauroth commented on HDFS-7739:
-

Hi [~brahmareddy].  From the stack trace, it looks like the process is blocked 
waiting to read output from the ssh connection to run fuser to stop the old 
active.  I can think of 2 possible theories:

# Passwordless ssh is not configured, so the connection is hanging indefinitely 
prompting for a password.  This would require configuration of 
{{dfs.ha.fencing.ssh.private-key-files}} to specify the ssh key file.
# The ssh connection to run fuser is hanging indefinitely.  This could be 
caused by a lot of different kinds of failures at the old active, making it 
unresponsive.  This can be mitigated by configuring a timeout on the ssh 
connection ({{dfs.ha.fencing.ssh.connect-timeout}}).

This documentation page has more details:

http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Configuration_details

> ZKFC - transitionToActive is indefinitely waiting to complete fenceOldActive
> 
>
> Key: HDFS-7739
> URL: https://issues.apache.org/jira/browse/HDFS-7739
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.6.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Attachments: zkfctd.out
>
>
>  *Scenario:* 
> One of the cluster disk got full and ZKFC making tranisionToAcitve ,To fence 
> old active node it needs to execute the command and wait for tge result, 
> since disk got full, strempumper thread will be indefinitely waiting( Even 
> after free the disk also, it will not come out)...
>  *{color:blue}Please check the attached thread dump of ZKFC{color}* ..
>  *{color:green}Better to maintain the timeout for stream-pumper 
> thread{color}* .
> {code}
> protected void pump() throws IOException {
> InputStreamReader inputStreamReader = new InputStreamReader(stream);
> BufferedReader br = new BufferedReader(inputStreamReader);
> String line = null;
> while ((line = br.readLine()) != null) {
>   if (type == StreamType.STDOUT) {
> log.info(logPrefix + ": " + line);
>   } else {
> log.warn(logPrefix + ": " + line);  
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522705#comment-14522705
 ] 

Hudson commented on HDFS-8283:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7710 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7710/])
HDFS-8300. Fix unit test failures and findbugs warning caused by HDFS-8283. 
Contributed by Jing Zhao. (jing9: rev 98a61766286321468bf801a9f17a843d7eae8d9e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> DataStreamer cleanup and some minor improvement
> ---
>
> Key: HDFS-8283
> URL: https://issues.apache.org/jira/browse/HDFS-8283
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h8283_20150428.patch
>
>
> - When throwing an exception
> -* always set lastException 
> -* always creating a new exception so that it has the new stack trace
> - Add LOG.
> - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522704#comment-14522704
 ] 

Hudson commented on HDFS-8292:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7710 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7710/])
HDFS-8292. Move conditional in fmt_time from dfs-dust.js to status.html. 
Contributed by Charles Lamb. (wang: rev 
87e997823581790cce8d82d20e5e82ef9dd80670)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary/status.html
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/dfs-dust.js


> Move conditional in fmt_time from dfs-dust.js to status.html
> 
>
> Key: HDFS-8292
> URL: https://issues.apache.org/jira/browse/HDFS-8292
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.8.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8292.000.patch
>
>
> Per [~wheat9]'s comment in HDFS-8214, move the check for < 0 from dfs-dust.js 
> to status.html.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522703#comment-14522703
 ] 

Hudson commented on HDFS-8300:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7710 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7710/])
HDFS-8300. Fix unit test failures and findbugs warning caused by HDFS-8283. 
Contributed by Jing Zhao. (jing9: rev 98a61766286321468bf801a9f17a843d7eae8d9e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.8.0
>
> Attachments: HDFS-8300.000.patch
>
>
> - findbugs warning
> Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from 
> an Exception, even though it is named as such
> - unit test failures
> see https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/
> These bugs somehow were not reported in [the Jenkins 
> run|https://issues.apache.org/jira/browse/HDFS-8283?focusedCommentId=14518736&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14518736]
>  previously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8290) WebHDFS calls before namesystem initialization can cause NullPointerException.

2015-04-30 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522700#comment-14522700
 ] 

Chris Nauroth commented on HDFS-8290:
-

The test failures and findbugs warning have been fixed in HDFS-8300, so I 
submitted a fresh Jenkins run for this patch.

> WebHDFS calls before namesystem initialization can cause NullPointerException.
> --
>
> Key: HDFS-8290
> URL: https://issues.apache.org/jira/browse/HDFS-8290
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Attachments: HDFS-8290.001.patch
>
>
> The NameNode has a brief window of time when the HTTP server has been 
> initialized, but the namesystem has not been initialized.  During this 
> window, a WebHDFS call can cause a {{NullPointerException}}.  We can catch 
> this condition and return a more meaningful error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8308) Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when loading editlog

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-8308.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285

Thanks for the review, Nicholas! I've committed this.

> Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when 
> loading editlog
> 
>
> Key: HDFS-8308
> URL: https://issues.apache.org/jira/browse/HDFS-8308
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: HDFS-7285
>
> Attachments: HDFS-8308.000.patch
>
>
> If the editlog contains a transaction for creating an EC file, the NN will 
> get blocked in {{waitForLoadingFSImage}} because the following call path:
> FSDirectory#addFileForEditLog --> FSDirectory#isInECZone --> 
> FSDirectory#getECSchema --> ECZoneManager#getECSchema --> 
> ECZoneManager#getECZoneInfo --> FSNamesystem#getSchema --> 
> waitForLoadingFSImage
> This jira plans to fix this bug and also do some code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8300:

   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Thanks for the review, Nicholas! I've committed this to trunk and branch-2.

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.8.0
>
> Attachments: HDFS-8300.000.patch
>
>
> - findbugs warning
> Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from 
> an Exception, even though it is named as such
> - unit test failures
> see https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/
> These bugs somehow were not reported in [the Jenkins 
> run|https://issues.apache.org/jira/browse/HDFS-8283?focusedCommentId=14518736&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14518736]
>  previously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace

2015-04-30 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522663#comment-14522663
 ] 

Colin Patrick McCabe edited comment on HDFS-8213 at 5/1/15 2:12 AM:


findbugs warning is bogus.  patch doesn't modify 
org.apache.hadoop.hdfs.DataStreamer$LastException.  the rest of the stuff looks 
bogus as well (a lot of test timeouts on random things that aren't enabling / 
touching tracing), guess it's time to re-run again


was (Author: cmccabe):
findbugs warning is bogus.  patch doesn't modify 
org.apache.hadoop.hdfs.DataStreamer$LastException.  the rest of the stuff looks 
bogus as well, guess it's time to re-run again

> DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
> than hadoop.htrace
> -
>
> Key: HDFS-8213
> URL: https://issues.apache.org/jira/browse/HDFS-8213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Billie Rinaldi
>Assignee: Colin Patrick McCabe
>Priority: Critical
> Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch
>
>
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
> SpanReceivers through its own configuration.  This results in the same 
> receivers being registered multiple times and spans being delivered more than 
> once.  The documentation says SpanReceiverHost.getInstance should be issued 
> once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522664#comment-14522664
 ] 

Hadoop QA commented on HDFS-8305:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 31s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 24s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 31s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  6s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 225m 44s | Tests failed in hadoop-hdfs. |
| | | 271m 35s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.namenode.TestSaveNamespace |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729603/HDFS-8305.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f0db797 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10490/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10490/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10490/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10490/console |


This message was automatically generated.

> HDFS INotify: the destination field of RenameOp should always end with the 
> file name
> 
>
> Key: HDFS-8305
> URL: https://issues.apache.org/jira/browse/HDFS-8305
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8305.001.patch
>
>
> HDFS INotify: the destination field of RenameOp should always end with the 
> file name rather than sometimes being a directory name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace

2015-04-30 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522663#comment-14522663
 ] 

Colin Patrick McCabe commented on HDFS-8213:


findbugs warning is bogus.  patch doesn't modify 
org.apache.hadoop.hdfs.DataStreamer$LastException.  the rest of the stuff looks 
bogus as well, guess it's time to re-run again

> DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
> than hadoop.htrace
> -
>
> Key: HDFS-8213
> URL: https://issues.apache.org/jira/browse/HDFS-8213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Billie Rinaldi
>Assignee: Colin Patrick McCabe
>Priority: Critical
> Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch
>
>
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
> SpanReceivers through its own configuration.  This results in the same 
> receivers being registered multiple times and spans being delivered more than 
> once.  The documentation says SpanReceiverHost.getInstance should be issued 
> once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522654#comment-14522654
 ] 

Hadoop QA commented on HDFS-7980:
-

(!) The patch artifact directory has been removed! 
This is a fatal error for test-patch.sh.  Aborting. 
Jenkins (node H4) information at 
https://builds.apache.org/job/PreCommit-HDFS-Build/10495/ may provide some 
hints.

> Incremental BlockReport will dramatically slow down the startup of  a namenode
> --
>
> Key: HDFS-7980
> URL: https://issues.apache.org/jira/browse/HDFS-7980
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hui Zheng
>Assignee: Walter Su
> Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, 
> HDFS-7980.003.patch, HDFS-7980.004.patch
>
>
> In the current implementation the datanode will call the 
> reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before 
> calling the bpNamenode.blockReport() method. So in a large(several thousands 
> of datanodes) and busy cluster it will slow down(more than one hour) the 
> startup of namenode. 
> {code}
> List blockReport() throws IOException {
> // send block report if timer has expired.
> final long startTime = now();
> if (startTime - lastBlockReport <= dnConf.blockReportInterval) {
>   return null;
> }
> final ArrayList cmds = new ArrayList();
> // Flush any block information that precedes the block report. Otherwise
> // we have a chance that we will miss the delHint information
> // or we will report an RBW replica after the BlockReport already reports
> // a FINALIZED one.
> reportReceivedDeletedBlocks();
> lastDeletedReport = startTime;
> .
> // Send the reports to the NN.
> int numReportsSent = 0;
> int numRPCs = 0;
> boolean success = false;
> long brSendStartTime = now();
> try {
>   if (totalBlockCount < dnConf.blockReportSplitThreshold) {
> // Below split threshold, send all reports in a single message.
> DatanodeCommand cmd = bpNamenode.blockReport(
> bpRegistration, bpos.getBlockPoolId(), reports);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522652#comment-14522652
 ] 

Hadoop QA commented on HDFS-8303:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 58s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 36s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 12s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 214m 18s | Tests failed in hadoop-hdfs. |
| | | 261m 45s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.cli.TestHDFSCLI |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729620/HDFS-8303.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f0db797 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10491/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10491/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10491/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10491/console |


This message was automatically generated.

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch, HDFS-8303.1.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\

[jira] [Commented] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522645#comment-14522645
 ] 

Hadoop QA commented on HDFS-8213:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 21s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   4m 47s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | common tests |  23m  3s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 227m 12s | Tests failed in hadoop-hdfs. |
| | | 294m 35s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.qjournal.TestNNWithQJM |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728982/HDFS-8213.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c55d609 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10489/artifact/patchprocess/checkstyle-result-diff.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10489/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10489/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10489/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10489/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10489/console |


This message was automatically generated.

> DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
> than hadoop.htrace
> -
>
> Key: HDFS-8213
> URL: https://issues.apache.org/jira/browse/HDFS-8213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Billie Rinaldi
>Assignee: Colin Patrick McCabe
>Priority: Critical
> Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch
>
>
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
> SpanReceivers through its own configuration.  This results in the same 
> receivers being registered multiple times and spans being delivered more than 
> once.  The documentation says SpanReceiverHost.getInstance should be issued 
> once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html

2015-04-30 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522637#comment-14522637
 ] 

Charles Lamb commented on HDFS-8292:


Yes, I tested it manually.

Thanks Andrew.


> Move conditional in fmt_time from dfs-dust.js to status.html
> 
>
> Key: HDFS-8292
> URL: https://issues.apache.org/jira/browse/HDFS-8292
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.8.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8292.000.patch
>
>
> Per [~wheat9]'s comment in HDFS-8214, move the check for < 0 from dfs-dust.js 
> to status.html.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522627#comment-14522627
 ] 

Hadoop QA commented on HDFS-8300:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 35s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  5s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 164m 51s | Tests passed in hadoop-hdfs. 
|
| | | 212m 56s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729607/HDFS-8300.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f0db797 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10492/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10492/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10492/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10492/console |


This message was automatically generated.

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8300.000.patch
>
>
> - findbugs warning
> Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from 
> an Exception, even though it is named as such
> - unit test failures
> see https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/
> These bugs somehow were not reported in [the Jenkins 
> run|https://issues.apache.org/jira/browse/HDFS-8283?focusedCommentId=14518736&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14518736]
>  previously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html

2015-04-30 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8292:
--
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks Charles for the fixup.

> Move conditional in fmt_time from dfs-dust.js to status.html
> 
>
> Key: HDFS-8292
> URL: https://issues.apache.org/jira/browse/HDFS-8292
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.8.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-8292.000.patch
>
>
> Per [~wheat9]'s comment in HDFS-8214, move the check for < 0 from dfs-dust.js 
> to status.html.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8292) Move conditional in fmt_time from dfs-dust.js to status.html

2015-04-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522612#comment-14522612
 ] 

Andrew Wang commented on HDFS-8292:
---

+1 will commit shortly, assuming you tested manually.

> Move conditional in fmt_time from dfs-dust.js to status.html
> 
>
> Key: HDFS-8292
> URL: https://issues.apache.org/jira/browse/HDFS-8292
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.8.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Minor
> Attachments: HDFS-8292.000.patch
>
>
> Per [~wheat9]'s comment in HDFS-8214, move the check for < 0 from dfs-dust.js 
> to status.html.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522598#comment-14522598
 ] 

Hadoop QA commented on HDFS-8303:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 34s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   7m  5s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  8s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 227m  6s | Tests failed in hadoop-hdfs. |
| | | 274m 45s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.qjournal.client.TestQuorumJournalManager |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.TestNNStorageRetentionManager |
| Timed out tests | 
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729602/HDFS-8303.0.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7e8639f |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10488/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10488/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10488/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10488/console |


This message was automatically generated.

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch, HDFS-8303.1.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txi

[jira] [Commented] (HDFS-6757) Simplify lease manager with INodeID

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522566#comment-14522566
 ] 

Hadoop QA commented on HDFS-6757:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 32s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 46s | The applied patch generated  9 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  8s | The patch appears to introduce 2 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 223m 58s | Tests failed in hadoop-hdfs. |
| | | 272m 12s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
|  |  Dead store to src in 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.prepareFileForTruncate(INodesInPath,
 String, String, long, Block)  At 
FSNamesystem.java:org.apache.hadoop.hdfs.server.namenode.FSNamesystem.prepareFileForTruncate(INodesInPath,
 String, String, long, Block)  At FSNamesystem.java:[line 2086] |
| Failed unit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.TestFileCreationDelete |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
| Timed out tests | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729596/HDFS-6757.012.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7e8639f |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10486/artifact/patchprocess/checkstyle-result-diff.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10486/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10486/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10486/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10486/console |


This message was automatically generated.

> Simplify lease manager with INodeID
> ---
>
> Key: HDFS-6757
> URL: https://issues.apache.org/jira/browse/HDFS-6757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6757.000.patch, HDFS-6757.001.patch, 
> HDFS-6757.002.patch, HDFS-6757.003.patch, HDFS-6757.004.patch, 
> HDFS-6757.005.patch, HDFS-6757.006.patch, HDFS-6757.007.patch, 
> HDFS-6757.008.patch, HDFS-6757.009.patch, HDFS-6757.010.patch, 
> HDFS-6757.011.patch, HDFS-6757.012.patch
>
>
> Currently the lease manager records leases based on path instead of inode 
> ids. Therefore, the lease manager needs to carefully keep track of the path 
> of act

[jira] [Commented] (HDFS-7739) ZKFC - transitionToActive is indefinitely waiting to complete fenceOldActive

2015-04-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522565#comment-14522565
 ] 

Brahma Reddy Battula commented on HDFS-7739:


[~cnauroth] and [~vinayrpet] any pointers to this issue..

> ZKFC - transitionToActive is indefinitely waiting to complete fenceOldActive
> 
>
> Key: HDFS-7739
> URL: https://issues.apache.org/jira/browse/HDFS-7739
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.6.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Critical
> Attachments: zkfctd.out
>
>
>  *Scenario:* 
> One of the cluster disk got full and ZKFC making tranisionToAcitve ,To fence 
> old active node it needs to execute the command and wait for tge result, 
> since disk got full, strempumper thread will be indefinitely waiting( Even 
> after free the disk also, it will not come out)...
>  *{color:blue}Please check the attached thread dump of ZKFC{color}* ..
>  *{color:green}Better to maintain the timeout for stream-pumper 
> thread{color}* .
> {code}
> protected void pump() throws IOException {
> InputStreamReader inputStreamReader = new InputStreamReader(stream);
> BufferedReader br = new BufferedReader(inputStreamReader);
> String line = null;
> while ((line = br.readLine()) != null) {
>   if (type == StreamType.STDOUT) {
> log.info(logPrefix + ": " + line);
>   } else {
> log.warn(logPrefix + ": " + line);  
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8308) Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when loading editlog

2015-04-30 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8308:
--
Hadoop Flags: Reviewed

+1 patch looks good.

> Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when 
> loading editlog
> 
>
> Key: HDFS-8308
> URL: https://issues.apache.org/jira/browse/HDFS-8308
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8308.000.patch
>
>
> If the editlog contains a transaction for creating an EC file, the NN will 
> get blocked in {{waitForLoadingFSImage}} because the following call path:
> FSDirectory#addFileForEditLog --> FSDirectory#isInECZone --> 
> FSDirectory#getECSchema --> ECZoneManager#getECSchema --> 
> ECZoneManager#getECZoneInfo --> FSNamesystem#getSchema --> 
> waitForLoadingFSImage
> This jira plans to fix this bug and also do some code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8161) Both Namenodes are in standby State

2015-04-30 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522555#comment-14522555
 ] 

Brahma Reddy Battula commented on HDFS-8161:


Thanks a lot [~cnauroth] looking into this issue. 

Yes, we should make ourselves more resilient on this..

{quote}Do you know if there was a particular ZooKeeper status code that you saw 
when this happened?{quote}
I am not sure,Status might be "OK", same I will confirm..
{quote}
Do you have the capability to repro consistently?
{quote}
It's hard to reproduce but I can,I will try on next monday and post..


> Both Namenodes are in standby State
> ---
>
> Key: HDFS-8161
> URL: https://issues.apache.org/jira/browse/HDFS-8161
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover
>Affects Versions: 2.6.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: ACTIVEBreadcumb and StandbyElector.txt
>
>
> Suspected Scenario:
> 
> Start cluster with three Nodes.
> Reboot Machine where ZKFC is not running..( Here Active Node ZKFC should open 
> session with this ZK )
> Now  ZKFC ( Active NN's ) session expire and try re-establish connection with 
> another ZK...Bythe time  ZKFC ( StndBy NN's ) will try to fence old active 
> and create the active Breadcrumb and Makes SNN to active state..
> But immediately it fence to standby state.. ( Here is the doubt)
> Hence both will be in standby state..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8308) Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when loading editlog

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8308:

Attachment: HDFS-8308.000.patch

Upload a patch to fix. The main fix is actually just one line: 
{code}
-  ECSchema schema = dir.getFSNamesystem().getECSchema(schemaName);
+  ECSchema schema = dir.getFSNamesystem().getSchemaManager()
+  .getSchema(schemaName);
{code}

The patch also updates {{TestAddStripedBlocks}} which can now be used to verify 
the fix.

> Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when 
> loading editlog
> 
>
> Key: HDFS-8308
> URL: https://issues.apache.org/jira/browse/HDFS-8308
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8308.000.patch
>
>
> If the editlog contains a transaction for creating an EC file, the NN will 
> get blocked in {{waitForLoadingFSImage}} because the following call path:
> FSDirectory#addFileForEditLog --> FSDirectory#isInECZone --> 
> FSDirectory#getECSchema --> ECZoneManager#getECSchema --> 
> ECZoneManager#getECZoneInfo --> FSNamesystem#getSchema --> 
> waitForLoadingFSImage
> This jira plans to fix this bug and also do some code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8289) DFSStripedOutputStream uses an additional rpc all to getErasureCodingInfo

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao reassigned HDFS-8289:
---

Assignee: Jing Zhao  (was: Tsz Wo Nicholas Sze)

> DFSStripedOutputStream uses an additional rpc all to getErasureCodingInfo
> -
>
> Key: HDFS-8289
> URL: https://issues.apache.org/jira/browse/HDFS-8289
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Jing Zhao
>
> {code}
> // ECInfo is restored from NN just before writing striped files.
> ecInfo = dfsClient.getErasureCodingInfo(src);
> {code}
> The rpc call above can be avoided by adding ECSchema to HdfsFileStatus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8249) Separate HdfsConstants into the client and the server side class

2015-04-30 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8249:
-
Attachment: HDFS-8249.003.patch

> Separate HdfsConstants into the client and the server side class
> 
>
> Key: HDFS-8249
> URL: https://issues.apache.org/jira/browse/HDFS-8249
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8249.000.patch, HDFS-8249.001.patch, 
> HDFS-8249.002.patch, HDFS-8249.003.patch
>
>
> The constants in {{HdfsConstants}} are used by both the client side and the 
> server side. There are two types of constants in the class:
> 1. Constants that are used internally by the servers or not part of the APIs. 
> These constants are free to evolve without breaking compatibilities. For 
> example, {{MAX_PATH_LENGTH}} is used by the NN to enforce the length of the 
> path does not go too long. Developers are free to change the name of the 
> constants and to move it around if necessary.
> 1. Constants that are used by the clients, but not parts of the APIs. For 
> example, {{QUOTA_DONT_SET}} represents an unlimited quota. The value is part 
> of the wire protocol but the value is not. Developers are free to rename the 
> constants but are not allowed to change the value of the constants.
> 1. Constants that are parts of the APIs. For example, {{SafeModeAction}} is 
> used in {{DistributedFileSystem}}. Changing the name / value of the constant 
> will break binary compatibility, but not source code compatibility.
> This jira proposes to separate the above three types of constants into 
> different classes:
> * Creating a new class {{HdfsConstantsServer}} to hold the first type of 
> constants.
> * Move {{HdfsConstants}} into the {{hdfs-client}} package. The work of 
> separating the second and the third types of constants will be postponed in a 
> separate jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7281) Missing block is marked as corrupted block

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522507#comment-14522507
 ] 

Hadoop QA commented on HDFS-7281:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 25s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 29s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   3m 58s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  9s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 225m  9s | Tests failed in hadoop-hdfs. |
| | | 269m 26s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestClose |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729274/HDFS-7281-6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e2e8f77 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10485/artifact/patchprocess/checkstyle-result-diff.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10485/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10485/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10485/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10485/console |


This message was automatically generated.

> Missing block is marked as corrupted block
> --
>
> Key: HDFS-7281
> URL: https://issues.apache.org/jira/browse/HDFS-7281
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: supportability
> Attachments: HDFS-7281-2.patch, HDFS-7281-3.patch, HDFS-7281-4.patch, 
> HDFS-7281-5.patch, HDFS-7281-6.patch, HDFS-7281.patch
>
>
> In the situation where the block lost all its replicas, fsck shows the block 
> is missing as well as corrupted. Perhaps it is better not to mark the block 
> corrupted in this case. The reason it is marked as corrupted is 
> numCorruptNodes == numNodes == 0 in the following code.
> {noformat}
> BlockManager
> final boolean isCorrupt = numCorruptNodes == numNodes;
> {noformat}
> Would like to clarify if it is the intent to mark missing block as corrupted 
> or it is just a bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8134) Using OpenJDK on HDFS

2015-04-30 Thread Yingqi Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingqi Lu updated HDFS-8134:

Attachment: pic2.png
pic1.png

Hi All,

Here are the most recent data on OpenJDK with HDFS. 

Purpose of the study: 
The goal is to show OpenJDK is enterprise ready from the performance point of 
view.  

Configuration:
•   Intel Xeon® E5-2699 V3 (2 X 18Core 2.3Ghz) CPUs
•   BIOS  Version: SE5C610.86B.01.01.0008.021120151325 (release date 
02/11/2015)
•   All BIOS settings are kept default (HT enabled, Turbo enabled, Power 
features enabled)
•   Memory: 16 X 16GB DDR4 2133MHz, 2 Dimms per channel
•   Storage: OS is installed on a 120GB SSD. HDFS and tmp directory is 
located on 1 PCIeSSD drive (1 X Intel® SSD DC P3700 Series, 1/2 Height PCIe 
3.0, 20nm, MLC)
•   OS: CentOS 7 kernel version 3.10.0-123.el7.x86_64
•   Hadoop: 3.0.0-SNAPSHOT (commit 
867d5d2675b8fb73c40fac1e581b02b005459d95, dated 04/02/2015), single node cluster
•   Java: Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode) 
vs. OpenJDK 64-Bit Server VM (build 25.40-b25, mixed mode). Two GC methods are 
checked – G1GC and ParallelOldGC.
•   Workload: dfsioe inside HiBench suite. Data size is 128M. Read and 
write operations are included in the performance data study.

Data summary:
1.  With 128M file size and 1000 files, both read and write operations from 
dfsioe show similar performance (throughput) between Hotspot Java and OpenJDK 
(5% performance difference is within workload run-run variance range). 
2.  We also tested 2 GC methods – G1Gc and ParallelOldGC. They both show 
similar performance as well for this specific workload.
3.  The single system cluster runs 95%+ CPU utilization for both read and 
write operations.

Performance charts are attached here. Please let me know if you have any 
questions and comments.

Thanks,
Yingqi Lu


> Using OpenJDK on HDFS
> -
>
> Key: HDFS-8134
> URL: https://issues.apache.org/jira/browse/HDFS-8134
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: benchmarks, performance
> Environment: CentOS7, OpenJDK8 update 40, Oracle JDK8 update 40
>Reporter: Yingqi Lu
>Assignee: Yingqi Lu
> Attachments: pic1.png, pic2.png
>
>
> Dear All,
> We would like to start the effort of certifying OpenJDK with HDFS. The effort 
> includes compiling HDFS source code with OpenJDK and reporting issues if 
> there is any, and completing performance study and comparing all the results 
> with Oracle JDK. The workload we will start with is DFSIOe which is part of 
> the HiBench suite. We can surely add more workloads such as Teragen and etc. 
> into our testing environment if there is any interest from this community. 
> This is our first time to work on this community. Please do let us know your 
> feedback and comments. If you all like the idea and this is the right place 
> to start the effort, we will be sending out the data soon!
> Thanks,
> Yingqi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8249) Separate HdfsConstants into the client and the server side class

2015-04-30 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522486#comment-14522486
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8249:
---

There is already a HdfsServerConstants class.  How about merging the new class 
HdfsConstantsServer to it?

> Separate HdfsConstants into the client and the server side class
> 
>
> Key: HDFS-8249
> URL: https://issues.apache.org/jira/browse/HDFS-8249
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8249.000.patch, HDFS-8249.001.patch, 
> HDFS-8249.002.patch
>
>
> The constants in {{HdfsConstants}} are used by both the client side and the 
> server side. There are two types of constants in the class:
> 1. Constants that are used internally by the servers or not part of the APIs. 
> These constants are free to evolve without breaking compatibilities. For 
> example, {{MAX_PATH_LENGTH}} is used by the NN to enforce the length of the 
> path does not go too long. Developers are free to change the name of the 
> constants and to move it around if necessary.
> 1. Constants that are used by the clients, but not parts of the APIs. For 
> example, {{QUOTA_DONT_SET}} represents an unlimited quota. The value is part 
> of the wire protocol but the value is not. Developers are free to rename the 
> constants but are not allowed to change the value of the constants.
> 1. Constants that are parts of the APIs. For example, {{SafeModeAction}} is 
> used in {{DistributedFileSystem}}. Changing the name / value of the constant 
> will break binary compatibility, but not source code compatibility.
> This jira proposes to separate the above three types of constants into 
> different classes:
> * Creating a new class {{HdfsConstantsServer}} to hold the first type of 
> constants.
> * Move {{HdfsConstants}} into the {{hdfs-client}} package. The work of 
> separating the second and the third types of constants will be postponed in a 
> separate jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements

2015-04-30 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522480#comment-14522480
 ] 

Colin Patrick McCabe commented on HDFS-7836:


Hi [~xinwei],

The discussion on March 11th was focused on our proposal for off-heaping and 
parallelizing the block manager from February 24th.  We spent a lot of time 
going through the proposal and responding to questions on the proposal.

There was widespread agreement that we needed to reduce the garbage collection 
impact of the millions of BlockInfoContiguous structures.  There was some 
disagreement about how to do that.  Daryn argued that using large primitive 
arrays was the best way to go.  Charles and I argued that using off-heap 
storage was better.

The main advantage of large primitive arrays is that it makes the existing Java 
\-Xmx memory settings work as expected.  The main advantage of off-heap is that 
it allows the use of things like {{Unsafe#compareAndSwap}}, which can often 
lead to more efficient concurrent data structures.  Also, when using off-heap 
memory, we get to re-use malloc rather than essentially writing our own malloc 
for every subsystem.

There was some hand-wringing about off-heap memory being slower, but I do not 
believe that this is valid.  Apache Spark has found that their off-heap hash 
table was actually faster than the on-heap one, due to the ability to better 
control the memory layout.  
https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html
  The key is to avoid using {{DirectByteBuffer}}, which is rather slow, and use 
{{Unsafe}} instead.

However, Daryn has posted some patches using the "large arrays" approach.  
Since they are a nice incremental improvement, we are probably going to pick 
them up if there are no blockers.  We are also looking at incremental 
improvements such as implementing backpressure for full block reports, and 
speeding up edit log replay (if possible).  I would also like to look at 
parallelizing the full block report... if we can do that, we can get a dramatic 
improvement in FBR times by using more than 1 core.

> BlockManager Scalability Improvements
> -
>
> Key: HDFS-7836
> URL: https://issues.apache.org/jira/browse/HDFS-7836
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: BlockManagerScalabilityImprovementsDesign.pdf
>
>
> Improvements to BlockManager scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-4176) EditLogTailer should call rollEdits with a timeout

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang reassigned HDFS-4176:
---

Assignee: Zhe Zhang

> EditLogTailer should call rollEdits with a timeout
> --
>
> Key: HDFS-4176
> URL: https://issues.apache.org/jira/browse/HDFS-4176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Zhe Zhang
> Attachments: namenode.jstack4
>
>
> When the EditLogTailer thread calls rollEdits() on the active NN via RPC, it 
> currently does so without a timeout. So, if the active NN has frozen (but not 
> actually crashed), this call can hang forever. This can then potentially 
> prevent the standby from becoming active.
> This may actually considered a side effect of HADOOP-6762 -- if the RPC were 
> interruptible, that would also fix the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8308) Erasure Coding: NameNode may get blocked in waitForLoadingFSImage() when loading editlog

2015-04-30 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-8308:
---

 Summary: Erasure Coding: NameNode may get blocked in 
waitForLoadingFSImage() when loading editlog
 Key: HDFS-8308
 URL: https://issues.apache.org/jira/browse/HDFS-8308
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


If the editlog contains a transaction for creating an EC file, the NN will get 
blocked in {{waitForLoadingFSImage}} because the following call path:

FSDirectory#addFileForEditLog --> FSDirectory#isInECZone --> 
FSDirectory#getECSchema --> ECZoneManager#getECSchema --> 
ECZoneManager#getECZoneInfo --> FSNamesystem#getSchema --> waitForLoadingFSImage

This jira plans to fix this bug and also do some code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8299) HDFS reporting missing blocks when they are actually present due to read-only filesystem

2015-04-30 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522450#comment-14522450
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8299:
---

{code}
2015-04-30 14:11:08,235 WARN  datanode.DataNode 
(DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
/archive1/dn : 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
writable: /archive1/dn
{code}
Since datanode dir was considered as invalid, the datanode did not add the dir 
to its block map.  All the block under that dir won't be report to NN.

> HDFS reporting missing blocks when they are actually present due to read-only 
> filesystem
> 
>
> Key: HDFS-8299
> URL: https://issues.apache.org/jira/browse/HDFS-8299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Critical
> Attachments: datanode.log
>
>
> Fsck shows missing blocks when the blocks can be found on a datanode's 
> filesystem and the datanode has been restarted to try to get it to recognize 
> that the blocks are indeed present and hence report them to the NameNode in a 
> block report.
> Fsck output showing an example "missing" block:
> {code}/apps/hive/warehouse/.db/someTable/00_0: CORRUPT 
> blockpool BP-120244285--1417023863606 block blk_1075202330
>  MISSING 1 blocks of total size 3260848 B
> 0. BP-120244285--1417023863606:blk_1075202330_1484191 len=3260848 
> MISSING!{code}
> The block is definitely present on more than one datanode however, here is 
> the output from one of them that I restarted to try to get it to report the 
> block to the NameNode:
> {code}# ll 
> /archive1/dn/current/BP-120244285--1417023863606/current/finalized/subdir22/subdir73/blk_1075202330*
> -rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 
> /archive1/dn/current/BP-120244285--1417023863606/current/finalized/subdir22/subdir73/blk_1075202330
> -rw-r--r-- 1 hdfs 499   25483 Apr 27 15:02 
> /archive1/dn/current/BP-120244285--1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code}
> It's worth noting that this is on HDFS tiered storage on an archive tier 
> going to a networked block device that may have become temporarily 
> unavailable but is available now. See also feature request HDFS-8297 for 
> online rescan to not have to go around restarting datanodes.
> It turns out in the datanode log (that I am attaching) this is because the 
> datanode fails to get a write lock on the filesystem. I think it would be 
> better to be able to read-only those blocks however, since this way causes 
> client visible data unavailability when the data could in fact be read.
> {code}2015-04-30 14:11:08,235 WARN  datanode.DataNode 
> (DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir 
> /archive1/dn :
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
> writable: /archive1/dn
> at 
> org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193)
> at 
> org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
> at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378)
> at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8300:
--
 Description: 
- findbugs warning
Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from an 
Exception, even though it is named as such

- unit test failures
see https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/

These bugs somehow were not reported in [the Jenkins 
run|https://issues.apache.org/jira/browse/HDFS-8283?focusedCommentId=14518736&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14518736]
 previously.
Hadoop Flags: Reviewed

+1 patch looks good.  Thanks a lot for fixing my bugs!

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8300.000.patch
>
>
> - findbugs warning
> Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived from 
> an Exception, even though it is named as such
> - unit test failures
> see https://builds.apache.org/job/PreCommit-HDFS-Build/10455/testReport/
> These bugs somehow were not reported in [the Jenkins 
> run|https://issues.apache.org/jira/browse/HDFS-8283?focusedCommentId=14518736&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14518736]
>  previously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8281) Erasure Coding: implement parallel stateful reading for striped layout

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8281:

Attachment: HDFS-8281-HDFS-7285.001.patch

Thanks Zhe! Rebase the patch.

> Erasure Coding: implement parallel stateful reading for striped layout
> --
>
> Key: HDFS-8281
> URL: https://issues.apache.org/jira/browse/HDFS-8281
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8281-HDFS-7285.001.patch, 
> HDFS-8281-HDFS-7285.001.patch, HDFS-8281.000.patch
>
>
> This jira aims to support parallel reading for stateful read in 
> {{DFSStripedInputStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8245) Standby namenode doesn't process DELETED_BLOCK if the add block request is in edit log.

2015-04-30 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-8245:
-
Status: Patch Available  (was: Open)

> Standby namenode doesn't process DELETED_BLOCK if the add block request is in 
> edit log.
> ---
>
> Key: HDFS-8245
> URL: https://issues.apache.org/jira/browse/HDFS-8245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-8245.patch
>
>
> The following series of events happened on Standby namenode :
> 2015-04-09 07:47:21,735 \[Edit log tailer] INFO ha.EditLogTailer: Triggering 
> log roll on remote NameNode Active Namenode (ANN)
> 2015-04-09 07:58:01,858 \[Edit log tailer] INFO ha.EditLogTailer: Triggering 
> log roll on remote NameNode ANN
> The following series of events happened on Active Namenode:,
> 2015-04-09 07:47:21,747 \[IPC Server handler 99 on 8020] INFO 
> namenode.FSNamesystem: Roll Edit Log from Standby NN (SNN)
> 2015-04-09 07:58:01,868 \[IPC Server handler 18 on 8020] INFO 
> namenode.FSNamesystem: Roll Edit Log from SNN
> The following series of events happened on datanode ( {color:red} datanodeA 
> {color}):
> 2015-04-09 07:52:15,817 \[DataXceiver for client 
> DFSClient_attempt_1428022041757_102831_r_000107_0_1139131345_1 at /:51078 
> \[Receiving block 
> BP-595383232--1360869396230:blk_1570321882_1102029183867]] INFO 
> datanode.DataNode: Receiving 
> BP-595383232--1360869396230:blk_1570321882_1102029183867 src: 
> /client:51078 dest: /{color:red}datanodeA:1004{color}
> 2015-04-09 07:52:15,969 \[PacketResponder: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, 
> type=HAS_DOWNSTREAM_IN_PIPELINE] INFO DataNode.clienttrace: src: 
> /client:51078, dest: /{color:red}datanodeA:1004{color}, bytes: 20, op: 
> HDFS_WRITE, cliID: 
> DFSClient_attempt_1428022041757_102831_r_000107_0_1139131345_1, offset: 0, 
> srvID: 356a8a98-826f-446d-8f4c-ce288c1f0a75, blockid: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, duration: 
> 148948385
> 2015-04-09 07:52:15,969 \[PacketResponder: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, 
> type=HAS_DOWNSTREAM_IN_PIPELINE] INFO datanode.DataNode: PacketResponder: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, 
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating
> 2015-04-09 07:52:25,970 \[DataXceiver for client /{color:red} 
> {color}:52827 \[Copying block 
> BP-595383232--1360869396230:blk_1570321882_1102029183867]] INFO 
> datanode.DataNode: Copied 
> BP-595383232--1360869396230:blk_1570321882_1102029183867 to 
> <{color:red}datanodeB{color}>:52827
> 2015-04-09 07:52:28,187 \[DataNode:   heartbeating to ANN:8020] INFO 
> impl.FsDatasetAsyncDiskService: Scheduling blk_1570321882_1102029183867 file 
> /blk_1570321882 for deletion
> 2015-04-09 07:52:28,188 \[Async disk worker #1482 for volume ] INFO 
> impl.FsDatasetAsyncDiskService: Deleted BP-595383232--1360869396230 
> blk_1570321882_1102029183867 file /blk_1570321882
> Then we failover for upgrade and then the standby became active.
> When we did  ls command on this file, we got the following exception:
> 15/04/09 22:07:39 WARN hdfs.BlockReaderFactory: I/O error constructing remote 
> block reader.
> java.io.IOException: Got error for OP_READ_BLOCK, self=/client:32947, 
> remote={color:red}datanodeA:1004{color}, for file , for pool 
> BP-595383232--1360869396230 block 1570321882_1102029183867
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:445)
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:410)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:815)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:693)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:351)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at 
> org.apache.hadoop.fs.shell.CopyCommands$Merge.processArguments(CopyCommands.java:97)
> at 
> org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
>   

[jira] [Updated] (HDFS-8245) Standby namenode doesn't process DELETED_BLOCK if the add block request is in edit log.

2015-04-30 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-8245:
-
Attachment: HDFS-8245.patch

I have just added a check in BlockManager#removeStoredBlock to add to 
pendingDNMessages if the namenode is standby.

There is one test case failing TestDNFencing#testDnFencing by the change.
The test case was checking for postponedMisreplicatedBlocksCount after the 
failover.
Since the change adds the deleted block to the pending messages queue and 
during the trnasition to active, it dequeues the deleted block request.
So the postponedMisreplicatedBlocksCount will be zero after the failOver.

> Standby namenode doesn't process DELETED_BLOCK if the add block request is in 
> edit log.
> ---
>
> Key: HDFS-8245
> URL: https://issues.apache.org/jira/browse/HDFS-8245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-8245.patch
>
>
> The following series of events happened on Standby namenode :
> 2015-04-09 07:47:21,735 \[Edit log tailer] INFO ha.EditLogTailer: Triggering 
> log roll on remote NameNode Active Namenode (ANN)
> 2015-04-09 07:58:01,858 \[Edit log tailer] INFO ha.EditLogTailer: Triggering 
> log roll on remote NameNode ANN
> The following series of events happened on Active Namenode:,
> 2015-04-09 07:47:21,747 \[IPC Server handler 99 on 8020] INFO 
> namenode.FSNamesystem: Roll Edit Log from Standby NN (SNN)
> 2015-04-09 07:58:01,868 \[IPC Server handler 18 on 8020] INFO 
> namenode.FSNamesystem: Roll Edit Log from SNN
> The following series of events happened on datanode ( {color:red} datanodeA 
> {color}):
> 2015-04-09 07:52:15,817 \[DataXceiver for client 
> DFSClient_attempt_1428022041757_102831_r_000107_0_1139131345_1 at /:51078 
> \[Receiving block 
> BP-595383232--1360869396230:blk_1570321882_1102029183867]] INFO 
> datanode.DataNode: Receiving 
> BP-595383232--1360869396230:blk_1570321882_1102029183867 src: 
> /client:51078 dest: /{color:red}datanodeA:1004{color}
> 2015-04-09 07:52:15,969 \[PacketResponder: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, 
> type=HAS_DOWNSTREAM_IN_PIPELINE] INFO DataNode.clienttrace: src: 
> /client:51078, dest: /{color:red}datanodeA:1004{color}, bytes: 20, op: 
> HDFS_WRITE, cliID: 
> DFSClient_attempt_1428022041757_102831_r_000107_0_1139131345_1, offset: 0, 
> srvID: 356a8a98-826f-446d-8f4c-ce288c1f0a75, blockid: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, duration: 
> 148948385
> 2015-04-09 07:52:15,969 \[PacketResponder: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, 
> type=HAS_DOWNSTREAM_IN_PIPELINE] INFO datanode.DataNode: PacketResponder: 
> BP-595383232--1360869396230:blk_1570321882_1102029183867, 
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating
> 2015-04-09 07:52:25,970 \[DataXceiver for client /{color:red} 
> {color}:52827 \[Copying block 
> BP-595383232--1360869396230:blk_1570321882_1102029183867]] INFO 
> datanode.DataNode: Copied 
> BP-595383232--1360869396230:blk_1570321882_1102029183867 to 
> <{color:red}datanodeB{color}>:52827
> 2015-04-09 07:52:28,187 \[DataNode:   heartbeating to ANN:8020] INFO 
> impl.FsDatasetAsyncDiskService: Scheduling blk_1570321882_1102029183867 file 
> /blk_1570321882 for deletion
> 2015-04-09 07:52:28,188 \[Async disk worker #1482 for volume ] INFO 
> impl.FsDatasetAsyncDiskService: Deleted BP-595383232--1360869396230 
> blk_1570321882_1102029183867 file /blk_1570321882
> Then we failover for upgrade and then the standby became active.
> When we did  ls command on this file, we got the following exception:
> 15/04/09 22:07:39 WARN hdfs.BlockReaderFactory: I/O error constructing remote 
> block reader.
> java.io.IOException: Got error for OP_READ_BLOCK, self=/client:32947, 
> remote={color:red}datanodeA:1004{color}, for file , for pool 
> BP-595383232--1360869396230 block 1570321882_1102029183867
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:445)
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:410)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:815)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:693)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:351)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInput

[jira] [Commented] (HDFS-8281) Erasure Coding: implement parallel stateful reading for striped layout

2015-04-30 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522367#comment-14522367
 ] 

Zhe Zhang commented on HDFS-8281:
-

Thanks Jing for the update! I'm reviewing now. A quick note is that it needs a 
rebase -- likely because of the HDFS-8282 refactor.

> Erasure Coding: implement parallel stateful reading for striped layout
> --
>
> Key: HDFS-8281
> URL: https://issues.apache.org/jira/browse/HDFS-8281
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8281-HDFS-7285.001.patch, HDFS-8281.000.patch
>
>
> This jira aims to support parallel reading for stateful read in 
> {{DFSStripedInputStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8178) QJM doesn't move aside stale inprogress edits files

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8178:

Attachment: HDFS-8178.003.patch

Combined with HDFS-8303 patch.

> QJM doesn't move aside stale inprogress edits files
> ---
>
> Key: HDFS-8178
> URL: https://issues.apache.org/jira/browse/HDFS-8178
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8178.000.patch, HDFS-8178.002.patch, 
> HDFS-8178.003.patch
>
>
> When a QJM crashes, the in-progress edit log file at that time remains in the 
> file system. When the node comes back, it will accept new edit logs and those 
> stale in-progress files are never cleaned up. QJM treats them as regular 
> in-progress edit log files and tries to finalize them, which potentially 
> causes high memory usage. This JIRA aims to move aside those stale edit log 
> files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8178) QJM doesn't move aside stale inprogress edits files

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8178:

Attachment: HDFS-8178.002.patch

We can also combine this patch with HDFS-8303 and close one of them as dup.

> QJM doesn't move aside stale inprogress edits files
> ---
>
> Key: HDFS-8178
> URL: https://issues.apache.org/jira/browse/HDFS-8178
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8178.000.patch, HDFS-8178.002.patch
>
>
> When a QJM crashes, the in-progress edit log file at that time remains in the 
> file system. When the node comes back, it will accept new edit logs and those 
> stale in-progress files are never cleaned up. QJM treats them as regular 
> in-progress edit log files and tries to finalize them, which potentially 
> causes high memory usage. This JIRA aims to move aside those stale edit log 
> files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8281) Erasure Coding: implement parallel stateful reading for striped layout

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8281:

Attachment: HDFS-8281-HDFS-7285.001.patch

Thanks for the review, Nicholas! Update the patch to address your comments. The 
new patch also fixes several bugs for offset calculation.

> Erasure Coding: implement parallel stateful reading for striped layout
> --
>
> Key: HDFS-8281
> URL: https://issues.apache.org/jira/browse/HDFS-8281
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8281-HDFS-7285.001.patch, HDFS-8281.000.patch
>
>
> This jira aims to support parallel reading for stateful read in 
> {{DFSStripedInputStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8200) Refactor FSDirStatAndListingOp

2015-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522324#comment-14522324
 ] 

Hudson commented on HDFS-8200:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7708 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7708/])
HDFS-8200. Refactor FSDirStatAndListingOp. Contributed by Haohui Mai. (wheat9: 
rev c55d609053fe24b3a50fbe17dc1b47717b453ed6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirStatAndListingOp.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java


> Refactor FSDirStatAndListingOp
> --
>
> Key: HDFS-8200
> URL: https://issues.apache.org/jira/browse/HDFS-8200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-8200.000.patch, HDFS-8200.001.patch
>
>
> After HDFS-6826 several functions in {{FSDirStatAndListingOp}} are dead. This 
> jira proposes to clean them up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: HDFS-8303.1.patch

The below code is actually needed to avoid purging current in-progress file:
{code}
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
{code}

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch, HDFS-8303.1.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # Different regex in matching for empty/corrupt in-progress files. The FJM 
> pattern makes more sense to me.
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough. This doesn't make sense because end txID is always larger than 
> start txID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8229) LAZY_PERSIST file gets deleted after NameNode restart.

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522306#comment-14522306
 ] 

Hadoop QA commented on HDFS-8229:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 25s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 28s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  7s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 233m 19s | Tests failed in hadoop-hdfs. |
| | | 279m  6s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.namenode.TestSaveNamespace |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729442/HDFS-8229_2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / de9404f |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10482/artifact/patchprocess/checkstyle-result-diff.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10482/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10482/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10482/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10482/console |


This message was automatically generated.

> LAZY_PERSIST file gets deleted after NameNode restart.
> --
>
> Key: HDFS-8229
> URL: https://issues.apache.org/jira/browse/HDFS-8229
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: HDFS
>Affects Versions: 2.6.0
>Reporter: surendra singh lilhore
>Assignee: surendra singh lilhore
> Attachments: HDFS-8229.patch, HDFS-8229_1.patch, HDFS-8229_2.patch
>
>
> {code}
> 2015-04-20 10:26:55,180 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Removing lazyPersist 
> file /LAZY_PERSIST/smallfile with no replicas.
> {code}
> After namenode restart and before DN's registration if 
> {{LazyPersistFileScrubber}} will run then it will delete Lazy persist file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522293#comment-14522293
 ] 

Hadoop QA commented on HDFS-7678:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 59s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 39s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 43s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 38s | The applied patch generated  3 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 58s | The patch appears to introduce 
11 new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  85m 14s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 16s | Tests passed in 
hadoop-hdfs-client. |
| | | 133m 12s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time  
Unsynchronized access at DFSOutputStream.java:90% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Switch statement found in 
org.apache.hadoop.hdfs.DFSStripedInputStream.fetchBlockByteRange(long, long, 
long, byte[], int, Map) where default case is missing  At 
DFSStripedInputStream.java:long, long, byte[], int, Map) where default case is 
missing  At DFSStripedInputStream.java:[lines 471-494] |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 105] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:[line 85] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:[line 167] |
| FindBugs | module:hadoop-hdfs-client |
|  |  org.apache.hadoop.hdfs.protocol.LocatedStripedBlock.getBlockIndices() may 
expose internal representation by returning LocatedStripedBlock.blockIndices  
At LocatedStripedBlock.java:

[jira] [Updated] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-04-30 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8306:

Status: Patch Available  (was: Open)

> Generate ACL and Xattr outputs in OIV XML outputs
> -
>
> Key: HDFS-8306
> URL: https://issues.apache.org/jira/browse/HDFS-8306
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-8306.000.patch
>
>
> Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
> outputs. It makes inspecting {{fsimage}} from XML outputs less practical. 
> Also it prevents recovering a fsimage from XML file.
> This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
> achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-04-30 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8306:

Attachment: HDFS-8306.000.patch

This patch generates ACLs and Xattrs in OIV XML outputs, and verifies the XML 
outputs in the test. The value of Xattr is base64 encoded. 

> Generate ACL and Xattr outputs in OIV XML outputs
> -
>
> Key: HDFS-8306
> URL: https://issues.apache.org/jira/browse/HDFS-8306
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-8306.000.patch
>
>
> Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
> outputs. It makes inspecting {{fsimage}} from XML outputs less practical. 
> Also it prevents recovering a fsimage from XML file.
> This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
> achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8307) Spurious DNS Queries from hdfs shell

2015-04-30 Thread Anu Engineer (JIRA)
Anu Engineer created HDFS-8307:
--

 Summary: Spurious DNS Queries from hdfs shell
 Key: HDFS-8307
 URL: https://issues.apache.org/jira/browse/HDFS-8307
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.1
Reporter: Anu Engineer
Priority: Trivial


With HA configured the hdfs shell (org.apache.hadoop.fs.FsShell) seems to issue 
a DNS query for the cluster Name. if  fs.defaultFS is set to hdfs://mycluster, 
then the shell seems to issue a DNS query for mycluster.FQDN or mycluster.

since mycluster not a machine name  DNS query always fails with 
"DNS 85 Standard query response 0x2aeb No such name"

Repro Steps:

# Setup a HA cluster 
# Log on to any node
# Run wireshark monitoring port 53 - "sudo tshark 'port 53'"
# Run "sudo -u hdfs hdfs dfs -ls /" 
# You should be able to see DNS queries to mycluster.FQDN in wireshark




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace

2015-04-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522264#comment-14522264
 ] 

stack commented on HDFS-8213:
-

+1 from me.

> DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
> than hadoop.htrace
> -
>
> Key: HDFS-8213
> URL: https://issues.apache.org/jira/browse/HDFS-8213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Billie Rinaldi
>Assignee: Colin Patrick McCabe
>Priority: Critical
> Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch
>
>
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
> SpanReceivers through its own configuration.  This results in the same 
> receivers being registered multiple times and spans being delivered more than 
> once.  The documentation says SpanReceiverHost.getInstance should be issued 
> once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: HDFS-8303.0.patch

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # Different regex in matching for empty/corrupt in-progress files. The FJM 
> pattern makes more sense to me.
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough. This doesn't make sense because end txID is always larger than 
> start txID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Description: 
As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# Different regex in matching for empty/corrupt in-progress files. The FJM 
pattern makes more sense to me.
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough. This doesn't make sense because end txID is always larger than start 
txID

  was:
As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# When matching for empty/corrupt in-progress files, QJM requires that the 
suffix doesn't have blank spaces. I think we should use the QJM regex and 
consider a file {{edits_inprogress_01.a bc}} as stale.
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough. This doesn't make sense because end txID is always larger than start 
txID


> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>

[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: (was: HDFS-8303.0.patch)

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # Different regex in matching for empty/corrupt in-progress files. The FJM 
> pattern makes more sense to me.
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough. This doesn't make sense because end txID is always larger than 
> start txID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8300:

Target Version/s: 2.8.0

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8300.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8300:

Status: Patch Available  (was: Open)

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8300.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8300:

Attachment: HDFS-8300.000.patch

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8300.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8300:

Affects Version/s: 2.8.0

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-8300.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8200) Refactor FSDirStatAndListingOp

2015-04-30 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8200:
-
   Resolution: Fixed
Fix Version/s: 2.8.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks Brandon for the reviews.

> Refactor FSDirStatAndListingOp
> --
>
> Key: HDFS-8200
> URL: https://issues.apache.org/jira/browse/HDFS-8200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-8200.000.patch, HDFS-8200.001.patch
>
>
> After HDFS-6826 several functions in {{FSDirStatAndListingOp}} are dead. This 
> jira proposes to clean them up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-04-30 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-8306:
---

 Summary: Generate ACL and Xattr outputs in OIV XML outputs
 Key: HDFS-8306
 URL: https://issues.apache.org/jira/browse/HDFS-8306
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Minor


Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
outputs. It makes inspecting {{fsimage}} from XML outputs less practical. Also 
it prevents recovering a fsimage from XML file.

This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name

2015-04-30 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8305:
---
Attachment: HDFS-8305.001.patch

> HDFS INotify: the destination field of RenameOp should always end with the 
> file name
> 
>
> Key: HDFS-8305
> URL: https://issues.apache.org/jira/browse/HDFS-8305
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8305.001.patch
>
>
> HDFS INotify: the destination field of RenameOp should always end with the 
> file name rather than sometimes being a directory name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name

2015-04-30 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8305:
---
Status: Patch Available  (was: Open)

> HDFS INotify: the destination field of RenameOp should always end with the 
> file name
> 
>
> Key: HDFS-8305
> URL: https://issues.apache.org/jira/browse/HDFS-8305
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8305.001.patch
>
>
> HDFS INotify: the destination field of RenameOp should always end with the 
> file name rather than sometimes being a directory name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name

2015-04-30 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8305:
---
Description: HDFS INotify: the destination field of RenameOp should always 
end with the file name rather than sometimes being a directory name.  (was: 
HDFS INotify: the destination argument to RenameOp should always end with the 
file name rather than sometimes being a directory name.)

> HDFS INotify: the destination field of RenameOp should always end with the 
> file name
> 
>
> Key: HDFS-8305
> URL: https://issues.apache.org/jira/browse/HDFS-8305
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>
> HDFS INotify: the destination field of RenameOp should always end with the 
> file name rather than sometimes being a directory name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8305) HDFS INotify: the destination field of RenameOp should always end with the file name

2015-04-30 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8305:
---
Summary: HDFS INotify: the destination field of RenameOp should always end 
with the file name  (was: HDFS INotify: the destination argument to RenameOp 
should always end with the file name)

> HDFS INotify: the destination field of RenameOp should always end with the 
> file name
> 
>
> Key: HDFS-8305
> URL: https://issues.apache.org/jira/browse/HDFS-8305
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>
> HDFS INotify: the destination argument to RenameOp should always end with the 
> file name rather than sometimes being a directory name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8305) HDFS INotify: the destination argument to RenameOp should always end with the file name

2015-04-30 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-8305:
--

 Summary: HDFS INotify: the destination argument to RenameOp should 
always end with the file name
 Key: HDFS-8305
 URL: https://issues.apache.org/jira/browse/HDFS-8305
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


HDFS INotify: the destination argument to RenameOp should always end with the 
file name rather than sometimes being a directory name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8200) Refactor FSDirStatAndListingOp

2015-04-30 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522192#comment-14522192
 ] 

Brandon Li commented on HDFS-8200:
--


+1.

> Refactor FSDirStatAndListingOp
> --
>
> Key: HDFS-8200
> URL: https://issues.apache.org/jira/browse/HDFS-8200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8200.000.patch, HDFS-8200.001.patch
>
>
> After HDFS-6826 several functions in {{FSDirStatAndListingOp}} are dead. This 
> jira proposes to clean them up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Description: 
As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# When matching for empty/corrupt in-progress files, QJM requires that the 
suffix doesn't have blank spaces. I think we should use the QJM regex and 
consider a file {{edits_inprogress_01.a bc}} as stale.
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough. This doesn't make sense because end txID is always larger than start 
txID

  was:
As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# When matching for empty/corrupt in-progress files, QJM requires that the 
suffix doesn't have blank spaces
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough

Both seem safer than the QJM logic. 


> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToK

[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: HDFS-8303.0.patch

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # When matching for empty/corrupt in-progress files, QJM requires that the 
> suffix doesn't have blank spaces
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough
> Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: (was: HDFS-8303.0.patch)

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # When matching for empty/corrupt in-progress files, QJM requires that the 
> suffix doesn't have blank spaces
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough
> Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Description: 
As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# When matching for empty/corrupt in-progress files, QJM requires that the 
suffix doesn't have blank spaces
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough

Both seem safer than the QJM logic. 

  was:
As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# FJM has a slightly stricter match for empty/corrupt in-progress files: the 
suffix shouldn't have blank space
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough

Both seem safer than the QJM logic. 


> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # When matching for empty/corrupt in-progress files, QJM requires that the 
> suffix

[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: HDFS-8303.0.patch

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # FJM has a slightly stricter match for empty/corrupt in-progress files: the 
> suffix shouldn't have blank space
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough
> Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: (was: HDFS-8303.0.patch)

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # FJM has a slightly stricter match for empty/corrupt in-progress files: the 
> suffix shouldn't have blank space
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough
> Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8288) Refactor DFSStripedOutputStream and StripedDataStreamer

2015-04-30 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8288:
--
Attachment: HDFS-8288-HDFS-7285.20150430.patch

Sure let's try HDFS-8288-HDFS-7285.20150430.patch

> Refactor DFSStripedOutputStream and StripedDataStreamer
> ---
>
> Key: HDFS-8288
> URL: https://issues.apache.org/jira/browse/HDFS-8288
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-8288-HDFS-7285.20150430.patch, h8288_20150429.patch
>
>
> - DFSStripedOutputStream has a list of StripedDataStreamer(s).  The streamers 
> share a data structure List> stripeBlocks for 
> communicate located block and end block information.
> For example,
> {code}
> //StripedDataStreamer.endBlock()
>   // before retrieving a new block, transfer the finished block to
>   // leading streamer
>   LocatedBlock finishedBlock = new LocatedBlock(
>   new ExtendedBlock(block.getBlockPoolId(), block.getBlockId(),
>   block.getNumBytes(), block.getGenerationStamp()), null);
>   try {
> boolean offSuccess = stripedBlocks.get(0).offer(finishedBlock, 30,
> TimeUnit.SECONDS);
> {code}
> It is unnecessary to create a LocatedBlock object for an end block since the 
> locations passed is null.  Also, the return value is ignored (i.e. offSuccess 
> is not used).
> - DFSStripedOutputStream has another data structure cellBuffers for computing 
> parity.  It should be refactored to a class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8249) Separate HdfsConstants into the client and the server side class

2015-04-30 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522163#comment-14522163
 ] 

Haohui Mai commented on HDFS-8249:
--

The test failures look unrelated.

> Separate HdfsConstants into the client and the server side class
> 
>
> Key: HDFS-8249
> URL: https://issues.apache.org/jira/browse/HDFS-8249
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8249.000.patch, HDFS-8249.001.patch, 
> HDFS-8249.002.patch
>
>
> The constants in {{HdfsConstants}} are used by both the client side and the 
> server side. There are two types of constants in the class:
> 1. Constants that are used internally by the servers or not part of the APIs. 
> These constants are free to evolve without breaking compatibilities. For 
> example, {{MAX_PATH_LENGTH}} is used by the NN to enforce the length of the 
> path does not go too long. Developers are free to change the name of the 
> constants and to move it around if necessary.
> 1. Constants that are used by the clients, but not parts of the APIs. For 
> example, {{QUOTA_DONT_SET}} represents an unlimited quota. The value is part 
> of the wire protocol but the value is not. Developers are free to rename the 
> constants but are not allowed to change the value of the constants.
> 1. Constants that are parts of the APIs. For example, {{SafeModeAction}} is 
> used in {{DistributedFileSystem}}. Changing the name / value of the constant 
> will break binary compatibility, but not source code compatibility.
> This jira proposes to separate the above three types of constants into 
> different classes:
> * Creating a new class {{HdfsConstantsServer}} to hold the first type of 
> constants.
> * Move {{HdfsConstants}} into the {{hdfs-client}} package. The work of 
> separating the second and the third types of constants will be postponed in a 
> separate jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6757) Simplify lease manager with INodeID

2015-04-30 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6757:
-
Attachment: HDFS-6757.012.patch

> Simplify lease manager with INodeID
> ---
>
> Key: HDFS-6757
> URL: https://issues.apache.org/jira/browse/HDFS-6757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6757.000.patch, HDFS-6757.001.patch, 
> HDFS-6757.002.patch, HDFS-6757.003.patch, HDFS-6757.004.patch, 
> HDFS-6757.005.patch, HDFS-6757.006.patch, HDFS-6757.007.patch, 
> HDFS-6757.008.patch, HDFS-6757.009.patch, HDFS-6757.010.patch, 
> HDFS-6757.011.patch, HDFS-6757.012.patch
>
>
> Currently the lease manager records leases based on path instead of inode 
> ids. Therefore, the lease manager needs to carefully keep track of the path 
> of active leases during renames and deletes. This can be a non-trivial task.
> This jira proposes to simplify the logic by tracking leases using inodeids 
> instead of paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Attachment: HDFS-8303.0.patch

Attaching a simple change to check if it passes all unit tests. Will add a new 
test in next rev.

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8303.0.patch
>
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # FJM has a slightly stricter match for empty/corrupt in-progress files: the 
> suffix shouldn't have blank space
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough
> Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8303:

Status: Patch Available  (was: Open)

> QJM should purge old logs in the current directory through FJM
> --
>
> Key: HDFS-8303
> URL: https://issues.apache.org/jira/browse/HDFS-8303
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> As the first step of the consolidation effort, QJM should call its FJM to 
> purge the current directory. 
> The current QJM logic of purging current dir is very similar to FJM purging 
> logic.
> QJM:
> {code}
>  private static final List CURRENT_DIR_PURGE_REGEXES =
>   ImmutableList.of(
> Pattern.compile("edits_\\d+-(\\d+)"),
> Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
> ...
>   long txid = Long.parseLong(matcher.group(1));
>   if (txid < minTxIdToKeep) {
> LOG.info("Purging no-longer needed file " + txid);
> if (!f.delete()) {
> ...
> {code}
> FJM:
> {code}
>   private static final Pattern EDITS_REGEX = Pattern.compile(
> NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
> NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
>   private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
>   NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
> ...
> List editLogs = matchEditLogs(files, true);
> for (EditLogFile log : editLogs) {
>   if (log.getFirstTxId() < minTxIdToKeep &&
>   log.getLastTxId() < minTxIdToKeep) {
> purger.purgeLog(log);
>   }
> }
> {code}
> I can see 2 differences:
> # FJM has a slightly stricter match for empty/corrupt in-progress files: the 
> suffix shouldn't have blank space
> # FJM verifies that both start and end txID of a finalized edit file to be 
> old enough
> Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7442) Optimization for decommission-in-progress check

2015-04-30 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-7442:
--
Affects Version/s: 2.6.0

> Optimization for decommission-in-progress check
> ---
>
> Key: HDFS-7442
> URL: https://issues.apache.org/jira/browse/HDFS-7442
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Ming Ma
>
> 1. {{isReplicationInProgress }} currently rescan all blocks of a given node 
> each time the method is called; it becomes less efficient as more of its 
> blocks become fully replicated. Each scan takes FS lock.
> 2. As discussed in HDFS-7374, if the node becomes dead during decommission, 
> it is useful if the dead node can be marked as decommissioned after all its 
> blocks are fully replicated. Currently there is no way to check the blocks of 
> dead decomm-in-progress nodes, given the dead node is removed from blockmap.
> There are mitigations for these limitations. Set 
> dfs.namenode.decommission.nodes.per.interval to small value for reduce the 
> duration of lock. HDFS-7409 uses global FS state to tell if a dead node's 
> blocks are fully replicated.
> To address these scenarios, it will be useful to track the 
> decommon-in-progress blocks separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7442) Optimization for decommission-in-progress check

2015-04-30 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-7442:
--
Component/s: namenode

> Optimization for decommission-in-progress check
> ---
>
> Key: HDFS-7442
> URL: https://issues.apache.org/jira/browse/HDFS-7442
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Ming Ma
>
> 1. {{isReplicationInProgress }} currently rescan all blocks of a given node 
> each time the method is called; it becomes less efficient as more of its 
> blocks become fully replicated. Each scan takes FS lock.
> 2. As discussed in HDFS-7374, if the node becomes dead during decommission, 
> it is useful if the dead node can be marked as decommissioned after all its 
> blocks are fully replicated. Currently there is no way to check the blocks of 
> dead decomm-in-progress nodes, given the dead node is removed from blockmap.
> There are mitigations for these limitations. Set 
> dfs.namenode.decommission.nodes.per.interval to small value for reduce the 
> duration of lock. HDFS-7409 uses global FS state to tell if a dead node's 
> blocks are fully replicated.
> To address these scenarios, it will be useful to track the 
> decommon-in-progress blocks separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement

2015-04-30 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522080#comment-14522080
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8283:
---

Thanks Chris and Jing for reporting the findbugs and unit test problems and 
filing the bug.

It somehow only reported TestAppendSnapshotTruncate failed in [the Jenkins 
run|https://issues.apache.org/jira/browse/HDFS-8283?focusedCommentId=14518736&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14518736].

> DataStreamer cleanup and some minor improvement
> ---
>
> Key: HDFS-8283
> URL: https://issues.apache.org/jira/browse/HDFS-8283
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: h8283_20150428.patch
>
>
> - When throwing an exception
> -* always set lastException 
> -* always creating a new exception so that it has the new stack trace
> - Add LOG.
> - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6757) Simplify lease manager with INodeID

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522075#comment-14522075
 ] 

Hadoop QA commented on HDFS-6757:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 57s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:red}-1{color} | javac |   1m 34s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729565/HDFS-6757.011.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e2e8f77 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10483/console |


This message was automatically generated.

> Simplify lease manager with INodeID
> ---
>
> Key: HDFS-6757
> URL: https://issues.apache.org/jira/browse/HDFS-6757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6757.000.patch, HDFS-6757.001.patch, 
> HDFS-6757.002.patch, HDFS-6757.003.patch, HDFS-6757.004.patch, 
> HDFS-6757.005.patch, HDFS-6757.006.patch, HDFS-6757.007.patch, 
> HDFS-6757.008.patch, HDFS-6757.009.patch, HDFS-6757.010.patch, 
> HDFS-6757.011.patch
>
>
> Currently the lease manager records leases based on path instead of inode 
> ids. Therefore, the lease manager needs to carefully keep track of the path 
> of active leases during renames and deletes. This can be a non-trivial task.
> This jira proposes to simplify the logic by tracking leases using inodeids 
> instead of paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.005.patch

Rebased for HDFS-8282

> Erasure coding: DFSInputStream with decode functionality
> 
>
> Key: HDFS-7678
> URL: https://issues.apache.org/jira/browse/HDFS-7678
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Li Bo
>Assignee: Zhe Zhang
> Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
> HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
> HDFS-7678-HDFS-7285.005.patch, HDFS-7678.000.patch, HDFS-7678.001.patch
>
>
> A block group reader will read data from BlockGroup no matter in striping 
> layout or contiguous layout. The corrupt blocks can be known before 
> reading(told by namenode), or just be found during reading. The block group 
> reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6757) Simplify lease manager with INodeID

2015-04-30 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6757:
-
Attachment: HDFS-6757.011.patch

> Simplify lease manager with INodeID
> ---
>
> Key: HDFS-6757
> URL: https://issues.apache.org/jira/browse/HDFS-6757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-6757.000.patch, HDFS-6757.001.patch, 
> HDFS-6757.002.patch, HDFS-6757.003.patch, HDFS-6757.004.patch, 
> HDFS-6757.005.patch, HDFS-6757.006.patch, HDFS-6757.007.patch, 
> HDFS-6757.008.patch, HDFS-6757.009.patch, HDFS-6757.010.patch, 
> HDFS-6757.011.patch
>
>
> Currently the lease manager records leases based on path instead of inode 
> ids. Therefore, the lease manager needs to carefully keep track of the path 
> of active leases during renames and deletes. This can be a non-trivial task.
> This jira proposes to simplify the logic by tracking leases using inodeids 
> instead of paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8304) Separate out shared log purging methods for QJM (Paxos directory) and FJM

2015-04-30 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-8304:
---

 Summary: Separate out shared log purging methods for QJM (Paxos 
directory) and FJM
 Key: HDFS-8304
 URL: https://issues.apache.org/jira/browse/HDFS-8304
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang


With HDFS-8303 QJM will purge its /current dir through FJM. However its Paxos 
dir needs to be purged separately.

QJM currently uses its own {{JNStorage#purgeMatching}} method while FJM calls 
{{matchEditLogs}} to find all matches first and then uses 
{{DeletionStoragePurger#purgeLog}}.

This JIRA aims to create a unified method for both QJM's Paxos dir and FJM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522011#comment-14522011
 ] 

Arpit Agarwal commented on HDFS-8276:
-

Yes failed tests are likely due to HDFS-8300. Let's wait till that is resolved 
and resubmit the Jenkins runs. Thanks.

> LazyPersistFileScrubber should be disabled if scrubber interval configured 
> zero
> ---
>
> Key: HDFS-8276
> URL: https://issues.apache.org/jira/browse/HDFS-8276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: surendra singh lilhore
>Assignee: surendra singh lilhore
> Attachments: HDFS-8276.patch, HDFS-8276_1.patch
>
>
> bq. but I think it is simple enough to change the meaning of the value so 
> that zero means 'never scrub'. Let me post an updated patch.
> As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
> scrubber should be disable if 
> *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
> Currently namenode startup is failing if interval configured zero
> {code}
> 2015-04-27 23:47:31,744 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.
> java.lang.IllegalArgumentException: 
> dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:828)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread surendra singh lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521996#comment-14521996
 ] 

surendra singh lilhore commented on HDFS-8276:
--

Thanks [~arpitagarwal] for review  Failed test cases and findbug not 
related to this patch.

> LazyPersistFileScrubber should be disabled if scrubber interval configured 
> zero
> ---
>
> Key: HDFS-8276
> URL: https://issues.apache.org/jira/browse/HDFS-8276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: surendra singh lilhore
>Assignee: surendra singh lilhore
> Attachments: HDFS-8276.patch, HDFS-8276_1.patch
>
>
> bq. but I think it is simple enough to change the meaning of the value so 
> that zero means 'never scrub'. Let me post an updated patch.
> As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
> scrubber should be disable if 
> *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
> Currently namenode startup is failing if interval configured zero
> {code}
> 2015-04-27 23:47:31,744 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.
> java.lang.IllegalArgumentException: 
> dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:828)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8303) QJM should purge old logs in the current directory through FJM

2015-04-30 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-8303:
---

 Summary: QJM should purge old logs in the current directory 
through FJM
 Key: HDFS-8303
 URL: https://issues.apache.org/jira/browse/HDFS-8303
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang


As the first step of the consolidation effort, QJM should call its FJM to purge 
the current directory. 

The current QJM logic of purging current dir is very similar to FJM purging 
logic.

QJM:
{code}
 private static final List CURRENT_DIR_PURGE_REGEXES =
  ImmutableList.of(
Pattern.compile("edits_\\d+-(\\d+)"),
Pattern.compile("edits_inprogress_(\\d+)(?:\\..*)?"));
...
  long txid = Long.parseLong(matcher.group(1));
  if (txid < minTxIdToKeep) {
LOG.info("Purging no-longer needed file " + txid);
if (!f.delete()) {
...
{code}

FJM:
{code}
  private static final Pattern EDITS_REGEX = Pattern.compile(
NameNodeFile.EDITS.getName() + "_(\\d+)-(\\d+)");
  private static final Pattern EDITS_INPROGRESS_REGEX = Pattern.compile(
NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+)");
  private static final Pattern EDITS_INPROGRESS_STALE_REGEX = Pattern.compile(
  NameNodeFile.EDITS_INPROGRESS.getName() + "_(\\d+).*(\\S+)");
...
List editLogs = matchEditLogs(files, true);
for (EditLogFile log : editLogs) {
  if (log.getFirstTxId() < minTxIdToKeep &&
  log.getLastTxId() < minTxIdToKeep) {
purger.purgeLog(log);
  }
}
{code}

I can see 2 differences:
# FJM has a slightly stricter match for empty/corrupt in-progress files: the 
suffix shouldn't have blank space
# FJM verifies that both start and end txID of a finalized edit file to be old 
enough

Both seem safer than the QJM logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8300) Fix unit test failures and findbugs warning caused by HDFS-8283

2015-04-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao reassigned HDFS-8300:
---

Assignee: Jing Zhao

> Fix unit test failures and findbugs warning caused by HDFS-8283
> ---
>
> Key: HDFS-8300
> URL: https://issues.apache.org/jira/browse/HDFS-8300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8224) Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error

2015-04-30 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah reassigned HDFS-8224:


Assignee: Rushabh S Shah

> Any IOException in DataTransfer#run() will run diskError thread even if it is 
> not disk error
> 
>
> Key: HDFS-8224
> URL: https://issues.apache.org/jira/browse/HDFS-8224
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.8.0
>
>
> This happened in our 2.6 cluster.
> One of the block and its metadata file were corrupted.
> The disk was healthy in this case.
> Only the block was corrupt.
> Namenode tried to copy that block to another datanode but failed with the 
> following stack trace:
> 2015-04-20 01:04:04,421 
> [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN 
> datanode.DataNode: DatanodeRegistration(a.b.c.d, 
> datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, 
> infoSecurePort=0, ipcPort=8020, 
> storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed
>  to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to 
> a1.b1.c1.d1:1004 got 
> java.io.IOException: Could not create DataChecksum of type 0 with 
> bytesPerChecksum 0
> at 
> org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:287)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989)
> at java.lang.Thread.run(Thread.java:722)
> The following catch block in DataTransfer#run method will treat every 
> IOException as disk error fault and run disk errror
> {noformat}
> catch (IOException ie) {
> LOG.warn(bpReg + ":Failed to transfer " + b + " to " +
> targets[0] + " got ", ie);
> // check if there are any disk problem
> checkDiskErrorAsync();
>   } 
> {noformat}
> This block was never scanned by BlockPoolSliceScanner otherwise it would have 
> reported as corrupt block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8302) Consolidate log purging logic in QJM and FJM

2015-04-30 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-8302:
---

 Summary: Consolidate log purging logic in QJM and FJM
 Key: HDFS-8302
 URL: https://issues.apache.org/jira/browse/HDFS-8302
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Zhe Zhang
Assignee: Zhe Zhang


When executing {{purgeLogsOlderThan}}, {{JNStorage}} purges both the current 
directory and the Paxos directory using its own logic:
{code}
  void purgeDataOlderThan(long minTxIdToKeep) throws IOException {
purgeMatching(sd.getCurrentDir(),
CURRENT_DIR_PURGE_REGEXES, minTxIdToKeep);
purgeMatching(getPaxosDir(), PAXOS_DIR_PURGE_REGEXES, minTxIdToKeep);
  }
{code}

Meanwhile, FJM has its own logic of serving {{purgeLogsOlderThan}}, which is 
executed only under the legacy NFS-based journaling configuration.

This JIRA aims to consolidate these 2 separate purging procedures



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8276) LazyPersistFileScrubber should be disabled if scrubber interval configured zero

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521950#comment-14521950
 ] 

Hadoop QA commented on HDFS-8276:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 25s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  7s | The patch appears to introduce 1 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 238m 53s | Tests failed in hadoop-hdfs. |
| | | 284m 45s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
| Failed unit tests | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestClose |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12729474/HDFS-8276_1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / de9404f |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10477/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10477/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10477/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10477/console |


This message was automatically generated.

> LazyPersistFileScrubber should be disabled if scrubber interval configured 
> zero
> ---
>
> Key: HDFS-8276
> URL: https://issues.apache.org/jira/browse/HDFS-8276
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: surendra singh lilhore
>Assignee: surendra singh lilhore
> Attachments: HDFS-8276.patch, HDFS-8276_1.patch
>
>
> bq. but I think it is simple enough to change the meaning of the value so 
> that zero means 'never scrub'. Let me post an updated patch.
> As discussed in [HDFS-6929|https://issues.apache.org/jira/browse/HDFS-6929], 
> scrubber should be disable if 
> *dfs.namenode.lazypersist.file.scrub.interval.sec* is zero.
> Currently namenode startup is failing if interval configured zero
> {code}
> 2015-04-27 23:47:31,744 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.
> java.lang.IllegalArgumentException: 
> dfs.namenode.lazypersist.file.scrub.interval.sec must be non-zero.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:828)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >