date:20160202

[jira] [Updated] (HDFS-8722) Optimize datanode writes for small writes and flushes

2016-02-02 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8722:
-
Fix Version/s: 2.6.4

> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2, 2.6.4
>
> Attachments: HDFS-8722.br26.patch, HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes

2016-02-02 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128736#comment-15128736
 ] 

Junping Du commented on HDFS-8722:
--

I have merge the 2.6 patch to branch-2.6. Thanks [~kihwal] for help on this!

> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2, 2.6.4
>
> Attachments: HDFS-8722.br26.patch, HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9669) TcpPeerServer should respect ipc.server.listen.queue.size

2016-02-02 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9669:
---
  Resolution: Fixed
   Fix Version/s: 2.7.3
Target Version/s: 2.7.3
  Status: Resolved  (was: Patch Available)

> TcpPeerServer should respect ipc.server.listen.queue.size
> -
>
> Key: HDFS-9669
> URL: https://issues.apache.org/jira/browse/HDFS-9669
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 2.7.3
>
> Attachments: HDFS-9669.0.patch, HDFS-9669.1.patch, HDFS-9669.1.patch
>
>
> On periods of high traffic we are seeing:
> {code}
> 16/01/19 23:40:40 WARN hdfs.DFSClient: Connection failure: Failed to connect 
> to /10.138.178.47:50010 for file /MYPATH/MYFILE for block 
> BP-1935559084-10.138.112.27-1449689748174:blk_1080898601_7375294:java.io.IOException:
>  Connection reset by peer
> java.io.IOException: Connection reset by peer
>   at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>   at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:109)
>   at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
> {code}
> At the time that this happens there are way less xceivers than configured.
> On most JDK's this will make 50 the total backlog at any time. This 
> effectively means that any GC + Busy time willl result in tcp resets.
> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/net/ServerSocket.java#l370



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9741) libhdfs++: GetLastError not returning meaningful messages after some failures

2016-02-02 Thread Bob Hansen (JIRA)

Bob Hansen created HDFS-9741:


 Summary: libhdfs++: GetLastError not returning meaningful messages 
after some failures
 Key: HDFS-9741
 URL: https://issues.apache.org/jira/browse/HDFS-9741
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


After failing to open a file, the text for GetLastErrorMessage is not being 
set.  It should be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128965#comment-15128965
 ] 

Kihwal Lee commented on HDFS-9739:
--

If the patch works, {{TestDatanodeStartupFixesLegacyStorageIDs}} should fail 
until HDFS-9730 is fixed.  This wasn't failing in trunk and brach-2 despite the 
bug because of the broken {{isValidStorageId()}}.

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9739.000.patch
>
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4660) Block corruption can happen during pipeline recovery

2016-02-02 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128690#comment-15128690
 ] 

Junping Du commented on HDFS-4660:
--

I have commit the 2.6 patch to branch-2.6. Thanks [~kihwal] for updating the 
patch.

> Block corruption can happen during pipeline recovery
> 
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Peng Zhang
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.6.4
>
> Attachments: HDFS-4660.br26.patch, HDFS-4660.patch, HDFS-4660.patch, 
> HDFS-4660.v2.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9698) Long running Balancer should renew TGT

2016-02-02 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128733#comment-15128733
 ] 

Zhe Zhang commented on HDFS-9698:
-

Thanks Jing and Chris for the helpful discussion. Yes I believe the current 
logic already handles relogin. 

The production clusters encountering the bug actually have the HADOOP-10786 
fix. Unfortunately we didn't keep complete error logs except for the one 
snippet above. Only reasons I can think of are 1) keytab issue; 2) retry 
disabled in config. I'll close this JIRA and open a new one when the issue next 
surfaces.

> Long running Balancer should renew TGT
> --
>
> Key: HDFS-9698
> URL: https://issues.apache.org/jira/browse/HDFS-9698
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, security
>Affects Versions: 2.6.3
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9698.00.patch
>
>
> When the {{Balancer}} runs beyond the configured TGT lifetime, the current 
> logic won't renew TGT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9669) TcpPeerServer should respect ipc.server.listen.queue.size

2016-02-02 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9669:
---
Issue Type: Improvement  (was: Bug)

> TcpPeerServer should respect ipc.server.listen.queue.size
> -
>
> Key: HDFS-9669
> URL: https://issues.apache.org/jira/browse/HDFS-9669
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HDFS-9669.0.patch, HDFS-9669.1.patch, HDFS-9669.1.patch
>
>
> On periods of high traffic we are seeing:
> {code}
> 16/01/19 23:40:40 WARN hdfs.DFSClient: Connection failure: Failed to connect 
> to /10.138.178.47:50010 for file /MYPATH/MYFILE for block 
> BP-1935559084-10.138.112.27-1449689748174:blk_1080898601_7375294:java.io.IOException:
>  Connection reset by peer
> java.io.IOException: Connection reset by peer
>   at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>   at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:109)
>   at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
> {code}
> At the time that this happens there are way less xceivers than configured.
> On most JDK's this will make 50 the total backlog at any time. This 
> effectively means that any GC + Busy time willl result in tcp resets.
> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/net/ServerSocket.java#l370



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128939#comment-15128939
 ] 

Mingliang Liu commented on HDFS-9739:
-

Thanks [~kihwal] for reporting this.

The patch [HDFS-8979] simply removes one line of code: 
{{UUID.fromString(storageID.substring(STORAGE_ID_PREFIX.length()));}} This was 
removed because I thought it's not needed as we create a UUID and then ignore 
it.

Actually, if the {{storageID}} stripped prefix does not conform to the UUID 
string representation (old format?), the {{DatanodeStorage#isValidStorageId()}} 
should return to false. The old code will handle this thanks to thrown 
{{IllegalArgumentException}}, while the patch [HDFS-8979] removes this check.

A simple fix is to added the UUID.fromString()}} back.

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128971#comment-15128971
 ] 

Colin Patrick McCabe commented on HDFS-9700:

Good find, [~ghelmling].

bq. [~iwasakims] wrote: Adding configuration key such as 
HdfsClientConfigKeys.DFS_CLIENT_SOCKET_TCP_NODELAY might be conservative option 
to retain existing behaviour and change the default value later. (You can see 
HDFS-8829 and HDFS-9259 as example for the fix.)

Yeah, it makes sense to have a separate configuration key controlling whether 
{{TCP_NODELAY}} is set on {{DataTransferProtocol}}.

I think we should change the default to be that TCP_NODELAY is "on" for both 
{{DataTransferProtocol}} and Hadoop RPC.  We already try to avoid sending small 
messages over DataTransferProtocol, so Nagle's algorithm doesn't add a lot (and 
may significantly degrade the performance of things like hflush and hsync).

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9732) Remove DelegationTokenIdentifier.toString() —for better logging output

2016-02-02 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128689#comment-15128689
 ] 

Chris Nauroth commented on HDFS-9732:
-

Yes, agreed.  I would be +1 for an approach that converts {{hdfs fetchdt}} to 
use its own specific string formatting logic.  I think reliance on {{toString}} 
in any public contract is generally a bad idea for the same reasons you 
described.  People expect {{toString}} to be useful primarily for debugging, 
and they want it to be easily evolved to add more information.

BTW, was the current attachment meant to go on a different JIRA?

> Remove DelegationTokenIdentifier.toString() —for better logging output
> --
>
> Key: HDFS-9732
> URL: https://issues.apache.org/jira/browse/HDFS-9732
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
> Attachments: HADOOP-12752-001.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> HDFS {{DelegationTokenIdentifier.toString()}} adds some diagnostics info, 
> owner, sequence number. But its superclass,  
> {{AbstractDelegationTokenIdentifier}} contains a lot more information, 
> including token issue and expiry times.
> Because  {{DelegationTokenIdentifier.toString()}} doesn't include this data,
> information that is potentially useful for kerberos diagnostics is lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-4660) Block corruption can happen during pipeline recovery

2016-02-02 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-4660:
-
Fix Version/s: 2.6.4

> Block corruption can happen during pipeline recovery
> 
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Peng Zhang
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.6.4
>
> Attachments: HDFS-4660.br26.patch, HDFS-4660.patch, HDFS-4660.patch, 
> HDFS-4660.v2.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9742) TestAclsEndToEnd#testGoodWithWhitelistWithoutBlacklist occasionally fails in java8 trunk

2016-02-02 Thread Wei-Chiu Chuang (JIRA)

Wei-Chiu Chuang created HDFS-9742:
-

 Summary: TestAclsEndToEnd#testGoodWithWhitelistWithoutBlacklist 
occasionally fails in java8 trunk
 Key: HDFS-9742
 URL: https://issues.apache.org/jira/browse/HDFS-9742
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Wei-Chiu Chuang


TestAclsEndToEnd#testGoodWithWhitelistWithoutBlacklist was added in HDFS-9295. 
It sometimes fail in java8 trunk branch with the following log:

https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/838/testReport/junit/org.apache.hadoop.hdfs/TestAclsEndToEnd/testGoodWithWhitelistWithoutBlacklist/

Error Message
{noformat}
Exception during deletion of file /tmp/BLUEZONE/file1 by keyadmin
{noformat}
Stacktrace
{noformat}
java.lang.AssertionError: Exception during deletion of file /tmp/BLUEZONE/file1 
by keyadmin
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.TestAclsEndToEnd.doFullAclTest(TestAclsEndToEnd.java:471)
at 
org.apache.hadoop.hdfs.TestAclsEndToEnd.testGoodWithWhitelistWithoutBlacklist(TestAclsEndToEnd.java:368)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9721) Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference

2016-02-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128781#comment-15128781
 ] 

Hudson commented on HDFS-9721:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9225 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9225/])
HDFS-9721. Allow Delimited PB OIV tool to run upon fsimage that contains (lei: 
rev 9d494f0c0eaa05417f3a3e88487d878d1731da36)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageTextWriter.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/IgnoreSnapshotException.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageDelimitedTextWriter.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java


> Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference
> 
>
> Key: HDFS-9721
> URL: https://issues.apache.org/jira/browse/HDFS-9721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9721.01.patch, HDFS-9721.02.patch, 
> HDFS-9721.03.patch, HDFS-9721.04.patch, HDFS-9721.05.patch
>
>
> HDFS-6673 added the feature of Delimited format OIV tool on protocol buffer 
> based fsimage.
> However, if the fsimage contains {{INodeReference}}, the tool fails because:
> {code}Preconditions.checkState(e.getRefChildrenCount() == 0);{code}
> This jira is to propose allow the tool to finish, so that user can get full 
> metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9260) Improve the performance and GC friendliness of NameNode startup and full block reports

2016-02-02 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9260:
---
Summary: Improve the performance and GC friendliness of NameNode startup 
and full block reports  (was: Improve performance and GC friendliness of 
startup and FBRs)

> Improve the performance and GC friendliness of NameNode startup and full 
> block reports
> --
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9669) TcpPeerServer should respect ipc.server.listen.queue.size

2016-02-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128838#comment-15128838
 ] 

Hudson commented on HDFS-9669:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9226 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9226/])
HDFS-9669. TcpPeerServer should respect ipc.server.listen.queue.size (cmccabe: 
rev 2da03b48eba53d4dec2a77209ad9835d808171d1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/net/TcpPeerServer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TcpPeerServer should respect ipc.server.listen.queue.size
> -
>
> Key: HDFS-9669
> URL: https://issues.apache.org/jira/browse/HDFS-9669
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 2.7.3
>
> Attachments: HDFS-9669.0.patch, HDFS-9669.1.patch, HDFS-9669.1.patch
>
>
> On periods of high traffic we are seeing:
> {code}
> 16/01/19 23:40:40 WARN hdfs.DFSClient: Connection failure: Failed to connect 
> to /10.138.178.47:50010 for file /MYPATH/MYFILE for block 
> BP-1935559084-10.138.112.27-1449689748174:blk_1080898601_7375294:java.io.IOException:
>  Connection reset by peer
> java.io.IOException: Connection reset by peer
>   at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>   at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:109)
>   at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
> {code}
> At the time that this happens there are way less xceivers than configured.
> On most JDK's this will make 50 the total backlog at any time. This 
> effectively means that any GC + Busy time willl result in tcp resets.
> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/net/ServerSocket.java#l370



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9525) hadoop utilities need to support provided delegation tokens

2016-02-02 Thread HeeSoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128897#comment-15128897
 ] 

HeeSoo Kim commented on HDFS-9525:
--

[~steve_l] Thank you for your feedback.

{quote}
{{"hadoop.token.files"}} is not a core-default file, it is a system property.
{quote}
The {{"hadoop.token.files"}} property can be defined in two places.
One is system core-default file and the other is system property. The code is 
intended since we considered the two use cases.
In general, at runtime, the user uses system property.
However, if the user gets the token periodically somehow, and stores in 
specific directory in their system. I think they can also use the token 
filename in core-default file. This code has the error handling when the file 
does not exist. Even the file does not exist, it won't break the job. It will 
continuously work without the user mentioned credential files.

{quote}
Add some more logging too. Print out the files before they are loaded? Please.
{quote}
I thought it is a extension feature of HADOOP_TOKEN_FILE_LOCATION.

{quote}
Finally, why skip files that aren't there or aren't files? Isn't that a sign of 
an error? 
{quote}
As I explained above, it won't break the job even the token files are not 
available.
We don't know that the credential is expired or token file is existed.
It allows to keep work even it does not have right credential for the service.
For instance, if it needs to access WebHDFS filesystem and the credential is 
not available which in {{hadoop.token.files}}, it will call SPNEGO to renew the 
token. Therefore, the job can be work continuously without stop.

{quote}
Otherwise, someone —and I fear it shall be me— will end up trying to debug why 
a launched YARN app hasn't picked up credentials from oozie, with the cause 
being a typo in the path which was logged at all
{quote}
When the credentials is translated to distributed system, the Credentials class 
has multiple tokens. It will be stored on one file that has in 
HADOOP_TOKEN_FILE_LOCATION. If the initial client application read the 
credential token successfully, the token can be distributed to other job.

{quote}
{{String files = System.getProperty("hadoop.token.files", 
System.getEnv("HADOOP_TOKEN_FILE_LOCATION"))}}
the env would get picked up, the sysprop override. Then have one follow on 
codepath with the logging I mentioned earlier.
As it is, there's now the situation that both options can be set. Is that 
really what is wanted?
{quote}
The main intention of it is that read credentials from files as much as 
possible.
It allows to use multiple token filenames. It would not break previous 
configuration.
For instance, YARN uses the HADOOP_TOKEN_FILE_LOCATION property as a default 
credential filename. The credential file has multiple tokens. I think it is 
better to support multiple token filenames.

> hadoop utilities need to support provided delegation tokens
> ---
>
> Key: HDFS-9525
> URL: https://issues.apache.org/jira/browse/HDFS-9525
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, 
> HDFS-7984.003.patch, HDFS-7984.004.patch, HDFS-7984.005.patch, 
> HDFS-7984.006.patch, HDFS-7984.007.patch, HDFS-7984.patch, 
> HDFS-9525.008.patch, HDFS-9525.009.patch, HDFS-9525.009.patch, 
> HDFS-9525.branch-2.008.patch, HDFS-9525.branch-2.009.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-9406) FSImage may get corrupted after deleting snapshot

2016-02-02 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127262#comment-15127262
 ] 

Yongjun Zhang edited comment on HDFS-9406 at 2/2/16 7:59 PM:
-

Committed to trunk, branch-2, branch-2.8, branch-2.7.

{quote}
commit 34ab50ea92370cc7440a8f7649286b148c2fde65
Date:   Mon Feb 1 11:23:44 2016 -0800

HDFS-9406. FSImage may get corrupted after deleting snapshot. (Contributed 
by Jing Zhao, Stanislav Antic, Vinayakumar B, Yongjun Zhang)
{quote}

Many thanks to [~stanislav.an...@gmail.com], [~jingzhao], and [~vinayrpet] for 
the contribution, really nice community work!





was (Author: yzhangal):
Committed to trunk, branch-2, branch-2.8, branch-2.7.

{quote}
commit 34ab50ea92370cc7440a8f7649286b148c2fde65
Author: Yongjun Zhang 
Date:   Mon Feb 1 11:23:44 2016 -0800

HDFS-9406. FSImage may get corrupted after deleting snapshot. (Contributed 
by Jing Zhao, Stanislav Antic, Vinayakumar B, Yongjun Zhang)
{quote}

Many thanks to [~stanislav.an...@gmail.com], [~jingzhao], and [~vinayrpet] for 
the contribution, really nice community work!




> FSImage may get corrupted after deleting snapshot
> -
>
> Key: HDFS-9406
> URL: https://issues.apache.org/jira/browse/HDFS-9406
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
> Environment: CentOS 6 amd64, CDH 5.4.4-1
> 2xCPU: Intel(R) Xeon(R) CPU E5-2640 v3
> Memory: 32GB
> Namenode blocks: ~700_000 blocks, no HA setup
>Reporter: Stanislav Antic
>Assignee: Yongjun Zhang
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HDFS-9406.001.patch, HDFS-9406.002.patch, 
> HDFS-9406.003.patch, HDFS-9406.branch-2.7.patch
>
>
> FSImage corruption happened after HDFS snapshots were taken. Cluster was not 
> used
> at that time.
> When namenode restarts it reported NULL pointer exception:
> {code}
> 15/11/07 10:01:15 INFO namenode.FileJournalManager: Recovering unfinalized 
> segments in /tmp/fsimage_checker_5857/fsimage/current
> 15/11/07 10:01:15 INFO namenode.FSImage: No edit log streams selected.
> 15/11/07 10:01:18 INFO namenode.FSImageFormatPBINode: Loading 1370277 INodes.
> 15/11/07 10:01:27 ERROR namenode.NameNode: Failed to start namenode.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.addChild(INodeDirectory.java:531)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.addToParent(FSImageFormatPBINode.java:252)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectorySection(FSImageFormatPBINode.java:202)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:261)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:180)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:226)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:929)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:913)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:732)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:668)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1061)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:765)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:584)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:643)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:810)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:794)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1487)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1553)
> 15/11/07 10:01:27 INFO util.ExitUtil: Exiting with status 1
> {code}
> Corruption happened after "07.11.2015 00:15", and after that time blocks 
> ~9300 blocks were invalidated that shouldn't be.
> After recovering FSimage I discovered that around ~9300 blocks were missing.
> -I also attached log of namenode before and after corruption happened.-



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad

2016-02-02 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128715#comment-15128715
 ] 

Junping Du commented on HDFS-9178:
--

Thanks [~kihwal] for review the patch! I have commit the patch to branch-2.6.

> Slow datanode I/O can cause a wrong node to be marked bad
> -
>
> Key: HDFS-9178
> URL: https://issues.apache.org/jira/browse/HDFS-9178
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2, 2.6.4
>
> Attachments: 002-HDFS-9178.branch-2.6.patch, 
> HDFS-9178.branch-2.6.patch, HDFS-9178.patch
>
>
> When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the 
> downstream node can timeout on reading packet since even the heartbeat 
> packets will not be relayed down.  
> The packet read timeout is set in {{DataXceiver#run()}}:
> {code}
>   peer.setReadTimeout(dnConf.socketTimeout);
> {code}
> When the downstream node times out and closes the connection to the upstream, 
> the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an 
> ack upstream with the downstream node status set to {{ERROR}}.  This caused 
> the client to exclude the downstream node, even thought the upstream node was 
> the one got stuck.
> The connection to downstream has longer timeout, so the downstream will 
> always timeout  first. The downstream timeout is set in {{writeBlock()}}
> {code}
>   int timeoutValue = dnConf.socketTimeout +
>   (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length);
>   NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue);
>   OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock,
>   writeTimeout);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad

2016-02-02 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9178:
-
Fix Version/s: 2.6.4

> Slow datanode I/O can cause a wrong node to be marked bad
> -
>
> Key: HDFS-9178
> URL: https://issues.apache.org/jira/browse/HDFS-9178
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2, 2.6.4
>
> Attachments: 002-HDFS-9178.branch-2.6.patch, 
> HDFS-9178.branch-2.6.patch, HDFS-9178.patch
>
>
> When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the 
> downstream node can timeout on reading packet since even the heartbeat 
> packets will not be relayed down.  
> The packet read timeout is set in {{DataXceiver#run()}}:
> {code}
>   peer.setReadTimeout(dnConf.socketTimeout);
> {code}
> When the downstream node times out and closes the connection to the upstream, 
> the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an 
> ack upstream with the downstream node status set to {{ERROR}}.  This caused 
> the client to exclude the downstream node, even thought the upstream node was 
> the one got stuck.
> The connection to downstream has longer timeout, so the downstream will 
> always timeout  first. The downstream timeout is set in {{writeBlock()}}
> {code}
>   int timeoutValue = dnConf.socketTimeout +
>   (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length);
>   NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue);
>   OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock,
>   writeTimeout);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9395) getContentSummary and other FS operations are audit logged as success even if failed

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128753#comment-15128753
 ] 

Hadoop QA commented on HDFS-9395:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
186 unchanged - 0 fixed = 187 total (was 186) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 9s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 35s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 50m 49s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 127m 51s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Redundant nullcheck of effectiveDirective which is known to be null in 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheDirective(CacheDirectiveInfo,
 EnumSet,

[jira] [Updated] (HDFS-9721) Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference

2016-02-02 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9721:

  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.8.0
  3.0.0
Target Version/s: 3.0.0, 2.8.0
  Status: Resolved  (was: Patch Available)

+1. Thanks for the great work, [~xiaochen]

Committed to {{trunk}}, {{branch-2}} and {{branch-2.8}}

> Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference
> 
>
> Key: HDFS-9721
> URL: https://issues.apache.org/jira/browse/HDFS-9721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9721.01.patch, HDFS-9721.02.patch, 
> HDFS-9721.03.patch, HDFS-9721.04.patch, HDFS-9721.05.patch
>
>
> HDFS-6673 added the feature of Delimited format OIV tool on protocol buffer 
> based fsimage.
> However, if the fsimage contains {{INodeReference}}, the tool fails because:
> {code}Preconditions.checkState(e.getRefChildrenCount() == 0);{code}
> This jira is to propose allow the tool to finish, so that user can get full 
> metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9669) TcpPeerServer should respect ipc.server.listen.queue.size

2016-02-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128789#comment-15128789
 ] 

Colin Patrick McCabe commented on HDFS-9669:


+1.  Thanks, [~eclark].

> TcpPeerServer should respect ipc.server.listen.queue.size
> -
>
> Key: HDFS-9669
> URL: https://issues.apache.org/jira/browse/HDFS-9669
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HDFS-9669.0.patch, HDFS-9669.1.patch, HDFS-9669.1.patch
>
>
> On periods of high traffic we are seeing:
> {code}
> 16/01/19 23:40:40 WARN hdfs.DFSClient: Connection failure: Failed to connect 
> to /10.138.178.47:50010 for file /MYPATH/MYFILE for block 
> BP-1935559084-10.138.112.27-1449689748174:blk_1080898601_7375294:java.io.IOException:
>  Connection reset by peer
> java.io.IOException: Connection reset by peer
>   at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>   at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:109)
>   at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
> {code}
> At the time that this happens there are way less xceivers than configured.
> On most JDK's this will make 50 the total backlog at any time. This 
> effectively means that any GC + Busy time willl result in tcp resets.
> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/net/ServerSocket.java#l370



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9721) Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference

2016-02-02 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128797#comment-15128797
 ] 

Xiao Chen commented on HDFS-9721:
-

Thanks for the helpful reviews and commit [~eddyxu].

> Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference
> 
>
> Key: HDFS-9721
> URL: https://issues.apache.org/jira/browse/HDFS-9721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9721.01.patch, HDFS-9721.02.patch, 
> HDFS-9721.03.patch, HDFS-9721.04.patch, HDFS-9721.05.patch
>
>
> HDFS-6673 added the feature of Delimited format OIV tool on protocol buffer 
> based fsimage.
> However, if the fsimage contains {{INodeReference}}, the tool fails because:
> {code}Preconditions.checkState(e.getRefChildrenCount() == 0);{code}
> This jira is to propose allow the tool to finish, so that user can get full 
> metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9731) Erasure Coding: Rename BlockECRecoveryCommand to BlockECReconstructionCommand

2016-02-02 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9731:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks Rakesh. I verified reported test failures and none could be reproduced 
locally (they are not related to this refactor change either). I just committed 
the patch to trunk.

> Erasure Coding: Rename BlockECRecoveryCommand to BlockECReconstructionCommand
> -
>
> Key: HDFS-9731
> URL: https://issues.apache.org/jira/browse/HDFS-9731
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0
>
> Attachments: HDFS-9731-001.patch
>
>
> This sub-task is to visit the EC recovery logic and make the logic as 
> _reconstruction_. ie, rename EC-related block repair logic to "reconstruction"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HDFS-9739:
---

Assignee: Mingliang Liu

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7708) Balancer should delete its pid file when it completes rebalance

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128866#comment-15128866
 ] 

Hadoop QA commented on HDFS-7708:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
7s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 30s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 56s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 195m 30s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   |

[jira] [Commented] (HDFS-9260) Improve the performance and GC friendliness of NameNode startup and full block reports

2016-02-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128869#comment-15128869
 ] 

Hudson commented on HDFS-9260:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9227 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9227/])
HDFS-9260. Improve the performance and GC friendliness of NameNode (cmccabe: 
rev dd9ebf6eedfd4ff8b3486eae2a446de6b0c7fa8a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAddStripedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlockReportContext.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeHotSwapVolumes.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoStriped.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDnRespectsBlockReportSplitThreshold.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestNNHandlesCombinedBlockReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestTriggerBlockReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeStorageInfo.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/FoldedTreeSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/FoldedTreeSetTest.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestNNHandlesBlockReportPerStorage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java


> Improve the performance and GC friendliness of NameNode startup and full 
> block reports
> --
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for

[jira] [Updated] (HDFS-9260) Improve the performance and GC friendliness of NameNode startup and full block reports

2016-02-02 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9260:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks, [~sfriberg] and [~jingzhao].

> Improve the performance and GC friendliness of NameNode startup and full 
> block reports
> --
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Fix For: 3.0.0
>
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9740) Use a reasonable limit in DFSTestUtil.waitForMetric()

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128876#comment-15128876
 ] 

Hadoop QA commented on HDFS-9740:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 25s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 36s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 140m 5s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork |
|   | hadoop.hdfs.TestRenameWhileOpen |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   |

[jira] [Commented] (HDFS-9744) TestDirectoryScanner#testThrottling occasionally time out after 300 seconds

2016-02-02 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129033#comment-15129033
 ] 

Daniel Templeton commented on HDFS-9744:


Bump the timeout (or just remove it).  If you drop the block count, you'll 
compromise the test.

> TestDirectoryScanner#testThrottling occasionally time out after 300 seconds
> ---
>
> Key: HDFS-9744
> URL: https://issues.apache.org/jira/browse/HDFS-9744
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Priority: Minor
>  Labels: test
>
> I have seen quite a few test failures in TestDirectoryScanner#testThrottling.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2793/testReport/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testThrottling/
> Looking at the log, it does not look like the test got stucked. On my local 
> machine, this test took 219 seconds. It is likely that this test takes more 
> than 300 seconds to complete on a busy jenkins slave. I think it is 
> reasonable to set a longer time out value, or reduce the number of blocks to 
> reduce the duration of the test.
> Error Message
> {noformat}
> test timed out after 30 milliseconds
> {noformat}
> Stacktrace
> {noformat}
> java.lang.Exception: test timed out after 30 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.waitAndQueuePacket(DataStreamer.java:804)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacket(DFSOutputStream.java:423)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacketFull(DFSOutputStream.java:432)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:418)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
>   at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:125)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:111)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:418)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.createFile(TestDirectoryScanner.java:108)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:584)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9395) HDFS operations vary widely in which failures they put in the audit log and which they leave out

2016-02-02 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9395:
---
Summary: HDFS operations vary widely in which failures they put in the 
audit log and which they leave out  (was: getContentSummary and other FS 
operations are audit logged as success even if failed)

> HDFS operations vary widely in which failures they put in the audit log and 
> which they leave out
> 
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
> Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch
>
>
> Audit logging is in the fainally block along with the lock unlocking, so it 
> is always logged as success even for cases like FileNotFoundException is 
> thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9744) TestDirectoryScanner#testThrottling occasionally time out after 300 seconds

2016-02-02 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129096#comment-15129096
 ] 

Wei-Chiu Chuang commented on HDFS-9744:
---

Ok. I just feel a bit unfortunate that a single test can take more than 5 
minutes to complete. Is there anyway to reduce the test time without 
compromising the test?

> TestDirectoryScanner#testThrottling occasionally time out after 300 seconds
> ---
>
> Key: HDFS-9744
> URL: https://issues.apache.org/jira/browse/HDFS-9744
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Priority: Minor
>  Labels: test
>
> I have seen quite a few test failures in TestDirectoryScanner#testThrottling.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2793/testReport/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testThrottling/
> Looking at the log, it does not look like the test got stucked. On my local 
> machine, this test took 219 seconds. It is likely that this test takes more 
> than 300 seconds to complete on a busy jenkins slave. I think it is 
> reasonable to set a longer time out value, or reduce the number of blocks to 
> reduce the duration of the test.
> Error Message
> {noformat}
> test timed out after 30 milliseconds
> {noformat}
> Stacktrace
> {noformat}
> java.lang.Exception: test timed out after 30 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.waitAndQueuePacket(DataStreamer.java:804)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacket(DFSOutputStream.java:423)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacketFull(DFSOutputStream.java:432)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:418)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
>   at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:125)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:111)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:418)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.createFile(TestDirectoryScanner.java:108)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:584)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9403) Erasure coding: some EC tests are missing timeout

2016-02-02 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9403:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks Rui for the update! +1 on the v02 patch. I just committed it to trunk.

> Erasure coding: some EC tests are missing timeout
> -
>
> Key: HDFS-9403
> URL: https://issues.apache.org/jira/browse/HDFS-9403
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Affects Versions: 3.0.0
>Reporter: Zhe Zhang
>Assignee: GAO Rui
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-9403-origin-trunk.00.patch, 
> HDFS-9403-origin-trunk.01.patch, HDFS-9403-origin-trunk.02.patch
>
>
> EC data writing pipeline is still being worked on, and bugs could introduce 
> program hang. We should add a timeout for all tests involving striped 
> writing. I see at least the following:
> * {{TestErasureCodingPolicies}}
> * {{TestFileStatusWithECPolicy}}
> * {{TestDFSStripedOutputStream}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129152#comment-15129152
 ] 

Mingliang Liu commented on HDFS-9739:
-

{{TestDatanodeStartupFixesLegacyStorageIDs}} is in {{hadoop-hdfs}} module and 
this patch did not trigger its run as the change is in {{hadoop-hdfs-client}} 
module.

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9739.000.patch
>
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9260) Improve the performance and GC friendliness of NameNode startup and full block reports

2016-02-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128986#comment-15128986
 ] 

Hudson commented on HDFS-9260:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9230 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9230/])
CHANGES.txt:  Move HDFS-9260 to trunk (cmccabe: rev 
913676dc355f17dc41b75be1b3a27114197ea52c)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Improve the performance and GC friendliness of NameNode startup and full 
> block reports
> --
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Fix For: 3.0.0
>
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9731) Erasure Coding: Rename BlockECRecoveryCommand to BlockECReconstructionCommand

2016-02-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128987#comment-15128987
 ] 

Hudson commented on HDFS-9731:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9230 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9230/])
HDFS-9731. Erasure Coding: Rename BlockECRecoveryCommand to (zhezhang: rev 
4ae543fdcd6dcfbe32257b1e72a405df9aa73e17)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestRecoverStripedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlockECReconstructionCommand.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ErasureCodingWork.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReconstructStripedFile.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestReconstructStripedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
* hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/erasurecoding.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlockECRecoveryCommand.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java


> Erasure Coding: Rename BlockECRecoveryCommand to BlockECReconstructionCommand
> -
>
> Key: HDFS-9731
> URL: https://issues.apache.org/jira/browse/HDFS-9731
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0
>
> Attachments: HDFS-9731-001.patch
>
>
> This sub-task is to visit the EC recovery logic and make the logic as 
> _reconstruction_. ie, rename EC-related block repair logic to "reconstruction"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9743:
-
Attachment: HDFS-9743-branch-2.7.001.patch

> Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7
> -
>
> Key: HDFS-9743
> URL: https://issues.apache.org/jira/browse/HDFS-9743
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9743-branch-2.7.001.patch
>
>
> The corresponding test case has been moved and fixed in trunk by HDFS-9073. 
> We should fix it in branch-2.7 too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9743:
-
Status: Patch Available  (was: Open)

> Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7
> -
>
> Key: HDFS-9743
> URL: https://issues.apache.org/jira/browse/HDFS-9743
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9743-branch-2.7.001.patch
>
>
> The corresponding test case has been moved and fixed in trunk by HDFS-9073. 
> We should fix it in branch-2.7 too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned HDFS-9743:


Assignee: Kihwal Lee

> Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7
> -
>
> Key: HDFS-9743
> URL: https://issues.apache.org/jira/browse/HDFS-9743
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9743-branch-2.7.001.patch
>
>
> The corresponding test case has been moved and fixed in trunk by HDFS-9073. 
> We should fix it in branch-2.7 too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9744) TestDirectoryScanner#testThrottling occasionally time out after 300 seconds

2016-02-02 Thread Wei-Chiu Chuang (JIRA)

Wei-Chiu Chuang created HDFS-9744:
-

 Summary: TestDirectoryScanner#testThrottling occasionally time out 
after 300 seconds
 Key: HDFS-9744
 URL: https://issues.apache.org/jira/browse/HDFS-9744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
 Environment: Jenkins
Reporter: Wei-Chiu Chuang
Priority: Minor


I have seen quite a few test failures in TestDirectoryScanner#testThrottling.
https://builds.apache.org/job/Hadoop-Hdfs-trunk/2793/testReport/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testThrottling/

Looking at the log, it does not look like the test got stucked. On my local 
machine, this test took 219 seconds. It is likely that this test takes more 
than 300 seconds to complete on a busy jenkins slave. I think it is reasonable 
to set a longer time out value, or reduce the number of blocks to reduce the 
duration of the test.

Error Message
{noformat}
test timed out after 30 milliseconds
{noformat}
Stacktrace
{noformat}
java.lang.Exception: test timed out after 30 milliseconds
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at 
org.apache.hadoop.hdfs.DataStreamer.waitAndQueuePacket(DataStreamer.java:804)
at 
org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacket(DFSOutputStream.java:423)
at 
org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacketFull(DFSOutputStream.java:432)
at 
org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:418)
at 
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:125)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:111)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:418)
at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:376)
at 
org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.createFile(TestDirectoryScanner.java:108)
at 
org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:584)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9739:

Component/s: hdfs-client

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9739.000.patch
>
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9739:

Status: Patch Available  (was: Open)

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9739.000.patch
>
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9395) HDFS operations vary widely in which failures they put in the audit log and which they leave out

2016-02-02 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9395:
---
Description: 
So, the big question here is what should go in the audit log? All failures, or 
just "permission denied" failures? Or, to put it a different way, if someone 
attempts to do something and it fails because a file doesn't exist, is that 
worth an audit log entry?

We are currently inconsistent on this point. For example, concat, 
getContentSummary, addCacheDirective, and setErasureEncodingPolicy create an 
audit log entry for all failures, but setOwner, delete, and setAclEntries 
attempt to only create an entry for AccessControlException-based failures. 
There are a few operations, like allowSnapshot, disallowSnapshot, and 
startRollingUpgrade that never create audit log failure entries at all. They 
simply log nothing for any failure, and log success for a successful operation.

So to summarize, different HDFS operations currently fall into 3 categories:
1. audit-log all failures
2. audit-log only AccessControlException failures
3. never audit-log failures

Which category is right?  And how can we fix the inconsistency

  was:Audit logging is in the fainally block along with the lock unlocking, so 
it is always logged as success even for cases like FileNotFoundException is 
thrown.


> HDFS operations vary widely in which failures they put in the audit log and 
> which they leave out
> 
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
> Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch
>
>
> So, the big question here is what should go in the audit log? All failures, 
> or just "permission denied" failures? Or, to put it a different way, if 
> someone attempts to do something and it fails because a file doesn't exist, 
> is that worth an audit log entry?
> We are currently inconsistent on this point. For example, concat, 
> getContentSummary, addCacheDirective, and setErasureEncodingPolicy create an 
> audit log entry for all failures, but setOwner, delete, and setAclEntries 
> attempt to only create an entry for AccessControlException-based failures. 
> There are a few operations, like allowSnapshot, disallowSnapshot, and 
> startRollingUpgrade that never create audit log failure entries at all. They 
> simply log nothing for any failure, and log success for a successful 
> operation.
> So to summarize, different HDFS operations currently fall into 3 categories:
> 1. audit-log all failures
> 2. audit-log only AccessControlException failures
> 3. never audit-log failures
> Which category is right?  And how can we fix the inconsistency



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9744) TestDirectoryScanner#testThrottling occasionally time out after 300 seconds

2016-02-02 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129109#comment-15129109
 ] 

Daniel Templeton commented on HDFS-9744:


The problem is that there have to be enough blocks to make the scanning take 
long enough to be able to measure the throttling.  Since we can't predict where 
the test will run, it's hard to pin that number.  Too high makes the test take 
forever.  Too low and the test breaks.  Feel free to try to tune the block 
count, but understand that you're kinda stuck between a rock and a hard place.

> TestDirectoryScanner#testThrottling occasionally time out after 300 seconds
> ---
>
> Key: HDFS-9744
> URL: https://issues.apache.org/jira/browse/HDFS-9744
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Priority: Minor
>  Labels: test
>
> I have seen quite a few test failures in TestDirectoryScanner#testThrottling.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2793/testReport/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testThrottling/
> Looking at the log, it does not look like the test got stucked. On my local 
> machine, this test took 219 seconds. It is likely that this test takes more 
> than 300 seconds to complete on a busy jenkins slave. I think it is 
> reasonable to set a longer time out value, or reduce the number of blocks to 
> reduce the duration of the test.
> Error Message
> {noformat}
> test timed out after 30 milliseconds
> {noformat}
> Stacktrace
> {noformat}
> java.lang.Exception: test timed out after 30 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.waitAndQueuePacket(DataStreamer.java:804)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacket(DFSOutputStream.java:423)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacketFull(DFSOutputStream.java:432)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:418)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
>   at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:125)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:111)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:418)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.createFile(TestDirectoryScanner.java:108)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:584)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9726) Refactor IBR code to a new class

2016-02-02 Thread Masatake Iwasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129110#comment-15129110
 ] 

Masatake Iwasaki commented on HDFS-9726:


Thanks for working on this, [~szetszwo]. I think this is good improvement 
making datanode's code clearer.

The patch looks good to me overall. Unit tests passed on my environment. Found 
some nits.

* {{IncrementalBlockReportManager#addRDBI}} should be VisibleForTesting and 
{{IncrementalBlockReportManager#sendImmediately}} is not.
* {{notifyNamenodeBlock}}: I felt the meanings of {{send}} and {{now}} are not 
clear from the variable name. How about {{immediate}} and {{notify}} 
respectively?

> Refactor IBR code to a new class
> 
>
> Key: HDFS-9726
> URL: https://issues.apache.org/jira/browse/HDFS-9726
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9726_20160131.patch, h9726_20160201.patch
>
>
> The IBR code currently is mainly in BPServiceActor.  The JIRA is to refactor 
> it to a new class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9740) Use a reasonable limit in DFSTestUtil.waitForMetric()

2016-02-02 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129003#comment-15129003
 ] 

Chang Li commented on HDFS-9740:


testRemoveVolumeBeingWritten in TestFsDatasetImpl are tracked by HDFS-9310. 
Rest tests failures are not related by my change and can not be reproduced on 
my local machine

> Use a reasonable limit in DFSTestUtil.waitForMetric()
> -
>
> Key: HDFS-9740
> URL: https://issues.apache.org/jira/browse/HDFS-9740
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Chang Li
> Attachments: HDFS-9740.patch
>
>
> If test is detecting a bug, it will probably hit the long surefire timeout 
> because the max is {{Integer.MAX_VALUE}}.  Use something more realistic. The 
> default jmx update interval is 10 seconds, so something like 60 seconds 
> should be more than enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9689) Test o.a.h.hdfs.TestRenameWhileOpen fails intermittently

2016-02-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9689:

Assignee: (was: Mingliang Liu)

> Test o.a.h.hdfs.TestRenameWhileOpen fails intermittently 
> -
>
> Key: HDFS-9689
> URL: https://issues.apache.org/jira/browse/HDFS-9689
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Mingliang Liu
> Attachments: HDFS-9689.000.patch
>
>
> The test fails in recent builds, e.g.
> https://builds.apache.org/job/PreCommit-HDFS-Build/14063/testReport/org.apache.hadoop.hdfs/TestRenameWhileOpen/
> and
> https://builds.apache.org/job/PreCommit-HDFS-Build/14212/testReport/org.apache.hadoop.hdfs/TestRenameWhileOpen/testWhileOpenRenameToNonExistentDirectory/
> The *Error Message* is like:
> {code}
> Problem binding to [localhost:60690] java.net.BindException: Address already 
> in use; For more details see:  http://wiki.apache.org/hadoop/BindException
> {code}
> and *Stacktrace* is:
> {code}
> java.net.BindException: Problem binding to [localhost:60690] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:463)
>   at sun.nio.ch.Net.bind(Net.java:455)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:469)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:695)
>   at org.apache.hadoop.ipc.Server.(Server.java:2464)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:958)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:392)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:743)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:685)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:884)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:863)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1581)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1247)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1016)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:891)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:823)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
>   at 
> org.apache.hadoop.hdfs.TestRenameWhileOpen.testWhileOpenRenameToNonExistentDirectory(TestRenameWhileOpen.java:332)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9715) Check storage ID uniqueness on datanode startup

2016-02-02 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9715:

Attachment: HDFS-9715.03.patch

Hey, [~vinayrpet] 

Yes, you are right. I updated the patch to address the {{TestDFSRollback}}. 
Will much appreciate to have another review.

> Check storage ID uniqueness on datanode startup
> ---
>
> Key: HDFS-9715
> URL: https://issues.apache.org/jira/browse/HDFS-9715
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9715.00.patch, HDFS-9715.01.patch, 
> HDFS-9715.02.patch, HDFS-9715.03.patch
>
>
> We should fix this to check storage ID uniqueness on datanode startup. If 
> someone has manually edited the storage ID files, or if they have duplicated 
> a directory (or re-added an old disk) they could end up with a duplicate 
> storage ID and not realize it. 
> The HDFS-7575 fix does generate storage UUID for each storage, but not checks 
> the uniqueness of these UUIDs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9395) getContentSummary and other FS operations are audit logged as success even if failed

2016-02-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129094#comment-15129094
 ] 

Colin Patrick McCabe commented on HDFS-9395:


So, the big question here is what should go in the audit log?  All failures, or 
just "permission denied" failures?  Or, to put it a different way, if someone 
attempts to do something and it fails because a file doesn't exist, is that 
worth an audit log entry?

We are currently inconsistent on this point.  For example, {{concat}}, 
{{getContentSummary}}, {{addCacheDirective}}, and {{setErasureEncodingPolicy}} 
create an audit log entry for all failures, but {{setOwner}}, {{delete}}, and 
{{setAclEntries}} attempt to only create an entry for 
{{AccessControlException}}-based failures.  There are a few operations, like 
{{allowSnapshot}}, {{disallowSnapshot}}, and {{startRollingUpgrade}} that never 
create audit log failure entries at all.  They simply log nothing for any 
failure, and log success for a successful operation.

So to summarize, operations fall into 3 categories:
1. audit-log *all* failures
2. audit-log only {{AccessControlException}} failures
3. *never* audit-log failures

Category #3 seems like a clear violation of what people expect out of the audit 
log, since it will leave out all the unsuccessful attempts to do some 
privileged operation.  So perhaps the category #3 operations are clearly buggy. 
 The question then becomes, is the category #1 or #2 interpretation correct?  
One potential issue I see with category #2 is that if there is some failure 
that ultimately is permissions-related, but which fails to generate the 
specific {{AccessControlException}} subclass of exception, we will miss it.  So 
category #1 operations are more robust against changes in the exception 
handling.

> getContentSummary and other FS operations are audit logged as success even if 
> failed
> 
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
> Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch
>
>
> Audit logging is in the fainally block along with the lock unlocking, so it 
> is always logged as success even for cases like FileNotFoundException is 
> thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9743:
-
Affects Version/s: 2.7.2

> Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7
> -
>
> Key: HDFS-9743
> URL: https://issues.apache.org/jira/browse/HDFS-9743
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9743-branch-2.7.001.patch
>
>
> The corresponding test case has been moved and fixed in trunk by HDFS-9073. 
> We should fix it in branch-2.7 too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129039#comment-15129039
 ] 

Mingliang Liu commented on HDFS-9739:
-

Thanks for your prompt reply, [~kihwal]. I tested locally and 
{{TestDatanodeStartupFixesLegacyStorageIDs}} failed as expected. Glad to know 
[HDFS-9730] will address this further.

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9739.000.patch
>
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9658) Erasure Coding: allow to use multiple EC policies in striping related tests

2016-02-02 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129053#comment-15129053
 ] 

Zhe Zhang commented on HDFS-9658:
-

Ahh I see. Agreed to keep {{dnIndexSuite}}. +1 on the patch pending the 
{{maxPerLevel}} change and verifying Jenkins failures.

> Erasure Coding: allow to use multiple EC policies in striping related tests
> ---
>
> Key: HDFS-9658
> URL: https://issues.apache.org/jira/browse/HDFS-9658
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HDFS-9658.1.patch
>
>
> Currently many of the EC-related tests assume we're using RS-6-3 
> schema/policy. There're lots of hard coded fields as well as computations 
> based on that. To support multiple EC policies, we need to remove these hard 
> coded logic and make the tests more flexible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9395) getContentSummary and other FS operations are audit logged as success even if failed

2016-02-02 Thread Kuhu Shukla (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-9395:
--
Attachment: HDFS-9395.002.patch

Updated patch that addresses findbugs and checkstyle issues. The test failures 
are irreproducible on local machine and are also unrelated to this change.

> getContentSummary and other FS operations are audit logged as success even if 
> failed
> 
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
> Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch
>
>
> Audit logging is in the fainally block along with the lock unlocking, so it 
> is always logged as success even for cases like FileNotFoundException is 
> thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-9395) HDFS operations vary widely in which failures they put in the audit log and which they leave out

2016-02-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129094#comment-15129094
 ] 

Colin Patrick McCabe edited comment on HDFS-9395 at 2/2/16 9:46 PM:


So, the big question here is what should go in the audit log?  All failures, or 
just "permission denied" failures?  Or, to put it a different way, if someone 
attempts to do something and it fails because a file doesn't exist, is that 
worth an audit log entry?

We are currently inconsistent on this point.  For example, {{concat}}, 
{{getContentSummary}}, {{addCacheDirective}}, and {{setErasureEncodingPolicy}} 
create an audit log entry for all failures, but {{setOwner}}, {{delete}}, and 
{{setAclEntries}} attempt to only create an entry for 
{{AccessControlException}}-based failures.  There are a few operations, like 
{{allowSnapshot}}, {{disallowSnapshot}}, and {{startRollingUpgrade}} that never 
create audit log failure entries at all.  They simply log nothing for any 
failure, and log success for a successful operation.

So to summarize, different HDFS operations currently fall into 3 categories:
1. audit-log *all* failures
2. audit-log only {{AccessControlException}} failures
3. *never* audit-log failures

Category #3 seems like a clear violation of what people expect out of the audit 
log, since it will leave out all the unsuccessful attempts to do some 
privileged operation.  So perhaps the category #3 operations are clearly buggy. 
 The question then becomes, is the category #1 or #2 interpretation correct?  
One potential issue I see with category #2 is that if there is some failure 
that ultimately is permissions-related, but which fails to generate the 
specific {{AccessControlException}} subclass of exception, we will miss it.  So 
category #1 operations are more robust against changes in the exception 
handling.


was (Author: cmccabe):
So, the big question here is what should go in the audit log?  All failures, or 
just "permission denied" failures?  Or, to put it a different way, if someone 
attempts to do something and it fails because a file doesn't exist, is that 
worth an audit log entry?

We are currently inconsistent on this point.  For example, {{concat}}, 
{{getContentSummary}}, {{addCacheDirective}}, and {{setErasureEncodingPolicy}} 
create an audit log entry for all failures, but {{setOwner}}, {{delete}}, and 
{{setAclEntries}} attempt to only create an entry for 
{{AccessControlException}}-based failures.  There are a few operations, like 
{{allowSnapshot}}, {{disallowSnapshot}}, and {{startRollingUpgrade}} that never 
create audit log failure entries at all.  They simply log nothing for any 
failure, and log success for a successful operation.

So to summarize, operations fall into 3 categories:
1. audit-log *all* failures
2. audit-log only {{AccessControlException}} failures
3. *never* audit-log failures

Category #3 seems like a clear violation of what people expect out of the audit 
log, since it will leave out all the unsuccessful attempts to do some 
privileged operation.  So perhaps the category #3 operations are clearly buggy. 
 The question then becomes, is the category #1 or #2 interpretation correct?  
One potential issue I see with category #2 is that if there is some failure 
that ultimately is permissions-related, but which fails to generate the 
specific {{AccessControlException}} subclass of exception, we will miss it.  So 
category #1 operations are more robust against changes in the exception 
handling.

> HDFS operations vary widely in which failures they put in the audit log and 
> which they leave out
> 
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
> Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch
>
>
> Audit logging is in the fainally block along with the lock unlocking, so it 
> is always logged as success even for cases like FileNotFoundException is 
> thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Kihwal Lee (JIRA)

Kihwal Lee created HDFS-9743:


 Summary: Fix TestLazyPersistFiles#testFallbackToDiskFull in 
branch-2.7
 Key: HDFS-9743
 URL: https://issues.apache.org/jira/browse/HDFS-9743
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee


The corresponding test case has been moved and fixed in trunk by HDFS-9073. We 
should fix it in branch-2.7 too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7866) Erasure coding: NameNode manages multiple erasure coding policies

2016-02-02 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129089#comment-15129089
 ] 

Zhe Zhang commented on HDFS-7866:
-

[~lirui] [~drankye] Thanks for the discussions.

bq. As to policy ID mapping, do you mean we should pre-define the IDs for 
system EC policies, like what we do in HdfsConstants for storage policies?
Yes I think we should do this for the current phase. As Kai suggested above, 
and quoted Andrew's comment, I think full pluggability is still a pretty remote 
target. When we do implement customer-pluggable policies, we should revisit the 
issue together with block storage policies.

bq. Maybe 7 bits for EC policies (up to 128 ones) and 4 bits for repl factor 
(up to 16 sounds enough for striped blocks).
Well I think this decision can be left open now. Right now we can just use the 
11 bits for EC policies, and then decide how many to squeeze for repl factor. 
I've seen some special cases where the repl factor of some contiguous blocks 
are tuned high so all DNs in the cluster will have them. But for striped 
blocks, maybe that's not a requirement anymore. So 16 might be OK.

> Erasure coding: NameNode manages multiple erasure coding policies
> -
>
> Key: HDFS-7866
> URL: https://issues.apache.org/jira/browse/HDFS-7866
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rui Li
> Attachments: HDFS-7866-v1.patch, HDFS-7866-v2.patch, 
> HDFS-7866-v3.patch, HDFS-7866.4.patch, HDFS-7866.5.patch, HDFS-7866.6.patch, 
> HDFS-7866.7.patch
>
>
> This is to extend NameNode to load, list and sync predefine EC schemas in 
> authorized and controlled approach. The provided facilities will be used to 
> implement DFSAdmin commands so admin can list available EC schemas, then 
> could choose some of them for target EC zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-9395) HDFS operations vary widely in which failures they put in the audit log and which they leave out

2016-02-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15094501#comment-15094501
 ] 

Colin Patrick McCabe edited comment on HDFS-9395 at 2/2/16 9:47 PM:


[edit: see my later comments about this.  I was writing this assuming that 
audit log strategy #1 was correct, but without considering the other 
possibilities]

If I understand correctly, the relevant code block is here:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = true;
try {
  checkOperation(OperationCategory.READ);
  return FSDirStatAndListingOp.getContentSummary(dir, src);
} catch (AccessControlException ace) {
  success = false;
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

The code appears to be making the assumption that the only IOE that can be 
thrown is {{AccessControlException}}.  I don't think this is correct.  It would 
be better to change this to something like this, similar to our other audit log 
use-cases:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = false;
try {
  checkOperation(OperationCategory.READ);
  ContentSummary csum = FSDirStatAndListingOp.getContentSummary(dir, src);
  success = true;
  return csum;
} catch (AccessControlException ace) {
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

bq. It's by design? HDFS-5163

No, it's a bug.  Also, I looked at the code prior to the HDFS-4949 branch 
merge, and the bug existed prior to HDFS-5163 or any of the other HDFS-4949 
JIRAs.

Hope this helps.


was (Author: cmccabe):
If I understand correctly, the relevant code block is here:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = true;
try {
  checkOperation(OperationCategory.READ);
  return FSDirStatAndListingOp.getContentSummary(dir, src);
} catch (AccessControlException ace) {
  success = false;
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

The code appears to be making the assumption that the only IOE that can be 
thrown is {{AccessControlException}}.  I don't think this is correct.  It would 
be better to change this to something like this, similar to our other audit log 
use-cases:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = false;
try {
  checkOperation(OperationCategory.READ);
  ContentSummary csum = FSDirStatAndListingOp.getContentSummary(dir, src);
  success = true;
  return csum;
} catch (AccessControlException ace) {
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

bq. It's by design? HDFS-5163

No, it's a bug.  Also, I looked at the code prior to the HDFS-4949 branch 
merge, and the bug existed prior to HDFS-5163 or any of the other HDFS-4949 
JIRAs.

Hope this helps.

> HDFS operations vary widely in which failures they put in the audit log and 
> which they leave out
> 
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
> Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch
>
>
> So, the big question here is what should go in the audit log? All failures, 
> or just "permission denied" failures? Or, to put it a different way, if 
> someone attempts to do something and it fails because a file doesn't exist, 
> is that worth an audit log entry?
> We are currently inconsistent on this point. For example, concat, 
> getContentSummary, addCacheDirective, and setErasureEncodingPolicy create an 
> audit log entry for all failures, but setOwner, delete, and setAclEntries 
> attempt to only create an entry for AccessControlException-based failures. 
> There are a few operations, like allowSnapshot, disallowSnapshot, and 
> startRollingUpgrade that never create audit log failure entries at all. They 
> simply log nothing for any failure, and log success for a successful 
> operation.
> So to summarize, different HDFS operations currently fall into 3 categories:
> 1. audit-log all failures
> 2. audit-log only AccessControlException failures
> 3. never audit-log

[jira] [Comment Edited] (HDFS-9395) HDFS operations vary widely in which failures they put in the audit log and which they leave out

2016-02-02 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15094501#comment-15094501
 ] 

Colin Patrick McCabe edited comment on HDFS-9395 at 2/2/16 9:48 PM:


\[edit: see my later comments about this.  I was writing this assuming that 
audit log strategy #1 was correct, but without considering the other 
possibilities\]

If I understand correctly, the relevant code block is here:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = true;
try {
  checkOperation(OperationCategory.READ);
  return FSDirStatAndListingOp.getContentSummary(dir, src);
} catch (AccessControlException ace) {
  success = false;
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

The code appears to be making the assumption that the only IOE that can be 
thrown is {{AccessControlException}}.  I don't think this is correct.  It would 
be better to change this to something like this, similar to our other audit log 
use-cases:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = false;
try {
  checkOperation(OperationCategory.READ);
  ContentSummary csum = FSDirStatAndListingOp.getContentSummary(dir, src);
  success = true;
  return csum;
} catch (AccessControlException ace) {
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

bq. It's by design? HDFS-5163

No, it's a bug.  Also, I looked at the code prior to the HDFS-4949 branch 
merge, and the bug existed prior to HDFS-5163 or any of the other HDFS-4949 
JIRAs.

Hope this helps.


was (Author: cmccabe):
[edit: see my later comments about this.  I was writing this assuming that 
audit log strategy #1 was correct, but without considering the other 
possibilities]

If I understand correctly, the relevant code block is here:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = true;
try {
  checkOperation(OperationCategory.READ);
  return FSDirStatAndListingOp.getContentSummary(dir, src);
} catch (AccessControlException ace) {
  success = false;
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

The code appears to be making the assumption that the only IOE that can be 
thrown is {{AccessControlException}}.  I don't think this is correct.  It would 
be better to change this to something like this, similar to our other audit log 
use-cases:
{code}
  ContentSummary getContentSummary(final String src) throws IOException {
checkOperation(OperationCategory.READ);
readLock();
boolean success = false;
try {
  checkOperation(OperationCategory.READ);
  ContentSummary csum = FSDirStatAndListingOp.getContentSummary(dir, src);
  success = true;
  return csum;
} catch (AccessControlException ace) {
  throw ace;
} finally {
  readUnlock();
  logAuditEvent(success, "contentSummary", src);
}
  }
{code}

bq. It's by design? HDFS-5163

No, it's a bug.  Also, I looked at the code prior to the HDFS-4949 branch 
merge, and the bug existed prior to HDFS-5163 or any of the other HDFS-4949 
JIRAs.

Hope this helps.

> HDFS operations vary widely in which failures they put in the audit log and 
> which they leave out
> 
>
> Key: HDFS-9395
> URL: https://issues.apache.org/jira/browse/HDFS-9395
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kuhu Shukla
> Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch
>
>
> So, the big question here is what should go in the audit log? All failures, 
> or just "permission denied" failures? Or, to put it a different way, if 
> someone attempts to do something and it fails because a file doesn't exist, 
> is that worth an audit log entry?
> We are currently inconsistent on this point. For example, concat, 
> getContentSummary, addCacheDirective, and setErasureEncodingPolicy create an 
> audit log entry for all failures, but setOwner, delete, and setAclEntries 
> attempt to only create an entry for AccessControlException-based failures. 
> There are a few operations, like allowSnapshot, disallowSnapshot, and 
> startRollingUpgrade that never create audit log failure entries at all. They 
> simply log nothing for any failure, and log success for a successful 
> operation.
> So to

[jira] [Commented] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129139#comment-15129139
 ] 

Hadoop QA commented on HDFS-9739:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 19s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 12s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 21s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12785848/HDFS-9739.000.patch |
| JIRA Issue | HDFS-9739 |
| Optional Tests |  asflicense  compile  javac

[jira] [Commented] (HDFS-9172) Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client

2016-02-02 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129159#comment-15129159
 ] 

Zhe Zhang commented on HDFS-9172:
-

Thanks for the follow-up Rakesh. I think we can close this JIRA as the changes 
are done.

> Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client
> --
>
> Key: HDFS-9172
> URL: https://issues.apache.org/jira/browse/HDFS-9172
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Zhe Zhang
>
> The idea of this jira is to move the striped stream related classes to 
> {{hadoop-hdfs-client}} project. This will help to be in sync with the 
> HDFS-6200 proposal.
> - DFSStripedInputStream
> - DFSStripedOutputStream
> - StripedDataStreamer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-9172) Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client

2016-02-02 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang resolved HDFS-9172.
-
Resolution: Invalid

> Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client
> --
>
> Key: HDFS-9172
> URL: https://issues.apache.org/jira/browse/HDFS-9172
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Zhe Zhang
>
> The idea of this jira is to move the striped stream related classes to 
> {{hadoop-hdfs-client}} project. This will help to be in sync with the 
> HDFS-6200 proposal.
> - DFSStripedInputStream
> - DFSStripedOutputStream
> - StripedDataStreamer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9403) Erasure coding: some EC tests are missing timeout

2016-02-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129205#comment-15129205
 ] 

Hudson commented on HDFS-9403:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9231 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9231/])
HDFS-9403. Erasure coding: some EC tests are missing timeout. (zhezhang: rev 
6d1213860f448242c21d83ee5c764d79fc4a7801)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestStripedBlockUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfoStriped.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAddStripedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestWriteReadStripedFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestStripedINodeFile.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/TestECSchema.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestErasureCodeBenchmarkThroughput.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAddOverReplicatedStripedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/coder/TestXORCoder.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/cli/TestErasureCodingCLI.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSafeModeWithStripedFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaWithStripedBlocks.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/coder/TestRSErasureCoder.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatusWithECPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithMissingBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedInputStream.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFSStriped.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeECN.java


> Erasure coding: some EC tests are missing timeout
> -
>
> Key: HDFS-9403
> URL: https://issues.apache.org/jira/browse/HDFS-9403
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, test
>Affects Versions: 3.0.0
>Reporter: Zhe Zhang
>Assignee: GAO Rui
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-9403-origin-trunk.00.patch, 
> HDFS-9403-origin-trunk.01.patch, HDFS-9403-origin-trunk.02.patch
>
>
> EC data writing pipeline is still being worked on, and bugs could introduce 
> program hang. We should add a timeout for all tests involving striped 
> writing. I see at least the following:
> * {{TestErasureCodingPolicies}}
> * {{TestFileStatusWithECPolicy}}
> * {{TestDFSStripedOutputStream}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129221#comment-15129221
 ] 

Kihwal Lee commented on HDFS-9743:
--

[~aw], the branch-2.7 specific build failed. Is it expected or is there 
something wrong with the docker image?
{noformat}
[WARNING] [protoc, --version] failed: java.io.IOException: Cannot run program 
"protoc": error=2, No such file or directory
[ERROR] stdout: []
{noformat}

We are also not pushing 2.7.3-SNAPSHOT artifacts.

> Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7
> -
>
> Key: HDFS-9743
> URL: https://issues.apache.org/jira/browse/HDFS-9743
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9743-branch-2.7.001.patch
>
>
> The corresponding test case has been moved and fixed in trunk by HDFS-9073 
> and HDFS-9067. We should fix it in branch-2.7 too, but it will need a 
> different patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129294#comment-15129294
 ] 

Allen Wittenauer commented on HDFS-9743:


2.7 doesn't have a Dockerfile.

>From https://builds.apache.org/job/PreCommit-HDFS-Build/14348/console (up 
>above):
{code}
ERROR: Dockerfile 
'/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/dev-support/docker/Dockerfile'
 not found
{code}

> Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7
> -
>
> Key: HDFS-9743
> URL: https://issues.apache.org/jira/browse/HDFS-9743
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9743-branch-2.7.001.patch
>
>
> The corresponding test case has been moved and fixed in trunk by HDFS-9073 
> and HDFS-9067. We should fix it in branch-2.7 too, but it will need a 
> different patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9745) TestSecureNNWithQJM#testSecureMode sometimes fails with timeouts

2016-02-02 Thread Xiao Chen (JIRA)

Xiao Chen created HDFS-9745:
---

 Summary: TestSecureNNWithQJM#testSecureMode sometimes fails with 
timeouts
 Key: HDFS-9745
 URL: https://issues.apache.org/jira/browse/HDFS-9745
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Xiao Chen
Assignee: Xiao Chen
Priority: Minor


TestSecureNNWithQJM#testSecureMode fails intermittently. For most of the case, 
it timeouts.
In a 0.5%~1% probability, it fails with a more sophisticated error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9745) TestSecureNNWithQJM#testSecureMode sometimes fails with timeouts

2016-02-02 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129332#comment-15129332
 ] 

Xiao Chen commented on HDFS-9745:
-

Here is the more sophisticated error. I haven't root caused it, but kerberos 
seems to be involved. I've also seen this NPE in {{TestKMS#testACLs}} failures..
{noformat}
2016-02-01 06:41:47,944 INFO  namenode.FSImage (FSImage.java:loadEdits(832)) - 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@12e8c783 
expecting start txid #1
2016-02-01 06:41:47,944 INFO  namenode.FSImage 
(FSEditLogLoader.java:loadFSEdits(139)) - Start loading edits file 
https://localhost:55326/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID,
 
https://localhost:48398/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID
2016-02-01 06:41:47,945 INFO  namenode.EditLogInputStream 
(RedundantEditLogInputStream.java:nextOp(176)) - Fast-forwarding stream 
'https://localhost:55326/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID,
 
https://localhost:48398/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID'
 to transaction ID 1
2016-02-01 06:41:47,945 INFO  namenode.EditLogInputStream 
(RedundantEditLogInputStream.java:nextOp(176)) - Fast-forwarding stream 
'https://localhost:55326/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID'
 to transaction ID 1
2016-02-01 06:41:48,131 ERROR protocol.KerberosProtocolHandler 
(KerberosProtocolHandler.java:exceptionCaught(157)) - /127.0.0.1:46417 EXCEPTION
org.apache.mina.filter.codec.ProtocolDecoderException: 
java.lang.NullPointerException: message (Hexdump: 00 00 02 45 6C 82 02 41 30 82 
02 3D A1 03 02 01 05 A2 03 02 01 0C A3 82 01 C4 30 82 01 C0 30 82 01 BC A1 03 
02 01 01 A2 82 01 B3 04 82 01 AF 6E 82 01 AB 30 82 01 A7 A0 03 02 01 05 A1 03 
02 01 0E A2 07 03 05 00 00 00 00 00 A3 81 F6 61 81 F3 30 81 F0 A0 03 02 01 05 
A1 0D 1B 0B 45 58 41 4D 50 4C 45 2E 43 4F 4D A2 20 30 1E A0 03 02 01 02 A1 17 
30 15 1B 06 6B 72 62 74 67 74 1B 0B 45 58 41 4D 50 4C 45 2E 43 4F 4D A3 81 B7 
30 81 B4 A0 03 02 01 11 A2 81 AC 04 81 A9 6D 55 CA 9E E9 6A 4D 9F C1 47 9A 2B 
9E 75 07 73 F4 48 4A 36 AE BD 28 4D DA 3D 00 89 0D B1 70 C1 67 E7 44 0B C4 3E 
BF 59 3D 2B F2 EA E7 09 05 44 23 2D B6 D6 76 46 D8 DC 05 32 68 A2 2B D5 58 3D 
EC 5D DD 4D A1 6D F2 80 95 5F 61 2A 40 C5 D3 F7 BD A3 71 73 F2 81 DD CD B1 B3 
1D 8E FA B0 70 9F 88 AD 97 7C 2A 91 DE 4F 69 5A 23 17 AA 21 99 7E 89 36 E6 1A 
01 06 D2 DF 6F C3 76 15 47 81 9A 65 66 F7 CC 23 52 C5 DE 77 09 AD 66 38 94 DE 
93 DC CA 24 6B C4 2B FA 1A BC FC 07 84 9B CE 0D 15 BA C7 00 8A 5C 0E 61 D8 BE 
01 A4 81 98 30 81 95 A0 03 02 01 11 A2 81 8D 04 81 8A 60 DF 82 D5 14 DB 78 8D 
A8 E4 6D F8 FE 3A F4 AB 98 25 9D DB 51 ED 3B CE 53 C8 DC 48 1C CB EB B5 1B 5A 
45 BA CD 68 0A 26 2F 8D 3A FE 75 AE 36 4B 25 B5 B8 5A C1 27 71 E3 B6 03 7D D6 
2D 14 58 CD 6D 19 F0 25 D0 5A 9B 35 A6 7E 36 62 DA 28 56 0B E9 53 03 43 7B 71 
D5 ED 8F 52 CE 6E 8A 23 0C 52 53 EB 42 0B 7A 6B 8C 54 EB 1C 70 FB 21 DD DF 23 
B4 5E AD 42 67 65 42 61 FD DB 2D 28 C2 4D 7A 71 69 D0 74 9A 64 8A 82 A0 EC C8 
A4 69 30 67 A0 07 03 05 00 00 00 00 00 A2 0D 1B 0B 45 58 41 4D 50 4C 45 2E 43 
4F 4D A3 1C 30 1A A0 03 02 01 00 A1 13 30 11 1B 04 48 54 54 50 1B 09 6C 6F 63 
61 6C 68 6F 73 74 A5 11 18 0F 31 39 37 30 30 31 30 31 30 30 30 30 30 30 5A A7 
06 02 04 6B 88 55 A5 A8 14 30 12 02 01 12 02 01 11 02 01 10)
at 
org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:234)
at 
org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
at 
org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:48)
at 
org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:802)
at 
org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:120)
at 
org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:434)
at 
org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:426)
at 
org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:604)
at 
org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:564)
at 
org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:553)
at 
org.apache.mina.core.polling.AbstractPollingIoProcessor.access$400(AbstractPollingIoProcessor.java:57)
at 
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:892)
at 
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:65)
at

[jira] [Comment Edited] (HDFS-9671) DiskBalancer : SubmitPlan implementation

2016-02-02 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129207#comment-15129207
 ] 

Lei (Eddy) Xu edited comment on HDFS-9671 at 2/2/16 10:42 PM:
--

Hey, [~anu] 

Thanks for providing this patch. Would you mind to address the following 
comments?

* {{WorkItem.java}} misses apache license.

* Could you rename {{WorkItem}} and {{WorkStatus}} to {{DiskBalancerWorkItem}} 
and {{DiskBalancerWorkStatus}}. They are in the "o.a.h.h.s.datanode" namespace, 
while the names are too general. 

* IMO, {{private long copiedSoFar;}} might be named to {{private long 
bytesCopied}}. Putting units into the variable makes it little bit easier for 
me to understand. 

* Would you mind to fix comments to follow jdoc format? e.g., we do not need 
{{-}} in {{@param errMsg - msg}}.  Also there are a few places like {{@return 
long}}, that we can probably remove.

* {{DiskbalancerException}} should be {{DiskBalancerException}}.

* Are {{DiskBalancerException.DISK_BALANCE_NOT_ENABLED, INVALID_PLAN, ...}} the 
only possible value for {{int result}}? Could us use a {{enum}} here?j

*  {code}
try {
   lock.lock();
{code}
We might want to move {{lock()}} out of {{try \{..\}}}

* {{BlockMover}} should hold a {{FsVolumeReference}} when doing the IOs. So in 
this sense, it might extend from {{Closable}}.

* {code}
try {
312   synchronized (this.dataset) {
313 references = this.dataset.getFsVolumeReferences();

references.close();
{code}

You can probably directly use JDK7 {{try-ressource}} to manage {{references}}.

* A few logs like {{LOG.info("Disk Balancer - Unable to find destination 
volume. " ...}} should use {{LOG.error()}}?

* {{getPathToVolumeMap()}} is misleading. The key is {{storageID}} but not a 
path. 

* {{LOG.info("Executing Disk balancer plan ");}} would you mind to also output 
plan ID in the log?

* Either in {{VolumePair(source, target)}} or {{createWorkPlan}}, you might 
want to check that {{soruce}} dose not equal to {{dest}}?

Again, thanks for the work, [~anu]!


was (Author: eddyxu):
Hey, [~anu] 

Thanks for providing this patch. Would you mind to address the following 
patches?

* {{WorkItem.java}} misses apache license.

* Could you rename {{WorkItem}} and {{WorkStatus}} to {{DiskBalancerWorkItem}} 
and {{DiskBalancerWorkStatus}}. They are in the "o.a.h.h.s.datanode" namespace, 
while the names are too general. 

* IMO, {{private long copiedSoFar;}} might be named to {{private long 
bytesCopied}}. Putting units into the variable makes it little bit easier for 
me to understand. 

* Would you mind to fix comments to follow jdoc format? e.g., we do not need 
{{-}} in {{@param errMsg - msg}}.  Also there are a few places like {{@return 
long}}, that we can probably remove.

* {{DiskbalancerException}} should be {{DiskBalancerException}}.

* Are {{DiskBalancerException.DISK_BALANCE_NOT_ENABLED, INVALID_PLAN, ...}} the 
only possible value for {{int result}}? Could us use a {{enum}} here?j

*  {code}
try {
   lock.lock();
{code}
We might want to move {{lock()}} out of {{try \{..\}}}

* {{BlockMover}} should hold a {{FsVolumeReference}} when doing the IOs. So in 
this sense, it might extend from {{Closable}}.

* {code}
try {
312   synchronized (this.dataset) {
313 references = this.dataset.getFsVolumeReferences();

references.close();
{code}

You can probably directly use JDK7 {{try-ressource}} to manage {{references}}.

* A few logs like {{LOG.info("Disk Balancer - Unable to find destination 
volume. " ...}} should use {{LOG.error()}}?

* {{getPathToVolumeMap()}} is misleading. The key is {{storageID}} but not a 
path. 

* {{LOG.info("Executing Disk balancer plan ");}} would you mind to also output 
plan ID in the log?

* Either in {{VolumePair(source, target)}} or {{createWorkPlan}}, you might 
want to check that {{soruce}} dose not equal to {{dest}}?

Again, thanks for the work, [~anu]!

> DiskBalancer : SubmitPlan implementation 
> -
>
> Key: HDFS-9671
> URL: https://issues.apache.org/jira/browse/HDFS-9671
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9671-HDFS-1312.001.patch, 
> HDFS-9671-HDFS-1312.002.patch
>
>
> Datanode side code for submit plan for diskbalancer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9671) DiskBalancer : SubmitPlan implementation

2016-02-02 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129207#comment-15129207
 ] 

Lei (Eddy) Xu commented on HDFS-9671:
-

Hey, [~anu] 

Thanks for providing this patch. Would you mind to address the following 
patches?

* {{WorkItem.java}} misses apache license.

* Could you rename {{WorkItem}} and {{WorkStatus}} to {{DiskBalancerWorkItem}} 
and {{DiskBalancerWorkStatus}}. They are in the "o.a.h.h.s.datanode" namespace, 
while the names are too general. 

* IMO, {{private long copiedSoFar;}} might be named to {{private long 
bytesCopied}}. Putting units into the variable makes it little bit easier for 
me to understand. 

* Would you mind to fix comments to follow jdoc format? e.g., we do not need 
{{-}} in {{@param errMsg - msg}}.  Also there are a few places like {{@return 
long}}, that we can probably remove.

* {{DiskbalancerException}} should be {{DiskBalancerException}}.

* Are {{DiskBalancerException.DISK_BALANCE_NOT_ENABLED, INVALID_PLAN, ...}} the 
only possible value for {{int result}}? Could us use a {{enum}} here?j

*  {code}
try {
   lock.lock();
{code}
We might want to move {{lock()}} out of {{try \{..\}}}

* {{BlockMover}} should hold a {{FsVolumeReference}} when doing the IOs. So in 
this sense, it might extend from {{Closable}}.

* {code}
try {
312   synchronized (this.dataset) {
313 references = this.dataset.getFsVolumeReferences();

references.close();
{code}

You can probably directly use JDK7 {{try-ressource}} to manage {{references}}.

* A few logs like {{LOG.info("Disk Balancer - Unable to find destination 
volume. " ...}} should use {{LOG.error()}}?

* {{getPathToVolumeMap()}} is misleading. The key is {{storageID}} but not a 
path. 

* {{LOG.info("Executing Disk balancer plan ");}} would you mind to also output 
plan ID in the log?

* Either in {{VolumePair(source, target)}} or {{createWorkPlan}}, you might 
want to check that {{soruce}} dose not equal to {{dest}}?

Again, thanks for the work, [~anu]!

> DiskBalancer : SubmitPlan implementation 
> -
>
> Key: HDFS-9671
> URL: https://issues.apache.org/jira/browse/HDFS-9671
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9671-HDFS-1312.001.patch, 
> HDFS-9671-HDFS-1312.002.patch
>
>
> Datanode side code for submit plan for diskbalancer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9503) Replace -namenode option with -fs for NNThroughputBenchmark

2016-02-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129363#comment-15129363
 ] 

Mingliang Liu commented on HDFS-9503:
-

Hi [~shv], I believe this is an independent issue which was addressed in 
[HDFS-9601], which is not cherry-picked in {{branch-2.8}}.

> Replace -namenode option with -fs for NNThroughputBenchmark
> ---
>
> Key: HDFS-9503
> URL: https://issues.apache.org/jira/browse/HDFS-9503
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Mingliang Liu
> Attachments: HDFS-9053.000.patch, HDFS-9053.001.patch, 
> HDFS-9053.002.patch, HDFS-9053.003.patch, HDFS-9053.004.patch, 
> HDFS-9053.005.patch
>
>
> HDFS-7847 introduced a new option {{-namenode}}, which is intended to point 
> the benchmark to a remote NameNode. It should use a standard generic option 
> {{-fs}} instead, which is routinely used to specify NameNode URI in shell 
> commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9743:
-
Description: The corresponding test case has been moved and fixed in trunk 
by HDFS-9073 and HDFS-9067. We should fix it in branch-2.7 too, but it will 
need a different patch.  (was: The corresponding test case has been moved and 
fixed in trunk by HDFS-9073. We should fix it in branch-2.7 too.)

> Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7
> -
>
> Key: HDFS-9743
> URL: https://issues.apache.org/jira/browse/HDFS-9743
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9743-branch-2.7.001.patch
>
>
> The corresponding test case has been moved and fixed in trunk by HDFS-9073 
> and HDFS-9067. We should fix it in branch-2.7 too, but it will need a 
> different patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9743) Fix TestLazyPersistFiles#testFallbackToDiskFull in branch-2.7

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129197#comment-15129197
 ] 

Hadoop QA commented on HDFS-9743:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvndep {color} | {color:red} 0m 47s 
{color} | {color:red} branch's hadoop-hdfs-project/hadoop-hdfs dependency:list 
failed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 47s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 2m 14s 
{color} | {color:red} root in branch-2.7 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 11s 
{color} | {color:red} hadoop-hdfs in branch-2.7 failed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-hdfs in branch-2.7 failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
9s {color} | {color:green} branch-2.7 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-hdfs in branch-2.7 failed. {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-hdfs in branch-2.7 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-hdfs in branch-2.7 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-hdfs in branch-2.7 failed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-hdfs in branch-2.7 failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} mvndep {color} | {color:red} 0m 8s {color} 
| {color:red} patch's hadoop-hdfs-project/hadoop-hdfs dependency:list failed 
{color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 8s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 9s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 2s 
{color} | {color:red} The patch has 427 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 13s 
{color} | {color:red} The patch has 41 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 7s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 7s {color} | 
{color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 7s {color} | 
{color:red} hadoop-hdfs

[jira] [Updated] (HDFS-9741) libhdfs++: GetLastError not returning meaningful messages after some failures

2016-02-02 Thread Bob Hansen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9741:
-
Assignee: Bob Hansen
  Status: Patch Available  (was: Open)

> libhdfs++: GetLastError not returning meaningful messages after some failures
> -
>
> Key: HDFS-9741
> URL: https://issues.apache.org/jira/browse/HDFS-9741
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9741.HDFS-8707.000.patch
>
>
> After failing to open a file, the text for GetLastErrorMessage is not being 
> set.  It should be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9741) libhdfs++: GetLastError not returning meaningful messages after some failures

2016-02-02 Thread Bob Hansen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9741:
-
Attachment: HDFS-9741.HDFS-8707.000.patch

Patch: sets the error message for all failures.  Returns null string if there 
is no error message.

> libhdfs++: GetLastError not returning meaningful messages after some failures
> -
>
> Key: HDFS-9741
> URL: https://issues.apache.org/jira/browse/HDFS-9741
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
> Attachments: HDFS-9741.HDFS-8707.000.patch
>
>
> After failing to open a file, the text for GetLastErrorMessage is not being 
> set.  It should be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9503) Replace -namenode option with -fs for NNThroughputBenchmark

2016-02-02 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129357#comment-15129357
 ] 

Konstantin Shvachko commented on HDFS-9503:
---

Hey [~liuml07] I was trying to commit this to branch-2.8, but the test is 
failing for me with the following error:
{code}
Running org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark
Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 35.989 sec <<< 
FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark
testNNThroughput(org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark)
  Time elapsed: 1.838 sec  <<< ERROR!
org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException: Not 
replicated yet: 
/nnThroughputBenchmark/blockReport/ThroughputBenchDir0/ThroughputBench8
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:188)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2405)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:793)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1175)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1162)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:280)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1515)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1417)
at 
org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark.testNNThroughput(TestNNThroughputBenchmark.java:56)
{code}
Could you please verify this.

> Replace -namenode option with -fs for NNThroughputBenchmark
> ---
>
> Key: HDFS-9503
> URL: https://issues.apache.org/jira/browse/HDFS-9503
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Mingliang Liu
> Attachments: HDFS-9053.000.patch, HDFS-9053.001.patch, 
> HDFS-9053.002.patch, HDFS-9053.003.patch, HDFS-9053.004.patch, 
> HDFS-9053.005.patch
>
>
> HDFS-7847 introduced a new option {{-namenode}}, which is intended to point 
> the benchmark to a remote NameNode. It should use a standard generic option 
> {{-fs}} instead, which is routinely used to specify NameNode URI in shell 
> commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9244) Support nested encryption zones

2016-02-02 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129453#comment-15129453
 ] 

Zhe Zhang commented on HDFS-9244:
-

The patch only takes effect when EZ or Trash is used. None of the above 
reported failures are related.

> Support nested encryption zones
> ---
>
> Key: HDFS-9244
> URL: https://issues.apache.org/jira/browse/HDFS-9244
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: encryption
>Reporter: Xiaoyu Yao
>Assignee: Zhe Zhang
> Attachments: HDFS-9244.00.patch, HDFS-9244.01.patch, 
> HDFS-9244.02.patch, HDFS-9244.03.patch, HDFS-9244.04.patch
>
>
> This JIRA is opened to track adding support of nested encryption zone based 
> on [~andrew.wang]'s [comment 
> |https://issues.apache.org/jira/browse/HDFS-8747?focusedCommentId=14654141=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14654141]
>  for certain use cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9746) Some Kerberos related tests intermittently fails.

2016-02-02 Thread Xiao Chen (JIRA)

Xiao Chen created HDFS-9746:
---

 Summary: Some Kerberos related tests intermittently fails.
 Key: HDFS-9746
 URL: https://issues.apache.org/jira/browse/HDFS-9746
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Xiao Chen
Assignee: Xiao Chen


So far I've seen {{TestSecureNNWithQJM#testSecureMode}} and 
{{TestKMS#testACLs}} failing. More details coming in the 1st comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9746) Some Kerberos related tests intermittently fails.

2016-02-02 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129461#comment-15129461
 ] 

Xiao Chen commented on HDFS-9746:
-

Log excerpt of the failure I saw in {{TestSecureNNWithQJM#testSecureMode}}:
Error Message
{noformat}
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying 
edit log at offset 0.  Expected transaction ID was 1
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:194)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:178)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:187)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:140)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:835)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:690)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1063)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:767)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:609)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:670)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:838)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:817)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1538)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1862)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1827)
at 
org.apache.hadoop.hdfs.qjournal.TestSecureNNWithQJM.restartNameNode(TestSecureNNWithQJM.java:197)
at 
org.apache.hadoop.hdfs.qjournal.TestSecureNNWithQJM.doNNWithQJMTest(TestSecureNNWithQJM.java:179)
at 
org.apache.hadoop.hdfs.qjournal.TestSecureNNWithQJM.testSecureMode(TestSecureNNWithQJM.java:159)
{noformat}
stdout
{noformat}
2016-02-01 06:41:47,944 INFO  namenode.FSImage (FSImage.java:loadEdits(832)) - 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@12e8c783 
expecting start txid #1
2016-02-01 06:41:47,944 INFO  namenode.FSImage 
(FSEditLogLoader.java:loadFSEdits(139)) - Start loading edits file 
https://localhost:55326/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID,
 
https://localhost:48398/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID
2016-02-01 06:41:47,945 INFO  namenode.EditLogInputStream 
(RedundantEditLogInputStream.java:nextOp(176)) - Fast-forwarding stream 
'https://localhost:55326/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID,
 
https://localhost:48398/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID'
 to transaction ID 1
2016-02-01 06:41:47,945 INFO  namenode.EditLogInputStream 
(RedundantEditLogInputStream.java:nextOp(176)) - Fast-forwarding stream 
'https://localhost:55326/getJournal?jid=myjournal=1=-60%3A1226055383%3A0%3AtestClusterID'
 to transaction ID 1
2016-02-01 06:41:48,131 ERROR protocol.KerberosProtocolHandler 
(KerberosProtocolHandler.java:exceptionCaught(157)) - /127.0.0.1:46417 EXCEPTION
org.apache.mina.filter.codec.ProtocolDecoderException: 
java.lang.NullPointerException: message (Hexdump: 00 00 02 45 6C 82 02 41 30 82 
02 3D A1 03 02 01 05 A2 03 02 01 0C A3 82 01 C4 30 82 01 C0 30 82 01 BC A1 03 
02 01 01 A2 82 01 B3 04 82 01 AF 6E 82 01 AB 30 82 01 A7 A0 03 02 01 05 A1 03 
02 01 0E A2 07 03 05 00 00 00 00 00 A3 81 F6 61 81 F3 30 81 F0 A0 03 02 01 05 
A1 0D 1B 0B 45 58 41 4D 50 4C 45 2E 43 4F 4D A2 20 30 1E A0 03 02 01 02 A1 17 
30 15 1B 06 6B 72 62 74 67 74 1B 0B 45 58 41 4D 50 4C 45 2E 43 4F 4D A3 81 B7 
30 81 B4 A0 03 02 01 11 A2 81 AC 04 81 A9 6D 55 CA 9E E9 6A 4D 9F C1 47 9A 2B 
9E 75 07 73 F4 48 4A 36 AE BD 28 4D DA 3D 00 89 0D B1 70 C1 67 E7 44 0B C4 3E 
BF 59 3D 2B F2 EA E7 09 05 44 23 2D B6 D6 76 46 D8 DC 05 32 68 A2 2B D5 58 3D 
EC 5D DD 4D A1 6D F2 80 95 5F 61 2A 40 C5 D3 F7 BD A3 71 73 F2 81 DD CD B1 B3 
1D 8E FA B0 70 9F 88 AD 97 7C 2A 91 DE 4F 69 5A 23 17 AA 21 99 7E 89 36 E6 1A 
01 06 D2 DF 6F C3 76 15 47 81 9A 65 66 F7 CC 23 52 C5 DE 77 09 AD 66 38 94 DE 
93 DC CA 24 6B C4 2B FA 1A BC FC 07 84 9B CE 0D 15 BA C7 00 8A 5C 0E 61 D8 BE 
01 A4 81

[jira] [Updated] (HDFS-9745) TestSecureNNWithQJM#testSecureMode sometimes fails with timeouts

2016-02-02 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9745:

Status: Patch Available  (was: Open)

> TestSecureNNWithQJM#testSecureMode sometimes fails with timeouts
> 
>
> Key: HDFS-9745
> URL: https://issues.apache.org/jira/browse/HDFS-9745
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Minor
> Attachments: HDFS-9745.01.patch
>
>
> TestSecureNNWithQJM#testSecureMode fails intermittently. For most of the 
> case, it timeouts.
> In a 0.5%~1% probability, it fails with a more sophisticated error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9746) Some Kerberos related tests intermittently fail.

2016-02-02 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9746:

Summary: Some Kerberos related tests intermittently fail.  (was: Some 
Kerberos related tests intermittently fails.)

> Some Kerberos related tests intermittently fail.
> 
>
> Key: HDFS-9746
> URL: https://issues.apache.org/jira/browse/HDFS-9746
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>
> So far I've seen {{TestSecureNNWithQJM#testSecureMode}} and 
> {{TestKMS#testACLs}} failing. More details coming in the 1st comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9658) Erasure Coding: allow to use multiple EC policies in striping related tests

2016-02-02 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HDFS-9658:
-
Attachment: HDFS-9658.2.patch

Thanks Zhe. Update patch to increase {{maxPerLevel}}.

> Erasure Coding: allow to use multiple EC policies in striping related tests
> ---
>
> Key: HDFS-9658
> URL: https://issues.apache.org/jira/browse/HDFS-9658
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HDFS-9658.1.patch, HDFS-9658.2.patch
>
>
> Currently many of the EC-related tests assume we're using RS-6-3 
> schema/policy. There're lots of hard coded fields as well as computations 
> based on that. To support multiple EC policies, we need to remove these hard 
> coded logic and make the tests more flexible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9395) HDFS operations vary widely in which failures they put in the audit log and which they leave out

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129393#comment-15129393
 ] 

Hadoop QA commented on HDFS-9395:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 8s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 4s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
27s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 156m 51s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestReconstructStripedFile |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   |

[jira] [Commented] (HDFS-9741) libhdfs++: GetLastError not returning meaningful messages after some failures

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129428#comment-15129428
 ] 

Hadoop QA commented on HDFS-9741:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
38s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 19s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 15s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} HDFS-8707 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 28s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 40s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_72 Failed CTEST tests | test_test_libhdfs_threaded_hdfs_static |
|   | test_test_libhdfs_zerocopy_hdfs_static |
|   | test_test_native_mini_dfs |
|   | hdfspp_errors |
|   | memcheck_hdfspp_errors |
|   | test_libhdfs_threaded_hdfspp_test_shim_static |
| JDK v1.7.0_91 Failed CTEST tests | test_test_libhdfs_threaded_hdfs_static |
|   | test_test_libhdfs_zerocopy_hdfs_static |
|   | test_test_native_mini_dfs |
|   | hdfspp_errors |
|   | memcheck_hdfspp_errors |
|   | test_libhdfs_threaded_hdfspp_test_shim_static |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12785877/HDFS-9741.HDFS-8707.000.patch
 |
| JIRA Issue | HDFS-9741 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 19a41c0620b7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 /

[jira] [Updated] (HDFS-8995) Flaw in registration bookeeping can make DN die on reconnect

2016-02-02 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8995:
-
Target Version/s: 2.7.2, 2.6.5  (was: 2.7.2, 2.6.4)

> Flaw in registration bookeeping can make DN die on reconnect
> 
>
> Key: HDFS-8995
> URL: https://issues.apache.org/jira/browse/HDFS-8995
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8995.patch
>
>
> Normally data nodes re-register with the namenode when it was unreachable for 
> more than the heartbeat expiration and becomes reachable again. Datanodes 
> keep retrying the last rpc call such as incremental block report and 
> heartbeat and when it finally gets through the namenode tells it to 
> re-register.
> We have observed that some of datanodes stay dead in such scenarios. Further 
> investigation has revealed that those were told to shutdown by the namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9746) Some Kerberos related tests intermittently fails.

2016-02-02 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129455#comment-15129455
 ] 

Xiao Chen commented on HDFS-9746:
-

{{TestKMS#testACLs}} failure I observed in 
https://builds.apache.org/job/PreCommit-HADOOP-Build/8504:
Error Message
{noformat}
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Connection reset)
{noformat}
Stacktrace
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Connection reset)
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hadoop.crypto.key.kms.server.TestKMS$9$6.run(TestKMS.java:1296)
at 
org.apache.hadoop.crypto.key.kms.server.TestKMS$9$6.run(TestKMS.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1727)
at 
org.apache.hadoop.crypto.key.kms.server.TestKMS.doAs(TestKMS.java:251)
at 
org.apache.hadoop.crypto.key.kms.server.TestKMS.access$100(TestKMS.java:70)
{noformat}
Standard Output
{noformat}
Test KMS running at: http://localhost:50643/kms
2016-02-01 21:09:45,207 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'CREATE_KEY' on 'k']!!
2016-02-01 21:09:45,207 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,251 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'CREATE_KEY' on 'k']!!
2016-02-01 21:09:45,251 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,319 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'ROLL_NEW_VERSION' on 'k']!!
2016-02-01 21:09:45,320 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,359 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'ROLL_NEW_VERSION' on 'k']!!
2016-02-01 21:09:45,359 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,410 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'GET_KEYS']!!
2016-02-01 21:09:45,410 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,443 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'GET_KEYS_METADATA']!!
2016-02-01 21:09:45,443 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,477 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'GET_KEY_VERSION']!!
2016-02-01 21:09:45,478 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,521 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'GET_CURRENT_KEY' on 'k']!!
2016-02-01 21:09:45,521 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,559 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'GET_METADATA' on 'k']!!
2016-02-01 21:09:45,559 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,595 WARN  LoadBalancingKMSClientProvider - KMS provider at 
[http://localhost:50643/kms/v1/] threw an IOException [User:client not allowed 
to do 'GET_KEY_VERSIONS' on 'k']!!
2016-02-01 21:09:45,596 WARN  LoadBalancingKMSClientProvider - Aborting since 
the Request has failed with all KMS providers in the group. !!
2016-02-01 21:09:45,979 ERROR KerberosProtocolHandler - /127.0.0.1:42589 
EXCEPTION
org.apache.mina.filter.codec.ProtocolDecoderException: 
java.lang.NullPointerException: message (Hexdump: 00 00 02 45 6C 82 02 41 30 82 
02 3D A1 03 02 01 05

[jira] [Commented] (HDFS-9713) DataXceiver#copyBlock should return if block is pinned

2016-02-02 Thread Lin Yiqun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129580#comment-15129580
 ] 

Lin Yiqun commented on HDFS-9713:
-

That's a problem here.In DataXceiver#copyBlock:
{code}
if (datanode.data.getPinning(block)) {
  String msg = "Not able to copy block " + block.getBlockId() + " " +
  "to " + peer.getRemoteAddressString() + " because it's pinned ";
  LOG.info(msg);
  sendResponse(ERROR, msg);
}

if (!dataXceiverServer.balanceThrottler.acquire()) { // not able to start
  String msg = "Not able to copy block " + block.getBlockId() + " " +
  "to " + peer.getRemoteAddressString() + " because threads " +
  "quota is exceeded.";
  LOG.warn(msg);
  sendResponse(ERROR, msg);
  return;
}
{code}
One logic have returned, and the other is not. [~umamaheswararao], this jira 
have not updated many days, you can assign to me if you have not time to do 
with it.

> DataXceiver#copyBlock should return if block is pinned
> --
>
> Key: HDFS-9713
> URL: https://issues.apache.org/jira/browse/HDFS-9713
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> in DataXceiver#copyBlock
> {code}
>   if (datanode.data.getPinning(block)) {
>   String msg = "Not able to copy block " + block.getBlockId() + " " +
>   "to " + peer.getRemoteAddressString() + " because it's pinned ";
>   LOG.info(msg);
>   sendResponse(ERROR, msg);
> }
> {code}
> I think we should return back instead of proceeding to send block.as we 
> already sent ERROR here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9715) Check storage ID uniqueness on datanode startup

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129437#comment-15129437
 ] 

Hadoop QA commented on HDFS-9715:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 57s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 43s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
29s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 163m 0s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.qjournal.client.TestQuorumJournalManager |
| JDK v1.8.0_66 Timed out junit tests | 
org.apache.hadoop.hdfs.TestLeaseRecovery2 |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery |
\\
\\
||

[jira] [Updated] (HDFS-9698) Long running Balancer should renew TGT

2016-02-02 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9698:

Resolution: Cannot Reproduce
Status: Resolved  (was: Patch Available)

> Long running Balancer should renew TGT
> --
>
> Key: HDFS-9698
> URL: https://issues.apache.org/jira/browse/HDFS-9698
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, security
>Affects Versions: 2.6.3
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9698.00.patch
>
>
> When the {{Balancer}} runs beyond the configured TGT lifetime, the current 
> logic won't renew TGT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9740) Use a reasonable limit in DFSTestUtil.waitForMetric()

2016-02-02 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129834#comment-15129834
 ] 

Vinayakumar B commented on HDFS-9740:
-

+1
Will commit shortly

> Use a reasonable limit in DFSTestUtil.waitForMetric()
> -
>
> Key: HDFS-9740
> URL: https://issues.apache.org/jira/browse/HDFS-9740
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Kihwal Lee
>Assignee: Chang Li
> Attachments: HDFS-9740.patch
>
>
> If test is detecting a bug, it will probably hit the long surefire timeout 
> because the max is {{Integer.MAX_VALUE}}.  Use something more realistic. The 
> default jmx update interval is 10 seconds, so something like 60 seconds 
> should be more than enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9658) Erasure Coding: allow to use multiple EC policies in striping related tests

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129838#comment-15129838
 ] 

Hadoop QA commented on HDFS-9658:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 108m 12s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 36s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 246m 53s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.tracing.TestTracing |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
|   |

[jira] [Updated] (HDFS-9740) Use a reasonable limit in DFSTestUtil.waitForMetric()

2016-02-02 Thread Vinayakumar B (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9740:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.3
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2, branch-2.8 and branch-2.7

There were some line differences while cherry-picking to branch-2 and 
branch-2.7. Resolved them. Will attach the committed patches.

Thanks [~lichangleo] and [~kihwal]

> Use a reasonable limit in DFSTestUtil.waitForMetric()
> -
>
> Key: HDFS-9740
> URL: https://issues.apache.org/jira/browse/HDFS-9740
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Kihwal Lee
>Assignee: Chang Li
> Fix For: 2.7.3
>
> Attachments: HDFS-9740.patch
>
>
> If test is detecting a bug, it will probably hit the long surefire timeout 
> because the max is {{Integer.MAX_VALUE}}.  Use something more realistic. The 
> default jmx update interval is 10 seconds, so something like 60 seconds 
> should be more than enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9740) Use a reasonable limit in DFSTestUtil.waitForMetric()

2016-02-02 Thread Vinayakumar B (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9740:

Attachment: HDFS-9740-branch-2.7.patch
HDFS-9740-branch-2.patch

Branch-2 and branch-2.7 committed patches.

> Use a reasonable limit in DFSTestUtil.waitForMetric()
> -
>
> Key: HDFS-9740
> URL: https://issues.apache.org/jira/browse/HDFS-9740
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Kihwal Lee
>Assignee: Chang Li
> Fix For: 2.7.3
>
> Attachments: HDFS-9740-branch-2.7.patch, HDFS-9740-branch-2.patch, 
> HDFS-9740.patch
>
>
> If test is detecting a bug, it will probably hit the long surefire timeout 
> because the max is {{Integer.MAX_VALUE}}.  Use something more realistic. The 
> default jmx update interval is 10 seconds, so something like 60 seconds 
> should be more than enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9724) WebHDFS listing is too slow after HDFS-6565

2016-02-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129864#comment-15129864
 ] 

Hadoop QA commented on HDFS-9724:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 20s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 50m 16s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black}

[jira] [Created] (HDFS-9747) Reuse objectMapper instance in MapReduce

2016-02-02 Thread Lin Yiqun (JIRA)

Lin Yiqun created HDFS-9747:
---

 Summary: Reuse objectMapper instance in MapReduce
 Key: HDFS-9747
 URL: https://issues.apache.org/jira/browse/HDFS-9747
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: performance
Affects Versions: 2.7.1
Reporter: Lin Yiqun
Assignee: Lin Yiqun


Now in MapReduce, there are some places creating a new ObjectMapper instance 
every time. In wiki of ObjectMapper, it suggested:
{code}
Further: it is beneficial to use just one instance (or small number of 
instances) for data binding; many optimizations for reuse (of symbol tables, 
some buffers) depend on ObjectMapper instances being reused. 
{code}
http://webcache.googleusercontent.com/search?q=cache:kybMTIJC6F4J:wiki.fasterxml.com/JacksonFAQ+=4=ja=clnk=jp,
 it's similar to HDFS-9724.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9747) Reuse objectMapper instance in MapReduce

2016-02-02 Thread Lin Yiqun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9747:

Status: Patch Available  (was: Open)

Attach a initial patch, kindly reviewing.

> Reuse objectMapper instance in MapReduce
> 
>
> Key: HDFS-9747
> URL: https://issues.apache.org/jira/browse/HDFS-9747
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: MAPREDUCE.001.patch
>
>
> Now in MapReduce, there are some places creating a new ObjectMapper instance 
> every time. In wiki of ObjectMapper, it suggested:
> {code}
> Further: it is beneficial to use just one instance (or small number of 
> instances) for data binding; many optimizations for reuse (of symbol tables, 
> some buffers) depend on ObjectMapper instances being reused. 
> {code}
> http://webcache.googleusercontent.com/search?q=cache:kybMTIJC6F4J:wiki.fasterxml.com/JacksonFAQ+=4=ja=clnk=jp,
>  it's similar to HDFS-9724.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9747) Reuse objectMapper instance in MapReduce

2016-02-02 Thread Lin Yiqun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9747:

Attachment: MAPREDUCE.001.patch

> Reuse objectMapper instance in MapReduce
> 
>
> Key: HDFS-9747
> URL: https://issues.apache.org/jira/browse/HDFS-9747
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: MAPREDUCE.001.patch
>
>
> Now in MapReduce, there are some places creating a new ObjectMapper instance 
> every time. In wiki of ObjectMapper, it suggested:
> {code}
> Further: it is beneficial to use just one instance (or small number of 
> instances) for data binding; many optimizations for reuse (of symbol tables, 
> some buffers) depend on ObjectMapper instances being reused. 
> {code}
> http://webcache.googleusercontent.com/search?q=cache:kybMTIJC6F4J:wiki.fasterxml.com/JacksonFAQ+=4=ja=clnk=jp,
>  it's similar to HDFS-9724.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9747) Reuse objectMapper instance in MapReduce

2016-02-02 Thread Lin Yiqun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129872#comment-15129872
 ] 

Lin Yiqun commented on HDFS-9747:
-

Who can help me to transfer this jira to MAPREDUCE, thanks.

> Reuse objectMapper instance in MapReduce
> 
>
> Key: HDFS-9747
> URL: https://issues.apache.org/jira/browse/HDFS-9747
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: MAPREDUCE.001.patch
>
>
> Now in MapReduce, there are some places creating a new ObjectMapper instance 
> every time. In wiki of ObjectMapper, it suggested:
> {code}
> Further: it is beneficial to use just one instance (or small number of 
> instances) for data binding; many optimizations for reuse (of symbol tables, 
> some buffers) depend on ObjectMapper instances being reused. 
> {code}
> http://webcache.googleusercontent.com/search?q=cache:kybMTIJC6F4J:wiki.fasterxml.com/JacksonFAQ+=4=ja=clnk=jp,
>  it's similar to HDFS-9724.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9721) Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference

2016-02-02 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128675#comment-15128675
 ] 

Xiao Chen commented on HDFS-9721:
-

Failed tests seem unrelated.

> Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference
> 
>
> Key: HDFS-9721
> URL: https://issues.apache.org/jira/browse/HDFS-9721
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9721.01.patch, HDFS-9721.02.patch, 
> HDFS-9721.03.patch, HDFS-9721.04.patch, HDFS-9721.05.patch
>
>
> HDFS-6673 added the feature of Delimited format OIV tool on protocol buffer 
> based fsimage.
> However, if the fsimage contains {{INodeReference}}, the tool fails because:
> {code}Preconditions.checkState(e.getRefChildrenCount() == 0);{code}
> This jira is to propose allow the tool to finish, so that user can get full 
> metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9739) DatanodeStorage.isValidStorageId() is broken

2016-02-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9739:

Attachment: HDFS-9739.000.patch

The v0 patch simply adds the {{UUID.fromString}} back.

I think we any unit tests that can cover this will be very helpful. Any 
suggestion [~kihwal]?

> DatanodeStorage.isValidStorageId() is broken
> 
>
> Key: HDFS-9739
> URL: https://issues.apache.org/jira/browse/HDFS-9739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Mingliang Liu
>Priority: Critical
> Attachments: HDFS-9739.000.patch
>
>
> After HDFS-8979, the check is returning true for the old storage ID format. 
> So storage IDs in the old format  won't be updated during datanode upgrade. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9425) Expose number of blocks per volume as a metric

2016-02-02 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-9425:
---
Attachment: HDFS-9425-002.patch

[~vinayrpet] thanks a lot for review..

Uploaded the patch based on your comments,kindly review...

> Expose number of blocks per volume as a metric
> --
>
> Key: HDFS-9425
> URL: https://issues.apache.org/jira/browse/HDFS-9425
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-9425-002.patch, HDFS-9425.patch
>
>
> It will be helpful for user to know the usage in number of blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 3 >

1 - 100 of 207 matches

Mail list logo