[jira] [Created] (HDFS-7001) Tests in TestTracing depends on the order of execution

2014-09-05 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HDFS-7001:
--

 Summary: Tests in TestTracing depends on the order of execution
 Key: HDFS-7001
 URL: https://issues.apache.org/jira/browse/HDFS-7001
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor


o.a.h.tracing.TestTracing#testSpanReceiverHost is assumed to be executed first. 
It should be done in BeforeClass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7002) Failed to rolling upgrade hdfs from 2.2.0 to 2.4.1

2014-09-05 Thread sam liu (JIRA)
sam liu created HDFS-7002:
-

 Summary: Failed to rolling upgrade hdfs from 2.2.0 to 2.4.1
 Key: HDFS-7002
 URL: https://issues.apache.org/jira/browse/HDFS-7002
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: journal-node, namenode, qjm
Affects Versions: 2.4.1, 2.2.0
Reporter: sam liu
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7002) Failed to rolling upgrade hdfs from 2.2.0 to 2.4.1

2014-09-05 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA resolved HDFS-7002.
-
Resolution: Invalid

Rolling upgrades are available for the upgrades from 2.4+ only. Rolling upgrade 
from ~2.3 to 2.4+ is not supported.

 Failed to rolling upgrade hdfs from 2.2.0 to 2.4.1
 --

 Key: HDFS-7002
 URL: https://issues.apache.org/jira/browse/HDFS-7002
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: journal-node, namenode, qjm
Affects Versions: 2.2.0, 2.4.1
Reporter: sam liu
Priority: Blocker





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: Hadoop-Hdfs-trunk #1862

2014-09-05 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1862/changes

Changes:

[tucu] HADOOP-11054. Add a KeyProvider instantiation based on a URI. (tucu)

[tucu] HADOOP-11015. Http server/client utils to propagate and recreate 
Exceptions from server to client. (tucu)

[tucu] HADOOP-11060. Create a CryptoCodec test that verifies interoperability 
between the JCE and OpenSSL implementations. (hitliuyi via tucu)

[jeagles] YARN-2509. Enable Cross Origin Filter for timeline server only and 
not all Yarn servers (Mit Desai via jeagles)

[tucu] Fixing HDFS CHANGES.txt, missing HDFS-6905 entry

[cnauroth] HADOOP-11063. KMS cannot deploy on Windows, because class names are 
too long. Contributed by Chris Nauroth.

[jlowe] YARN-2431. NM restart: cgroup is not removed for reacquired containers. 
Contributed by Jason Lowe

[zjshen] YARN-2511. Allowed all origins by default when CrossOriginFilter is 
enabled. Contributed by Jonathan Eagles.

[jing] HDFS-6996. SnapshotDiff report can hit IndexOutOfBoundsException when 
there are nested renamed directory/file. Contributed by Jing Zhao.

[jing] HDFS-6886. Use single editlog record for creating file + overwrite. 
Contributed by Yi Liu.

[vinayakumarb] HDFS-6714. TestBlocksScheduledCounter#testBlocksScheduledCounter 
should shutdown cluster (vinayakumarb)

--
[...truncated 5314 lines...]
Running org.apache.hadoop.hdfs.web.TestWebHdfsUrl
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.658 sec - in 
org.apache.hadoop.hdfs.web.TestWebHdfsUrl
Running org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.189 sec - in 
org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
Running org.apache.hadoop.hdfs.web.TestWebHDFSForHA
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.104 sec - in 
org.apache.hadoop.hdfs.web.TestWebHDFSForHA
Running org.apache.hadoop.hdfs.web.TestWebHdfsTokens
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.127 sec - in 
org.apache.hadoop.hdfs.web.TestWebHdfsTokens
Running org.apache.hadoop.hdfs.web.TestOffsetUrlInputStream
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.159 sec - in 
org.apache.hadoop.hdfs.web.TestOffsetUrlInputStream
Running org.apache.hadoop.hdfs.web.TestWebHDFS
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.11 sec - 
in org.apache.hadoop.hdfs.web.TestWebHDFS
Running org.apache.hadoop.hdfs.web.TestHttpsFileSystem
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.084 sec - in 
org.apache.hadoop.hdfs.web.TestHttpsFileSystem
Running org.apache.hadoop.hdfs.web.TestWebHdfsContentLength
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.017 sec - in 
org.apache.hadoop.hdfs.web.TestWebHdfsContentLength
Running org.apache.hadoop.hdfs.web.TestWebHDFSXAttr
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.634 sec - 
in org.apache.hadoop.hdfs.web.TestWebHDFSXAttr
Running org.apache.hadoop.hdfs.web.TestWebHdfsWithAuthenticationFilter
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.349 sec - in 
org.apache.hadoop.hdfs.web.TestWebHdfsWithAuthenticationFilter
Running org.apache.hadoop.hdfs.web.TestByteRangeInputStream
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.351 sec - in 
org.apache.hadoop.hdfs.web.TestByteRangeInputStream
Running org.apache.hadoop.hdfs.web.TestURLConnectionFactory
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.163 sec - in 
org.apache.hadoop.hdfs.web.TestURLConnectionFactory
Running org.apache.hadoop.hdfs.web.TestAuthFilter
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec - in 
org.apache.hadoop.hdfs.web.TestAuthFilter
Running org.apache.hadoop.hdfs.web.TestTokenAspect
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.259 sec - in 
org.apache.hadoop.hdfs.web.TestTokenAspect
Running org.apache.hadoop.hdfs.TestDFSClientFailover
Tests run: 8, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 10.239 sec - in 
org.apache.hadoop.hdfs.TestDFSClientFailover
Running org.apache.hadoop.hdfs.TestAbandonBlock
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.394 sec - in 
org.apache.hadoop.hdfs.TestAbandonBlock
Running org.apache.hadoop.hdfs.tools.TestGetGroups
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.193 sec - in 
org.apache.hadoop.hdfs.tools.TestGetGroups
Running org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.718 sec - 
in org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA
Running 
org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.702 sec - in 
org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl
Running org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer
Tests run: 5, 

[jira] [Created] (HDFS-7003) Add NFS Gateway support for reading and writing to encryption zones

2014-09-05 Thread Stephen Chu (JIRA)
Stephen Chu created HDFS-7003:
-

 Summary: Add NFS Gateway support for reading and writing to 
encryption zones
 Key: HDFS-7003
 URL: https://issues.apache.org/jira/browse/HDFS-7003
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption, nfs
Affects Versions: 2.6.0
Reporter: Stephen Chu


Currently, reading and writing within encryption zones does not work through 
the NFS gateway.

For example, we have an encryption zone {{/enc}}. Here's the difference of 
reading the file from hadoop fs and the NFS gateway:

{code}
[hdfs@schu-enc2 ~]$ hadoop fs -cat /enc/hi
hi
[hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hi
??
{code}

If we write a file using the NFS gateway, we'll see behavior like this:

{code}
[hdfs@schu-enc2 ~]$ echo hello  /hdfs_nfs/enc/hello
[hdfs@schu-enc2 ~]$ cat /hdfs_nfs/enc/hello
hello
[hdfs@schu-enc2 ~]$ hdfs dfs -cat /enc/hello
???tp[hdfs@schu-enc2 ~]$ 
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7004) Update KeyProvider instantiation to create by URI

2014-09-05 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-7004:
-

 Summary: Update KeyProvider instantiation to create by URI
 Key: HDFS-7004
 URL: https://issues.apache.org/jira/browse/HDFS-7004
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Andrew Wang
Assignee: Andrew Wang


See HADOOP-11054, would be good to update the NN/DFSClient to fetch via this 
method rather than depending on the URI path lookup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7005) DFS input streams do not timeout

2014-09-05 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-7005:
-

 Summary: DFS input streams do not timeout
 Key: HDFS-7005
 URL: https://issues.apache.org/jira/browse/HDFS-7005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.5.0, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical


Input streams lost their timeout.  The problem appears to be 
{{DFSClient#newConnectedPeer}} does not set the read timeout.  During a 
temporary network interruption the server will close the socket, unbeknownst to 
the client host, which blocks on a read forever.

The results are dire.  Services such as the RM, JHS, NMs, oozie servers, etc 
all need to be restarted to recover - unless you want to wait many hours for 
the tcp stack keepalive to detect the broken socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7006) Test encryption zones with MKS

2014-09-05 Thread Alejandro Abdelnur (JIRA)
Alejandro Abdelnur created HDFS-7006:


 Summary: Test encryption zones with MKS
 Key: HDFS-7006
 URL: https://issues.apache.org/jira/browse/HDFS-7006
 Project: Hadoop HDFS
  Issue Type: Test
  Components: security, test
Affects Versions: 2.6.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur


We should test EZs with KMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7007) Interfaces to plugin ConsensusNode.

2014-09-05 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-7007:
-

 Summary: Interfaces to plugin ConsensusNode.
 Key: HDFS-7007
 URL: https://issues.apache.org/jira/browse/HDFS-7007
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko


This is to introduce interfaces in NameNode and namesystem, which are needed to 
plugin ConsensusNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()

2014-09-05 Thread Ted Yu (JIRA)
Ted Yu created HDFS-7008:


 Summary: xlator should be closed upon exit from 
DFSAdmin#genericRefresh()
 Key: HDFS-7008
 URL: https://issues.apache.org/jira/browse/HDFS-7008
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


{code}
GenericRefreshProtocol xlator =
  new GenericRefreshProtocolClientSideTranslatorPB(proxy);

// Refresh
CollectionRefreshResponse responses = xlator.refresh(identifier, args);
{code}
GenericRefreshProtocolClientSideTranslatorPB#close() should be called on xlator 
before return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7009) No enough retry during DN's initial handshake with NN

2014-09-05 Thread Ming Ma (JIRA)
Ming Ma created HDFS-7009:
-

 Summary: No enough retry during DN's initial handshake with NN
 Key: HDFS-7009
 URL: https://issues.apache.org/jira/browse/HDFS-7009
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma


To follow up on https://issues.apache.org/jira/browse/HDFS-6478, in most cases, 
given DN sends HB and BR to NN regularly, if a specific RPC call fails, it 
isn't a big deal.

However, there are cases where DN fails to register with NN during initial 
handshake due to exceptions not covered by RPC client's connection retry. When 
this happens, the DN won't talk to that NN until the DN restarts.

{noformat}
BPServiceActor

  public void run() {
LOG.info(this +  starting to offer service);

try {
  // init stuff
  try {
// setup storage
connectToNNAndHandshake();
  } catch (IOException ioe) {
// Initial handshake, storage recovery or registration failed
// End BPOfferService thread
LOG.fatal(Initialization failed for block pool  + this, ioe);
return;
  }

  initialized = true; // bp is initialized;
  
  while (shouldRun()) {
try {
  offerService();
} catch (Exception ex) {
  LOG.error(Exception in BPOfferService for  + this, ex);
  sleepAndLogInterrupts(5000, offering service);
}
  }
...
{noformat}


Here is an example of the call stack.

{noformat}
java.io.IOException: Failed on local exception: java.io.IOException: Response 
is null.; Host Details : local host is: xxx; destination host is: yyy:8030;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy9.registerDatanode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy9.registerDatanode(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.registerDatanode(DatanodeProtocolClientSideTranslatorPB.java:146)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:623)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Response is null.
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:949)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
{noformat}

This will create discrepancy between active NN and standby NN in terms of live 
nodes.
 
Here is a possible scenario of missing blocks after failover.

1. DN A, B set up handshakes with active NN, but not with standby NN.
2. A block is replicated to DN A, B and C.
3. From standby NN's point of view, given A and B are dead nodes, the block is 
under replicated.
4. DN C is down.
5. Before active NN detects DN C is down, it fails over.
6. The new active NN considers the block is missing. Even though there are two 
replicas on DN A and B.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] Release Apache Hadoop 2.5.1 RC0

2014-09-05 Thread Karthik Kambatla
Hi folks,

I have put together a release candidate (RC0) for Hadoop 2.5.1.

The RC is available at: http://people.apache.org/~kasha/hadoop-2.5.1-RC0/
The RC git tag is release-2.5.1-RC0
The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1010/

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the now usual 5
days.

Thanks
Karthik