[jira] [Created] (HADOOP-15321) Reduce the RPC Client max retries on timeouts

2018-03-16 Thread Xiao Chen (JIRA)
Xiao Chen created HADOOP-15321:
--

 Summary: Reduce the RPC Client max retries on timeouts
 Key: HADOOP-15321
 URL: https://issues.apache.org/jira/browse/HADOOP-15321
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Reporter: Xiao Chen
Assignee: Xiao Chen


Currently, the 
[default|https://github.com/apache/hadoop/blob/branch-3.0.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java#L379]
 number of retries when IPC client catch a {{ConnectTimeoutException}} is 45. 
This seems unreasonably high.

Given the IPC client timeout is by default 60 seconds, if a DN host is shutdown 
the client will retry for 45 minutes until aborting. (If host is there but 
process down, it would throw a connection refused immediately, which is cool)

Creating this Jira to discuss whether we can reduce that to a reasonable number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Windows/x64

2018-03-16 Thread Apache Jenkins Server
For more details, see https://builds.apache.org/job/hadoop-trunk-win/408/

[Mar 15, 2018 5:14:35 PM] (inigoiri) HDFS-12723.
[Mar 15, 2018 5:18:44 PM] (xyao) HDFS-13251. Avoid using hard coded datanode 
data dirs in unit
[Mar 15, 2018 5:32:30 PM] (inigoiri) HDFS-13224. RBF: Resolvers to support 
mount points across multiple
[Mar 15, 2018 6:02:27 PM] (xyao) HDFS-13280. WebHDFS: Fix NPE in get 
snasphottable directory list call.
[Mar 15, 2018 6:05:14 PM] (stevel) HADOOP-15209. DistCp to eliminate needless 
deletion of files under
[Mar 15, 2018 8:26:01 PM] (wangda) MAPREDUCE-7047. Make HAR tool support 
IndexedLogAggregtionController.
[Mar 15, 2018 8:26:45 PM] (wangda) YARN-7952. RM should be able to recover log 
aggregation status after
[Mar 16, 2018 3:17:16 AM] (xiao) HADOOP-15234. Throw meaningful message on null 
when initializing
[Mar 16, 2018 10:57:31 AM] (wwei) YARN-7636. Re-reservation count may overflow 
when cluster resource




-1 overall


The following subsystems voted -1:
unit


The following subsystems are considered long running:
(runtime bigger than 1h 00m 00s)
unit


Specific tests:

Failed CTEST tests :

   test_test_libhdfs_threaded_hdfs_static 

Failed junit tests :

   hadoop.crypto.TestCryptoStreamsWithOpensslAesCtrCryptoCodec 
   hadoop.fs.contract.rawlocal.TestRawlocalContractAppend 
   hadoop.fs.TestFsShellCopy 
   hadoop.fs.TestFsShellList 
   hadoop.fs.TestLocalFileSystem 
   hadoop.http.TestHttpServer 
   hadoop.http.TestHttpServerLogs 
   hadoop.io.nativeio.TestNativeIO 
   hadoop.ipc.TestIPC 
   hadoop.ipc.TestSocketFactory 
   hadoop.metrics2.impl.TestStatsDMetrics 
   hadoop.metrics2.sink.TestRollingFileSystemSinkWithLocal 
   hadoop.security.TestGroupsCaching 
   hadoop.security.TestSecurityUtil 
   hadoop.security.TestShellBasedUnixGroupsMapping 
   hadoop.security.token.TestDtUtilShell 
   hadoop.util.TestNativeCodeLoader 
   hadoop.util.TestNodeHealthScriptRunner 
   hadoop.util.TestWinUtils 
   hadoop.fs.TestWebHdfsFileContextMainOperations 
   hadoop.hdfs.client.impl.TestBlockReaderLocalLegacy 
   hadoop.hdfs.crypto.TestHdfsCryptoStreams 
   hadoop.hdfs.qjournal.client.TestQuorumJournalManager 
   hadoop.hdfs.qjournal.server.TestJournalNode 
   hadoop.hdfs.qjournal.server.TestJournalNodeSync 
   hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks 
   hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages 
   hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy 
   
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement 
   
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestProvidedImpl 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica 
   hadoop.hdfs.server.datanode.TestBlockPoolSliceStorage 
   hadoop.hdfs.server.datanode.TestBlockRecovery 
   hadoop.hdfs.server.datanode.TestBlockScanner 
   hadoop.hdfs.server.datanode.TestDataNodeFaultInjector 
   hadoop.hdfs.server.datanode.TestDataNodeMetrics 
   hadoop.hdfs.server.datanode.TestDataNodeUUID 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.server.datanode.TestHSync 
   hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage 
   hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport 
   hadoop.hdfs.server.datanode.web.TestDatanodeHttpXFrame 
   hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand 
   hadoop.hdfs.server.diskbalancer.TestDiskBalancerRPC 
   hadoop.hdfs.server.federation.router.TestRouterAdminCLI 
   hadoop.hdfs.server.mover.TestMover 
   hadoop.hdfs.server.mover.TestStorageMover 
   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA 
   hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA 
   hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics 
   
hadoop.hdfs.server.namenode.snapshot.TestINodeFileUnderConstructionWithSnapshot 
   hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot 
   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots 
   hadoop.hdfs.server.namenode.snapshot.TestSnapRootDescendantDiff 
   hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport 
   hadoop.hdfs.server.namenode.TestAddBlock 
   

[jira] [Created] (HADOOP-15320) Remove customized getFileBlockLocations for hadoop-azure and hadoop-azure-datalake

2018-03-16 Thread shanyu zhao (JIRA)
shanyu zhao created HADOOP-15320:


 Summary: Remove customized getFileBlockLocations for hadoop-azure 
and hadoop-azure-datalake
 Key: HADOOP-15320
 URL: https://issues.apache.org/jira/browse/HADOOP-15320
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/adl, fs/azure
Affects Versions: 3.0.0, 2.9.0, 2.7.3
Reporter: shanyu zhao
Assignee: shanyu zhao


hadoop-azure and hadoop-azure-datalake have its own implementation of 
getFileBlockLocations(), which faked a list of artificial blocks based on the 
hard-coded block size. And each block has one host with name "localhost". Take 
a look at this code:

[https://github.com/apache/hadoop/blob/release-2.9.0-RC3/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L3485]

This is a unnecessary mock up for a "remote" file system to mimic HDFS. And the 
problem with this mock is that for large (~TB) files we generates lots of 
artificial blocks, and FileInputFormat.getSplits() is slow in calculating 
splits based on these blocks.

We can safely remove this customized getFileBlockLocations() implementation, 
fall back to the default FileSystem.getFileBlockLocations() implementation, 
which is to return 1 block for any file with 1 host "localhost". Note that this 
doesn't mean we will create much less splits, because the number of splits is 
still limited by the blockSize in FileInputFormat.computeSplitSize():
{code:java}
return Math.max(minSize, Math.min(goalSize, blockSize));{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-16 Thread sanjay Radia

> On Mar 5, 2018, at 4:08 PM, Andrew Wang  wrote:
> 
> - NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen 
> acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the old 
> NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the old 
> HDFS code.
> 
> Are you proposing that we pursue the 2nd option to integrate HDSL with HDFS?


Originally I would have prefered (a); but Owen made a strong case for (b) in my 
discussions with his last week.
Overall we need a broader discussion around the next steps for NN evolution and 
how to chart the course; I am not locked into any particular path or how we 
would do it. 
Let me make a more detailed response in HDFS-10419.

sanjay



[jira] [Resolved] (HADOOP-14699) Impersonation errors with UGI after second principal relogin

2018-03-16 Thread Jeff Storck (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Storck resolved HADOOP-14699.
--
Resolution: Resolved

This issue will be resolved by HADOOP-9747.

> Impersonation errors with UGI after second principal relogin
> 
>
> Key: HADOOP-14699
> URL: https://issues.apache.org/jira/browse/HADOOP-14699
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.2, 2.7.3, 2.8.1
>Reporter: Jeff Storck
>Priority: Major
>
> Multiple principals that are logged in using UGI instances that are 
> instantiated from a UGI class loaded by the same classloader will encounter 
> problems when the second principal attempts to relogin and perform an action 
> using a UGI.doAs().  An impersonation will occur and the operation attempted 
> by the second principal after relogging in will fail.  There should not be an 
> implicit attempt to impersonate the second principal through the first 
> principal that logged in.
> I have created  a GitHub project that exhibits the impersonation error with 
> brief instructions on how to set up for the test and run it: 
> https://github.com/jtstorck/ugi-test
> {noformat}18:44:55.687 [pool-2-thread-2] WARN  
> h.u.u.ugirunnable.ugite...@example.com - Unexpected exception while 
> performing task for [ugite...@example.com (auth:KERBEROS)]
> org.apache.hadoop.ipc.RemoteException: User: ugite...@example.com is not 
> allowed to impersonate ugite...@example.com
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1481)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1427)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1337)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:787)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1700)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1448)
>   at 
> hadoop.ugitest.UgiTestMain$UgiRunnable.lambda$run$2(UgiTestMain.java:194)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at hadoop.ugitest.UgiTestMain$UgiRunnable.run(UgiTestMain.java:194)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-03-16 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/

[Mar 15, 2018 3:06:04 AM] (yqlin) HDFS-13261. Fix incorrect null value check. 
Contributed by Jianfei
[Mar 15, 2018 4:59:51 AM] (xiao) HDFS-13246. FileInputStream redundant closes 
in readReplicasFromCache.
[Mar 15, 2018 7:12:07 AM] (aajisaka) HADOOP-15305. Replace 
FileUtils.writeStringToFile(File, String) with
[Mar 15, 2018 5:14:35 PM] (inigoiri) HDFS-12723.
[Mar 15, 2018 5:18:44 PM] (xyao) HDFS-13251. Avoid using hard coded datanode 
data dirs in unit
[Mar 15, 2018 5:32:30 PM] (inigoiri) HDFS-13224. RBF: Resolvers to support 
mount points across multiple
[Mar 15, 2018 6:02:27 PM] (xyao) HDFS-13280. WebHDFS: Fix NPE in get 
snasphottable directory list call.
[Mar 15, 2018 6:05:14 PM] (stevel) HADOOP-15209. DistCp to eliminate needless 
deletion of files under
[Mar 15, 2018 8:26:01 PM] (wangda) MAPREDUCE-7047. Make HAR tool support 
IndexedLogAggregtionController.
[Mar 15, 2018 8:26:45 PM] (wangda) YARN-7952. RM should be able to recover log 
aggregation status after




-1 overall


The following subsystems voted -1:
findbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
   org.apache.hadoop.yarn.api.records.Resource.getResources() may expose 
internal representation by returning Resource.resources At Resource.java:by 
returning Resource.resources At Resource.java:[line 234] 

Failed junit tests :

   hadoop.fs.TestTrash 
   hadoop.util.TestBasicDiskValidator 
   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean 
   hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy 
   hadoop.hdfs.tools.TestDFSAdminWithHA 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.server.namenode.ha.TestBootstrapStandby 
   hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/diff-compile-javac-root.txt
  [288K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/whitespace-eol.txt
  [9.2M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/whitespace-tabs.txt
  [288K]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/xml.txt
  [4.0K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/diff-javadoc-javadoc-root.txt
  [760K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [184K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [440K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [48K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/722/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [84K]

Powered by Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, 

[jira] [Created] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Saurabh Padhy (JIRA)
Saurabh Padhy created HADOOP-15319:
--

 Summary: hadoop fs -rm command misbehaves on recent hadoop version 
2.5.0
 Key: HADOOP-15319
 URL: https://issues.apache.org/jira/browse/HADOOP-15319
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Affects Versions: 2.5.0
Reporter: Saurabh Padhy


This issue is regarding hadoop fs -rm command. 

In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",

It removes the files inside the c directory only.

But in case of versions higher to 2.5.0,

When we execute "hadoop fs -rm /a/b/c/*" or "hdfs dfs -rm /a/b/c/*"

It removes the inside files and directory as well.

Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org