[jira] [Resolved] (HDDS-97) Create Version File in Datanode
[ https://issues.apache.org/jira/browse/HDDS-97?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru resolved HDDS-97. Resolution: Duplicate The new layout of {{*<>/hdds/VERSION*}} will be handled in HDDS-156. Reverted the commit from branch HDDS-48. > Create Version File in Datanode > --- > > Key: HDDS-97 > URL: https://issues.apache.org/jira/browse/HDDS-97 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-97-HDDS-48.00.patch, HDDS-97-HDDS-48.01.patch, > HDDS-97-HDDS-48.02.patch, HDDS-97-HDDS-48.03.patch, HDDS-97-HDDS-48.04.patch > > > Create a versionFile in dfs.datanode.dir/hdds/ path. > The content of the versionFile: > # scmUuid > # CTime > # layOutVersion > When datanodes makes a request for SCMVersion, in this response we send > scmUuid. > With this response, we should be able to create version file on the datanode. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Reopened] (HDDS-97) Create Version File in Datanode
[ https://issues.apache.org/jira/browse/HDDS-97?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru reopened HDDS-97: Instead of creating Version File per SCM, we will have Version File per disk. New Layout is: *<>/hdds/VERSION* > Create Version File in Datanode > --- > > Key: HDDS-97 > URL: https://issues.apache.org/jira/browse/HDDS-97 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-97-HDDS-48.00.patch, HDDS-97-HDDS-48.01.patch, > HDDS-97-HDDS-48.02.patch, HDDS-97-HDDS-48.03.patch, HDDS-97-HDDS-48.04.patch > > > Create a versionFile in dfs.datanode.dir/hdds/ path. > The content of the versionFile: > # scmUuid > # CTime > # layOutVersion > When datanodes makes a request for SCMVersion, in this response we send > scmUuid. > With this response, we should be able to create version file on the datanode. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-160) Refactor KeyManager and KeyManagerImpl
Bharat Viswanadham created HDDS-160: --- Summary: Refactor KeyManager and KeyManagerImpl Key: HDDS-160 URL: https://issues.apache.org/jira/browse/HDDS-160 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham This Jira is to add new Interface ContainerManager to perform Key related operations Add a new Class KeyValueManager to implement ContainerManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.0.3 (RC0)
Sorry, Yongjun. My +1 is also binding+1 (binding)-Eric Payne On Friday, June 1, 2018, 12:25:36 PM CDT, Eric Payne wrote: Thanks a lot, Yongjun, for your hard work on this release. +1 - Built from source - Installed on 6 node pseudo cluster Tested the following in the Capacity Scheduler: - Verified that running apps in labelled queues restricts tasks to the labelled nodes. - Verified that various queue config properties for CS are refreshable - Verified streaming jobs work as expected - Verified that user weights work as expected - Verified that FairOrderingPolicy in a CS queue will evenly assign resources - Verified running yarn shell application runs as expected On Friday, June 1, 2018, 12:48:26 AM CDT, Yongjun Zhang wrote: Greetings all, I've created the first release candidate (RC0) for Apache Hadoop 3.0.3. This is our next maintenance release to follow up 3.0.2. It includes about 249 important fixes and improvements, among which there are 8 blockers. See https://issues.apache.org/jira/issues/?filter=12343997 The RC artifacts are available at: https://dist.apache.org/repos/dist/dev/hadoop/3.0.3-RC0/ The maven artifacts are available via https://repository.apache.org/content/repositories/orgapachehadoop-1126 Please try the release and vote; the vote will run for the usual 5 working days, ending on 06/07/2018 PST time. Would really appreciate your participation here. I bumped into quite some issues along the way, many thanks to quite a few people who helped, especially Sammi Chen, Andrew Wang, Junping Du, Eddy Xu. Thanks, --Yongjun
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/ No changes -1 overall The following subsystems voted -1: asflicense findbugs pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadListener; locked 75% of time Unsynchronized access at AllocationFileLoaderService.java:75% of time Unsynchronized access at AllocationFileLoaderService.java:[line 117] Failed junit tests : hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageDomain hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageSchema hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity hadoop.mapred.TestMRTimelineEventHandling hadoop.yarn.sls.appmaster.TestAMSimulator cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/diff-compile-javac-root.txt [336K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/diff-checkstyle-root.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/diff-patch-shelldocs.txt [16K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/whitespace-eol.txt [9.4M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/whitespace-tabs.txt [1.1M] xml: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/xml.txt [4.0K] findbugs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-hdds_client.txt [56K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-hdds_container-service.txt [52K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-hdds_server-scm.txt [60K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-hdds_tools.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-ozone_client.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-ozone_common.txt [24K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-ozone_objectstore-service.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-ozone_tools.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/branch-findbugs-hadoop-tools_hadoop-ozone.txt [8.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/808/artifact/out/diff-javadoc-javadoc-root.txt [760K] unit:
[jira] [Created] (HDFS-13669) YARN in HA not failing over to a new resource manager.
Danil Serdyuchenko created HDFS-13669: - Summary: YARN in HA not failing over to a new resource manager. Key: HDFS-13669 URL: https://issues.apache.org/jira/browse/HDFS-13669 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.1 Reporter: Danil Serdyuchenko We are running YARN in HA mode. (rm1 and rm2) We hit an issue when recreating one of the RMs. # Recreated a standby RM (rm2), which gave it a new IP # Stopped the active RM (rm1) # NMs tried to failover to rm2, but were timing out because of the old ip. # NMs reach the configured 30 failover retries and shutdown. We get the following logs. {noformat} 18/06/06 15:36:32 WARN ipc.Client: Address change detected. Old: yarnrm2/x.x.x.x:8031 New: yarnrm2/y.y.y.y:8031 18/06/06 15:36:32 INFO retry.RetryInvocationHandler: Exception while invoking nodeHeartbeat of class ResourceTrackerPBClientImpl over rm2 after 25 fail over attempts. Trying to fail over after sleeping for 37191ms. org.apache.hadoop.net.ConnectTimeoutException: Call From ip-a-a-a-a/a.a.a.a to yarnrm2:8031 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 2 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=yarnrm2/x.x.x.x:8031]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751) at org.apache.hadoop.ipc.Client.call(Client.java:1480) at org.apache.hadoop.ipc.Client.call(Client.java:1407) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy28.nodeHeartbeat(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.nodeHeartbeat(ResourceTrackerPBClientImpl.java:80) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy29.nodeHeartbeat(Unknown Source) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:596) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.net.ConnectTimeoutException: 2 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=yarnrm2.grappler.eu-west-1.prod.aws.skyscanner.local/10.51.104.136:8031] at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) at org.apache.hadoop.ipc.Client.call(Client.java:1446) ... 12 more{noformat} We get this and failover back to rm1 30 times until: {noformat} 18/06/06 15:39:44 WARN retry.RetryInvocationHandler: Exception while invoking class org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.nodeHeartbeat over rm1. Not retrying because failovers (30) exceeded maximum allowed (30){noformat} >From the logs it appears that the timeouts happen because it's trying to >connect to the old ip (x.x.x.x in the logs). Looking at the code of the Client >class, following the updateAddress method call we should expect a retry with >the new server ip ("Retrying connect to server ..." log) up to ipc.client.connect.max.retries.on.timeouts times. However we never see the retry logs and it just fails with exception. The above setting is set to default 45 for all of our NMs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Windows/x64
For more details, see https://builds.apache.org/job/hadoop-trunk-win/494/ [Jun 9, 2018 2:33:38 PM] (stevel) HADOOP-15520. Add tests for various org.apache.hadoop.util classes. [Jun 9, 2018 11:43:03 PM] (Bharat) HDFS-13667:Typo: Marking all datandoes as stale. Contributed by Nanda -1 overall The following subsystems voted -1: compile mvninstall pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc javac The following subsystems are considered long running: (runtime bigger than 1h 00m 00s) unit Specific tests: Failed junit tests : hadoop.crypto.TestCryptoStreamsWithOpensslAesCtrCryptoCodec hadoop.fs.contract.rawlocal.TestRawlocalContractAppend hadoop.fs.TestFileUtil hadoop.fs.TestFsShellCopy hadoop.fs.TestFsShellList hadoop.fs.TestLocalFileSystem hadoop.http.TestHttpServer hadoop.http.TestHttpServerLogs hadoop.io.compress.TestCodec hadoop.io.nativeio.TestNativeIO hadoop.ipc.TestIPC hadoop.ipc.TestSocketFactory hadoop.metrics2.impl.TestStatsDMetrics hadoop.security.TestGroupsCaching hadoop.security.TestSecurityUtil hadoop.security.TestShellBasedUnixGroupsMapping hadoop.security.token.TestDtUtilShell hadoop.util.TestDiskCheckerWithDiskIo hadoop.util.TestNativeCodeLoader hadoop.hdfs.qjournal.server.TestJournalNode hadoop.hdfs.qjournal.server.TestJournalNodeSync hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages hadoop.hdfs.server.datanode.fsdataset.impl.TestProvidedImpl hadoop.hdfs.server.datanode.TestBlockPoolSliceStorage hadoop.hdfs.server.datanode.TestDataNodeFaultInjector hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand hadoop.hdfs.server.diskbalancer.TestDiskBalancerRPC hadoop.hdfs.server.mover.TestMover hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics hadoop.hdfs.server.namenode.TestEditLogRace hadoop.hdfs.server.namenode.TestStartup hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs hadoop.hdfs.TestDecommissionWithStriped hadoop.hdfs.TestDFSShell hadoop.hdfs.TestDFSStripedOutputStreamWithFailure hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy hadoop.hdfs.TestDFSUpgradeFromImage hadoop.hdfs.TestFetchImage hadoop.hdfs.TestFileConcurrentReader hadoop.hdfs.TestHDFSFileSystemContract hadoop.hdfs.TestPread hadoop.hdfs.TestSecureEncryptionZoneWithKMS hadoop.hdfs.TestTrashWithSecureEncryptionZones hadoop.hdfs.tools.TestDFSAdmin hadoop.hdfs.tools.TestDFSAdminWithHA hadoop.hdfs.web.TestWebHDFS hadoop.hdfs.web.TestWebHdfsUrl hadoop.fs.http.server.TestHttpFSServerWebServer hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl hadoop.yarn.server.nodemanager.containermanager.TestAuxServices hadoop.yarn.server.nodemanager.containermanager.TestContainerManager hadoop.yarn.server.nodemanager.TestContainerExecutor hadoop.yarn.server.nodemanager.TestLocalDirsHandlerService hadoop.yarn.server.nodemanager.TestNodeManagerResync hadoop.yarn.server.webproxy.amfilter.TestAmFilter hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilterForV1 hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.TestFSSchedulerConfigurationStore hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.TestLeveldbConfigurationStore hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementProcessor hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService hadoop.yarn.server.resourcemanager.TestResourceTrackerService hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification hadoop.yarn.server.timeline.TestEntityGroupFSTimelineStore hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun