[ https://issues.apache.org/jira/browse/MYRIAD-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Miguel Bernadin updated MYRIAD-191: ----------------------------------- Description: When running teragen for 1 trillion rows, it fails running on Myriad but works fine on MapR 5.0. The application master dies after 3 to 8 minutes in to the job. This is a reproducible issue. We're running 28 large NM with specs: 10 vCPU and 12288 mb of RAM. The default is two mappers, which means there are only three task total running including the application master which dies every time. I've also noticed that the NM does fail more often on Myriad than MapR. As you can see in this image below Myriad: https://goo.gl/photos/oGuzcYCNkNSkx9N26 MapR: https://goo.gl/photos/KnYXaSJWMrxF684C7 Here are the logs below: I scrubbed the hostnames for easier readability. Application Master logs: ---------------------------------------------------------------- root@foo:/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1458608803637_0005/container_1458608803637_0005_01_000001# cat syslog 2016-03-22 15:03:53,985 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1458608803637_0005_000001 2016-03-22 15:03:54,362 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2016-03-22 15:03:54,362 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 5 cluster_timestamp: 1458608803637 } attemptId: 1 } keyId: 1800690504) 2016-03-22 15:03:54,501 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter. 2016-03-22 15:03:54,615 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null 2016-03-22 15:03:54,673 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-03-22 15:03:54,675 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 2016-03-22 15:03:54,710 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler 2016-03-22 15:03:54,711 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher 2016-03-22 15:03:54,712 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher 2016-03-22 15:03:54,713 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher 2016-03-22 15:03:54,714 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler 2016-03-22 15:03:54,721 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher 2016-03-22 15:03:54,721 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter 2016-03-22 15:03:54,731 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter 2016-03-22 15:03:54,765 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system is set solely by core-default.xml therefore - ignoring 2016-03-22 15:03:54,790 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system is set solely by core-default.xml therefore - ignoring 2016-03-22 15:03:54,815 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system is set solely by core-default.xml therefore - ignoring 2016-03-22 15:03:54,831 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled 2016-03-22 15:03:54,883 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler 2016-03-22 15:03:55,146 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-03-22 15:03:55,208 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-03-22 15:03:55,208 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started 2016-03-22 15:03:55,216 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1458608803637_0005 to jobTokenSecretManager 2016-03-22 15:03:55,237 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1458608803637_0005 because: not enabled; 2016-03-22 15:03:55,256 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1458608803637_0005 = 0. Number of splits = 2 2016-03-22 15:03:55,256 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1458608803637_0005 = 0 2016-03-22 15:03:55,256 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1458608803637_0005Job Transitioned from NEW to INITED 2016-03-22 15:03:55,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1458608803637_0005. 2016-03-22 15:03:55,311 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-03-22 15:03:55,329 INFO [Socket Reader #1 for port 38420] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 38420 2016-03-22 15:03:55,347 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server 2016-03-22 15:03:55,348 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2016-03-22 15:03:55,349 INFO [IPC Server listener on 38420] org.apache.hadoop.ipc.Server: IPC Server listener on 38420: starting 2016-03-22 15:03:55,352 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at foo.local/10.1.194.76:38420 2016-03-22 15:03:55,497 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2016-03-22 15:03:55,511 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined 2016-03-22 15:03:55,517 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/* 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/* 2016-03-22 15:03:55,557 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 35270 2016-03-22 15:03:55,557 INFO [main] org.mortbay.log: jetty-6.1.26 2016-03-22 15:03:55,587 INFO [main] org.mortbay.log: Extract jar:file:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0-mapr-1506.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_35270_mapreduce____.lnydg0/webapp 2016-03-22 15:03:55,849 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:35270 2016-03-22 15:03:55,849 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 35270 2016-03-22 15:03:56,189 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules 2016-03-22 15:03:56,195 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE job_1458608803637_0005 2016-03-22 15:03:56,195 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-03-22 15:03:56,196 INFO [Socket Reader #1 for port 57491] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 57491 2016-03-22 15:03:56,202 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2016-03-22 15:03:56,202 INFO [IPC Server listener on 57491] org.apache.hadoop.ipc.Server: IPC Server listener on 57491: starting 2016-03-22 15:03:56,229 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true 2016-03-22 15:03:56,229 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3 2016-03-22 15:03:56,229 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33 2016-03-22 15:03:56,281 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at rm.marathon.mesos/10.1.194.73:8030 2016-03-22 15:03:56,418 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: maxContainerCapability: <memory:8192, vCores:10, disks:4.0> 2016-03-22 15:03:56,419 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: queue: root.mapr 2016-03-22 15:03:56,423 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500 2016-03-22 15:03:56,423 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread pool initial size is 10 2016-03-22 15:03:56,427 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 2016-03-22 15:03:56,434 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1458608803637_0005Job Transitioned from INITED to SETUP 2016-03-22 15:03:56,449 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP 2016-03-22 15:03:56,461 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1458608803637_0005Job Transitioned from SETUP to RUNNING 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000000 Task Transitioned from NEW to SCHEDULED 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000001 Task Transitioned from NEW to SCHEDULED 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED 2016-03-22 15:03:56,485 INFO [Thread-48] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1, disks:0.5> 2016-03-22 15:03:56,526 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1458608803637_0005, File: maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job_1458608803637_0005_1.jhist 2016-03-22 15:03:57,422 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:2 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0 2016-03-22 15:03:57,461 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1458608803637_0005: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:636928, vCores:519, disks:208.0> knownNMs=52 2016-03-22 15:03:58,475 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 2 2016-03-22 15:03:58,478 INFO [RMCommunicator Allocator] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac017.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,479 INFO [RMCommunicator Allocator] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac023.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,480 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1458608803637_0005_01_000002 to attempt_1458608803637_0005_m_000000_0 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1458608803637_0005_01_000003 to attempt_1458608803637_0005_m_000001_0 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0 2016-03-22 15:03:58,549 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.Task: mapOutputFile class: org.apache.hadoop.mapred.MapRFsOutputFile 2016-03-22 15:03:58,552 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac017.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,575 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file on the remote FS is maprfs:///var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.jar 2016-03-22 15:03:58,580 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.xml 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #0 tokens and #1 secret keys for NM use for launching container 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertokens_dob is 1 2016-03-22 15:03:58,584 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle token in serviceData 2016-03-22 15:03:58,596 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding ShuffleProvider Service: mapr_direct_shuffle to serviceData 2016-03-22 15:03:58,646 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.Task: mapOutputFile class: org.apache.hadoop.mapred.MapRFsOutputFile 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac023.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,652 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-03-22 15:03:58,654 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1458608803637_0005_01_000002 taskAttempt attempt_1458608803637_0005_m_000000_0 2016-03-22 15:03:58,655 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1458608803637_0005_01_000003 taskAttempt attempt_1458608803637_0005_m_000001_0 2016-03-22 15:03:58,658 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1458608803637_0005_m_000000_0 2016-03-22 15:03:58,658 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1458608803637_0005_m_000001_0 2016-03-22 15:03:58,659 INFO [ContainerLauncher #0] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : svdidac017.techlabs.accenture.com:26802 2016-03-22 15:03:58,685 INFO [ContainerLauncher #1] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : svdidac023.techlabs.accenture.com:42464 2016-03-22 15:03:59,016 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: Shuffle port returned by ContainerManager for attempt_1458608803637_0005_m_000001_0 : 10394 2016-03-22 15:03:59,016 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: Shuffle port returned by ContainerManager for attempt_1458608803637_0005_m_000000_0 : 60124 2016-03-22 15:03:59,018 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1458608803637_0005_m_000001_0] using containerId: [container_1458608803637_0005_01_000003 on NM: [svdidac023.techlabs.accenture.com:42464] 2016-03-22 15:03:59,022 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1458608803637_0005_m_000000_0] using containerId: [container_1458608803637_0005_01_000002 on NM: [svdidac017.techlabs.accenture.com:26802] 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1458608803637_0005_m_000001 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000001 Task Transitioned from SCHEDULED to RUNNING 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1458608803637_0005_m_000000 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000000 Task Transitioned from SCHEDULED to RUNNING 2016-03-22 15:03:59,484 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1458608803637_0005: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:634880, vCores:517, disks:207.0> knownNMs=52 2016-03-22 15:04:02,438 INFO [Socket Reader #1 for port 57491] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1458608803637_0005 (auth:SIMPLE) 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1458608803637_0005_m_000003 asked for a task 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1458608803637_0005_m_000003 given task: attempt_1458608803637_0005_m_000001_0 2016-03-22 15:04:04,287 INFO [Socket Reader #1 for port 57491] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1458608803637_0005 (auth:SIMPLE) 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1458608803637_0005_m_000002 asked for a task 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1458608803637_0005_m_000002 given task: attempt_1458608803637_0005_m_000000_0 2016-03-22 15:04:08,858 INFO [IPC Server handler 3 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.007079788 2016-03-22 15:04:10,683 INFO [IPC Server handler 3 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000000_0 is : 2.622E-6 2016-03-22 15:04:11,887 INFO [IPC Server handler 5 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.010750526 2016-03-22 15:04:14,910 INFO [IPC Server handler 6 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.014476808 2016-03-22 15:04:17,932 INFO [IPC Server handler 1 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.018156094 2016-03-22 15:04:20,956 INFO [IPC Server handler 4 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.021778924 2016-03-22 15:04:23,978 INFO [IPC Server handler 8 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.025545845 2016-03-22 15:04:27,003 INFO [IPC Server handler 7 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.029212331 2016-03-22 15:04:30,024 INFO [IPC Server handler 10 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.032808386 2016-03-22 15:04:33,044 INFO [IPC Server handler 18 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.03653345 2016-03-22 15:04:36,066 INFO [IPC Server handler 11 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.040224314 2016-03-22 15:04:39,089 INFO [IPC Server handler 19 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.043970145 2016-03-22 15:04:42,112 INFO [IPC Server handler 23 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.047755376 2016-03-22 15:04:45,132 INFO [IPC Server handler 9 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.051441472 2016-03-22 15:04:48,151 INFO [IPC Server handler 13 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.05514416 2016-03-22 15:04:51,175 INFO [IPC Server handler 12 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.058915816 2016-03-22 15:04:54,193 INFO [IPC Server handler 20 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.06264193 2016-03-22 15:04:57,213 INFO [IPC Server handler 15 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.06635544 2016-03-22 15:05:00,236 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.07005012 2016-03-22 15:05:03,256 INFO [IPC Server handler 2 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.073812515 2016-03-22 15:05:06,277 INFO [IPC Server handler 3 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.07749763 2016-03-22 15:05:09,295 INFO [IPC Server handler 5 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.08121116 2016-03-22 15:05:12,313 INFO [IPC Server handler 6 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.08495909 2016-03-22 15:05:15,333 INFO [IPC Server handler 1 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.08868403 2016-03-22 15:05:18,354 INFO [IPC Server handler 8 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.09239528 2016-03-22 15:05:21,372 INFO [IPC Server handler 4 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.09613607 2016-03-22 15:05:24,391 INFO [IPC Server handler 7 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.09985288 2016-03-22 15:05:27,406 INFO [IPC Server handler 10 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.10355474 2016-03-22 15:05:30,422 INFO [IPC Server handler 18 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.107317574 2016-03-22 15:05:33,438 INFO [IPC Server handler 11 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.11103281 2016-03-22 15:05:36,453 INFO [IPC Server handler 19 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.11472398 Hadoop process logs: ----------------------------------------------------- mapr@foo002:/root$ time hadoop jar /opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0-mapr-1506.jar teragen 1000000000 /tmp/myriad-test 16/03/22 15:03:47 INFO client.RMProxy: Connecting to ResourceManager at rm.marathon.mesos/10.1.194.73:8032 16/03/22 15:03:48 INFO terasort.TeraSort: Generating 1000000000 using 2 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: number of splits:2 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1458608803637_0005 16/03/22 15:03:48 INFO security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager 16/03/22 15:03:48 INFO impl.YarnClientImpl: Submitted application application_1458608803637_0005 16/03/22 15:03:48 INFO mapreduce.Job: The url to track the job: http://rm.marathon.mesos:8088/proxy/application_1458608803637_0005/ 16/03/22 15:03:48 INFO mapreduce.Job: Running job: job_1458608803637_0005 16/03/22 15:03:57 INFO mapreduce.Job: Job job_1458608803637_0005 running in uber mode : false 16/03/22 15:03:57 INFO mapreduce.Job: map 0% reduce 0% 16/03/22 15:04:13 INFO mapreduce.Job: map 1% reduce 0% 16/03/22 15:04:31 INFO mapreduce.Job: map 2% reduce 0% 16/03/22 15:04:46 INFO mapreduce.Job: map 3% reduce 0% 16/03/22 15:05:02 INFO mapreduce.Job: map 4% reduce 0% 16/03/22 15:05:20 INFO mapreduce.Job: map 5% reduce 0% 16/03/22 15:05:35 INFO mapreduce.Job: map 6% reduce 0% 16/03/22 15:07:02 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:03 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:04 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:05 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:06 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 1 was: When running teragen for 1 trillion rows, it fails running on Myriad but works fine on MapR 5.0. The application master dies after 3 to 8 minutes in to the job. I've also noticed that the NM does fail more often on Myriad than MapR. As you can see in this image below Myriad: https://goo.gl/photos/oGuzcYCNkNSkx9N26 MapR: https://goo.gl/photos/KnYXaSJWMrxF684C7 Here are the logs below: I scrubbed the hostnames for easier readability. Application Master logs: ---------------------------------------------------------------- root@foo:/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1458608803637_0005/container_1458608803637_0005_01_000001# cat syslog 2016-03-22 15:03:53,985 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1458608803637_0005_000001 2016-03-22 15:03:54,362 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: 2016-03-22 15:03:54,362 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 5 cluster_timestamp: 1458608803637 } attemptId: 1 } keyId: 1800690504) 2016-03-22 15:03:54,501 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter. 2016-03-22 15:03:54,615 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null 2016-03-22 15:03:54,673 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1 2016-03-22 15:03:54,675 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 2016-03-22 15:03:54,710 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler 2016-03-22 15:03:54,711 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher 2016-03-22 15:03:54,712 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher 2016-03-22 15:03:54,713 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher 2016-03-22 15:03:54,714 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler 2016-03-22 15:03:54,721 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher 2016-03-22 15:03:54,721 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter 2016-03-22 15:03:54,731 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter 2016-03-22 15:03:54,765 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system is set solely by core-default.xml therefore - ignoring 2016-03-22 15:03:54,790 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system is set solely by core-default.xml therefore - ignoring 2016-03-22 15:03:54,815 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system is set solely by core-default.xml therefore - ignoring 2016-03-22 15:03:54,831 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled 2016-03-22 15:03:54,883 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler 2016-03-22 15:03:55,146 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-03-22 15:03:55,208 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-03-22 15:03:55,208 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started 2016-03-22 15:03:55,216 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1458608803637_0005 to jobTokenSecretManager 2016-03-22 15:03:55,237 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1458608803637_0005 because: not enabled; 2016-03-22 15:03:55,256 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1458608803637_0005 = 0. Number of splits = 2 2016-03-22 15:03:55,256 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1458608803637_0005 = 0 2016-03-22 15:03:55,256 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1458608803637_0005Job Transitioned from NEW to INITED 2016-03-22 15:03:55,258 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1458608803637_0005. 2016-03-22 15:03:55,311 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-03-22 15:03:55,329 INFO [Socket Reader #1 for port 38420] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 38420 2016-03-22 15:03:55,347 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server 2016-03-22 15:03:55,348 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2016-03-22 15:03:55,349 INFO [IPC Server listener on 38420] org.apache.hadoop.ipc.Server: IPC Server listener on 38420: starting 2016-03-22 15:03:55,352 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at foo.local/10.1.194.76:38420 2016-03-22 15:03:55,497 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2016-03-22 15:03:55,511 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined 2016-03-22 15:03:55,517 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/* 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/* 2016-03-22 15:03:55,557 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 35270 2016-03-22 15:03:55,557 INFO [main] org.mortbay.log: jetty-6.1.26 2016-03-22 15:03:55,587 INFO [main] org.mortbay.log: Extract jar:file:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0-mapr-1506.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_35270_mapreduce____.lnydg0/webapp 2016-03-22 15:03:55,849 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:35270 2016-03-22 15:03:55,849 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 35270 2016-03-22 15:03:56,189 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules 2016-03-22 15:03:56,195 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE job_1458608803637_0005 2016-03-22 15:03:56,195 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-03-22 15:03:56,196 INFO [Socket Reader #1 for port 57491] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 57491 2016-03-22 15:03:56,202 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2016-03-22 15:03:56,202 INFO [IPC Server listener on 57491] org.apache.hadoop.ipc.Server: IPC Server listener on 57491: starting 2016-03-22 15:03:56,229 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true 2016-03-22 15:03:56,229 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3 2016-03-22 15:03:56,229 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33 2016-03-22 15:03:56,281 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at rm.marathon.mesos/10.1.194.73:8030 2016-03-22 15:03:56,418 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: maxContainerCapability: <memory:8192, vCores:10, disks:4.0> 2016-03-22 15:03:56,419 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: queue: root.mapr 2016-03-22 15:03:56,423 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500 2016-03-22 15:03:56,423 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread pool initial size is 10 2016-03-22 15:03:56,427 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 2016-03-22 15:03:56,434 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1458608803637_0005Job Transitioned from INITED to SETUP 2016-03-22 15:03:56,449 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP 2016-03-22 15:03:56,461 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1458608803637_0005Job Transitioned from SETUP to RUNNING 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000000 Task Transitioned from NEW to SCHEDULED 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000001 Task Transitioned from NEW to SCHEDULED 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED 2016-03-22 15:03:56,485 INFO [Thread-48] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1, disks:0.5> 2016-03-22 15:03:56,526 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1458608803637_0005, File: maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job_1458608803637_0005_1.jhist 2016-03-22 15:03:57,422 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:2 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0 2016-03-22 15:03:57,461 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1458608803637_0005: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:636928, vCores:519, disks:208.0> knownNMs=52 2016-03-22 15:03:58,475 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 2 2016-03-22 15:03:58,478 INFO [RMCommunicator Allocator] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac017.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,479 INFO [RMCommunicator Allocator] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac023.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,480 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1458608803637_0005_01_000002 to attempt_1458608803637_0005_m_000000_0 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1458608803637_0005_01_000003 to attempt_1458608803637_0005_m_000001_0 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0 2016-03-22 15:03:58,549 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.Task: mapOutputFile class: org.apache.hadoop.mapred.MapRFsOutputFile 2016-03-22 15:03:58,552 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac017.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,575 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file on the remote FS is maprfs:///var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.jar 2016-03-22 15:03:58,580 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.xml 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #0 tokens and #1 secret keys for NM use for launching container 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertokens_dob is 1 2016-03-22 15:03:58,584 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle token in serviceData 2016-03-22 15:03:58,596 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding ShuffleProvider Service: mapr_direct_shuffle to serviceData 2016-03-22 15:03:58,646 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.Task: mapOutputFile class: org.apache.hadoop.mapred.MapRFsOutputFile 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved svdidac023.techlabs.accenture.com to /default-rack 2016-03-22 15:03:58,652 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-03-22 15:03:58,654 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1458608803637_0005_01_000002 taskAttempt attempt_1458608803637_0005_m_000000_0 2016-03-22 15:03:58,655 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1458608803637_0005_01_000003 taskAttempt attempt_1458608803637_0005_m_000001_0 2016-03-22 15:03:58,658 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1458608803637_0005_m_000000_0 2016-03-22 15:03:58,658 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1458608803637_0005_m_000001_0 2016-03-22 15:03:58,659 INFO [ContainerLauncher #0] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : svdidac017.techlabs.accenture.com:26802 2016-03-22 15:03:58,685 INFO [ContainerLauncher #1] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : svdidac023.techlabs.accenture.com:42464 2016-03-22 15:03:59,016 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: Shuffle port returned by ContainerManager for attempt_1458608803637_0005_m_000001_0 : 10394 2016-03-22 15:03:59,016 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: Shuffle port returned by ContainerManager for attempt_1458608803637_0005_m_000000_0 : 60124 2016-03-22 15:03:59,018 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1458608803637_0005_m_000001_0] using containerId: [container_1458608803637_0005_01_000003 on NM: [svdidac023.techlabs.accenture.com:42464] 2016-03-22 15:03:59,022 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1458608803637_0005_m_000000_0] using containerId: [container_1458608803637_0005_01_000002 on NM: [svdidac017.techlabs.accenture.com:26802] 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1458608803637_0005_m_000001 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000001 Task Transitioned from SCHEDULED to RUNNING 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1458608803637_0005_m_000000 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1458608803637_0005_m_000000 Task Transitioned from SCHEDULED to RUNNING 2016-03-22 15:03:59,484 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1458608803637_0005: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:634880, vCores:517, disks:207.0> knownNMs=52 2016-03-22 15:04:02,438 INFO [Socket Reader #1 for port 57491] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1458608803637_0005 (auth:SIMPLE) 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1458608803637_0005_m_000003 asked for a task 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1458608803637_0005_m_000003 given task: attempt_1458608803637_0005_m_000001_0 2016-03-22 15:04:04,287 INFO [Socket Reader #1 for port 57491] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1458608803637_0005 (auth:SIMPLE) 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1458608803637_0005_m_000002 asked for a task 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1458608803637_0005_m_000002 given task: attempt_1458608803637_0005_m_000000_0 2016-03-22 15:04:08,858 INFO [IPC Server handler 3 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.007079788 2016-03-22 15:04:10,683 INFO [IPC Server handler 3 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000000_0 is : 2.622E-6 2016-03-22 15:04:11,887 INFO [IPC Server handler 5 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.010750526 2016-03-22 15:04:14,910 INFO [IPC Server handler 6 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.014476808 2016-03-22 15:04:17,932 INFO [IPC Server handler 1 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.018156094 2016-03-22 15:04:20,956 INFO [IPC Server handler 4 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.021778924 2016-03-22 15:04:23,978 INFO [IPC Server handler 8 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.025545845 2016-03-22 15:04:27,003 INFO [IPC Server handler 7 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.029212331 2016-03-22 15:04:30,024 INFO [IPC Server handler 10 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.032808386 2016-03-22 15:04:33,044 INFO [IPC Server handler 18 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.03653345 2016-03-22 15:04:36,066 INFO [IPC Server handler 11 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.040224314 2016-03-22 15:04:39,089 INFO [IPC Server handler 19 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.043970145 2016-03-22 15:04:42,112 INFO [IPC Server handler 23 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.047755376 2016-03-22 15:04:45,132 INFO [IPC Server handler 9 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.051441472 2016-03-22 15:04:48,151 INFO [IPC Server handler 13 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.05514416 2016-03-22 15:04:51,175 INFO [IPC Server handler 12 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.058915816 2016-03-22 15:04:54,193 INFO [IPC Server handler 20 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.06264193 2016-03-22 15:04:57,213 INFO [IPC Server handler 15 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.06635544 2016-03-22 15:05:00,236 INFO [IPC Server handler 0 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.07005012 2016-03-22 15:05:03,256 INFO [IPC Server handler 2 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.073812515 2016-03-22 15:05:06,277 INFO [IPC Server handler 3 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.07749763 2016-03-22 15:05:09,295 INFO [IPC Server handler 5 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.08121116 2016-03-22 15:05:12,313 INFO [IPC Server handler 6 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.08495909 2016-03-22 15:05:15,333 INFO [IPC Server handler 1 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.08868403 2016-03-22 15:05:18,354 INFO [IPC Server handler 8 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.09239528 2016-03-22 15:05:21,372 INFO [IPC Server handler 4 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.09613607 2016-03-22 15:05:24,391 INFO [IPC Server handler 7 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.09985288 2016-03-22 15:05:27,406 INFO [IPC Server handler 10 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.10355474 2016-03-22 15:05:30,422 INFO [IPC Server handler 18 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.107317574 2016-03-22 15:05:33,438 INFO [IPC Server handler 11 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.11103281 2016-03-22 15:05:36,453 INFO [IPC Server handler 19 on 57491] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1458608803637_0005_m_000001_0 is : 0.11472398 Hadoop process logs: ----------------------------------------------------- mapr@foo002:/root$ time hadoop jar /opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0-mapr-1506.jar teragen 1000000000 /tmp/myriad-test 16/03/22 15:03:47 INFO client.RMProxy: Connecting to ResourceManager at rm.marathon.mesos/10.1.194.73:8032 16/03/22 15:03:48 INFO terasort.TeraSort: Generating 1000000000 using 2 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: number of splits:2 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1458608803637_0005 16/03/22 15:03:48 INFO security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager 16/03/22 15:03:48 INFO impl.YarnClientImpl: Submitted application application_1458608803637_0005 16/03/22 15:03:48 INFO mapreduce.Job: The url to track the job: http://rm.marathon.mesos:8088/proxy/application_1458608803637_0005/ 16/03/22 15:03:48 INFO mapreduce.Job: Running job: job_1458608803637_0005 16/03/22 15:03:57 INFO mapreduce.Job: Job job_1458608803637_0005 running in uber mode : false 16/03/22 15:03:57 INFO mapreduce.Job: map 0% reduce 0% 16/03/22 15:04:13 INFO mapreduce.Job: map 1% reduce 0% 16/03/22 15:04:31 INFO mapreduce.Job: map 2% reduce 0% 16/03/22 15:04:46 INFO mapreduce.Job: map 3% reduce 0% 16/03/22 15:05:02 INFO mapreduce.Job: map 4% reduce 0% 16/03/22 15:05:20 INFO mapreduce.Job: map 5% reduce 0% 16/03/22 15:05:35 INFO mapreduce.Job: map 6% reduce 0% 16/03/22 15:07:02 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:03 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:04 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:05 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/03/22 15:07:06 INFO ipc.Client: Retrying connect to server: appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 1 > Hadoop Teragen Benchmark Fails for Myriad > ----------------------------------------- > > Key: MYRIAD-191 > URL: https://issues.apache.org/jira/browse/MYRIAD-191 > Project: Myriad > Issue Type: Bug > Components: Executor > Affects Versions: Myriad 0.1.0 > Environment: MapR 5.0. Kernel: 3.13.0-66-generic OS: Ubuntu 14.04 > Mesos: 0.26.0 Marathon: 0.14.0 > Reporter: Miguel Bernadin > > When running teragen for 1 trillion rows, it fails running on Myriad but > works fine on MapR 5.0. The application master dies after 3 to 8 minutes in > to the job. This is a reproducible issue. We're running 28 large NM with > specs: 10 vCPU and 12288 mb of RAM. The default is two mappers, which means > there are only three task total running including the application master > which dies every time. > I've also noticed that the NM does fail more often on Myriad than MapR. > As you can see in this image below > Myriad: https://goo.gl/photos/oGuzcYCNkNSkx9N26 > MapR: https://goo.gl/photos/KnYXaSJWMrxF684C7 > Here are the logs below: > I scrubbed the hostnames for easier readability. > Application Master logs: > ---------------------------------------------------------------- > root@foo:/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1458608803637_0005/container_1458608803637_0005_01_000001# > cat syslog > 2016-03-22 15:03:53,985 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for > application appattempt_1458608803637_0005_000001 > 2016-03-22 15:03:54,362 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: > 2016-03-22 15:03:54,362 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, > Service: , Ident: (appAttemptId { application_id { id: 5 cluster_timestamp: > 1458608803637 } attemptId: 1 } keyId: 1800690504) > 2016-03-22 15:03:54,501 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter. > 2016-03-22 15:03:54,615 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config > null > 2016-03-22 15:03:54,673 INFO [main] > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output > Committer Algorithm version is 1 > 2016-03-22 15:03:54,675 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > 2016-03-22 15:03:54,710 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.jobhistory.EventType for class > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler > 2016-03-22 15:03:54,711 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher > 2016-03-22 15:03:54,712 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher > 2016-03-22 15:03:54,713 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher > 2016-03-22 15:03:54,714 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler > 2016-03-22 15:03:54,721 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher > 2016-03-22 15:03:54,721 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter > 2016-03-22 15:03:54,731 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for > class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter > 2016-03-22 15:03:54,765 INFO [main] > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system is set solely by core-default.xml therefore - ignoring > 2016-03-22 15:03:54,790 INFO [main] > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system is set solely by core-default.xml therefore - ignoring > 2016-03-22 15:03:54,815 INFO [main] > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system is set solely by core-default.xml therefore - ignoring > 2016-03-22 15:03:54,831 INFO [main] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job > history data to the timeline server is not enabled > 2016-03-22 15:03:54,883 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler > 2016-03-22 15:03:55,146 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from > hadoop-metrics2.properties > 2016-03-22 15:03:55,208 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period > at 10 second(s). > 2016-03-22 15:03:55,208 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system > started > 2016-03-22 15:03:55,216 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for > job_1458608803637_0005 to jobTokenSecretManager > 2016-03-22 15:03:55,237 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing > job_1458608803637_0005 because: not enabled; > 2016-03-22 15:03:55,256 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job > job_1458608803637_0005 = 0. Number of splits = 2 > 2016-03-22 15:03:55,256 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for > job job_1458608803637_0005 = 0 > 2016-03-22 15:03:55,256 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1458608803637_0005Job Transitioned from NEW to INITED > 2016-03-22 15:03:55,258 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, > non-uberized, multi-container job job_1458608803637_0005. > 2016-03-22 15:03:55,311 INFO [main] org.apache.hadoop.ipc.CallQueueManager: > Using callQueue class java.util.concurrent.LinkedBlockingQueue > 2016-03-22 15:03:55,329 INFO [Socket Reader #1 for port 38420] > org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 38420 > 2016-03-22 15:03:55,347 INFO [main] > org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding > protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server > 2016-03-22 15:03:55,348 INFO [IPC Server Responder] > org.apache.hadoop.ipc.Server: IPC Server Responder: starting > 2016-03-22 15:03:55,349 INFO [IPC Server listener on 38420] > org.apache.hadoop.ipc.Server: IPC Server listener on 38420: starting > 2016-03-22 15:03:55,352 INFO [main] > org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated > MRClientService at foo.local/10.1.194.76:38420 > 2016-03-22 15:03:55,497 INFO [main] org.mortbay.log: Logging to > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > org.mortbay.log.Slf4jLog > 2016-03-22 15:03:55,511 INFO [main] org.apache.hadoop.http.HttpRequestLog: > Http request log for http.requests.mapreduce is not defined > 2016-03-22 15:03:55,517 INFO [main] org.apache.hadoop.http.HttpServer2: Added > global filter 'safety' > (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) > 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added > filter AM_PROXY_FILTER > (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context > mapreduce > 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added > filter AM_PROXY_FILTER > (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context > static > 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: > adding path spec: /mapreduce/* > 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: > adding path spec: /ws/* > 2016-03-22 15:03:55,557 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty > bound to port 35270 > 2016-03-22 15:03:55,557 INFO [main] org.mortbay.log: jetty-6.1.26 > 2016-03-22 15:03:55,587 INFO [main] org.mortbay.log: Extract > jar:file:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0-mapr-1506.jar!/webapps/mapreduce > to /tmp/Jetty_0_0_0_0_35270_mapreduce____.lnydg0/webapp > 2016-03-22 15:03:55,849 INFO [main] org.mortbay.log: Started > HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:35270 > 2016-03-22 15:03:55,849 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: > Web app /mapreduce started at 35270 > 2016-03-22 15:03:56,189 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: > Registered webapp guice modules > 2016-03-22 15:03:56,195 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE > job_1458608803637_0005 > 2016-03-22 15:03:56,195 INFO [main] org.apache.hadoop.ipc.CallQueueManager: > Using callQueue class java.util.concurrent.LinkedBlockingQueue > 2016-03-22 15:03:56,196 INFO [Socket Reader #1 for port 57491] > org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 57491 > 2016-03-22 15:03:56,202 INFO [IPC Server Responder] > org.apache.hadoop.ipc.Server: IPC Server Responder: starting > 2016-03-22 15:03:56,202 INFO [IPC Server listener on 57491] > org.apache.hadoop.ipc.Server: IPC Server listener on 57491: starting > 2016-03-22 15:03:56,229 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: > nodeBlacklistingEnabled:true > 2016-03-22 15:03:56,229 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: > maxTaskFailuresPerNode is 3 > 2016-03-22 15:03:56,229 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: > blacklistDisablePercent is 33 > 2016-03-22 15:03:56,281 INFO [main] org.apache.hadoop.yarn.client.RMProxy: > Connecting to ResourceManager at rm.marathon.mesos/10.1.194.73:8030 > 2016-03-22 15:03:56,418 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: maxContainerCapability: > <memory:8192, vCores:10, disks:4.0> > 2016-03-22 15:03:56,419 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: queue: root.mapr > 2016-03-22 15:03:56,423 INFO [main] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper > limit on the thread pool size is 500 > 2016-03-22 15:03:56,423 INFO [main] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread > pool initial size is 10 > 2016-03-22 15:03:56,427 INFO [main] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > yarn.client.max-cached-nodemanagers-proxies : 0 > 2016-03-22 15:03:56,434 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1458608803637_0005Job Transitioned from INITED to SETUP > 2016-03-22 15:03:56,449 INFO [CommitterEvent Processor #0] > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing > the event EventType: JOB_SETUP > 2016-03-22 15:03:56,461 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1458608803637_0005Job Transitioned from SETUP to RUNNING > 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000000 Task Transitioned from NEW to SCHEDULED > 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000001 Task Transitioned from NEW to SCHEDULED > 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2016-03-22 15:03:56,485 INFO [Thread-48] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: > mapResourceRequest:<memory:1024, vCores:1, disks:0.5> > 2016-03-22 15:03:56,526 INFO [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer > setup for JobId: job_1458608803637_0005, File: > maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job_1458608803637_0005_1.jhist > 2016-03-22 15:03:57,422 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before > Scheduling: PendingReds:0 ScheduledMaps:2 ScheduledReds:0 AssignedMaps:0 > AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 > HostLocal:0 RackLocal:0 > 2016-03-22 15:03:57,461 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() > for application_1458608803637_0005: ask=1 release= 0 newContainers=0 > finishedContainers=0 resourcelimit=<memory:636928, vCores:519, disks:208.0> > knownNMs=52 > 2016-03-22 15:03:58,475 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated > containers 2 > 2016-03-22 15:03:58,478 INFO [RMCommunicator Allocator] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac017.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,479 INFO [RMCommunicator Allocator] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac023.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,480 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_1458608803637_0005_01_000002 to > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_1458608803637_0005_01_000003 to > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: > PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 > CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0 > 2016-03-22 15:03:58,549 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapred.Task: mapOutputFile class: > org.apache.hadoop.mapred.MapRFsOutputFile > 2016-03-22 15:03:58,552 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac017.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,575 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file > on the remote FS is > maprfs:///var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.jar > 2016-03-22 15:03:58,580 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf > file on the remote FS is > maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.xml > 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #0 tokens > and #1 secret keys for NM use for launching container > 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of > containertokens_dob is 1 > 2016-03-22 15:03:58,584 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle > token in serviceData > 2016-03-22 15:03:58,596 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding > ShuffleProvider Service: mapr_direct_shuffle to serviceData > 2016-03-22 15:03:58,646 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from > UNASSIGNED to ASSIGNED > 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapred.Task: mapOutputFile class: > org.apache.hadoop.mapred.MapRFsOutputFile > 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac023.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,652 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from > UNASSIGNED to ASSIGNED > 2016-03-22 15:03:58,654 INFO [ContainerLauncher #0] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing > the event EventType: CONTAINER_REMOTE_LAUNCH for container > container_1458608803637_0005_01_000002 taskAttempt > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:03:58,655 INFO [ContainerLauncher #1] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing > the event EventType: CONTAINER_REMOTE_LAUNCH for container > container_1458608803637_0005_01_000003 taskAttempt > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:03:58,658 INFO [ContainerLauncher #0] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:03:58,658 INFO [ContainerLauncher #1] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:03:58,659 INFO [ContainerLauncher #0] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > Opening proxy : svdidac017.techlabs.accenture.com:26802 > 2016-03-22 15:03:58,685 INFO [ContainerLauncher #1] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > Opening proxy : svdidac023.techlabs.accenture.com:42464 > 2016-03-22 15:03:59,016 INFO [ContainerLauncher #1] > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: > Shuffle port returned by ContainerManager for > attempt_1458608803637_0005_m_000001_0 : 10394 > 2016-03-22 15:03:59,016 INFO [ContainerLauncher #0] > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: > Shuffle port returned by ContainerManager for > attempt_1458608803637_0005_m_000000_0 : 60124 > 2016-03-22 15:03:59,018 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: > [attempt_1458608803637_0005_m_000001_0] using containerId: > [container_1458608803637_0005_01_000003 on NM: > [svdidac023.techlabs.accenture.com:42464] > 2016-03-22 15:03:59,022 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from ASSIGNED > to RUNNING > 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: > [attempt_1458608803637_0005_m_000000_0] using containerId: > [container_1458608803637_0005_01_000002 on NM: > [svdidac017.techlabs.accenture.com:26802] > 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from ASSIGNED > to RUNNING > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START > task_1458608803637_0005_m_000001 > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000001 Task Transitioned from SCHEDULED to RUNNING > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START > task_1458608803637_0005_m_000000 > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000000 Task Transitioned from SCHEDULED to RUNNING > 2016-03-22 15:03:59,484 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() > for application_1458608803637_0005: ask=1 release= 0 newContainers=0 > finishedContainers=0 resourcelimit=<memory:634880, vCores:517, disks:207.0> > knownNMs=52 > 2016-03-22 15:04:02,438 INFO [Socket Reader #1 for port 57491] > SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for > job_1458608803637_0005 (auth:SIMPLE) > 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : > jvm_1458608803637_0005_m_000003 asked for a task > 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: > jvm_1458608803637_0005_m_000003 given task: > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:04:04,287 INFO [Socket Reader #1 for port 57491] > SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for > job_1458608803637_0005 (auth:SIMPLE) > 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : > jvm_1458608803637_0005_m_000002 asked for a task > 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: > jvm_1458608803637_0005_m_000002 given task: > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:04:08,858 INFO [IPC Server handler 3 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.007079788 > 2016-03-22 15:04:10,683 INFO [IPC Server handler 3 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000000_0 is : 2.622E-6 > 2016-03-22 15:04:11,887 INFO [IPC Server handler 5 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.010750526 > 2016-03-22 15:04:14,910 INFO [IPC Server handler 6 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.014476808 > 2016-03-22 15:04:17,932 INFO [IPC Server handler 1 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.018156094 > 2016-03-22 15:04:20,956 INFO [IPC Server handler 4 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.021778924 > 2016-03-22 15:04:23,978 INFO [IPC Server handler 8 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.025545845 > 2016-03-22 15:04:27,003 INFO [IPC Server handler 7 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.029212331 > 2016-03-22 15:04:30,024 INFO [IPC Server handler 10 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.032808386 > 2016-03-22 15:04:33,044 INFO [IPC Server handler 18 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.03653345 > 2016-03-22 15:04:36,066 INFO [IPC Server handler 11 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.040224314 > 2016-03-22 15:04:39,089 INFO [IPC Server handler 19 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.043970145 > 2016-03-22 15:04:42,112 INFO [IPC Server handler 23 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.047755376 > 2016-03-22 15:04:45,132 INFO [IPC Server handler 9 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.051441472 > 2016-03-22 15:04:48,151 INFO [IPC Server handler 13 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.05514416 > 2016-03-22 15:04:51,175 INFO [IPC Server handler 12 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.058915816 > 2016-03-22 15:04:54,193 INFO [IPC Server handler 20 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.06264193 > 2016-03-22 15:04:57,213 INFO [IPC Server handler 15 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.06635544 > 2016-03-22 15:05:00,236 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.07005012 > 2016-03-22 15:05:03,256 INFO [IPC Server handler 2 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.073812515 > 2016-03-22 15:05:06,277 INFO [IPC Server handler 3 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.07749763 > 2016-03-22 15:05:09,295 INFO [IPC Server handler 5 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.08121116 > 2016-03-22 15:05:12,313 INFO [IPC Server handler 6 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.08495909 > 2016-03-22 15:05:15,333 INFO [IPC Server handler 1 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.08868403 > 2016-03-22 15:05:18,354 INFO [IPC Server handler 8 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.09239528 > 2016-03-22 15:05:21,372 INFO [IPC Server handler 4 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.09613607 > 2016-03-22 15:05:24,391 INFO [IPC Server handler 7 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.09985288 > 2016-03-22 15:05:27,406 INFO [IPC Server handler 10 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.10355474 > 2016-03-22 15:05:30,422 INFO [IPC Server handler 18 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.107317574 > 2016-03-22 15:05:33,438 INFO [IPC Server handler 11 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.11103281 > 2016-03-22 15:05:36,453 INFO [IPC Server handler 19 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.11472398 > Hadoop process logs: > ----------------------------------------------------- > mapr@foo002:/root$ time hadoop jar > /opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0-mapr-1506.jar > teragen 1000000000 /tmp/myriad-test > 16/03/22 15:03:47 INFO client.RMProxy: Connecting to ResourceManager at > rm.marathon.mesos/10.1.194.73:8032 > 16/03/22 15:03:48 INFO terasort.TeraSort: Generating 1000000000 using 2 > 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: number of splits:2 > 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_1458608803637_0005 > 16/03/22 15:03:48 INFO security.ExternalTokenManagerFactory: Initialized > external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager > 16/03/22 15:03:48 INFO impl.YarnClientImpl: Submitted application > application_1458608803637_0005 > 16/03/22 15:03:48 INFO mapreduce.Job: The url to track the job: > http://rm.marathon.mesos:8088/proxy/application_1458608803637_0005/ > 16/03/22 15:03:48 INFO mapreduce.Job: Running job: job_1458608803637_0005 > 16/03/22 15:03:57 INFO mapreduce.Job: Job job_1458608803637_0005 running in > uber mode : false > 16/03/22 15:03:57 INFO mapreduce.Job: map 0% reduce 0% > 16/03/22 15:04:13 INFO mapreduce.Job: map 1% reduce 0% > 16/03/22 15:04:31 INFO mapreduce.Job: map 2% reduce 0% > 16/03/22 15:04:46 INFO mapreduce.Job: map 3% reduce 0% > 16/03/22 15:05:02 INFO mapreduce.Job: map 4% reduce 0% > 16/03/22 15:05:20 INFO mapreduce.Job: map 5% reduce 0% > 16/03/22 15:05:35 INFO mapreduce.Job: map 6% reduce 0% > 16/03/22 15:07:02 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:03 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:04 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 2 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:05 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:06 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)