[ https://issues.apache.org/jira/browse/MYRIAD-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208487#comment-15208487 ]
DarinJ commented on MYRIAD-191: ------------------------------- Looks like it's dieing during teragen, I've successfully run that job several times this week with no failures pulling off master. I think in this case we need to know what's happening on the node manager running the application master. First question is does the app master ever go into an UNHEALTHY or LOST state in the resource manager. Second question is does it ever go into a LOST or FAILED state in mesos, this information will be on the mesos-slave logs. > Hadoop Teragen Benchmark Fails for Myriad > ----------------------------------------- > > Key: MYRIAD-191 > URL: https://issues.apache.org/jira/browse/MYRIAD-191 > Project: Myriad > Issue Type: Bug > Components: Executor > Affects Versions: Myriad 0.1.0 > Environment: MapR 5.0. Kernel: 3.13.0-66-generic OS: Ubuntu 14.04 > Mesos: 0.26.0 Marathon: 0.14.0 > Reporter: Miguel Bernadin > > When running teragen for 1 trillion rows, it fails running on Myriad but > works fine on MapR 5.0. The application master dies after 3 to 8 minutes in > to the job. This is a reproducible issue. We're running 28 large NM with > specs: 10 vCPU and 12288 mb of RAM. The default is two mappers, which means > there are only three task total running including the application master > which dies every time. > I've also noticed that the NM does fail more often on Myriad than MapR. > As you can see in this image below > Myriad: https://goo.gl/photos/oGuzcYCNkNSkx9N26 > MapR: https://goo.gl/photos/KnYXaSJWMrxF684C7 > Here are the logs below: > I scrubbed the hostnames for easier readability. > Application Master logs: > ---------------------------------------------------------------- > root@foo:/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1458608803637_0005/container_1458608803637_0005_01_000001# > cat syslog > 2016-03-22 15:03:53,985 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for > application appattempt_1458608803637_0005_000001 > 2016-03-22 15:03:54,362 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens: > 2016-03-22 15:03:54,362 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, > Service: , Ident: (appAttemptId { application_id { id: 5 cluster_timestamp: > 1458608803637 } attemptId: 1 } keyId: 1800690504) > 2016-03-22 15:03:54,501 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter. > 2016-03-22 15:03:54,615 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config > null > 2016-03-22 15:03:54,673 INFO [main] > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output > Committer Algorithm version is 1 > 2016-03-22 15:03:54,675 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > 2016-03-22 15:03:54,710 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.jobhistory.EventType for class > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler > 2016-03-22 15:03:54,711 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher > 2016-03-22 15:03:54,712 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher > 2016-03-22 15:03:54,713 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher > 2016-03-22 15:03:54,714 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler > 2016-03-22 15:03:54,721 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher > 2016-03-22 15:03:54,721 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter > 2016-03-22 15:03:54,731 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for > class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter > 2016-03-22 15:03:54,765 INFO [main] > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system is set solely by core-default.xml therefore - ignoring > 2016-03-22 15:03:54,790 INFO [main] > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system is set solely by core-default.xml therefore - ignoring > 2016-03-22 15:03:54,815 INFO [main] > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system is set solely by core-default.xml therefore - ignoring > 2016-03-22 15:03:54,831 INFO [main] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job > history data to the timeline server is not enabled > 2016-03-22 15:03:54,883 INFO [main] > org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class > org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler > 2016-03-22 15:03:55,146 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from > hadoop-metrics2.properties > 2016-03-22 15:03:55,208 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period > at 10 second(s). > 2016-03-22 15:03:55,208 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system > started > 2016-03-22 15:03:55,216 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for > job_1458608803637_0005 to jobTokenSecretManager > 2016-03-22 15:03:55,237 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing > job_1458608803637_0005 because: not enabled; > 2016-03-22 15:03:55,256 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job > job_1458608803637_0005 = 0. Number of splits = 2 > 2016-03-22 15:03:55,256 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for > job job_1458608803637_0005 = 0 > 2016-03-22 15:03:55,256 INFO [main] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1458608803637_0005Job Transitioned from NEW to INITED > 2016-03-22 15:03:55,258 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, > non-uberized, multi-container job job_1458608803637_0005. > 2016-03-22 15:03:55,311 INFO [main] org.apache.hadoop.ipc.CallQueueManager: > Using callQueue class java.util.concurrent.LinkedBlockingQueue > 2016-03-22 15:03:55,329 INFO [Socket Reader #1 for port 38420] > org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 38420 > 2016-03-22 15:03:55,347 INFO [main] > org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding > protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server > 2016-03-22 15:03:55,348 INFO [IPC Server Responder] > org.apache.hadoop.ipc.Server: IPC Server Responder: starting > 2016-03-22 15:03:55,349 INFO [IPC Server listener on 38420] > org.apache.hadoop.ipc.Server: IPC Server listener on 38420: starting > 2016-03-22 15:03:55,352 INFO [main] > org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated > MRClientService at foo.local/10.1.194.76:38420 > 2016-03-22 15:03:55,497 INFO [main] org.mortbay.log: Logging to > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > org.mortbay.log.Slf4jLog > 2016-03-22 15:03:55,511 INFO [main] org.apache.hadoop.http.HttpRequestLog: > Http request log for http.requests.mapreduce is not defined > 2016-03-22 15:03:55,517 INFO [main] org.apache.hadoop.http.HttpServer2: Added > global filter 'safety' > (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) > 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added > filter AM_PROXY_FILTER > (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context > mapreduce > 2016-03-22 15:03:55,543 INFO [main] org.apache.hadoop.http.HttpServer2: Added > filter AM_PROXY_FILTER > (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context > static > 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: > adding path spec: /mapreduce/* > 2016-03-22 15:03:55,547 INFO [main] org.apache.hadoop.http.HttpServer2: > adding path spec: /ws/* > 2016-03-22 15:03:55,557 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty > bound to port 35270 > 2016-03-22 15:03:55,557 INFO [main] org.mortbay.log: jetty-6.1.26 > 2016-03-22 15:03:55,587 INFO [main] org.mortbay.log: Extract > jar:file:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0-mapr-1506.jar!/webapps/mapreduce > to /tmp/Jetty_0_0_0_0_35270_mapreduce____.lnydg0/webapp > 2016-03-22 15:03:55,849 INFO [main] org.mortbay.log: Started > HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:35270 > 2016-03-22 15:03:55,849 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: > Web app /mapreduce started at 35270 > 2016-03-22 15:03:56,189 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: > Registered webapp guice modules > 2016-03-22 15:03:56,195 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE > job_1458608803637_0005 > 2016-03-22 15:03:56,195 INFO [main] org.apache.hadoop.ipc.CallQueueManager: > Using callQueue class java.util.concurrent.LinkedBlockingQueue > 2016-03-22 15:03:56,196 INFO [Socket Reader #1 for port 57491] > org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 57491 > 2016-03-22 15:03:56,202 INFO [IPC Server Responder] > org.apache.hadoop.ipc.Server: IPC Server Responder: starting > 2016-03-22 15:03:56,202 INFO [IPC Server listener on 57491] > org.apache.hadoop.ipc.Server: IPC Server listener on 57491: starting > 2016-03-22 15:03:56,229 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: > nodeBlacklistingEnabled:true > 2016-03-22 15:03:56,229 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: > maxTaskFailuresPerNode is 3 > 2016-03-22 15:03:56,229 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: > blacklistDisablePercent is 33 > 2016-03-22 15:03:56,281 INFO [main] org.apache.hadoop.yarn.client.RMProxy: > Connecting to ResourceManager at rm.marathon.mesos/10.1.194.73:8030 > 2016-03-22 15:03:56,418 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: maxContainerCapability: > <memory:8192, vCores:10, disks:4.0> > 2016-03-22 15:03:56,419 INFO [main] > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: queue: root.mapr > 2016-03-22 15:03:56,423 INFO [main] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper > limit on the thread pool size is 500 > 2016-03-22 15:03:56,423 INFO [main] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread > pool initial size is 10 > 2016-03-22 15:03:56,427 INFO [main] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > yarn.client.max-cached-nodemanagers-proxies : 0 > 2016-03-22 15:03:56,434 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1458608803637_0005Job Transitioned from INITED to SETUP > 2016-03-22 15:03:56,449 INFO [CommitterEvent Processor #0] > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing > the event EventType: JOB_SETUP > 2016-03-22 15:03:56,461 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1458608803637_0005Job Transitioned from SETUP to RUNNING > 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000000 Task Transitioned from NEW to SCHEDULED > 2016-03-22 15:03:56,482 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000001 Task Transitioned from NEW to SCHEDULED > 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2016-03-22 15:03:56,484 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2016-03-22 15:03:56,485 INFO [Thread-48] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: > mapResourceRequest:<memory:1024, vCores:1, disks:0.5> > 2016-03-22 15:03:56,526 INFO [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer > setup for JobId: job_1458608803637_0005, File: > maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job_1458608803637_0005_1.jhist > 2016-03-22 15:03:57,422 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before > Scheduling: PendingReds:0 ScheduledMaps:2 ScheduledReds:0 AssignedMaps:0 > AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 > HostLocal:0 RackLocal:0 > 2016-03-22 15:03:57,461 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() > for application_1458608803637_0005: ask=1 release= 0 newContainers=0 > finishedContainers=0 resourcelimit=<memory:636928, vCores:519, disks:208.0> > knownNMs=52 > 2016-03-22 15:03:58,475 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated > containers 2 > 2016-03-22 15:03:58,478 INFO [RMCommunicator Allocator] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac017.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,479 INFO [RMCommunicator Allocator] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac023.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,480 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_1458608803637_0005_01_000002 to > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_1458608803637_0005_01_000003 to > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:03:58,482 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: > PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 > CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0 > 2016-03-22 15:03:58,549 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapred.Task: mapOutputFile class: > org.apache.hadoop.mapred.MapRFsOutputFile > 2016-03-22 15:03:58,552 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac017.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,575 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file > on the remote FS is > maprfs:///var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.jar > 2016-03-22 15:03:58,580 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf > file on the remote FS is > maprfs:/var/mapr/cluster/yarn/rm/staging/mapr/.staging/job_1458608803637_0005/job.xml > 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #0 tokens > and #1 secret keys for NM use for launching container > 2016-03-22 15:03:58,582 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of > containertokens_dob is 1 > 2016-03-22 15:03:58,584 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle > token in serviceData > 2016-03-22 15:03:58,596 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding > ShuffleProvider Service: mapr_direct_shuffle to serviceData > 2016-03-22 15:03:58,646 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from > UNASSIGNED to ASSIGNED > 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapred.Task: mapOutputFile class: > org.apache.hadoop.mapred.MapRFsOutputFile > 2016-03-22 15:03:58,651 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved > svdidac023.techlabs.accenture.com to /default-rack > 2016-03-22 15:03:58,652 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from > UNASSIGNED to ASSIGNED > 2016-03-22 15:03:58,654 INFO [ContainerLauncher #0] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing > the event EventType: CONTAINER_REMOTE_LAUNCH for container > container_1458608803637_0005_01_000002 taskAttempt > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:03:58,655 INFO [ContainerLauncher #1] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing > the event EventType: CONTAINER_REMOTE_LAUNCH for container > container_1458608803637_0005_01_000003 taskAttempt > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:03:58,658 INFO [ContainerLauncher #0] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:03:58,658 INFO [ContainerLauncher #1] > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:03:58,659 INFO [ContainerLauncher #0] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > Opening proxy : svdidac017.techlabs.accenture.com:26802 > 2016-03-22 15:03:58,685 INFO [ContainerLauncher #1] > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: > Opening proxy : svdidac023.techlabs.accenture.com:42464 > 2016-03-22 15:03:59,016 INFO [ContainerLauncher #1] > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: > Shuffle port returned by ContainerManager for > attempt_1458608803637_0005_m_000001_0 : 10394 > 2016-03-22 15:03:59,016 INFO [ContainerLauncher #0] > org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptContainerLaunchedEvent: > Shuffle port returned by ContainerManager for > attempt_1458608803637_0005_m_000000_0 : 60124 > 2016-03-22 15:03:59,018 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: > [attempt_1458608803637_0005_m_000001_0] using containerId: > [container_1458608803637_0005_01_000003 on NM: > [svdidac023.techlabs.accenture.com:42464] > 2016-03-22 15:03:59,022 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000001_0 TaskAttempt Transitioned from ASSIGNED > to RUNNING > 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: > [attempt_1458608803637_0005_m_000000_0] using containerId: > [container_1458608803637_0005_01_000002 on NM: > [svdidac017.techlabs.accenture.com:26802] > 2016-03-22 15:03:59,023 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1458608803637_0005_m_000000_0 TaskAttempt Transitioned from ASSIGNED > to RUNNING > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START > task_1458608803637_0005_m_000001 > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000001 Task Transitioned from SCHEDULED to RUNNING > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START > task_1458608803637_0005_m_000000 > 2016-03-22 15:03:59,024 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1458608803637_0005_m_000000 Task Transitioned from SCHEDULED to RUNNING > 2016-03-22 15:03:59,484 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() > for application_1458608803637_0005: ask=1 release= 0 newContainers=0 > finishedContainers=0 resourcelimit=<memory:634880, vCores:517, disks:207.0> > knownNMs=52 > 2016-03-22 15:04:02,438 INFO [Socket Reader #1 for port 57491] > SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for > job_1458608803637_0005 (auth:SIMPLE) > 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : > jvm_1458608803637_0005_m_000003 asked for a task > 2016-03-22 15:04:02,458 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: > jvm_1458608803637_0005_m_000003 given task: > attempt_1458608803637_0005_m_000001_0 > 2016-03-22 15:04:04,287 INFO [Socket Reader #1 for port 57491] > SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for > job_1458608803637_0005 (auth:SIMPLE) > 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : > jvm_1458608803637_0005_m_000002 asked for a task > 2016-03-22 15:04:04,301 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: > jvm_1458608803637_0005_m_000002 given task: > attempt_1458608803637_0005_m_000000_0 > 2016-03-22 15:04:08,858 INFO [IPC Server handler 3 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.007079788 > 2016-03-22 15:04:10,683 INFO [IPC Server handler 3 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000000_0 is : 2.622E-6 > 2016-03-22 15:04:11,887 INFO [IPC Server handler 5 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.010750526 > 2016-03-22 15:04:14,910 INFO [IPC Server handler 6 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.014476808 > 2016-03-22 15:04:17,932 INFO [IPC Server handler 1 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.018156094 > 2016-03-22 15:04:20,956 INFO [IPC Server handler 4 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.021778924 > 2016-03-22 15:04:23,978 INFO [IPC Server handler 8 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.025545845 > 2016-03-22 15:04:27,003 INFO [IPC Server handler 7 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.029212331 > 2016-03-22 15:04:30,024 INFO [IPC Server handler 10 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.032808386 > 2016-03-22 15:04:33,044 INFO [IPC Server handler 18 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.03653345 > 2016-03-22 15:04:36,066 INFO [IPC Server handler 11 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.040224314 > 2016-03-22 15:04:39,089 INFO [IPC Server handler 19 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.043970145 > 2016-03-22 15:04:42,112 INFO [IPC Server handler 23 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.047755376 > 2016-03-22 15:04:45,132 INFO [IPC Server handler 9 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.051441472 > 2016-03-22 15:04:48,151 INFO [IPC Server handler 13 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.05514416 > 2016-03-22 15:04:51,175 INFO [IPC Server handler 12 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.058915816 > 2016-03-22 15:04:54,193 INFO [IPC Server handler 20 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.06264193 > 2016-03-22 15:04:57,213 INFO [IPC Server handler 15 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.06635544 > 2016-03-22 15:05:00,236 INFO [IPC Server handler 0 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.07005012 > 2016-03-22 15:05:03,256 INFO [IPC Server handler 2 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.073812515 > 2016-03-22 15:05:06,277 INFO [IPC Server handler 3 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.07749763 > 2016-03-22 15:05:09,295 INFO [IPC Server handler 5 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.08121116 > 2016-03-22 15:05:12,313 INFO [IPC Server handler 6 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.08495909 > 2016-03-22 15:05:15,333 INFO [IPC Server handler 1 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.08868403 > 2016-03-22 15:05:18,354 INFO [IPC Server handler 8 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.09239528 > 2016-03-22 15:05:21,372 INFO [IPC Server handler 4 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.09613607 > 2016-03-22 15:05:24,391 INFO [IPC Server handler 7 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.09985288 > 2016-03-22 15:05:27,406 INFO [IPC Server handler 10 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.10355474 > 2016-03-22 15:05:30,422 INFO [IPC Server handler 18 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.107317574 > 2016-03-22 15:05:33,438 INFO [IPC Server handler 11 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.11103281 > 2016-03-22 15:05:36,453 INFO [IPC Server handler 19 on 57491] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1458608803637_0005_m_000001_0 is : 0.11472398 > Hadoop process logs: > ----------------------------------------------------- > mapr@foo002:/root$ time hadoop jar > /opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0-mapr-1506.jar > teragen 1000000000 /tmp/myriad-test > 16/03/22 15:03:47 INFO client.RMProxy: Connecting to ResourceManager at > rm.marathon.mesos/10.1.194.73:8032 > 16/03/22 15:03:48 INFO terasort.TeraSort: Generating 1000000000 using 2 > 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: number of splits:2 > 16/03/22 15:03:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_1458608803637_0005 > 16/03/22 15:03:48 INFO security.ExternalTokenManagerFactory: Initialized > external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager > 16/03/22 15:03:48 INFO impl.YarnClientImpl: Submitted application > application_1458608803637_0005 > 16/03/22 15:03:48 INFO mapreduce.Job: The url to track the job: > http://rm.marathon.mesos:8088/proxy/application_1458608803637_0005/ > 16/03/22 15:03:48 INFO mapreduce.Job: Running job: job_1458608803637_0005 > 16/03/22 15:03:57 INFO mapreduce.Job: Job job_1458608803637_0005 running in > uber mode : false > 16/03/22 15:03:57 INFO mapreduce.Job: map 0% reduce 0% > 16/03/22 15:04:13 INFO mapreduce.Job: map 1% reduce 0% > 16/03/22 15:04:31 INFO mapreduce.Job: map 2% reduce 0% > 16/03/22 15:04:46 INFO mapreduce.Job: map 3% reduce 0% > 16/03/22 15:05:02 INFO mapreduce.Job: map 4% reduce 0% > 16/03/22 15:05:20 INFO mapreduce.Job: map 5% reduce 0% > 16/03/22 15:05:35 INFO mapreduce.Job: map 6% reduce 0% > 16/03/22 15:07:02 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:03 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:04 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 2 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:05 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 16/03/22 15:07:06 INFO ipc.Client: Retrying connect to server: > appmaster.local/10.1.194.76:38420. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) > 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)