You can certainly reduce the 1min delay in mesos-dns, although you might
not want to go much lower than 5-10s, especially on a large cluster, since
it can actually take a while for the master to generate all that
state.json. We're working on the performance of state.json so that more
frequent polling can be less of an impact.

On Wed, Sep 9, 2015 at 8:56 AM, Santosh Marella <smare...@maprtech.com>
wrote:

> > ready right away (1 minute delay after kicking off Myriad)
> That's true. The reason a medium NM instance is kicked off at RM's startup
> was to allow non-zero capacity in the cluster, without which YARN seems
> to reject app submissions. I'll look at YARN's code base more carefully
> and see if this behavior can be disabled via configuration
> (I didn't notice such option last timeI looked).
>
> The other workaround might be to reduce the 1 min delay in mesos-dns
> to create DNS entries for mesos tasks. Not sure if that's recommended
> in production, but sometimes I felt 1 min is too long to create a DNS
> entry.
> If RM failsover, it means the new RM instance can't be discovered for 1
> min.
> Job's that were previously running have to wait > 1 min to resume.
>
> Santosh
>
> On Tue, Sep 8, 2015 at 1:23 PM, John Omernik <j...@omernik.com> wrote:
>
> > Also a side note:  The Flexing up and now having to have at least one
> node
> > manager specified at startup:
> >
> > nmInstances: # NMs to start with. Requires at least 1 NM with a non-zero
> > profile.
> >
> >   medium: 1 # <profile_name : instances>
> >
> >
> > Is going to lead to task failures with mesos dns because the name won't
> be
> > ready right away (1 minute delay after kicking off Myriad) do we NEED to
> > have a non-0 profile nodemanager startup with the resource manager?
> >
> > On Tue, Sep 8, 2015 at 3:16 PM, John Omernik <j...@omernik.com> wrote:
> >
> > > Cool.  Question about the yarn-site.xml in general.
> > >
> > > I was struggling with some things in the wiki on this page:
> > >
> >
> https://cwiki.apache.org/confluence/display/MYRIAD/Installing+for+Administrators
> > >
> > > Basically in step 5:
> > > Step 5: Configure YARN to use Myriad
> > >
> > > Modify the */opt/hadoop-2.7.0/etc/hadoop/yarn-site.xml* file as
> > > instructed in Sample: myriad-config-default.yml
> > > <
> >
> https://cwiki.apache.org/confluence/display/MYRIAD/Sample%3A+myriad-config-default.yml
> > >
> > > .
> > >
> > >
> > > (It should not link to the yml, but to the yarn site, side issue) it
> has
> > > us put that information in the yarn-site.xml This makes sense.  The
> > > resource manager needs to be aware of the myriad stuff.
> > >
> > > Then I go to create a tarbal, (which I SHOULD be able to use for both
> > > resource manager and nodemanager... right?) However, the instructions
> > state
> > > to remove the *.xml files.
> > >
> > > Step 6: Create the Tarball
> > >
> > > The tarball has all of the files needed for the Node Managers and
> > > Resource Managers. The following shows how to create the tarball and
> > place
> > > it in HDFS:
> > > cd ~
> > > sudo cp -rp /opt/hadoop-2.7.0 .
> > > sudo rm hadoop-2.7.0/etc/hadoop/*.xml
> > > sudo tar -zcpf ~/hadoop-2.7.0.tar.gz hadoop-2.7.0
> > > hadoop fs -put ~/hadoop-2.7.0.tar.gz /dist
> > >
> > >
> > > What I ended up doing... since I am running the resourcemanager
> (myriad)
> > > in marathon, is I created two tarballs. One is my
> hadoop-2.7.0-RM.tar.gz
> > > which has the all the xml files still in the tar ball for shipping to
> > > marathon. Then other is hadoop-2.7.0-NM.tar.gz which per the
> instructions
> > > removes the *.xml files from the /etc/hadoop/ directory.
> > >
> > >
> > > I guess... my logic is that myriad creates the conf directory for the
> > > nodemanagers... but then I thought, and I overthinking something? Am I
> > > missing something? Could that be factoring into what I am doing here?
> > >
> > >
> > > Obviously my first steps are to add the extra yarn-site.xml entries,
> but
> > > in this current setup, they are only going into the resource manager
> > > yarn-site as the the node-managers don't have a yarn-site in their
> > > directories.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Sep 8, 2015 at 3:09 PM, yuliya Feldman <
> > > yufeld...@yahoo.com.invalid> wrote:
> > >
> > >> Take a look at :   https://github.com/mesos/myriad/pull/128
> > >> for yarn-site.xml updates
> > >>
> > >>       From: John Omernik <j...@omernik.com>
> > >>  To: dev@myriad.incubator.apache.org
> > >>  Sent: Tuesday, September 8, 2015 12:38 PM
> > >>  Subject: Getting Nodes to be "Running" in Mesos
> > >>
> > >> So I am playing around with a recent build of Myriad, and I am using
> > MapR
> > >> 5.0 (hadoop-2.7.0) I hate to use the dev list as a "help Myriad won't
> > run"
> > >> forum, so please forgive me if I am using the list wrong.
> > >>
> > >> Basically, I seem to be able to get myriad running, and the things up,
> > and
> > >> it tries to start a nodemanager.
> > >>
> > >> In mesos, the status of the nodemanager task never gets past staging,
> > and
> > >> eventually, fails.  The logs for both the node manager and myriad,
> seem
> > to
> > >> look healthy, and I am not sure where I should look next to
> troubleshoot
> > >> what is happening. Basically you can see the registration of the
> > >> nodemanager, and then it fails with no error in the logs... Any
> thoughts
> > >> would be appreciated on where I can look next for troubleshooting.
> > >>
> > >>
> > >> Node Manager Logs (complete)
> > >>
> > >> STARTUP_MSG:  build = g...@github.com:mapr/private-hadoop-common.git
> > >> -r fc95119f587541fb3a9af0dbeeed23c974178115; compiled by 'root' on
> > >> 2015-08-19T20:02Z
> > >> STARTUP_MSG:  java = 1.8.0_45-internal
> > >> ************************************************************/
> > >> 15/09/08 14:35:23 INFO nodemanager.NodeManager: registered UNIX signal
> > >> handlers for [TERM, HUP, INT]
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizationEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServicesEventType
> > >> for class
> > >> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >> org.apache.hadoop.yarn.server.nodemanager.ContainerManagerEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >> org.apache.hadoop.yarn.server.nodemanager.NodeManagerEventType for
> > >> class org.apache.hadoop.yarn.server.nodemanager.NodeManager
> > >> 15/09/08 14:35:24 INFO impl.MetricsConfig: loaded properties from
> > >> hadoop-metrics2.properties
> > >> 15/09/08 14:35:24 INFO impl.MetricsSystemImpl: Scheduled snapshot
> > >> period at 10 second(s).
> > >> 15/09/08 14:35:24 INFO impl.MetricsSystemImpl: NodeManager metrics
> > >> system started
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploadEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploadService
> > >> 15/09/08 14:35:24 INFO localizer.ResourceLocalizationService: per
> > >> directory file limit = 8192
> > >> 15/09/08 14:35:24 INFO localizer.ResourceLocalizationService:
> > >> usercache path :
> > >> file:///tmp/hadoop-mapr/nm-local-dir/usercache_DEL_1441740924753
> > >> 15/09/08 14:35:24 INFO event.AsyncDispatcher: Registering class
> > >>
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizerEventType
> > >> for class
> > >>
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker
> > >> 15/09/08 14:35:24 WARN containermanager.AuxServices: The Auxilurary
> > >> Service named 'mapreduce_shuffle' in the configuration is for class
> > >> org.apache.hadoop.mapred.ShuffleHandler which has a name of
> > >> 'httpshuffle'. Because these are not the same tools trying to send
> > >> ServiceData and read Service Meta Data may have issues unless the
> > >> refer to the name in the config.
> > >> 15/09/08 14:35:24 INFO containermanager.AuxServices: Adding auxiliary
> > >> service httpshuffle, "mapreduce_shuffle"
> > >> 15/09/08 14:35:24 INFO monitor.ContainersMonitorImpl:  Using
> > >> ResourceCalculatorPlugin :
> > >> org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@1a5b6f42
> > >> 15/09/08 14:35:24 INFO monitor.ContainersMonitorImpl:  Using
> > >> ResourceCalculatorProcessTree : null
> > >> 15/09/08 14:35:24 INFO monitor.ContainersMonitorImpl: Physical memory
> > >> check enabled: true
> > >> 15/09/08 14:35:24 INFO monitor.ContainersMonitorImpl: Virtual memory
> > >> check enabled: false
> > >> 15/09/08 14:35:24 INFO nodemanager.NodeStatusUpdaterImpl: Initialized
> > >> nodemanager for null: physical-memory=16384 virtual-memory=34407
> > >> virtual-cores=4 disks=4.0
> > >> 15/09/08 14:35:24 INFO ipc.CallQueueManager: Using callQueue class
> > >> java.util.concurrent.LinkedBlockingQueue
> > >> 15/09/08 14:35:24 INFO ipc.Server: Starting Socket Reader #1 for port
> > >> 55449
> > >> 15/09/08 14:35:24 INFO pb.RpcServerFactoryPBImpl: Adding protocol
> > >> org.apache.hadoop.yarn.api.ContainerManagementProtocolPB to the server
> > >> 15/09/08 14:35:24 INFO containermanager.ContainerManagerImpl: Blocking
> > >> new container-requests as container manager rpc server is still
> > >> starting.
> > >> 15/09/08 14:35:24 INFO ipc.Server: IPC Server Responder: starting
> > >> 15/09/08 14:35:24 INFO ipc.Server: IPC Server listener on 55449:
> > starting
> > >> 15/09/08 14:35:24 INFO security.NMContainerTokenSecretManager:
> > >> Updating node address : hadoopmapr5.brewingintel.com:55449
> > >> 15/09/08 14:35:24 INFO ipc.CallQueueManager: Using callQueue class
> > >> java.util.concurrent.LinkedBlockingQueue
> > >> 15/09/08 14:35:24 INFO ipc.Server: Starting Socket Reader #1 for port
> > 8040
> > >> 15/09/08 14:35:24 INFO pb.RpcServerFactoryPBImpl: Adding protocol
> > >> org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> > >> to the server
> > >> 15/09/08 14:35:24 INFO ipc.Server: IPC Server Responder: starting
> > >> 15/09/08 14:35:24 INFO ipc.Server: IPC Server listener on 8040:
> starting
> > >> 15/09/08 14:35:24 INFO localizer.ResourceLocalizationService:
> > >> Localizer started on port 8040
> > >> 15/09/08 14:35:24 INFO mapred.IndexCache: IndexCache created with max
> > >> memory = 10485760
> > >> 15/09/08 14:35:24 INFO mapred.ShuffleHandler: httpshuffle listening on
> > >> port 13562
> > >> 15/09/08 14:35:24 INFO containermanager.ContainerManagerImpl:
> > >> ContainerManager started at hadoopmapr5/192.168.0.96:55449
> > >> 15/09/08 14:35:24 INFO containermanager.ContainerManagerImpl:
> > >> ContainerManager bound to 0.0.0.0/0.0.0.0:0
> > >> 15/09/08 14:35:24 INFO webapp.WebServer: Instantiating NMWebApp at
> > >> 0.0.0.0:8042
> > >> 15/09/08 14:35:24 INFO mortbay.log: Logging to
> > >> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> > >> org.mortbay.log.Slf4jLog
> > >> 15/09/08 14:35:24 INFO http.HttpRequestLog: Http request log for
> > >> http.requests.nodemanager is not defined
> > >> 15/09/08 14:35:24 INFO http.HttpServer2: Added global filter 'safety'
> > >> (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
> > >> 15/09/08 14:35:24 INFO http.HttpServer2: Added filter
> > >> static_user_filter
> > >>
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
> > >> to context node
> > >> 15/09/08 14:35:24 INFO http.HttpServer2: Added filter
> > >> static_user_filter
> > >>
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
> > >> to context static
> > >> 15/09/08 14:35:24 INFO http.HttpServer2: Added filter
> > >> static_user_filter
> > >>
> (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter)
> > >> to context logs
> > >> 15/09/08 14:35:24 INFO http.HttpServer2: adding path spec: /node/*
> > >> 15/09/08 14:35:24 INFO http.HttpServer2: adding path spec: /ws/*
> > >> 15/09/08 14:35:24 INFO http.HttpServer2: Jetty bound to port 8042
> > >> 15/09/08 14:35:24 INFO mortbay.log: jetty-6.1.26
> > >> 15/09/08 14:35:24 INFO mortbay.log: Extract
> > >>
> > >>
> >
> jar:file:/tmp/mesos/slaves/20150907-111332-1660987584-5050-8033-S3/frameworks/20150907-111332-1660987584-5050-8033-0003/executors/myriad_executor20150907-111332-1660987584-5050-8033-000320150907-111332-1660987584-5050-8033-O11824820150907-111332-1660987584-5050-8033-S3/runs/67cc8f37-b6d4-4018-a9b4-0071d020c9a5/hadoop-2.7.0/share/hadoop/yarn/hadoop-yarn-common-2.7.0-mapr-1506.jar!/webapps/node
> > >> to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
> > >> 15/09/08 14:35:25 INFO mortbay.log: Started
> > >> HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042
> > >> 15/09/08 14:35:25 INFO webapp.WebApps: Web app /node started at 8042
> > >> 15/09/08 14:35:25 INFO webapp.WebApps: Registered webapp guice modules
> > >> 15/09/08 14:35:25 INFO client.RMProxy: Connecting to ResourceManager
> > >> at myriad.marathon.mesos/192.168.0.99:8031
> > >> 15/09/08 14:35:25 INFO nodemanager.NodeStatusUpdaterImpl: Sending out
> > >> 0 NM container statuses: []
> > >> 15/09/08 14:35:25 INFO nodemanager.NodeStatusUpdaterImpl: Registering
> > >> with RM using containers :[]
> > >> 15/09/08 14:35:25 INFO security.NMContainerTokenSecretManager: Rolling
> > >> master-key for container-tokens, got key with id 338249572
> > >> 15/09/08 14:35:25 INFO security.NMTokenSecretManagerInNM: Rolling
> > >> master-key for container-tokens, got key with id -362725484
> > >> 15/09/08 14:35:25 INFO nodemanager.NodeStatusUpdaterImpl: Registered
> > >> with ResourceManager as hadoopmapr5.brewingintel.com:55449 with total
> > >> resource of <memory:16384, vCores:4, disks:4.0>
> > >> 15/09/08 14:35:25 INFO nodemanager.NodeStatusUpdaterImpl: Notifying
> > >> ContainerManager to unblock new container-requests
> > >>
> > >>
> > >> Except of Myriad logs:
> > >>
> > >> /09/08 14:35:12 INFO handlers.ResourceOffersEventHandler: Received
> > offers
> > >> 1
> > >> 15/09/08 14:35:13 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:15 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:16 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:17 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:18 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:19 INFO handlers.StatusUpdateEventHandler: Status
> > >> Update for task: value:
> > >> "nm.medium.323f6664-11ca-477b-9e6e-41fb7547eacf"
> > >>  | state: TASK_FAILED
> > >> 15/09/08 14:35:19 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:19 INFO scheduler.DownloadNMExecutorCLGenImpl: Using
> > >> remote distribution
> > >> 15/09/08 14:35:19 INFO scheduler.TaskFactory$NMTaskFactoryImpl:
> > >> Getting Hadoop distribution
> > >> from:maprfs:///mesos/myriad/hadoop-2.7.0.tar.gz
> > >> 15/09/08 14:35:19 INFO scheduler.TaskFactory$NMTaskFactoryImpl:
> > >> Getting config from:http://myriad.marathon.mesos:8088/conf
> > >> 15/09/08 14:35:19 INFO scheduler.TaskFactory$NMTaskFactoryImpl: Slave
> > >> will execute command:sudo tar -zxpf hadoop-2.7.0.tar.gz && sudo chown
> > >> mapr . && cp conf hadoop-2.7.0/etc/hadoop/yarn-site.xml; export
> > >> YARN_HOME=hadoop-2.7.0; sudo -E -u mapr -H env
> > >> YARN_HOME="hadoop-2.7.0"
> > >> YARN_NODEMANAGER_OPTS="-Dnodemanager.resource.io-spindles=4.0
> > >> -Dyarn.resourcemanager.hostname=myriad.marathon.mesos
> > >>
> > >>
> >
> -Dyarn.nodemanager.container-executor.class=org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor
> > >> -Dnodemanager.resource.cpu-vcores=4
> > >> -Dnodemanager.resource.memory-mb=16384
> > >> -Dmyriad.yarn.nodemanager.address=0.0.0.0:31000
> > >> -Dmyriad.yarn.nodemanager.localizer.address=0.0.0.0:31001
> > >> -Dmyriad.yarn.nodemanager.webapp.address=0.0.0.0:31002
> > >> -Dmyriad.mapreduce.shuffle.port=0.0.0.0:31003"  $YARN_HOME/bin/yarn
> > >> nodemanager
> > >> 15/09/08 14:35:19 INFO handlers.ResourceOffersEventHandler: Launching
> > >> task: nm.medium.323f6664-11ca-477b-9e6e-41fb7547eacf using offer:
> > >> value: "20150907-111332-1660987584-5050-8033-O118248"
> > >>
> > >> 15/09/08 14:35:20 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 2
> > >> 15/09/08 14:35:21 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:21 INFO util.AbstractLivelinessMonitor:
> > >> Expired:hadoopmapr5.brewingintel.com:52878 Timed out after 2 secs
> > >> 15/09/08 14:35:21 INFO rmnode.RMNodeImpl: Deactivating Node
> > >> hadoopmapr5.brewingintel.com:52878 as it is now LOST
> > >> 15/09/08 14:35:21 INFO rmnode.RMNodeImpl:
> > >> hadoopmapr5.brewingintel.com:52878 Node Transitioned from RUNNING to
> > >> LOST
> > >> 15/09/08 14:35:21 INFO fair.FairScheduler: Removed node
> > >> hadoopmapr5.brewingintel.com:52878 cluster capacity: <memory:0,
> > >> vCores:0, disks:0.0>
> > >> 15/09/08 14:35:22 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:23 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:25 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:25 INFO util.RackResolver: Resolved
> > >> hadoopmapr5.brewingintel.com to /default-rack
> > >> 15/09/08 14:35:25 INFO resourcemanager.ResourceTrackerService:
> > >> NodeManager from node hadoopmapr5.brewingintel.com(cmPort: 55449
> > >> httpPort: 8042) registered with capability: <memory:16384, vCores:4,
> > >> disks:4.0>, assigned nodeId hadoopmapr5.brewingintel.com:55449
> > >> 15/09/08 14:35:25 INFO rmnode.RMNodeImpl:
> > >> hadoopmapr5.brewingintel.com:55449 Node Transitioned from NEW to
> > >> RUNNING
> > >> 15/09/08 14:35:25 INFO fair.FairScheduler: Added node
> > >> hadoopmapr5.brewingintel.com:55449 cluster capacity: <memory:16384,
> > >> vCores:4, disks:4.0>
> > >> 15/09/08 14:35:26 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:27 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:28 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:30 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:31 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:32 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:33 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:35 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:36 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:37 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:38 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:40 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:41 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:42 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:43 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:45 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:46 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:47 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:48 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:50 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:51 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:52 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:53 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:55 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:35:56 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:57 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:35:58 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:00 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:36:01 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:02 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:03 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:05 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:36:06 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:07 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:08 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:10 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:36:11 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:12 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:13 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:15 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 3
> > >> 15/09/08 14:36:16 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:17 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:18 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:19 INFO handlers.StatusUpdateEventHandler: Status
> > >> Update for task: value:
> > >> "nm.medium.323f6664-11ca-477b-9e6e-41fb7547eacf"
> > >>  | state: TASK_FAILED
> > >> 15/09/08 14:36:19 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:19 INFO scheduler.DownloadNMExecutorCLGenImpl: Using
> > >> remote distribution
> > >> 15/09/08 14:36:19 INFO scheduler.TaskFactory$NMTaskFactoryImpl:
> > >> Getting Hadoop distribution
> > >> from:maprfs:///mesos/myriad/hadoop-2.7.0.tar.gz
> > >> 15/09/08 14:36:19 INFO scheduler.TaskFactory$NMTaskFactoryImpl:
> > >> Getting config from:http://myriad.marathon.mesos:8088/conf
> > >> 15/09/08 14:36:19 INFO scheduler.TaskFactory$NMTaskFactoryImpl: Slave
> > >> will execute command:sudo tar -zxpf hadoop-2.7.0.tar.gz && sudo chown
> > >> mapr . && cp conf hadoop-2.7.0/etc/hadoop/yarn-site.xml; export
> > >> YARN_HOME=hadoop-2.7.0; sudo -E -u mapr -H env
> > >> YARN_HOME="hadoop-2.7.0"
> > >> YARN_NODEMANAGER_OPTS="-Dnodemanager.resource.io-spindles=4.0
> > >> -Dyarn.resourcemanager.hostname=myriad.marathon.mesos
> > >>
> > >>
> >
> -Dyarn.nodemanager.container-executor.class=org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor
> > >> -Dnodemanager.resource.cpu-vcores=4
> > >> -Dnodemanager.resource.memory-mb=16384
> > >> -Dmyriad.yarn.nodemanager.address=0.0.0.0:31000
> > >> -Dmyriad.yarn.nodemanager.localizer.address=0.0.0.0:31001
> > >> -Dmyriad.yarn.nodemanager.webapp.address=0.0.0.0:31002
> > >> -Dmyriad.mapreduce.shuffle.port=0.0.0.0:31003"  $YARN_HOME/bin/yarn
> > >> nodemanager
> > >> 15/09/08 14:36:19 INFO handlers.ResourceOffersEventHandler: Launching
> > >> task: nm.medium.323f6664-11ca-477b-9e6e-41fb7547eacf using offer:
> > >> value: "20150907-111332-1660987584-5050-8033-O118392"
> > >>
> > >> 15/09/08 14:36:20 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 2
> > >> 15/09/08 14:36:20 INFO util.AbstractLivelinessMonitor:
> > >> Expired:hadoopmapr5.brewingintel.com:55449 Timed out after 2 secs
> > >> 15/09/08 14:36:20 INFO rmnode.RMNodeImpl: Deactivating Node
> > >> hadoopmapr5.brewingintel.com:55449 as it is now LOST
> > >> 15/09/08 14:36:20 INFO rmnode.RMNodeImpl:
> > >> hadoopmapr5.brewingintel.com:55449 Node Transitioned from RUNNING to
> > >> LOST
> > >> 15/09/08 14:36:20 INFO fair.FairScheduler: Removed node
> > >> hadoopmapr5.brewingintel.com:55449 cluster capacity: <memory:0,
> > >> vCores:0, disks:0.0>
> > >> 15/09/08 14:36:22 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 2
> > >> 15/09/08 14:36:23 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:24 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:25 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 2
> > >> 15/09/08 14:36:25 INFO util.RackResolver: Resolved
> > >> hadoopmapr5.brewingintel.com to /default-rack
> > >> 15/09/08 14:36:25 INFO resourcemanager.ResourceTrackerService:
> > >> NodeManager from node hadoopmapr5.brewingintel.com(cmPort: 40378
> > >> httpPort: 8042) registered with capability: <memory:16384, vCores:4,
> > >> disks:4.0>, assigned nodeId hadoopmapr5.brewingintel.com:40378
> > >> 15/09/08 14:36:25 INFO rmnode.RMNodeImpl:
> > >> hadoopmapr5.brewingintel.com:40378 Node Transitioned from NEW to
> > >> RUNNING
> > >> 15/09/08 14:36:25 INFO fair.FairScheduler: Added node
> > >> hadoopmapr5.brewingintel.com:40378 cluster capacity: <memory:16384,
> > >> vCores:4, disks:4.0>
> > >> 15/09/08 14:36:27 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 2
> > >> 15/09/08 14:36:28 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:29 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 1
> > >> 15/09/08 14:36:30 INFO handlers.ResourceOffersEventHandler: Received
> > >> offers 2
> > >>
> > >>
> > >>
> > >
> > >
> >
>

Reply via email to