OK. Thanks guys. It continued to fail with min 256 and max 512 or 1024. In the end I switched off the check of ratio virtual/physical memory. My config is now:
<property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>256</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>1024</value> </property> On Fri, Dec 2, 2016 at 2:06 PM, AJAY GUPTA <ajaygit...@gmail.com> wrote: > Hi Max/Ashwin > > The logs have the following lines > 2016-12-02 16:53:53,868 INFO com.datatorrent.stram.StreamingAppMasterService: > Asking RM for containers: [Capability[<memory:3000, vCores:1>]Priority[2], > Capability[<memory:3000, vCores:1>]Priority[1]] > 2016-12-02 16:53:53,868 INFO com.datatorrent.stram.StreamingAppMasterService: > Requested container: Capability[<memory:3000, vCores:1>]Priority[2] on > host: [null] > > The yarn manager has been requested for 3000MB RAM which the system is > probably not able to provide. > > With the above configuration, you have set a minimum allocation of 256MB > per operator and maximum of 512MB per operator. You can change this later > in case you expect your operator to consume more/less memory. > > Ashwin, kindly correct me if I am wrong here. > > > Regards, > Ajay > > On Sat, Dec 3, 2016 at 12:28 AM, Ashwin Chandra Putta < > ashwinchand...@gmail.com> wrote: > >> Ajay, >> >> Can you specify the reason why it did not work before as well? >> >> Regards, >> Ashwin. >> >> On Fri, Dec 2, 2016 at 10:55 AM, AJAY GUPTA <ajaygit...@gmail.com> wrote: >> >>> Hi Max, >>> >>> Can you try adding the following configurations to yarn-site.xml. >>> Restart yarn and then try starting wordcount-demo. >>> >>> >>> <property> >>> <name>yarn.scheduler.minimum-allocation-mb</name> >>> <value>256</value> >>> </property> >>> <property> >>> <name>yarn.scheduler.maximum-allocation-mb</name> >>> <value>512</value> >>> </property> >>> >>> >>> Regards, >>> Ajay >>> >>> >>> On Fri, Dec 2, 2016 at 10:26 PM, Max Bridgewater < >>> max.bridgewa...@gmail.com> wrote: >>> >>>> Yeah, application has been running for 20h. But no event is flowing >>>> through. See dt.log attached. >>>> >>>> On Fri, Dec 2, 2016 at 11:48 AM, Ashwin Chandra Putta < >>>> ashwinchand...@gmail.com> wrote: >>>> >>>>> Max, >>>>> >>>>> Can you check app master logs? If application status changed to >>>>> running, it means app master is running. You can find operator deployment >>>>> related logs from app master dt.log. >>>>> >>>>> Regards, >>>>> Ashwin. >>>>> >>>>> On Dec 2, 2016 5:25 AM, "Max Bridgewater" <max.bridgewa...@gmail.com> >>>>> wrote: >>>>> >>>>>> I deployed the WordCountDemo in DataTorrent RTS Enterprise with >>>>>> evaluation license. The application is in RUNNING state and resulted into >>>>>> two processes that are all in ACTIVE state. >>>>>> >>>>>> On the other hand, however, the operators themselves are in >>>>>> PENDING_DEPLOY state. These are wordinput, count, and console. So, >>>>>> nothing >>>>>> is really running and no words are being counted. >>>>>> >>>>>> There seems to be enough resources: >>>>>> >>>>>> 2016-12-01 20:36:28,287 INFO org.apache.hadoop.yarn.server. >>>>>> resourcemanager.sched >>>>>> uler.SchedulerNode: Assigned container >>>>>> container_1480549373717_0021_01_000002 >>>>>> of >>>>>> capacity <memory:3000, vCores:1> on host localhost:36079, which has >>>>>> 2 container >>>>>> s, <memory:6000, vCores:2> used and <memory:2192, vCores:6> available >>>>>> after allo >>>>>> cation. >>>>>> >>>>>> >>>>>> In /var/log/dtgateway.log, here is what I see. Can this be the cause? >>>>>> If so, how do I fix this? >>>>>> >>>>>> 016-12-02 13:21:34,113 WARN com.datatorrent.gateway.x: Cannot update >>>>>> license registry for the number of nodes >>>>>> com.datatorrent.a.E: Filesystem closed >>>>>> at com.datatorrent.a.M.b(w:341) >>>>>> at com.datatorrent.a.C.b(m:34) >>>>>> at com.datatorrent.gateway.x.b(jd:456) >>>>>> at com.datatorrent.gateway.x.b(jd:627) >>>>>> at com.datatorrent.gateway.x.b(jd:787) >>>>>> at com.datatorrent.gateway.x.b(jd:141) >>>>>> at com.datatorrent.gateway.U.run(jd:210) >>>>>> at java.util.concurrent.Executors >>>>>> $RunnableAdapter.call(Executors.java:51 >>>>>> 1) >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>> at java.util.concurrent.ThreadPoo >>>>>> lExecutor.runWorker(ThreadPoolExecutor. >>>>>> java:1142) >>>>>> at java.util.concurrent.ThreadPoo >>>>>> lExecutor$Worker.run(ThreadPoolExecutor >>>>>> .java:617) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> 2016-12-02 13:21:34,113 WARN com.datatorrent.gateway.x: Detected >>>>>> containers not >>>>>> provisioned for application_1480549373717_0021: # planned containers >>>>>> is 3 and # >>>>>> allocated containers is 1 >>>>>> 2016-12-02 13:21:39,158 WARN com.datatorrent.gateway.x: Cannot update >>>>>> license re >>>>>> gistry for the number of nodes >>>>>> com.datatorrent.a.E: Filesystem closed >>>>>> at com.datatorrent.a.M.b(w:341) >>>>>> at com.datatorrent.a.C.b(m:34) >>>>>> at com.datatorrent.gateway.x.b(jd:456) >>>>>> at com.datatorrent.gateway.x.b(jd:627) >>>>>> at com.datatorrent.gateway.x.b(jd:787) >>>>>> at com.datatorrent.gateway.x.b(jd:141) >>>>>> at com.datatorrent.gateway.U.run(jd:210) >>>>>> at java.util.concurrent.Executors >>>>>> $RunnableAdapter.call(Executors.java:51 >>>>>> 1) >>>>>> >>>>>> >>>>>> >>>> >>> >> >> >> -- >> >> Regards, >> Ashwin. >> > >