Cool. I did not see these errors in the AM logs, were these in the node manager logs?
Regards, Ashwin. On Mon, Nov 21, 2016 at 4:20 AM, Max Bridgewater <[email protected]> wrote: > The issue turned out to be memory allocation. Here is the relevant YARN > error message: > > 2016-11-21 11:44:30,020 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.container.ContainerImpl: Container > container_1479728463466_0001_02_000001 transitioned from LOCALIZED to > RUNNING > 2016-11-21 11:44:31,858 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting > resource-monitoring for container_1479728463466_0001_02_000001 > 2016-11-21 11:44:31,858 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping > resource-monitoring for container_1479728463466_0001_01_000001 > 2016-11-21 11:44:31,867 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage > of ProcessTree 26632 for container-id container_1479728463466_0001_02_000001: > 194.5 MB of 1 GB physical memory used; 2.5 GB of 2.1 GB virtual memory used > 2016-11-21 11:44:34,875 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage > of ProcessTree 26632 for container-id container_1479728463466_0001_02_000001: > 532.4 MB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used > 2016-11-21 11:44:34,876 WARN org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Process tree > for container: container_1479728463466_0001_02_000001 has processes older > than 1 iteration running over the configured limit. Limit=2254857728, > current usage = 2822131712 > 2016-11-21 11:44:34,876 WARN org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Container > [pid=26632,containerID=container_1479728463466_0001_02_000001] is running > beyond virtual memory limits. Current usage: 532.4 MB of 1 GB physical > memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container. > > > I solved it by adding this to yarn-site.xml: > > <property> > <name>yarn.scheduler.minimum-allocation-mb</name> > <value>1000</value> > </property> > > > Thanks, > Max. > > > > > On Sat, Nov 19, 2016 at 10:30 PM, Ashwin Chandra Putta < > [email protected]> wrote: > >> Max, >> >> The app failure does not depend on the gateway. The gateway is a daemon >> to launch Apex apps on YARN and to get metrics for the Apex apps from YARN >> and AM for each app, so it won't affect app execution once YARN accepts the >> application. For some reason the AM itself is failing. I cannot figure out >> the cause from the logs. It is possible that the app packages for these >> apps have hadoop dependencies packaged, it is one of the most common causes >> for AM failure. >> >> Regards, >> Ashwin. >> >> On Sat, Nov 19, 2016 at 3:08 PM, Max Bridgewater < >> [email protected]> wrote: >> >>> Please find the AppMaster.stderr attached as well as dt.log. >>> AppMaster.stdout is empty. I am still wondering if there is another port >>> that is needed or if the UI is using websocket. >>> >>> On Sat, Nov 19, 2016 at 5:40 PM, Ashwin Chandra Putta < >>> [email protected]> wrote: >>> >>>> Max, >>>> >>>> Can you share the app master logs of the failed application? >>>> >>>> Regards, >>>> Ashwin. >>>> >>>> On Sat, Nov 19, 2016 at 4:45 AM, Max Bridgewater < >>>> [email protected]> wrote: >>>> >>>>> Hi Ahswin, >>>>> >>>>> Thanks for the feedback. I created a completely new instance, trying >>>>> the follow the instructions more precisely. I attached the logs again. As >>>>> you can see they are very clean. Despite this, PIDemo is still failing >>>>> without any meaningful error message. Same things happens with >>>>> WorldCountDemo. After launching, it stays in ACCEPTED status for 10 to 15 >>>>> seconds and switch to FAILED. >>>>> >>>>> Max. >>>>> >>>>> On Fri, Nov 18, 2016 at 2:30 PM, Ashwin Chandra Putta < >>>>> [email protected]> wrote: >>>>> >>>>>> Also, there are write permission errors on /user/dtadmin/datatorrent >>>>>> in hdfs. Please make dtadmin user own /user/dtadmin/ >>>>>> >>>>>> Permission denied: user=dtadmin, access=WRITE, >>>>>> inode="/user/dtadmin/datatorrent":hduser:supergroup:drwxr-xr-x >>>>>> >>>>>> Regards, >>>>>> Ashwin. >>>>>> >>>>>> On Fri, Nov 18, 2016 at 11:27 AM, Ashwin Chandra Putta < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> The end </property> tag is missing between line 30 and 31. It is for >>>>>>> the property dt.attr.DEBUG. >>>>>>> >>>>>>> Regards, >>>>>>> Ashwin. >>>>>>> >>>>>>> On Fri, Nov 18, 2016 at 10:16 AM, Max Bridgewater < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Here is the log folder. Note that it refers to a malformed >>>>>>>> properties.xml. I am attaching that properties file as well. >>>>>>>> >>>>>>>> On Fri, Nov 18, 2016 at 1:08 PM, Ashwin Chandra Putta < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Max, >>>>>>>>> >>>>>>>>> Can you share the gateway logs? >>>>>>>>> >>>>>>>>> You will find them under /var/log/datatorrent for global install, >>>>>>>>> or under ~/.dt/logs for local install. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Ashwin. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Ashwin. >>>>>>>>> >>>>>>>>> On Nov 18, 2016 9:41 AM, "Max Bridgewater" < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi Folks, >>>>>>>>>> >>>>>>>>>> I am playing with Apex (DataTorrent RTS Enterprise). Local >>>>>>>>>> deployment in Ubuntu 16 box works fine. >>>>>>>>>> >>>>>>>>>> However, when I deploy on a remote host, I am not apple to launch >>>>>>>>>> demo applications. My suspicion is that this is due to having to >>>>>>>>>> open an >>>>>>>>>> SSH tunnel to access the gateway. All activities other than >>>>>>>>>> launching the >>>>>>>>>> apps seem to work fine. >>>>>>>>>> >>>>>>>>>> My question: is there another port I need to open? Anybody is >>>>>>>>>> aware of issues running/accessing Apex behind a proxy or firewall? >>>>>>>>>> >>>>>>>>>> Unfortunately the UI does not provide much information. I am >>>>>>>>>> attaching some screenshots. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Max. >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Regards, >>>>>>> Ashwin. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Regards, >>>>>> Ashwin. >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Regards, >>>> Ashwin. >>>> >>> >>> >> >> >> -- >> >> Regards, >> Ashwin. >> > > -- Regards, Ashwin.
