It seems unlikely to me that it would be a 2.2 issue, though not entirely
impossible.  Are you able to find any of the container logs?  Is the
NodeManager launching containers and reporting some exit code?

-Sandy

On Thu, Feb 12, 2015 at 1:21 PM, Anders Arpteg <arp...@spotify.com> wrote:

> No, not submitting from windows, from a debian distribution. Had a quick
> look at the rm logs, and it seems some containers are allocated but then
> released again for some reason. Not easy to make sense of the logs, but
> here is a snippet from the logs (from a test in our small test cluster) if
> you'd like to have a closer look: http://pastebin.com/8WU9ivqC
>
> Sandy, sounds like it could possible be a 2.2 issue then, or what do you
> think?
>
> Thanks,
> Anders
>
> On Thu, Feb 12, 2015 at 3:11 PM, Aniket Bhatnagar <
> aniket.bhatna...@gmail.com> wrote:
>
>> This is tricky to debug. Check logs of node and resource manager of YARN
>> to see if you can trace the error. In the past I have to closely look at
>> arguments getting passed to YARN container (they get logged before
>> attempting to launch containers). If I still don't get a clue, I had to
>> check the script generated by YARN to execute the container and even run
>> manually to trace at what line the error has occurred.
>>
>> BTW are you submitting the job from windows?
>>
>> On Thu, Feb 12, 2015, 3:34 PM Anders Arpteg <arp...@spotify.com> wrote:
>>
>>> Interesting to hear that it works for you. Are you using Yarn 2.2 as
>>> well? No strange log message during startup, and can't see any other log
>>> messages since no executer gets launched. Does not seems to work in
>>> yarn-client mode either, failing with the exception below.
>>>
>>> Exception in thread "main" org.apache.spark.SparkException: Yarn
>>> application has already ended! It might have been killed or unable to
>>> launch application master.
>>>         at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:119)
>>>         at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
>>>         at
>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
>>>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:370)
>>>         at
>>> com.spotify.analytics.AnalyticsSparkContext.<init>(AnalyticsSparkContext.scala:8)
>>>         at com.spotify.analytics.DataSampler$.main(DataSampler.scala:42)
>>>         at com.spotify.analytics.DataSampler.main(DataSampler.scala)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:551)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:155)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:178)
>>>         at
>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:99)
>>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>> /Anders
>>>
>>>
>>> On Thu, Feb 12, 2015 at 1:33 AM, Sandy Ryza <sandy.r...@cloudera.com>
>>> wrote:
>>>
>>>> Hi Anders,
>>>>
>>>> I just tried this out and was able to successfully acquire executors.
>>>> Any strange log messages or additional color you can provide on your
>>>> setup?  Does yarn-client mode work?
>>>>
>>>> -Sandy
>>>>
>>>> On Wed, Feb 11, 2015 at 1:28 PM, Anders Arpteg <arp...@spotify.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Compiled the latest master of Spark yesterday (2015-02-10) for Hadoop
>>>>> 2.2 and failed executing jobs in yarn-cluster mode for that build. Works
>>>>> successfully with spark 1.2 (and also master from 2015-01-16), so 
>>>>> something
>>>>> has changed since then that prevents the job from receiving any executors
>>>>> on the cluster.
>>>>>
>>>>> Basic symptoms are that the jobs fires up the AM, but after examining
>>>>> the "executors" page in the web ui, only the driver is listed, no
>>>>> executors are ever received, and the driver keep waiting forever. Has
>>>>> anyone seemed similar problems?
>>>>>
>>>>> Thanks for any insights,
>>>>> Anders
>>>>>
>>>>
>>>>
>>>
>

Reply via email to