Re: [Analytics] SparkContext stopped and cannot be restarted

2020-02-07 Thread Leila Zia
On Fri, Feb 7, 2020 at 12:45 PM Nuria Ruiz  wrote:

> > and the verdict (supported by you) was that we should use this list or
> the public IRC channel.
> Indeed, eh? I suggest we revisit that to send questions to
> analytics-internal but if others disagree, I am fine with either.
>

my 2 cents: I prefer the public list as the conversation can be relevant to
my team (Research) as well. At the moment, if I see something is not of
immediate interest to me, I mute the thread. That's quite easy/cheap on my
end. If the frequency of this kind of question on the list increases
significantly, I'd suggest adding a tag to the subject line that allows
people to filter appropriately.

Leila
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] SparkContext stopped and cannot be restarted

2020-02-07 Thread Nuria Ruiz
> and the verdict (supported by you) was that we should use this list or
the public IRC channel.
Indeed, eh? I suggest we revisit that to send questions to
analytics-internal but if others disagree, I am fine with either.



On Fri, Feb 7, 2020 at 12:17 PM Neil Shah-Quinn 
wrote:

> Good suggestions, Andrew! I'll try those if I encounter this again.
>
> Nuria, we had a discussion about the appropriate places to ask questions
> about internal systems in October 2018, and the verdict (supported by you)
> was that we should use this list or the public IRC channel.
>
> If you want to revisit that decision, I'd suggest you consult that thread
> first (the subject was "Where to ask questions about internal analytics
> tools") because I included a detailed list of pros and cons of different
> channels to start the discussion. In that list, I even mentioned that such
> discussions on this channel could annoy subscribers who don't have access
> to these systems 
>
> If you still want us to use a different list, we can certainly do that. If
> so, please send my team a message and update the docs I added
>  so it stays clear.
>
> On Fri, 7 Feb 2020 at 07:48, Nuria Ruiz  wrote:
>
>> Hello,
>>
>> Probably this discussion is not of wide interest to this public list, I
>> suggest to move it to analytics-internal?
>>
>> Thanks,
>>
>> Nuria
>>
>> On Fri, Feb 7, 2020 at 6:53 AM Andrew Otto  wrote:
>>
>>> Hm, interesting!  I don't think many of us have used
>>> SparkSession.builder.getOrCreate repeatedly in the same process.  What
>>> happens if you manually stop the spark session first, (session.stop()
>>> ?)
>>> or maybe try to explicitly create a new session via newSession()
>>> 
>>> ?
>>>
>>> On Thu, Feb 6, 2020 at 7:31 PM Neil Shah-Quinn 
>>> wrote:
>>>
 Hi Luca!

 Those were separate Yarn jobs I started later. When I got this error, I
 found that the Yarn job corresponding to the SparkContext was marked as
 "successful", but I still couldn't get SparkSession.builder.getOrCreate to
 open a new one.

 Any idea what might have caused that or how I could recover without
 restarting the notebook, which could mean losing a lot of in-progress work?
 I had already restarted that kernel so I don't know if I'll encounter this
 problem again. If I do, I'll file a task.

 On Wed, 5 Feb 2020 at 23:24, Luca Toscano 
 wrote:

> Hey Neil,
>
> there were two Yarn jobs running related to your notebooks, I just
> killed them, let's see if it solves the problem (you might need to restart
> again your notebook). If not, let's open a task and investigate :)
>
> Luca
>
> Il giorno gio 6 feb 2020 alle ore 02:08 Neil Shah-Quinn <
> nshahqu...@wikimedia.org> ha scritto:
>
>> Whoa—I just got the same stopped SparkContext error on the query even
>> after restarting the notebook, without an intermediate Java heap space
>> error. That seems very strange to me.
>>
>> On Wed, 5 Feb 2020 at 16:09, Neil Shah-Quinn <
>> nshahqu...@wikimedia.org> wrote:
>>
>>> Hey there!
>>>
>>> I was running SQL queries via PySpark (using the wmfdata package
>>> )
>>> on SWAP when one of my queries failed with "java.lang.OutofMemoryError:
>>> Java heap space".
>>>
>>> After that, when I tried to call the spark.sql function again (via
>>> wmfdata.hive.run), it failed with "java.lang.IllegalStateException: 
>>> Cannot
>>> call methods on a stopped SparkContext."
>>>
>>> When I tried to create a new Spark context using
>>> SparkSession.builder.getOrCreate (whether using 
>>> wmfdata.spark.get_session
>>> or directly), it returned a SparkContent object properly, but calling 
>>> the
>>> object's sql function still gave the "stopped SparkContext error".
>>>
>>> Any idea what's going on? I assume restarting the notebook kernel
>>> would take care of the problem, but it seems like there has to be a 
>>> better
>>> way to recover.
>>>
>>> Thank you!
>>>
>>>
>>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 

Re: [Analytics] SparkContext stopped and cannot be restarted

2020-02-07 Thread Neil Shah-Quinn
Good suggestions, Andrew! I'll try those if I encounter this again.

Nuria, we had a discussion about the appropriate places to ask questions
about internal systems in October 2018, and the verdict (supported by you)
was that we should use this list or the public IRC channel.

If you want to revisit that decision, I'd suggest you consult that thread
first (the subject was "Where to ask questions about internal analytics
tools") because I included a detailed list of pros and cons of different
channels to start the discussion. In that list, I even mentioned that such
discussions on this channel could annoy subscribers who don't have access
to these systems 

If you still want us to use a different list, we can certainly do that. If
so, please send my team a message and update the docs I added
 so it stays clear.

On Fri, 7 Feb 2020 at 07:48, Nuria Ruiz  wrote:

> Hello,
>
> Probably this discussion is not of wide interest to this public list, I
> suggest to move it to analytics-internal?
>
> Thanks,
>
> Nuria
>
> On Fri, Feb 7, 2020 at 6:53 AM Andrew Otto  wrote:
>
>> Hm, interesting!  I don't think many of us have used
>> SparkSession.builder.getOrCreate repeatedly in the same process.  What
>> happens if you manually stop the spark session first, (session.stop()
>> ?)
>> or maybe try to explicitly create a new session via newSession()
>> 
>> ?
>>
>> On Thu, Feb 6, 2020 at 7:31 PM Neil Shah-Quinn 
>> wrote:
>>
>>> Hi Luca!
>>>
>>> Those were separate Yarn jobs I started later. When I got this error, I
>>> found that the Yarn job corresponding to the SparkContext was marked as
>>> "successful", but I still couldn't get SparkSession.builder.getOrCreate to
>>> open a new one.
>>>
>>> Any idea what might have caused that or how I could recover without
>>> restarting the notebook, which could mean losing a lot of in-progress work?
>>> I had already restarted that kernel so I don't know if I'll encounter this
>>> problem again. If I do, I'll file a task.
>>>
>>> On Wed, 5 Feb 2020 at 23:24, Luca Toscano 
>>> wrote:
>>>
 Hey Neil,

 there were two Yarn jobs running related to your notebooks, I just
 killed them, let's see if it solves the problem (you might need to restart
 again your notebook). If not, let's open a task and investigate :)

 Luca

 Il giorno gio 6 feb 2020 alle ore 02:08 Neil Shah-Quinn <
 nshahqu...@wikimedia.org> ha scritto:

> Whoa—I just got the same stopped SparkContext error on the query even
> after restarting the notebook, without an intermediate Java heap space
> error. That seems very strange to me.
>
> On Wed, 5 Feb 2020 at 16:09, Neil Shah-Quinn 
> wrote:
>
>> Hey there!
>>
>> I was running SQL queries via PySpark (using the wmfdata package
>> )
>> on SWAP when one of my queries failed with "java.lang.OutofMemoryError:
>> Java heap space".
>>
>> After that, when I tried to call the spark.sql function again (via
>> wmfdata.hive.run), it failed with "java.lang.IllegalStateException: 
>> Cannot
>> call methods on a stopped SparkContext."
>>
>> When I tried to create a new Spark context using
>> SparkSession.builder.getOrCreate (whether using wmfdata.spark.get_session
>> or directly), it returned a SparkContent object properly, but calling the
>> object's sql function still gave the "stopped SparkContext error".
>>
>> Any idea what's going on? I assume restarting the notebook kernel
>> would take care of the problem, but it seems like there has to be a 
>> better
>> way to recover.
>>
>> Thank you!
>>
>>
>> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics

>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
___
Analytics mailing list

Re: [Analytics] SparkContext stopped and cannot be restarted

2020-02-07 Thread Nuria Ruiz
Hello,

Probably this discussion is not of wide interest to this public list, I
suggest to move it to analytics-internal?

Thanks,

Nuria

On Fri, Feb 7, 2020 at 6:53 AM Andrew Otto  wrote:

> Hm, interesting!  I don't think many of us have used 
> SparkSession.builder.getOrCreate
> repeatedly in the same process.  What happens if you manually stop the
> spark session first, (session.stop()
> ?)
> or maybe try to explicitly create a new session via newSession()
> 
> ?
>
> On Thu, Feb 6, 2020 at 7:31 PM Neil Shah-Quinn 
> wrote:
>
>> Hi Luca!
>>
>> Those were separate Yarn jobs I started later. When I got this error, I
>> found that the Yarn job corresponding to the SparkContext was marked as
>> "successful", but I still couldn't get SparkSession.builder.getOrCreate to
>> open a new one.
>>
>> Any idea what might have caused that or how I could recover without
>> restarting the notebook, which could mean losing a lot of in-progress work?
>> I had already restarted that kernel so I don't know if I'll encounter this
>> problem again. If I do, I'll file a task.
>>
>> On Wed, 5 Feb 2020 at 23:24, Luca Toscano  wrote:
>>
>>> Hey Neil,
>>>
>>> there were two Yarn jobs running related to your notebooks, I just
>>> killed them, let's see if it solves the problem (you might need to restart
>>> again your notebook). If not, let's open a task and investigate :)
>>>
>>> Luca
>>>
>>> Il giorno gio 6 feb 2020 alle ore 02:08 Neil Shah-Quinn <
>>> nshahqu...@wikimedia.org> ha scritto:
>>>
 Whoa—I just got the same stopped SparkContext error on the query even
 after restarting the notebook, without an intermediate Java heap space
 error. That seems very strange to me.

 On Wed, 5 Feb 2020 at 16:09, Neil Shah-Quinn 
 wrote:

> Hey there!
>
> I was running SQL queries via PySpark (using the wmfdata package
> )
> on SWAP when one of my queries failed with "java.lang.OutofMemoryError:
> Java heap space".
>
> After that, when I tried to call the spark.sql function again (via
> wmfdata.hive.run), it failed with "java.lang.IllegalStateException: Cannot
> call methods on a stopped SparkContext."
>
> When I tried to create a new Spark context using
> SparkSession.builder.getOrCreate (whether using wmfdata.spark.get_session
> or directly), it returned a SparkContent object properly, but calling the
> object's sql function still gave the "stopped SparkContext error".
>
> Any idea what's going on? I assume restarting the notebook kernel
> would take care of the problem, but it seems like there has to be a better
> way to recover.
>
> Thank you!
>
>
> ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics

>>> ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] SparkContext stopped and cannot be restarted

2020-02-07 Thread Andrew Otto
Hm, interesting!  I don't think many of us have used
SparkSession.builder.getOrCreate
repeatedly in the same process.  What happens if you manually stop the
spark session first, (session.stop()
?)
or maybe try to explicitly create a new session via newSession()

?

On Thu, Feb 6, 2020 at 7:31 PM Neil Shah-Quinn 
wrote:

> Hi Luca!
>
> Those were separate Yarn jobs I started later. When I got this error, I
> found that the Yarn job corresponding to the SparkContext was marked as
> "successful", but I still couldn't get SparkSession.builder.getOrCreate to
> open a new one.
>
> Any idea what might have caused that or how I could recover without
> restarting the notebook, which could mean losing a lot of in-progress work?
> I had already restarted that kernel so I don't know if I'll encounter this
> problem again. If I do, I'll file a task.
>
> On Wed, 5 Feb 2020 at 23:24, Luca Toscano  wrote:
>
>> Hey Neil,
>>
>> there were two Yarn jobs running related to your notebooks, I just killed
>> them, let's see if it solves the problem (you might need to restart again
>> your notebook). If not, let's open a task and investigate :)
>>
>> Luca
>>
>> Il giorno gio 6 feb 2020 alle ore 02:08 Neil Shah-Quinn <
>> nshahqu...@wikimedia.org> ha scritto:
>>
>>> Whoa—I just got the same stopped SparkContext error on the query even
>>> after restarting the notebook, without an intermediate Java heap space
>>> error. That seems very strange to me.
>>>
>>> On Wed, 5 Feb 2020 at 16:09, Neil Shah-Quinn 
>>> wrote:
>>>
 Hey there!

 I was running SQL queries via PySpark (using the wmfdata package
 )
 on SWAP when one of my queries failed with "java.lang.OutofMemoryError:
 Java heap space".

 After that, when I tried to call the spark.sql function again (via
 wmfdata.hive.run), it failed with "java.lang.IllegalStateException: Cannot
 call methods on a stopped SparkContext."

 When I tried to create a new Spark context using
 SparkSession.builder.getOrCreate (whether using wmfdata.spark.get_session
 or directly), it returned a SparkContent object properly, but calling the
 object's sql function still gave the "stopped SparkContext error".

 Any idea what's going on? I assume restarting the notebook kernel would
 take care of the problem, but it seems like there has to be a better way to
 recover.

 Thank you!


 ___
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>> ___
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
> ___
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics