Re: Stopping a job

2020-06-09 Thread Kostas Kloudas
Hi all,

Just for future reference, there is an ongoing discussion on the topic at
another thread found in [1].
So please post any relevant comments there :)

Cheers,
Kostas

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Age-old-stop-vs-cancel-debate-td35514.html#a35615

On Tue, Jun 9, 2020 at 7:36 AM M Singh  wrote:

> Thanks Kostas, Arvid, and Senthil for your help.
>
> On Monday, June 8, 2020, 12:47:56 PM EDT, Senthil Kumar <
> senthi...@vmware.com> wrote:
>
>
> I am just stating this for completeness.
>
>
>
> When a job is cancelled, Flink sends an Interrupt signal to the Thread
> running the Source.run method
>
>
>
> For some reason (unknown to me), this does not happen when a Stop command
> is issued.
>
>
>
> We ran into some minor issues because of said behavior.
>
>
>
> *From: *Kostas Kloudas 
> *Date: *Monday, June 8, 2020 at 2:35 AM
> *To: *Arvid Heise 
> *Cc: *M Singh , User-Flink 
> *Subject: *Re: Stopping a job
>
>
>
> What Arvid said is correct.
>
> The only thing I have to add is that "stop" allows also exactly-once sinks
> to push out their buffered data to their final destination (e.g.
> Filesystem). In other words, it takes into account side-effects, so it
> guarantees exactly-once end-to-end, assuming that you are
> using exactly-once sources and sinks. Cancel with savepoint on the other
> hand did not necessarily and committing side-effects is was following a
> "best-effort" approach.
>
>
>
> For more information you can check [1].
>
>
>
> Cheers,
>
> Kostas
>
>
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D103090212=02%7C01%7Csenthilku%40vmware.com%7C08b9084d245a427ac48108d80b86f556%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637272021560377282=AaA89A3Cq3iVoArqAS3BuvxGPFphztC4g7O6P96JxEs%3D=0>
>
>
>
> On Mon, Jun 8, 2020 at 10:23 AM Arvid Heise  wrote:
>
> It was before I joined the dev team, so the following are kind of
> speculative:
>
>
>
> The concept of stoppable functions never really took off as it was a bit
> of a clumsy approach. There is no fundamental difference between stopping
> and cancelling on (sub)task level. Indeed if you look in the twitter source
> of 1.6 [1], cancel() and stop() are doing the exact same thing. I'd assume
> that this is probably true for all sources.
>
>
>
> So what is the difference between cancel and stop then? It's more the way
> on how you terminate the whole DAG. On cancelling, you cancel() on all
> tasks more or less simultaneously. If you want to stop, it's more a
> fine-grain cancel, where you stop first the sources and then let the tasks
> close themselves when all upstream tasks are done. Just before closing the
> tasks, you also take a snapshot. Thus, the difference should not be visible
> in user code but only in the Flink code itself (task/checkpoint coordinator)
>
>
>
> So for your question:
>
> 1. No, as on task level stop() and cancel() are the same thing on UDF
> level.
>
> 2. Yes, stop will be more graceful and creates a snapshot. [2]
>
> 3. Not that I am aware of. In the whole flink code base, there are no more
> (see javadoc). You could of course check if there are some in Bahir. But it
> shouldn't really matter. There is no huge difference between stopping and
> cancelling if you wait for a checkpoint to finish.
>
> 4. Okay you answered your second question ;) Yes cancel with savepoint =
> stop now to make it easier for new users.
>
>
>
> [1]
> https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink%2Fblob%2Frelease-1.6%2Fflink-connectors%2Fflink-connector-twitter%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fflink%2Fstreaming%2Fconnectors%2Ftwitter%2FTwitterSource.java%23L180-L190=02%7C01%7Csenthilku%40vmware.com%7C08b9084d245a427ac48108d80b86f556%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637272021560387275=7xQ3BbQUveflErmTg34QsKvwOjlLnwS41xaoscjd57k%3D=0>
>
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.10%2Fops%2Fcli.html=02%7C01%7Csenthilku%40vmware.com%7C08b9084d245a427ac48108d80b86f556%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637272021560387275=u89koNzR4Ho2%2FPzQWyeEvYPX60c9FbP3kBrrngH

Re: Stopping a job

2020-06-08 Thread Senthil Kumar
I am just stating this for completeness.

When a job is cancelled, Flink sends an Interrupt signal to the Thread running 
the Source.run method

For some reason (unknown to me), this does not happen when a Stop command is 
issued.

We ran into some minor issues because of said behavior.

From: Kostas Kloudas 
Date: Monday, June 8, 2020 at 2:35 AM
To: Arvid Heise 
Cc: M Singh , User-Flink 
Subject: Re: Stopping a job

What Arvid said is correct.
The only thing I have to add is that "stop" allows also exactly-once sinks to 
push out their buffered data to their final destination (e.g. Filesystem). In 
other words, it takes into account side-effects, so it guarantees exactly-once 
end-to-end, assuming that you are using exactly-once sources and sinks. Cancel 
with savepoint on the other hand did not necessarily and committing 
side-effects is was following a "best-effort" approach.

For more information you can check [1].

Cheers,
Kostas

[1] 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fpages%2Fviewpage.action%3FpageId%3D103090212=02%7C01%7Csenthilku%40vmware.com%7C08b9084d245a427ac48108d80b86f556%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637272021560377282=AaA89A3Cq3iVoArqAS3BuvxGPFphztC4g7O6P96JxEs%3D=0>

On Mon, Jun 8, 2020 at 10:23 AM Arvid Heise 
mailto:ar...@ververica.com>> wrote:
It was before I joined the dev team, so the following are kind of speculative:

The concept of stoppable functions never really took off as it was a bit of a 
clumsy approach. There is no fundamental difference between stopping and 
cancelling on (sub)task level. Indeed if you look in the twitter source of 1.6 
[1], cancel() and stop() are doing the exact same thing. I'd assume that this 
is probably true for all sources.

So what is the difference between cancel and stop then? It's more the way on 
how you terminate the whole DAG. On cancelling, you cancel() on all tasks more 
or less simultaneously. If you want to stop, it's more a fine-grain cancel, 
where you stop first the sources and then let the tasks close themselves when 
all upstream tasks are done. Just before closing the tasks, you also take a 
snapshot. Thus, the difference should not be visible in user code but only in 
the Flink code itself (task/checkpoint coordinator)

So for your question:
1. No, as on task level stop() and cancel() are the same thing on UDF level.
2. Yes, stop will be more graceful and creates a snapshot. [2]
3. Not that I am aware of. In the whole flink code base, there are no more (see 
javadoc). You could of course check if there are some in Bahir. But it 
shouldn't really matter. There is no huge difference between stopping and 
cancelling if you wait for a checkpoint to finish.
4. Okay you answered your second question ;) Yes cancel with savepoint = stop 
now to make it easier for new users.

[1] 
https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink%2Fblob%2Frelease-1.6%2Fflink-connectors%2Fflink-connector-twitter%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fflink%2Fstreaming%2Fconnectors%2Ftwitter%2FTwitterSource.java%23L180-L190=02%7C01%7Csenthilku%40vmware.com%7C08b9084d245a427ac48108d80b86f556%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637272021560387275=7xQ3BbQUveflErmTg34QsKvwOjlLnwS41xaoscjd57k%3D=0>
[2] 
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-release-1.10%2Fops%2Fcli.html=02%7C01%7Csenthilku%40vmware.com%7C08b9084d245a427ac48108d80b86f556%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637272021560387275=u89koNzR4Ho2%2FPzQWyeEvYPX60c9FbP3kBrrngHOKTA%3D=0>

On Sun, Jun 7, 2020 at 1:04 AM M Singh 
mailto:mans2si...@yahoo.com>> wrote:

Hi Arvid:

Thanks for the links.

A few questions:

1. Is there any particular interface in 1.9+ that identifies the source as 
stoppable ?
2. Is there any distinction b/w stop and cancel  in 1.9+ ?
3. Is there any list of sources which are documented as stoppable besides the 
one listed in your SO link ?
4. In 1.9+ there is flink stop command and a flink cancel command. 
(https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.apache.org%2Fprojects%2Fflink%2Fflink-docs-stable%2Fops%2Fcli.html%23stop=02%7C01%7Csenthilku%40vmware.com%7C08b9084d245a427ac48108d80b86f556%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637272021560397270=8m7T%2BQRt0u7hSK2o44ilIJOTXUGZ8bqZ3%2BD7xpga6KQ%3D=0>).
  So it appears that flink stop will take a savepoint and the call cancel, and 
cancel wi

Re: Stopping a job

2020-06-08 Thread Kostas Kloudas
What Arvid said is correct.
The only thing I have to add is that "stop" allows also exactly-once sinks
to push out their buffered data to their final destination (e.g.
Filesystem). In other words, it takes into account side-effects, so it
guarantees exactly-once end-to-end, assuming that you are
using exactly-once sources and sinks. Cancel with savepoint on the other
hand did not necessarily and committing side-effects is was following a
"best-effort" approach.

For more information you can check [1].

Cheers,
Kostas

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212

On Mon, Jun 8, 2020 at 10:23 AM Arvid Heise  wrote:

> It was before I joined the dev team, so the following are kind of
> speculative:
>
> The concept of stoppable functions never really took off as it was a bit
> of a clumsy approach. There is no fundamental difference between stopping
> and cancelling on (sub)task level. Indeed if you look in the twitter source
> of 1.6 [1], cancel() and stop() are doing the exact same thing. I'd assume
> that this is probably true for all sources.
>
> So what is the difference between cancel and stop then? It's more the way
> on how you terminate the whole DAG. On cancelling, you cancel() on all
> tasks more or less simultaneously. If you want to stop, it's more a
> fine-grain cancel, where you stop first the sources and then let the tasks
> close themselves when all upstream tasks are done. Just before closing the
> tasks, you also take a snapshot. Thus, the difference should not be visible
> in user code but only in the Flink code itself (task/checkpoint coordinator)
>
> So for your question:
> 1. No, as on task level stop() and cancel() are the same thing on UDF
> level.
> 2. Yes, stop will be more graceful and creates a snapshot. [2]
> 3. Not that I am aware of. In the whole flink code base, there are no more
> (see javadoc). You could of course check if there are some in Bahir. But it
> shouldn't really matter. There is no huge difference between stopping and
> cancelling if you wait for a checkpoint to finish.
> 4. Okay you answered your second question ;) Yes cancel with savepoint =
> stop now to make it easier for new users.
>
> [1]
> https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html
>
> On Sun, Jun 7, 2020 at 1:04 AM M Singh  wrote:
>
>>
>> Hi Arvid:
>>
>> Thanks for the links.
>>
>> A few questions:
>>
>> 1. Is there any particular interface in 1.9+ that identifies the source
>> as stoppable ?
>> 2. Is there any distinction b/w stop and cancel  in 1.9+ ?
>> 3. Is there any list of sources which are documented as stoppable besides
>> the one listed in your SO link ?
>> 4. In 1.9+ there is flink stop command and a flink cancel command. (
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop).
>> So it appears that flink stop will take a savepoint and the call cancel,
>> and cancel will just cancel the job (looks like cancel with savepoint is
>> deprecated in 1.10).
>>
>> Thanks again for your help.
>>
>>
>>
>> On Saturday, June 6, 2020, 02:18:57 PM EDT, Arvid Heise <
>> ar...@ververica.com> wrote:
>>
>>
>> Yes, it seems as if FlinkKinesisConsumer does not implement it.
>>
>> Here are the links to the respective javadoc [1] and code [2]. Note that
>> in later releases (1.9+) this interface has been removed. Stop is now
>> implemented through a cancel() on source level.
>>
>> In general, I don't think that in a Kinesis to Kinesis use case, stop is
>> needed anyways, since there is no additional consistency expected over a
>> normal cancel.
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html
>> [2]
>> https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java
>>
>> On Sat, Jun 6, 2020 at 8:03 PM M Singh  wrote:
>>
>> Hi Arvid:
>>
>> I check the link and it indicates that only Storm SpoutSource,
>> TwitterSource and NifiSource support stop.
>>
>> Does this mean that FlinkKinesisConsumer is not stoppable ?
>>
>> Also, can you please point me to the Stoppable interface mentioned in the
>> link ?  I found the following but am not sure if TwitterSource implements
>> it :
>>
>> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java
>>
>> Thanks
>>
>>
>>
>>
>>
>> On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise <
>> ar...@ververica.com> wrote:
>>
>>
>> Hi,
>>
>> could you check if this SO thread [1] helps you already?
>>
>> [1]
>> https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable
>>
>> On Thu, Jun 4, 2020 at 7:43 PM M Singh  wrote:
>>
>> 

Re: Stopping a job

2020-06-08 Thread Arvid Heise
It was before I joined the dev team, so the following are kind of
speculative:

The concept of stoppable functions never really took off as it was a bit of
a clumsy approach. There is no fundamental difference between stopping and
cancelling on (sub)task level. Indeed if you look in the twitter source of
1.6 [1], cancel() and stop() are doing the exact same thing. I'd assume
that this is probably true for all sources.

So what is the difference between cancel and stop then? It's more the way
on how you terminate the whole DAG. On cancelling, you cancel() on all
tasks more or less simultaneously. If you want to stop, it's more a
fine-grain cancel, where you stop first the sources and then let the tasks
close themselves when all upstream tasks are done. Just before closing the
tasks, you also take a snapshot. Thus, the difference should not be visible
in user code but only in the Flink code itself (task/checkpoint coordinator)

So for your question:
1. No, as on task level stop() and cancel() are the same thing on UDF level.
2. Yes, stop will be more graceful and creates a snapshot. [2]
3. Not that I am aware of. In the whole flink code base, there are no more
(see javadoc). You could of course check if there are some in Bahir. But it
shouldn't really matter. There is no huge difference between stopping and
cancelling if you wait for a checkpoint to finish.
4. Okay you answered your second question ;) Yes cancel with savepoint =
stop now to make it easier for new users.

[1]
https://github.com/apache/flink/blob/release-1.6/flink-connectors/flink-connector-twitter/src/main/java/org/apache/flink/streaming/connectors/twitter/TwitterSource.java#L180-L190
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html

On Sun, Jun 7, 2020 at 1:04 AM M Singh  wrote:

>
> Hi Arvid:
>
> Thanks for the links.
>
> A few questions:
>
> 1. Is there any particular interface in 1.9+ that identifies the source as
> stoppable ?
> 2. Is there any distinction b/w stop and cancel  in 1.9+ ?
> 3. Is there any list of sources which are documented as stoppable besides
> the one listed in your SO link ?
> 4. In 1.9+ there is flink stop command and a flink cancel command. (
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop).
> So it appears that flink stop will take a savepoint and the call cancel,
> and cancel will just cancel the job (looks like cancel with savepoint is
> deprecated in 1.10).
>
> Thanks again for your help.
>
>
>
> On Saturday, June 6, 2020, 02:18:57 PM EDT, Arvid Heise <
> ar...@ververica.com> wrote:
>
>
> Yes, it seems as if FlinkKinesisConsumer does not implement it.
>
> Here are the links to the respective javadoc [1] and code [2]. Note that
> in later releases (1.9+) this interface has been removed. Stop is now
> implemented through a cancel() on source level.
>
> In general, I don't think that in a Kinesis to Kinesis use case, stop is
> needed anyways, since there is no additional consistency expected over a
> normal cancel.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html
> [2]
> https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java
>
> On Sat, Jun 6, 2020 at 8:03 PM M Singh  wrote:
>
> Hi Arvid:
>
> I check the link and it indicates that only Storm SpoutSource,
> TwitterSource and NifiSource support stop.
>
> Does this mean that FlinkKinesisConsumer is not stoppable ?
>
> Also, can you please point me to the Stoppable interface mentioned in the
> link ?  I found the following but am not sure if TwitterSource implements
> it :
>
> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java
>
> Thanks
>
>
>
>
>
> On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise 
> wrote:
>
>
> Hi,
>
> could you check if this SO thread [1] helps you already?
>
> [1]
> https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable
>
> On Thu, Jun 4, 2020 at 7:43 PM M Singh  wrote:
>
> Hi:
>
> I am running a job which consumes data from Kinesis and send data to
> another Kinesis queue.  I am using an older version of Flink (1.6), and
> when I try to stop the job I get an exception
>
> Caused by: java.util.concurrent.ExecutionException:
> org.apache.flink.runtime.rest.util.RestClientException: [Job termination
> (STOP) failed: This job is not stoppable.]
>
>
> I wanted to find out what is a stoppable job and it possible to make a job
> stoppable if is reading/writing to kinesis ?
>
> Thanks
>
>
>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> 
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 

Re: Stopping a job

2020-06-06 Thread M Singh
 
Hi Arvid:   
Thanks for the links.  
A few questions:
1. Is there any particular interface in 1.9+ that identifies the source as 
stoppable ?2. Is there any distinction b/w stop and cancel  in 1.9+ ?3. Is 
there any list of sources which are documented as stoppable besides the one 
listed in your SO link ?4. In 1.9+ there is flink stop command and a flink 
cancel command. 
(https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#stop).  So 
it appears that flink stop will take a savepoint and the call cancel, and 
cancel will just cancel the job (looks like cancel with savepoint is deprecated 
in 1.10).  
Thanks again for your help.


On Saturday, June 6, 2020, 02:18:57 PM EDT, Arvid Heise 
 wrote:  
 
 Yes, it seems as if FlinkKinesisConsumer does not implement it.
Here are the links to the respective javadoc [1] and code [2]. Note that in 
later releases (1.9+) this interface has been removed. Stop is now implemented 
through a cancel() on source level.
In general, I don't think that in a Kinesis to Kinesis use case, stop is needed 
anyways, since there is no additional consistency expected over a normal cancel.

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html[2]
 
https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java
On Sat, Jun 6, 2020 at 8:03 PM M Singh  wrote:

 Hi Arvid:
I check the link and it indicates that only Storm SpoutSource, TwitterSource 
and NifiSource support stop.   
Does this mean that FlinkKinesisConsumer is not stoppable ?
Also, can you please point me to the Stoppable interface mentioned in the link 
?  I found the following but am not sure if TwitterSource implements it 
:https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java
Thanks




On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise  
wrote:  
 
 Hi,
could you check if this SO thread [1] helps you already?
[1] 
https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable
On Thu, Jun 4, 2020 at 7:43 PM M Singh  wrote:

Hi:
I am running a job which consumes data from Kinesis and send data to another 
Kinesis queue.  I am using an older version of Flink (1.6), and when I try to 
stop the job I get an exception 

 

Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.rest.util.RestClientException: [Job termination (STOP) 
failed: This job is not stoppable.]


I wanted to find out what is a stoppable job and it possible to make a job 
stoppable if is reading/writing to kinesis ?
Thanks





-- 

Arvid Heise | Senior Java Developer




Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) 
Cheng      


-- 

Arvid Heise | Senior Java Developer




Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) 
Cheng      

Re: Stopping a job

2020-06-06 Thread Arvid Heise
Yes, it seems as if FlinkKinesisConsumer does not implement it.

Here are the links to the respective javadoc [1] and code [2]. Note that in
later releases (1.9+) this interface has been removed. Stop is now
implemented through a cancel() on source level.

In general, I don't think that in a Kinesis to Kinesis use case, stop is
needed anyways, since there is no additional consistency expected over a
normal cancel.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html
[2]
https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java

On Sat, Jun 6, 2020 at 8:03 PM M Singh  wrote:

> Hi Arvid:
>
> I check the link and it indicates that only Storm SpoutSource,
> TwitterSource and NifiSource support stop.
>
> Does this mean that FlinkKinesisConsumer is not stoppable ?
>
> Also, can you please point me to the Stoppable interface mentioned in the
> link ?  I found the following but am not sure if TwitterSource implements
> it :
>
> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java
>
> Thanks
>
>
>
>
>
> On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise 
> wrote:
>
>
> Hi,
>
> could you check if this SO thread [1] helps you already?
>
> [1]
> https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable
>
> On Thu, Jun 4, 2020 at 7:43 PM M Singh  wrote:
>
> Hi:
>
> I am running a job which consumes data from Kinesis and send data to
> another Kinesis queue.  I am using an older version of Flink (1.6), and
> when I try to stop the job I get an exception
>
> Caused by: java.util.concurrent.ExecutionException:
> org.apache.flink.runtime.rest.util.RestClientException: [Job termination
> (STOP) failed: This job is not stoppable.]
>
>
> I wanted to find out what is a stoppable job and it possible to make a job
> stoppable if is reading/writing to kinesis ?
>
> Thanks
>
>
>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> 
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Toni) Cheng
>


-- 

Arvid Heise | Senior Java Developer



Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng


Re: Stopping a job

2020-06-06 Thread M Singh
 Hi Arvid:
I check the link and it indicates that only Storm SpoutSource, TwitterSource 
and NifiSource support stop.   
Does this mean that FlinkKinesisConsumer is not stoppable ?
Also, can you please point me to the Stoppable interface mentioned in the link 
?  I found the following but am not sure if TwitterSource implements it 
:https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/StartStoppable.java
Thanks




On Friday, June 5, 2020, 02:48:49 PM EDT, Arvid Heise  
wrote:  
 
 Hi,
could you check if this SO thread [1] helps you already?
[1] 
https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable
On Thu, Jun 4, 2020 at 7:43 PM M Singh  wrote:

Hi:
I am running a job which consumes data from Kinesis and send data to another 
Kinesis queue.  I am using an older version of Flink (1.6), and when I try to 
stop the job I get an exception 

 

Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.rest.util.RestClientException: [Job termination (STOP) 
failed: This job is not stoppable.]


I wanted to find out what is a stoppable job and it possible to make a job 
stoppable if is reading/writing to kinesis ?
Thanks





-- 

Arvid Heise | Senior Java Developer




Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) 
Cheng      

Re: Stopping a job

2020-06-05 Thread Arvid Heise
Hi,

could you check if this SO thread [1] helps you already?

[1]
https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable

On Thu, Jun 4, 2020 at 7:43 PM M Singh  wrote:

> Hi:
>
> I am running a job which consumes data from Kinesis and send data to
> another Kinesis queue.  I am using an older version of Flink (1.6), and
> when I try to stop the job I get an exception
>
> Caused by: java.util.concurrent.ExecutionException:
> org.apache.flink.runtime.rest.util.RestClientException: [Job termination
> (STOP) failed: This job is not stoppable.]
>
>
> I wanted to find out what is a stoppable job and it possible to make a job
> stoppable if is reading/writing to kinesis ?
>
> Thanks
>
>
>

-- 

Arvid Heise | Senior Java Developer



Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng


Stopping a job

2020-06-04 Thread M Singh
Hi:
I am running a job which consumes data from Kinesis and send data to another 
Kinesis queue.  I am using an older version of Flink (1.6), and when I try to 
stop the job I get an exception 

 

Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.rest.util.RestClientException: [Job termination (STOP) 
failed: This job is not stoppable.]


I wanted to find out what is a stoppable job and it possible to make a job 
stoppable if is reading/writing to kinesis ?
Thanks




Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-16 Thread John Smith
Thanks! This should do the trick...

@Override
public void close() throws Exception {
   CompletableFuture vertxClosed = new CompletableFuture<>();

   if(jdbc != null)
  jdbc.close();

   if(vertx != null)
  vertx.close(close -> {
 if(close.failed())
logger.error("Vertx did not close properly!", close.cause());

 vertxClosed.complete(null);
  });

   if(ignite != null)
  ignite.close();

   // Give a chance to the async libs to close().
   vertxClosed.get(3000, TimeUnit.MILLISECONDS);
}


On Thu, 16 May 2019 at 15:25, Ken Krugler 
wrote:

>
> On May 16, 2019, at 9:38 AM, John Smith  wrote:
>
> Hi, so Thread.Sleep(1000) seems to do the trick. Now is this a good thing
> or bad thing?
>
>
> An arbitrary sleep duration has the potential to create random failures if
> the close takes longer than expected.
>
> For async close() calls, often you handle this by querying (in a loop,
> with sleep() and a max duration) the client until its status changes.
>
> — Ken
>
>
>
> On Thu, 16 May 2019 at 11:46, John Smith  wrote:
>
>> Yes when I mean cancel the JOB, it's when you go inside the UI and hit
>> the cancel button at the top right corner.
>>
>> The close is very simple...
>>
>> @Override
>> public void close() throws Exception {
>>if(jdbc != null)
>>   jdbc.close();
>>
>>if(vertx != null)
>>   vertx.close();
>>
>>if(ignite != null)
>>   ignite.close();
>> }
>>
>> I was thinking that jdbc.close() and vertx.close() are actually async
>> underneath. The vertx APIs offers 2 methods for close, one which has async
>> interface and one that does not. But essentially they are both async.
>>
>> So I'm thinking maybe I can introduce a small Thread.Sleep(1000) after
>> the last close to give a chance for vertx.close().
>>
>>
>>
>>
>> On Thu, 16 May 2019 at 04:14, Andrey Zagrebin 
>> wrote:
>>
>>> Could you share the source code of your RichAsyncFunction?
>>> Looks like netty threads of vertx are still being shutdown after user
>>> code class loader has been shutdown.
>>> It probably means that RichAsyncFunction was not closed properly or not
>>> all resources have been fully freed there (logging your
>>> RichAsyncFunction.close could help).
>>> Do you mean cancelation by stopping the job?
>>>
>>> On Wed, May 15, 2019 at 10:02 PM John Smith 
>>> wrote:
>>>
>>>> So these are the two exceptions I see in the logs...
>>>>
>>>> Exception in thread "vert.x-worker-thread-0" Exception in thread
>>>> "vert.x-internal-blocking-0" java.lang.NoClassDefFoundError:
>>>> io/netty/util/concurrent/FastThreadLocal
>>>> at
>>>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>>>> at java.lang.Thread.run(Thread.java:748)
>>>> Caused by: java.lang.ClassNotFoundException:
>>>> io.netty.util.concurrent.FastThreadLocal
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at
>>>> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> ... 2 more
>>>> java.lang.NoClassDefFoundError: io/netty/util/concurrent/FastThreadLocal
>>>> at
>>>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>>>> at java.lang.Thread.run(Thread.java:748)
>>>> May 15, 2019 10:42:07 AM io.vertx.core.impl.ContextImpl
>>>> SEVERE: Unhandled exception
>>>> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
>>>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>>>> at
>>>> io.vertx.core.impl.VertxImpl.lambda$deleteCacheDirAndShutdown$25(VertxImpl.java:830)
>>>> at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:284)
>>>> at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:320)
>>>> at
>>>> io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
>>>> at
>>>> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>>>> at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
&

Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-16 Thread Ken Krugler

> On May 16, 2019, at 9:38 AM, John Smith  wrote:
> 
> Hi, so Thread.Sleep(1000) seems to do the trick. Now is this a good thing or 
> bad thing?

An arbitrary sleep duration has the potential to create random failures if the 
close takes longer than expected.

For async close() calls, often you handle this by querying (in a loop, with 
sleep() and a max duration) the client until its status changes.

— Ken



> On Thu, 16 May 2019 at 11:46, John Smith  <mailto:java.dev@gmail.com>> wrote:
> Yes when I mean cancel the JOB, it's when you go inside the UI and hit the 
> cancel button at the top right corner.
> 
> The close is very simple...
> @Override
> public void close() throws Exception {
>if(jdbc != null)
>   jdbc.close();
> 
>if(vertx != null)
>   vertx.close();
> 
>if(ignite != null)
>   ignite.close();
> }
> I was thinking that jdbc.close() and vertx.close() are actually async 
> underneath. The vertx APIs offers 2 methods for close, one which has async 
> interface and one that does not. But essentially they are both async.
> 
> So I'm thinking maybe I can introduce a small Thread.Sleep(1000) after the 
> last close to give a chance for vertx.close().
> 
> 
> 
> 
> On Thu, 16 May 2019 at 04:14, Andrey Zagrebin  <mailto:and...@ververica.com>> wrote:
> Could you share the source code of your RichAsyncFunction?
> Looks like netty threads of vertx are still being shutdown after user code 
> class loader has been shutdown.
> It probably means that RichAsyncFunction was not closed properly or not all 
> resources have been fully freed there (logging your RichAsyncFunction.close 
> could help).
> Do you mean cancelation by stopping the job?
> 
> On Wed, May 15, 2019 at 10:02 PM John Smith  <mailto:java.dev@gmail.com>> wrote:
> So these are the two exceptions I see in the logs...
> 
> Exception in thread "vert.x-worker-thread-0" Exception in thread 
> "vert.x-internal-blocking-0" java.lang.NoClassDefFoundError: 
> io/netty/util/concurrent/FastThreadLocal
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> io.netty.util.concurrent.FastThreadLocal
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 2 more
> java.lang.NoClassDefFoundError: io/netty/util/concurrent/FastThreadLocal
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>   at java.lang.Thread.run(Thread.java:748)
> May 15, 2019 10:42:07 AM io.vertx.core.impl.ContextImpl
> SEVERE: Unhandled exception
> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: 
> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>   at 
> io.vertx.core.impl.VertxImpl.lambda$deleteCacheDirAndShutdown$25(VertxImpl.java:830)
>   at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:284)
>   at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:320)
>   at 
> io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
>   at 
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NoClassDefFoundError: 
> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>   ... 10 more
> Caused by: java.lang.ClassNotFoundException: 
> io.vertx.core.impl.VertxImpl$SharedWorkerPool
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at 
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 10 more
> 
> On Wed, 15 May 2019 at 12:00, Andrey Zagrebin  <mailto:and...@ververica.com>> wrote:
> Hi John,
> 
> co

Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-16 Thread John Smith
Hi, so Thread.Sleep(1000) seems to do the trick. Now is this a good thing
or bad thing?

On Thu, 16 May 2019 at 11:46, John Smith  wrote:

> Yes when I mean cancel the JOB, it's when you go inside the UI and hit the
> cancel button at the top right corner.
>
> The close is very simple...
>
> @Override
> public void close() throws Exception {
>if(jdbc != null)
>   jdbc.close();
>
>if(vertx != null)
>   vertx.close();
>
>if(ignite != null)
>   ignite.close();
> }
>
> I was thinking that jdbc.close() and vertx.close() are actually async
> underneath. The vertx APIs offers 2 methods for close, one which has async
> interface and one that does not. But essentially they are both async.
>
> So I'm thinking maybe I can introduce a small Thread.Sleep(1000) after the
> last close to give a chance for vertx.close().
>
>
>
>
> On Thu, 16 May 2019 at 04:14, Andrey Zagrebin 
> wrote:
>
>> Could you share the source code of your RichAsyncFunction?
>> Looks like netty threads of vertx are still being shutdown after user
>> code class loader has been shutdown.
>> It probably means that RichAsyncFunction was not closed properly or not
>> all resources have been fully freed there (logging your
>> RichAsyncFunction.close could help).
>> Do you mean cancelation by stopping the job?
>>
>> On Wed, May 15, 2019 at 10:02 PM John Smith 
>> wrote:
>>
>>> So these are the two exceptions I see in the logs...
>>>
>>> Exception in thread "vert.x-worker-thread-0" Exception in thread
>>> "vert.x-internal-blocking-0" java.lang.NoClassDefFoundError:
>>> io/netty/util/concurrent/FastThreadLocal
>>> at
>>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>>> at java.lang.Thread.run(Thread.java:748)
>>> Caused by: java.lang.ClassNotFoundException:
>>> io.netty.util.concurrent.FastThreadLocal
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at
>>> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> ... 2 more
>>> java.lang.NoClassDefFoundError: io/netty/util/concurrent/FastThreadLocal
>>> at
>>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>>> at java.lang.Thread.run(Thread.java:748)
>>> May 15, 2019 10:42:07 AM io.vertx.core.impl.ContextImpl
>>> SEVERE: Unhandled exception
>>> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
>>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>>> at
>>> io.vertx.core.impl.VertxImpl.lambda$deleteCacheDirAndShutdown$25(VertxImpl.java:830)
>>> at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:284)
>>> at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:320)
>>> at
>>> io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
>>> at
>>> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>>> at
>>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
>>> at
>>> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
>>> at
>>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>>> at java.lang.Thread.run(Thread.java:748)
>>> Caused by: java.lang.NoClassDefFoundError:
>>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>>> ... 10 more
>>> Caused by: java.lang.ClassNotFoundException:
>>> io.vertx.core.impl.VertxImpl$SharedWorkerPool
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at
>>> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> ... 10 more
>>>
>>> On Wed, 15 May 2019 at 12:00, Andrey Zagrebin 
>>> wrote:
>>>
>>>> Hi John,
>>>>
>>>> could you share the full stack trace or better logs?
>>>> It looks like something is trying to be executed in vertx.io code
>>>> after the local task has been stopped and the class loader for the user
>>>> code has been unloaded. Maybe from some daemon thread pool.
>>>>
>>>> Best,
>>>> Andrey
>>>>
>>>>
>>>> On Wed, May 15, 2019 at 4:58 PM John Smith 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm using vertx.io as an async JDBC client for a RichAsyncFunction it
>>>>> works fine but when I stop the job I get...
>>>>>
>>>>> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
>>>>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>>>>>
>>>>> Is there a way to avoid/fix this?
>>>>>
>>>>


Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-16 Thread John Smith
Yes when I mean cancel the JOB, it's when you go inside the UI and hit the
cancel button at the top right corner.

The close is very simple...

@Override
public void close() throws Exception {
   if(jdbc != null)
  jdbc.close();

   if(vertx != null)
  vertx.close();

   if(ignite != null)
  ignite.close();
}

I was thinking that jdbc.close() and vertx.close() are actually async
underneath. The vertx APIs offers 2 methods for close, one which has async
interface and one that does not. But essentially they are both async.

So I'm thinking maybe I can introduce a small Thread.Sleep(1000) after the
last close to give a chance for vertx.close().




On Thu, 16 May 2019 at 04:14, Andrey Zagrebin  wrote:

> Could you share the source code of your RichAsyncFunction?
> Looks like netty threads of vertx are still being shutdown after user code
> class loader has been shutdown.
> It probably means that RichAsyncFunction was not closed properly or not
> all resources have been fully freed there (logging your
> RichAsyncFunction.close could help).
> Do you mean cancelation by stopping the job?
>
> On Wed, May 15, 2019 at 10:02 PM John Smith 
> wrote:
>
>> So these are the two exceptions I see in the logs...
>>
>> Exception in thread "vert.x-worker-thread-0" Exception in thread
>> "vert.x-internal-blocking-0" java.lang.NoClassDefFoundError:
>> io/netty/util/concurrent/FastThreadLocal
>> at
>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by: java.lang.ClassNotFoundException:
>> io.netty.util.concurrent.FastThreadLocal
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at
>> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> ... 2 more
>> java.lang.NoClassDefFoundError: io/netty/util/concurrent/FastThreadLocal
>> at
>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
>> at java.lang.Thread.run(Thread.java:748)
>> May 15, 2019 10:42:07 AM io.vertx.core.impl.ContextImpl
>> SEVERE: Unhandled exception
>> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>> at
>> io.vertx.core.impl.VertxImpl.lambda$deleteCacheDirAndShutdown$25(VertxImpl.java:830)
>> at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:284)
>> at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:320)
>> at
>> io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
>> at
>> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>> at
>> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
>> at
>> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
>> at
>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by: java.lang.NoClassDefFoundError:
>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>> ... 10 more
>> Caused by: java.lang.ClassNotFoundException:
>> io.vertx.core.impl.VertxImpl$SharedWorkerPool
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at
>> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> ... 10 more
>>
>> On Wed, 15 May 2019 at 12:00, Andrey Zagrebin 
>> wrote:
>>
>>> Hi John,
>>>
>>> could you share the full stack trace or better logs?
>>> It looks like something is trying to be executed in vertx.io code after
>>> the local task has been stopped and the class loader for the user code has
>>> been unloaded. Maybe from some daemon thread pool.
>>>
>>> Best,
>>> Andrey
>>>
>>>
>>> On Wed, May 15, 2019 at 4:58 PM John Smith 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using vertx.io as an async JDBC client for a RichAsyncFunction it
>>>> works fine but when I stop the job I get...
>>>>
>>>> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
>>>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>>>>
>>>> Is there a way to avoid/fix this?
>>>>
>>>


Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-16 Thread Andrey Zagrebin
Could you share the source code of your RichAsyncFunction?
Looks like netty threads of vertx are still being shutdown after user code
class loader has been shutdown.
It probably means that RichAsyncFunction was not closed properly or not all
resources have been fully freed there (logging your RichAsyncFunction.close
could help).
Do you mean cancelation by stopping the job?

On Wed, May 15, 2019 at 10:02 PM John Smith  wrote:

> So these are the two exceptions I see in the logs...
>
> Exception in thread "vert.x-worker-thread-0" Exception in thread
> "vert.x-internal-blocking-0" java.lang.NoClassDefFoundError:
> io/netty/util/concurrent/FastThreadLocal
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException:
> io.netty.util.concurrent.FastThreadLocal
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 2 more
> java.lang.NoClassDefFoundError: io/netty/util/concurrent/FastThreadLocal
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
> at java.lang.Thread.run(Thread.java:748)
> May 15, 2019 10:42:07 AM io.vertx.core.impl.ContextImpl
> SEVERE: Unhandled exception
> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
> io/vertx/core/impl/VertxImpl$SharedWorkerPool
> at
> io.vertx.core.impl.VertxImpl.lambda$deleteCacheDirAndShutdown$25(VertxImpl.java:830)
> at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:284)
> at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:320)
> at
> io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
> at
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NoClassDefFoundError:
> io/vertx/core/impl/VertxImpl$SharedWorkerPool
> ... 10 more
> Caused by: java.lang.ClassNotFoundException:
> io.vertx.core.impl.VertxImpl$SharedWorkerPool
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at
> org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 10 more
>
> On Wed, 15 May 2019 at 12:00, Andrey Zagrebin 
> wrote:
>
>> Hi John,
>>
>> could you share the full stack trace or better logs?
>> It looks like something is trying to be executed in vertx.io code after
>> the local task has been stopped and the class loader for the user code has
>> been unloaded. Maybe from some daemon thread pool.
>>
>> Best,
>> Andrey
>>
>>
>> On Wed, May 15, 2019 at 4:58 PM John Smith 
>> wrote:
>>
>>> Hi,
>>>
>>> I'm using vertx.io as an async JDBC client for a RichAsyncFunction it
>>> works fine but when I stop the job I get...
>>>
>>> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
>>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>>>
>>> Is there a way to avoid/fix this?
>>>
>>


Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-15 Thread John Smith
So these are the two exceptions I see in the logs...

Exception in thread "vert.x-worker-thread-0" Exception in thread
"vert.x-internal-blocking-0" java.lang.NoClassDefFoundError:
io/netty/util/concurrent/FastThreadLocal
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException:
io.netty.util.concurrent.FastThreadLocal
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at
org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 2 more
java.lang.NoClassDefFoundError: io/netty/util/concurrent/FastThreadLocal
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:32)
at java.lang.Thread.run(Thread.java:748)
May 15, 2019 10:42:07 AM io.vertx.core.impl.ContextImpl
SEVERE: Unhandled exception
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
io/vertx/core/impl/VertxImpl$SharedWorkerPool
at
io.vertx.core.impl.VertxImpl.lambda$deleteCacheDirAndShutdown$25(VertxImpl.java:830)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:284)
at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:320)
at
io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
at
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
at
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError:
io/vertx/core/impl/VertxImpl$SharedWorkerPool
... 10 more
Caused by: java.lang.ClassNotFoundException:
io.vertx.core.impl.VertxImpl$SharedWorkerPool
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at
org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 10 more

On Wed, 15 May 2019 at 12:00, Andrey Zagrebin  wrote:

> Hi John,
>
> could you share the full stack trace or better logs?
> It looks like something is trying to be executed in vertx.io code after
> the local task has been stopped and the class loader for the user code has
> been unloaded. Maybe from some daemon thread pool.
>
> Best,
> Andrey
>
>
> On Wed, May 15, 2019 at 4:58 PM John Smith  wrote:
>
>> Hi,
>>
>> I'm using vertx.io as an async JDBC client for a RichAsyncFunction it
>> works fine but when I stop the job I get...
>>
>> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
>> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>>
>> Is there a way to avoid/fix this?
>>
>


Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-15 Thread Andrey Zagrebin
Hi John,

could you share the full stack trace or better logs?
It looks like something is trying to be executed in vertx.io code after the
local task has been stopped and the class loader for the user code has been
unloaded. Maybe from some daemon thread pool.

Best,
Andrey


On Wed, May 15, 2019 at 4:58 PM John Smith  wrote:

> Hi,
>
> I'm using vertx.io as an async JDBC client for a RichAsyncFunction it
> works fine but when I stop the job I get...
>
> java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
> io/vertx/core/impl/VertxImpl$SharedWorkerPool
>
> Is there a way to avoid/fix this?
>


Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-15 Thread John Smith
Hi,

I'm using vertx.io as an async JDBC client for a RichAsyncFunction it works
fine but when I stop the job I get...

java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
io/vertx/core/impl/VertxImpl$SharedWorkerPool

Is there a way to avoid/fix this?


Re: Stopping the job with ID XXX failed.

2017-05-08 Thread Stefan Richter
Hi,

what you intend to do is cancel in Flink terminology, not stop. So you should 
use the cancel command instead of the stop. Please take a look here: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/cli.html 
<https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/cli.html> .

Best,
Stefan

> Am 08.05.2017 um 14:02 schrieb yunfan123 <yunfanfight...@foxmail.com>:
> 
> I can't stop the job, every time the exception like follows.
> 
> Retrieving JobManager.
> Using address /10.6.192.141:6123 to connect to JobManager.
> 
> 
> The program finished with the following exception:
> 
> java.lang.Exception: Stopping the job with ID
> 7f5a5f95353a2b486572f4cdefa813b8 failed.
>   at org.apache.flink.client.CliFrontend.stop(CliFrontend.java:528)
>   at
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1087)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1126)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
>   at
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1123)
> Caused by: java.lang.IllegalStateException: Job with ID
> 7f5a5f95353a2b486572f4cdefa813b8 is not stoppable.
>   at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:657)
>   at
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
>   at
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>   at
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
>   at
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:118)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 
> 
> 
> --
> View this message in context: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Stopping-the-job-with-ID-XXX-failed-tp13046.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.



Stopping the job with ID XXX failed.

2017-05-08 Thread yunfan123
I can't stop the job, every time the exception like follows.

Retrieving JobManager.
Using address /10.6.192.141:6123 to connect to JobManager.


 The program finished with the following exception:

java.lang.Exception: Stopping the job with ID
7f5a5f95353a2b486572f4cdefa813b8 failed.
at org.apache.flink.client.CliFrontend.stop(CliFrontend.java:528)
at
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1087)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1126)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
at
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1123)
Caused by: java.lang.IllegalStateException: Job with ID
7f5a5f95353a2b486572f4cdefa813b8 is not stoppable.
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:657)
at
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
at
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at
org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:118)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Stopping-the-job-with-ID-XXX-failed-tp13046.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Gracefully Stopping Streaming Job Programmatically in LocalStreamEnvironment

2016-08-29 Thread Konstantin Knauf
Hi everyone,

I have an integration test for which a use a LocalStreamEnvironment.
Currently, the Flink Job is started in a separated thread, which I
interrupt after some time and then do some assertions.

In this situation is there a better way to stop/cancel a running job in
LocalStreamEnvironment programmatically. Side-Info: The job is reading
from a Kafka Cluster, which is programmatically started for each test run.

Cheers,

Konstantin

-- 
Konstantin Knauf * konstantin.kn...@tngtech.com * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082



signature.asc
Description: OpenPGP digital signature