[jira] [Created] (FLINK-3010) Add link to powered-by wiki page to project website

2015-11-13 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-3010:


 Summary: Add link to powered-by wiki page to project website
 Key: FLINK-3010
 URL: https://issues.apache.org/jira/browse/FLINK-3010
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Fabian Hueske
Priority: Minor


We recently started a powered-by-Flink page in the Flink wiki:

https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink

We should link to that page from the project website to give it some exposure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Consistent shutdown of streaming jobs

2015-11-13 Thread Matthias J. Sax
I was thinking about this issue too and wanted to include it in my
current PR (I just about to rebase it to the current master...
https://github.com/apache/flink/pull/750).

Or should be open a new JIRA for it and address it after Stop signal is
available?


-Matthias

On 11/12/2015 11:47 AM, Maximilian Michels wrote:
> +1 for the proposed changes. But why not always create a snapshot on
> shutdown? Does that break any assumptions in the checkpointing
> interval? I see that if the user has checkpointing disabled, we can
> just create a fake snapshot.
> 
> On Thu, Nov 12, 2015 at 9:56 AM, Gyula Fóra  wrote:
>> Yes, I agree with you.
>>
>> Once we have the graceful shutdown we can make this happen fairly simply
>> with the mechanism you described :)
>>
>> Gyula
>>
>> Stephan Ewen  ezt írta (időpont: 2015. nov. 11., Sze,
>> 15:43):
>>
>>> I think you are touching on something important here.
>>>
>>> There is a discussion/PullRequest about graceful shutdown of streaming jobs
>>> (like stop
>>> the sources and let the remainder of the streams run out).
>>>
>>> With the work in progress to draw external checkpoint, it should be easy do
>>> checkpoint-and-close.
>>> We may not even need the last ack in the "checkpoint -> ack -> notify ->
>>> ack" sequence, when the
>>> operators simply wait for the "notifyComplete" function to finish. Then,
>>> the operators finish naturally
>>> only successfully when the "notifyComplete()" method succeeds, otherwise
>>> they go to the state "failed".
>>> That is good, because we need no extra mechanism (extra message type).
>>>
>>> What we do need anyways is a way to detect when the checkpoint did not
>>> globally succeed, that the
>>> functions where it succeeded do not wait forever for the "notifySuccessful"
>>> message.
>>>
>>> We have two things here now:
>>>
>>> 1) Graceful shutdown should trigger an "internal" checkpoint (which is
>>> immediately discarded), in order to commit
>>> pending data for cases where data is staged between checkpoints.
>>>
>>> 2) An option to shut down with external checkpoint would also be important,
>>> to stop and resume from exactly there.
>>>
>>>
>>> Stephan
>>>
>>>
>>> On Wed, Nov 11, 2015 at 3:19 PM, Gyula Fóra  wrote:
>>>
 Hey guys,

 With recent discussions around being able to shutdown and restart
>>> streaming
 jobs from specific checkpoints, there is another issue that I think needs
 tackling.

 As far as I understand when a streaming job finishes the tasks are not
 notified for the last checkpoints and also jobs don't take a final
 checkpoint before shutting down.

 In my opinion this might lead to situations when the user cannot tell
 whether the job finished properly (with consistent states/ outputs) etc.
>>> To
 give you a concrete example, let's say I am using the RollingSink to
 produce exactly once output files. If the job finishes I think there will
 be some files that remain in the pending state and are never completed.
>>> The
 user then sees some complete files, and some pending files for the
 completed job. The question is then, how do I tell whether the pending
 files were actually completed properly no that the job is finished.

 Another example would be that I want to manually shut down my job at
>>> 12:00
 and make sure that I produce every output up to that point. Is there any
 way to achieve this currently?

 I think we need to do 2 things to make this work:
 1. Job shutdowns (finish/manual) should trigger a final checkpoint
 2. These final checkpoints should actually be 2 phase checkpoints:
 checkpoint -> ack -> notify -> ack , then when the checkpointcoordinator
 gets all the notification acks it can tell the user that the system shut
 down cleanely.

 Unfortunately it can happen that for some reason the coordinator does not
 receive all the acks for a complete job, in that case it can warn the
>>> user
 that the checkpoint might be inconsistent.

 Let me know what you think!

 Cheers,
 Gyula

>>>



signature.asc
Description: OpenPGP digital signature


[jira] [Created] (FLINK-3008) Failing/restarting streaming jobs are buggy in the webfrontend

2015-11-13 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-3008:
-

 Summary: Failing/restarting streaming jobs are buggy in the 
webfrontend
 Key: FLINK-3008
 URL: https://issues.apache.org/jira/browse/FLINK-3008
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: 0.10, 1.0
Reporter: Gyula Fora


Failing/restarting jobs are not handled very cleanely by the webfrontend. There 
are multiple issues:

- One cannot cancel a restarting job (there is no cancel button): this is a 
problem if its a nonrecoverable failure and you just want to kill it
- Sometimes restarting but manually canceled jobs stay in the running jobs 
section in restarting phase even though it has been canceled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] [RESULT] Release Apache Flink 0.10.0 (release-0.10.0-rc8)

2015-11-13 Thread Chiwan Park
Great. Thanks Max! :)

> On Nov 13, 2015, at 4:06 PM, Vasiliki Kalavri  
> wrote:
> 
> \o/ \o/ \o/
> Thank you Max!
> On Nov 13, 2015 2:23 AM, "Nick Dimiduk"  wrote:
> 
>> Woo hoo!
>> 
>> On Thu, Nov 12, 2015 at 3:01 PM, Maximilian Michels 
>> wrote:
>> 
>>> Thanks for voting! The vote passes.
>>> 
>>> The following votes have been cast:
>>> 
>>> +1 votes: 7
>>> 
>>> Stephan
>>> Aljoscha
>>> Robert
>>> Max
>>> Chiwan*
>>> Henry
>>> Fabian
>>> 
>>> * non-binding
>>> 
>>> -1 votes: none
>>> 
>>> I'll upload the release artifacts and release the Maven artifacts.
>>> Once the changes are effective, the community may announce the
>>> release.
>>> 

Regards,
Chiwan Park





Re: Release notes for 0.10.0

2015-11-13 Thread Ufuk Celebi

> On 12 Nov 2015, at 21:25, Fabian Hueske  wrote:
> 
> Hi everybody,
> 
> with 0.10.0 almost being released I started writing release nodes for the
> Flink blog.
> 
> Please find the current draft here:
> https://docs.google.com/document/d/1ULZAdxwneZAldhJ69tB3UEvjJQhS-ZASN5mdtumtJ48/edit?usp=sharing
> 
> Everybody has permissions to comment the draft. Please let me know, if
> you'd like access to directly edit the document.
> 
> I would like to publish the release nodes in about 20 hours when the
> release artifacts have been pushed to Maven central and the Apache download
> mirrors.
> 
> Looking forward for your feedback and comments,
> Fabian

Very good work Fabian! Thanks!

Can we add a section (or a Wiki entry, which is linked) about how to migrate 
from 0.9 DataStream code? The document only mentions keyBy, but it makes sense 
to give more information on this.

Is there a plan to do this? I can try to draft a first version with the most 
important changes if you want.

– Ufuk



Re: Release notes for 0.10.0

2015-11-13 Thread Fabian Hueske
Yes, that's a good idea. I would go for a wiki page that we link to from
the announcement.
I don't think this level of detail should to go directly into the release
announcement.

It would be great if you could draft a migration guide.

2015-11-13 10:27 GMT+01:00 Ufuk Celebi :

>
> > On 12 Nov 2015, at 21:25, Fabian Hueske  wrote:
> >
> > Hi everybody,
> >
> > with 0.10.0 almost being released I started writing release nodes for the
> > Flink blog.
> >
> > Please find the current draft here:
> >
> https://docs.google.com/document/d/1ULZAdxwneZAldhJ69tB3UEvjJQhS-ZASN5mdtumtJ48/edit?usp=sharing
> >
> > Everybody has permissions to comment the draft. Please let me know, if
> > you'd like access to directly edit the document.
> >
> > I would like to publish the release nodes in about 20 hours when the
> > release artifacts have been pushed to Maven central and the Apache
> download
> > mirrors.
> >
> > Looking forward for your feedback and comments,
> > Fabian
>
> Very good work Fabian! Thanks!
>
> Can we add a section (or a Wiki entry, which is linked) about how to
> migrate from 0.9 DataStream code? The document only mentions keyBy, but it
> makes sense to give more information on this.
>
> Is there a plan to do this? I can try to draft a first version with the
> most important changes if you want.
>
> – Ufuk
>
>


Re: Release notes for 0.10.0

2015-11-13 Thread Maximilian Michels
Thanks Fabian for drafting the release announcement. The release
artifacts have already been synced and I've updated the website for
the 0.10.0 release.

Seems like we are set up to publish the release announcement later today.

On Fri, Nov 13, 2015 at 1:41 PM, Ufuk Celebi  wrote:
> https://cwiki.apache.org/confluence/display/FLINK/Migration+Guide%3A+0.9.x+to+0.10.x
>
>> On 13 Nov 2015, at 10:42, Fabian Hueske  wrote:
>>
>> Yes, that's a good idea. I would go for a wiki page that we link to from
>> the announcement.
>> I don't think this level of detail should to go directly into the release
>> announcement.
>>
>> It would be great if you could draft a migration guide.
>>
>> 2015-11-13 10:27 GMT+01:00 Ufuk Celebi :
>>
>>>
 On 12 Nov 2015, at 21:25, Fabian Hueske  wrote:

 Hi everybody,

 with 0.10.0 almost being released I started writing release nodes for the
 Flink blog.

 Please find the current draft here:

>>> https://docs.google.com/document/d/1ULZAdxwneZAldhJ69tB3UEvjJQhS-ZASN5mdtumtJ48/edit?usp=sharing

 Everybody has permissions to comment the draft. Please let me know, if
 you'd like access to directly edit the document.

 I would like to publish the release nodes in about 20 hours when the
 release artifacts have been pushed to Maven central and the Apache
>>> download
 mirrors.

 Looking forward for your feedback and comments,
 Fabian
>>>
>>> Very good work Fabian! Thanks!
>>>
>>> Can we add a section (or a Wiki entry, which is linked) about how to
>>> migrate from 0.9 DataStream code? The document only mentions keyBy, but it
>>> makes sense to give more information on this.
>>>
>>> Is there a plan to do this? I can try to draft a first version with the
>>> most important changes if you want.
>>>
>>> – Ufuk
>>>
>>>
>


[jira] [Created] (FLINK-3011) Cannot cancel failing/restarting streaming job from the command line

2015-11-13 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-3011:
-

 Summary: Cannot cancel failing/restarting streaming job from the 
command line
 Key: FLINK-3011
 URL: https://issues.apache.org/jira/browse/FLINK-3011
 Project: Flink
  Issue Type: Bug
  Components: Command-line client
Affects Versions: 0.10, 1.0
Reporter: Gyula Fora


I cannot seem to be able to cancel a failing/restarting job from the command 
line client. The job cannot be rescheduled so it keeps failing:

The exception I get:
13:58:11,240 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 - Status of job 0c895d22c632de5dfe16c42a9ba818d5 (player-id) changed to 
RESTARTING.
13:58:25,234 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 - Trying to cancel job with ID 0c895d22c632de5dfe16c42a9ba818d5.
13:58:25,561 WARN  akka.remote.ReliableDeliverySupervisor   
 - Association with remote system [akka.tcp://flink@127.0.0.1:42012] has 
failed, address is now gated for [5000] ms. Reason is: [Disassociated].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Canceling a failing/restarting job

2015-11-13 Thread Ufuk Celebi
Not that I am aware of. This is most probably a bug.

Looking at the code of the ExecutionGraph:

A job can only be cancelled when the job status is CREATED or RUNNING. If the 
job failed during execution it is in state FAILED until it is RESTARTING. After 
resetting the ExecutionGraph state, the state is CREATED (now it’s cancellable) 
until it's scheduled for execution, which then fails it again.

It should work if the cancelling happens right before trying to schedule it. :D

– Ufuk

> On 13 Nov 2015, at 15:07, Gyula Fóra  wrote:
> 
> Hey,
> 
> Is there any other way to cancel a job besides ./bin/flink cancel jobId?
> This doesnt seem to work when a job cannot be scheduled and is retrying
> over and over again.
> 
> The exception I get:
> 
> 13:58:11,240 INFO  org.apache.flink.runtime.jobmanager.JobManager
>  - Status of job 0c895d22c632de5dfe16c42a9ba818d5 (player-id)
> changed to RESTARTING.
> 13:58:25,234 INFO  org.apache.flink.runtime.jobmanager.JobManager
>  - Trying to cancel job with ID
> 0c895d22c632de5dfe16c42a9ba818d5.
> 13:58:25,561 WARN  akka.remote.ReliableDeliverySupervisor
>  - Association with remote system
> [akka.tcp://flink@127.0.0.1:42012] has failed, address is now gated
> for [5000] ms. Reason is: [Disassociated].
> 
> 
> I will open a JIRA for this, in the meantime it would still be good to
> kill it somehow.
> 
> 
> Cheers,
> 
> Gyula



Re: neo4j - Flink connector

2015-11-13 Thread Robert Metzger
Have a look at my updated version of your code:
https://github.com/rmetzger/scratch/tree/dependency_problem
It now executes both tests, however, I was not able to get the second test
to pass. It seems that Neo4j's web server is returning a 500 status code
when open()'ing the connection.
I'm not sure how to debug this issue.

On Thu, Nov 12, 2015 at 2:19 PM, Martin Junghanns 
wrote:

> Hi Robert,
>
> Thank you for the reply! At the moment we just "play" with Neo4j and Flink
> but the InputFormat shall be available in Flink eventually.
>
> Concerning the license: I did not think of that, but yes, I can make it
> available in maven central. I just need to find out how to do this.
>
> I created a branch that includes the dependency problem [1]. There is a
> test case "neo4jOnly" [2] which does not use Flink and works fine in a
> project where only neo4j-harness is included. However, when I add
> flink-java and flink-gelly (excluding flink-clients because of jetty) to
> the project, the neo4jOnly test fails with:
>
> org.neo4j.server.ServerStartupException: Starting Neo4j failed:
> com.sun.jersey.core.reflection.ReflectionHelper.classForNamePA(Ljava/lang/String;)Ljava/security/PrivilegedAction;
>
> I compared the depedencies of the "clean" neo4j-harness project and made
> sure the dependencies and versions are the same. ReflectionHelper is part
> of jersey-core which is included.
>
> This is really weird, because - as I wrote before - the simple neo4jOnly
> test ran a few days ago. Were there any changes concerning dependencies in
> 0.10-SNAPSHOT?
> However, the next thing which would fail is caused by the scala-library
> version conflict.
>
> Again, thanks for your help.
>
> Best,
> Martin
>
> [1] https://github.com/s1ck/flink-neo4j/tree/dependency_problem
> [2]
> https://github.com/s1ck/flink-neo4j/blob/dependency_problem/src/test/java/org/apache/flink/api/java/io/neo4j/Neo4jInputTest.java#L32
>
>
> On 12.11.2015 12:51, Robert Metzger wrote:
>
>> Sorry for the delay.
>> So the plan of this work is to add a neo4j connector into Flink, right?
>>
>> While looking at the pom files of neo4j I found that its GPLv3 licensed,
>> and Apache projects can not depend/link with GPL code [1].
>> So I we can not make the module part of the Flink source.
>> However, its actually quite easy to publish code into Maven central, so
>> you
>> could release it on your own into Maven.
>> If that is too much work for you, I can also start a github project like
>> "flink-gpl" with access to maven central where we can release stuff like
>> this.
>>
>> Is this repository [2] your current work in progress on the dependency
>> issue?
>> Maybe the neo4j dependency expects scala 2.11 and there is no scala 2.10
>> build out. In this case, we could require Flink users to use the scala
>> 2.11
>> build of Flink when they want to use neo4j.
>> I think I can help you much better as soon as I have your current pom file
>> + code.
>>
>> [1] http://www.apache.org/legal/resolved.html#category-a
>> [2] https://github.com/s1ck/flink-neo4j
>>
>>
>> On Wed, Nov 11, 2015 at 7:38 PM, Martin Junghanns <
>> m.jungha...@mailbox.org>
>> wrote:
>>
>> Hi,
>>>
>>> I am a bit stuck with that dependency problem. Any help would be
>>> appreciated as I would like to continue working on the formats. Thanks!
>>>
>>> Best,
>>> Martin
>>>
>>>
>>> On 07.11.2015 17:28, Martin Junghanns wrote:
>>>
>>> Hi Robert,

 Thank you for the hints. I tried to narrow down the error:

 Flink version: 0.10-SNAPSHOT
 Neo4j version: 2.3.0

 I start with two dependencies:
 flink-java
 flink-gelly

 (1) Add neo4j-harness and run basic example from Neo4j [1]
 Leads to:

 java.lang.ClassNotFoundException:
 org.eclipse.jetty.server.ConnectionFactory

 (2) I excluded jetty-server from flink-java and flink-gelly
 It now uses jetty-server:9.2.4.v20141103 (was 8.0.0.M1)
 Leads to:

 leads to: java.lang.NoSuchMethodError:
 org.eclipse.jetty.servlet.ServletContextHandler.

 (3) I excluded jetty-servlet from flink-java and flink-gelly
 It now uses jetty-servlet:9.2.4.v20141103 (was 8.0.0.M1)
 Leads to:

 java.lang.NoSuchMethodError: scala.Predef$.$conforms()

 (4) I excluded scala-library from flink-java and flink-gelly
 It now uses scala-library:2.11.7 (was 2.10.4)

 Now, the basic Neo4j example (without Flink runs).

 Next, I added Flink to the mix and wrote a simple test using
 neo4j-harness features, ExecutionEnvironment and my InputFormat.
 Leads to:

 java.lang.NoSuchMethodError:


 scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;

   at akka.actor.ActorCell$.(ActorCell.scala:336)
   at akka.actor.ActorCell$.(ActorCell.scala)
   at akka.actor.RootActorPath.$div(ActorPath.scala:159)
   at
 

Re: Release notes for 0.10.0

2015-11-13 Thread Ufuk Celebi
https://cwiki.apache.org/confluence/display/FLINK/Migration+Guide%3A+0.9.x+to+0.10.x

> On 13 Nov 2015, at 10:42, Fabian Hueske  wrote:
> 
> Yes, that's a good idea. I would go for a wiki page that we link to from
> the announcement.
> I don't think this level of detail should to go directly into the release
> announcement.
> 
> It would be great if you could draft a migration guide.
> 
> 2015-11-13 10:27 GMT+01:00 Ufuk Celebi :
> 
>> 
>>> On 12 Nov 2015, at 21:25, Fabian Hueske  wrote:
>>> 
>>> Hi everybody,
>>> 
>>> with 0.10.0 almost being released I started writing release nodes for the
>>> Flink blog.
>>> 
>>> Please find the current draft here:
>>> 
>> https://docs.google.com/document/d/1ULZAdxwneZAldhJ69tB3UEvjJQhS-ZASN5mdtumtJ48/edit?usp=sharing
>>> 
>>> Everybody has permissions to comment the draft. Please let me know, if
>>> you'd like access to directly edit the document.
>>> 
>>> I would like to publish the release nodes in about 20 hours when the
>>> release artifacts have been pushed to Maven central and the Apache
>> download
>>> mirrors.
>>> 
>>> Looking forward for your feedback and comments,
>>> Fabian
>> 
>> Very good work Fabian! Thanks!
>> 
>> Can we add a section (or a Wiki entry, which is linked) about how to
>> migrate from 0.9 DataStream code? The document only mentions keyBy, but it
>> makes sense to give more information on this.
>> 
>> Is there a plan to do this? I can try to draft a first version with the
>> most important changes if you want.
>> 
>> – Ufuk
>> 
>>