Re: Improve the documentation of the Flink Architecture and internals

2015-03-16 Thread Aljoscha Krettek
Why do you wan't to split stuff between the doc in the repository and
the wiki. I for one would always be to lazy to check stuff in a wiki
when there is also a documentation. Plus, this would lead to
additional overhead in deciding what goes where and syncing between
the two places for documentation.

On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen se...@apache.org wrote:
 Ah, I totally forgot to add to the internals:

   - Fault tolerance in Batch mode

   - Fault Tolerance in Streaming Mode, with state handling

 On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen se...@apache.org wrote:

 Hi all!

 I would like to kick of an effort to improve the documentation of the
 Flink Architecture and internals. This also means making the streaming
 architecture more prominent in the docs.

 Being quite a sophisticated stack, we need to improve the presentation of
 how Flink works - to an extend necessary to use Flink (and to appreciate
 all the cool stuff that is happening). This should also come in handy with
 new contributors.

 As a general umbrella, we need to first decide where and how to organize
 the documentation.

 I would propose to put the bulk of the documentation into the Wiki. Create
 a dedicated section on Flink Internals and sub-pages for each component /
 topic. To the docs, we add a general overview from which we link into the
 Wiki.


  == These sections would go into the DOCS in the git repository ==

   - Overview of Program, pre-flight phase (type extraction, optimizer),
 JobManager, TaskManager. Differences between streaming and batch. We can
 realize this through one very nice picture with few lines of text.

   - High level architecture stack, different program representations (API
 operators, common API DAG, optimizer DAG, parallel data flow (JobGraph /
 Execution Graph)

   - (maybe) Parallelism and scheduling. This seems to be paramount to
 understand for users.

   - Processes (JobManager, TaskManager, Webserver, WebClient, CLI client)



  == These sections would go into the WIKI ==

   - Project structure (maven projects, what is where, dependencies between
 projects)

   - Component overview

 - JobManager (InstanceManager, Scheduler, BLOB server, Library Cache,
 Archiving)

 - TaskManager (MemoryManager, IOManager, BLOB Cache, Library Cache)

 - Involved Actor Systems / Actors / Messages

   - Details about submitting a job (library upload, job graph submission,
 execution graph setup, scheduling trigger)

   - Memory Management

   - Optimizer internals

   - Akka Setup specifics

   - Netty and pluggable data exchange strategies

   - Testing: Flink test clusters and unit test utilities

   - Developer How-To: Setting up Eclipse, IntelliJ, Travis

   - Step-by-step guide to add a new operator


 I will go ahead and stub some sections in the Wiki.

 As we discuss and agree/disagree with the outline, we can evolve the Wiki.

 Greetings,
 Stephan




Introduction of new Gelly developper

2015-03-16 Thread Yi ZHOU

Hello, everyone,

I am ZHOU Yi, a student working in graph algorithm, especially graph 
combinatorial algorithms. I am glad to join the Flink devlopping group.


I am working on Gelly graph library, hopping to devote improvements and 
new features to Gelly graph library (algorithms library, graph partition 
improvements etc. issue26 
https://github.com/project-flink/flink-graph/issues/20). Now,  I plan 
to start  with affinity propagation example Flink-1707 
https://issues.apache.org/jira/browse/FLINK-1707.


My apache account on appache is joey00! (a little bizzare ;-) ) , my 
github account is joey001. I have joined several lab group projects 
(AUTOSAR, TTCN-3), but don't much open source experience, especially 
large project like Flink.  Your suggestions and comments will be welcomed.


Best Regards.
ZHOU Yi


Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-16 Thread Maximilian Michels
+1 for enforcing a more strict Java code style. However, let's not
introduce a line legth of 100 like in Scala. I think that's hurting
readability of the code.

On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote:

 On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org wrote:

  I'm in favor of strict coding styles. And I like the google style.


 +1 I would like that. We essentially all agree that we want more
 homogeneity and I think strict rules are the only way to go. Since this is
 a very subjective matter it makes sense to go with something (somewhat)
 well
 established like the Google code style.



Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-16 Thread Alexander Alexandrov
+1 for not limiting the line length.

2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org:

 +1 for not limiting the line length. Everyone should have a good sense to
 break lines. When in exceptional cases people violate this, it is usually
 for a good reason.

 On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org
 wrote:

  +1 for enforcing a more strict Java code style. However, let's not
  introduce a line legth of 100 like in Scala. I think that's hurting
  readability of the code.
 
  On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote:
 
   On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org
  wrote:
  
I'm in favor of strict coding styles. And I like the google style.
  
  
   +1 I would like that. We essentially all agree that we want more
   homogeneity and I think strict rules are the only way to go. Since this
  is
   a very subjective matter it makes sense to go with something (somewhat)
   well
   established like the Google code style.
  
 



Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-16 Thread Hermann Gábor
+1 for the stricter Java code styles.

We should not forget about providing code formatter settings for Eclipse
and Intellij IDEA (as mentioned above).
That would help a lot.

(Of course if we'll use Google Code Style, they already provide such files
https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml
.)

On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov 
alexander.s.alexand...@gmail.com wrote:

 +1 for not limiting the line length.

 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org:

  +1 for not limiting the line length. Everyone should have a good sense to
  break lines. When in exceptional cases people violate this, it is usually
  for a good reason.
 
  On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org
  wrote:
 
   +1 for enforcing a more strict Java code style. However, let's not
   introduce a line legth of 100 like in Scala. I think that's hurting
   readability of the code.
  
   On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org wrote:
  
On Saturday, March 14, 2015, Aljoscha Krettek aljos...@apache.org
   wrote:
   
 I'm in favor of strict coding styles. And I like the google style.
   
   
+1 I would like that. We essentially all agree that we want more
homogeneity and I think strict rules are the only way to go. Since
 this
   is
a very subjective matter it makes sense to go with something
 (somewhat)
well
established like the Google code style.
   
  
 



Re: [DISCUSS] Name of Expression API and DataSet abstraction

2015-03-16 Thread Márton Balassi
+1 for Max's suggestion.

On Mon, Mar 16, 2015 at 10:32 AM, Ufuk Celebi u...@apache.org wrote:

 On Fri, Mar 13, 2015 at 6:08 PM, Maximilian Michels m...@apache.org
 wrote:

 
  Thanks for starting the discussion. We should definitely not keep
  flink-expressions.
 
  I'm in favor of DataTable for the DataSet abstraction equivalent. For
  consistency, the package name should then be flink-table. At first
  sight, the name seems kind of plain but I think it is quite intuitive
  because the API enables you to work in a SQL like fashion.
 


 +1

 I think this is a very good suggestion. :-)

 (There is an associated issue, we shouldn't forget to close:
 https://issues.apache.org/jira/browse/FLINK-1623)



[jira] [Created] (FLINK-1706) Add disk spilling to BarrierBuffer

2015-03-16 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1706:
-

 Summary: Add disk spilling to BarrierBuffer
 Key: FLINK-1706
 URL: https://issues.apache.org/jira/browse/FLINK-1706
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Streaming
Reporter: Gyula Fora
Assignee: Gyula Fora
 Fix For: 0.9


The barrier buffers, used for streaming fault tolerance to sync on superstep 
barriers, currently dont spill to disk which can cause out of memory exceptions 
when some input task is delayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Name of Expression API and DataSet abstraction

2015-03-16 Thread Hermann Gábor
+1 for DataTable

On Mon, Mar 16, 2015 at 4:11 PM Till Rohrmann trohrm...@apache.org wrote:

 +1 for DataTable

 On Mon, Mar 16, 2015 at 10:34 AM, Márton Balassi balassi.mar...@gmail.com
 
 wrote:

  +1 for Max's suggestion.
 
  On Mon, Mar 16, 2015 at 10:32 AM, Ufuk Celebi u...@apache.org wrote:
 
   On Fri, Mar 13, 2015 at 6:08 PM, Maximilian Michels m...@apache.org
   wrote:
  
   
Thanks for starting the discussion. We should definitely not keep
flink-expressions.
   
I'm in favor of DataTable for the DataSet abstraction equivalent. For
consistency, the package name should then be flink-table. At first
sight, the name seems kind of plain but I think it is quite intuitive
because the API enables you to work in a SQL like fashion.
   
  
  
   +1
  
   I think this is a very good suggestion. :-)
  
   (There is an associated issue, we shouldn't forget to close:
   https://issues.apache.org/jira/browse/FLINK-1623)
  
 



[jira] [Created] (FLINK-1708) KafkaSink not working from Usercode classloader

2015-03-16 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1708:
-

 Summary: KafkaSink not working from Usercode classloader
 Key: FLINK-1708
 URL: https://issues.apache.org/jira/browse/FLINK-1708
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger


The KafakSink is not able to be used from a user jar.

I have a fix ready.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-16 Thread Till Rohrmann
Do we already enforce the official Scala style guide strictly?

On Mon, Mar 16, 2015 at 4:57 PM, Aljoscha Krettek aljos...@apache.org
wrote:

 I'm already always sticking to the official Scala style guide, with the
 exception of 100 line length.
 On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote:

  +1 for stricter Java code styles. I haven't looked into the Google Code
  Style but maybe we make it easier for new contributors if we apply a
 coding
  style which is somehow known.
 
  +1 for line length of 100 for Scala code. I think it makes code review on
  GitHub easier.
 
  For the Scala style, we could stick to official style guidelines [1].
 
  [1] http://docs.scala-lang.org/style/
 
  On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com
  wrote:
 
   +1 for the stricter Java code styles.
  
   We should not forget about providing code formatter settings for
 Eclipse
   and Intellij IDEA (as mentioned above).
   That would help a lot.
  
   (Of course if we'll use Google Code Style, they already provide such
  files
   
  
 
 https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml
   
   .)
  
   On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov 
   alexander.s.alexand...@gmail.com wrote:
  
+1 for not limiting the line length.
   
2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org:
   
 +1 for not limiting the line length. Everyone should have a good
  sense
   to
 break lines. When in exceptional cases people violate this, it is
   usually
 for a good reason.

 On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels 
 m...@apache.org
 wrote:

  +1 for enforcing a more strict Java code style. However, let's
 not
  introduce a line legth of 100 like in Scala. I think that's
 hurting
  readability of the code.
 
  On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org
  wrote:
 
   On Saturday, March 14, 2015, Aljoscha Krettek 
  aljos...@apache.org
   
  wrote:
  
I'm in favor of strict coding styles. And I like the google
   style.
  
  
   +1 I would like that. We essentially all agree that we want
 more
   homogeneity and I think strict rules are the only way to go.
  Since
this
  is
   a very subjective matter it makes sense to go with something
(somewhat)
   well
   established like the Google code style.
  
 

   
  
 



[jira] [Created] (FLINK-1709) Add support for programs with higher-than-slot parallelism

2015-03-16 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1709:
--

 Summary: Add support for programs with higher-than-slot parallelism
 Key: FLINK-1709
 URL: https://issues.apache.org/jira/browse/FLINK-1709
 Project: Flink
  Issue Type: Improvement
Affects Versions: master
Reporter: Ufuk Celebi


Currently, we can't run programs with higher parallelism than available slots.

For example, if we currently have a map-reduce program and 4 task slots 
configured (e.g. 2 task managers with 2 slots per task manager), the map and 
reduce tasks will be scheduled with pipelined results and the same parallelism 
in shared slots. Setting the parallelism to more than available slots will 
result in a NoResourcesAvailableException.

As a first step to support these kinds of programs, we can add initial support 
for this when running in batch mode (after 
https://github.com/apache/flink/pull/471 is merged).

This is easier than the original pipelined scenario, because the map tasks can 
be deployed after each other to produce the blocking result. The blocking 
result can then be consumed after all map tasks produced their result. The 
mechanism in #471 to deploy result receivers can be used for this and should 
not need any modifications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Could not build up connection to JobManager

2015-03-16 Thread Till Rohrmann
It is really strange. It's right that the CliFrontend now resolves
localhost to the correct local address 10.218.100.122. Moreover, according
to the logs, the JobManager is also started and binds to akka.tcp://
flink@10.218.100.122:6123. According to the logs, this is also the address
the CliFrontend uses to connect to the JobManager. If the timestamps are
correct, then the JobManager was still alive when the job was sent. I don't
really understand why this happens. Can it be that the CliFrontend which
binds to 127.0.0.1 cannot communicate with 10.218.100.122? Can it be that
you have some settings which prevent this? For the failing 127.0.0.1 case,
it would be helpful to have access to the JobManager log.

I've updated the branch
https://github.com/tillrohrmann/flink/tree/fixJobClient with a new fix for
the localhost scenario. Could you try it out again? Thanks a lot for your
help.

Best regards,

Till

On Mon, Mar 16, 2015 at 10:30 AM, Ufuk Celebi u...@apache.org wrote:

 There was an issue for this:
 https://issues.apache.org/jira/browse/FLINK-1634

 Can we close it then?

 On Sat, Mar 14, 2015 at 9:16 PM, Dulaj Viduranga vidura...@icloud.com
 wrote:

  Hay Stephan,
  Great to know you could fix the issue. Thank you on the update.
  Best regards.
 
   On Mar 14, 2015, at 9:19 PM, Stephan Ewen se...@apache.org wrote:
  
   Hey Dulaj!
  
   Forget what I said in the previous email. The issue with the wrong
  address
   binding seems to be solved now. There is another issue that the
 embedded
   taskmanager does not start properly, for whatever reason. My gut
 feeling
  is
   that there is something wrong
  
   There is a patch pending that changes the startup behavior to debug
 these
   situations much easier. I'll ping you as soon as that is in...
  
  
   Stephan
  
   On Sat, Mar 14, 2015 at 4:42 PM, Stephan Ewen se...@apache.org
 wrote:
  
   Hey Dulaj!
  
   One thing you can try is to add to the JVM startup options (in the
  scripts
   in the bin folder) the option -Djava.net.preferIPv4Stack=true and
  see
   if that helps it?
  
   Stephan
  
  
   On Sat, Mar 14, 2015 at 4:29 AM, Dulaj Viduranga 
 vidura...@icloud.com
   wrote:
  
   Hi,
   Still this is no luck. I’ll upload the logs with configuration
   “localhost as well as “127.0.0.1” so you can take a look.
  
   127.0.0.1
   flink-Vidura-flink-client-localhost.log 
  
 
 https://gist.github.com/viduranga/1d01149eee238158519e#file-flink-vidura-flink-client-localhost-log
  
  
   localhost
   flink-Vidura-flink-client-localhost.log 
  
 
 https://gist.github.com/viduranga/d866c24c0ba566abab17#file-flink-vidura-flink-client-localhost-log
  
   flink-Vidura-jobmanager-localhost.log 
  
 
 https://gist.github.com/viduranga/e7549ef818c6a2af73e9#file-flink-vidura-jobmanager-localhost-log
  
  
   On Mar 11, 2015, at 11:32 PM, Till Rohrmann trohrm...@apache.org
   wrote:
  
   Hi Dulaj,
  
   sorry for my late response. It looks as if the JobClient tries to
   connect
   to the JobManager using its IPv6 instead of IPv4. Akka is really
 picky
   when
   it comes to remote address. If Akka binds to the FQDN, then other
   ActorSystem which try to connect to it using its IP address won't be
   successful. I assume that this might be a problem. I tried to fix
 it.
   You
   can find it here [1]. Could you please try it out by starting a
 local
   cluster with the start-local.sh script. If it fails, could you
 please
   send
   me all log files (client, jobmanager and taskmanager). Once we
 figured
   out
   why the JobCilent does not connect, we can try to tackle the
  BlobServer
   issue.
  
   Cheers,
  
   Till
  
   [1] https://github.com/tillrohrmann/flink/tree/fixJobClient
  
   On Thu, Mar 5, 2015 at 4:40 PM, Dulaj Viduranga 
 vidura...@icloud.com
  
   wrote:
  
   Hi,
   The error message is,
  
   21:06:01,521 WARN  org.apache.hadoop.util.NativeCodeLoader
- Unable to load native-hadoop library for your platform...
  using
   builtin-java classes where applicable
   org.apache.flink.client.program.ProgramInvocationException: Could
 not
   build up connection to JobManager.
 at
 org.apache.flink.client.program.Client.run(Client.java:327)
 at
 org.apache.flink.client.program.Client.run(Client.java:306)
 at
 org.apache.flink.client.program.Client.run(Client.java:300)
 at
  
  
 
 org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:55)
 at
  
  
 
 org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:82)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
 Method)
 at
  
  
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
  
  
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at
  
  
 
 org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)

Re: [DISCUSS] Name of Expression API and DataSet abstraction

2015-03-16 Thread Fabian Hueske
I am also more in favor of Rel and Relation, but DataTable nicely follows
the terms DataSet and DataStream.
 On Mar 16, 2015 4:58 PM, Aljoscha Krettek aljos...@apache.org wrote:

 I like Relation or Rel, is shorter.
 On Mar 16, 2015 4:52 PM, Hermann Gábor reckone...@gmail.com wrote:

  +1 for DataTable
 
  On Mon, Mar 16, 2015 at 4:11 PM Till Rohrmann trohrm...@apache.org
  wrote:
 
   +1 for DataTable
  
   On Mon, Mar 16, 2015 at 10:34 AM, Márton Balassi 
  balassi.mar...@gmail.com
   
   wrote:
  
+1 for Max's suggestion.
   
On Mon, Mar 16, 2015 at 10:32 AM, Ufuk Celebi u...@apache.org
 wrote:
   
 On Fri, Mar 13, 2015 at 6:08 PM, Maximilian Michels 
 m...@apache.org
 wrote:

 
  Thanks for starting the discussion. We should definitely not keep
  flink-expressions.
 
  I'm in favor of DataTable for the DataSet abstraction equivalent.
  For
  consistency, the package name should then be flink-table. At
 first
  sight, the name seems kind of plain but I think it is quite
  intuitive
  because the API enables you to work in a SQL like fashion.
 


 +1

 I think this is a very good suggestion. :-)

 (There is an associated issue, we shouldn't forget to close:
 https://issues.apache.org/jira/browse/FLINK-1623)

   
  
 



Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-16 Thread Aljoscha Krettek
I'm already always sticking to the official Scala style guide, with the
exception of 100 line length.
On Mar 16, 2015 3:27 PM, Till Rohrmann trohrm...@apache.org wrote:

 +1 for stricter Java code styles. I haven't looked into the Google Code
 Style but maybe we make it easier for new contributors if we apply a coding
 style which is somehow known.

 +1 for line length of 100 for Scala code. I think it makes code review on
 GitHub easier.

 For the Scala style, we could stick to official style guidelines [1].

 [1] http://docs.scala-lang.org/style/

 On Mon, Mar 16, 2015 at 3:06 PM, Hermann Gábor reckone...@gmail.com
 wrote:

  +1 for the stricter Java code styles.
 
  We should not forget about providing code formatter settings for Eclipse
  and Intellij IDEA (as mentioned above).
  That would help a lot.
 
  (Of course if we'll use Google Code Style, they already provide such
 files
  
 
 https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml
  
  .)
 
  On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov 
  alexander.s.alexand...@gmail.com wrote:
 
   +1 for not limiting the line length.
  
   2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org:
  
+1 for not limiting the line length. Everyone should have a good
 sense
  to
break lines. When in exceptional cases people violate this, it is
  usually
for a good reason.
   
On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org
wrote:
   
 +1 for enforcing a more strict Java code style. However, let's not
 introduce a line legth of 100 like in Scala. I think that's hurting
 readability of the code.

 On Sat, Mar 14, 2015 at 4:41 PM, Ufuk Celebi u...@apache.org
 wrote:

  On Saturday, March 14, 2015, Aljoscha Krettek 
 aljos...@apache.org
  
 wrote:
 
   I'm in favor of strict coding styles. And I like the google
  style.
 
 
  +1 I would like that. We essentially all agree that we want more
  homogeneity and I think strict rules are the only way to go.
 Since
   this
 is
  a very subjective matter it makes sense to go with something
   (somewhat)
  well
  established like the Google code style.
 

   
  
 



Re: Could not build up connection to JobManager

2015-03-16 Thread Till Rohrmann
Could you please upload the logs? They would be really helpful.

On Mon, Mar 16, 2015 at 6:11 PM, Dulaj Viduranga vidura...@icloud.com
wrote:

 Hi,
 I tested the update but it’s still the same. I think it isn’t a problem
 with my system because, I have an XAMPP server working totally fine (I
 tried with it is shut down as well) and also I doubly checked hosts files.
 I had little snitch installed but I also tried uninstalling it.
 Isn’t there a way around without using DNS to resolve localhost?

  On Mar 16, 2015, at 10:04 PM, Till Rohrmann trohrm...@apache.org
 wrote:
 
  It is really strange. It's right that the CliFrontend now resolves
  localhost to the correct local address 10.218.100.122. Moreover,
 according
  to the logs, the JobManager is also started and binds to akka.tcp://
  flink@10.218.100.122:6123. According to the logs, this is also the
 address
  the CliFrontend uses to connect to the JobManager. If the timestamps are
  correct, then the JobManager was still alive when the job was sent. I
 don't
  really understand why this happens. Can it be that the CliFrontend which
  binds to 127.0.0.1 cannot communicate with 10.218.100.122? Can it be that
  you have some settings which prevent this? For the failing 127.0.0.1
 case,
  it would be helpful to have access to the JobManager log.
 
  I've updated the branch
  https://github.com/tillrohrmann/flink/tree/fixJobClient with a new fix
 for
  the localhost scenario. Could you try it out again? Thanks a lot for
 your
  help.
 
  Best regards,
 
  Till
 
  On Mon, Mar 16, 2015 at 10:30 AM, Ufuk Celebi u...@apache.org wrote:
 
  There was an issue for this:
  https://issues.apache.org/jira/browse/FLINK-1634
 
  Can we close it then?
 
  On Sat, Mar 14, 2015 at 9:16 PM, Dulaj Viduranga vidura...@icloud.com
  wrote:
 
  Hay Stephan,
  Great to know you could fix the issue. Thank you on the update.
  Best regards.
 
  On Mar 14, 2015, at 9:19 PM, Stephan Ewen se...@apache.org wrote:
 
  Hey Dulaj!
 
  Forget what I said in the previous email. The issue with the wrong
  address
  binding seems to be solved now. There is another issue that the
  embedded
  taskmanager does not start properly, for whatever reason. My gut
  feeling
  is
  that there is something wrong
 
  There is a patch pending that changes the startup behavior to debug
  these
  situations much easier. I'll ping you as soon as that is in...
 
 
  Stephan
 
  On Sat, Mar 14, 2015 at 4:42 PM, Stephan Ewen se...@apache.org
  wrote:
 
  Hey Dulaj!
 
  One thing you can try is to add to the JVM startup options (in the
  scripts
  in the bin folder) the option -Djava.net.preferIPv4Stack=true and
  see
  if that helps it?
 
  Stephan
 
 
  On Sat, Mar 14, 2015 at 4:29 AM, Dulaj Viduranga 
  vidura...@icloud.com
  wrote:
 
  Hi,
  Still this is no luck. I’ll upload the logs with configuration
  “localhost as well as “127.0.0.1” so you can take a look.
 
  127.0.0.1
  flink-Vidura-flink-client-localhost.log 
 
 
 
 https://gist.github.com/viduranga/1d01149eee238158519e#file-flink-vidura-flink-client-localhost-log
 
 
  localhost
  flink-Vidura-flink-client-localhost.log 
 
 
 
 https://gist.github.com/viduranga/d866c24c0ba566abab17#file-flink-vidura-flink-client-localhost-log
 
  flink-Vidura-jobmanager-localhost.log 
 
 
 
 https://gist.github.com/viduranga/e7549ef818c6a2af73e9#file-flink-vidura-jobmanager-localhost-log
 
 
  On Mar 11, 2015, at 11:32 PM, Till Rohrmann trohrm...@apache.org
  wrote:
 
  Hi Dulaj,
 
  sorry for my late response. It looks as if the JobClient tries to
  connect
  to the JobManager using its IPv6 instead of IPv4. Akka is really
  picky
  when
  it comes to remote address. If Akka binds to the FQDN, then other
  ActorSystem which try to connect to it using its IP address won't
 be
  successful. I assume that this might be a problem. I tried to fix
  it.
  You
  can find it here [1]. Could you please try it out by starting a
  local
  cluster with the start-local.sh script. If it fails, could you
  please
  send
  me all log files (client, jobmanager and taskmanager). Once we
  figured
  out
  why the JobCilent does not connect, we can try to tackle the
  BlobServer
  issue.
 
  Cheers,
 
  Till
 
  [1] https://github.com/tillrohrmann/flink/tree/fixJobClient
 
  On Thu, Mar 5, 2015 at 4:40 PM, Dulaj Viduranga 
  vidura...@icloud.com
 
  wrote:
 
  Hi,
  The error message is,
 
  21:06:01,521 WARN  org.apache.hadoop.util.NativeCodeLoader
  - Unable to load native-hadoop library for your platform...
  using
  builtin-java classes where applicable
  org.apache.flink.client.program.ProgramInvocationException: Could
  not
  build up connection to JobManager.
   at
  org.apache.flink.client.program.Client.run(Client.java:327)
   at
  org.apache.flink.client.program.Client.run(Client.java:306)
   at
  org.apache.flink.client.program.Client.run(Client.java:300)
   at
 
 
 
 
 

Re: Improve the documentation of the Flink Architecture and internals

2015-03-16 Thread Stephan Ewen
I have put my suggested version of an outline for the docs into the wiki.
Regardless where the docs end up (wiki or repository), we can use the wiki
to outline the docs.

https://cwiki.apache.org/confluence/display/FLINK/Flink+Internals

Some pages contain some stub or outline, others are completely blank.

Not a comple list. Additions are welcome.

On Mon, Mar 16, 2015 at 10:04 PM, Stephan Ewen se...@apache.org wrote:

 I think the Wiki has a much lower barrier of entry to fix docs, especially
 for external people. The docs, with the Jekyll setup, is rather tricky.
 I would very much like that all kinds of people contribute to the docs
 about the internals, not just the usual three suspects that have done this
 so far.

 Having a good landing page in the regular docs is exactly to not loose all
 the people that do not look into a wiki. The overview pages for the
 internals need to be good and accessible and nicely link to the wiki to
 forward people there.

 The overhead of deciding what goes where should not be terribly large, in
 my opinion, since there is no really wrong place to put it.



 On Mon, Mar 16, 2015 at 9:58 PM, Aljoscha Krettek aljos...@apache.org
 wrote:

 Why do you wan't to split stuff between the doc in the repository and
 the wiki. I for one would always be to lazy to check stuff in a wiki
 when there is also a documentation. Plus, this would lead to
 additional overhead in deciding what goes where and syncing between
 the two places for documentation.

 On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen se...@apache.org wrote:
  Ah, I totally forgot to add to the internals:
 
- Fault tolerance in Batch mode
 
- Fault Tolerance in Streaming Mode, with state handling
 
  On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen se...@apache.org wrote:
 
  Hi all!
 
  I would like to kick of an effort to improve the documentation of the
  Flink Architecture and internals. This also means making the streaming
  architecture more prominent in the docs.
 
  Being quite a sophisticated stack, we need to improve the presentation
 of
  how Flink works - to an extend necessary to use Flink (and to
 appreciate
  all the cool stuff that is happening). This should also come in handy
 with
  new contributors.
 
  As a general umbrella, we need to first decide where and how to
 organize
  the documentation.
 
  I would propose to put the bulk of the documentation into the Wiki.
 Create
  a dedicated section on Flink Internals and sub-pages for each
 component /
  topic. To the docs, we add a general overview from which we link into
 the
  Wiki.
 
 
   == These sections would go into the DOCS in the git repository ==
 
- Overview of Program, pre-flight phase (type extraction, optimizer),
  JobManager, TaskManager. Differences between streaming and batch. We
 can
  realize this through one very nice picture with few lines of text.
 
- High level architecture stack, different program representations
 (API
  operators, common API DAG, optimizer DAG, parallel data flow (JobGraph
 /
  Execution Graph)
 
- (maybe) Parallelism and scheduling. This seems to be paramount to
  understand for users.
 
- Processes (JobManager, TaskManager, Webserver, WebClient, CLI
 client)
 
 
 
   == These sections would go into the WIKI ==
 
- Project structure (maven projects, what is where, dependencies
 between
  projects)
 
- Component overview
 
  - JobManager (InstanceManager, Scheduler, BLOB server, Library
 Cache,
  Archiving)
 
  - TaskManager (MemoryManager, IOManager, BLOB Cache, Library
 Cache)
 
  - Involved Actor Systems / Actors / Messages
 
- Details about submitting a job (library upload, job graph
 submission,
  execution graph setup, scheduling trigger)
 
- Memory Management
 
- Optimizer internals
 
- Akka Setup specifics
 
- Netty and pluggable data exchange strategies
 
- Testing: Flink test clusters and unit test utilities
 
- Developer How-To: Setting up Eclipse, IntelliJ, Travis
 
- Step-by-step guide to add a new operator
 
 
  I will go ahead and stub some sections in the Wiki.
 
  As we discuss and agree/disagree with the outline, we can evolve the
 Wiki.
 
  Greetings,
  Stephan
 
 





Re: Improve the documentation of the Flink Architecture and internals

2015-03-16 Thread Stephan Ewen
I think the Wiki has a much lower barrier of entry to fix docs, especially
for external people. The docs, with the Jekyll setup, is rather tricky.
I would very much like that all kinds of people contribute to the docs
about the internals, not just the usual three suspects that have done this
so far.

Having a good landing page in the regular docs is exactly to not loose all
the people that do not look into a wiki. The overview pages for the
internals need to be good and accessible and nicely link to the wiki to
forward people there.

The overhead of deciding what goes where should not be terribly large, in
my opinion, since there is no really wrong place to put it.



On Mon, Mar 16, 2015 at 9:58 PM, Aljoscha Krettek aljos...@apache.org
wrote:

 Why do you wan't to split stuff between the doc in the repository and
 the wiki. I for one would always be to lazy to check stuff in a wiki
 when there is also a documentation. Plus, this would lead to
 additional overhead in deciding what goes where and syncing between
 the two places for documentation.

 On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen se...@apache.org wrote:
  Ah, I totally forgot to add to the internals:
 
- Fault tolerance in Batch mode
 
- Fault Tolerance in Streaming Mode, with state handling
 
  On Mon, Mar 16, 2015 at 7:51 PM, Stephan Ewen se...@apache.org wrote:
 
  Hi all!
 
  I would like to kick of an effort to improve the documentation of the
  Flink Architecture and internals. This also means making the streaming
  architecture more prominent in the docs.
 
  Being quite a sophisticated stack, we need to improve the presentation
 of
  how Flink works - to an extend necessary to use Flink (and to appreciate
  all the cool stuff that is happening). This should also come in handy
 with
  new contributors.
 
  As a general umbrella, we need to first decide where and how to organize
  the documentation.
 
  I would propose to put the bulk of the documentation into the Wiki.
 Create
  a dedicated section on Flink Internals and sub-pages for each component
 /
  topic. To the docs, we add a general overview from which we link into
 the
  Wiki.
 
 
   == These sections would go into the DOCS in the git repository ==
 
- Overview of Program, pre-flight phase (type extraction, optimizer),
  JobManager, TaskManager. Differences between streaming and batch. We can
  realize this through one very nice picture with few lines of text.
 
- High level architecture stack, different program representations
 (API
  operators, common API DAG, optimizer DAG, parallel data flow (JobGraph /
  Execution Graph)
 
- (maybe) Parallelism and scheduling. This seems to be paramount to
  understand for users.
 
- Processes (JobManager, TaskManager, Webserver, WebClient, CLI
 client)
 
 
 
   == These sections would go into the WIKI ==
 
- Project structure (maven projects, what is where, dependencies
 between
  projects)
 
- Component overview
 
  - JobManager (InstanceManager, Scheduler, BLOB server, Library
 Cache,
  Archiving)
 
  - TaskManager (MemoryManager, IOManager, BLOB Cache, Library Cache)
 
  - Involved Actor Systems / Actors / Messages
 
- Details about submitting a job (library upload, job graph
 submission,
  execution graph setup, scheduling trigger)
 
- Memory Management
 
- Optimizer internals
 
- Akka Setup specifics
 
- Netty and pluggable data exchange strategies
 
- Testing: Flink test clusters and unit test utilities
 
- Developer How-To: Setting up Eclipse, IntelliJ, Travis
 
- Step-by-step guide to add a new operator
 
 
  I will go ahead and stub some sections in the Wiki.
 
  As we discuss and agree/disagree with the outline, we can evolve the
 Wiki.
 
  Greetings,
  Stephan
 
 



Re: Could not build up connection to JobManager

2015-03-16 Thread Ufuk Celebi
There was an issue for this:
https://issues.apache.org/jira/browse/FLINK-1634

Can we close it then?

On Sat, Mar 14, 2015 at 9:16 PM, Dulaj Viduranga vidura...@icloud.com
wrote:

 Hay Stephan,
 Great to know you could fix the issue. Thank you on the update.
 Best regards.

  On Mar 14, 2015, at 9:19 PM, Stephan Ewen se...@apache.org wrote:
 
  Hey Dulaj!
 
  Forget what I said in the previous email. The issue with the wrong
 address
  binding seems to be solved now. There is another issue that the embedded
  taskmanager does not start properly, for whatever reason. My gut feeling
 is
  that there is something wrong
 
  There is a patch pending that changes the startup behavior to debug these
  situations much easier. I'll ping you as soon as that is in...
 
 
  Stephan
 
  On Sat, Mar 14, 2015 at 4:42 PM, Stephan Ewen se...@apache.org wrote:
 
  Hey Dulaj!
 
  One thing you can try is to add to the JVM startup options (in the
 scripts
  in the bin folder) the option -Djava.net.preferIPv4Stack=true and
 see
  if that helps it?
 
  Stephan
 
 
  On Sat, Mar 14, 2015 at 4:29 AM, Dulaj Viduranga vidura...@icloud.com
  wrote:
 
  Hi,
  Still this is no luck. I’ll upload the logs with configuration
  “localhost as well as “127.0.0.1” so you can take a look.
 
  127.0.0.1
  flink-Vidura-flink-client-localhost.log 
 
 https://gist.github.com/viduranga/1d01149eee238158519e#file-flink-vidura-flink-client-localhost-log
 
 
  localhost
  flink-Vidura-flink-client-localhost.log 
 
 https://gist.github.com/viduranga/d866c24c0ba566abab17#file-flink-vidura-flink-client-localhost-log
 
  flink-Vidura-jobmanager-localhost.log 
 
 https://gist.github.com/viduranga/e7549ef818c6a2af73e9#file-flink-vidura-jobmanager-localhost-log
 
 
  On Mar 11, 2015, at 11:32 PM, Till Rohrmann trohrm...@apache.org
  wrote:
 
  Hi Dulaj,
 
  sorry for my late response. It looks as if the JobClient tries to
  connect
  to the JobManager using its IPv6 instead of IPv4. Akka is really picky
  when
  it comes to remote address. If Akka binds to the FQDN, then other
  ActorSystem which try to connect to it using its IP address won't be
  successful. I assume that this might be a problem. I tried to fix it.
  You
  can find it here [1]. Could you please try it out by starting a local
  cluster with the start-local.sh script. If it fails, could you please
  send
  me all log files (client, jobmanager and taskmanager). Once we figured
  out
  why the JobCilent does not connect, we can try to tackle the
 BlobServer
  issue.
 
  Cheers,
 
  Till
 
  [1] https://github.com/tillrohrmann/flink/tree/fixJobClient
 
  On Thu, Mar 5, 2015 at 4:40 PM, Dulaj Viduranga vidura...@icloud.com
 
  wrote:
 
  Hi,
  The error message is,
 
  21:06:01,521 WARN  org.apache.hadoop.util.NativeCodeLoader
   - Unable to load native-hadoop library for your platform...
 using
  builtin-java classes where applicable
  org.apache.flink.client.program.ProgramInvocationException: Could not
  build up connection to JobManager.
at org.apache.flink.client.program.Client.run(Client.java:327)
at org.apache.flink.client.program.Client.run(Client.java:306)
at org.apache.flink.client.program.Client.run(Client.java:300)
at
 
 
 org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:55)
at
 
 
 org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at
 
 
 org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
at
 
 
 org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
at org.apache.flink.client.program.Client.run(Client.java:250)
at
 
 
 org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:371)
at
 org.apache.flink.client.CliFrontend.run(CliFrontend.java:344)
at
 
 
 org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1087)
at
  org.apache.flink.client.CliFrontend.main(CliFrontend.java:1114)
  Caused by: java.io.IOException: JobManager at akka.tcp://flink@fe80
  :0:0:0:742b:7f78:fab5:68e2%11:6123/user/jobmanager
  not reachable. Please make sure that the JobManager is running and
 its
  port
  is reachable.
at
 
 
 org.apache.flink.runtime.jobmanager.JobManager$.getJobManagerRemoteReference(JobManager.scala:957)
at
 
 
 org.apache.flink.runtime.client.JobClient$.createJobClient(JobClient.scala:151)
at
 
 
 org.apache.flink.runtime.client.JobClient$.createJobClientFromConfig(JobClient.scala:142)
at
 
 
 

Re: [DISCUSS] Offer Flink with Scala 2.11

2015-03-16 Thread Alexander Alexandrov
Popping up this thread. The PR build passes and we need some sort of
strategy (add suffix to packages or force manual build for 2.11) before we
merge.

2015-03-11 19:15 GMT+01:00 Robert Metzger rmetz...@apache.org:

 Thank you for opening the pull request. It looks great so far.

 There came up one question while discussing Alex changes in the PR: To make
 this change really usable for our users, we need to provide our users with
 different poms for each artifact, containing either Scala 2.10 or 2.11.
 Its basically the same thing we did for Hadoop's dependencies.
 We would then end up with the versions: 0.9.0 and 0.9.0-hadoop1
 and artifacts like flink-core and flink-core_2.11.
 So each module will generate 4 artifacts to maven central.

 Another option would be to keep everything as it is right now, then users
 would need to set the properties correctly when building their flink
 projects.




 On Wed, Mar 11, 2015 at 12:41 AM, Alexander Alexandrov 
 alexander.s.alexand...@gmail.com wrote:

  The PR is here: https://github.com/apache/flink/pull/477
 
  Cheers!
 
  2015-03-10 18:07 GMT+01:00 Alexander Alexandrov 
  alexander.s.alexand...@gmail.com:
 
   Yes, will do.
  
   2015-03-10 16:39 GMT+01:00 Robert Metzger rmetz...@apache.org:
  
   Very nice work.
   The changes are probably somewhat easy to merge. Except for the
 version
   properties in the parent pom, there should not be a bigger issue.
  
   Can you also add additional build profiles to travis for scala 2.11 ?
  
   On Tue, Mar 10, 2015 at 2:50 PM, Alexander Alexandrov 
   alexander.s.alexand...@gmail.com wrote:
  
We have is almost ready here:
   
https://github.com/stratosphere/flink/commits/scala_2.11_rebased
   
I wanted to open a PR today
   
2015-03-10 11:28 GMT+01:00 Robert Metzger rmetz...@apache.org:
   
 Hey Alex,

 I don't know the exact status of the Scala 2.11 integration. But I
   wanted
 to point you to https://github.com/apache/flink/pull/454, which
 is
 changing
 a huge portion of our maven build infrastructure.
 If you haven't started yet, it might make sense to base your
   integration
 onto that pull request.

 Otherwise, let me know if you have troubles rebasing your changes.

 On Mon, Mar 2, 2015 at 9:13 PM, Chiwan Park 
 chiwanp...@icloud.com
wrote:

  +1 for Scala 2.11
 
  Regards.
  Chiwan Park (Sent with iPhone)
 
 
   On Mar 3, 2015, at 2:43 AM, Robert Metzger 
 rmetz...@apache.org
  
 wrote:
  
   I'm +1 if this doesn't affect existing Scala 2.10 users.
  
   I would also suggest to add a scala 2.11 build to travis as
 well
   to
  ensure
   everything is working with the different Hadoop/JVM versions.
   It shouldn't be a big deal to offer scala_version x
  hadoop_version
 builds
   for newer releases.
   You only need to add more builds here:
  
 

   
  
 
 https://github.com/apache/flink/blob/master/tools/create_release_files.sh#L131
  
  
  
   On Mon, Mar 2, 2015 at 6:17 PM, Till Rohrmann 
   trohrm...@apache.org
  wrote:
  
   +1 for Scala 2.11
  
   On Mon, Mar 2, 2015 at 5:02 PM, Alexander Alexandrov 
   alexander.s.alexand...@gmail.com wrote:
  
   Spark currently only provides pre-builds for 2.10 and
 requires
custom
   build
   for 2.11.
  
   Not sure whether this is the best idea, but I can see the
   benefits
  from a
   project management point of view...
  
   Would you prefer to have a {scala_version} ×
 {hadoop_version}
  integrated
   on
   the website?
  
   2015-03-02 16:57 GMT+01:00 Aljoscha Krettek 
   aljos...@apache.org:
  
   +1 I also like it. We just have to figure out how we can
   publish
two
   sets of release artifacts.
  
   On Mon, Mar 2, 2015 at 4:48 PM, Stephan Ewen 
  se...@apache.org
   
  wrote:
   Big +1 from my side!
  
   Does it have to be a Maven profile, or does a maven
 property
work?
   (Profile
   may be needed for quasiquotes dependency?)
  
   On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov 
   alexander.s.alexand...@gmail.com wrote:
  
   Hi there,
  
   since I'm relying on Scala 2.11.4 on a project I've been
   working
   on, I
   created a branch which updates the Scala version used by
   Flink
 from
   2.10.4
   to 2.11.4:
  
   https://github.com/stratosphere/flink/commits/scala_2.11
  
   Everything seems to work fine and the PR contains minor
   changes
   compared to
   Spark:
  
   https://issues.apache.org/jira/browse/SPARK-4466
  
   If you're interested, I can rewrite this as a Maven
 Profile
   and
   open a
   PR
   so people can build Flink with 2.11 support.
  
   I suggest to do this sooner rather than later in order to