[jira] [Commented] (FLINK-1436) Command-line interface verbose option (-v)

2015-01-22 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287509#comment-14287509
 ] 

Robert Metzger commented on FLINK-1436:
---

I totally agree with you.

Also, it would be nice if the cli tool would say what it got as an action. In 
your case for example invalid action: -v
I had a similar issue with the jar file, the only error message I've got as 
Invalid jar file or so. It would have been nicer if the output was 
org.apache.flink.Wordcount is not a valid file location.

In addition to that, I don't see a reason why there is a -v option at all. 
I'm always using -v because I want to see the exceptions which happen. 
So I vote to a) improve the error messages
b) remove the verbose flag and make it always verbose
c) do not show the full help text on errors.

 Command-line interface verbose option (-v)
 --

 Key: FLINK-1436
 URL: https://issues.apache.org/jira/browse/FLINK-1436
 Project: Flink
  Issue Type: Improvement
  Components: Start-Stop Scripts
Reporter: Max Michels
Priority: Trivial
  Labels: usability

 Let me run just a basic Flink job and add the verbose flag. It's a general 
 option, so let me add it as a first parameter:
  ./flink -v run ../examples/flink-java-examples-0.8.0-WordCount.jar 
  hdfs:///input hdfs:///output9
 Invalid action!
 ./flink ACTION [GENERAL_OPTIONS] [ARGUMENTS]
   general options:
  -h,--help  Show the help for the CLI Frontend.
  -v,--verbose   Print more detailed error messages.
 Action run compiles and runs a program.
   Syntax: run [OPTIONS] jar-file arguments
   run action arguments:
  -c,--class classname   Class with the program entry point 
 (main
   method or getPlan() method. Only 
 needed
   if the JAR file does not specify the 
 class
   in its manifest.
  -m,--jobmanager host:port  Address of the JobManager (master) to
   which to connect. Use this flag to 
 connect
   to a different JobManager than the one
   specified in the configuration.
  -p,--parallelism parallelism   The parallelism with which to run the
   program. Optional flag to override the
   default value specified in the
   configuration.
 Action info displays information about a program.
   info action arguments:
  -c,--class classname   Class with the program entry point 
 (main
   method or getPlan() method. Only 
 needed
   if the JAR file does not specify the 
 class
   in its manifest.
  -e,--executionplan   Show optimized execution plan of the
   program (JSON)
  -m,--jobmanager host:port  Address of the JobManager (master) to
   which to connect. Use this flag to 
 connect
   to a different JobManager than the one
   specified in the configuration.
  -p,--parallelism parallelism   The parallelism with which to run the
   program. Optional flag to override the
   default value specified in the
   configuration.
 Action list lists running and finished programs.
   list action arguments:
  -m,--jobmanager host:port   Address of the JobManager (master) to which
to connect. Use this flag to connect to a
different JobManager than the one specified
in the configuration.
  -r,--running  Show running programs and their JobIDs
  -s,--scheduledShow scheduled prorgrams and their JobIDs
 Action cancel cancels a running program.
   cancel action arguments:
  -i,--jobid jobIDJobID of program to cancel
  -m,--jobmanager host:port   Address of the JobManager (master) to which
to connect. Use this flag to connect to a
different JobManager than the one specified
in the configuration.
 What just happened? This results in a lot of output which is usually 
 generated if you use the --help option on command-line tools. If your 
 terminal window is large enough, then you will see a tiny message:
 Please specify an action. I did 

[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-22 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287404#comment-14287404
 ] 

Robert Metzger commented on FLINK-629:
--

Maybe you can even cherry-pick the commit into the release-0.8 branch

 Add support for null values to the java api
 ---

 Key: FLINK-629
 URL: https://issues.apache.org/jira/browse/FLINK-629
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Stephan Ewen
Assignee: Gyula Fora
Priority: Critical
  Labels: github-import
 Fix For: pre-apache

 Attachments: Selection_006.png, SimpleTweetInputFormat.java, 
 Tweet.java, model.tar.gz


 Currently, many runtime operations fail when encountering a null value. Tuple 
 serialization should allow null fields.
 I suggest to add a method to the tuples called `getFieldNotNull()` which 
 throws a meaningful exception when the accessed field is null. That way, we 
 simplify the logic of operators that should not dead with null fields, like 
 key grouping or aggregations.
 Even though SQL allows grouping and aggregating of null values, I suggest to 
 exclude this from the java api, because the SQL semantics of aggregating null 
 fields are messy.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/629
 Created by: [StephanEwen|https://github.com/StephanEwen]
 Labels: enhancement, java api, 
 Milestone: Release 0.5.1
 Created at: Wed Mar 26 00:27:49 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-22 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287403#comment-14287403
 ] 

Robert Metzger commented on FLINK-629:
--

I think you can very easily backport the fix to Flink 0.8.0

 Add support for null values to the java api
 ---

 Key: FLINK-629
 URL: https://issues.apache.org/jira/browse/FLINK-629
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Stephan Ewen
Assignee: Gyula Fora
Priority: Critical
  Labels: github-import
 Fix For: pre-apache

 Attachments: Selection_006.png, SimpleTweetInputFormat.java, 
 Tweet.java, model.tar.gz


 Currently, many runtime operations fail when encountering a null value. Tuple 
 serialization should allow null fields.
 I suggest to add a method to the tuples called `getFieldNotNull()` which 
 throws a meaningful exception when the accessed field is null. That way, we 
 simplify the logic of operators that should not dead with null fields, like 
 key grouping or aggregations.
 Even though SQL allows grouping and aggregating of null values, I suggest to 
 exclude this from the java api, because the SQL semantics of aggregating null 
 fields are messy.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/629
 Created by: [StephanEwen|https://github.com/StephanEwen]
 Labels: enhancement, java api, 
 Milestone: Release 0.5.1
 Created at: Wed Mar 26 00:27:49 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1385) Add option to YARN client to disable resource availability checks

2015-01-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1385.
---
   Resolution: Fixed
Fix Version/s: 0.9

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/8f8efe2f

 Add option to YARN client to disable resource availability checks
 -

 Key: FLINK-1385
 URL: https://issues.apache.org/jira/browse/FLINK-1385
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.8, 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


 The YARN client checks which resources are available in the cluster at the 
 time of the initial deployment.
 If somebody wants to deploy a Flink YARN session to a full YARN cluster, the 
 Flink Client will reject the session.
 I'm going to add an option which disables the check. The YARN session will 
 then allocate new containers as they become available in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-883) Use MiniYARNCluster to test the Flink YARN client

2015-01-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-883.
--
   Resolution: Fixed
Fix Version/s: 0.9

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/8f8efe2f


 Use MiniYARNCluster to test the Flink YARN client
 -

 Key: FLINK-883
 URL: https://issues.apache.org/jira/browse/FLINK-883
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Affects Versions: 0.7.0-incubating
Reporter: Robert Metzger
Assignee: Sebastian Kunert
  Labels: github-import
 Fix For: 0.9


 I would like to have a test that verifies that our YARN client is properly 
 working, using the `MiniYARNCluster` class of YARN.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/883
 Created by: [rmetzger|https://github.com/rmetzger]
 Labels: enhancement, testing, YARN, 
 Created at: Thu May 29 09:13:06 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1295) Add option to Flink client to start a YARN session per job

2015-01-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1295.
---
   Resolution: Fixed
Fix Version/s: 0.9

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/8f8efe2f

 Add option to Flink client to start a YARN session per job
 --

 Key: FLINK-1295
 URL: https://issues.apache.org/jira/browse/FLINK-1295
 Project: Flink
  Issue Type: New Feature
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


 Currently, Flink users can only launch Flink on YARN as a YARN session 
 (meaning a long-running YARN application that can run multiple Flink jobs)
 Users have requested to extend the Flink Client to allocate YARN containers 
 only for executing a single job.
 As part of this pull request, I would suggest to refactor the YARN Client to 
 make it more modular and object oriented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1424) bin/flink run does not recognize -c parameter anymore

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1424:
--
Assignee: Max Michels

 bin/flink run does not recognize -c parameter anymore
 -

 Key: FLINK-1424
 URL: https://issues.apache.org/jira/browse/FLINK-1424
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: master
Reporter: Carsten Brandt
Assignee: Max Michels

 bin/flink binary does not recognize `-c` parameter anymore which specifies 
 the class to run:
 {noformat}
 $ ./flink run /path/to/target/impro3-ws14-flink-1.0-SNAPSHOT.jar -c 
 de.tu_berlin.impro3.flink.etl.FollowerGraphGenerator /tmp/flink/testgraph.txt 
 1
 usage: emma-experiments-impro3-ss14-flink
[-?]
 emma-experiments-impro3-ss14-flink: error: unrecognized arguments: '-c'
 {noformat}
 before this command worked fine and executed the job.
 I tracked it down to the following commit using `git bisect`:
 {noformat}
 93eadca782ee8c77f89609f6d924d73021dcdda9 is the first bad commit
 commit 93eadca782ee8c77f89609f6d924d73021dcdda9
 Author: Alexander Alexandrov alexander.s.alexand...@gmail.com
 Date:   Wed Dec 24 13:49:56 2014 +0200
 [FLINK-1027] [cli] Added support for '--' and '-' prefixed tokens in CLI 
 program arguments.
 
 This closes #278
 :04 04 a1358e6f7fe308b4d51a47069f190a29f87fdeda 
 d6f11bbc9444227d5c6297ec908e44b9644289a9 Mflink-clients
 {noformat}
 https://github.com/apache/flink/commit/93eadca782ee8c77f89609f6d924d73021dcdda9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1153) Yarn container does not terminate if Flink's yarn client is terminated before the application master is completely started

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-1153.
-
Resolution: Won't Fix

 Yarn container does not terminate if Flink's yarn client is terminated before 
 the application master is completely started
 --

 Key: FLINK-1153
 URL: https://issues.apache.org/jira/browse/FLINK-1153
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Till Rohrmann
Assignee: Robert Metzger

 The yarn application master container does not terminate if the yarn client 
 is terminated after the container request is issued but before the 
 application master is completely started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1016) Add directory for user-contributed programs

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-1016.
-
Resolution: Duplicate

 Add directory for user-contributed programs
 ---

 Key: FLINK-1016
 URL: https://issues.apache.org/jira/browse/FLINK-1016
 Project: Flink
  Issue Type: Task
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Trivial

 As a result of this discussion: 
 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Where-to-host-user-contributed-Flink-programs-td750.html
  
 I'm going to add a directory for user-contributed Flink programs that are 
 reviewed by committers, but not shipped with releases. In addition, we 
 require that these programs provide test cases to improve test diversity and 
 the overall coverage.
 The first algorithm I'm going to add is the contribution made in FLINK-904.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-956) Add a parameter to the YARN command line script that allows to define the amount of MemoryManager memory

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-956.
--
   Resolution: Fixed
Fix Version/s: 0.7.0-incubating

This has been fixed as described in the comments.

 Add a parameter to the YARN command line script that allows to define the 
 amount of MemoryManager memory
 

 Key: FLINK-956
 URL: https://issues.apache.org/jira/browse/FLINK-956
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Affects Versions: 0.6-incubating
Reporter: Stephan Ewen
Assignee: Robert Metzger
 Fix For: 0.7.0-incubating


 The current parameter specifies the YARN container size.
 It would be nice to have parameters for the JVM heap size 
 {{taskmanager.heap.mb}} and the amount of memory that goes to the memory 
 manager {{taskmanager.memory.size}}. Both should be optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1453) Integration tests for YARN failing on OS X

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1453.
---
   Resolution: Fixed
Fix Version/s: 0.9

Issue resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/3bdeab1b.

Thanks to [~uce] for giving me ssh access to his Mac :D

 Integration tests for YARN failing on OS X
 --

 Key: FLINK-1453
 URL: https://issues.apache.org/jira/browse/FLINK-1453
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


 The flink yarn tests are failing on OS X, most likely through a port conflict:
 {code}
 11:59:38,870 INFO  org.eclipse.jetty.util.log 
- jetty-0.9-SNAPSHOT
 11:59:38,885 WARN  org.eclipse.jetty.util.log 
- FAILED SelectChannelConnector@0.0.0.0:8081: java.net.BindException: 
 Address already in use
 11:59:38,885 WARN  org.eclipse.jetty.util.log 
- FAILED org.eclipse.jetty.server.Server@281c7736: java.net.BindException: 
 Address already in use
 11:59:38,892 ERROR akka.actor.OneForOneStrategy   
- Address already in use
 akka.actor.ActorInitializationException: exception during creation
 at akka.actor.ActorInitializationException$.apply(Actor.scala:164)
 at akka.actor.ActorCell.create(ActorCell.scala:596)
 at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456)
 at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
 at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
 at akka.dispatch.Mailbox.run(Mailbox.scala:220)
 at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
 at 
 scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
 at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: java.net.BindException: Address already in use
 at sun.nio.ch.Net.bind0(Native Method)
 at sun.nio.ch.Net.bind(Net.java:444)
 at sun.nio.ch.Net.bind(Net.java:436)
 at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
 at 
 org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:208)
 at 
 org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:288)
 at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:55)
 at org.eclipse.jetty.server.Server.doStart(Server.java:254)
 at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:55)
 at 
 org.apache.flink.runtime.jobmanager.web.WebInfoServer.start(WebInfoServer.java:198)
 at 
 org.apache.flink.runtime.jobmanager.WithWebServer$class.$init$(WithWebServer.scala:28)
 at 
 org.apache.flink.yarn.ApplicationMaster$$anonfun$startJobManager$2$$anon$1.init(ApplicationMaster.scala:181)
 at 
 org.apache.flink.yarn.ApplicationMaster$$anonfun$startJobManager$2.apply(ApplicationMaster.scala:181)
 at 
 org.apache.flink.yarn.ApplicationMaster$$anonfun$startJobManager$2.apply(ApplicationMaster.scala:181)
 at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343)
 at akka.actor.Props.newActor(Props.scala:252)
 at akka.actor.ActorCell.newActor(ActorCell.scala:552)
 at akka.actor.ActorCell.create(ActorCell.scala:578)
 ... 10 more
 {code}
 The issue does not appear on Travis or on Arch Linux (however, tests are also 
 failing on some Ubuntu versions)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1320) Add an off-heap variant of the managed memory

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1320:
--
Assignee: Max Michels

 Add an off-heap variant of the managed memory
 -

 Key: FLINK-1320
 URL: https://issues.apache.org/jira/browse/FLINK-1320
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Stephan Ewen
Assignee: Max Michels
Priority: Minor

 For (nearly) all memory that Flink accumulates (in the form of sort buffers, 
 hash tables, caching), we use a special way of representing data serialized 
 across a set of memory pages. The big work lies in the way the algorithms are 
 implemented to operate on pages, rather than on objects.
 The core class for the memory is the {{MemorySegment}}, which has all methods 
 to set and get primitives values efficiently. It is a somewhat simpler (and 
 faster) variant of a HeapByteBuffer.
 As such, it should be straightforward to create a version where the memory 
 segment is not backed by a heap byte[], but by memory allocated outside the 
 JVM, in a similar way as the NIO DirectByteBuffers, or the Netty direct 
 buffers do it.
 This may have multiple advantages:
   - We reduce the size of the JVM heap (garbage collected) and the number and 
 size of long living alive objects. For large JVM sizes, this may improve 
 performance quite a bit. Utilmately, we would in many cases reduce JVM size 
 to 1/3 to 1/2 and keep the remaining memory outside the JVM.
   - We save copies when we move memory pages to disk (spilling) or through 
 the network (shuffling / broadcasting / forward piping)
 The changes required to implement this are
   - Add a {{UnmanagedMemorySegment}} that only stores the memory adress as a 
 long, and the segment size. It is initialized from a DirectByteBuffer.
   - Allow the MemoryManager to allocate these MemorySegments, instead of the 
 current ones.
   - Make sure that the startup script pick up the mode and configure the heap 
 size and the max direct memory properly.
 Since the MemorySegment is probably the most performance critical class in 
 Flink, we must take care that we do this right. The following are critical 
 considerations:
   - If we want both solutions (heap and off-heap) to exist side-by-side 
 (configurable), we must make the base MemorySegment abstract and implement 
 two versions (heap and off-heap).
   - To get the best performance, we need to make sure that only one class 
 gets loaded (or at least ever used), to ensure optimal JIT de-virtualization 
 and inlining.
   - We should carefully measure the performance of both variants. From 
 previous micro benchmarks, I remember that individual byte accesses in 
 DirectByteBuffers (off-heap) were slightly slower than on-heap, any larger 
 accesses were equally good or slightly better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1446) Make KryoSerializer.createInstance() return new instances instead of null

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1446.
---
Resolution: Fixed

Resolved for 0.9 (master) in 
http://git-wip-us.apache.org/repos/asf/flink/commit/6bb02353
Resolved for 0.8.1 (release-0.8) in 
http://git-wip-us.apache.org/repos/asf/flink/commit/8ca7cbad

 Make KryoSerializer.createInstance() return new instances instead of null
 -

 Key: FLINK-1446
 URL: https://issues.apache.org/jira/browse/FLINK-1446
 Project: Flink
  Issue Type: Bug
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9, 0.8.1






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1454) CliFrontend blocks for 100 seconds when submitting to a non-existent JobManager

2015-01-26 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291951#comment-14291951
 ] 

Robert Metzger commented on FLINK-1454:
---

Yes. It would be nicer if the system told the user that the network connection 
was refused.
Also, 100s is a pretty long time, in particular because many newbies will come 
across this issue.

 CliFrontend blocks for 100 seconds when submitting to a non-existent 
 JobManager
 ---

 Key: FLINK-1454
 URL: https://issues.apache.org/jira/browse/FLINK-1454
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger

 When a user tries to submit a job to a job manager which doesn't exist at 
 all, the CliFrontend blocks for 100 seconds.
 Ideally, Akka would fail because it can not connect to the given 
 hostname:port.
  
 {code}
 ./bin/flink run ./examples/flink-java-examples-0.9-SNAPSHOT-WordCount.jar -c 
 foo.Baz
 org.apache.flink.client.program.ProgramInvocationException: The main method 
 caused an error.
   at 
 org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:449)
   at 
 org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:350)
   at org.apache.flink.client.program.Client.run(Client.java:242)
   at 
 org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:389)
   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:362)
   at 
 org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1078)
   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1102)
 Caused by: java.util.concurrent.TimeoutException: Futures timed out after 
 [100 seconds]
   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
   at 
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
   at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
   at 
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
   at scala.concurrent.Await$.result(package.scala:107)
   at org.apache.flink.runtime.akka.AkkaUtils$.ask(AkkaUtils.scala:265)
   at 
 org.apache.flink.runtime.client.JobClient$.uploadJarFiles(JobClient.scala:169)
   at 
 org.apache.flink.runtime.client.JobClient.uploadJarFiles(JobClient.scala)
   at org.apache.flink.client.program.Client.run(Client.java:314)
   at org.apache.flink.client.program.Client.run(Client.java:296)
   at org.apache.flink.client.program.Client.run(Client.java:290)
   at 
 org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:55)
   at 
 org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:82)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:434)
   ... 6 more
 The exception above occurred while trying to run your command.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-512) Add support for Tachyon File System to Flink

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-512.
--
Resolution: Fixed

Has been resolved by FLINK-1266 (at least to the scope of this issue)

 Add support for Tachyon File System to Flink
 

 Key: FLINK-512
 URL: https://issues.apache.org/jira/browse/FLINK-512
 Project: Flink
  Issue Type: New Feature
Reporter: Chesnay Schepler
Assignee: Robert Metzger
Priority: Minor
  Labels: github-import
 Fix For: pre-apache

 Attachments: pull-request-512-9112930359400451481.patch


 Implementation of the Tachyon file system.
 The code was tested using junit and the wordcount example on a tachyon file 
 system running in local mode.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/pull/512
 Created by: [zentol|https://github.com/zentol]
 Labels: 
 Created at: Wed Feb 26 13:58:24 CET 2014
 State: closed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1452) Add flink-contrib maven module and README.md with the rules

2015-02-02 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1452.
---
Resolution: Fixed

Resolved in 
https://github.com/apache/flink/commit/3d9267eb3020b4671c67302cbfbf57291eb5a9bb

 Add flink-contrib maven module and README.md with the rules
 -

 Key: FLINK-1452
 URL: https://issues.apache.org/jira/browse/FLINK-1452
 Project: Flink
  Issue Type: New Feature
  Components: flink-contrib
Reporter: Robert Metzger
Assignee: Robert Metzger

 I'll also create a JIRA component



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1438) ClassCastException for Custom InputSplit in local mode and invalid type code in distributed mode

2015-02-07 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310891#comment-14310891
 ] 

Robert Metzger commented on FLINK-1438:
---

We should add this also to release-0.8 in my optinion

 ClassCastException for Custom InputSplit in local mode and invalid type code 
 in distributed mode
 

 Key: FLINK-1438
 URL: https://issues.apache.org/jira/browse/FLINK-1438
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.8, 0.9
Reporter: Fabian Hueske
Assignee: Stephan Ewen
Priority: Minor
 Fix For: 0.9


 Jobs with custom InputSplits fail with a ClassCastException such as 
 {{org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit 
 cannot be cast to 
 org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit}} 
 if executed on a local setup. 
 This issue is probably related to different ClassLoaders used by the 
 JobManager when InputSplits are generated and when they are handed to the 
 InputFormat by the TaskManager. Moving the class of the custom InputSplit 
 into the {{./lib}} folder and removing it from the job's makes the job work.
 To reproduce the bug, run the following job on a local setup. 
 {code}
 public class CustomSplitTestJob {
   public static void main(String[] args) throws Exception {
   ExecutionEnvironment env = 
 ExecutionEnvironment.getExecutionEnvironment();
   DataSetString x = env.createInput(new TestFileInputFormat());
   x.print();
   env.execute();
   }
   public static class TestFileInputFormat implements 
 InputFormatString,TestFileInputSplit {
   @Override
   public void configure(Configuration parameters) {
   }
   @Override
   public BaseStatistics getStatistics(BaseStatistics 
 cachedStatistics) throws IOException {
   return null;
   }
   @Override
   public TestFileInputSplit[] createInputSplits(int minNumSplits) 
 throws IOException {
   return new TestFileInputSplit[]{new 
 TestFileInputSplit()};
   }
   @Override
   public InputSplitAssigner 
 getInputSplitAssigner(TestFileInputSplit[] inputSplits) {
   return new LocatableInputSplitAssigner(inputSplits);
   }
   @Override
   public void open(TestFileInputSplit split) throws IOException {
   }
   @Override
   public boolean reachedEnd() throws IOException {
   return false;
   }
   @Override
   public String nextRecord(String reuse) throws IOException {
   return null;
   }
   @Override
   public void close() throws IOException {
   }
   }
   public static class TestFileInputSplit extends FileInputSplit {
   }
 }
 {code}
 The same happens in distributed mode just that Akka terminates the 
 transmission of the input split with a meaningless {{invalid type code: 00}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1390) java.lang.ClassCastException: X cannot be cast to X

2015-02-07 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310892#comment-14310892
 ] 

Robert Metzger commented on FLINK-1390:
---

I'll ask the user who reported the issue whether the change in 1438 resolved it.

  java.lang.ClassCastException: X cannot be cast to X
 

 Key: FLINK-1390
 URL: https://issues.apache.org/jira/browse/FLINK-1390
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.8
Reporter: Robert Metzger

 A user is affected by an issue, which is probably caused by different 
 classloaders being used for loading user classes.
 Current state of investigation:
 - the error happens in yarn sessions (there is only a YARN environment 
 available)
 - the error doesn't happen on the first time the job is being executed. It 
 only happens on subsequent executions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1314) Update website about #flink chat room in freenode IRC

2015-02-03 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303569#comment-14303569
 ] 

Robert Metzger commented on FLINK-1314:
---

Sorry, I was too busy today. The change looks good to merge.
Thank you!

 Update website about #flink chat room in freenode IRC
 -

 Key: FLINK-1314
 URL: https://issues.apache.org/jira/browse/FLINK-1314
 Project: Flink
  Issue Type: Task
  Components: Project Website
Reporter: Henry Saputra
Assignee: Henry Saputra
Priority: Minor
 Attachments: FLINK-1314.patch


 Update Flink website to mention the #flink chat room in freenode IRC



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1464) Added ResultTypeQueryable interface to TypeSerializerInputFormat.

2015-02-03 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1464.
---
Resolution: Fixed

Fixed via e3f6c9ba69a3e545fdd8f18b7b652fa111ade93e
Thanks for the patch!

The fix has been merged by Stephan Ewen.


 Added ResultTypeQueryable interface to TypeSerializerInputFormat.
 -

 Key: FLINK-1464
 URL: https://issues.apache.org/jira/browse/FLINK-1464
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Optimizer
Affects Versions: 0.8, 0.9, 0.8.1
Reporter: Alexander Alexandrov
Assignee: Alexander Alexandrov
Priority: Minor
  Labels: easyfix
 Fix For: 0.9, 0.8.1

   Original Estimate: 6h
  Remaining Estimate: 6h

 It is currently impossible to use the {{TypeSerializerInputFormat}} with 
 generic Tuple types.
 For example, [this example 
 gist|https://gist.github.com/aalexandrov/90bf21f66bf604676f37] fails with a
 {quote}
 Exception in thread main 
 org.apache.flink.api.common.InvalidProgramException: The type returned by the 
 input format could not be automatically determined. Please specify the 
 TypeInformation of the produced type explicitly.
 at 
 org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341)
 at SerializedFormatExample$.main(SerializedFormatExample.scala:48)
 at SerializedFormatExample.main(SerializedFormatExample.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
 {quote}
 exaception. 
 To fix the issue, I changed the constructor to take a {{TypeInformationT}} 
 instad of a {{TypeSerializerT}} argument. If this is indeed a bug, I think 
 that this is a good solution. 
 Unfortunately the fix breaks the API. Feel free to change it if you find a 
 more elegant solution compatible with the 0.8 branch.
 The suggested fix can be found in the GitHub 
 [PR#349|https://github.com/apache/flink/pull/349].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-1471) Allow KeySelectors to implement ResultTypeQueryable

2015-02-03 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reopened FLINK-1471:
---

The commit Stephan mentions fixes another issue.
This is still unresolved.

 Allow KeySelectors to implement ResultTypeQueryable
 ---

 Key: FLINK-1471
 URL: https://issues.apache.org/jira/browse/FLINK-1471
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Timo Walther
 Fix For: 0.9


 See https://github.com/apache/flink/pull/354



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1473) Simplify SplittableIterator interface

2015-02-04 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1473:
-

 Summary: Simplify SplittableIterator interface
 Key: FLINK-1473
 URL: https://issues.apache.org/jira/browse/FLINK-1473
 Project: Flink
  Issue Type: Task
  Components: Java API
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1473) Simplify SplittableIterator interface

2015-02-04 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1473.
---
Resolution: Fixed

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/7ac6447f

 Simplify SplittableIterator interface
 -

 Key: FLINK-1473
 URL: https://issues.apache.org/jira/browse/FLINK-1473
 Project: Flink
  Issue Type: Task
  Components: Java API
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Trivial





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1473) Simplify SplittableIterator interface

2015-02-04 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305141#comment-14305141
 ] 

Robert Metzger commented on FLINK-1473:
---

Code from https://github.com/apache/flink/pull/338

 Simplify SplittableIterator interface
 -

 Key: FLINK-1473
 URL: https://issues.apache.org/jira/browse/FLINK-1473
 Project: Flink
  Issue Type: Task
  Components: Java API
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Trivial





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-592) Add support for secure YARN clusters with Kerberos Auth

2015-02-04 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-592.
--
Resolution: Fixed

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/48d8dd5a

 Add support for secure YARN clusters with Kerberos Auth
 ---

 Key: FLINK-592
 URL: https://issues.apache.org/jira/browse/FLINK-592
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Reporter: GitHub Import
Assignee: Daniel Warneke
Priority: Minor
  Labels: github-import
 Fix For: pre-apache


 The current YARN client will throw an exception (as of 
 https://github.com/stratosphere/stratosphere/pull/591) if it detects a secure 
 environment.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/592
 Created by: [rmetzger|https://github.com/rmetzger]
 Labels: enhancement, YARN, 
 Created at: Sun Mar 16 11:05:07 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1480) Test Flink against Hadoop 2.6.0

2015-02-05 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1480:
-

 Summary: Test Flink against Hadoop 2.6.0
 Key: FLINK-1480
 URL: https://issues.apache.org/jira/browse/FLINK-1480
 Project: Flink
  Issue Type: Task
  Components: Build System
Reporter: Robert Metzger
Assignee: Robert Metzger


One of our travis builds is building always against the latest hadoop release.
Right now, we test against 2.5.1. I would like to bump the version to 2.6.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1477) YARN Client: Use HADOOP_HOME if set

2015-02-05 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1477.
---
Resolution: Fixed

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/5e1cc9e2

 YARN Client: Use HADOOP_HOME if set
 ---

 Key: FLINK-1477
 URL: https://issues.apache.org/jira/browse/FLINK-1477
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger

 As part of FLINK-592 I removed code to load the Hadoop configuration from the 
 HADOOP_HOME variable.
 But it seems that we still have users where this deprecated variable is the 
 only hadoop conf variable which is set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1477) YARN Client: Use HADOOP_HOME if set

2015-02-05 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1477:
-

 Summary: YARN Client: Use HADOOP_HOME if set
 Key: FLINK-1477
 URL: https://issues.apache.org/jira/browse/FLINK-1477
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger


As part of FLINK-592 I removed code to load the Hadoop configuration from the 
HADOOP_HOME variable.
But it seems that we still have users where this deprecated variable is the 
only hadoop conf variable which is set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1488) JobManager web interface logfile access broken

2015-02-06 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1488:
-

 Summary: JobManager web interface logfile access broken
 Key: FLINK-1488
 URL: https://issues.apache.org/jira/browse/FLINK-1488
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


In 0.9 the logfile access dies with a null pointer exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1488) JobManager web interface logfile access broken

2015-02-06 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1488.
---
Resolution: Fixed

Resolved in https://issues.apache.org/jira/browse/FLINK-1488

 JobManager web interface logfile access broken
 --

 Key: FLINK-1488
 URL: https://issues.apache.org/jira/browse/FLINK-1488
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


 In 0.9 the logfile access dies with a null pointer exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-456) Optional runtime statistics / metrics collection

2015-02-06 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-456:
-
Summary: Optional runtime statistics / metrics collection  (was: Optional 
runtime statistics collection)

 Optional runtime statistics / metrics collection
 

 Key: FLINK-456
 URL: https://issues.apache.org/jira/browse/FLINK-456
 Project: Flink
  Issue Type: New Feature
  Components: JobManager, TaskManager
Reporter: Fabian Hueske
  Labels: github-import
 Fix For: pre-apache


 The engine should collect job execution statistics (e.g., via accumulators) 
 such as:
 - total number of input / output records per operator
 - histogram of input/output ratio of UDF calls
 - histogram of number of input records per reduce / cogroup UDF call
 - histogram of number of output records per UDF call
 - histogram of time spend in UDF calls
 - number of local and remote bytes read (not via accumulators)
 - ...
 These stats should be made available to the user after execution (via 
 webfrontend). The purpose of this feature is to ease performance debugging of 
 parallel jobs (e.g., to detect data skew).
 It should be possible to deactivate (or activate) the gathering of these 
 statistics.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/456
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, runtime, user satisfaction, 
 Created at: Tue Feb 04 20:32:49 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1471) Allow KeySelectors to implement ResultTypeQueryable

2015-02-03 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1471:
-

 Summary: Allow KeySelectors to implement ResultTypeQueryable
 Key: FLINK-1471
 URL: https://issues.apache.org/jira/browse/FLINK-1471
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Timo Walther


See https://github.com/apache/flink/pull/354



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1379) add RSS feed for the blog

2015-01-15 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1379:
--
Assignee: Max Michels

 add RSS feed for the blog
 -

 Key: FLINK-1379
 URL: https://issues.apache.org/jira/browse/FLINK-1379
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Max Michels
Assignee: Max Michels
Priority: Minor
 Attachments: feed.patch


 I couldn't find an RSS feed for the Flink blog. I think that a feed helps a 
 lot of people to stay up to date with the changes in Flink. 
 [FLINK-391] mentions a RSS feed but it does not seem to exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1379) add RSS feed for the blog

2015-01-15 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1379:
--
Attachment: feed.patch

 add RSS feed for the blog
 -

 Key: FLINK-1379
 URL: https://issues.apache.org/jira/browse/FLINK-1379
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Max Michels
Priority: Minor
 Attachments: feed.patch


 I couldn't find an RSS feed for the Flink blog. I think that a feed helps a 
 lot of people to stay up to date with the changes in Flink. 
 [FLINK-391] mentions a RSS feed but it does not seem to exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1379) add RSS feed for the blog

2015-01-15 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279275#comment-14279275
 ] 

Robert Metzger commented on FLINK-1379:
---

You can actually attach patches to JIRA issues. There are a lot of Apache 
projects which use exactly that mechanism to bring code to their master.
I'll attach the patch to the JIRA.
I'm probably merging this tomorrow. Thank you!

 add RSS feed for the blog
 -

 Key: FLINK-1379
 URL: https://issues.apache.org/jira/browse/FLINK-1379
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Max Michels
Priority: Minor

 I couldn't find an RSS feed for the Flink blog. I think that a feed helps a 
 lot of people to stay up to date with the changes in Flink. 
 [FLINK-391] mentions a RSS feed but it does not seem to exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283640#comment-14283640
 ] 

Robert Metzger commented on FLINK-629:
--

This screenshot from github is related I guess 
(https://cloud.githubusercontent.com/assets/2375289/5814551/13a7bc88-a08d-11e4-9133-a0b0d0239533.png)
 ?
So your problem is that your input format is getting a null instead a 
reference to a Tweet object?
Is Flink treating your POJO (Tweet) as a POJO or as a generic type?
What is {{TypeExtractor.createTypeInfo(Tweet.class)}} returning?

 Add support for null values to the java api
 ---

 Key: FLINK-629
 URL: https://issues.apache.org/jira/browse/FLINK-629
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Stephan Ewen
Assignee: Gyula Fora
Priority: Critical
  Labels: github-import
 Fix For: pre-apache

 Attachments: model.tar.gz


 Currently, many runtime operations fail when encountering a null value. Tuple 
 serialization should allow null fields.
 I suggest to add a method to the tuples called `getFieldNotNull()` which 
 throws a meaningful exception when the accessed field is null. That way, we 
 simplify the logic of operators that should not dead with null fields, like 
 key grouping or aggregations.
 Even though SQL allows grouping and aggregating of null values, I suggest to 
 exclude this from the java api, because the SQL semantics of aggregating null 
 fields are messy.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/629
 Created by: [StephanEwen|https://github.com/StephanEwen]
 Labels: enhancement, java api, 
 Milestone: Release 0.5.1
 Created at: Wed Mar 26 00:27:49 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1417) Automatically register nested types at Kryo

2015-01-20 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1417:
-

Assignee: Robert Metzger

 Automatically register nested types at Kryo
 ---

 Key: FLINK-1417
 URL: https://issues.apache.org/jira/browse/FLINK-1417
 Project: Flink
  Issue Type: Improvement
  Components: Java API, Scala API
Reporter: Stephan Ewen
Assignee: Robert Metzger
 Fix For: 0.9


 Currently, the {{GenericTypeInfo}} registers the class of the type at Kryo. 
 In order to get the best performance, it should recursively walk the classes 
 and make sure that it registered all contained subtypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1410) Integrate Flink version variables into website layout

2015-01-21 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285903#comment-14285903
 ] 

Robert Metzger commented on FLINK-1410:
---

Also, we need to fix the ul, li elements for regular lists created from 
markdown.

 Integrate Flink version variables into website layout
 -

 Key: FLINK-1410
 URL: https://issues.apache.org/jira/browse/FLINK-1410
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Robert Metzger

 The new website layout doesn't use the variables in the website configuration.
 This makes releasing versions extremely hard, because one needs to manually 
 fix all the links for every version change.
 The old layout of the website was respecting all variables which made 
 releasing a new version of the website a matter of minutes (changing one 
 file).
 I would highly recommend to fix FLINK-1387 first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1410) Integrate Flink version variables into website layout

2015-01-21 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1410:
--
Attachment: ulli-howItShouldLook.png
ulli-howItactuallyLooks.png

 Integrate Flink version variables into website layout
 -

 Key: FLINK-1410
 URL: https://issues.apache.org/jira/browse/FLINK-1410
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Robert Metzger
 Attachments: ulli-howItShouldLook.png, ulli-howItactuallyLooks.png


 The new website layout doesn't use the variables in the website configuration.
 This makes releasing versions extremely hard, because one needs to manually 
 fix all the links for every version change.
 The old layout of the website was respecting all variables which made 
 releasing a new version of the website a matter of minutes (changing one 
 file).
 I would highly recommend to fix FLINK-1387 first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1432) CombineTaskTest.testCancelCombineTaskSorting sometimes fails

2015-01-22 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1432:
-

 Summary: CombineTaskTest.testCancelCombineTaskSorting sometimes 
fails
 Key: FLINK-1432
 URL: https://issues.apache.org/jira/browse/FLINK-1432
 Project: Flink
  Issue Type: Bug
Reporter: Robert Metzger


We have a bunch of tests which fail only in rare cases on travis.


https://s3.amazonaws.com/archive.travis-ci.org/jobs/47783455/log.txt

{code}
Exception in thread Thread-17 java.lang.AssertionError: Canceling task 
failed: java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
at java.util.ArrayList$Itr.next(ArrayList.java:831)
at 
org.apache.flink.runtime.memorymanager.DefaultMemoryManager.release(DefaultMemoryManager.java:290)
at 
org.apache.flink.runtime.operators.GroupReduceCombineDriver.cancel(GroupReduceCombineDriver.java:221)
at 
org.apache.flink.runtime.operators.testutils.DriverTestBase.cancel(DriverTestBase.java:272)
at 
org.apache.flink.runtime.operators.testutils.TaskCancelThread.run(TaskCancelThread.java:60)

at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.flink.runtime.operators.testutils.TaskCancelThread.run(TaskCancelThread.java:68)
java.lang.NullPointerException
at 
org.apache.flink.runtime.memorymanager.DefaultMemoryManager.release(DefaultMemoryManager.java:291)
at 
org.apache.flink.runtime.operators.GroupReduceCombineDriver.cleanup(GroupReduceCombineDriver.java:213)
at 
org.apache.flink.runtime.operators.testutils.DriverTestBase.testDriverInternal(DriverTestBase.java:245)
at 
org.apache.flink.runtime.operators.testutils.DriverTestBase.testDriver(DriverTestBase.java:175)
at 
org.apache.flink.runtime.operators.CombineTaskTest$1.run(CombineTaskTest.java:143)
Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.172 sec  
FAILURE! - in org.apache.flink.runtime.operators.CombineTaskTest
testCancelCombineTaskSorting[0](org.apache.flink.runtime.operators.CombineTaskTest)
  Time elapsed: 1.023 sec   FAILURE!
java.lang.AssertionError: Exception was thrown despite proper canceling.
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.flink.runtime.operators.CombineTaskTest.testCancelCombineTaskSorting(CombineTaskTest.java:162)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1296) Add support for very large record for sorting

2015-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1296:
--
Fix Version/s: (was: 0.8)
   0.9

 Add support for very large record for sorting
 -

 Key: FLINK-1296
 URL: https://issues.apache.org/jira/browse/FLINK-1296
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: 0.8
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9


 Currently, very large records (multiple hundreds of megabytes) can break the 
 sorter if the overflow the sort buffer.
 Furthermore, if a merge is attempted of those records, pulling multiple of 
 them concurrently into memory can break the machine memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1240) We cannot use sortGroup on a global reduce

2015-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1240:
--
Fix Version/s: (was: 0.8)
   0.9

 We cannot use sortGroup on a global reduce
 --

 Key: FLINK-1240
 URL: https://issues.apache.org/jira/browse/FLINK-1240
 Project: Flink
  Issue Type: Improvement
Reporter: Aljoscha Krettek
Priority: Minor
 Fix For: 0.9


 This is only an API problem, I hope.
 I also know, that this is potentially a very bad idea because everything must 
 be sorted on one node. In some cases, such as sorted first-n this would make 
 sense, though, since there we use a combiner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-937) Change the YARN Client to allocate all cluster resources, if no argument given

2015-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-937:
-
Fix Version/s: (was: 0.8)
   0.9

 Change the YARN Client to allocate all cluster resources, if no argument given
 --

 Key: FLINK-937
 URL: https://issues.apache.org/jira/browse/FLINK-937
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Minor
 Fix For: 0.9


 In order to further improve the user experience, I would like to change the 
 YARN client's behavior to allocate as many cluster resources as possible, if 
 the user does not specify differently.
 The majority of users have exclusive access to the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-987) Extend TypeSerializers and -Comparators to work directly on Memory Segments

2015-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-987:
-
Fix Version/s: (was: 0.8)
   0.9

 Extend TypeSerializers and -Comparators to work directly on Memory Segments
 ---

 Key: FLINK-987
 URL: https://issues.apache.org/jira/browse/FLINK-987
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.6-incubating
Reporter: Stephan Ewen
Assignee: Aljoscha Krettek
 Fix For: 0.9


 As per discussion with [~till.rohrmann], [~uce], [~aljoscha], we suggest to 
 change the way that the TypeSerialzers/Comparators and 
 DataInputViews/DataOutputViews work.
 The goal is to allow more flexibility in the construction on the binary 
 representation of data types, and to allow partial deserialization of 
 individual fields. Both is currently prohibited by the fact that the 
 abstraction of the memory (into which the data goes) is a stream abstraction 
 ({{DataInputView}}, {{DataOutputView}}).
 An idea is to offer a random-access buffer like view for construction and 
 random-access deserialization, as well as various methods to copy elements in 
 a binary fashion between such buffers and streams.
 A possible set of methods for the {{TypeSerializer}} could be:
 {code}
 long serialize(T record, TargetBuffer buffer);
   
 T deserialize(T reuse, SourceBuffer source);
   
 void ensureBufferSufficientlyFilled(SourceBuffer source);
   
 X X deserializeField(X reuse, int logicalPos, SourceBuffer buffer);
   
 int getOffsetForField(int logicalPos, int offset, SourceBuffer buffer);
   
 void copy(DataInputView in, TargetBuffer buffer);
   
 void copy(SourceBuffer buffer,, DataOutputView out);
   
 void copy(DataInputView source, DataOutputView target);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1278) Remove the Record special code paths

2015-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1278:
--
Fix Version/s: (was: 0.8)
   0.9

 Remove the Record special code paths
 

 Key: FLINK-1278
 URL: https://issues.apache.org/jira/browse/FLINK-1278
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: 0.8
Reporter: Stephan Ewen
Assignee: Kostas Tzoumas
 Fix For: 0.9


 There are some legacy Record code paths in the runtime, which are often 
 forgotten to be kept in sync and cause errors if people actually use records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1297) Add support for tracking statistics of intermediate results

2015-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1297:
--
Fix Version/s: (was: 0.8)
   0.9

 Add support for tracking statistics of intermediate results
 ---

 Key: FLINK-1297
 URL: https://issues.apache.org/jira/browse/FLINK-1297
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Reporter: Alexander Alexandrov
Assignee: Alexander Alexandrov
 Fix For: 0.9

   Original Estimate: 1,008h
  Remaining Estimate: 1,008h

 One of the major problems related to the optimizer at the moment is the lack 
 of proper statistics.
 With the introduction of staged execution, it is possible to instrument the 
 runtime code with a statistics facility that collects the required 
 information for optimizing the next execution stage.
 I would therefore like to contribute code that can be used to gather basic 
 statistics for the (intermediate) result of dataflows (e.g. min, max, count, 
 count distinct) and make them available to the job manager.
 Before I start, I would like to hear some feedback form the other users.
 In particular, to handle skew (e.g. on grouping) it might be good to have 
 some sort of detailed sketch about the key distribution of an intermediate 
 result. I am not sure whether a simple histogram is the most effective way to 
 go. Maybe somebody would propose another lightweight sketch that provides 
 better accuracy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-938) Change start-cluster.sh script so that users don't have to configure the JobManager address

2015-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-938:
-
Fix Version/s: (was: 0.8)
   0.9

 Change start-cluster.sh script so that users don't have to configure the 
 JobManager address
 ---

 Key: FLINK-938
 URL: https://issues.apache.org/jira/browse/FLINK-938
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Robert Metzger
Assignee: Mingliang Qi
Priority: Minor
 Fix For: 0.9


 To improve the user experience, Flink should not require users to configure 
 the JobManager's address on a cluster.
 In combination with FLINK-934, this would allow running Flink with decent 
 performance on a cluster without setting a single configuration value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1372) TaskManager and JobManager do not log startup settings any more

2015-01-16 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279982#comment-14279982
 ] 

Robert Metzger commented on FLINK-1372:
---

Yes, switching the logger sounds good. The information is very helpful when 
debugging issues reported by users.

 TaskManager and JobManager do not log startup settings any more
 ---

 Key: FLINK-1372
 URL: https://issues.apache.org/jira/browse/FLINK-1372
 Project: Flink
  Issue Type: Bug
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Till Rohrmann
 Fix For: 0.9


 In prior versions, the jobmanager and taskmanager logged a lot of startup 
 options:
  - Environment
  - ports
  - memory configuration
  - network configuration
 Currently, they log very little. We should add the logging back in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1410) Integrate Flink version variables into website layout

2015-01-16 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1410:
-

 Summary: Integrate Flink version variables into website layout
 Key: FLINK-1410
 URL: https://issues.apache.org/jira/browse/FLINK-1410
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Robert Metzger


The new website layout doesn't use the variables in the website configuration.
This makes releasing versions extremely hard, because one needs to manually fix 
all the links for every version change.

The old layout of the website was respecting all variables which made releasing 
a new version of the website a matter of minutes (changing one file).
I would highly recommend to fix FLINK-1387 first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1317) Add a standalone version of the Flink Plan Visualizer to the Website

2015-01-17 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281275#comment-14281275
 ] 

Robert Metzger commented on FLINK-1317:
---

I would resolve this issue by adding the plan viz to the docs directory in 
the git source. This way, we can ensure that it works for different versions.

 Add a standalone version of the Flink Plan Visualizer to the Website
 

 Key: FLINK-1317
 URL: https://issues.apache.org/jira/browse/FLINK-1317
 Project: Flink
  Issue Type: New Feature
  Components: Project Website
Reporter: Stephan Ewen
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1407) Enable log output (error level) for test cases

2015-01-15 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278690#comment-14278690
 ] 

Robert Metzger commented on FLINK-1407:
---

I'm generally a big fan of logging a bit .. so I'm in.

BUT: Travis has a limit on the log file size .. so we have to be careful.

 Enable log output (error level) for test cases
 --

 Key: FLINK-1407
 URL: https://issues.apache.org/jira/browse/FLINK-1407
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann

 In case of errors (especially flakey test cases) which most often appear on 
 Travis, it would be helpful to log the error messages in order to detect the 
 actual problem.  Since we also have test exceptions which would be logged the 
 same way and we don't want to clutter the logs, we would have to set up the 
 logging for the individual classes selectively.
 What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1423) No Tags for new release on the github repo

2015-01-21 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1423:
-

Assignee: Robert Metzger

 No Tags for new release on the github repo
 --

 Key: FLINK-1423
 URL: https://issues.apache.org/jira/browse/FLINK-1423
 Project: Flink
  Issue Type: Task
  Components: Project Website
Affects Versions: 0.7.0-incubating
Reporter: Carsten Brandt
Assignee: Robert Metzger
Priority: Minor

 As far as I know there is an official version 0.7 release but there is no tag 
 for that version:
 https://github.com/apache/flink/releases



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1372) TaskManager and JobManager do not log startup settings any more

2015-01-21 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285693#comment-14285693
 ] 

Robert Metzger commented on FLINK-1372:
---

The issue has not been resolved.
I just checked with the current master, and I got the following output:

{code}
15:38:55,181 INFO  akka.event.slf4j.Slf4jLogger 
 - Slf4jLogger started
15:38:55,217 INFO  Remoting 
 - Starting remoting
15:38:55,352 INFO  Remoting 
 - Remoting started; listening on addresses :[akka.tcp://flink@localhost:6123]
15:38:55,365 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Using 0.7 of the free heap space for managed memory.
15:38:55,456 INFO  
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Starting job manager at akka://flink/user/jobmanager.
15:38:55,459 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Starting task manager at akka://flink/user/taskmanager.
15:38:55,459 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Creating 1 task slot(s).
15:38:55,459 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - TaskManager connection information localhost.localdomain ( dataPort=37035).
15:38:55,460 INFO  org.apache.flink.runtime.blob.BlobServer 
 - Started BLOB server on port 51520
15:38:55,461 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Temporary file directory '/tmp': total 7 GB,usable 7 GB [100.00% usable])
15:38:55,471 INFO  
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$main$1$$anon$1  - 
Started job manager. Waiting for incoming messages.
15:38:55,474 INFO  org.apache.flink.runtime.jobmanager.web.WebInfoServer
 - Setting up web info server, using web-root 
directoryjar:file:/home/robert/incubator-flink/flink-dist/target/flink-0.9-SNAPSHOT-bin/flink-0.9-SNAPSHOT/lib/flink-runtime-0.9-SNAPSHOT.jar!/web-docs-infoserver.
15:38:55,474 INFO  org.apache.flink.runtime.jobmanager.web.WebInfoServer
 - Web info server will display information about flink job-manager on 
localhost, port 8081.
15:38:56,049 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Profiling of jobs is disabled.
15:38:56,051 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Memory usage stats: [HEAP: 346/491/491 MB, NON HEAP: 20/44/304 MB 
(used/committed/max)]
15:38:56,053 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Try to register at master akka.tcp://flink@localhost:6123/user/jobmanager. 
Attempt #1
15:38:56,067 INFO  org.apache.flink.runtime.jobmanager.web.WebInfoServer
 - Starting web info server for JobManager on port 8081
15:38:56,067 INFO  org.eclipse.jetty.util.log   
 - jetty-8.0.0.M1
15:38:56,086 INFO  org.eclipse.jetty.util.log   
 - Started SelectChannelConnector@0.0.0.0:8081
15:38:56,090 INFO  org.apache.flink.runtime.instance.InstanceManager
 - Registered TaskManager at akka://flink/user/taskmanager as 
1e74cf0dfeabdb4d23f7e27a3776ae3f. Current number of registered hosts is 1.
15:38:56,092 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - TaskManager successfully registered at JobManager 
akka://flink/user/jobmanager.
15:38:56,224 INFO  org.apache.flink.runtime.io.network.buffer.NetworkBufferPool 
 - Allocated 64 MB for network buffer pool (number of memory segments: 2048, 
bytes per segment: 32768).
15:38:56,226 INFO  org.apache.flink.runtime.taskmanager.TaskManager 
 - Determined BLOB server address to be localhost/127.0.0.1:51520.
{code}

The output used to look like that:
{code}
14:43:15,211 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 - ---
14:43:15,213 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 -  Starting JobManager (Version: 0.6-incubating-SNAPSHOT, Rev:3f1e220, 
Date:09.08.2014 @ 14:36:31 CEST)
14:43:15,213 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 -  Current user: dr.who
14:43:15,213 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.7/23.25-b01
14:43:15,213 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 -  Startup Options: -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled 
-XX:MaxPermSize=256m -Xmx25600m 
14:43:15,213 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 -  Maximum heap size: 25450 MiBytes
14:43:15,213 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 -  JAVA_HOME: /usr/java/default
14:43:15,213 INFO  org.apache.flink.runtime.jobmanager.JobManager

[jira] [Commented] (FLINK-1426) JobManager AJAX requests sometimes fail

2015-01-21 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285663#comment-14285663
 ] 

Robert Metzger commented on FLINK-1426:
---

Ah cool. I didn't know that there is still something going on. I haven't heard 
much on this.
Would be cool if the student is sharing a preview with us, once that's 
possible. Big changes usually have much better chances to be merged when they 
are properly discussed with the community.

It this issue is being fixed as part of the bachelor thesis, its even better.

 JobManager AJAX requests sometimes fail
 ---

 Key: FLINK-1426
 URL: https://issues.apache.org/jira/browse/FLINK-1426
 Project: Flink
  Issue Type: Bug
  Components: JobManager, Webfrontend
Reporter: Robert Metzger

 It seems that the JobManager sometimes (I think when accessing it the first 
 time) does not show the number of TMs / slots.
 A simple workaround is re-loading it, but still, users are complaining about 
 it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1411) PlanVisualizer is not working

2015-01-18 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281857#comment-14281857
 ] 

Robert Metzger commented on FLINK-1411:
---

+1 for hosting the tool on the website (version specific)
I think storing the web files in the flink-client.jar and flink-runtime.jar is 
better than having the raw files in our distribution.

 PlanVisualizer is not working
 -

 Key: FLINK-1411
 URL: https://issues.apache.org/jira/browse/FLINK-1411
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Priority: Minor

 In the current master, the PlanVisualizer is no longer working. The reason is 
 that the resources folder containing the web resources has been moved to the 
 flink-runtime and flink-clients jar. 
 Maybe we should pick up FLINK-1317 and make the PlanVisualizer accessible 
 through the flink website.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1413) Add option to YARN client to use already deployed uberjar/flink files instead of redeploying on each YARN session

2015-01-18 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1413:
-

 Summary: Add option to YARN client to use already deployed 
uberjar/flink files instead of redeploying on each YARN session
 Key: FLINK-1413
 URL: https://issues.apache.org/jira/browse/FLINK-1413
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Reporter: Robert Metzger


Right now, Flink is redeploying all the files it needs for each YARN session or 
per job yarn cluster.

In particular with the new per job yarn cluster feature, it's an unnecessary 
waste of resources to redeploy the files on each job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1414) Remove quickstart-*.sh from git source and put them to the website's svn

2015-01-18 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1414:
-

 Summary: Remove quickstart-*.sh from git source and put them to 
the website's svn
 Key: FLINK-1414
 URL: https://issues.apache.org/jira/browse/FLINK-1414
 Project: Flink
  Issue Type: Task
  Components: Project Website
Reporter: Robert Metzger


The quickstart.sh script is currently (due to historic reasons) located in the 
main source repo.
It probably better fits into the homepage because it is independent of the 
versions in the pom.xml files. 
This also makes the release maintenance easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1052) Access to stdout and log information of Taskmanagers from the web interface

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1052:
--
Component/s: JobManager

 Access to stdout and log information of Taskmanagers from the web interface
 ---

 Key: FLINK-1052
 URL: https://issues.apache.org/jira/browse/FLINK-1052
 Project: Flink
  Issue Type: Improvement
  Components: JobManager
Affects Versions: 0.9
Reporter: Till Rohrmann

 It would be convenient to have access to the stdout and log files of the 
 individual taskmanagers from the web interface. This is especially useful in 
 the case of a yarn setup where the different logs are hidden behind 
 application and container IDs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1052) Access to stdout and log information of Taskmanagers from the web interface

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1052:
--
Priority: Major  (was: Minor)

 Access to stdout and log information of Taskmanagers from the web interface
 ---

 Key: FLINK-1052
 URL: https://issues.apache.org/jira/browse/FLINK-1052
 Project: Flink
  Issue Type: Improvement
  Components: JobManager
Affects Versions: 0.9
Reporter: Till Rohrmann

 It would be convenient to have access to the stdout and log files of the 
 individual taskmanagers from the web interface. This is especially useful in 
 the case of a yarn setup where the different logs are hidden behind 
 application and container IDs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1504) Add support for accessing secured HDFS clusters in standalone mode

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1504.
---
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Max Michels

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/82cda12e

 Add support for accessing secured HDFS clusters in standalone mode
 --

 Key: FLINK-1504
 URL: https://issues.apache.org/jira/browse/FLINK-1504
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Max Michels
 Fix For: 0.9


 Only for one single user.
 So the user who starts flink has the kerberos credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1052) Access to stdout and log information of Taskmanagers from the web interface

2015-02-11 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316426#comment-14316426
 ] 

Robert Metzger commented on FLINK-1052:
---

I'm assigning the issue to myself, but I won't start working on it immediately.
its more like putting it on the todo list.

 Access to stdout and log information of Taskmanagers from the web interface
 ---

 Key: FLINK-1052
 URL: https://issues.apache.org/jira/browse/FLINK-1052
 Project: Flink
  Issue Type: Improvement
  Components: JobManager
Affects Versions: 0.9
Reporter: Till Rohrmann

 It would be convenient to have access to the stdout and log files of the 
 individual taskmanagers from the web interface. This is especially useful in 
 the case of a yarn setup where the different logs are hidden behind 
 application and container IDs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1052) Access to stdout and log information of Taskmanagers from the web interface

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1052:
-

Assignee: Robert Metzger

 Access to stdout and log information of Taskmanagers from the web interface
 ---

 Key: FLINK-1052
 URL: https://issues.apache.org/jira/browse/FLINK-1052
 Project: Flink
  Issue Type: Improvement
  Components: JobManager
Affects Versions: 0.9
Reporter: Till Rohrmann
Assignee: Robert Metzger

 It would be convenient to have access to the stdout and log files of the 
 individual taskmanagers from the web interface. This is especially useful in 
 the case of a yarn setup where the different logs are hidden behind 
 application and container IDs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1052) Access to stdout and log information of Taskmanagers from the web interface

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1052:
--
Affects Version/s: 0.9

 Access to stdout and log information of Taskmanagers from the web interface
 ---

 Key: FLINK-1052
 URL: https://issues.apache.org/jira/browse/FLINK-1052
 Project: Flink
  Issue Type: Improvement
  Components: JobManager
Affects Versions: 0.9
Reporter: Till Rohrmann

 It would be convenient to have access to the stdout and log files of the 
 individual taskmanagers from the web interface. This is especially useful in 
 the case of a yarn setup where the different logs are hidden behind 
 application and container IDs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1489.
---
   Resolution: Fixed
Fix Version/s: 0.9

Merged in http://git-wip-us.apache.org/repos/asf/flink/commit/aedbacfc. Thank 
you.

 Failing JobManager due to blocking calls in 
 Execution.scheduleOrUpdateConsumers
 ---

 Key: FLINK-1489
 URL: https://issues.apache.org/jira/browse/FLINK-1489
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.9


 [~Zentol] reported that the JobManager failed to execute his python job. The 
 reason is that the the JobManager executes blocking calls in the actor thread 
 in the method {{Execution.sendUpdateTaskRpcCall}} as a result to receiving a 
 {{ScheduleOrUpdateConsumers}} message. 
 Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the 
 JobManager to notify the consumers about available data. The JobManager then 
 sends to each TaskManager the respective update call 
 {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread, we 
 effectively execute the update calls sequentially. Due to the ever 
 accumulating delay, some of the initial timeouts on the TaskManager side in 
 {{IntermediateResultParititon.scheduleOrUpdateConsumers}} fail. As a result 
 the execution of the respective Tasks fails.
 A solution would be to make the call non-blocking.
 A general caveat for actor programming is: We should never block the actor 
 thread, otherwise we seriously jeopardize the scalability of the system. Or 
 even worse, the system simply fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1487) Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1487:
--
Component/s: JobManager

 Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case
 -

 Key: FLINK-1487
 URL: https://issues.apache.org/jira/browse/FLINK-1487
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Till Rohrmann

 I got the following failure on travis:
 {{SchedulerIsolatedTasksTest.testScheduleQueueing:283 expected:107 but 
 was:106}}
 The failure does not occur consistently on travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1487) Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case

2015-02-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1487:
--
Affects Version/s: 0.9

 Failing SchedulerIsolatedTasksTest.testScheduleQueueing test case
 -

 Key: FLINK-1487
 URL: https://issues.apache.org/jira/browse/FLINK-1487
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Till Rohrmann

 I got the following failure on travis:
 {{SchedulerIsolatedTasksTest.testScheduleQueueing:283 expected:107 but 
 was:106}}
 The failure does not occur consistently on travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1525) Provide utils to pass -D parameters to UDFs

2015-02-11 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1525:
-

 Summary: Provide utils to pass -D parameters to UDFs 
 Key: FLINK-1525
 URL: https://issues.apache.org/jira/browse/FLINK-1525
 Project: Flink
  Issue Type: Improvement
  Components: flink-contrib
Reporter: Robert Metzger


Hadoop users are used to setting job configuration through -D on the command 
line.

Right now, Flink users have to manually parse command line arguments and pass 
them to the methods.
It would be nice to provide a standard args parser with is taking care of such 
stuff.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1527) Incomplete Accumulator documentation (no mention of the shorthand methods)

2015-02-11 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1527:
-

 Summary: Incomplete Accumulator documentation (no mention of the 
shorthand methods)
 Key: FLINK-1527
 URL: https://issues.apache.org/jira/browse/FLINK-1527
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.8, 0.9
Reporter: Robert Metzger


The current documentation implies that users have to manually register every 
counter.
But it should be possible to call counters like this
{code}
getRuntimeContext().getLongCounter(elements).add(1L);
{code}
within the udf function call.


http://flink.apache.org/docs/0.8/programming_guide.html#accumulators--counters





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup

2015-02-10 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1492.
---
   Resolution: Fixed
Fix Version/s: 0.8.1

Resolved for 0.8.1 in 
https://git1-us-west.apache.org/repos/asf?p=flink.git;a=commit;h=5b420d84

 Exceptions on shutdown concerning BLOB store cleanup
 

 Key: FLINK-1492
 URL: https://issues.apache.org/jira/browse/FLINK-1492
 Project: Flink
  Issue Type: Bug
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Ufuk Celebi
 Fix For: 0.9, 0.8.1


 The following stack traces occur not every time, but frequently.
 {code}
 java.lang.IllegalArgumentException: 
 /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982 does not exist
   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
   at 
 org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
   at 
 org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
   at 
 akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
   at 
 akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
   at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
   at 
 akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
   at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 15:16:15,350 ERROR 
 org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
   - LibraryCacheManager did not shutdown properly.
 java.io.IOException: Unable to delete file: 
 /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
   at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
   at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
   at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
   at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
   at 
 org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
   at 
 org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
   at 
 org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
   at 
 akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
   at 
 akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
   at 
 

[jira] [Closed] (FLINK-1575) JobManagerConnectionTest.testResolveUnreachableActorRemoteHost times out on travis

2015-02-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-1575.
-
Resolution: Invalid

The issue is caused by one of my changes (dependency conflict).
I'm closing it as invalid.

 JobManagerConnectionTest.testResolveUnreachableActorRemoteHost times out on 
 travis
 --

 Key: FLINK-1575
 URL: https://issues.apache.org/jira/browse/FLINK-1575
 Project: Flink
  Issue Type: Bug
Reporter: Robert Metzger

 This might be related to FLINK-1529.
 I saw this issue now at least twice on travis:
 https://travis-ci.org/rmetzger/flink/jobs/51108554
 {code}
 Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 200.266 sec 
  FAILURE! - in org.apache.flink.runtime.jobmanager.JobManagerConnectionTest
 testResolveUnreachableActorRemoteHost(org.apache.flink.runtime.jobmanager.JobManagerConnectionTest)
   Time elapsed: 100.215 sec   ERROR!
 java.util.concurrent.TimeoutException: Futures timed out after [10 
 milliseconds]
   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
   at 
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
   at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
   at 
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
   at scala.concurrent.Await$.result(package.scala:107)
   at akka.remote.Remoting.start(Remoting.scala:173)
   at 
 akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
   at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:579)
   at akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)
   at akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)
   at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
   at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
   at akka.actor.ActorSystem$.create(ActorSystem.scala:66)
   at 
 org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:71)
   at 
 org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:61)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerConnectionTest.testResolveUnreachableActorRemoteHost(JobManagerConnectionTest.scala:88)
 testResolveUnreachableActorLocalHost(org.apache.flink.runtime.jobmanager.JobManagerConnectionTest)
   Time elapsed: 100.031 sec   ERROR!
 java.util.concurrent.TimeoutException: Futures timed out after [10 
 milliseconds]
   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
   at 
 scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
   at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
   at 
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
   at scala.concurrent.Await$.result(package.scala:107)
   at akka.remote.Remoting.start(Remoting.scala:173)
   at 
 akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
   at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:579)
   at akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)
   at akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)
   at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
   at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
   at akka.actor.ActorSystem$.create(ActorSystem.scala:66)
   at 
 org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:71)
   at 
 org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:61)
   at 
 org.apache.flink.runtime.jobmanager.JobManagerConnectionTest.testResolveUnreachableActorLocalHost(JobManagerConnectionTest.scala:45)
 Running org.apache.flink.runtime.operators.hash.MemoryHashTableTest
 [ERROR] [02/17/2015 17:38:04.250] [main] [Remoting] Remoting error: [Startup 
 timed out] [
 akka.remote.RemoteTransportException: Startup timed out
   at 
 akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:129)
   at akka.remote.Remoting.start(Remoting.scala:191)
   at 
 akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
   at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:579)
   at akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)
   at akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)
   at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
   at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
   at akka.actor.ActorSystem$.create(ActorSystem.scala:66)
   at 
 org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:71)
   at 
 org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:61)
   

[jira] [Updated] (FLINK-1573) Add per-job metrics to flink.

2015-02-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1573:
--
Fix Version/s: (was: pre-apache)

 Add per-job metrics to flink.
 -

 Key: FLINK-1573
 URL: https://issues.apache.org/jira/browse/FLINK-1573
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 With FLINK-1501, we have JVM specific metrics (mainly monitoring the TMs).
 With this task, I would like to add metrics which are job-specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1525) Provide utils to pass -D parameters to UDFs

2015-02-12 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318016#comment-14318016
 ] 

Robert Metzger commented on FLINK-1525:
---

I think it doesn't hurt to have something like this in the {{flink-contrib}} 
package.

The task is nice for somebody who's looking into an easy starter task with 
Flink.
It doesn't have to be the Configuration object we're passing to the UDFs.
The util could also return a serializable object one can pass into the 
functions.


 Provide utils to pass -D parameters to UDFs 
 

 Key: FLINK-1525
 URL: https://issues.apache.org/jira/browse/FLINK-1525
 Project: Flink
  Issue Type: Improvement
  Components: flink-contrib
Reporter: Robert Metzger
  Labels: starter

 Hadoop users are used to setting job configuration through -D on the 
 command line.
 Right now, Flink users have to manually parse command line arguments and pass 
 them to the methods.
 It would be nice to provide a standard args parser with is taking care of 
 such stuff.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1483) Temporary channel files are not properly deleted when Flink is terminated

2015-02-13 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1483:
--
Component/s: TaskManager

 Temporary channel files are not properly deleted when Flink is terminated
 -

 Key: FLINK-1483
 URL: https://issues.apache.org/jira/browse/FLINK-1483
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 0.8, 0.9
Reporter: Till Rohrmann
Assignee: Ufuk Celebi

 The temporary channel files are not properly deleted if the IOManager does 
 not shut down properly. This can be the case when the TaskManagers are 
 terminated by Flink's shell scripts.
 A solution could be to store all channel files of one TaskManager in a 
 uniquely identifiable directory and to register a shutdown hook which deletes 
 this file upon termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1483) Temporary channel files are not properly deleted when Flink is terminated

2015-02-13 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1483:
--
Affects Version/s: 0.9
   0.8

 Temporary channel files are not properly deleted when Flink is terminated
 -

 Key: FLINK-1483
 URL: https://issues.apache.org/jira/browse/FLINK-1483
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 0.8, 0.9
Reporter: Till Rohrmann
Assignee: Ufuk Celebi

 The temporary channel files are not properly deleted if the IOManager does 
 not shut down properly. This can be the case when the TaskManagers are 
 terminated by Flink's shell scripts.
 A solution could be to store all channel files of one TaskManager in a 
 uniquely identifiable directory and to register a shutdown hook which deletes 
 this file upon termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-1556) JobClient does not wait until a job failed completely if submission exception

2015-02-17 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324364#comment-14324364
 ] 

Robert Metzger edited comment on FLINK-1556 at 2/17/15 4:15 PM:


Also, it seems that these fail-fast jobs are not properly removed from the 
jobmanager?

http://imgur.com/PyuQEfm



was (Author: rmetzger):
Also, it seems that these failearily jobs are not properly removed from the 
jobmanager?

http://imgur.com/PyuQEfm


 JobClient does not wait until a job failed completely if submission exception
 -

 Key: FLINK-1556
 URL: https://issues.apache.org/jira/browse/FLINK-1556
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann

 If an exception occurs during job submission the {{JobClient}} received a 
 {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}} 
 terminates itself and returns the error to the {{Client}}. This indicates to 
 the user that the job has been completely failed which is not necessarily 
 true. 
 If the user directly after such a failure submits another job, then it might 
 be the case that not all slots of the formerly failed job are returned. This 
 can lead to a {{NoRessourceAvailableException}}.
 We can solve this problem by waiting for the completion of the job failure in 
 the {{JobClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1556) JobClient does not wait until a job failed completely if submission exception

2015-02-17 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324364#comment-14324364
 ] 

Robert Metzger commented on FLINK-1556:
---

Also, it seems that these failearily jobs are not properly removed from the 
jobmanager?

http://imgur.com/PyuQEfm


 JobClient does not wait until a job failed completely if submission exception
 -

 Key: FLINK-1556
 URL: https://issues.apache.org/jira/browse/FLINK-1556
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann

 If an exception occurs during job submission the {{JobClient}} received a 
 {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}} 
 terminates itself and returns the error to the {{Client}}. This indicates to 
 the user that the job has been completely failed which is not necessarily 
 true. 
 If the user directly after such a failure submits another job, then it might 
 be the case that not all slots of the formerly failed job are returned. This 
 can lead to a {{NoRessourceAvailableException}}.
 We can solve this problem by waiting for the completion of the job failure in 
 the {{JobClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1573) Add per-job metrics to flink.

2015-02-17 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1573:
-

 Summary: Add per-job metrics to flink.
 Key: FLINK-1573
 URL: https://issues.apache.org/jira/browse/FLINK-1573
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger


With FLINK-1501, we have JVM specific metrics (mainly monitoring the TMs).

With this task, I would like to add metrics which are job-specific.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1556) JobClient does not wait until a job failed completely if submission exception

2015-02-17 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324359#comment-14324359
 ] 

Robert Metzger commented on FLINK-1556:
---

I think showing the whole stacktrace of the exception is helpful to understand 
the deployment issue better.

 JobClient does not wait until a job failed completely if submission exception
 -

 Key: FLINK-1556
 URL: https://issues.apache.org/jira/browse/FLINK-1556
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Assignee: Till Rohrmann

 If an exception occurs during job submission the {{JobClient}} received a 
 {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}} 
 terminates itself and returns the error to the {{Client}}. This indicates to 
 the user that the job has been completely failed which is not necessarily 
 true. 
 If the user directly after such a failure submits another job, then it might 
 be the case that not all slots of the formerly failed job are returned. This 
 can lead to a {{NoRessourceAvailableException}}.
 We can solve this problem by waiting for the completion of the job failure in 
 the {{JobClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1572) Output directories are created before input paths are checked

2015-02-17 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1572:
-

 Summary: Output directories are created before input paths are 
checked
 Key: FLINK-1572
 URL: https://issues.apache.org/jira/browse/FLINK-1572
 Project: Flink
  Issue Type: Improvement
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger
Priority: Minor


Flink is first creating the output directories for a job before creating the 
input splits.
If a job's input directories are wrong, the system will have created output 
directories for a failed job.

It would be much better if the system is creating the output directories on 
demand before data is actually written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1388) POJO support for writeAsCsv

2015-02-15 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321961#comment-14321961
 ] 

Robert Metzger commented on FLINK-1388:
---

Cool.
Some feedback:
- I think an integration with {{DataSet.internalWriteAsCsv()}} would be nice.
- From there, you can also pass the TypeInformation into the method (right now, 
the code is doing the TypeExtraction on every tuple. That won't work on large 
data sets)
- I think it would be much easier for you to extend from the 
{{FileOutputFormat}}. It will do all the file-related stuff for you (your code 
does not allow to write files to HDFS).



 POJO support for writeAsCsv
 ---

 Key: FLINK-1388
 URL: https://issues.apache.org/jira/browse/FLINK-1388
 Project: Flink
  Issue Type: New Feature
  Components: Java API
Reporter: Timo Walther
Assignee: Adnan Khan
Priority: Minor

 It would be great if one could simply write out POJOs in CSV format.
 {code}
 public class MyPojo {
String a;
int b;
 }
 {code}
 to:
 {code}
 # CSV file of org.apache.flink.MyPojo: String a, int b
 Hello World, 42
 Hello World 2, 47
 ...
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1555) Add utility to log the serializers of composite types

2015-02-16 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1555:
-

 Summary: Add utility to log the serializers of composite types
 Key: FLINK-1555
 URL: https://issues.apache.org/jira/browse/FLINK-1555
 Project: Flink
  Issue Type: Improvement
Reporter: Robert Metzger
Priority: Minor


Users affected by poor performance might want to understand how Flink is 
serializing their data.

Therefore, it would be cool to have a tool utility which logs the serializers 
like this:
{{SerializerUtils.getSerializers(TypeInformationPOJO t);}}
to get 
{code}
PojoSerializer
TupleSerializer
  IntSer
  DateSer
  GenericTypeSer(java.sql.Date)
PojoSerializer
  GenericTypeSer(HashMap)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1588) Load flink configuration also from classloader

2015-02-20 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1588:
-

 Summary: Load flink configuration also from classloader
 Key: FLINK-1588
 URL: https://issues.apache.org/jira/browse/FLINK-1588
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger


The GlobalConfiguration object should also check if it finds the 
flink-config.yaml in the classpath and load if from there.

This allows users to inject configuration files in local standalone or 
embedded environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1589:
-

 Summary: Add option to pass Configuration to LocalExecutor
 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger


Right now its not possible for users to pass custom configuration values to 
Flink when running it from within an IDE.

It would be very convenient to be able to create a local execution environment 
that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1589) Add option to pass Configuration to LocalExecutor

2015-02-20 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1589:
-

Assignee: Robert Metzger

 Add option to pass Configuration to LocalExecutor
 -

 Key: FLINK-1589
 URL: https://issues.apache.org/jira/browse/FLINK-1589
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger
Assignee: Robert Metzger

 Right now its not possible for users to pass custom configuration values to 
 Flink when running it from within an IDE.
 It would be very convenient to be able to create a local execution 
 environment that allows passing configuration files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1555) Add utility to log the serializers of composite types

2015-02-20 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1555:
-

Assignee: Robert Metzger

 Add utility to log the serializers of composite types
 -

 Key: FLINK-1555
 URL: https://issues.apache.org/jira/browse/FLINK-1555
 Project: Flink
  Issue Type: Improvement
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Minor

 Users affected by poor performance might want to understand how Flink is 
 serializing their data.
 Therefore, it would be cool to have a tool utility which logs the serializers 
 like this:
 {{SerializerUtils.getSerializers(TypeInformationPOJO t);}}
 to get 
 {code}
 PojoSerializer
 TupleSerializer
   IntSer
   DateSer
   GenericTypeSer(java.sql.Date)
 PojoSerializer
   GenericTypeSer(HashMap)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1590) Log environment information also in YARN mode

2015-02-20 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1590:
-

 Summary: Log environment information also in YARN mode
 Key: FLINK-1590
 URL: https://issues.apache.org/jira/browse/FLINK-1590
 Project: Flink
  Issue Type: Improvement
Reporter: Robert Metzger
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1579) Create a Flink History Server

2015-02-18 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325851#comment-14325851
 ] 

Robert Metzger commented on FLINK-1579:
---

We have to investigate on that: 
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/TimelineServer.html


 Create a Flink History Server
 -

 Key: FLINK-1579
 URL: https://issues.apache.org/jira/browse/FLINK-1579
 Project: Flink
  Issue Type: New Feature
Affects Versions: 0.9
Reporter: Robert Metzger

 Right now its not possible to analyze the job results for jobs that ran on 
 YARN, because we'll loose the information once the JobManager has stopped.
 Therefore, I propose to implement a Flink History Server which serves  the 
 results from these jobs.
 I haven't started thinking about the implementation, but I suspect it 
 involves some JSON files stored in HDFS :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1579) Create a Flink History Server

2015-02-18 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1579:
-

 Summary: Create a Flink History Server
 Key: FLINK-1579
 URL: https://issues.apache.org/jira/browse/FLINK-1579
 Project: Flink
  Issue Type: New Feature
Affects Versions: 0.9
Reporter: Robert Metzger


Right now its not possible to analyze the job results for jobs that ran on 
YARN, because we'll loose the information once the JobManager has stopped.

Therefore, I propose to implement a Flink History Server which serves  the 
results from these jobs.

I haven't started thinking about the implementation, but I suspect it involves 
some JSON files stored in HDFS :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1395) Add Jodatime support to Kryo

2015-02-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1395.
---
   Resolution: Fixed
Fix Version/s: 0.9

Resolved for master 0.9 in 
http://git-wip-us.apache.org/repos/asf/flink/commit/5015ab49

 Add Jodatime support to Kryo
 

 Key: FLINK-1395
 URL: https://issues.apache.org/jira/browse/FLINK-1395
 Project: Flink
  Issue Type: Sub-task
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1392) Serializing Protobuf - issue 1

2015-02-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1392.
---
   Resolution: Fixed
Fix Version/s: 0.9

Resolved for 0.9 in master with commit: 
http://git-wip-us.apache.org/repos/asf/flink/commit/77c45484

 Serializing Protobuf - issue 1
 --

 Key: FLINK-1392
 URL: https://issues.apache.org/jira/browse/FLINK-1392
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 0.8, 0.9
Reporter: Felix Neutatz
Assignee: Robert Metzger
Priority: Minor
 Fix For: 0.9, 0.8.1


 Hi, I started to experiment with Parquet using Protobuf.
 When I use the standard Protobuf class: 
 com.twitter.data.proto.tutorial.AddressBookProtos
 The code which I run, can be found here: 
 [https://github.com/FelixNeutatz/incubator-flink/blob/ParquetAtFlink/flink-addons/flink-hadoop-compatibility/src/main/java/org/apache/flink/hadoopcompatibility/mapreduce/example/ParquetProtobufOutput.java]
 I get the following exception:
 {code:xml}
 Exception in thread main java.lang.Exception: Deserializing the 
 InputFormat (org.apache.flink.api.java.io.CollectionInputFormat) failed: 
 Could not read the user code wrapper: Error while deserializing element from 
 collection
   at 
 org.apache.flink.runtime.jobgraph.InputFormatVertex.initializeOnMaster(InputFormatVertex.java:60)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$5.apply(JobManager.scala:179)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$5.apply(JobManager.scala:172)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
   at 
 org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:172)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
   at 
 scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:34)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:27)
   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
   at 
 org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:27)
   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
   at 
 org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:52)
   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
   at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
   at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
   at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: 
 org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could 
 not read the user code wrapper: Error while deserializing element from 
 collection
   at 
 org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:285)
   at 
 org.apache.flink.runtime.jobgraph.InputFormatVertex.initializeOnMaster(InputFormatVertex.java:57)
   ... 25 more
 Caused by: java.io.IOException: Error while deserializing element from 
 collection
   at 
 org.apache.flink.api.java.io.CollectionInputFormat.readObject(CollectionInputFormat.java:108)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at 
 

[jira] [Resolved] (FLINK-1417) Automatically register nested types at Kryo

2015-02-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1417.
---
Resolution: Fixed

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/354efec0

 Automatically register nested types at Kryo
 ---

 Key: FLINK-1417
 URL: https://issues.apache.org/jira/browse/FLINK-1417
 Project: Flink
  Issue Type: Improvement
  Components: Java API, Scala API
Reporter: Stephan Ewen
Assignee: Robert Metzger
 Fix For: 0.9


 Currently, the {{GenericTypeInfo}} registers the class of the type at Kryo. 
 In order to get the best performance, it should recursively walk the classes 
 and make sure that it registered all contained subtypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1391) Kryo fails to properly serialize avro collection types

2015-02-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1391.
---
   Resolution: Fixed
Fix Version/s: 0.9

Resolved for 0.9 into master with 
http://git-wip-us.apache.org/repos/asf/flink/commit/7e39bc67

 Kryo fails to properly serialize avro collection types
 --

 Key: FLINK-1391
 URL: https://issues.apache.org/jira/browse/FLINK-1391
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 0.8, 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9, 0.8.1


 Before FLINK-610, Avro was the default generic serializer.
 Now, special types coming from Avro are handled by Kryo .. which seems to 
 cause errors like:
 {code}
 Exception in thread main 
 org.apache.flink.runtime.client.JobExecutionException: 
 java.lang.NullPointerException
   at org.apache.avro.generic.GenericData$Array.add(GenericData.java:200)
   at 
 com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:116)
   at 
 com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22)
   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
   at 
 org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:143)
   at 
 org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:148)
   at 
 org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:244)
   at 
 org.apache.flink.runtime.plugable.DeserializationDelegate.read(DeserializationDelegate.java:56)
   at 
 org.apache.flink.runtime.io.network.serialization.AdaptiveSpanningRecordDeserializer.getNextRecord(AdaptiveSpanningRecordDeserializer.java:71)
   at 
 org.apache.flink.runtime.io.network.channels.InputChannel.readRecord(InputChannel.java:189)
   at 
 org.apache.flink.runtime.io.network.gates.InputGate.readRecord(InputGate.java:176)
   at 
 org.apache.flink.runtime.io.network.api.MutableRecordReader.next(MutableRecordReader.java:51)
   at 
 org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:53)
   at 
 org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:170)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
   at java.lang.Thread.run(Thread.java:744)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1388) POJO support for writeAsCsv

2015-02-18 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325971#comment-14325971
 ] 

Robert Metzger commented on FLINK-1388:
---

Yes, adding additional tests is always good.

 POJO support for writeAsCsv
 ---

 Key: FLINK-1388
 URL: https://issues.apache.org/jira/browse/FLINK-1388
 Project: Flink
  Issue Type: New Feature
  Components: Java API
Reporter: Timo Walther
Assignee: Adnan Khan
Priority: Minor

 It would be great if one could simply write out POJOs in CSV format.
 {code}
 public class MyPojo {
String a;
int b;
 }
 {code}
 to:
 {code}
 # CSV file of org.apache.flink.MyPojo: String a, int b
 Hello World, 42
 Hello World 2, 47
 ...
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes

2015-02-17 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1567:
-

 Summary: Add option to switch between Avro and Kryo serialization 
for GenericTypes
 Key: FLINK-1567
 URL: https://issues.apache.org/jira/browse/FLINK-1567
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.8, 0.9
Reporter: Robert Metzger


Allow users to switch the underlying serializer for GenericTypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1569) Object reuse mode is not working with KeySelector functions.

2015-02-17 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1569:
-

 Summary: Object reuse mode is not working with KeySelector 
functions.
 Key: FLINK-1569
 URL: https://issues.apache.org/jira/browse/FLINK-1569
 Project: Flink
  Issue Type: Bug
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger


The following code works correctly when object reuse is switched off.
When switching it on, the results are wrong.
Using a string-based key selection (putting name) works for both cases.
{code}
@Test
public void testWithAvroGenericSer() throws Exception {
final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
//  env.getConfig().enableObjectReuse();
Path in = new Path(inFile.getAbsoluteFile().toURI());

AvroInputFormatUser users = new AvroInputFormatUser(in, 
User.class);
DataSetUser usersDS = env.createInput(users);

DataSetTuple2String, Integer res = usersDS.groupBy(new 
KeySelectorUser, String() {
@Override
public String getKey(User value) throws Exception {
return String.valueOf(value.getName());
}
}).reduceGroup(new GroupReduceFunctionUser, Tuple2String, 
Integer() {
@Override
public void reduce(IterableUser values, 
CollectorTuple2String, Integer out) throws Exception {
for(User u : values) {
out.collect(new Tuple2String, 
Integer(u.getName().toString(), 1));
}
}
});

res.writeAsText(resultPath);
res.print();
env.execute(Avro Key selection);


expected = (Charlie,1)\n(Alyssa,1)\n;
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1452) Add flink-contrib maven module and README.md with the rules

2015-01-27 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293240#comment-14293240
 ] 

Robert Metzger commented on FLINK-1452:
---

In my opinion:
{{flink-addons}}:
- are usually part of {{flink-dist}}
- are maintained by the core committers
- are documented in the main documentation
- are eventually moved out of flink-addons once they are stable (for 
example {{flink-yarn}} recently, and probably {{flink-streaming}} soon)

{{flink-contrib}}:
- Not part of {{flink-dist}}
- documentation should live somewhere in the code, not in the main repo

We could call {{flink-contrib}} -- {{flink-user-contrib}} .. but that would 
make the name pretty long
Also, we could name {{flink-addons}} -- {{flink-unstable}}, but that's 
probably a bad name ;) 

I see the {{flink-addons}} as our internal incubator. Other potential names 
could be {{flink-beta}}, {{flink-incubator}}

 Add flink-contrib maven module and README.md with the rules
 -

 Key: FLINK-1452
 URL: https://issues.apache.org/jira/browse/FLINK-1452
 Project: Flink
  Issue Type: New Feature
  Components: flink-contrib
Reporter: Robert Metzger
Assignee: Robert Metzger

 I'll also create a JIRA component



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1433) Add HADOOP_CLASSPATH to start scripts

2015-01-27 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1433.
---
   Resolution: Fixed
Fix Version/s: 0.9
 Assignee: Robert Metzger

Fixed for 0.9 in http://git-wip-us.apache.org/repos/asf/flink/commit/a5150a90
Fixed for 0.8.1 in http://git-wip-us.apache.org/repos/asf/flink/commit/2387a08e

 Add HADOOP_CLASSPATH to start scripts
 -

 Key: FLINK-1433
 URL: https://issues.apache.org/jira/browse/FLINK-1433
 Project: Flink
  Issue Type: Improvement
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9, 0.8.1


 With the Hadoop file system wrapper, its important to have access to the 
 hadoop filesystem classes.
 The HADOOP_CLASSPATH seems to be a standard environment variable used by 
 Hadoop for such libraries.
 Deployments like Google Compute Cloud set this variable containing the 
 Google Cloud Storage Hadoop Wrapper. So if users want to use the Cloud 
 Storage in an non-yarn environment, we need to address this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1467) Job deployment fails with NPE on JobManager, if TMs did not start properly

2015-01-31 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1467:
-

 Summary: Job deployment fails with NPE on JobManager, if TMs did 
not start properly
 Key: FLINK-1467
 URL: https://issues.apache.org/jira/browse/FLINK-1467
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Reporter: Robert Metzger


I have a Flink cluster started where all TaskManagers died (misconfiguration). 
The JobManager needs more than 200 seconds to realize that (on the TaskManagers 
overview, you see timeouts  200). When submitting a job, you'll get the 
following exception:

{code}
org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: java.lang.Exception: Failed to deploy the task CHAIN 
DataSource (Generator: class io.airlift.tpch.NationGenerator) - Map (Map at 
writeAsFormattedText(DataSet.java:1132)) (1/1) - execution #0 to slot SubSlot 0 
(f8d11026ec5a11f0b273184c74ec4f29 (0) - ALLOCATED/ALIVE): 
java.lang.NullPointerException
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:346)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:248)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.yarn.YarnTaskManager$$anonfun$receiveYarnMessages$1.applyOrElse(YarnTaskManager.scala:32)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:41)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:27)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:27)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:78)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

at 
org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:311)
at akka.dispatch.OnComplete.internal(Future.scala:247)
at akka.dispatch.OnComplete.internal(Future.scala:244)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at 
scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

at org.apache.flink.client.program.Client.run(Client.java:345)
at org.apache.flink.client.program.Client.run(Client.java:304)
at org.apache.flink.client.program.Client.run(Client.java:298)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:55)
at flink.generators.programs.TPCHGenerator.main(TPCHGenerator.java:80)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
at org.apache.flink.client.program.Client.run(Client.java:250)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:389)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:358)
at 
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1068)
at 

[jira] [Commented] (FLINK-1438) ClassCastException for Custom InputSplit in local mode and invalid type code in distributed mode

2015-01-31 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299802#comment-14299802
 ] 

Robert Metzger commented on FLINK-1438:
---

Do you think this is related? https://issues.apache.org/jira/browse/FLINK-1390

 ClassCastException for Custom InputSplit in local mode and invalid type code 
 in distributed mode
 

 Key: FLINK-1438
 URL: https://issues.apache.org/jira/browse/FLINK-1438
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.8, 0.9
Reporter: Fabian Hueske
Priority: Minor

 Jobs with custom InputSplits fail with a ClassCastException such as 
 {{org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit 
 cannot be cast to 
 org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit}} 
 if executed on a local setup. 
 This issue is probably related to different ClassLoaders used by the 
 JobManager when InputSplits are generated and when they are handed to the 
 InputFormat by the TaskManager. Moving the class of the custom InputSplit 
 into the {{./lib}} folder and removing it from the job's makes the job work.
 To reproduce the bug, run the following job on a local setup. 
 {code}
 public class CustomSplitTestJob {
   public static void main(String[] args) throws Exception {
   ExecutionEnvironment env = 
 ExecutionEnvironment.getExecutionEnvironment();
   DataSetString x = env.createInput(new TestFileInputFormat());
   x.print();
   env.execute();
   }
   public static class TestFileInputFormat implements 
 InputFormatString,TestFileInputSplit {
   @Override
   public void configure(Configuration parameters) {
   }
   @Override
   public BaseStatistics getStatistics(BaseStatistics 
 cachedStatistics) throws IOException {
   return null;
   }
   @Override
   public TestFileInputSplit[] createInputSplits(int minNumSplits) 
 throws IOException {
   return new TestFileInputSplit[]{new 
 TestFileInputSplit()};
   }
   @Override
   public InputSplitAssigner 
 getInputSplitAssigner(TestFileInputSplit[] inputSplits) {
   return new LocatableInputSplitAssigner(inputSplits);
   }
   @Override
   public void open(TestFileInputSplit split) throws IOException {
   }
   @Override
   public boolean reachedEnd() throws IOException {
   return false;
   }
   @Override
   public String nextRecord(String reuse) throws IOException {
   return null;
   }
   @Override
   public void close() throws IOException {
   }
   }
   public static class TestFileInputSplit extends FileInputSplit {
   }
 }
 {code}
 The same happens in distributed mode just that Akka terminates the 
 transmission of the input split with a meaningless {{invalid type code: 00}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-25 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291060#comment-14291060
 ] 

Robert Metzger commented on FLINK-629:
--

Cool. I'm glad its working. I'll push the fix to master and release-0.8 branch.

Thank you for verifying the fix.

 Add support for null values to the java api
 ---

 Key: FLINK-629
 URL: https://issues.apache.org/jira/browse/FLINK-629
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Stephan Ewen
Assignee: Gyula Fora
Priority: Critical
  Labels: github-import
 Fix For: pre-apache

 Attachments: Selection_006.png, SimpleTweetInputFormat.java, 
 Tweet.java, model.tar.gz


 Currently, many runtime operations fail when encountering a null value. Tuple 
 serialization should allow null fields.
 I suggest to add a method to the tuples called `getFieldNotNull()` which 
 throws a meaningful exception when the accessed field is null. That way, we 
 simplify the logic of operators that should not dead with null fields, like 
 key grouping or aggregations.
 Even though SQL allows grouping and aggregating of null values, I suggest to 
 exclude this from the java api, because the SQL semantics of aggregating null 
 fields are messy.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/629
 Created by: [StephanEwen|https://github.com/StephanEwen]
 Labels: enhancement, java api, 
 Milestone: Release 0.5.1
 Created at: Wed Mar 26 00:27:49 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >