Diagnosing TaskManager disappearance

2015-10-29 Thread Greg Hogan
I am testing again on a 64 node cluster (the JobManager is running fine having reduced some operator's parallelism and fixed the string conversion performance). I am seeing TaskManagers drop like flies every other job or so. I am not seeing any output in the .out log files corresponding to the

Re: Caching information from a stream

2015-10-29 Thread Andra Lungu
Thanks Max ^^ On Wed, Oct 28, 2015 at 8:41 PM, Maximilian Michels wrote: > Oups, forgot the mapper :) > > static class StatefulMapper extends RichMapFunction Long>, Tuple2> { > >private OperatorState counter; > >@Override >public

Re: Diagnosing TaskManager disappearance

2015-10-29 Thread Maximilian Michels
Hi Greg, Thanks for reporting. You wrote you didn't see any output in the .out files of the task managers. What about the .log files of these instances? Where and when did you produce the thread dump you included? Thanks, Max On Thu, Oct 29, 2015 at 1:46 PM, Greg Hogan

neo4j - Flink connector

2015-10-29 Thread Vasiliki Kalavri
Hello everyone, Martin, Martin, Alex (cc'ed) and myself have started discussing about implementing a neo4j-Flink connector. I've opened a corresponding JIRA (FLINK-2941) containing an initial document [1], but we'd also like to share our ideas here to engage the community and get your feedback.

New JobManager web frontend

2015-10-29 Thread Matthias J. Sax
Hi, I was just playing with the new JobManager web frontend and missing a button to cancel a running job. It there no such button, or is it hidden somewhere? -Matthias signature.asc Description: OpenPGP digital signature

Re: New JobManager web frontend

2015-10-29 Thread Maximilian Michels
Hi Matthias, There is currently no cancel button in the web frontend. Just filed this ticket today: https://issues.apache.org/jira/browse/FLINK-2939 Cheers, Max On Thu, Oct 29, 2015 at 4:49 PM, Matthias J. Sax wrote: > Hi, > > I was just playing with the new JobManager web

Re: Diagnosing TaskManager disappearance

2015-10-29 Thread Greg Hogan
I recently discovered that AWS uses NUMA for its largest nodes. An example c4.8xlarge: $ numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 18 19 20 21 22 23 24 25 26 node 0 size: 29813 MB node 0 free: 24537 MB node 1 cpus: 9 10 11 12 13 14 15 16 17 27 28 29 30 31 32 33 34

Fwd: neo4j - Flink connector

2015-10-29 Thread Vasiliki Kalavri
Forwarding these here to keep dev@ in the loop :) -- Forwarded message -- From: Martin Junghanns Date: 29 October 2015 at 18:37 Subject: Re: neo4j - Flink connector To: Martin Liesenberg , Vasia Kalavri <

[jira] [Created] (FLINK-2942) Dangling operators in web UI's program visualization

2015-10-29 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2942: Summary: Dangling operators in web UI's program visualization Key: FLINK-2942 URL: https://issues.apache.org/jira/browse/FLINK-2942 Project: Flink Issue

Re: Diagnosing TaskManager disappearance

2015-10-29 Thread Aljoscha Krettek
Could it be a problem that there are two TaskManagers running per machine? > On 29 Oct 2015, at 19:04, Greg Hogan wrote: > > I have memory logging enabled. Tail of TaskManager log on 10.0.88.140: > > 17:35:26,415 INFO > org.apache.flink.runtime.taskmanager.TaskManager

[jira] [Created] (FLINK-2943) Confusing Bytes/Records "read" and "write" labels in WebUI job view

2015-10-29 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2943: Summary: Confusing Bytes/Records "read" and "write" labels in WebUI job view Key: FLINK-2943 URL: https://issues.apache.org/jira/browse/FLINK-2943 Project: Flink

Re: Diagnosing TaskManager disappearance

2015-10-29 Thread Greg Hogan
I removed the use of numactl but left in starting two TaskManagers and am still seeing TaskManagers crash. >From the JobManager log: 17:36:06,412 WARN akka.remote.ReliableDeliverySupervisor- Association with remote system [akka.tcp://flink@10.0.88.140:45742] has failed,

[jira] [Created] (FLINK-2940) Deploy multiple Scala versions for Maven artifacts

2015-10-29 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-2940: - Summary: Deploy multiple Scala versions for Maven artifacts Key: FLINK-2940 URL: https://issues.apache.org/jira/browse/FLINK-2940 Project: Flink

[jira] [Created] (FLINK-2939) Add button to cancel jobs in new web frontend

2015-10-29 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-2939: - Summary: Add button to cancel jobs in new web frontend Key: FLINK-2939 URL: https://issues.apache.org/jira/browse/FLINK-2939 Project: Flink Issue

Re: Scala 2.10/2.11 Maven dependencies

2015-10-29 Thread Maximilian Michels
Seems like we agree that we need artifacts for different versions of Scala on Maven. There also seems to be a preference for including the version in the artifact name. I've created an issue and marked it to be resolved for 1.0. For the 0.10 release, we will have binaries but no Maven artifacts.