Re: Flink 1.3 REST API wrapper for Scala

2017-06-12 Thread Flavio Pompermaier
Nice lib! Is it available also on maven central? On 13 Jun 2017 4:36 am, "Michael Reid" wrote: > I'm currently working on a project where I need to manage jobs > programmatically without being tied to Flink, so I wrote a small, > asynchronous Scala wrapper around the

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-12 Thread Kurt Young
Hi, I think the reason is your record is too large to do a in-memory combine. You can try to disable your combiner. Best, Kurt On Mon, Jun 12, 2017 at 9:55 PM, Sebastian Neef < gehax...@mailbox.tu-berlin.de> wrote: > Hi, > > when I'm running my Flink job on a small dataset, it successfully >

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-12 Thread Ted Yu
Sebastian: Are you using jdk 7 or jdk 8 ? For jdk 7, there was bug w.r.t. code cache getting full which affects performance. https://bugs.openjdk.java.net/browse/JDK-8051955 https://bugs.openjdk.java.net/browse/JDK-8074288 http://blog.andresteingress.com/2016/10/19/java-codecache Cheers On

Re: Exception in Flink 1.3.0

2017-06-12 Thread rhashmi
Any update when 1.3.1 will be available? Our current copy is 1.2.0 but that has separate issue(invalid type code: 00). http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/invalid-type-code-00-td13326.html#a13332 -- View this message in context:

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-12 Thread Flavio Pompermaier
Try to see of in the output of dmesg command there are some log about an OOM. The OS logs there such info. I had a similar experience recently... see [1] Best, Flavio [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-swapping-question-td13284.html On 12 Jun 2017

Re: At what point do watermarks get injected into the stream?

2017-06-12 Thread Fabian Hueske
Hi, each operator keeps track of the latest (and therefore maximum) watermark received from each of its inputs and sets its own internal time to the minimum watermark of each input. In the example, window(1) has two inputs (Map(1) and Map(2)). If Window(1) first receives a WM 33 from Map(1) it

Re: At what point do watermarks get injected into the stream?

2017-06-12 Thread Ray Ruvinskiy
Thanks! I had a couple some follow-up questions to the example in the documentation. Suppose Source 1 sends a watermark of 33, and Source 2 sends a watermark of 17. If I understand correctly, map(1) will forward the watermark of 33 to window(1) and window(2), and map(2) will forward the

Task and Operator Metrics in Flink 1.3

2017-06-12 Thread Dail, Christopher
I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I would expect to see. I have flink configured to write out metrics via statsd and I am consuming this with telegraf. Initially I thought this was an issue with telegraf parsing the data generated. I dumped all of the

Re: coGroup exception or something else in Gelly job

2017-06-12 Thread Kaepke, Marc
I’m working on an implementation of SemiClustering [1]. I used two graph models (Pregel aka. vertexcentric iteration and gather scatter). Short description of the algorithm * input is a weighted, undirected graph * output are greedy clusters * Each vertex V maintains a list

Re: Use Single Sink For All windows

2017-06-12 Thread rhashmi
Thanks Aljoscha for your response. I would give a try.. 1- flink call *invoke* method of SinkFunction to dispatch aggregated information. My follow up question here is .. while snapshotState method is in process, if sink received another update then we might have mix records, however per

ReduceFunction mechanism

2017-06-12 Thread nragon
Hi, Regarding ReduceFunction. Is reduce() called when there is only one record for a given key? Thanks -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ReduceFunction-mechanism-tp13651.html Sent from the Apache Flink User Mailing List

Re: coGroup exception or something else in Gelly job

2017-06-12 Thread Greg Hogan
I don't think it is possible for anyone to debug your exception without the source code. Storing the adjacency list within the Vertex value is not scalable. Can you share a basic description of the algorithm you are working to implement? On Mon, Jun 12, 2017 at 5:47 AM, Kaepke, Marc

Re: Use Single Sink For All windows

2017-06-12 Thread Aljoscha Krettek
Ah, I think now I get your problem. You could manually implement batching inside your SinkFunction, The SinkFunction would batch values in memory and periodically (based on the count of values and on a timeout) send these values as a single batch to MySQL. To ensure that data is not lost you

Re: Cannot write record to fresh sort buffer. Record too large.

2017-06-12 Thread Stefan Richter
Hi, can you please take a look at your TM logs? I would expect that you can see an java.lang.OutOfMemoryError there. If this assumption is correct, you can try to: 1. Further decrease the taskmanager.memory.fraction: This will cause the TaskManager to allocate less memory for managed memory

Flink Streaming and K-Nearest-Neighbours

2017-06-12 Thread Alexandru Ciobanu
Hello flink community, I would like to know how you would calculate k-nearest neighbours using the flink streaming environment - is this even possible? What I currently have is a datastream which comes from a socket. The messages from the socket are run through a map and a reduce function, thus

Re: dynamic add sink to flink

2017-06-12 Thread Tzu-Li (Gordon) Tai
Hi, The Flink Kafka Producer allows writing to multiple topics beside the default topic. To do this, you can override the configured default topic by implementing the `getTargetTopic` method on the `KeyedSerializationSchema`. That method is invoked for each record, and if a value is returned,

Re: [DISCUSS] Removal of twitter-inputformat

2017-06-12 Thread sblackmon
Hello, Apache Streams (incubating) maintains and publishes json-schemas and jackson-compatible POJOs for Twitter and other popular third-party APIs. http://streams.apache.org/site/0.5.1-incubating-SNAPSHOT/streams-project/streams-contrib/streams-provider-twitter/index.html We also have a

Cannot write record to fresh sort buffer. Record too large.

2017-06-12 Thread Sebastian Neef
Hi, when I'm running my Flink job on a small dataset, it successfully finishes. However, when a bigger dataset is used, I get multiple exceptions: - Caused by: java.io.IOException: Cannot write record to fresh sort buffer. Record too large. - Thread 'SortMerger Reading Thread' terminated due to

Re: Flink-derrived operator names cause issues in Graphite metrics

2017-06-12 Thread Carst Tankink
Thanks for the quick response :-) I think the limiting of names might still be good enough for my use-case, because the default case is naming operators properly (it helps in creating dashboards...) but if we forget/miss one, we do not want to start hammering our graphite setup with bad data.

Re: Flink-derrived operator names cause issues in Graphite metrics

2017-06-12 Thread Chesnay Schepler
So there's 2 issues here: 1. The default names for windows are horrible. They are to long, full of special characters, and unstable as reported in FLINK-6464 2. The reporter doesn't filter out metrics it can't report. For 2) we can do 2

Flink-derrived operator names cause issues in Graphite metrics

2017-06-12 Thread Carst Tankink
Hi, We accidentally forgot to give some operators in our flink stream a custom/unique name, and ran into the following exception in Graphite: ‘exceptions.IOError: [Errno 36] File name too long:

Re: [DISCUSS] Removal of twitter-inputformat

2017-06-12 Thread Aljoscha Krettek
Bumpety-bump. I would be in favour or removing this: - It can be implemented as a MapFunction parser after a TextInputFormat - Additions, changes, fixes that happen on TextInputFormat are not reflected to SimpleTweetInputFormat - SimpleTweetInput format overrides nextRecord(), which is not

Re: Java 8 lambdas for CEP patterns won't compile

2017-06-12 Thread Kostas Kloudas
Done. > On Jun 12, 2017, at 12:24 PM, Ted Yu wrote: > > Can you add link to this thread in the JIRA ? > > Cheers > > On Mon, Jun 12, 2017 at 3:15 AM, Kostas Kloudas > wrote: > Unfortunately, there was no

Re: Java 8 lambdas for CEP patterns won't compile

2017-06-12 Thread Ted Yu
Can you add link to this thread in the JIRA ? Cheers On Mon, Jun 12, 2017 at 3:15 AM, Kostas Kloudas wrote: > Unfortunately, there was no discussion as this regression came as an > artifact of the addition of the IterativeConditions, but it will be fixed. > > This

Re: Java 8 lambdas for CEP patterns won't compile

2017-06-12 Thread Ted Yu
Do know which JIRA / discussion thread had the context for this decision ? I did a quick search in JIRA but only found FLINK-3681. Cheers On Mon, Jun 12, 2017 at 1:48 AM, Kostas Kloudas wrote: > Hi David and Ted, > > The documentation is outdated. I will update

Re: Guava version conflict

2017-06-12 Thread Tzu-Li (Gordon) Tai
This seems like a shading problem then. I’ve tested this again with Maven 3.0.5, even without building against CDH Hadoop binaries the flink-dist jar contains non-shaded Guava dependencies. Let me investigate a bit and get back to this! Cheers, Gordon On 8 June 2017 at 2:47:02 PM, Flavio