date:20160408

Re: Multiple operations on a WindowedStream

2016-04-08 Thread Aljoscha Krettek

Hi, the sources don't report records consumed. This is a bit confusing but the records sent/records consumed statistics only talk about Flink-internal sending of records, so a Kafka source would only show sent records. To really see each operator in isolation you should disable chaining for these

Re: Access an event's TimeWindow?

2016-04-08 Thread Aljoscha Krettek

Hi, for cases like these there is a family of "apply" methods on WindowedStream that also take an incremental aggregation function. For example, there is .apply(ReduceFunction, WindowFunction) What this will do is incrementally aggregate the contents of the window. When the window result should

Re: WindowedStream sum behavior

2016-04-08 Thread Aljoscha Krettek

Hi, there no guarantees for the fields other than the summed field (and eventual key fields). I think in practice it's either the fields from the first or last record but I wouldn't rely on that. Cheers, Aljoscha On Sat, 9 Apr 2016 at 03:19 Elias Levy wrote: > I am

RE: Multiple operations on a WindowedStream

2016-04-08 Thread Kanak Biscuitwala

It turns out that the problem is deeper than I originally thought. The flink dashboard reports that 0 records are being consumed, which is quite odd. Is there some issue with the 0.9 consumer on YARN? From: aljos...@apache.org Date: Thu, 7 Apr 2016 09:56:42 + Subject: Re: Multiple

Access an event's TimeWindow?

2016-04-08 Thread Elias Levy

Is there an API to access an event's time window? When you are computing aggregates over a time window, you usually want to output the window along with the aggregate. I could compute the Window on my own, but this seems like an API that should exist.

WindowedStream sum behavior

2016-04-08 Thread Elias Levy

I am wondering about the expected behavior of the sum method. Obviously it sums a specific field in a tuple or POJO. But what should one expect in other fields? Does sum keep the first field, last field or there aren't any guarantees?

Re: Flink ML 1.0.0 - Saving and Loading Models to Score a Single Feature Vector

2016-04-08 Thread Trevor Grant

I'm just about to open an issue / PR solution for 'warm-starts' Once this is in, we could just add a setter for the weight vector (and what ever iteration you're on if you're going to do more partial fits). Then all you need to save if your weight vector (and iter number). Trevor Grant Data

Re: Flink ML 1.0.0 - Saving and Loading Models to Score a Single Feature Vector

2016-04-08 Thread Behrouz Derakhshan

Is there a reasons the Predictor or Estimator class don't have read and write methods for saving and retrieving the model? I couldn't find Jira issues for it. Does it make sense to create one ? BR, Behrouz On Wed, Mar 30, 2016 at 4:40 PM, Till Rohrmann wrote: > Yes Suneel

Re: FromIteratorFunction problems

2016-04-08 Thread Andrew Whitaker

Thanks, that example is helpful. It seems like to use `fromCollection` with an iterator it must be an iterator that implements serializable, and Java's built in `Iterator`s don't, unfortunately. On Thu, Apr 7, 2016 at 6:11 PM, Chesnay Schepler wrote: > hmm, maybe i was to

Re: Yammer metrics

2016-04-08 Thread Sanne de Roever

O.k., sounds nice; didn't mean to impose: my train of thought was a bit murky, my apologies for that. Although I had prior experience with JMX, and none with Yammer, at this moment it is all new again. Cleaning up my thinking a bit, the following picture, not directly Flink related, comes up. For

Re: Handling large state (incremental snapshot?)

2016-04-08 Thread Hironori Ogibayashi

Thank you for your suggestion, Regarding throughput, actually, there was a bottleneck at the process which put logs into Kafka. When I added more process, the throughput increased. And, also, HyperLogLog seems a good solution in this case. I will try. Regards, Hironori 2016-04-07 17:45

Re: Yammer metrics

2016-04-08 Thread Chesnay Schepler

note that we still /could /expose the option of using the yammer reporters; there isn't any technical limitation as of now that would prohibit that. On 08.04.2016 13:05, Chesnay Schepler wrote: I'm very much aware of how Yammer works. As the slides you linked show (near the end) is that

Re: Yammer metrics

2016-04-08 Thread Chesnay Schepler

I'm very much aware of how Yammer works. As the slides you linked show (near the end) is that there are several small issues with the general-purpose reporters offered by yammer. Instead of hacking around those issues i would very much prefer creating our own reporters that are, again,

Re: Yammer metrics

2016-04-08 Thread Sanne de Roever

I forgot to add some extra information, still all tentative. Earlier (erm, 14 years ago to be honest), I also worked on a JMX monitoring system, and it turned out to be a pain to identify all the components, write the correct jmx queries and then to plot everything. It was hard. This presentation

Re: Yammer metrics

2016-04-08 Thread Sanne de Roever

Thanks Chesnay. Might I make a tentative case for Yammer? I'm not an expert, but I am currently trying to pull together information on this and was reviewing jmxtrans. This is all tentative, I've just dived in a few days ago. Please find the information below. Using Yammer it is possible to load

Re: Joda DateTimeSerializer

2016-04-08 Thread Robert Metzger

Hi Stefano, your fix is the right way to resolve the issue ;) If you want, give me your Confluence Wiki username and I give you edit permissions in our wiki. Otherwise, I'll quickly add a note to the migration guide. On Fri, Apr 8, 2016 at 11:28 AM, Stefano Bortoli wrote:

Re: Integrate Flink with S3 on EMR cluster

2016-04-08 Thread Robert Metzger

Hi Timur, the Flink optimizer runs on the client, so the exception is thrown from the JVM running the ./bin/flink client. Since the statistics sampling is an optional step, its surrounded by a try / catch block that just logs the error message. More answers inline below On Thu, Apr 7, 2016 at

Joda DateTimeSerializer

2016-04-08 Thread Stefano Bortoli

Hi to all, we've just upgraded to Flink 1.0.0 and we had some problems with joda DateTime serialization. The problem was caused by Flink-3305 that removed the JavaKaffee dependency. We had to re-add such dependency in our application and then register the DateTime serializer in the environment:

Re: Yammer metrics

2016-04-08 Thread Chesnay Schepler

Flink currently doesn't expose any metrics beyond those shown in the Dashboard. I am currently working on integrating a new metrics system that is partly /based/ on Yammer/Codahale/Dropwizard metrics. For a first version it is planned to export metrics only via JMX as this effectively

Yammer metrics

2016-04-08 Thread Sanne de Roever

Hi, I´m looking into setting up monitoring for our (Flink) environment and realized that both Kafka and Cassandra use the yammer metrics library. This library enables the direct export of all metrics to Graphite (and possibly statsd). Does Flink use Yammer metrics? Cheers, Sanne

Re: Multiple operations on a WindowedStream

Re: Access an event's TimeWindow?

Re: WindowedStream sum behavior

RE: Multiple operations on a WindowedStream

Access an event's TimeWindow?

WindowedStream sum behavior

Re: Flink ML 1.0.0 - Saving and Loading Models to Score a Single Feature Vector

Re: Flink ML 1.0.0 - Saving and Loading Models to Score a Single Feature Vector

Re: FromIteratorFunction problems

Re: Yammer metrics

Re: Handling large state (incremental snapshot?)

Re: Yammer metrics

Re: Yammer metrics

Re: Yammer metrics

Re: Yammer metrics

Re: Joda DateTimeSerializer

Re: Integrate Flink with S3 on EMR cluster

Joda DateTimeSerializer

Re: Yammer metrics

Yammer metrics

20 matches

Site Navigation

Mail list logo

Footer information