Hi,
the sources don't report records consumed. This is a bit confusing but the
records sent/records consumed statistics only talk about Flink-internal
sending of records, so a Kafka source would only show sent records.
To really see each operator in isolation you should disable chaining for
these
Hi,
for cases like these there is a family of "apply" methods on WindowedStream
that also take an incremental aggregation function. For example, there is
.apply(ReduceFunction, WindowFunction)
What this will do is incrementally aggregate the contents of the window.
When the window result should
Hi,
there no guarantees for the fields other than the summed field (and
eventual key fields). I think in practice it's either the fields from the
first or last record but I wouldn't rely on that.
Cheers,
Aljoscha
On Sat, 9 Apr 2016 at 03:19 Elias Levy wrote:
> I am
It turns out that the problem is deeper than I originally thought. The flink
dashboard reports that 0 records are being consumed, which is quite odd. Is
there some issue with the 0.9 consumer on YARN?
From: aljos...@apache.org
Date: Thu, 7 Apr 2016 09:56:42 +
Subject: Re: Multiple
Is there an API to access an event's time window? When you are computing
aggregates over a time window, you usually want to output the window along
with the aggregate. I could compute the Window on my own, but this seems
like an API that should exist.
I am wondering about the expected behavior of the sum method. Obviously it
sums a specific field in a tuple or POJO. But what should one expect in
other fields? Does sum keep the first field, last field or there aren't
any guarantees?
I'm just about to open an issue / PR solution for 'warm-starts'
Once this is in, we could just add a setter for the weight vector (and what
ever iteration you're on if you're going to do more partial fits).
Then all you need to save if your weight vector (and iter number).
Trevor Grant
Data
Is there a reasons the Predictor or Estimator class don't have read and
write methods for saving and retrieving the model? I couldn't find Jira
issues for it. Does it make sense to create one ?
BR,
Behrouz
On Wed, Mar 30, 2016 at 4:40 PM, Till Rohrmann wrote:
> Yes Suneel
Thanks, that example is helpful. It seems like to use `fromCollection` with
an iterator it must be an iterator that implements serializable, and Java's
built in `Iterator`s don't, unfortunately.
On Thu, Apr 7, 2016 at 6:11 PM, Chesnay Schepler wrote:
> hmm, maybe i was to
O.k., sounds nice; didn't mean to impose: my train of thought was a bit
murky, my apologies for that. Although I had prior experience with JMX, and
none with Yammer, at this moment it is all new again. Cleaning up my
thinking a bit, the following picture, not directly Flink related, comes up.
For
Thank you for your suggestion,
Regarding throughput, actually, there was a bottleneck at the process
which put logs into Kafka.
When I added more process, the throughput increased.
And, also, HyperLogLog seems a good solution in this case. I will try.
Regards,
Hironori
2016-04-07 17:45
note that we still /could /expose the option of using the yammer
reporters; there isn't any technical limitation as of now that would
prohibit that.
On 08.04.2016 13:05, Chesnay Schepler wrote:
I'm very much aware of how Yammer works.
As the slides you linked show (near the end) is that
I'm very much aware of how Yammer works.
As the slides you linked show (near the end) is that there are several
small issues with the general-purpose reporters offered by yammer.
Instead of hacking around those issues i would very much prefer creating
our own reporters that are, again,
I forgot to add some extra information, still all tentative.
Earlier (erm, 14 years ago to be honest), I also worked on a JMX monitoring
system, and it turned out to be a pain to identify all the components,
write the correct jmx queries and then to plot everything. It was hard.
This presentation
Thanks Chesnay.
Might I make a tentative case for Yammer? I'm not an expert, but I am
currently trying to pull together information on this and was reviewing
jmxtrans. This is all tentative, I've just dived in a few days ago. Please
find the information below.
Using Yammer it is possible to load
Hi Stefano,
your fix is the right way to resolve the issue ;)
If you want, give me your Confluence Wiki username and I give you edit
permissions in our wiki. Otherwise, I'll quickly add a note to the
migration guide.
On Fri, Apr 8, 2016 at 11:28 AM, Stefano Bortoli
wrote:
Hi Timur,
the Flink optimizer runs on the client, so the exception is thrown from the
JVM running the ./bin/flink client.
Since the statistics sampling is an optional step, its surrounded by a try
/ catch block that just logs the error message.
More answers inline below
On Thu, Apr 7, 2016 at
Hi to all,
we've just upgraded to Flink 1.0.0 and we had some problems with joda
DateTime serialization.
The problem was caused by Flink-3305 that removed the JavaKaffee dependency.
We had to re-add such dependency in our application and then register the
DateTime serializer in the environment:
Flink currently doesn't expose any metrics beyond those shown in the
Dashboard.
I am currently working on integrating a new metrics system that is
partly /based/ on Yammer/Codahale/Dropwizard metrics.
For a first version it is planned to export metrics only via JMX as this
effectively
Hi,
I´m looking into setting up monitoring for our (Flink) environment and
realized that both Kafka and Cassandra use the yammer metrics library. This
library enables the direct export of all metrics to Graphite (and possibly
statsd). Does Flink use Yammer metrics?
Cheers,
Sanne
20 matches
Mail list logo