I only restarted before running th first test, since all the configurations are
the same in the three tests.
Are you restarting the topology between all of these tests?
On Tue, Jan 28, 2020 at 11:09 AM Gonçalo Pedras
wrote:
> Hi,
>
> This profiler is really inconsistent, i’m going crazy right now.
>
> I’ve made a further investigation and this is really bugging my mind:
>
> 1. I’m not expecting
Hi,
This profiler is really inconsistent, i’m going crazy right now.
I’ve made a further investigation and this is really bugging my mind:
1. I’m not expecting to receive15 hours old messages. In fact I’m the one
who’s picking the messages from the current time and sending them to Kafka,
Prior to this point in time, the Profiler had received a message indicating
that the current time is Mon Jan 27 2020 17:46:44 GMT. It then received a
message with a timestamp of Tue Jan 28 2020 09:02:52 GMT, about 15 hours in
the future. Since this time gap is significantly larger than your
Hi again,
I found something in the profiler storm logs that proves the delay:
“2020-01-28 09:46:37.061 o.a.m.p.s.FixedFrequencyFlushSignal
watermark-event-generator-0 [WARN] Timestamp out-of-order by -54968000 ms. This
may indicate a problem in the data. timestamp=1580202172000,
Hi Allen,
Thanks for the reply by the way.
I’ve been checking my profiler, tweaking some options and whatnot. I’ve set the
“timestampField” and solved half the issue.
I ran the spark batch profiler and it rectified the counts. Then I started the
Storm profiler once again. Now the profiler is
Hi Gonçalo -
What could be happening is that your Profiler is not tuned to be able to
keep up with the amount of incoming data that you have. I would guess that
the Profiler "keeps counting beyond that period of time" because it is
still processing old data that is queued up.
- How much data