Re: Understanding metrics

2017-03-15 Thread Ankit Malhotra
Hi Xinyu,

Super weird, but I can’t seem to find the commit-ns nor the window-ns metric 
for our jobs. There are window-calls and commit-calls but the timings are 
missing.

task.commit.ms=9
task.window.ms=9

Ankit
On 3/14/17, 8:00 PM, "xinyu liu"  wrote:

Hi, Ankit,

When running your job in multithreading, block-ns here actually includes
the process_ns. This is because after your task.process() is submitted to
the thread pool, the run loop thread will be blocked until the process() is
complete for one of the task. It's interesting that block-ns (0.3 ms) is
much longer than process-ns (0.12 ms). I am wondering whether you also have
window and checkpoint configured for your job. Since window and checkpoint
will also be running inside this thread pool to improve the parallelism,
block-ns will be affected since the run loop will block for
window/checkpoint to complete. If you are using window/commit, please send
us the configs (task.window.ms and task.commit.ms) and the timer metrics
(window-ns and commit-ns). Then we can correlate better with block-ns.

Thanks,
Xinyu

On Tue, Mar 14, 2017 at 3:33 PM, Ankit Malhotra 
wrote:

> Wait, block-ns = 0.3ms (300,000ns). Also, why are we not adding in
> choose-ns?
>
> Thanks
> Ankit
>
> On 3/14/17, 6:26 PM, "Jagadish Venkatraman" 
> wrote:
>
> I would expect (process_ns + block_ns) to be almost the same as 0.15
> which
> makes sense.
>
> process_ns = 0.12 ms
> block_ns = 0.03 ms
> process_ns + block_ns ~ 0.15ms
>
> This corresponds to the number of process calls roughly 1/7000 ~
> 0.15ms per
> process call.
>
> >> Each process call is a separate thread.
> Given that you have one partition in each container, and you want
> in-order
> processing, there will be only one thread that's processing messages.
> The
> two other threads are not doing work, and entail a constant (albeit
> insignificant) synchronization overhead.
>
>
>
>
>
> On Tue, Mar 14, 2017 at 3:03 PM, Ankit Malhotra <
> amalho...@appnexus.com>
> wrote:
>
> > Hi,
> >
> > We are trying to understand metrics that are being populated by our
> samza
> > job and are a little confused what each of these metrics mean
> especially
> > since we’re running the job with a thread pool.
> >
> >
> > · We have 3 input streams
> >
> > · job.container.thread.pool.size=3
> >
> > · 1 container per partition
> >
> > · Using a RocksDB backed store with changelogging
> >
> > · process-ns = 120,000
> >
> > · get-ns ~ 30,000
> >
> > · put-ns ~ 90,000
> >
> > · block-ns ~ 300,000
> >
> > · choose-ns ~ 500,000
> >
> > Metrics are avg(metric) for all containers/partitions.
> >
> > Process-envelopes ~ 7000/sec.
> >
> > If I back the math out, this correlates quite closely to process-ns.
> > (1/7000 ~ 0.15ms).
> >
> > What I don’t understand is that the event loop is single threaded.
> Even
> > though, each process call is a separate thread, the main even loop 
is
> > blocking (block-ns) and choosing (choose-ns) every time and these
> times are
> > quite high. Given these metrics, how is it that we are consistently
> seeing
> > the above metrics?
> >
> > Also, lag (messages behind high watermark) is ~ 0.
> >
> > Thanks
> > Ankit
> >
> >
> >
> >
> >
> >
>
>
> --
> Jagadish V,
> Graduate Student,
> Department of Computer Science,
> Stanford University
>
>
>




Re: Understanding metrics

2017-03-14 Thread xinyu liu
Hi, Ankit,

When running your job in multithreading, block-ns here actually includes
the process_ns. This is because after your task.process() is submitted to
the thread pool, the run loop thread will be blocked until the process() is
complete for one of the task. It's interesting that block-ns (0.3 ms) is
much longer than process-ns (0.12 ms). I am wondering whether you also have
window and checkpoint configured for your job. Since window and checkpoint
will also be running inside this thread pool to improve the parallelism,
block-ns will be affected since the run loop will block for
window/checkpoint to complete. If you are using window/commit, please send
us the configs (task.window.ms and task.commit.ms) and the timer metrics
(window-ns and commit-ns). Then we can correlate better with block-ns.

Thanks,
Xinyu

On Tue, Mar 14, 2017 at 3:33 PM, Ankit Malhotra 
wrote:

> Wait, block-ns = 0.3ms (300,000ns). Also, why are we not adding in
> choose-ns?
>
> Thanks
> Ankit
>
> On 3/14/17, 6:26 PM, "Jagadish Venkatraman" 
> wrote:
>
> I would expect (process_ns + block_ns) to be almost the same as 0.15
> which
> makes sense.
>
> process_ns = 0.12 ms
> block_ns = 0.03 ms
> process_ns + block_ns ~ 0.15ms
>
> This corresponds to the number of process calls roughly 1/7000 ~
> 0.15ms per
> process call.
>
> >> Each process call is a separate thread.
> Given that you have one partition in each container, and you want
> in-order
> processing, there will be only one thread that's processing messages.
> The
> two other threads are not doing work, and entail a constant (albeit
> insignificant) synchronization overhead.
>
>
>
>
>
> On Tue, Mar 14, 2017 at 3:03 PM, Ankit Malhotra <
> amalho...@appnexus.com>
> wrote:
>
> > Hi,
> >
> > We are trying to understand metrics that are being populated by our
> samza
> > job and are a little confused what each of these metrics mean
> especially
> > since we’re running the job with a thread pool.
> >
> >
> > · We have 3 input streams
> >
> > · job.container.thread.pool.size=3
> >
> > · 1 container per partition
> >
> > · Using a RocksDB backed store with changelogging
> >
> > · process-ns = 120,000
> >
> > · get-ns ~ 30,000
> >
> > · put-ns ~ 90,000
> >
> > · block-ns ~ 300,000
> >
> > · choose-ns ~ 500,000
> >
> > Metrics are avg(metric) for all containers/partitions.
> >
> > Process-envelopes ~ 7000/sec.
> >
> > If I back the math out, this correlates quite closely to process-ns.
> > (1/7000 ~ 0.15ms).
> >
> > What I don’t understand is that the event loop is single threaded.
> Even
> > though, each process call is a separate thread, the main even loop is
> > blocking (block-ns) and choosing (choose-ns) every time and these
> times are
> > quite high. Given these metrics, how is it that we are consistently
> seeing
> > the above metrics?
> >
> > Also, lag (messages behind high watermark) is ~ 0.
> >
> > Thanks
> > Ankit
> >
> >
> >
> >
> >
> >
>
>
> --
> Jagadish V,
> Graduate Student,
> Department of Computer Science,
> Stanford University
>
>
>


Re: Understanding metrics

2017-03-14 Thread Ankit Malhotra
Wait, block-ns = 0.3ms (300,000ns). Also, why are we not adding in choose-ns?

Thanks
Ankit

On 3/14/17, 6:26 PM, "Jagadish Venkatraman"  wrote:

I would expect (process_ns + block_ns) to be almost the same as 0.15 which
makes sense.

process_ns = 0.12 ms
block_ns = 0.03 ms
process_ns + block_ns ~ 0.15ms

This corresponds to the number of process calls roughly 1/7000 ~ 0.15ms per
process call.

>> Each process call is a separate thread.
Given that you have one partition in each container, and you want in-order
processing, there will be only one thread that's processing messages. The
two other threads are not doing work, and entail a constant (albeit
insignificant) synchronization overhead.





On Tue, Mar 14, 2017 at 3:03 PM, Ankit Malhotra 
wrote:

> Hi,
>
> We are trying to understand metrics that are being populated by our samza
> job and are a little confused what each of these metrics mean especially
> since we’re running the job with a thread pool.
>
>
> · We have 3 input streams
>
> · job.container.thread.pool.size=3
>
> · 1 container per partition
>
> · Using a RocksDB backed store with changelogging
>
> · process-ns = 120,000
>
> · get-ns ~ 30,000
>
> · put-ns ~ 90,000
>
> · block-ns ~ 300,000
>
> · choose-ns ~ 500,000
>
> Metrics are avg(metric) for all containers/partitions.
>
> Process-envelopes ~ 7000/sec.
>
> If I back the math out, this correlates quite closely to process-ns.
> (1/7000 ~ 0.15ms).
>
> What I don’t understand is that the event loop is single threaded. Even
> though, each process call is a separate thread, the main even loop is
> blocking (block-ns) and choosing (choose-ns) every time and these times 
are
> quite high. Given these metrics, how is it that we are consistently seeing
> the above metrics?
>
> Also, lag (messages behind high watermark) is ~ 0.
>
> Thanks
> Ankit
>
>
>
>
>
>


-- 
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University




Re: Understanding metrics

2017-03-14 Thread Jagadish Venkatraman
I would expect (process_ns + block_ns) to be almost the same as 0.15 which
makes sense.

process_ns = 0.12 ms
block_ns = 0.03 ms
process_ns + block_ns ~ 0.15ms

This corresponds to the number of process calls roughly 1/7000 ~ 0.15ms per
process call.

>> Each process call is a separate thread.
Given that you have one partition in each container, and you want in-order
processing, there will be only one thread that's processing messages. The
two other threads are not doing work, and entail a constant (albeit
insignificant) synchronization overhead.





On Tue, Mar 14, 2017 at 3:03 PM, Ankit Malhotra 
wrote:

> Hi,
>
> We are trying to understand metrics that are being populated by our samza
> job and are a little confused what each of these metrics mean especially
> since we’re running the job with a thread pool.
>
>
> · We have 3 input streams
>
> · job.container.thread.pool.size=3
>
> · 1 container per partition
>
> · Using a RocksDB backed store with changelogging
>
> · process-ns = 120,000
>
> · get-ns ~ 30,000
>
> · put-ns ~ 90,000
>
> · block-ns ~ 300,000
>
> · choose-ns ~ 500,000
>
> Metrics are avg(metric) for all containers/partitions.
>
> Process-envelopes ~ 7000/sec.
>
> If I back the math out, this correlates quite closely to process-ns.
> (1/7000 ~ 0.15ms).
>
> What I don’t understand is that the event loop is single threaded. Even
> though, each process call is a separate thread, the main even loop is
> blocking (block-ns) and choosing (choose-ns) every time and these times are
> quite high. Given these metrics, how is it that we are consistently seeing
> the above metrics?
>
> Also, lag (messages behind high watermark) is ~ 0.
>
> Thanks
> Ankit
>
>
>
>
>
>


-- 
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University


Understanding metrics

2017-03-14 Thread Ankit Malhotra
Hi,

We are trying to understand metrics that are being populated by our samza job 
and are a little confused what each of these metrics mean especially since 
we’re running the job with a thread pool.


· We have 3 input streams

· job.container.thread.pool.size=3

· 1 container per partition

· Using a RocksDB backed store with changelogging

· process-ns = 120,000

· get-ns ~ 30,000

· put-ns ~ 90,000

· block-ns ~ 300,000

· choose-ns ~ 500,000

Metrics are avg(metric) for all containers/partitions.

Process-envelopes ~ 7000/sec.

If I back the math out, this correlates quite closely to process-ns. (1/7000 ~ 
0.15ms).

What I don’t understand is that the event loop is single threaded. Even though, 
each process call is a separate thread, the main even loop is blocking 
(block-ns) and choosing (choose-ns) every time and these times are quite high. 
Given these metrics, how is it that we are consistently seeing the above 
metrics?

Also, lag (messages behind high watermark) is ~ 0.

Thanks
Ankit