Hi, Ankit, When running your job in multithreading, block-ns here actually includes the process_ns. This is because after your task.process() is submitted to the thread pool, the run loop thread will be blocked until the process() is complete for one of the task. It's interesting that block-ns (0.3 ms) is much longer than process-ns (0.12 ms). I am wondering whether you also have window and checkpoint configured for your job. Since window and checkpoint will also be running inside this thread pool to improve the parallelism, block-ns will be affected since the run loop will block for window/checkpoint to complete. If you are using window/commit, please send us the configs (task.window.ms and task.commit.ms) and the timer metrics (window-ns and commit-ns). Then we can correlate better with block-ns.
Thanks, Xinyu On Tue, Mar 14, 2017 at 3:33 PM, Ankit Malhotra <amalho...@appnexus.com> wrote: > Wait, block-ns = 0.3ms (300,000ns). Also, why are we not adding in > choose-ns? > > Thanks > Ankit > > On 3/14/17, 6:26 PM, "Jagadish Venkatraman" <jagadish1...@gmail.com> > wrote: > > I would expect (process_ns + block_ns) to be almost the same as 0.15 > which > makes sense. > > process_ns = 0.12 ms > block_ns = 0.03 ms > process_ns + block_ns ~ 0.15ms > > This corresponds to the number of process calls roughly 1/7000 ~ > 0.15ms per > process call. > > >> Each process call is a separate thread. > Given that you have one partition in each container, and you want > in-order > processing, there will be only one thread that's processing messages. > The > two other threads are not doing work, and entail a constant (albeit > insignificant) synchronization overhead. > > > > > > On Tue, Mar 14, 2017 at 3:03 PM, Ankit Malhotra < > amalho...@appnexus.com> > wrote: > > > Hi, > > > > We are trying to understand metrics that are being populated by our > samza > > job and are a little confused what each of these metrics mean > especially > > since we’re running the job with a thread pool. > > > > > > · We have 3 input streams > > > > · job.container.thread.pool.size=3 > > > > · 1 container per partition > > > > · Using a RocksDB backed store with changelogging > > > > · process-ns = 120,000 > > > > · get-ns ~ 30,000 > > > > · put-ns ~ 90,000 > > > > · block-ns ~ 300,000 > > > > · choose-ns ~ 500,000 > > > > Metrics are avg(metric) for all containers/partitions. > > > > Process-envelopes ~ 7000/sec. > > > > If I back the math out, this correlates quite closely to process-ns. > > (1/7000 ~ 0.15ms). > > > > What I don’t understand is that the event loop is single threaded. > Even > > though, each process call is a separate thread, the main even loop is > > blocking (block-ns) and choosing (choose-ns) every time and these > times are > > quite high. Given these metrics, how is it that we are consistently > seeing > > the above metrics? > > > > Also, lag (messages behind high watermark) is ~ 0. > > > > Thanks > > Ankit > > > > > > > > > > > > > > > -- > Jagadish V, > Graduate Student, > Department of Computer Science, > Stanford University > > >