I ran sar while the bookies were showing high latencies. Pasting the output below
disk with journal Average: DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util Average: dev8-16 2467.77 0.05 27118.78 10.99 0.14 0.06 0.06 13.89 Average: dev8-17 2467.77 0.05 27118.78 10.99 0.14 0.06 0.06 13.89 Following disks store ledgers. Average: dev8-32 319.61 10524.33 462.75 34.38 1.10 3.43 1.14 36.59 Average: dev8-33 319.61 10524.33 462.75 34.38 1.10 3.43 1.14 36.59 Average: dev8-48 63.25 1613.32 2022.78 57.49 0.12 1.96 1.25 7.90 Average: dev8-49 63.25 1613.32 2022.78 57.49 0.12 1.96 1.25 7.90 Average: dev8-64 142.51 2934.78 480.65 23.97 0.54 3.80 1.31 18.66 Average: dev8-65 142.51 2934.78 480.65 23.97 0.54 3.80 1.31 18.66 Average: dev8-80 17.14 4.53 501.03 29.50 0.00 0.13 0.11 0.19 Average: dev8-81 17.14 4.53 501.03 29.50 0.00 0.13 0.11 0.19 Regards, Aniruddha. On Fri, Sep 21, 2012 at 9:39 AM, Aniruddha Laud <[email protected]>wrote: > Thanks Flavio and Ivan. Please find my replies inline. > > Flavio, > Yes we are using trunk with only a few stats related modifications to > bookkeeper. Also, I meant < 1ms and not 1 second. It's jmx stat value > checked using jconsole and I believe that is in ms. Our entries are small > (Around 1KB) and we have a lot of entries in our ledgers (We see close to > 20k publishes/second on our hedwig cluster) > > We have 15 bookie servers with an ensemble size of 4 and quorum of 3. > > On Fri, Sep 21, 2012 at 2:59 AM, Ivan Kelly <[email protected]> wrote: > >> What latency are you getting for the reads? Is the latency too high >> for reading individual entries, or for reading a whole ledger? >> > The latency for reading individual entries is really high. In the order of > seconds. This almost always occurs when a backed up subscriber is catching > up and there are simultaneous publishes. There are around 15k-20k read > entry requests per second across the 15 bookies (This is the sum of the > PCBC requests per second) and around 10k add entry requests per second. > From what I can tell, if there are only writes, the latencies are not high. > >> >> Also, how many topics do you have in the system? The number of topics >> will define the amount of interleaving in the ledger storage files >> which has an affect on how long it will take to read entries from the >> ledger. >> > We have 1000 topics. > >> >> For the disk, check the number of I/O transactions which are >> occurring. I don't think you can get it with dstat, but sar -b should >> give it to you. I suspect a lot of seeks are occurring. >> > Thanks for the tip. I'll do this the next time we run into such a problem. > > >> >> There's a couple of shortcomings in how bookie's serve reads which >> I've wanted to look at for a while, so this could be a good chance. >> > Could you give some information on what those shortcomings are? Also, do > let me know if you need any more information from our end. > >> >> -Ivan >> >> >> >> On Thu, Sep 20, 2012 at 07:42:29PM -0700, Aniruddha Laud wrote: >> > I'm logging latencies in the per channel bookie client by overriding the >> > read completion and add completion callbacks. >> > >> > When a hedwig subscriber subscribes to a topic and it hasn't consumed >> any >> > messages for a long time, it has to issue requests to bookkeeper to read >> > entries. We are observing very high latencies during these operations. >> > >> > throttle limit for the bookkeeper client is set at 5000. We are using >> the >> > Hierarchical ledger manager. >> > >> > The bookies have 5 disks. 1 for the journal, 4 for the ledgers. On the >> > bookie server pageSize=32768, open file limit = 20000. I checked a few >> > bookies and the number of open ledgers was around 5000 on each. However, >> > the add and read latencies on the servers were less than 1 second >> (measured >> > using jconsole and the exposed jmx stats). >> > >> > Also, the bookie disks are standard spin drives and were doing about >> 40MBps >> > reads and 10MBps writes when measured with dstat. Any thoughts would be >> > helpful. >> > > Regards, > Aniruddha. >
