What latency are you getting for the reads? Is the latency too high for reading individual entries, or for reading a whole ledger?
Also, how many topics do you have in the system? The number of topics will define the amount of interleaving in the ledger storage files which has an affect on how long it will take to read entries from the ledger. For the disk, check the number of I/O transactions which are occurring. I don't think you can get it with dstat, but sar -b should give it to you. I suspect a lot of seeks are occurring. There's a couple of shortcomings in how bookie's serve reads which I've wanted to look at for a while, so this could be a good chance. -Ivan On Thu, Sep 20, 2012 at 07:42:29PM -0700, Aniruddha Laud wrote: > I'm logging latencies in the per channel bookie client by overriding the > read completion and add completion callbacks. > > When a hedwig subscriber subscribes to a topic and it hasn't consumed any > messages for a long time, it has to issue requests to bookkeeper to read > entries. We are observing very high latencies during these operations. > > throttle limit for the bookkeeper client is set at 5000. We are using the > Hierarchical ledger manager. > > The bookies have 5 disks. 1 for the journal, 4 for the ledgers. On the > bookie server pageSize=32768, open file limit = 20000. I checked a few > bookies and the number of open ledgers was around 5000 on each. However, > the add and read latencies on the servers were less than 1 second (measured > using jconsole and the exposed jmx stats). > > Also, the bookie disks are standard spin drives and were doing about 40MBps > reads and 10MBps writes when measured with dstat. Any thoughts would be > helpful.
