Hi Stu, I'm not sure I understand your point. If with one thread we are getting pretty high latency (case Aniruddha described), doesn't it mean we have a number of requests queued up? Adding more threads might only make the problem worse by queueing up even more requests. I'm possibly missing your point...
-Flavio On Sep 25, 2012, at 9:37 PM, Stu Hood wrote: > Separating by device would help, but will not allow the devices to be fully > utilized: in order to buffer enough io commands into a disk's queue for the > elevator algorithms to kick in, you either need to use multiple threads per > disk, or native async IO (not trivially available within the JVM.) > > On Tue, Sep 25, 2012 at 2:23 AM, Flavio Junqueira <[email protected]> wrote: > >> >> On Sep 25, 2012, at 10:55 AM, Aniruddha Laud wrote: >> >>> On Tue, Sep 25, 2012 at 1:35 AM, Flavio Junqueira <[email protected]> >> wrote: >>> >>>> Just to add a couple of comments to the discussion, separating reads and >>>> writes into different threads should only help with queuing latency. It >>>> wouldn't help with IO latency. >>>> >>> >>> Yes, but with the current implementation, publishes latencies in hedwig >>> suffer because of lagging subscribers. By separating read and write >> queues, >>> we can at least guarantee that the write SLA is maintained (separate >>> journal disk + separate thread would ensure that writes are not affected >> by >>> read related seeks) >>> >> >> Agreed and based on my comment below, I was wondering if it wouldn't be >> best to separate traffic across threads by device instead of by operation >> type. >> >> >>>> >>>> Also, it sounds like a good idea to have at least one thread per ledger >>>> device. In the case of multiple ledger devices, if we use one single >>>> thread, then the performance of the bookie will be driven by the slowest >>>> disk, no? >>>> >>> yup, makes sense. >>> >>>> >>>> -Flavio >>>> >>>> On Sep 25, 2012, at 10:24 AM, Ivan Kelly wrote: >>>> >>>>>> Could you give some information on what those shortcomings are? Also, >> do >>>>>> let me know if you need any more information from our end. >>>>> Off the top of my head: >>>>> - reads and writes are handled in the same thread (as you have >> observed) >>>>> - each entry read requires a single RPC. >>>>> - entries are read in parallel >>>> >>> By parallel, you mean the BufferedChannel wrapper on top of FileChannel, >>> right? >>> >>>>> >>>>> Not all of these could result in the high latency you see, but if each >>>>> entry is being read separately, a sync on the ledger disk in between >>>>> will make a mess of the disk head scheduling. >>>> >>> Increasing the time interval between flushing log files might possibly >>> help in this case then? >>> >>>>> >>>>> -Ivan >>>> >>>> >>> Thanks for the help :) >> >>
