>
> If with one thread we are getting pretty high latency (case Aniruddha
> described), doesn't it mean we have a number of requests queued up?

If you have 1 thread in the system and you're doing blocking IO, then the
os/hardware can only possibly know about 1 IO operation at a time (because
the 1 thread can't send the next operation while it is blocked waiting for
the first operation.) If instead you have 16 threads issuing IO to one
physical device, then there can be 16 outstanding IO operations at once.

An "outstanding" IO operation might be queued a) in the kernel, via an IO
scheduler [0], b) at a raid controller, c) in the device itself [1]. All of
these layers implement reordering of operations: and the more queueing, the
more chance to optimize access order.

One of the simplest exposed metrics is '*avgqu-sz*', exposed by linux
`iostat -x`: it is the average number of queued operations for a device.

[0] for example, CFQ: http://en.wikipedia.org/wiki/CFQ
[1] NCQ: http://en.wikipedia.org/wiki/Native_Command_Queuing



On Tue, Sep 25, 2012 at 2:06 PM, Flavio Junqueira <[email protected]> wrote:

> Hi Stu, I'm not sure I understand your point. If with one thread we are
> getting pretty high latency (case Aniruddha described), doesn't it mean we
> have a number of requests queued up? Adding more threads might only make
> the problem worse by queueing up even more requests. I'm possibly missing
> your point...
>
> -Flavio
>
> On Sep 25, 2012, at 9:37 PM, Stu Hood wrote:
>
> > Separating by device would help, but will not allow the devices to be
> fully
> > utilized: in order to buffer enough io commands into a disk's queue for
> the
> > elevator algorithms to kick in, you either need to use multiple threads
> per
> > disk, or native async IO (not trivially available within the JVM.)
> >
> > On Tue, Sep 25, 2012 at 2:23 AM, Flavio Junqueira <[email protected]>
> wrote:
> >
> >>
> >> On Sep 25, 2012, at 10:55 AM, Aniruddha Laud wrote:
> >>
> >>> On Tue, Sep 25, 2012 at 1:35 AM, Flavio Junqueira <[email protected]>
> >> wrote:
> >>>
> >>>> Just to add a couple of comments to the discussion, separating reads
> and
> >>>> writes into different threads should only help with queuing latency.
> It
> >>>> wouldn't help with IO latency.
> >>>>
> >>>
> >>> Yes, but with the current implementation, publishes latencies in hedwig
> >>> suffer because of lagging subscribers. By separating read and write
> >> queues,
> >>> we can at least guarantee that the write SLA is maintained (separate
> >>> journal disk + separate thread would ensure that writes are not
> affected
> >> by
> >>> read related seeks)
> >>>
> >>
> >> Agreed and based on my comment below, I was wondering if it wouldn't be
> >> best to separate traffic across threads by device instead of by
> operation
> >> type.
> >>
> >>
> >>>>
> >>>> Also, it sounds like a good idea to have at least one thread per
> ledger
> >>>> device. In the case of multiple ledger devices, if we use one single
> >>>> thread, then the performance of the bookie will be driven by the
> slowest
> >>>> disk, no?
> >>>>
> >>> yup, makes sense.
> >>>
> >>>>
> >>>> -Flavio
> >>>>
> >>>> On Sep 25, 2012, at 10:24 AM, Ivan Kelly wrote:
> >>>>
> >>>>>> Could you give some information on what those shortcomings are?
> Also,
> >> do
> >>>>>> let me know if you need any more information from our end.
> >>>>> Off the top of my head:
> >>>>> - reads and writes are handled in the same thread (as you have
> >> observed)
> >>>>> - each entry read requires a single RPC.
> >>>>> - entries are read in parallel
> >>>>
> >>> By parallel, you mean the BufferedChannel wrapper on top of
> FileChannel,
> >>> right?
> >>>
> >>>>>
> >>>>> Not all of these could result in the high latency you see, but if
> each
> >>>>> entry is being read separately, a sync on the ledger disk in between
> >>>>> will make a mess of the disk head scheduling.
> >>>>
> >>> Increasing the time interval between  flushing log files might possibly
> >>> help in this case then?
> >>>
> >>>>>
> >>>>> -Ivan
> >>>>
> >>>>
> >>> Thanks for the help :)
> >>
> >>
>
>

Reply via email to