Lru fashion on demand. One block at a time. On Aug 25, 2015 7:54 PM, "Atri Sharma" <[email protected]> wrote:
> Quick question then. > > Just to clarify my understanding, does bufferserver dump all of the data > when full or starts evicting in LRU fashion on demand? > On 26 Aug 2015 03:53, "Chetan Narsude" <[email protected]> wrote: > > > I have a hunch that there may be a problem in terms of adding the > latency. > > But ultimately we will use benchmark to rule out the hunches if you > > strongly believe in it. > > > > Here is what happens today: bufferserver tries to hold the data in memory > > for as long as possible but not longer than needed. If you do not persist > > the data to memory, you do not have to load it as well as it's already in > > memory. This greatly reduces the disk related latency. Even when we have > to > > persist the data, we pick the block (it's pending correct > implementation), > > which we will not need back in memory immediately. > > > > The converse of it is presumably true as well. If you start persisting > the > > data in anticipation of buffer being full, you will also need to load > this > > data back when needed. This will result in frequent round-trips to disk > > adding to the latency. > > > > -- > > Chetan > > > > > > > > > > On Tue, Aug 25, 2015 at 12:53 PM, Atri Sharma <[email protected]> > wrote: > > > > > What are the problems you see around loading? I think that it might > > > actually help since we might end up using locality of reference for > > similar > > > data in a single window. > > > On 25 Aug 2015 22:14, "Chetan Narsude" <[email protected]> wrote: > > > > > > > This looks at store side of the equation, what's the impact on the > load > > > > side when the time comes to use this data? > > > > > > > > -- > > > > Chetan > > > > > > > > On Tue, Aug 25, 2015 at 8:41 AM, Atri Sharma <[email protected]> > > > wrote: > > > > > > > > > On 25 Aug 2015 10:34, "Vlad Rozov" <[email protected]> > wrote: > > > > > > > > > > > > I think that the bufferserver should be allowed to use no more > than > > > > > application specified amount of memory and behavior like linux file > > > cache > > > > > will make it difficult to allocate operator/container cache without > > > > > reserving too much memory for spikes. > > > > > > > > > > Sure, agreed. > > > > > > > > > > My idea is to use *lesser* memory than what is allocated by > > application > > > > > since I am suggesting some level of control over group commits. So > I > > am > > > > > thinking of taking the patch you wrote to have it trigger each time > > > > buffer > > > > > server fills by n units, n being window size. > > > > > > > > > > If n exceed allocated memory, we can error out. > > > > > > > > > > Thoughts? > > > > > > > > > > But I may be wrong and it will be good to have suggested behavior > > > > > implemented in a prototype and benchmark prototype performance. > > > > > > > > > > > > Vlad > > > > > > > > > > > > > > > > > > On 8/24/15 18:24, Atri Sharma wrote: > > > > > >> > > > > > >> The idea is that if bufferserver dumps *all* pages once it runs > > out > > > of > > > > > >> memory, then it's a huge I/O spike. If it starts paging out once > > it > > > > runs > > > > > >> out of memory, then it behaves like a normal cache and further > > > level > > > > of > > > > > >> paging control can be applied. > > > > > >> > > > > > >> My idea is that there should be functionality to control the > > amount > > > of > > > > > data > > > > > >> that is committed together. This also allows me to 1) define > > optimal > > > > way > > > > > >> writes work on my disk 2) allow my application to define > locality > > of > > > > > data. > > > > > >> For eg I might be performing graph analysis in which a time > > window's > > > > > data > > > > > >> consists of sub graph. > > > > > >> On 25 Aug 2015 02:46, "Chetan Narsude" <[email protected]> > > > > wrote: > > > > > >> > > > > > >>> The bufferserver writes pages to disk *only when* it runs out > of > > > > memory > > > > > to > > > > > >>> hold them. > > > > > >>> > > > > > >>> Can you elaborate where you see I/O spikes? > > > > > >>> > > > > > >>> -- > > > > > >>> Chetan > > > > > >>> > > > > > >>> On Mon, Aug 24, 2015 at 12:39 PM, Atri Sharma < > > [email protected] > > > > > > > > > wrote: > > > > > >>> > > > > > >>>> Folks, > > > > > >>>> > > > > > >>>> I was wondering if it makes sense to have a functionality in > > which > > > > > >>>> bufferserver writes out data pages to disk in batches defined > by > > > > > >>>> timeslice/application window. > > > > > >>>> > > > > > >>>> This will allow flexible workloads and reduce I/O spikes (I > > > > understand > > > > > >>> > > > > > >>> that > > > > > >>>> > > > > > >>>> we have non-blocking I/O but it still would incur disk head > > > costs). > > > > > >>>> > > > > > >>>> Thoughts? > > > > > >>>> > > > > > >>>> -- > > > > > >>>> Regards, > > > > > >>>> > > > > > >>>> Atri > > > > > >>>> *l'apprenant* > > > > > >>>> > > > > > > > > > > > > > > > > > > > > >
