>>> I really don't know well how Firebird does the page precedence graph,
>>> but if we could write all "independent" (say, the ones with the order
>>> they're write didn't matter) then call fdatasync and continues, would be
>>> much better than O_DSYNC mode.
>>>
>>> Is it possible? How generally is that precedence graph, does it have to
>>> many "independent" pages on it?
>> 
>>     There is 2 general places when Firebird writes pages. All of them, of 
>> course,
>> used precedence graph. See below:
>> 
>> a) flush on commit\rollback\detach\background_gc :
>> 
>>     Pages (bdb's) to be flushed are collected into array, array is sorted by 
>> page numbers 
>> and then pages which have no dependent pages are written (in page numbers 
>> order).
>> Of course, after page write, its dependency is cleared, so after first write 
>> pass we have
>> another set of "independent" pages. This process continues until all pages 
>> from array is 
>> written.
>> 
>> b) "single" page write when old dirty page is eliminated from the page cache
>> 
>>     Page cache choosed oldest dirty page and attempt to write it. But if 
>> there is another
>> dirty pages which is dependent on "victim" page - they will be written 
>> before "victim" page.
>> 
>>     This is recursive process as currently to be written page could have 
>> another dependent
>> pages which must be written first.
>> 
>> Same "single" write occured also when :
>> - page lock is downgraded and page is dirty (classic only)
>> - new precedence relationship will make a circle in precedence graph 
>> - dirty page buffer is marked as system (or must_write) and page is released
>> - cache_writer thread (SS only) writes an oldest dirty page
>> 
>>     You see - in case (a) we have more or less group writes while in case 
>> (b) we have mostly 
>> single write (but in some unfortunate cases it could affect many pages). 
>> Case (a) is optimized
>> and we can call fdatasync() after each pass of writes of independent pages. 
>> In case (b) we
>> have no such possibility, at least in current code. 
>> 
> 
> So I suppose if fdatasync does not interfere when used in multiple
> threads, it may not make case (b) noticeable slower, but case (a) much
> faster.

    If your guess is correct (that O_DSYNC makes barrier after every single 
write) - then yes, 
it should be better to remove O_DSYNC and add fdatasync() after every pass of 
flush of group 
of independent pages
 
> Does (b) happens only when there is no free buffer for a another page?

    I enumerated 5 cases above.

> May it be changed to eliminate not only one independent dirty pages, but
> a number of it in one pass?

    I thought about to change cache_writer in this direction - make it to do 
"mini-flush"
instead of writting one page at time. But, note, we have no cache_writer thread 
in CS\SC 
and we can do a little (almost nothing) when page lock is downgraded (often 
case in CS, i 
think). 

    I don't think that writting more than one dirty page when free buffer is 
needed is good 
idea as it could delay user process significantly. I could be wrong...

Regards,
Vlad

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to