Laurynas, Thank you for your explanation. It helps me a lot. Appreciate your help! Thank everyone else's help also!
Xiaofei On Sun, May 10, 2015 at 9:53 PM, Laurynas Biveinis < [email protected]> wrote: > Undo logs log only a subset of a database instance. And, since their > purpose is different, by the time of crash recovery the undo logs > might be purged. > > 2015-05-10 2:57 GMT+03:00 Xiaofei Du <[email protected]>: > > Laurynas, > > > > We cannot recover from a torn page only using redo log. But wouldn't undo > > log record enough information for recovery in the case of a torn page? > Undo > > log should have old values of affected rows. So shouldn't it be enough to > > recover a torn page using information from undo log? > > > > Xiaofei > > > > On Sat, May 9, 2015 at 12:07 AM, Laurynas Biveinis > > <[email protected]> wrote: > >> > >> Xiaofei - > >> > >> We can indeed detect the torn page write without the doublewrite > >> buffer (and WebScaleSQL has a patch utilising this observation). But > >> we need not only to detect, but to recover the page as well. And > >> without the doublewrite, if we discard the page, we have nothing: a > >> half-old half-new page on the disk and the redo log records for that > >> page are not enough to recover it. > >> > >> 2015-05-09 8:44 GMT+03:00 Xiaofei Du <[email protected]>: > >> > Justin, > >> > > >> > I think the fsync I was concerning and the torn page problem are two > >> > different things. But now I have a question about double write buffer. > >> > If we > >> > can detect a torn page by checking the top and bottom of a page, why > >> > would > >> > we still need double write buffer? If the page is consistent, then we > >> > use > >> > it, otherwise, we just discard it. Maybe this is a naive question. But > >> > please let me know. Thanks. > >> > > >> > Xiaofei > >> > > >> > On Fri, May 8, 2015 at 9:24 PM, Justin Swanhart <[email protected]> > >> > wrote: > >> >> > >> >> Hi, > >> >> > >> >> The log does not have whole pages. Pages must not be torn for the > >> >> recovery process to work. A fsync is required when a page is written > >> >> to > >> >> disk. During recovery all changes since the last checkpoint are > >> >> replayed, > >> >> then transactions that do not have a commit marker are rolled back. > >> >> This is > >> >> called roll forward/roll back recovery. > >> >> > >> >> --Justin > >> >> > >> >> On Fri, May 8, 2015 at 6:09 PM, Xiaofei Du <[email protected]> > >> >> wrote: > >> >>> > >> >>> Justin, > >> >>> > >> >>> I was thinking of if fsync is needed each time after a write. The > >> >>> operations are already in the log. So recovery can always be done > from > >> >>> the > >> >>> log. The difference is that during recovery, we need to go back > >> >>> further in > >> >>> the log and it will take longer. But in that way, I guess it would > be > >> >>> hard > >> >>> to coordinate with the kernel flush thread. > >> >>> > >> >>> Xiaofei > >> >>> > >> >>> On Fri, May 8, 2015 at 2:06 PM, Justin Swanhart < > [email protected]> > >> >>> wrote: > >> >>>> > >> >>>> Hi, > >> >>>> > >> >>>> InnoDB recovery can not handle torn pages. An fsync is required to > >> >>>> ensure that the page is fully written to disk. This is also why > the > >> >>>> doublewrite buffer is used. Before pages are written down to disk, > >> >>>> they are > >> >>>> first written sequentially into the doublewrite buffer. This > buffer > >> >>>> is > >> >>>> synced, then async page writing can proceed. If the database > >> >>>> crashes, the > >> >>>> pages in flight will be rewritten by the doublewrite buffer. The > >> >>>> detection > >> >>>> mechanism for torn pages comes from an LSN, which is written into > the > >> >>>> top > >> >>>> and the bottom of the page. If the LSN at the top and bottom do > not > >> >>>> match > >> >>>> the page is torn. > >> >>>> > >> >>>> Regards, > >> >>>> > >> >>>> --Justin > >> >>>> > >> >>>> On Fri, May 8, 2015 at 12:43 PM, Xiaofei Du < > [email protected]> > >> >>>> wrote: > >> >>>>> > >> >>>>> Laurynas, > >> >>>>> > >> >>>>> This is exactly what I was looking for. I went through these > >> >>>>> functions > >> >>>>> before. I disabled double write buffer, so I didn't pay attention > to > >> >>>>> code > >> >>>>> under buf_dblwr... The reason I asked this question is because I > >> >>>>> didn't know > >> >>>>> how the recovery process works, so I was wondering if it's > necessary > >> >>>>> to > >> >>>>> fsync after each write. It's a performance concern. Anyway, thank > >> >>>>> you very > >> >>>>> much! > >> >>>>> > >> >>>>> Jan -- Thank you for your answer too! > >> >>>>> > >> >>>>> Xiaofei > >> >>>>> > >> >>>>> On Thu, May 7, 2015 at 9:59 PM, Laurynas Biveinis > >> >>>>> <[email protected]> wrote: > >> >>>>>> > >> >>>>>> Xiaofei - > >> >>>>>> > >> >>>>>> fsync is performed for all the flush types (LRU, flush, single > >> >>>>>> page) > >> >>>>>> if it is asked for (innodb_flush_method != O_DIRECT_NO_FSYNC). > The > >> >>>>>> apparent difference in sync and async is not because of the sync > >> >>>>>> difference itself, but because of the flush type difference. The > >> >>>>>> single page flush flushes one page, and requests a fsync for its > >> >>>>>> file. > >> >>>>>> Other flushes flush in batches, don't have to fsync for each > >> >>>>>> written > >> >>>>>> page individually but rather sync once at the end. Then > doublewrite > >> >>>>>> complicates this further. If it is disabled, fsync will happen in > >> >>>>>> buf_dblwr_sync_datafiles called from > >> >>>>>> buf_dblwr_flush_buffered_writes > >> >>>>>> called from buf_flush_common called at the end of either LRU or > >> >>>>>> flush > >> >>>>>> list flush. If doublewrite is enabled, fsync will happen in > >> >>>>>> buf_dblwr_update called from buf_flush_write_complete. > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> 2015-05-07 9:01 GMT+03:00 Xiaofei Du <[email protected]>: > >> >>>>>> > Hi Laurynas, > >> >>>>>> > > >> >>>>>> > On Wed, May 6, 2015 at 9:14 PM, Laurynas Biveinis > >> >>>>>> > <[email protected]> wrote: > >> >>>>>> >> > >> >>>>>> >> Xiaofei - > >> >>>>>> >> > >> >>>>>> >> > Does InnoDB maintain a dirty > >> >>>>>> >> > page table? > >> >>>>>> >> > >> >>>>>> >> You must be referring to the buffer pool flush_list. > >> >>>>>> > > >> >>>>>> > > >> >>>>>> > You are right. The flush_list is can be used for recovery and > >> >>>>>> > checkpoint. > >> >>>>>> > > >> >>>>>> >> > >> >>>>>> >> > >> >>>>>> >> > Is fsync called to guarantee the page to be on persistent > >> >>>>>> >> > storage so that the dirty page table can be updated? If this > >> >>>>>> >> > is > >> >>>>>> >> > the > >> >>>>>> >> > case, > >> >>>>>> >> > when is the dirty page table updated for asynchronous IOs? > >> >>>>>> >> > >> >>>>>> >> Check buf_flush_write_complete in buf0flu.cc. For async IO it > is > >> >>>>>> >> called from buf_page_io_complete in buf0buf.cc. > >> >>>>>> > > >> >>>>>> > > >> >>>>>> > You are right that this is the place it updates the dirty page > >> >>>>>> > information. > >> >>>>>> > But I still don't understand why the fsync is needed for > >> >>>>>> > synchronous > >> >>>>>> > IOs, > >> >>>>>> > but not for the AIOs. Jan Lindstrom said fsync is also called > for > >> >>>>>> > other AIO > >> >>>>>> > operations. But I could only it true in one of many AIO > >> >>>>>> > operations. > >> >>>>>> > Or maybe > >> >>>>>> > I am missing something still? > >> >>>>>> > > >> >>>>>> >> > >> >>>>>> >> > >> >>>>>> >> -- > >> >>>>>> >> Laurynas > >> >>>>>> > > >> >>>>>> > > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> -- > >> >>>>>> Laurynas > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> _______________________________________________ > >> >>>>> Mailing list: https://launchpad.net/~maria-discuss > >> >>>>> Post to : [email protected] > >> >>>>> Unsubscribe : https://launchpad.net/~maria-discuss > >> >>>>> More help : https://help.launchpad.net/ListHelp > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > > >> > >> > >> > >> -- > >> Laurynas > > > > > > > > -- > Laurynas >
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

