Bug#569598: disk caches?

2010-02-16 Thread Philipp Weis
On 2010-02-16 23:12, Christoph Hellwig  wrote:
> On Tue, Feb 16, 2010 at 03:42:25PM -0500, Philipp Weis wrote:
> > How could it end up in the page cache for this setup? There is no
> > filesystem layer below the loop device. Any write that reaches the
> > loop device is passed through to lvm, then md and finally the disk.
> > Are you saying that there is extra caching going on at the lvm or md
> > layers?
> 
> Yes, the block device node is a cached access mode, and uses the same
> pagecache as a filesystem uses.

Ugh, thanks for clearing that up. Is there a way to disable the page
cache for lvm and md? Or is it just not possible to have this work
reliably as long as there's no write barrier support in loop? From
what I understand, 2.6.33 will have full barrier support for both md
and lvm.

Thanks,

Philipp




signature.asc
Description: Digital signature


Bug#569598: disk caches?

2010-02-16 Thread Christoph Hellwig
On Tue, Feb 16, 2010 at 03:42:25PM -0500, Philipp Weis wrote:
> How could it end up in the page cache for this setup? There is no
> filesystem layer below the loop device. Any write that reaches the
> loop device is passed through to lvm, then md and finally the disk.
> Are you saying that there is extra caching going on at the lvm or md
> layers?

Yes, the block device node is a cached access mode, and uses the same
pagecache as a filesystem uses.




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100216221248.ga8...@lst.de



Bug#569598: disk caches?

2010-02-16 Thread Philipp Weis
Hi,

On 2010-02-16 20:18, Christoph Hellwig  wrote:
> Hmm, it really does look like the typical write cache corruption.
> 
> Looking a bit at the loop driver as suspsect I think I know what the
> problem is:
> 
>  - loop writes data either using the address_space operations, or
>using ->write, but it never calls does the O_SYNC processing
>(just ->fsync in modern kernels, but a little different in 2.6.26)
> 
> So when using loop data simply gets written into the page cache, but
> there is no guarantee it ever goes out to disk.

How could it end up in the page cache for this setup? There is no
filesystem layer below the loop device. Any write that reaches the
loop device is passed through to lvm, then md and finally the disk.
Are you saying that there is extra caching going on at the lvm or md
layers?

> This bug is still present in latests upstream - a patch to introduce
> barrier support calling ->fsync went in a while ago but caused
> mysterious lockups and was reverted, with the patch author never
> coming back to try to figure it out.

That's really unfortunate.

Philipp




signature.asc
Description: Digital signature


Bug#569598: disk caches?

2010-02-16 Thread Christoph Hellwig
On Tue, Feb 16, 2010 at 10:58:06AM -0500, Philipp Weis wrote:
> On 2010-02-16 11:21, Christoph Hellwig  wrote:
> > Do you have disk write caches disabled on the machine?  Neither md, nor
> > dm, nor loop pass through barrier requests.  Without disabling the
> > volatile write cache on the disks you will lose data everytime the
> > machine is not shut down cleanly.  The messages you see are typical
> > for that kind of corruption.
> 
> The caches are all off.

Hmm, it really does look like the typical write cache corruption.

Looking a bit at the loop driver as suspsect I think I know what the
problem is:

 - loop writes data either using the address_space operations, or
   using ->write, but it never calls does the O_SYNC processing
   (just ->fsync in modern kernels, but a little different in 2.6.26)

So when using loop data simply gets written into the page cache, but
there is no guarantee it ever goes out to disk.

This bug is still present in latests upstream - a patch to introduce
barrier support calling ->fsync went in a while ago but caused
mysterious lockups and was reverted, with the patch author never
coming back to try to figure it out.




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100216191800.ga1...@lst.de



Bug#569598: disk caches?

2010-02-16 Thread Philipp Weis
On 2010-02-16 11:21, Christoph Hellwig  wrote:
> Do you have disk write caches disabled on the machine?  Neither md, nor
> dm, nor loop pass through barrier requests.  Without disabling the
> volatile write cache on the disks you will lose data everytime the
> machine is not shut down cleanly.  The messages you see are typical
> for that kind of corruption.

The caches are all off.

| # hdparm -W /dev/hd[abeg]
| 
| /dev/hda:
|  write-caching =  0 (off)
| 
| /dev/hdb:
|  write-caching =  0 (off)
| 
| /dev/hde:
|  write-caching =  0 (off)
| 
| /dev/hdg:
|  write-caching =  0 (off)

I learned about this and deactivated the caches about half a year back, and
luckily didn't run into any problems before then. The machine has been rebooted
since then several times without any issues, so I don't think this could be a
left-over artifact from back then.

Thanks,

Philipp





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100216155806.ga11...@zaphod.pweis.com



Bug#569598: disk caches?

2010-02-16 Thread Christoph Hellwig
Do you have disk write caches disabled on the machine?  Neither md, nor
dm, nor loop pass through barrier requests.  Without disabling the
volatile write cache on the disks you will lose data everytime the
machine is not shut down cleanly.  The messages you see are typical
for that kind of corruption.



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100216102114.ga8...@lst.de