On 10/17/2016 11:55 AM, Christoph Hellwig wrote:
On Mon, Oct 17, 2016 at 09:01:29AM -0400, Ric Wheeler wrote:
This must go in - without this fix, there is no data integrity for any file 
megaraid always had odd ideas on cache flushing, and this might be
a opportunity to write down all the assumptions and document them.

In effect, this driver by default has been throwing away SYNCHRONIZE_CACHE
commands even when acting in JBOD/non-RAID mode.
That would explain some issues we've seen with megaraid hardware, but
it seems a bit too shocking to be true.

Looking over the patch I disagree with the module option - we must do
the right thing by default, which is sending SYNCHRONIZE_CACHE commands
if the WCE bit is set.  If there are controllers where this is harmful
for RAID mode and we can't fix the firmware in time we'll need to make
special exceptions for this case in the driver based on the PCI ID
and knowing what we talk to instead of leaving it to the user.

I do agree - having users be able to disable this easily is asking for trouble. It will be slower, but slower because the cache flush actually is effective.

* having T10 & T13 report the existence of a volatile write cache - this is
different than WCE set, some devices have a write cache and are
battery/flash backed.
T10 is pretty clear now the WCE should only be set for a non-voltile
cache.  For a while they had odd NV bits to allow flushing a
non-volatile cache, but in the latest revisions all that is gone.

If T10 has clarity on this, then the actual fix would be to have the driver advertise WCE enabled only for the pass through/non-RAID luns (assuming the drive's write cache was enabled) and then we would leave WCE disabled for the targets that the firmware handles cache destaging for.

To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to