This message is from the T13 list server.


Harlan,

Not could completely destroy data, it will.
Simple proof is a journalling FS and FUA is issued on the down block of
the journal.  This is more than local single host and device ownership.
Apply this to a global filesystem where the only communication is via the
FS and the commit/down blocks.  This is much more messy as SATA move into
SAS like deployments.

Yes I am paranoid, and have reason and experience to be.

Historical winners:

Drives which misreported the state of iCRC as valid.
Drives which randomly puke cache content to platter on a power event.

These are the two freshest in my mind, I am sure Hale as seen more and
worse.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

On Thu, 26 Jun 2003, Harlan E. Andrews wrote:

> This message is from the T13 list server.
> 
> 
> Curtis,
> 
> When you say "state" information, I assume you are referring to 
> information which is vital to prevent damage to system structures due 
> to a power failure.  You are correct in that there is very little 
> warning of a power failure.   In fact, I think there is usually NO 
> guaranteed warning for a power failure.
> 
> The trick is to make the state of the disk consistent at all times.   
> This is usually done by guaranteeing that directory changes are ALWAYS 
> valid.    To do this, the new structure is constructed as an unlinked 
> copy of the old structure.  Then the changes are made to the copy.   
> When the copy is consistent, the LAST step is to link it in to the 
> actual directory in a SINGLE write of a single sector.   Not 
> coincidentally, some data bases use a similar technique.
> 
> Typically, you would WANT a flush cache before the final link write to 
> guarantee that all previous modifications were actually on the media 
> before linking the new structure.
> 
> Are you saying that you can completely clean up the directory 
> structures after receiving a power fail warning ?
> 
> The flush cache was designed EXACTLY for that use, but even before that 
> command existed the OS folks used other methods to insure that the 
> drive's cache was completely written to the media prior to the link 
> write ( or prior to a planned power off ).
> 
> I think the FUA has serious potential for misuse.   If extra complexity 
> is introduced to the hard drives, it could ALSO cause the drives to 
> mishandle normal read/writes.
> 
> 5% savings on a BENCHMARK is a poor excuse to introduce potential data 
> corruption.   The current benchmarks have little relation to real 
> system performance.   A 5% savings in benchmark will NOT result in a 5% 
> improvement to system performance using real applications.
> 
> However, improperly re-ordering the queue could completely destroy data 
> integrity.
> 
> Regards,
> 
> Harlan
> 
> 
> 
> On Monday, Jun 23, 2003, at 19:38 US/Pacific, Curtis Stevens wrote:
> 
> > This message is from the T13 list server.
> >
> >
> > Gary
> >
> >     The wording you propose would defeat the purpose of the command.  
> > I do
> > not know if you were there for the discussion.  The purpose of the FUA 
> > was
> > to cause critical data to be committed to the media, regardless of 
> > what is
> > in the que.  If you wait for a flush you are delaying the higher 
> > priority
> > data.  Furthermore, If you know that the area you are writing is for 
> > this
> > type of critical data you can prevent the issue you are trying to 
> > solve in
> > the drive...  Queued FUA is not a function for everyday use, but if 
> > you use
> > it for the purpose intended it has great value.  Fir instance, if 
> > power has
> > been removed from the system and you only have a few MS to store the 
> > state,
> > you would wrather not wait for the que to execute or flush, nor would 
> > you
> > want to go through the very time consuming process of aborting the 
> > que.  You
> > simply was the state information committed in the shortest possible 
> > time.  I
> > see this as one of the main uses for queued FUA commands.
> >
> > ---------------------------
> > Curtis E. Stevens
> > 29 Dewey
> > Irvine, Ca 92620
> >
> > Home: (949) 552-4777
> > E-Mail: [EMAIL PROTECTED]
> >
> > The face of a child can say it all, especially the mouth part of the 
> > face...
> > ----- Original Message -----
> > From: "Gary Laatsch" <[EMAIL PROTECTED]>
> > To: "Curtis Stevens" <[EMAIL PROTECTED]>; "T13 List Server"
> > <[EMAIL PROTECTED]>
> > Sent: Monday, June 23, 2003 2:40 PM
> > Subject: Re: [t13] hmmm.. no comments?
> >
> >
> >> Curtis,
> >>
> >> Dropping HDD's is one of those very unsafe thing to do   :-)
> >>
> >> My 2 cents on the whole deal is simple.  Mark Vallis brought up a very
> > good
> >> point in his example of 2 read requests and the 1 write request in the
> > queue
> >> and than overlapped by a FUA Write request.  The current ATAPI-7 spec
> >> indicates that the FUA command shall not be released.  Obviously if 
> >> you do
> >> this, the result is the read data returned is in question (return the 
> >> old
> > or
> >> new data, the command was recieved before but processed after).  
> >> Also, the
> >> queued write would request data that should be older then the data 
> >> just
> >> written with the FUA since the command was actually recieved before.  
> >> I
> >> think this is really an exception case and should be handled as such.
> >>
> >> My opinion is to modify the wording to the queued FUA commands to add
> >> something like "if the queued FUA request overlaps a previously queued
> >> command, that the queue shall be flushed....blah blah blah" or 
> >> however we
> >> say it in T13 queue-ish.  The biggest issue is that in this case, the 
> >> FUA
> >> needs to release and it slows down because of the clean up needed to
> > insure
> >> data integrity.  But I know that you guys will figure it out this 
> >> week up
> >> there in SJ.  Have fun.
> >>
> >> Gary Laatsch
> >> [EMAIL PROTECTED]
> >>
> >> ----- Original Message -----
> >> From: "Curtis Stevens" <[EMAIL PROTECTED]>
> >> To: "T13 List Server" <[EMAIL PROTECTED]>
> >> Sent: Monday, June 23, 2003 10:20 AM
> >> Subject: Re: [t13] hmmm.. no comments?
> >>
> >>
> >>> This message is from the T13 list server.
> >>>
> >>>
> >>> I really don't think the issue is one of implementation...  It looks 
> >>> to
> > me
> >>> like there are some concerns about usage and the possibility of
> > unexpected
> >>> outcomes.  MS clearly stated that they understood several of the
> >> unexpected
> >>> outcomes and still needed the capability.  There are many "unsafe"
> > things
> >>> you can do to an HDD, but that has not prevented commands from being
> >>> implemented.
> >>>
> >>> ---------------------------
> >>> Curtis E. Stevens
> >>> 29 Dewey
> >>> Irvine, Ca 92620
> >>>
> >>> Home: (949) 552-4777
> >>> E-Mail: [EMAIL PROTECTED]
> >>>
> >>> The face of a child can say it all, especially the mouth part of the
> >> face...
> >>> ----- Original Message -----
> >>> From: "Eschmann, Michael K" <[EMAIL PROTECTED]>
> >>> To: "T13 List Server" <[EMAIL PROTECTED]>
> >>> Sent: Monday, June 23, 2003 8:23 AM
> >>> Subject: RE: [t13] hmmm.. no comments?
> >>>
> >>>
> >>>> This message is from the T13 list server.
> >>>>
> >>>>
> >>>> My (humble?) opinion is that FUA is necessary and is not dangerous 
> >>>> as
> >> long
> >>> as a disk drive properly deals with outstanding requests.
> >>>>
> >>>> First off a flush is very slow, affecting system benchmark scores by
> > as
> >>> much as 5%.  The more interesting fact is that not all drives 
> >>> properly
> >>> support flush, where many HDD's will complete the Flush command 
> >>> without
> >>> writing any cached data to media just because of the desire to make 
> >>> ones
> >>> disk synthetically faster than somebody elses.
> >>>>
> >>>> FUA allows the OS to flush critical data without adversely affecting
> >>> performance.  The drive should be required to test all outstanding
> > writes
> >>> (in the queued case) and assure that the writes are ordered to 
> >>> guarantee
> >> no
> >>> data loss.  Lets take a look at a specific scenario:
> >>>>
> >>>> - Queued write 256 sectors to LBA 10000
> >>>> - FUA write 1 sector 10001
> >>>>
> >>>> The 256-sector write must be written to media, or the 1 sector must
> >>> over-write the same sector written by the 256 sector write in the
> > devices
> >>> cache.  The drive must also assure that the media results in the same
> >> data.
> >>> I'm sure we could expand this simple case to something much more
> > complex,
> >>> but the basic idea remains:  The drive must handle ordering such that
> >> there
> >>> is no data loss.  I've asked once before, and I'll ask it again:
> > someone
> >>> offer up a more complex scenario where you believe FUA will break 
> >>> and we
> >> can
> >>> then have a real conversation about the (de)merits of FUA.
> >>>>
> >>>> MKE.
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Harlan Andrews [mailto:[EMAIL PROTECTED]
> >>>> Sent: Friday, June 20, 2003 4:47 PM
> >>>> To: Andre Hedrick; Steve Livaccari
> >>>> Cc: Curtis Stevens; T13 List Server; [EMAIL PROTECTED]; Larry 
> >>>> Barras
> >>>> Subject: Re: [t13] hmmm.. no comments?
> >>>>
> >>>>
> >>>> This message is from the T13 list server.
> >>>>
> >>>>
> >>>> I have not been present at any of the discussions, but Out-Of-Order
> >>>> writes are inherently dangerous to ANY file system - not only to
> >>>> journaling.  Now that we have Flush Cache as a mandatory command, 
> >>>> why
> >>>> don't we simply issue the Flush Cache to force unit access.
> >>>>
> >>>> I have not heard any real benefit for such a dangerous operation.  
> >>>> Why
> >>>> would anyone even consider it ?
> >>>>
> >>>> ...Harlan
> >>>>
> >>>> on 6/19/03 10:49 PM, [EMAIL PROTECTED] wrote:
> >>>>
> >>>>> This message is from the T13 list server.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Steve,
> >>>>>
> >>>>> This totally nukes and destroys write ordered operations.
> >>>>> Example is the down/commit block on a journalled operation.
> >>>>>
> >>>>> Taking an FUA command to platter and blasting past the queue cache
> > will
> >>>>> destroy every bit of the security designed into any journaling file
> >>>>> system.
> >>>>>
> >>>>> I still do not get why MicroSoft thinks there journaling NTFS of 
> >>>>> the
> >>>>> meta data in OS buffer cache will not take a hit.  If I knew the OS
> > my
> >>>>> data was dependent on did such a "FOOLISH" operation I would find
> >>> another.
> >>>>>
> >>>>> If T13 continues to move towards making it possible for the HOST to
> > do
> >>> bad
> >>>>> things, then the DEVICE is even worse.
> >>>>>
> >>>>> We can all pack our bags and go home and switch to T10, because
> > nobody
> >>>>> will trust a device coming out of T13 again.
> >>>>>
> >>>>> Comments?
> >>>>>
> >>>>> Tomato Shield UP!!
> >>>>>
> >>>>> Andre Hedrick
> >>>>> LAD Storage Consulting Group
> >>>>>
> >>>>> On Wed, 18 Jun 2003, Steve Livaccari wrote:
> >>>>>
> >>>>>> This message is from the T13 list server.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> All modern HDD's have a buffer for write cache that is used to
> > stack
> >> up
> >>>>>> write data from both queued and unqueued write commands.  A write
> >>> command
> >>>>>> followed by a flush cache command will likely not move the data
> > from
> >>> the
> >>>>>> last write command to the media until the rest of the data is the
> >> write
> >>>>>> cache is written.  If a write FUA command is used the data from 
> >>>>>> the
> >>> write
> >>>>>> FUA command will be given priority over the other data in the 
> >>>>>> write
> >>> cache
> >>>>>> and be written first.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Regards,
> >>>>>> Steve Livaccari
> >>>>>>
> >>>>>> Hard Drive Engineering
> >>>>>> IBM Global Procurement
> >>>>>> Internet:  [EMAIL PROTECTED]
> >>>>>> Phone (919) 543.7393
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>>
> >>>>>>                       "Curtis Stevens"
> >>>>
> >>>>>
> >>>>>>                       <[EMAIL PROTECTED]        To:       "T13 List
> >>> Server"
> >>>>> <[EMAIL PROTECTED]>
> >>>>>>                       oo.com>                  cc:
> >>>>
> >>>>>
> >>>>>>                       Sent by:                 Subject:  Re: [t13]
> >>> hmmm..
> >>>> no
> >>>>> comments?
> >>>>>>                       [EMAIL PROTECTED]
> >>>>
> >>>>>
> >>>>>>                       rg
> >>>>
> >>>>>
> >>>>>>
> >>>>
> >>>>>
> >>>>>>
> >>>>
> >>>>>
> >>>>>>                       06/17/2003 11:09
> >>>>
> >>>>>
> >>>>>>                       PM
> >>>>
> >>>>>
> >>>>>>
> >>>>
> >>>>>
> >>>>>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> This message is from the T13 list server.
> >>>>>>
> >>>>>>
> >>>>>> Gary
> >>>>>>
> >>>>>>     As I recall, there were some inacuracies in the proposals as
> > made
> >>> to
> >>>>>> the
> >>>>>> committee.  There were many revisions.  The only new FUA commands
> >> that
> >>> make
> >>>>>> sense are the queued ones.  All others could be followed by flush
> >>> cache.
> >>>>>>
> >>>>>> ---------------------------
> >>>>>> Curtis E. Stevens
> >>>>>> 29 Dewey
> >>>>>> Irvine, Ca 92620
> >>>>>>
> >>>>>> Home: (949) 552-4777
> >>>>>> E-Mail: [EMAIL PROTECTED]
> >>>>>>
> >>>>>> The face of a child can say it all, especially the mouth part of
> > the
> >>>>>> face...
> >>>>>> ----- Original Message -----
> >>>>>> From: "Gary Laatsch" <[EMAIL PROTECTED]>
> >>>>>> To: "Curtis Stevens" <[EMAIL PROTECTED]>; "T13 List Server"
> >>>>>> <[EMAIL PROTECTED]>
> >>>>>> Sent: Tuesday, June 17, 2003 4:40 PM
> >>>>>> Subject: Re: [t13] hmmm.. no comments?
> >>>>>>
> >>>>>>
> >>>>>>> Curtis and Hale,
> >>>>>>>
> >>>>>>>     Also, to expand upon this.  I think Hale's point is the
> >> proposal
> >>> put
> >>>>>>> forth by Nita didn't contain the QUEUE FUA or QUEUE FUA EXT
> >> commands
> >>> and
> >>>>>> he
> >>>>>>> was wondering where they were added or how they were proposed. My
> >>> memory
> >>>>>> was
> >>>>>>> this was discussed and added at the June 2002 meetings.  That is
> >> why
> >>> I
> >>>>>> was
> >>>>>>> wondering if anyone else remembered these discussions.  I
> > remember
> >>>>>>> discussing all of this stuff (even Andre's comments about the FUA
> >>> blowig
> >>>>>>> away the queue) but for some reason it just wasn't captured very
> >> well
> >>> in
> >>>>>> the
> >>>>>>> minutes.
> >>>>>>>
> >>>>>>> Gary Laatsch
> >>>>>>> [EMAIL PROTECTED]
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>> From: "Curtis Stevens" <[EMAIL PROTECTED]>
> >>>>>>> To: "T13 List Server" <[EMAIL PROTECTED]>
> >>>>>>> Sent: Tuesday, June 17, 2003 3:15 PM
> >>>>>>> Subject: Re: [t13] hmmm.. no comments?
> >>>>>>>
> >>>>>>>
> >>>>>>>> This message is from the T13 list server.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Hale
> >>>>>>>>
> >>>>>>>>     I was there during the discussions and there was no secret
> >>>>>> committee.
> >>>>>>>> Basically, MS stated that they wanted to force meta data to the
> >>> drive
> >>>>>>>> without blowing the que.  This means that although it is
> > possible
> >>> to
> >>>>>> lose
> >>>>>>>> data, in their application data loss would not occur...
> >>>>>>>>
> >>>>>>>> ---------------------------
> >>>>>>>> Curtis E. Stevens
> >>>>>>>> 29 Dewey
> >>>>>>>> Irvine, Ca 92620
> >>>>>>>>
> >>>>>>>> Home: (949) 552-4777
> >>>>>>>> E-Mail: [EMAIL PROTECTED]
> >>>>>>>>
> >>>>>>>> The face of a child can say it all, especially the mouth part
> > of
> >>> the
> >>>>>>> face...
> >>>>>>>> ----- Original Message -----
> >>>>>>>> From: "Hale Landis" <[EMAIL PROTECTED]>
> >>>>>>>> To: "T13 List Server" <[EMAIL PROTECTED]>
> >>>>>>>> Sent: Tuesday, June 17, 2003 10:57 AM
> >>>>>>>> Subject: [t13] hmmm.. no comments?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> This message is from the T13 list server.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I'm curious why there are no comments about the question of
> > the
> >>>>>>>>> origin of the WRITE DMA QUEUED FUA command (where is the
> >>> proposal?).
> >>>>>>>>> And why no comments on QUEUED EXT commands with large sector
> >>> counts.
> >>>>>>>>>
> >>>>>>>>> Is this because all these discussions must take place via the
> >>> "secret
> >>>>>>>>> society"?
> >>>>>>>>>
> >>>>>>>>> Hale
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *** Hale Landis *** www.ata-atapi.com ***
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >
> 

Reply via email to