Re: Unable to shutdown

2011-09-02 Thread Kevin Oberman
On Wed, Aug 31, 2011 at 8:23 PM, Kevin Oberman kob6...@gmail.com wrote:
 On Thu, Sep 1, 2011 at 2:01 AM,  per...@pluto.rain.com wrote:
 Jeremy Chadwick free...@jdc.parodius.com wrote:
 On Tue, Aug 30, 2011 at 11:04:43PM -0700, Kevin Oberman wrote:
  ... the standrad does not specify EXACTLY what triggers a
  transition from standby to ready (PM2 to PM0). Only that it is
  something that requires media access. A write does not
  necessarily require media access if you define media as the
  disk platter.

 You're correct -- media access could mean, literally, accessing
 the platter OR it could mean LBA read/write I/O.  Then comes
 into question whether or not the drive returning something from
 its on-board cache would count as media access or not.

 T13 should probably clarify on this point, and this is one I do
 not have an answer for myself.  I strongly believe media access
 means LBA read/write I/O and regardless if it's data that's in
 the on-board cache on the disk or not.  I wonder if this behaviour
 varies per drive model.

 Given a standard which is, shall we say, open to interpretation,
 I think the liklihood approaches 100% that it has been interpreted
 differently by different manufacturers -- or even by different
 firmware authors within a single manufacturer.  I would be amazed
 if the behaviour did _not_ vary among drive models.

 And, if you tell your firmware writers that they should look for any
 technique that
 reduces power consumption, I don't doubt that keeping the disk in
 standby until there
 was a reason to move data from write cache to disk would look good. I would 
 hope
 that they would not make a cache flush lie, but that used to be common
 on old ATA
 drives.

OK. I tried the drive with a UFS file system. I plugged it in and
Gnome mounted it. I then
ignored it for a while and the LED went from ON to pulsing (bright to
dim and back) at
about .5Hz. Drive was spun down. I assume it was in STANDBY. (No other
state that I
can see it being in.)

I requested that the drive be unmounted. It behaved the same way as the msdosfs
system. It appeared to have unmounted, but the device entry still was
open and the
drive was non-reponsive. Interestingly, although an msdosfs system was
still mounted,
the LED went from slow pulsing to OFF. Attempts to unmount the msodsfs
system failed
with the LED staying off and the unmount not completing. Still an open
connection to the
device with the UFS partition (/dev/da0s3). The system was operating
normally. If the
drive was in STANDBY when the LED was pulsing, what state was it in when the LED
was off?

I then tried to ls a directory on the msdosfs system. The LED came ON and, after
several seconds, I got the listing of the directory followed by a
message that the
umount of the msdosfs system had failed. When I checked, there were no open
connections to the UFS partition. It was fully unmounted. I could also
unmount the
msdosfs system.

So the problem is not unique to msdosfs. I still think the hardware is
doing something
weird, especially with the LED going off when I attempted to unmount
the file system.

I may try doing a run with usbdebug and see if that gives any more
clues, but I may not
find anything that I understand.
-- 
R. Kevin Oberman, Network Engineer - Retired
E-mail: kob6...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-31 Thread Kevin Oberman
Jeremy,

I think we are simply not communicating, I guess. You are arguing
point with which I agree.

Comments in line:
On Tue, Aug 30, 2011 at 4:43 PM, Jeremy Chadwick
free...@jdc.parodius.com wrote:
 On Tue, Aug 30, 2011 at 04:10:13PM -0700, Kevin Oberman wrote:
 On Tue, Aug 30, 2011 at 2:48 PM, Jeremy Chadwick
 free...@jdc.parodius.com wrote:
  On Tue, Aug 30, 2011 at 01:29:02PM -0400, David Magda wrote:
  On Tue, August 30, 2011 11:50, Kevin Oberman wrote:
  [...]
   The more I look at this, the more it seems to me that it is an issue
   with the Seagate drive and not a FreeBSD issue. Probably a bug that is
   never triggered on Windows, so is largely unnoticed. I suspect Widows
   probably orders the command is a subtly different order.
  [...]
 
  Or not the drive per se, but the USB-to-IDE/SATA chipset.
 
  A while back on the OpenSolaris zfs-discuss list there was an issue where
  USB drives would have corrupt ZFS pools if a drive was yanked without a
  'zpool export' being run. Even though ZFS is supposed to always be
  consistent on-disk (because it's transactional), this wasn't happening.
 
  It turned that the chipset had a list of particular SATA commands that it
  allowed through to the drive, and all others were simply answered with
  OK, regardless of what actual actions needed to be taken. One of the
  SATA commands that was NOT whitelisted was the 'cache flush'
  command--which ZFS needs to make sure that it's data structures were
  written in the proper order.
 
  Turns out the drive and its firmware were fine and doing things properly,
  it's just that the necessary commands weren't getting to it because of the
  USB adaptor's chipsset.
 
  I don't think that advice is applicable in this situation. ?Here's why:
 
  Kevin's original description indicates that when the drive (or enclosure
  translation ASIC for that matter) is in standby, when the system is shut
  down, the drive/ASIC never spins back up on I/O (flushing all I/O
  buffers to disk).
 
  If he issues ls commands or similar userland-induced I/O to the drive
  prior to shutting the system down, the drive/ASIC spins up normally.
 
  Here's Kevin's original quote:
 
  The drive is green and spins down when idle. ?If an attempt is made
  to shutdown the system while the drive is spun down, the system goes
  through the usual shutdown including flushing all buffer out to disk,
  but when the final disk access to mark the file systems as clean, the
  drive never spins up and the system hangs until it is powered down.
  I've found no way to avoid this other then to remember to access the
  disk and cause it to spin up before shutting down.
 
  If I attempt to unmount the file systems when the drive is shut down.
  the same thing happens, but I can recover as the second file system
  is still mounted and an ls(1) to that file system will cause the disk
  to spin up and everything is fine.
 
  So the question is what's unique about flushing all I/O buffers to
  disk during shutdown compared to issuing standard I/O in userland. ?I
  can speculate all day as to what the cause is, but it's highly unlikely
  that the USB-to-SATA controller ASIC is causing the problem.

 You are perhaps assuming a bit too much. Since I know that a disk read
 or write WILL spin up the drive, I can only assume that the msdosfs is
 not finding anything to flush, so is not writing. I see the full
 flushing all buffers countdown and it always runs successfully to
 zero. This, without the drive spinning up. This begs at least the
 question of whether the drive is receiving any writes or whether the
 writes are simply being cached by the drive to save energy. I
 suspect that the drive only spins up when enough of its write cache is
 filled.

 If there's nothing to flush, then why is the kernel indefinitely
 looping (finally giving up, and it usually prints something when it
 encounters that condition) when trying to flush buffers when the drive
 is spun down?  What exactly is it trying to flush if there's nothing to
 flush?

I think you may be focusing on things you believe I meant when I didn't mean or
say them. I don't have any reason to believe that a cache flush is or is not the
command that is hanging. I have absolutely no doubt that a flush is requested by
the OS during the unmount process.  I'm just not sure what other commands might
be issued. And, of course, they are CAM operations that the box is probably
converting to SATA, but I can't even say this for sure as the Seagate
drive in question
is a SATA drive in the box. I can only say that the drive is not a
standard 9mm laptop
drive It is longer, thicker and heavier than a laptop drive. It is the
same width as a
normal 2.5 in. drive.

As to the issue of nothing to flush, that was my fault as I was
entering text in a stream
of consciousness  and I realized that, if there was only a little data
being written, it might
not spin up the drive (i.e. take it out of standby) until more data is
written or a 

Re: Unable to shutdown

2011-08-31 Thread Jeremy Chadwick
On Tue, Aug 30, 2011 at 11:04:43PM -0700, Kevin Oberman wrote:
  On Tue, Aug 30, 2011 at 2:48 PM, Jeremy Chadwick
  free...@jdc.parodius.com wrote:
  instead use UFS2 and see if the problem disappears? ?This is in no way a
  permanent solution. ?If this workaround fixes the problem, then I'm
  inclined to believe msdosfs is to blame. ?There have been a lot of
  discussion of this driver in the kernel as of late, and the general
  opinion of it is that it's crummy.
 
 Actually, for me it is as I will shortly be re-partitioning this into
 a GPT disk without any
 msdosfs partitions. I will give it a try with a UFS partition tomorrow
 and see what
 happens.
 
 When you say that it is crummy, are you referring to the USB driver,
 the AHCI driver, or
 the msdosfs support? I have long been concerned about the latter due
 to occasional
 unstable behavior that is fixed by booting Windows. fsck_msdosfs
 seems to do some
 questionable things, too.

I was referring to msdosfs support in the FreeBSD kernel.  I'm still not
so sure about the USB stack (some things seem to be better now as a
result of the re-write that happened during the 7.x - 8.x days, but
other things may still be awry); I don't tend to use any USB devices on
FreeBSD.  As for AHCI, I have no complaints at all, although AHCI
shouldn't be involved when it comes to a USB-connected SATA hard disk.

  And here's another thought: what if the issue is limited, somehow, to
  just writes? ?Meaning, could the kernel issue a false read to the
  device (for some random LBA, even LBA 0 for all I care) and then proceed
  with its write/flushing? ?I wonder if that would cause the drive to spin
  up first. ?That would be a quirk in my opinion.
 
 Interesting idea, but I really doubt that it's an issue with the write
 other than that the
 drive may not leave standby unless the cache is full enough that it flushes.

I'm not sure what you mean by the last part of the sentence, but the
former is something I'm in agreement with.  I doubt adding a fake read
prior to issuing writes and flushes during shutdown would make any
difference.  I'm just surprised the writes being made are not causing
the drive to spin up.

  There's also the possibility the USB stack on FreeBSD is doing something
  really stupid... man, I don't even want to go down that road. ?Hans
  should be able to help determine if that's the case, but not using
  msdosfs as a test would be a good start.
 
 Yes. I make no claim to understand the USB layer at all, but I do
 understand that
 it is very tricky. Lots of evidence of that in how broken early
 Microsoft USB stacks
 were.

FreeBSD has gone through at least two major versions of a USB stack.
The stack in the 4.x days did not impress me -- I tried working on
Logitech USB camera support, but could not get alternative indexes to
work -- ugen(4) returned bizarre error conditions for things that
absolutely should have worked.  I did contact the stack maintainer, but
I would rather not go into the discussion that ensued as a result.

Said USB stack improved slightly from 4.x to 7.x.  An entire re-write
was performed (what was then called USB2, not to be confused with the
USB 2.0 protocol) which is what's in use (in RELENG_8) today.  There
have been at least 3 different maintainers of the FreeBSD USB stack, and
all at different times / completely segregated.

I don't want my comments to make anyone think the problem described here
is in the FreeBSD USB stack.  I'm just stating some history for those
wondering about it, especially given the comments about Microsoft's
early USB stacks (particularly during the original Windows 95 days and
some other issues during the Win98 era).  My opinion/experiences are my
own.

The problem is that I don't know how to rule the USB stack out when it
comes to diagnosing the problem you're having.  There is the USB_DEBUG
option in one's kernel config which may or may not provide some
insights, but I imagine it's quite chatty and would justify the need for
serial or firewire console given the amount of console output.

  So I'm pretty sure the kernel is iterating over whatever cache buffers
  there are for I/O (I don't know what this is called technically) and
  issuing WRITE DMA or -EXT and either waiting for a non-error response
  from the device or issuing it blindly followed by a FLUSH CACHE or -EXT
  (either once per write or at the very end).
 
 Again, I really believe that the kernel fully believes that all writes
 are complete,
 at least to the disk cache. At that point the FS structures can be removed and
 the FS is no longer mounted as seen from the perspective of the
 system, this MUST
 be done before the disk cache is flushed and the FS is marked clean.
 I suspect,
 but don't know for sure, that the last two operations performed are to
 mark the drive
 clean and then do a cache flush. Of possible relevance is that none of the 
 file
 system is marked clean during a hung shutdown. All need to be FSCKed 
 although
 

Re: Unable to shutdown

2011-08-31 Thread perryh
Jeremy Chadwick free...@jdc.parodius.com wrote:
 On Tue, Aug 30, 2011 at 11:04:43PM -0700, Kevin Oberman wrote:
  ... the standrad does not specify EXACTLY what triggers a
  transition from standby to ready (PM2 to PM0). Only that it is
  something that requires media access. A write does not
  necessarily require media access if you define media as the
  disk platter.

 You're correct -- media access could mean, literally, accessing
 the platter OR it could mean LBA read/write I/O.  Then comes
 into question whether or not the drive returning something from
 its on-board cache would count as media access or not.

 T13 should probably clarify on this point, and this is one I do
 not have an answer for myself.  I strongly believe media access
 means LBA read/write I/O and regardless if it's data that's in
 the on-board cache on the disk or not.  I wonder if this behaviour
 varies per drive model.

Given a standard which is, shall we say, open to interpretation,
I think the liklihood approaches 100% that it has been interpreted
differently by different manufacturers -- or even by different
firmware authors within a single manufacturer.  I would be amazed
if the behaviour did _not_ vary among drive models.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-31 Thread Kevin Oberman
On Thu, Sep 1, 2011 at 2:01 AM,  per...@pluto.rain.com wrote:
 Jeremy Chadwick free...@jdc.parodius.com wrote:
 On Tue, Aug 30, 2011 at 11:04:43PM -0700, Kevin Oberman wrote:
  ... the standrad does not specify EXACTLY what triggers a
  transition from standby to ready (PM2 to PM0). Only that it is
  something that requires media access. A write does not
  necessarily require media access if you define media as the
  disk platter.

 You're correct -- media access could mean, literally, accessing
 the platter OR it could mean LBA read/write I/O.  Then comes
 into question whether or not the drive returning something from
 its on-board cache would count as media access or not.

 T13 should probably clarify on this point, and this is one I do
 not have an answer for myself.  I strongly believe media access
 means LBA read/write I/O and regardless if it's data that's in
 the on-board cache on the disk or not.  I wonder if this behaviour
 varies per drive model.

 Given a standard which is, shall we say, open to interpretation,
 I think the liklihood approaches 100% that it has been interpreted
 differently by different manufacturers -- or even by different
 firmware authors within a single manufacturer.  I would be amazed
 if the behaviour did _not_ vary among drive models.

And, if you tell your firmware writers that they should look for any
technique that
reduces power consumption, I don't doubt that keeping the disk in
standby until there
was a reason to move data from write cache to disk would look good. I would hope
that they would not make a cache flush lie, but that used to be common
on old ATA
drives.
-- 
R. Kevin Oberman, Network Engineer - Retired
E-mail: kob6...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-30 Thread Kevin Oberman
On Mon, Aug 29, 2011 at 1:06 PM, Eli Dart d...@es.net wrote:


 On 8/28/11 1:06 PM, Bengt Ahlgren wrote:

 Kevin Obermankob6...@gmail.com  writes:

 I've run into an odd problem with dismounting file systems on a
 Seagate Expansion portable
 USB drive. Running 8-stable on an amd64 system and with two FAT32
 (msdosfs) file systems
 on the drive.

 The drive is green and spins down when idle.  If an attempt is made
 to shutdown the
 system while the drive is spun down, the system goes through the usual
 shutdown including
 flushing all buffer out to disk, but when the final disk access to
 mark the file systems as
 clean, the drive never spins up and the system hangs until it is
 powered down. I've found no
 way to avoid this other then to remember to access the disk and cause
 it to spin up before
 shutting down.

 If I attempt to unmount the file systems when the drive is shut down.
 the same thing
 happens, but I can recover as the second file system is still mounted
 and an ls(1) to that file
 system will cause the disk to spin up and everything is fine.

 This looks like a bug, but I don't see why the unmounting of an
 msdosfs system does not
 spin up the drive. It's clearly hanging on some operation that is not
 spinning up the drive,
 but does block.

 Any ideas what is going on? Possible fix?

 Not a solution to your problem, but a data point:

 I have a WD Passport 750GB (2.5) drive with an UFS filesystem on it.  I
 don't think I've tried shutdown with the drive mounted, but I've
 experienced no problems after the drive has spun down, including umount.
 There is just a delay while it spins up.  This is on 8.2-REL/i386, that
 is, with the new USB stack.

 In my experience, the issues don't show up at lower capacities.  I've seen
 problems with 2TB drives, but 1TB and 1.5TB drives seem to work fine.

 Kevin - how big is the disk in question?

Only 750G. It's just a little portable drive and not even a new one.
It was big back
when I bought it, but not any more. I think it might be more of an
issue with the
particular firmware on the drive. Some CAM operation seems to never complete
when the drive is spun down. Either:
1. The command cannot be completed with until the drive is spun up,
but a firmware
bug is not triggering a spin-up
or:
2. The command does not need the drive spun up, but a bug in the firmware is not
allowing the completion wen the drive is not spinning.

The more I look at this, the more it seems to me that it is an issue
with the Seagate
drive and not a FreeBSD issue. Probably a bug that is never triggered
on Windows,
so is largely unnoticed. I suspect Widows probably orders the command
is a subtly
different order.

It is probably an issue that FreeBSD fails to ever timeout when this
happens, though.
That makes me suspect that the command in question is one that should
always return
something immediately. I suppose it is also possible that it is some
oddity in the USB
stack, too, but I still suspect that the root issue is a firmware bug
in the drive.
-- 
R. Kevin Oberman, Network Engineer - Retired
E-mail: kob6...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-30 Thread David Magda
On Tue, August 30, 2011 11:50, Kevin Oberman wrote:
[...]
 The more I look at this, the more it seems to me that it is an issue
 with the Seagate drive and not a FreeBSD issue. Probably a bug that is
 never triggered on Windows, so is largely unnoticed. I suspect Widows
 probably orders the command is a subtly different order.
[...]

Or not the drive per se, but the USB-to-IDE/SATA chipset.

A while back on the OpenSolaris zfs-discuss list there was an issue where
USB drives would have corrupt ZFS pools if a drive was yanked without a
'zpool export' being run. Even though ZFS is supposed to always be
consistent on-disk (because it's transactional), this wasn't happening.

It turned that the chipset had a list of particular SATA commands that it
allowed through to the drive, and all others were simply answered with
OK, regardless of what actual actions needed to be taken. One of the
SATA commands that was NOT whitelisted was the 'cache flush'
command--which ZFS needs to make sure that it's data structures were
written in the proper order.

Turns out the drive and its firmware were fine and doing things properly,
it's just that the necessary commands weren't getting to it because of the
USB adaptor's chipsset.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-30 Thread Jeremy Chadwick
On Tue, Aug 30, 2011 at 01:29:02PM -0400, David Magda wrote:
 On Tue, August 30, 2011 11:50, Kevin Oberman wrote:
 [...]
  The more I look at this, the more it seems to me that it is an issue
  with the Seagate drive and not a FreeBSD issue. Probably a bug that is
  never triggered on Windows, so is largely unnoticed. I suspect Widows
  probably orders the command is a subtly different order.
 [...]
 
 Or not the drive per se, but the USB-to-IDE/SATA chipset.
 
 A while back on the OpenSolaris zfs-discuss list there was an issue where
 USB drives would have corrupt ZFS pools if a drive was yanked without a
 'zpool export' being run. Even though ZFS is supposed to always be
 consistent on-disk (because it's transactional), this wasn't happening.
 
 It turned that the chipset had a list of particular SATA commands that it
 allowed through to the drive, and all others were simply answered with
 OK, regardless of what actual actions needed to be taken. One of the
 SATA commands that was NOT whitelisted was the 'cache flush'
 command--which ZFS needs to make sure that it's data structures were
 written in the proper order.
 
 Turns out the drive and its firmware were fine and doing things properly,
 it's just that the necessary commands weren't getting to it because of the
 USB adaptor's chipsset.

I don't think that advice is applicable in this situation.  Here's why:

Kevin's original description indicates that when the drive (or enclosure
translation ASIC for that matter) is in standby, when the system is shut
down, the drive/ASIC never spins back up on I/O (flushing all I/O
buffers to disk).

If he issues ls commands or similar userland-induced I/O to the drive
prior to shutting the system down, the drive/ASIC spins up normally.

Here's Kevin's original quote:

 The drive is green and spins down when idle.  If an attempt is made
 to shutdown the system while the drive is spun down, the system goes
 through the usual shutdown including flushing all buffer out to disk,
 but when the final disk access to mark the file systems as clean, the
 drive never spins up and the system hangs until it is powered down.
 I've found no way to avoid this other then to remember to access the
 disk and cause it to spin up before shutting down.

 If I attempt to unmount the file systems when the drive is shut down.
 the same thing happens, but I can recover as the second file system
 is still mounted and an ls(1) to that file system will cause the disk
 to spin up and everything is fine.

So the question is what's unique about flushing all I/O buffers to
disk during shutdown compared to issuing standard I/O in userland.  I
can speculate all day as to what the cause is, but it's highly unlikely
that the USB-to-SATA controller ASIC is causing the problem.

Furthermore, Windows doesn't have special disk/enclosure drivers for
such drives, so there's nothing unique Windows would be sending across
the wire, ATA-protocol-wise, that would explain why Windows works and
FreeBSD doesn't.  At least that's my opinion.

With ATA/SATA, the FLUSH CACHE (0xe7) and -EXT (0xea) (for 48-bit LBAs)
commands are separate from WRITE DMA (0xca) and -EXT (0x35) (for 48-bit
LBAs).  Both FLUSH CACHE commands do not take LBAs in their input CDB.
To flush buffers to disk I imagine what the kernel should be doing is
issuing WRITE commands followed by FLUSH CACHE.  The WRITEs should be
waking the drive up.

But wait, there's more.

I want to point out to people that sleep and standby are two very
different things (they're separate ATA commands too).  So if you're
using camcontrol sleep you probably should be using camcontrol
standby.  The man page is quite clear about the repercussions of the
former (and in the latter case I can imagine I/O to the drive failing or
simply timing out given that a bus reset is not performed during
shutdown TMK).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-30 Thread Kevin Oberman
On Tue, Aug 30, 2011 at 2:48 PM, Jeremy Chadwick
free...@jdc.parodius.com wrote:
 On Tue, Aug 30, 2011 at 01:29:02PM -0400, David Magda wrote:
 On Tue, August 30, 2011 11:50, Kevin Oberman wrote:
 [...]
  The more I look at this, the more it seems to me that it is an issue
  with the Seagate drive and not a FreeBSD issue. Probably a bug that is
  never triggered on Windows, so is largely unnoticed. I suspect Widows
  probably orders the command is a subtly different order.
 [...]

 Or not the drive per se, but the USB-to-IDE/SATA chipset.

 A while back on the OpenSolaris zfs-discuss list there was an issue where
 USB drives would have corrupt ZFS pools if a drive was yanked without a
 'zpool export' being run. Even though ZFS is supposed to always be
 consistent on-disk (because it's transactional), this wasn't happening.

 It turned that the chipset had a list of particular SATA commands that it
 allowed through to the drive, and all others were simply answered with
 OK, regardless of what actual actions needed to be taken. One of the
 SATA commands that was NOT whitelisted was the 'cache flush'
 command--which ZFS needs to make sure that it's data structures were
 written in the proper order.

 Turns out the drive and its firmware were fine and doing things properly,
 it's just that the necessary commands weren't getting to it because of the
 USB adaptor's chipsset.

 I don't think that advice is applicable in this situation.  Here's why:

 Kevin's original description indicates that when the drive (or enclosure
 translation ASIC for that matter) is in standby, when the system is shut
 down, the drive/ASIC never spins back up on I/O (flushing all I/O
 buffers to disk).

 If he issues ls commands or similar userland-induced I/O to the drive
 prior to shutting the system down, the drive/ASIC spins up normally.

 Here's Kevin's original quote:

 The drive is green and spins down when idle.  If an attempt is made
 to shutdown the system while the drive is spun down, the system goes
 through the usual shutdown including flushing all buffer out to disk,
 but when the final disk access to mark the file systems as clean, the
 drive never spins up and the system hangs until it is powered down.
 I've found no way to avoid this other then to remember to access the
 disk and cause it to spin up before shutting down.

 If I attempt to unmount the file systems when the drive is shut down.
 the same thing happens, but I can recover as the second file system
 is still mounted and an ls(1) to that file system will cause the disk
 to spin up and everything is fine.

 So the question is what's unique about flushing all I/O buffers to
 disk during shutdown compared to issuing standard I/O in userland.  I
 can speculate all day as to what the cause is, but it's highly unlikely
 that the USB-to-SATA controller ASIC is causing the problem.

You are perhaps assuming a bit too much. Since I know that a disk read or write
WILL spin up the drive, I can only assume that the msdosfs is not finding
anything to flush, so is not writing. I see the full flushing all
buffers countdown
and it always runs successfully to zero. This, without the drive
spinning up. This
begs at least the question of whether the drive is receiving any writes or
whether the writes are simply being cached by the drive to save energy. I
suspect that the drive only spins up when enough of its write cache is filled.

In that case, the flush cache might actually be what is issued, but
I can't claim
any certainly about that. I'm not willing to completely clear the
USB-SATA chip as
the culprit.

 Furthermore, Windows doesn't have special disk/enclosure drivers for
 such drives, so there's nothing unique Windows would be sending across
 the wire, ATA-protocol-wise, that would explain why Windows works and
 FreeBSD doesn't.  At least that's my opinion.

This is not always quite true, but it is true for the general case. (I
know some WD
enclosures do install a custom driver.)

 With ATA/SATA, the FLUSH CACHE (0xe7) and -EXT (0xea) (for 48-bit LBAs)
 commands are separate from WRITE DMA (0xca) and -EXT (0x35) (for 48-bit
 LBAs).  Both FLUSH CACHE commands do not take LBAs in their input CDB.
 To flush buffers to disk I imagine what the kernel should be doing is
 issuing WRITE commands followed by FLUSH CACHE.  The WRITEs should be
 waking the drive up.

Should they? As I pointed out above, that is not necessarily the case.

 But wait, there's more.

 I want to point out to people that sleep and standby are two very
 different things (they're separate ATA commands too).  So if you're
 using camcontrol sleep you probably should be using camcontrol
 standby.  The man page is quite clear about the repercussions of the
 former (and in the latter case I can imagine I/O to the drive failing or
 simply timing out given that a bus reset is not performed during
 shutdown TMK).

This is  very interesting point. Note that when this happens, whether
at shutdown
or when unmounting 

Re: Unable to shutdown

2011-08-30 Thread Jeremy Chadwick
On Tue, Aug 30, 2011 at 04:10:13PM -0700, Kevin Oberman wrote:
 On Tue, Aug 30, 2011 at 2:48 PM, Jeremy Chadwick
 free...@jdc.parodius.com wrote:
  On Tue, Aug 30, 2011 at 01:29:02PM -0400, David Magda wrote:
  On Tue, August 30, 2011 11:50, Kevin Oberman wrote:
  [...]
   The more I look at this, the more it seems to me that it is an issue
   with the Seagate drive and not a FreeBSD issue. Probably a bug that is
   never triggered on Windows, so is largely unnoticed. I suspect Widows
   probably orders the command is a subtly different order.
  [...]
 
  Or not the drive per se, but the USB-to-IDE/SATA chipset.
 
  A while back on the OpenSolaris zfs-discuss list there was an issue where
  USB drives would have corrupt ZFS pools if a drive was yanked without a
  'zpool export' being run. Even though ZFS is supposed to always be
  consistent on-disk (because it's transactional), this wasn't happening.
 
  It turned that the chipset had a list of particular SATA commands that it
  allowed through to the drive, and all others were simply answered with
  OK, regardless of what actual actions needed to be taken. One of the
  SATA commands that was NOT whitelisted was the 'cache flush'
  command--which ZFS needs to make sure that it's data structures were
  written in the proper order.
 
  Turns out the drive and its firmware were fine and doing things properly,
  it's just that the necessary commands weren't getting to it because of the
  USB adaptor's chipsset.
 
  I don't think that advice is applicable in this situation. ?Here's why:
 
  Kevin's original description indicates that when the drive (or enclosure
  translation ASIC for that matter) is in standby, when the system is shut
  down, the drive/ASIC never spins back up on I/O (flushing all I/O
  buffers to disk).
 
  If he issues ls commands or similar userland-induced I/O to the drive
  prior to shutting the system down, the drive/ASIC spins up normally.
 
  Here's Kevin's original quote:
 
  The drive is green and spins down when idle. ?If an attempt is made
  to shutdown the system while the drive is spun down, the system goes
  through the usual shutdown including flushing all buffer out to disk,
  but when the final disk access to mark the file systems as clean, the
  drive never spins up and the system hangs until it is powered down.
  I've found no way to avoid this other then to remember to access the
  disk and cause it to spin up before shutting down.
 
  If I attempt to unmount the file systems when the drive is shut down.
  the same thing happens, but I can recover as the second file system
  is still mounted and an ls(1) to that file system will cause the disk
  to spin up and everything is fine.
 
  So the question is what's unique about flushing all I/O buffers to
  disk during shutdown compared to issuing standard I/O in userland. ?I
  can speculate all day as to what the cause is, but it's highly unlikely
  that the USB-to-SATA controller ASIC is causing the problem.
 
 You are perhaps assuming a bit too much. Since I know that a disk read
 or write WILL spin up the drive, I can only assume that the msdosfs is
 not finding anything to flush, so is not writing. I see the full
 flushing all buffers countdown and it always runs successfully to
 zero. This, without the drive spinning up. This begs at least the
 question of whether the drive is receiving any writes or whether the
 writes are simply being cached by the drive to save energy. I
 suspect that the drive only spins up when enough of its write cache is
 filled.

If there's nothing to flush, then why is the kernel indefinitely
looping (finally giving up, and it usually prints something when it
encounters that condition) when trying to flush buffers when the drive
is spun down?  What exactly is it trying to flush if there's nothing to
flush?

Let me ask you this: can you stop using msdosfs on said USB device and
instead use UFS2 and see if the problem disappears?  This is in no way a
permanent solution.  If this workaround fixes the problem, then I'm
inclined to believe msdosfs is to blame.  There have been a lot of
discussion of this driver in the kernel as of late, and the general
opinion of it is that it's crummy.

And here's another thought: what if the issue is limited, somehow, to
just writes?  Meaning, could the kernel issue a false read to the
device (for some random LBA, even LBA 0 for all I care) and then proceed
with its write/flushing?  I wonder if that would cause the drive to spin
up first.  That would be a quirk in my opinion.

There's also the possibility the USB stack on FreeBSD is doing something
really stupid... man, I don't even want to go down that road.  Hans
should be able to help determine if that's the case, but not using
msdosfs as a test would be a good start.

 In that case, the flush cache might actually be what is issued, but
 I can't claim any certainly about that. I'm not willing to completely
 clear the USB-SATA chip as the culprit.

I'm pretty 

Re: Unable to shutdown

2011-08-29 Thread Eli Dart



On 8/28/11 1:06 PM, Bengt Ahlgren wrote:

Kevin Obermankob6...@gmail.com  writes:


I've run into an odd problem with dismounting file systems on a
Seagate Expansion portable
USB drive. Running 8-stable on an amd64 system and with two FAT32
(msdosfs) file systems
on the drive.

The drive is green and spins down when idle.  If an attempt is made
to shutdown the
system while the drive is spun down, the system goes through the usual
shutdown including
flushing all buffer out to disk, but when the final disk access to
mark the file systems as
clean, the drive never spins up and the system hangs until it is
powered down. I've found no
way to avoid this other then to remember to access the disk and cause
it to spin up before
shutting down.

If I attempt to unmount the file systems when the drive is shut down.
the same thing
happens, but I can recover as the second file system is still mounted
and an ls(1) to that file
system will cause the disk to spin up and everything is fine.

This looks like a bug, but I don't see why the unmounting of an
msdosfs system does not
spin up the drive. It's clearly hanging on some operation that is not
spinning up the drive,
but does block.

Any ideas what is going on? Possible fix?


Not a solution to your problem, but a data point:

I have a WD Passport 750GB (2.5) drive with an UFS filesystem on it.  I
don't think I've tried shutdown with the drive mounted, but I've
experienced no problems after the drive has spun down, including umount.
There is just a delay while it spins up.  This is on 8.2-REL/i386, that
is, with the new USB stack.


In my experience, the issues don't show up at lower capacities.  I've 
seen problems with 2TB drives, but 1TB and 1.5TB drives seem to work fine.


Kevin - how big is the disk in question?

Thanks,

--eli



Bengt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


--
Eli DartNOC: (510) 486-7600
ESnet Network Engineering Group (AS293)  (800) 333-7638
Lawrence Berkeley National Laboratory
PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-28 Thread Bengt Ahlgren
Kevin Oberman kob6...@gmail.com writes:

 I've run into an odd problem with dismounting file systems on a
 Seagate Expansion portable
 USB drive. Running 8-stable on an amd64 system and with two FAT32
 (msdosfs) file systems
 on the drive.

 The drive is green and spins down when idle.  If an attempt is made
 to shutdown the
 system while the drive is spun down, the system goes through the usual
 shutdown including
 flushing all buffer out to disk, but when the final disk access to
 mark the file systems as
 clean, the drive never spins up and the system hangs until it is
 powered down. I've found no
 way to avoid this other then to remember to access the disk and cause
 it to spin up before
 shutting down.

 If I attempt to unmount the file systems when the drive is shut down.
 the same thing
 happens, but I can recover as the second file system is still mounted
 and an ls(1) to that file
 system will cause the disk to spin up and everything is fine.

 This looks like a bug, but I don't see why the unmounting of an
 msdosfs system does not
 spin up the drive. It's clearly hanging on some operation that is not
 spinning up the drive,
 but does block.

 Any ideas what is going on? Possible fix?

Not a solution to your problem, but a data point:

I have a WD Passport 750GB (2.5) drive with an UFS filesystem on it.  I
don't think I've tried shutdown with the drive mounted, but I've
experienced no problems after the drive has spun down, including umount.
There is just a delay while it spins up.  This is on 8.2-REL/i386, that
is, with the new USB stack.

Bengt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-27 Thread Ulf Zimmermann
On Fri, Aug 26, 2011 at 09:51:02PM -0700, Kevin Oberman wrote:
 I've run into an odd problem with dismounting file systems on a
 Seagate Expansion portable
 USB drive. Running 8-stable on an amd64 system and with two FAT32
 (msdosfs) file systems
 on the drive.
 
 The drive is green and spins down when idle.  If an attempt is made
 to shutdown the
 system while the drive is spun down, the system goes through the usual
 shutdown including
 flushing all buffer out to disk, but when the final disk access to
 mark the file systems as
 clean, the drive never spins up and the system hangs until it is
 powered down. I've found no
 way to avoid this other then to remember to access the disk and cause
 it to spin up before
 shutting down.
 
 If I attempt to unmount the file systems when the drive is shut down.
 the same thing
 happens, but I can recover as the second file system is still mounted
 and an ls(1) to that file
 system will cause the disk to spin up and everything is fine.
 
 This looks like a bug, but I don't see why the unmounting of an
 msdosfs system does not
 spin up the drive. It's clearly hanging on some operation that is not
 spinning up the drive,
 but does block.
 
 Any ideas what is going on? Possible fix?
 -- 
 R. Kevin Oberman, Network Engineer - Retired
 E-mail: kob6...@gmail.com

Have a script, which gets run at shutdown as one of the first ones,
which would do a ls on the filesystem to wake the drive up.

-- 
Regards, Ulf.

-
Ulf Zimmermann, 1525 Pacific Ave., Alameda, CA-94501, #: 510-865-0204
You can find my resume at: http://www.Alameda.net/~ulf/resume.html
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-27 Thread Adrian Chadd
This sounds like a create PR and make noise until it's fixed issue. :-)

green drives are only going to get more prevalent..


Adrian

On 27 August 2011 12:51, Kevin Oberman kob6...@gmail.com wrote:
 I've run into an odd problem with dismounting file systems on a
 Seagate Expansion portable
 USB drive. Running 8-stable on an amd64 system and with two FAT32
 (msdosfs) file systems
 on the drive.

 The drive is green and spins down when idle.  If an attempt is made
 to shutdown the
 system while the drive is spun down, the system goes through the usual
 shutdown including
 flushing all buffer out to disk, but when the final disk access to
 mark the file systems as
 clean, the drive never spins up and the system hangs until it is
 powered down. I've found no
 way to avoid this other then to remember to access the disk and cause
 it to spin up before
 shutting down.

 If I attempt to unmount the file systems when the drive is shut down.
 the same thing
 happens, but I can recover as the second file system is still mounted
 and an ls(1) to that file
 system will cause the disk to spin up and everything is fine.

 This looks like a bug, but I don't see why the unmounting of an
 msdosfs system does not
 spin up the drive. It's clearly hanging on some operation that is not
 spinning up the drive,
 but does block.

 Any ideas what is going on? Possible fix?
 --
 R. Kevin Oberman, Network Engineer - Retired
 E-mail: kob6...@gmail.com
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Unable to shutdown

2011-08-27 Thread Eli Dart
This sounds like it may have the same underlying cause as an issue I've 
been experiencing.


Steps to reproduce:

1) Mount filesystem (Seagate 2TB USB disk)
2) wait a while, so the drive spins down
3) cd to a directory off the root of the mount point (the thing we're 
looking for here is a directory that is already in the filesystem buffer 
cache because of the filesystem mount).  We want a directory that is not 
empty.

4) ls
5) ls will hang for a while as the drive spins up (this is to be expected)
6) ls returns nothing

We now have a problem.  The kernel thinks the directory is empty, even 
when its not.  The drive is spun up now, and the rest of the filesystem 
will function normally, but that one directory will be considered empty 
by the kernel until it has reason to interact with disk (which means 
writing to the directory).  Once the directory is written, its now corrupt.


My guess is that there is something in the USB subsystem that doesn't 
deal well with the longer times necessary for bigger drives to spin back 
up (this is not a problem on 1TB drives).


A workaround is to have little script that does a dd from the raw device 
to /dev/null before attempting to access the drive - this will ensure 
that its spun up.  Needless to say, this doesn't work at all well for 
some production operations (e.g. rsync backup to USB disk), where disk 
I/O can cease for long enough for the drive to spin down in the middle 
of the job.


--eli



On 8/26/11 9:51 PM, Kevin Oberman wrote:

I've run into an odd problem with dismounting file systems on a
Seagate Expansion portable
USB drive. Running 8-stable on an amd64 system and with two FAT32
(msdosfs) file systems
on the drive.

The drive is green and spins down when idle.  If an attempt is made
to shutdown the
system while the drive is spun down, the system goes through the usual
shutdown including
flushing all buffer out to disk, but when the final disk access to
mark the file systems as
clean, the drive never spins up and the system hangs until it is
powered down. I've found no
way to avoid this other then to remember to access the disk and cause
it to spin up before
shutting down.

If I attempt to unmount the file systems when the drive is shut down.
the same thing
happens, but I can recover as the second file system is still mounted
and an ls(1) to that file
system will cause the disk to spin up and everything is fine.

This looks like a bug, but I don't see why the unmounting of an
msdosfs system does not
spin up the drive. It's clearly hanging on some operation that is not
spinning up the drive,
but does block.

Any ideas what is going on? Possible fix?


--
Eli DartNOC: (510) 486-7600
ESnet Network Engineering Group (AS293)  (800) 333-7638
Lawrence Berkeley National Laboratory
PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org