Re: The SX4 challenge

2008-01-20 Thread Mikael Pettersson
Jeff Garzik writes:
  
  Promise just gave permission to post the docs for their PDC20621 (i.e. 
  SX4) hardware:
  http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-1.2.pdf.bz2
  
  joining the existing PDC20621 DIMM and PLL docs:
  http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-dimm-1.6.pdf.bz2
  http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-pll-ata-timing-1.2.pdf.bz2
  
  
  So, the SX4 is now open.  Yay :)  I am hoping to talk Mikael into 
  becoming the sata_sx4 maintainer, and finally integrating my 'new-eh' 
  conversion in libata-dev.git.

The best solution would be if some storage driver person would
take on the SX4 challenge and work towards integrating the SX4
into Linux' RAID framework.

If no-one steps forward I'll take over Jeff's SX4 card and just
maintain sata_sx4 as a plain non-RAID driver. Unfortunately I
don't have the time needed to turn it into a decent RAID or
RAID-offload driver myself.

/Mikael

  
  But now is a good time to remind people how lame the sata_sx4 driver 
  software really is -- and I should know, I wrote it.
  
  The SX4 hardware, simplified, is three pieces:  XOR engine (for raid5), 
  host-board memcpy engine, and several ATA engines (and some helpful 
  transaction sequencing features).  Data for each WRITE command is first 
  copied to the board RAM, then the ATA engines DMA to/from the board RAM. 
Data for each READ command is copied to board RAM via the ATA engines, 
  then DMA'd across PCI to your host memory.
  
  Therefore, while it is not hardware RAID, the SX4 provides all the 
  pieces necessary to offload RAID1 and RAID5, and handle other RAID 
  levels optimally.  RAID1 and 5 copies can be offloaded (provided all 
  copies go to SX4-attached devices of course).  RAID5 XOR gen and 
  checking can be offloaded, allowing the OS to see a single request, 
  while the hardware processes a sequence of low-level requests sent in a 
  batch.
  
  This hardware presents an interesting challenge:  it does not really fit 
  into software RAID (i.e. no RAID) /or/ hardware RAID categories.  The 
  sata_sx4 driver presents the no-RAID configuration, while is terribly 
  inefficient:
  
   WRITE:
   submit host DMA (copy to board)
   host DMA completion via interrupt
   submit ATA command
   ATA command completion via interrupt
   READ:
   submit ATA command
   ATA command completion via interrupt
   submit host DMA (copy from board)
   host DMA completion via interrupt
  
  Thus, the SX4 challenge is a challenge to developers to figure out the 
  most optimal configuration for this hardware, given the existing MD and 
  DM work going on.
  
  Now, it must be noted that the SX4 is not current-gen technology.  Most 
  vendors have moved towards an IOP model, where the hw vendor puts most 
  of their hard work into an ARM/MIPS firmware, running on an embedded 
  chip specially tuned for storage purposes.  (ref hptiop and stex 
  drivers, very very small SCSI drivers)
  
  I know Dan Williams @ Intel is working on very similar issues on the IOP 
  -- async memcpy, XOR offload, etc. -- and I am hoping that, due to that 
  current work, some of the good ideas can be reused with the SX4.
  
  Anyway...  it's open, it's interesting, even if it's not current-gen 
  tech anymore.  You can probably find them on Ebay or in an 
  out-of-the-way computer shop somewhere.
  
   Jeff
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Machine hanging on synchronize cache on shutdown 2.6.22-rc4-git[45678]

2007-06-18 Thread Mikael Pettersson
On Mon, 18 Jun 2007 16:09:49 +0900, Tejun Heo wrote:
 Mikael Pettersson wrote:
  On Sat, 16 Jun 2007 15:52:33 +0400, Brad Campbell wrote:
  I've got a box here based on current Debian Stable.
  It's got 15 Maxtor SATA drives in it on 4 Promise TX4 controllers.
 
  Using kernel 2.6.21.x it shuts down, but of course with a huge clack as 
  15 drives all do emergency 
  head parks simultaneously. I thought I'd upgrade to 2.6.22-rc to get 
  around this but the machine 
  just hangs up hard apparently trying to sync cache on a drive.
 
  I've run this process manually, so I know it is being performed properly.
 
  Prior to shutdown, all nfsd processes are stopped, filesystems unmounted 
  and md arrays stopped.
  /proc/mdstat shows
  [EMAIL PROTECTED]:~# cat /proc/mdstat
  Personalities : [raid6] [raid5] [raid4]
  unused devices: none
  [EMAIL PROTECTED]:~#
 
  Here is the final hangup.
 
  http://www.fnarfbargle.com/CIMG1029.JPG
  
  Something sent a command to the disk on ata15 after the PHY had been
  offlined and the interface had been put in SLUMBER state (SStatus 614).
  Consequently the command timed out. Libata tried a soft reset, and then
  a hard reset, after which the machine hung.
 
 Hmm... weird.  Maybe device initiated power saving (DIPS) is active?
 
  I don't think sata_promise is the guilty party here. Looks like some
  layer above sata_promise got confused about the state of the interface.
 
 But locking up hard after hardreset is a problem of sata_promise, no?

Maybe, maybe not. The original report doesn't specify where/how
the machine hung.

Brad: can you enable sysrq and check if the kernel responds to
sysrq when it appears to hang, and if so, where it's executing?

sata_promise just passes sata_std_hardreset to ata_do_eh.
I've certainly seen EH hardresets work before, so I'm assuming
that something in this particular situation (PHY offlined,
kernel close to shutting down) breaks things.

FWIW, I'm seeing scsi layer accesses (cache flushes) after things
like rmmod sata_promise. They error out and don't seem to cause
any harm, but the fact that they occur at all makes me nervous.

/Mikael
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html