George,

> I'm looking for any pointers or advice on what might have happened
> to cause the following problem...

To run Oracle RAC on iSCSI Target LUs, accessible by three or more  
iSCSI Initiator nodes, requires support for SCSI-3 Persistent  
Reservations. This functionality was added to OpenSolaris at build  
snv_74, and current being backported to Solaris 10, available in S10u7  
next year.

The weird behavior seen below with 'dd', is likely Oracle's desire to  
continually repair one of its many redundant header blocks.

- Jim

FWIW, you don't need a file that contains zeros, as /dev/zero works  
just fine, and it is infinitely large.

>
>
> Setup:
> Two X4500 / Sol 10 U5 iSCSI servers, four T1000 S10 U4 -> U5 Oracle  
> RAC
> DB heads iSCSI clients.
>
> iSCSI set up using zfs volumes, set shareiscsi=on,
> (slightly wierd thing) partitioned disks to get max spindles
> available for "pseudo-RAID 10" performance zpools (500 gb disks,
> 465 usable, partitioned 115 GB for "fast" db, 345 for "archive" db,
> 5 gb for "utility" used for OCR and VOTE partitions in RAC).
> Disks on each server set up the same way, active zpool disks
> in 7 "fast" pools ("fast" partition on target 1 on each SATA
> controller all together in one pool, target 2 on each in second  
> pool, etc)
> 7 "archive" pools and 7 "utility" pools.  "fast" and "utility" are
> zpool pseudo-RAID 10  "archive" raid-Z.  Fixed size zfs volumes
> built to full capacity of each pool.
>
> The clients were S10U4 when we first spotted this, we upgraded them
> all to S10U5 as soon as we noticed that, but the problem happened
> again last week.  The X4500s have been S10U5 since they were  
> installed.
>
>
> Problem:
> Both servers have experienced a failure mode which initially
> manifested as a Oracle RAC crash and proved via testing to be
> an ignored iSCSI write to "fast" partitions.
>
> Test case:
> (/tmp/zero is a 1-k file full of zero)
> # dd if=/dev/rdsk/c2t42d0s6 bs=1k count=1
> nÉçORCLDISK
> FDATA_0008FDATAFDATA_0008ö*Én¨ö*íSô¼>Ú
> ö*5|1+0 records in
> 1+0 records out
> # dd of=/dev/rdsk/c2t42d0s6 if=/tmp/zero bs=1k count=1
> 1+0 records in
> 1+0 records out
> # dd if=/dev/rdsk/c2t42d0s6 bs=1k count=1
> nÉçORCLDISK
> FDATA_0008FDATAFDATA_0008ö*Én¨ö*íSô¼>Ú
> ö*5|1+0 records in
> 1+0 records out
> #
>
>
> Once this started happening, the same write behavior appears  
> immediately
> on all clients, including new ones which had not previously been
> connected to the iSCSI server.
>
> We can write a block of all 0's, or A's, out to any of the other iSCSI
> devices other than the problem one, and read it back fine.  But the
> misbehaving one consistently refuses to actually commit writes,
> though it takes the write and returns.  All reads get the old data.
>
> zpool status, zfs list, /var/adm/messages, everything else we look
> at on the servers say they're all happy and fine.  But obviously
> there's something very wrong with the particular volume / pool
> which is giving us problems.
>
> A coworker fixed it the first time by running a manual resilver,
> once that was underway writes did the right thing again.  But that
> was just a random shot in the dark - we saw no errors or clear
> reason to resilver.
>
> We saw it again, and it blew up the just-about-to-go-live database,
> and we had to cut over to SAN storage to hit the deploy window.
>
> It's happend on both the X4500s we were using for iSCSI, so it's
> not a single point hardware issue.
>
> I have preserved the second failed system in error mode in case
> someone has ideas for more diagnostics.
>
> I have an open support ticket, but so far no hint at a solution.
>
> Anyone on list have ideas?
>
>
> Thanks....
>
> - -george william herbert
> [EMAIL PROTECTED]
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> storage-discuss mailing list
> [EMAIL PROTECTED]
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to