Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread bank kus
Btw FWIW if I redo the dd + 2 cp experiment on /tmp the result is far more 
disastrous. The GUI stops moving caps lock stops responding for large intervals 
no clue why.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread bank kus
Hi Henrik
I have 16GB Ram on my system on a lesser RAM system dd does cause problems as I 
mentioned above. My __guess__ dd is probably sitting in some in memory cache 
since du -sh doesnt show the full file size until I do a sync.

At this point I m less looking for QA type repro questions and/or speculations 
rather looking for  ZFS design expectations. 

What is the expected behaviour, if one thread queues 100 reads  and another 
thread comes later with 50 reads are these 50 reads __guaranteed__ to fall 
behind the first 100 or is timeslice/fairshre done between two streams? 

Btw this problem is pretty serious with 3 users using the system one of them 
initiating a large copy grinds the other 2 to a halt. Linux doesnt have this 
problem and this is almost a switch O/S moment for us unfortunately :-(

Regards
banks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread Henrik Johansson
On Jan 9, 2010, at 2:02 PM, bank kus wrote:

>> Probably not, but ZFS only runs in userspace on Linux
>> with fuse so it  
>> will be quite different.
> 
> I wasnt clear in my description, I m referring to ext4 on Linux. In fact on a 
> system with low RAM even the dd command makes the system horribly 
> unresponsive. 
> 
> IMHO not having fairshare or timeslicing between different processes issuing 
> reads is frankly unacceptable given a lame user can bring the system to a 
> halt with 3 large file copies. Are there ZFS settings or Project Resource 
> Control settings one can use to limit abuse from individual processes?
> -- 

Are your sure this problem is related to ZFS? I have no problem with multiple 
threads reading and writing to my pools, it's till responsive, if I however put 
urandom with dd into the mix I get much more latency. 

Does't  for example $(dd if=/dev/urandom of=/dev/null bs=1048576k count=8) give 
you the same problem, or if you use the file you already created from urandom 
as input to dd?

Regards

Henrik
http://sparcv9.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] abusing zfs boot disk for fun and DR

2010-01-09 Thread Mark Bennett
Ben,
I have found that booting from cdrom and importing the pool on the new host, 
then boot the hard disk will prevent these issues.
That will reconfigure the zfs to use the new disk device.
When running, zpool detach the missing mirror device and attach a new one.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [zones-discuss] Zones on shared storage - a warning

2010-01-09 Thread Frank Batschulat (Home)
On Fri, 08 Jan 2010 18:33:06 +0100, Mike Gerdts  wrote:

> I've written a dtrace script to get the checksums on Solaris 10.
> Here's what I see with NFSv3 on Solaris 10.

jfyi, I've reproduces it as well using a Solaris 10 Update 8 SB2000 sparc client
and NFSv4.

much like you I also get READ errors along with the CKSUM errors which
is different from my observation on a ONNV client.

unfortunately your dtrace script did not worked for me, ie. it
did not spit out anything :(

cheers
frankB

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Paul B. Henson
On Sat, 9 Jan 2010, Eric Schrock wrote:

> > If ZFS removed the drive from the pool, why does the system keep
> > complaining about it?
>
> It's not failing in the sense that it's returning I/O errors, but it's
> flaky, so it's attaching and detaching.  Most likely it decided to attach
> again and then you got transport errors.

Ok, how do I make it stop logging messages about the drive until it is
replaced? It's still filling up the logs with the same errors about the
drive being offline.

Looks like hdadm isn't it:

r...@cartman ~ # hdadm offline disk c1t2d0
/usr/bin/hdadm[1762]: /dev/rdsk/c1t2d0d0p0: cannot open
/dev/rdsk/c1t2d0d0p0 is not available

Hmm, I was able to unconfigure it with cfgadm:

r...@cartman ~ # cfgadm -c unconfigure sata1/2::dsk/c1t2d0

It went from:

sata1/2::dsk/c1t2d0disk connectedconfigured   failed

to:

sata1/2disk connectedunconfigured failed

Hopefully that will stop the errors until it's replaced and not break
anything else :).

> No, it's fine.  DEGRADED just means the pool is not operating at the
> ideal state.  By definition a hot spare is always DEGRADED.  As long as
> the spare itself is ONLINE it's fine.

The spare shows as "INUSE", but I'm guessing that's fine too.

> Hope that helps

That was perfect, thank you very much for the review. Now I can not worry
about it until Monday :).

-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Ian Collins

Paul B. Henson wrote:

We just had our first x4500 disk failure (which of course had to happen
late Friday night ), I've opened a ticket on it but don't expect a
response until Monday so was hoping to verify the hot spare took over
correctly and we still have redundancy pending device replacement.

This is an S10U6 box:

Here's the zpool status output:

  pool: export
 state: DEGRADED
[...]
 scrub: scrub completed after 0h6m with 0 errors on Fri Jan  8 23:21:31
2010


NAME  STATE READ WRITE CKSUM
exportDEGRADED 0 0 0

  mirror  DEGRADED 0 0 0
c0t2d0ONLINE   0 0 0
spare DEGRADED 18.9K 0 0
  c1t2d0  REMOVED  0 0 0
  c5t0d0  ONLINE   0 0 18.9K

spares
  c5t0d0  INUSE currently in use

Is the pool/mirror/spare still supposed to show up as degraded after the
hot spare is deployed?

  
Yes, the spare will show as degraded until you replace it. I had a pool 
on a 4500 that lost one drive, then swapped out 3 more due to brain 
farts from that naff Marvell driver. It was a bit of a concern for a 
while seeing two degraded devices in one raidz vdev!



The scrub started at 11pm last night, the disk got booted at 11:15pm,
presumably the scrub came across the failures the os had been reporting.
The last scrub status shows that scrub completing successfully. What
happened to the resilver status? How can I tell if the resilver was
successful? Did the resilver start and complete while the scrub was still
running and its status output was lost? Is there any way to see the status
of past scrubs/resilvers, or is only the most recent one available?

  

You only see the last one, but a resilver is a scrub.


Mostly I'd like to verify my hot spare is working correctly. Given the
spare status is "degraded", the read errors on the spare device, and the
lack of successful resilver status output, it seems like the spare might
not have been added successfully.

  

It has - "scrub completed after 0h6m with 0 errors".

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Eric Schrock

On Jan 9, 2010, at 9:45 AM, Paul B. Henson wrote:
> 
> If ZFS removed the drive from the pool, why does the system keep
> complaining about it?

It's not failing in the sense that it's returning I/O errors, but it's flaky, 
so it's attaching and detaching.  Most likely it decided to attach again and 
then you got transport errors.

> Is fault management stuff still poking at it?

No.

> Is the pool/mirror/spare still supposed to show up as degraded after the
> hot spare is deployed?

Yes.

> There are 18.9K checksum errors on the disk that failed, but there are also
> 18.9K read errors on the hot spare?

This is a bug recently fixed in OpenSolaris.

> The last scrub status shows that scrub completing successfully. What
> happened to the resilver status?

If there was a scrub it will show as the last thing completed.

> How can I tell if the resilver was
> successful?

If the scrub was successful.

> Did the resilver start and complete while the scrub was still
> running and its status output was lost?

No, only one can be active at any time.

> Is there any way to see the status
> of past scrubs/resilvers, or is only the most recent one available?

Only the most recent one.

> None of that results in a fault diagnosys?

When the device is in the process of going away, no.  From the OS perspective 
this disk was physically removed from the system.

> Mostly I'd like to verify my hot spare is working correctly. Given the
> spare status is "degraded", the read errors on the spare device, and the
> lack of successful resilver status output, it seems like the spare might
> not have been added successfully.

No, it's fine.  DEGRADED just means the pool is not operating at the ideal 
state.  By definition a hot spare is always DEGRADED.  As long as the spare 
itself is ONLINE it's fine.

Hope that helps,

- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] x4500 failed disk, not sure if hot spare took over correctly

2010-01-09 Thread Paul B. Henson

We just had our first x4500 disk failure (which of course had to happen
late Friday night ), I've opened a ticket on it but don't expect a
response until Monday so was hoping to verify the hot spare took over
correctly and we still have redundancy pending device replacement.

This is an S10U6 box:

SunOS cartman 5.10 Generic_141445-09 i86pc i386 i86pc

Looks like the first errors started yesterday morning:

Jan  8 07:46:02 cartman marvell88sx: [ID 268337 kern.warning] WARNING:
marvell88
sx1:device on port 2 failed to reset
Jan  8 07:46:15 cartman marvell88sx: [ID 268337 kern.warning] WARNING:
marvell88
sx1:device on port 2 failed to reset
Jan  8 07:46:32 cartman sata: [ID 801593 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1:
Jan  8 07:46:32 cartman SATA device at port 2 - device failed
Jan  8 07:46:32 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  8 07:46:32 cartman Command failed to complete...Device is gone

ZFS failed the drive about 11:15PM:

Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] zpool
export
 status: One or more devices has experienced an unrecoverable error.  An
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] zpool
export
 status: attempt was made to correct the error.  Applications are
unaffected.
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error] unknown
head
er see
Jan  8 23:15:01 cartman zpool_check[3702]: [ID 702911 daemon.error]
warning: poo
l export health DEGRADED

However, the errors continue still:

Jan  9 03:54:48 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  9 03:54:48 cartman Command failed to complete...Device is gone
[...]
Jan  9 07:56:12 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  9 07:56:12 cartman Command failed to complete...Device is gone
Jan  9 07:56:12 cartman scsi: [ID 107833 kern.warning] WARNING:
/p...@0,0/pci1022
,7...@2/pci11ab,1...@1/d...@2,0 (sd26):
Jan  9 07:56:12 cartman drive offline

If ZFS removed the drive from the pool, why does the system keep
complaining about it? Is fault management stuff still poking at it?

Here's the zpool status output:

  pool: export
 state: DEGRADED
[...]
 scrub: scrub completed after 0h6m with 0 errors on Fri Jan  8 23:21:31
2010


NAME  STATE READ WRITE CKSUM
exportDEGRADED 0 0 0

  mirror  DEGRADED 0 0 0
c0t2d0ONLINE   0 0 0
spare DEGRADED 18.9K 0 0
  c1t2d0  REMOVED  0 0 0
  c5t0d0  ONLINE   0 0 18.9K

spares
  c5t0d0  INUSE currently in use

Is the pool/mirror/spare still supposed to show up as degraded after the
hot spare is deployed?

There are 18.9K checksum errors on the disk that failed, but there are also
18.9K read errors on the hot spare?

The scrub started at 11pm last night, the disk got booted at 11:15pm,
presumably the scrub came across the failures the os had been reporting.
The last scrub status shows that scrub completing successfully. What
happened to the resilver status? How can I tell if the resilver was
successful? Did the resilver start and complete while the scrub was still
running and its status output was lost? Is there any way to see the status
of past scrubs/resilvers, or is only the most recent one available?

Fault managment doesn't report any problems:

r...@cartman ~ # fmdump
TIME UUID SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty

Shouldn't this show a failed disk?

fmdump -e shows tuns of bad stuff:

Jan 08 07:46:32.9467 ereport.fs.zfs.probe_failure
Jan 08 07:46:36.2015 ereport.fs.zfs.io
[...]
Jan 08 07:51:05.1865 ereport.fs.zfs.io

None of that results in a fault diagnosys?

Mostly I'd like to verify my hot spare is working correctly. Given the
spare status is "degraded", the read errors on the spare device, and the
lack of successful resilver status output, it seems like the spare might
not have been added successfully.

Thanks for any input you might provide...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread Jürgen Keil
> > I wasnt clear in my description, I m referring to ext4 on Linux. In 
> > fact on a system with low RAM even the dd command makes the system 
> > horribly unresponsive.
> >
> > IMHO not having fairshare or timeslicing between different processes 
> > issuing reads is frankly unacceptable given a lame user can bring 
> > the system to a halt with 3 large file copies. Are there ZFS 
> > settings or Project Resource Control settings one can use to limit 
> > abuse from individual processes?
> 
> I am confused.  Are you talking about ZFS under OpenSolaris, or are 
> you talking about ZFS under Linux via Fuse?
> 
> Do you have compression or deduplication enabled on
> the zfs  filesystem?
> 
> What sort of system are you using?

I was able to reproduce the problem running
current (mercurial) opensolaris bits, with the
"dd" command:

  dd if=/dev/urandom of=largefile.txt bs=1048576k count=8

dedup is off, compression is on. System is a 32-bit laptop
with 2GB of memory, single core cpu.  The system was
unusable / unresponsive for about 5 minutes before I was
able to interrupt the dd process.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ssd pool + ssd cache ?

2010-01-09 Thread Richard Elling

On Jan 9, 2010, at 1:32 AM, Lutz Schumann wrote:

Depends.

a) Pool design
5 x SSD as raidZ = 4 SSD space - read I/O performance of one drive
Adding 5 cheap 40 GB L2ARC device (which are pooled) increases the  
read performance for your working window of 200 GB.


An interesting thing happens when an app suddenly has 50-100x more IOPS.
The bottleneck tends to move back to the CPU.  This is a good thing,  
because
the application running on a CPU is where the most value is gained. Be  
aware
of this, because it is not uncommon for people to upgrade the storage  
and not

see significant improvement when the application becomes CPU-bound.


If you have a pool of mirrors - adding L2ARC does not make sence.


I think this is a good rule of thumb.
 -- richard


b) SSD type
Is your devices are MLC adding a ZIL makes sence. Watch out for  
drive qualification ! (honor the cache flush command).


Robert
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread bank kus
> I am confused.  Are you talking about ZFS under
> OpenSolaris, or are 
> you talking about ZFS under Linux via Fuse?

??? 

> Do you have compression or deduplication enabled on
> the zfs 
> filesystem?

Compression no. I m guessing 2009.06 doesnt have dedup.
 
> What sort of system are you using?

OSOL 2009.06 on Intel i7 920. The repro steps are at the top of this thread.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread Bob Friesenhahn

On Sat, 9 Jan 2010, bank kus wrote:


Probably not, but ZFS only runs in userspace on Linux
with fuse so it
will be quite different.


I wasnt clear in my description, I m referring to ext4 on Linux. In 
fact on a system with low RAM even the dd command makes the system 
horribly unresponsive.


IMHO not having fairshare or timeslicing between different processes 
issuing reads is frankly unacceptable given a lame user can bring 
the system to a halt with 3 large file copies. Are there ZFS 
settings or Project Resource Control settings one can use to limit 
abuse from individual processes?


I am confused.  Are you talking about ZFS under OpenSolaris, or are 
you talking about ZFS under Linux via Fuse?


Do you have compression or deduplication enabled on the zfs 
filesystem?


What sort of system are you using?

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool iostat -v hangs on L2ARC failure (SATA, 160 GB Postville)

2010-01-09 Thread Lutz Schumann
I finally managed to resolve this. I received some useful info from Richard 
Elling (without List CC): 

>> (ME) However I sill think, also the plain IDE driver needs a timeout to 
>> hande disk failures, cause cables etc can fail.

>(Richard) Yes, this is a little bit odd.  The sd driver should be in the stack 
>above
the IDE driver and the sd driver tends to manage timeouts as well.
Could you send the "prtconf -D" output? 

>> (ME) prtconf -D::
>>
>> System Configuration:  Sun Microsystems  i86pc
>> Memory size: 8191 Megabytes
>> System Peripherals (Software Nodes):
>>
>> i86pc (driver name: rootnex)
>>   scsi_vhci, instance #0 (driver name: scsi_vhci)
>>   isa, instance #0 (driver name: isa)
>>   asy, instance #0 (driver name: asy)
>>   lp, instance #0 (driver name: ecpp)
>>   i8042, instance #0 (driver name: i8042)
>>   keyboard, instance #0 (driver name: kb8042)
>>   motherboard
>>   pit_beep, instance #0 (driver name: pit_beep)
>>   pci, instance #0 (driver name: npe)
>>   pci1002,5957
>>   pci1002,5978, instance #0 (driver name: pcie_pci)
>>   display, instance #1 (driver name: vgatext)
>>   pci1002,597b, instance #1 (driver name: pcie_pci)
>>   pci8086,1083, instance #1 (driver name: e1000g)
>>   pci1002,597c, instance #2 (driver name: pcie_pci)
>>   pci8086,1083, instance #2 (driver name: e1000g)
>>   pci1002,597f, instance #3 (driver name: pcie_pci)
>>   pci1458,e000 (driver name: gani)
>>   pci-ide, instance #3 (driver name: pci-ide)
>>   ide, instance #6 (driver name: ata)
>>   cmdk, instance #1 (driver name: cmdk)

> (Richard) Here is where you see the driver stack. Inverted it looks like:

>cmdk
>ata
>pci-ide
>npe

>I/O from the file system will go directly to the cmdk driver.
>I'm not familiar with that driver, mostly because if you change
>to AHCI, then you will see something more like:
>
>sd
>ahci
>pci-ide
>npe

>The important detail to remember is that ZFS does not have any
>timeouts.  It will patiently wait for a response from cmdk or sd.
>The cmdk and sd drivers manage timeouts and retries farther
>down the stack.

> For sd, I know that disk selection errors are propagated quickly. 

>> (ME) Is it possible to set AHCI without reinstalling OSol ?

> (Richard) Yes. But you might need to re-import the non-syspool pools 
> manually. 

 
OK so I wanted to switch from IDE to AHCI while keeping my Installation and 
Test again. When setting the Mode for my IDE devices to AHCI in the BIOS, the 
machine paniced with "Error could not import root volume: error 19" in Grub. So 
the machine could not boot. Afer some googeling I found: 


http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6795637 ( 
Implement full root disk portability, boot must not be dependent on install 
hardware config) 

and 

http://defect.opensolaris.org/bz/show_bug.cgi?id=5785 (guide how to change the 
boot device for P2V, which is actually similar)

So I did as described in the guide. Maybe this is of use for someone else 
finding this. 
---
Overview: My Storage Server's disk mode was set to IDE (If your server is set 
to SATA or IDE can be tested with cfgadm. If the devices are not shown, you are 
in IDE mode). To Enable AHCI / SATA mode for you drives, you have to go to the 
BIOS and set the mode to AHCI. However, after you have done this  - your 
machine will (may?) not boot anymore. You will get a Panic after GRUB saying 
"cannot mount rootfs" (actally this screen is there only VERY SHORT. To 
actually see it add "-k -v" to the grub boot options and you will fall into the 
debugger to read the message)

IDE MODE:

* NO hot plug
* The system hangs 100% if a cache or non device is removed (see thread 
above)
* NO NCQ available

AHCI Mode:

* Full support for NCQ (?)
* Full support for Hot Plug (devices shown via cfgadm as sata/X:disk) 

To switch from IDE mode to AHCI for a running installation of NexentaStor I did 
the following:

* Create a checkpoint just to be sure
* node (write down) which checkpoint is the safety checkpoint you just 
created
* note (write down) which checkpoint is currently bootet
* export your data volumes
* reboot
* Enter BIOS and set mode to AHCI
* Boot rescue CD (USB CDROM not working, must be IDE, PXE maybe later added)
* In the Rescue CD do (login root / passwd empty): 
  o mkdir -mnt
  o zfs import -f syspool
  o mount -F zfs syspool/rootfs-nms-XXX /mnt (this is the active 
snapshot / clone you are booting normally, not the rescue checkpoint you 
created)
  o mv /mnt/etc/path_to_inst /mnt/etc/path_to_inst.ORG
  o touch /mnt/etc/path_to_inst
  o devfsadm -C -r /mnt
  o devfsadm -c disk -r /mnt
  o devfsadm -i e1000g -r /mnt
  o cp -a /mnt/etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache.ORG
  o cp -a /etc/zfs/zpool.cach

Re: [zfs-discuss] ZFS extremely slow performance

2010-01-09 Thread Emily Grettel


Hello again,
 
I swapped out the PSU and replaced the cables and ran scrubs almost every day 
(after hours) with no reported faults. I also upgraded to SNV_130 thanks to 
Brock & changed cables and PSU after the suggestion from Richard. I owe you two 
both beers!
 
We thought our troubles were resolved but I'm noticing alot of the messages 
above from my /var/adm/messages and I'm starting to worry. I tailed the log 
whilst we streamed some MPEG2 captures (it was about 12Gb) and the log went 
crazy!

Jan  9 23:52:04 razor nwamd[34]: [ID 244715 daemon.error] scf_handle_destroy() 
failed: repository server unavailable
Jan  9 23:52:04 razor smbd[13585]: [ID 354691 daemon.error] smb_nicmon_daemon: 
failed to refresh SMF instance svc:/network/smb/server:default
Jan  9 23:52:04 razor last message repeated 11 times
Jan  9 23:52:04 razor nwamd[34]: [ID 244715 daemon.error] scf_handle_destroy() 
failed: repository server unavailable
Jan  9 23:52:04 razor smbd[13585]: [ID 354691 daemon.error] smb_nicmon_daemon: 
failed to refresh SMF instance svc:/network/smb/server:default
Jan  9 23:52:05 razor last message repeated 4 times
Jan  9 23:52:05 razor nwamd[34]: [ID 244715 daemon.error] scf_handle_destroy() 
failed: repository server unavailable
Jan  9 23:52:05 razor smbd[13585]: [ID 354691 daemon.error] smb_nicmon_daemon: 
failed to refresh SMF instance svc:/network/smb/server:default

Any ideas why this may be happenning? I'm really starting to worry, is it a ZFS 
issue or SMB again?
 
Cheers,
Emily


> On Dec 31, 2009, at 11:38 PM, Emily Grettel wrote:
> 
> > Hi Richard,
> >
> > This is my zpool status -v
> >
> > pool: tank
> > state: ONLINE
> > status: One or more devices has experienced an unrecoverable error. 
> > An
> > attempt was made to correct the error. Applications are 
> > unaffected.
> > action: Determine if the device needs to be replaced, and clear the 
> > errors
> > using 'zpool clear' or replace the device with 'zpool 
> > replace'.
> > see: http://www.sun.com/msg/ZFS-8000-9P
> > scrub: scrub completed after 5h15m with 0 errors on Fri Jan 1 
> > 17:39:57 2010
> > config:
> > NAME STATE READ WRITE CKSUM
> > tank ONLINE 0 0 0
> > raidz1-0 ONLINE 0 0 0
> > c7t1d0 ONLINE 0 0 2 51.5K repaired
> > c7t4d0 ONLINE 0 0 2 52K repaired
> > c0t1d0 ONLINE 0 0 3 77.5K repaired
> > c7t5d0 ONLINE 0 0 0
> > c7t3d0 ONLINE 0 0 1 26K repaired
> > c7t2d0 ONLINE 0 0 0
> > errors: No known data errors
> >
> > I might swap the SATA cables to some better quality ones that are 
> > shielded I think (ACRyan has some) and see if its that.
> >
> > Cheers,
> > Em
> >
> > > From: richard.ell...@gmail.com
> > > To: emilygrettelis...@hotmail.com
> > > Subject: Re: [zfs-discuss] ZFS extremely slow performance
> > > Date: Thu, 31 Dec 2009 19:58:24 -0800
> > >
> > > hmmm... might be something other than the disk, like cables or
> > > vibration.
> > > Let's see what happens after the scrub completes.
> > > -- richard
> > >
> > > On Dec 31, 2009, at 5:34 PM, Emily Grettel wrote:
> > >
> > > > Hello!
> > > >
> > > >
> > > > > This could be a broken disk, or it could be some other
> > > > > hardware/software/firmware issue. Check the errors on the
> > > > > device with
> > > > > iostat -En
> > > >
> > > > Heres the output:
> > > >
> > > > c7t1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> > > > Vendor: ATA Product: WDC WD10EADS-00L Revision: 1A01 Serial No:
> > > > Size: 1000.20GB <1000204886016 bytes>
> > > > Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> > > > Illegal Request: 4 Predictive Failure Analysis: 0
> > > > c7t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> > > > Vendor: ATA Product: WDC WD10EADS-00P Revision: 0A01 Serial No:
> > > > Size: 1000.20GB <1000204886016 bytes>
> > > > Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> > > > Illegal Request: 4 Predictive Failure Analysis: 0
> > > > c7t3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> > > > Vendor: ATA Product: WDC WD10EADS-00P Revision: 0A01 Serial No:
> > > > Size: 1000.20GB <1000204886016 bytes>
> > > > Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> > > > Illegal Request: 4 Predictive Failure Analysis: 0
> > > > c7t4d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> > > > Vendor: ATA Product: WDC WD10EADS-00P Revision: 0A01 Serial No:
> > > > Size: 1000.20GB <1000204886016 bytes>
> > > > Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> > > > Illegal Request: 4 Predictive Failure Analysis: 0
> > > > c7t5d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> > > > Vendor: ATA Product: WDC WD10EADS-00P Revision: 0A01 Serial No:
> > > > Size: 1000.20GB <1000204886016 bytes>
> > > > Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> > > > Illegal Request: 4 Predictive Failure Analysis: 0
> > > > c7t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> > > > Vendor: ATA Product: WDC WD740GD-00FL Revision: 8F33 Serial No:
> > > > Size: 74.36GB <74355769344 bytes>
> > 

Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread bank kus
> Probably not, but ZFS only runs in userspace on Linux
> with fuse so it  
> will be quite different.

I wasnt clear in my description, I m referring to ext4 on Linux. In fact on a 
system with low RAM even the dd command makes the system horribly unresponsive. 

IMHO not having fairshare or timeslicing between different processes issuing 
reads is frankly unacceptable given a lame user can bring the system to a halt 
with 3 large file copies. Are there ZFS settings or Project Resource Control 
settings one can use to limit abuse from individual processes?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I/O Read starvation

2010-01-09 Thread Henrik Johansson



Henrik
http://sparcv9.blogspot.com

On 9 jan 2010, at 04.49, bank kus  wrote:


dd if=/dev/urandom of=largefile.txt bs=1G count=8

cp largefile.txt ./test/1.txt &
cp largefile.txt ./test/2.txt &

Thats it now the system is totally unusable after launching the two  
8G copies. Until these copies finish no other application is able to  
launch completely. Checking prstat shows them to be in the sleep  
state.


Question:
<> I m guessing this because ZFS doesnt use CFQ and that one process  
is allowed to queue up all its I/O reads ahead of other processes?




What is CFQ, a sheduler, if you are running OpenSolaris, then you do  
not have CFQ.


<> Is there a concept of priority among I/O reads? I only ask  
because if root were to launch some GUI application they dont start  
up until both copies are done. So there is no concept of priority?  
Needless to say this does not exist on Linux 2.60...

--


Probably not, but ZFS only runs in userspace on Linux with fuse so it  
will be quite different.






This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Activity after LU with ZFS/Zone working

2010-01-09 Thread Cesare
Hy all,

recently I upgraded to S10U8 a T5120 using LU. The system had a zones
configured and at time of upgrade procedure the zones was still alive
and worked fine. The LU procedure was ended successfully. Zones on the
system was installed in a ZFS filesystem. Here the result at the end
of LU (ABE-from: s10Aug2007, ABE-to: s10Set2009):

# zfs list
NAME
 USED  AVAIL  REFER  MOUNTPOINT
tank/zones/zone01
69.8M  52.9G   268M  /opt/zones/zone01
tank/zones/zone01-s10Aug2007 5.74G  52.9G
5.74G  /opt/zones/zone01-s10Aug2007
tank/zones/zone01-s10aug2...@s10set2009  3.75M  -  5.74G  -

I successfully destroyed the clone and snapshot as:

# zfs get origin:
NAMEPROPERTY  VALUE
   SOURCE
tank/zones/zone01 origin
tank/zones/zone01-s10aug2...@s10set2009
   -
tank/zones/zon...@s10set2009  origin-
 -
tank/zones/zone01-s10Aug2007  origin
tank/zones/zon...@s10set2009  -

# zfs promote tank/zones/zone01
# zfs destroy tank/zones/zone01-s10Aug2007
# zfs destroy tank/zones/zon...@s10set2009

At the end:

# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank/zones/zone01268M  59.7G   268M  /opt/zones/zone01

It's that normal after a LU the zfs filesystem must be promoted
manually? I't depend that in case of roll-back (booting the system to
ABE-old) it's needed to have the previous state? I can't verify
configuration in the ABE old the zone-path and zfs filesystem the
previuos filesystem is "tank/zones/zone01-s10Aug2007" (zfs) with
mountpoint "/opt/zones/zone01-s10Aug2007", after two weeks of working
I destroyed the ABE-old.

Thanks

Cesare
-- 

Mike Ditka  - "If God had wanted man to play soccer, he wouldn't have
given us arms." -
http://www.brainyquote.com/quotes/authors/m/mike_ditka.html
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ssd pool + ssd cache ?

2010-01-09 Thread Lutz Schumann
Depends. 

a) Pool design
5 x SSD as raidZ = 4 SSD space - read I/O performance of one drive
Adding 5 cheap 40 GB L2ARC device (which are pooled) increases the read 
performance for your working window of 200 GB.

If you have a pool of mirrors - adding L2ARC does not make sence.

b) SSD type 
Is your devices are MLC adding a ZIL makes sence. Watch out for drive 
qualification ! (honor the cache flush command). 

Robert
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss