Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
On Tue, Feb 1, 2011 at 11:34 PM, Richard Elling
 wrote:
> There is a failure going on here.  It could be a cable or it could be a bad
> disk or firmware. The actual fault might not be in the disk reporting the 
> errors (!)
> It is not a media error.
>

Errors were as follows:
Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401
Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401
Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401
Feb 01 19:33:04.9969 ereport.io.scsi.cmd.disk.tran 0x269f99ef0b300401
Feb 01 19:33:04.9970 ereport.io.scsi.cmd.disk.tran 0x269f9a165a400401

Verbose of a message:
Feb 01 2011 19:33:04.996932283 ereport.io.scsi.cmd.disk.tran
nvlist version: 0
class = ereport.io.scsi.cmd.disk.tran
ena = 0x269f99ef0b300401
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0
(end detector)

devid = id1,sd@n5000c50010ed6a31
driver-assessment = fail
op-code = 0x0
cdb = 0x0 0x0 0x0 0x0 0x0 0x0
pkt-reason = 0x18
pkt-state = 0x1
pkt-stats = 0x0
__ttl = 0x1
__tod = 0x4d48a640 0x3b6bfabb

It was a cable error, but why didn't fault management tell me about
it? What do you mean by "The actual fault might not be in the disk
reporting the errors (!)
It is not a media error."? Fault might be sourcing from my SATA
controller or something possibly?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Richard Elling
On Feb 1, 2011, at 6:49 PM, Krunal Desai wrote:

>> The output of fmdump is explicit. I am interested to know if you saw 
>> aborts and timeouts or some other errors.
> 
> I have the machine off atm while I install new disks (18x ST32000542AS), but 
> IIRC they appeared as transport errors (scsi..transport, I can 
> paste the exact errors in a little bit). A slew of transfer/soft errors 
> followed by the drive disappearing. I assume that my HBA took it offline, and 
> mpt driver reported that to the OS as an admin disconnecting, not as a 
> "failure" per se.

There is a failure going on here.  It could be a cable or it could be a bad
disk or firmware. The actual fault might not be in the disk reporting the 
errors (!)
It is not a media error.

> 
>> The open-source version of smartmontools seems to be slightly out
>> of date and somewhat finicky. Does anyone know of a better SMART
>> implementation?
> 
> That SUNWhd I mentioned seemed interesting, but I assume licensing means I 
> can only get that if I purchase SUn hardware.
> 
>> Nice idea, except that the X4500 was EOL years ago and the replacement,
>> X4540, uses LSI HBAs. I think you will find better Solaris support for the 
>> LSI
>> chipsets because Oracle's Sun products use them from the top (M9000) all
>> the way down the product line.
> 
> Oops, forgot that the X4500s are actually kind of "old". I'll have to look up 
> what LSI controllers the newer models are using (the LSI 2xx8 something IIRC? 
> Will have to Google).

No, they aren't that new.  The LSI 2008 are 6 Gbps HBAs and the older 1064/1068 
series are 3 Gbps.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
> The output of fmdump is explicit. I am interested to know if you saw 
> aborts and timeouts or some other errors.

I have the machine off atm while I install new disks (18x ST32000542AS), but 
IIRC they appeared as transport errors (scsi..transport, I can paste 
the exact errors in a little bit). A slew of transfer/soft errors followed by 
the drive disappearing. I assume that my HBA took it offline, and mpt driver 
reported that to the OS as an admin disconnecting, not as a "failure" per se.

> The open-source version of smartmontools seems to be slightly out
> of date and somewhat finicky. Does anyone know of a better SMART
> implementation?

That SUNWhd I mentioned seemed interesting, but I assume licensing means I can 
only get that if I purchase SUn hardware.

> Nice idea, except that the X4500 was EOL years ago and the replacement,
> X4540, uses LSI HBAs. I think you will find better Solaris support for the LSI
> chipsets because Oracle's Sun products use them from the top (M9000) all
> the way down the product line.

Oops, forgot that the X4500s are actually kind of "old". I'll have to look up 
what LSI controllers the newer models are using (the LSI 2xx8 something IIRC? 
Will have to Google).

--khd

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Richard Elling
On Feb 1, 2011, at 5:52 PM, Krunal Desai wrote:

> On Tue, Feb 1, 2011 at 6:11 PM, Cindy Swearingen
>  wrote:
>> I misspoke and should clarify:
>> 
>> 1. fmdump identifies fault reports that explain system issues
>> 
>> 2. fmdump -eV identifies errors or problem symptoms
> 
> Gotcha; fmdump -eV gives me the information I need. It appears to have
> been a loose cable, I'm hitting the machine with some heavy I/O load,
> and the pool resilvered itself, drive has not dropped out.

The output of fmdump is explicit. I am interested to know if you saw 
aborts and timeouts or some other errors.

> 
> SMART status was reported healthy as well (got smartctl kind of
> working), but I cannot read the SMART data of my disks behind the
> 1068E due to limitations of smartmontools I guess. (e.g. 'smartctl -d
> scsi -a /dev/rdsk/c10t0d0' gives me serial #, model, and just a
> generic 'SMART Ok'). I assume that SUNWhd is licensed only for use on
> the X4500 Thumper and family? I'd like to see if it works with the
> 1068E.

The open-source version of smartmontools seems to be slightly out
of date and somewhat finicky. Does anyone know of a better SMART
implementation?

> 
> It's getting kind of tempting for me to investigate oing a run of
> boards that run Marvell 88SX6081s behind a PLX PCIe <-> PCI-X bridge.
> They should have beyond excellent support seeing as that is what the
> X4500 uses to run its SATA ports.

Nice idea, except that the X4500 was EOL years ago and the replacement,
X4540, uses LSI HBAs. I think you will find better Solaris support for the LSI
chipsets because Oracle's Sun products use them from the top (M9000) all
the way down the product line.
 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] multiple disk failure (solved?)

2011-02-01 Thread Richard Elling
On Feb 1, 2011, at 5:56 AM, Mike Tancsa wrote:
> On 1/31/2011 4:19 PM, Mike Tancsa wrote:
>> On 1/31/2011 3:14 PM, Cindy Swearingen wrote:
>>> Hi Mike,
>>> 
>>> Yes, this is looking much better.
>>> 
>>> Some combination of removing corrupted files indicated in the zpool
>>> status -v output, running zpool scrub and then zpool clear should
>>> resolve the corruption, but its depends on how bad the corruption is.
>>> 
>>> First, I would try least destruction method: Try to remove the
>>> files listed below by using the rm command.
>>> 
>>> This entry probably means that the metadata is corrupted or some
>>> other file (like a temp file) no longer exists:
>>> 
>>> tank1/argus-data:<0xc6>
>> 
>> 
>> Hi Cindy,
>>  I removed the files that were listed, and now I am left with
>> 
>> errors: Permanent errors have been detected in the following files:
>> 
>>tank1/argus-data:<0xc5>
>>tank1/argus-data:<0xc6>
>>tank1/argus-data:<0xc7>
>> 
>> I have started a scrub
>> scrub: scrub in progress for 0h48m, 10.90% done, 6h35m to go
> 
> 
> Looks like that was it!  The scrub finished in the time it estimated and
> that was all I needed to do. I did not have to to do zpool clear or any
> other commands.  Is there anything beyond scrub to check the integrity
> of the pool ?

That is exactly what scrub does. It validates all data on the disks.


> 
> 0(offsite)# zpool status -v
>  pool: tank1
> state: ONLINE
> scrub: scrub completed after 7h32m with 0 errors on Mon Jan 31 23:00:46
> 2011
> config:
> 
>NAMESTATE READ WRITE CKSUM
>tank1   ONLINE   0 0 0
>  raidz1ONLINE   0 0 0
>ad0 ONLINE   0 0 0
>ad1 ONLINE   0 0 0
>ad4 ONLINE   0 0 0
>ad6 ONLINE   0 0 0
>  raidz1ONLINE   0 0 0
>ada0ONLINE   0 0 0
>ada1ONLINE   0 0 0
>ada2ONLINE   0 0 0
>ada3ONLINE   0 0 0
>  raidz1ONLINE   0 0 0
>ada5ONLINE   0 0 0
>ada8ONLINE   0 0 0
>ada7ONLINE   0 0 0
>ada6ONLINE   0 0 0
> 
> errors: No known data errors

Congrats!
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
On Tue, Feb 1, 2011 at 6:11 PM, Cindy Swearingen
 wrote:
> I misspoke and should clarify:
>
> 1. fmdump identifies fault reports that explain system issues
>
> 2. fmdump -eV identifies errors or problem symptoms

Gotcha; fmdump -eV gives me the information I need. It appears to have
been a loose cable, I'm hitting the machine with some heavy I/O load,
and the pool resilvered itself, drive has not dropped out.

SMART status was reported healthy as well (got smartctl kind of
working), but I cannot read the SMART data of my disks behind the
1068E due to limitations of smartmontools I guess. (e.g. 'smartctl -d
scsi -a /dev/rdsk/c10t0d0' gives me serial #, model, and just a
generic 'SMART Ok'). I assume that SUNWhd is licensed only for use on
the X4500 Thumper and family? I'd like to see if it works with the
1068E.

It's getting kind of tempting for me to investigate oing a run of
boards that run Marvell 88SX6081s behind a PLX PCIe <-> PCI-X bridge.
They should have beyond excellent support seeing as that is what the
X4500 uses to run its SATA ports.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Question regarding "zfs snapshot -r"

2011-02-01 Thread Eric D. Mudama

On Tue, Feb  1 at 10:54, Rahul Deb wrote:

  Hello All,
  I have two questions related to "zfs snapshot -r"
  1. When "zfs snapshot -r tank@today" command is issued, does it creates
  snapshots for all the�descendent file systems at the same moment? I mean
  to say if the command is issued at 10:20:35 PM, does the creation time of
  all the snapshots for descendent file systems are same?
  2. Say, tank has around 5000 descendent file systems and "zfs snapshot -r
  tank@today" takes around 10 seconds to complete. If there is a new file
  systems created under tank within that 10 seconds period, does that
  snapshot process includes the new file system created within that 10
  seconds?
  OR it will exclude that newly created filesystem?
  Thanks,
  -- Rahul


I believe the contract is that the content of all recursive snapshots
are consistent with the instant in time at which the snapshot command
was executed.

Quoting from the ZFS Administration Guide:

 Recursive ZFS snapshots are created quickly as one atomic
 operation. The snapshots are created together (all at once) or not
 created at all. The benefit of such an operation is that the snapshot
 data is always taken at one consistent time, even across descendent
 file systems.

Therefore, in #2 above, the snapshot wouldn't include the new file in
the descendent file system, because it was created after the moment in
time when the snapshot was initiated.

In #1 above, I would guess the snapshot time is the time of the
initial command across all filesystems in the tree, even if it takes
10 seconds to actually complete the command.  However, I have no such
system where I can prove this guess as correct or not.

--eric

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Cindy Swearingen

I misspoke and should clarify:

1. fmdump identifies fault reports that explain system issues

2. fmdump -eV identifies errors or problem symptoms

I'm unclear about your REMOVED status. I don't see it very often.

The ZFS Admin Guide says:

REMOVED

The device was physically removed while the system was running. Device 
removal detection is hardware-dependent and might not be supported on 
all platforms.


I need to check if FMA generally reports on devices that are REMOVED
by the administrator, as ZFS seems to think in this case.

Thanks,

Cindy



On 02/01/11 15:47, Krunal Desai wrote:

On Tue, Feb 1, 2011 at 1:29 PM, Cindy Swearingen
 wrote:

I agree that we need to get email updates for failing devices.


Definitely!


See if fmdump generated an error report using the commands below.


Unfortunately not, see below:

movax@megatron:/root# fmdump
TIME UUID SUNW-MSG-ID EVENT
fmdump: warning: /var/fm/fmd/fltlog is empty

--khd

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
On Tue, Feb 1, 2011 at 1:29 PM, Cindy Swearingen
 wrote:
> I agree that we need to get email updates for failing devices.

Definitely!

> See if fmdump generated an error report using the commands below.

Unfortunately not, see below:

movax@megatron:/root# fmdump
TIME UUID SUNW-MSG-ID EVENT
fmdump: warning: /var/fm/fmd/fltlog is empty

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Question regarding "zfs snapshot -r"

2011-02-01 Thread Rahul Deb
Hello All,

I have two questions related to "zfs snapshot -r"

1. When "zfs snapshot -r tank@today" command is issued, does it creates
snapshots for all the descendent file systems at the same moment? I mean to
say if the command is issued at 10:20:35 PM, does the creation time of all
the snapshots for descendent file systems are same?

2. Say, tank has around 5000 descendent file systems and "zfs snapshot -r
tank@today" takes around 10 seconds to complete. If there is a new file
systems created under tank within that 10 seconds period, does that snapshot
process includes the new file system created within that 10 seconds?

OR it will exclude that newly created filesystem?

Thanks,

-- Rahul
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Cindy Swearingen

Hi Krunal,

It looks to me like FMA thinks that you removed the disk so you'll need
to confirm whether the cable dropped or something else.

I agree that we need to get email updates for failing devices.

See if fmdump generated an error report using the commands below.

Thanks,

Cindy

# fmdump
TIME UUID SUNW-MSG-ID EVENT

Jan 07 14:01:14.7839 04ee736a-b2cb-612f-ce5e-a0e43d666762 ZFS-8000-GH 
Diagnosed
Jan 13 10:34:32.2301 04ee736a-b2cb-612f-ce5e-a0e43d666762 FMD-8000-58 
Updated


Then, review the contents:

fmdump -u 04ee736a-b2cb-612f-ce5e-a0e43d666762 -v
TIME UUID SUNW-MSG-ID EVENT
Jan 07 14:01:14.7839 04ee736a-b2cb-612f-ce5e-a0e43d666762 ZFS-8000-GH 
Diagnosed

  100%  fault.fs.zfs.vdev.checksum

Problem in: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383
   Affects: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383
   FRU: -
  Location: -

Jan 13 10:34:32.2301 04ee736a-b2cb-612f-ce5e-a0e43d666762 FMD-8000-58 
Updated

  100%  fault.fs.zfs.vdev.checksum

Problem in: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383
   Affects: zfs://pool=c4538d8607c1e030/vdev=7954b2ff7a8383
   FRU: -
  Location: -

Thanks,

Cindy



On 02/01/11 09:55, Krunal Desai wrote:

I recently discovered a drive failure (either that or a loose cable, I
need to investigate further) on my home fileserver. 'fmadm faulty'
returns no output, but I can clearly see a failure when I do zpool
status -v:

pool: tank
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub canceled on Tue Feb  1 11:51:58 2011
config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
  raidz2-0   DEGRADED 0 0 0
c10t0d0  ONLINE   0 0 0
c10t1d0  ONLINE   0 0 0
c10t2d0  ONLINE   0 0 0
c10t3d0  REMOVED  0 0 0
c10t4d0  ONLINE   0 0 0
c10t5d0  ONLINE   0 0 0
c10t6d0  ONLINE   0 0 0
c10t7d0  ONLINE   0 0 0

In dmesg, I see:
Feb  1 11:14:33 megatron scsi: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0 (sd8):
Feb  1 11:14:33 megatronCommand failed to complete...Device is gone

never had any problems with these drives + mpt in snv_134 (on snv_151a
now), only change was adding a second 1068E-IT that's currently
unpopulated with drives. But more importantly I guess, why can't I see
this failure in fmadm (and how would I go about setting up
automatically dispatching an e-mail to me when stuff like this
happens?)? Is a pool going degraded != to failure?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
I recently discovered a drive failure (either that or a loose cable, I
need to investigate further) on my home fileserver. 'fmadm faulty'
returns no output, but I can clearly see a failure when I do zpool
status -v:

pool: tank
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub canceled on Tue Feb  1 11:51:58 2011
config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
  raidz2-0   DEGRADED 0 0 0
c10t0d0  ONLINE   0 0 0
c10t1d0  ONLINE   0 0 0
c10t2d0  ONLINE   0 0 0
c10t3d0  REMOVED  0 0 0
c10t4d0  ONLINE   0 0 0
c10t5d0  ONLINE   0 0 0
c10t6d0  ONLINE   0 0 0
c10t7d0  ONLINE   0 0 0

In dmesg, I see:
Feb  1 11:14:33 megatron scsi: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0 (sd8):
Feb  1 11:14:33 megatronCommand failed to complete...Device is gone

never had any problems with these drives + mpt in snv_134 (on snv_151a
now), only change was adding a second 1068E-IT that's currently
unpopulated with drives. But more importantly I guess, why can't I see
this failure in fmadm (and how would I go about setting up
automatically dispatching an e-mail to me when stuff like this
happens?)? Is a pool going degraded != to failure?

-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] multiple disk failure (solved?)

2011-02-01 Thread Cindy Swearingen

Excellent.

I think you are good for now as long as your hardware setup is stable.

You survived a severe hardware failure so say a prayer and make sure
this doesn't happen again. Always have good backups.

Thanks,

Cindy

On 02/01/11 06:56, Mike Tancsa wrote:

On 1/31/2011 4:19 PM, Mike Tancsa wrote:

On 1/31/2011 3:14 PM, Cindy Swearingen wrote:

Hi Mike,

Yes, this is looking much better.

Some combination of removing corrupted files indicated in the zpool
status -v output, running zpool scrub and then zpool clear should
resolve the corruption, but its depends on how bad the corruption is.

First, I would try least destruction method: Try to remove the
files listed below by using the rm command.

This entry probably means that the metadata is corrupted or some
other file (like a temp file) no longer exists:

tank1/argus-data:<0xc6>


Hi Cindy,
I removed the files that were listed, and now I am left with

errors: Permanent errors have been detected in the following files:

tank1/argus-data:<0xc5>
tank1/argus-data:<0xc6>
tank1/argus-data:<0xc7>

I have started a scrub
 scrub: scrub in progress for 0h48m, 10.90% done, 6h35m to go



Looks like that was it!  The scrub finished in the time it estimated and
that was all I needed to do. I did not have to to do zpool clear or any
other commands.  Is there anything beyond scrub to check the integrity
of the pool ?

0(offsite)# zpool status -v
  pool: tank1
 state: ONLINE
 scrub: scrub completed after 7h32m with 0 errors on Mon Jan 31 23:00:46
2011
config:

NAMESTATE READ WRITE CKSUM
tank1   ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ad0 ONLINE   0 0 0
ad1 ONLINE   0 0 0
ad4 ONLINE   0 0 0
ad6 ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ada0ONLINE   0 0 0
ada1ONLINE   0 0 0
ada2ONLINE   0 0 0
ada3ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ada5ONLINE   0 0 0
ada8ONLINE   0 0 0
ada7ONLINE   0 0 0
ada6ONLINE   0 0 0

errors: No known data errors
0(offsite)#


---Mike

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup success stories?

2011-02-01 Thread Oyvind Syljuasen
> > Dedup is *hungry* for RAM. 8GB is not enough for
> your configuration,
> > most likely! First guess: double the RAM and then
> you might have
> > better
> > luck.
> 
> I know... that's why I use L2ARC
>  

What is zdb -D showing?

Does this give you any clue;
http://blogs.sun.com/roch/entry/dedup_performance_considerations1

br,
syljua
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] multiple disk failure (solved?)

2011-02-01 Thread Mike Tancsa
On 1/31/2011 4:19 PM, Mike Tancsa wrote:
> On 1/31/2011 3:14 PM, Cindy Swearingen wrote:
>> Hi Mike,
>>
>> Yes, this is looking much better.
>>
>> Some combination of removing corrupted files indicated in the zpool
>> status -v output, running zpool scrub and then zpool clear should
>> resolve the corruption, but its depends on how bad the corruption is.
>>
>> First, I would try least destruction method: Try to remove the
>> files listed below by using the rm command.
>>
>> This entry probably means that the metadata is corrupted or some
>> other file (like a temp file) no longer exists:
>>
>> tank1/argus-data:<0xc6>
> 
> 
> Hi Cindy,
>   I removed the files that were listed, and now I am left with
> 
> errors: Permanent errors have been detected in the following files:
> 
> tank1/argus-data:<0xc5>
> tank1/argus-data:<0xc6>
> tank1/argus-data:<0xc7>
> 
> I have started a scrub
>  scrub: scrub in progress for 0h48m, 10.90% done, 6h35m to go


Looks like that was it!  The scrub finished in the time it estimated and
that was all I needed to do. I did not have to to do zpool clear or any
other commands.  Is there anything beyond scrub to check the integrity
of the pool ?

0(offsite)# zpool status -v
  pool: tank1
 state: ONLINE
 scrub: scrub completed after 7h32m with 0 errors on Mon Jan 31 23:00:46
2011
config:

NAMESTATE READ WRITE CKSUM
tank1   ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ad0 ONLINE   0 0 0
ad1 ONLINE   0 0 0
ad4 ONLINE   0 0 0
ad6 ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ada0ONLINE   0 0 0
ada1ONLINE   0 0 0
ada2ONLINE   0 0 0
ada3ONLINE   0 0 0
  raidz1ONLINE   0 0 0
ada5ONLINE   0 0 0
ada8ONLINE   0 0 0
ada7ONLINE   0 0 0
ada6ONLINE   0 0 0

errors: No known data errors
0(offsite)#


---Mike
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and L2ARC memory requirements?

2011-02-01 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk
> 
> > Even *with* an L2ARC, your memory requirements are *substantial*,
> > because the L2ARC itself needs RAM. 8 GB is simply inadequate for your
> > test.
> 
> With 50TB storage, and 1TB if L2ARC, with  no dedup, what amount of ARC
> would you would you recommeend?

Without dedup and without L2ARC, the amount of ram you require is unrelated to 
amount of storage you have.  Your ram requirement depends on what applications 
you run.  Any excess ram you have will be used for ARC (that is, L1 ARC) and 
therefore used to benefit performance.  So excess ram is always good.  

Do not be a cheapskate with ram.  Regardless of whether you use ZFS, or any 
other filesystem, or any other OS, even windows or linux.  Excess ram is always 
a good thing.  It always improves stability and improves performance.

If you are using a laptop and not serving anything and performance is not a 
major concern and you're free to reboot whenever you want, then you can survive 
on 2G of ram.  But a server presumably DOES stuff and you don't want to reboot 
frequently.  I'd recommend 4G minimally, 8G standard, and if you run any 
applications (databases, web servers, symantec products) then add more.  And if 
you use dedup, or l2arc, then add more.


> And then, _with_ dedup, what would you recommemend?

If you have dedup enabled, add slightly under 3G ram for every 1TB unique data 
in your pool on top of whatever you've selected for your base ram 
configuration. 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and TRIM

2011-02-01 Thread Garrett D'Amore

On 01/31/11 01:09 PM, Pasi Kärkkäinen wrote:

On Mon, Jan 31, 2011 at 03:41:52PM +0100, Joerg Schilling wrote:
   

Brandon High  wrote:

 

On Sat, Jan 29, 2011 at 8:31 AM, Edward Ned Harvey
  wrote:
   

What is the status of ZFS support for TRIM?
 

I believe it's been supported for a while now.
http://www.c0t0d0s0.org/archives/6792-SATA-TRIM-support-in-Opensolaris.html
   

The command is implemented in the sata driver but there does ot seem to be any
user of the code.

 

Btw is the SCSI equivalent also implemented? iirc it was called SCSI UNMAP (for 
SAS).
   


No.

- Garrett


-- Pasi

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup success stories (take two)

2011-02-01 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk
> 
> Sorry about the initial post - it was wrong. The hardware configuration was
> right, but for initial tests, I use NFS, meaning sync writes. This obviously
> stresses the ARC/L2ARC more than async writes, but the result remains the
> same.

I'm sorry, that's not correct.  L2ARC is a read cache.  ZIL is used for sync 
writes.  ZIL always exists.   If there is no dedicated ZIL log device, then 
blocks are used for ZIL in the main storage pool.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup success stories?

2011-02-01 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk
> 
> > Dedup is *hungry* for RAM. 8GB is not enough for your configuration,
> > most likely! First guess: double the RAM and then you might have
> > better
> > luck.
> 
> I know... that's why I use L2ARC

l2arc is not a substitute for ram.  In some cases it can improve disk 
performance in the absence of ram, but it cannot be used for in-memory 
applications and kernel.  

At best, what you're describing would be swap space on a SSD.  Swap space is a 
substitute for ram.  Be aware that SSD performance is 1/100th the performance 
of ram (or worse.)

Garrett is right.  Add more ram, if it is physically possible.  And if it is 
not physically possible, think long and hard about upgrading your server so you 
can add more ram.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-01 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of James
> 
> I’m trying to select the appropriate disk spindle speed for a proposal and
> would welcome any experience and opinions  (e.g. has anyone actively
> chosen 10k/15k drives for a new ZFS build and, if so, why?).

There is nothing special about ZFS in relation to spindle speed.  If you get 
higher rpm's, then you get higher iops, and the same is true for EXT3, NTFS, 
HFS+, ZFS, etc.

One characteristic people often overlook is:  When you get a disk with higher 
capacity (say, 2T versus 600G) then you get more empty space and hence 
typically lower fragmentation in the drive.  Also, the platter density is 
typically higher, so if the two drives have equal RPM's, typically the higher 
capacity drive can perform faster sustained sequential operations.

Even if you use slow drives, assuming you have them in some sort of raid 
configuration, they quickly add up sequential speed to reach the bus speed.  So 
if you expect to do large sequential operations, go for the lower rpm disks.  
But if you expect to do lots of small operations, then twice the rpm's 
literally means twice the performance.  So for small random operations, go for 
the higher rpm disks.


> ** My understanding is that  ZFS will adjust the amount of data accepted into
> each “transaction” (TXG) to ensure it can be written to disk in 5s.Async 
> data
> will stay in ARC, Sync data will also go to ZIL or, if overthreshold, will go 
> to disk
> and pointer to ZIL(on low latency SLOG) – ie. all writes apart from sync 
> writes

ZFS will aggregate small random writes into larger sequential writes.  So you 
don't have to worry too much about rpm's and iops during writes.  But of course 
there's nothing you can do about the random reads.  So if you do random reads, 
you do indeed want higher rpm's.

Your understanding (or terminology) of arc is not correct.  Arc and l2arc are 
read cache.  The terminology for the context you're describing would be the 
write buffer.  Async writes will be stored in the ram write buffer and 
optimized for sequential disk blocks before writing to disk.  Whenever there 
are sync writes, they will be written to the ZIL (hopefully you have a 
dedicated ZIL log device) immediately, and then they will join the write buffer 
with all the other async writes.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss