On Apr 7, 2014, at 10:20 AM, Jakob Borg <[email protected]> wrote:
> 2014-04-07 18:13 GMT+02:00 Evan Rowley <[email protected]>:
> >
> > I'd like to measure the total amount of data written over time to the
> > physical devices that make up zpool zils, l2arcs, and vdevs. Each one of
> > these physical devices has a projected Mean Time Before Failure (MTBF),
> > often measured in writes or data written to the device, which I'd like to
> > compare with the total number of writes to the device.
> >
> > The "zpool iostat" command can do this for a given period of time, but
> > unless I'm mistaken, it can't do this continually for an unpredictable
> > amount of time.
> >
> > I'm imagining that something like this can be done using dtrace, but I'm
> > not completley certain if that is the best way to tackle the problem.
> >
> > Are there any other ways to do this? (before i go re-inventing the wheel)
>
> Many SSD:s keep SMART-accessible counters for data read and written;
Wow, that is revisionist history :-)
SCSI devices reported this info in the read/write logs long before SMART (circa
2004) existed.
>
> [root@anto ~]# /opt/smartmontools/sbin/smartctl -d sat,12 -A
> /dev/rdsk/c3t1d0p0
> ...
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
> WHEN_FAILED RAW_VALUE
> ...
> 241 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always
> - 355603
> 242 Host_Reads_32MiB 0x0032 100 100 000 Old_age Always
> - 64535
>
> Haven't seen the same on any spinning disks...
Try something like:
sg_logs -a /dev/rdsk/c0t5000C500476C5B37d0
SEAGATE ST3300657SS 0008
Supported log pages (spc-2) [0x0]:
0x00 Supported log pages
0x02 Error counters (write)
0x03 Error counters (read)
0x05 Error counters (verify)
0x06 Non-medium errors
0x0d Temperature
0x10 Self-test results
0x15 Background scan results (sbc-3)
0x18 Protocol specific port
0x37 Cache (Seagate), Miscellaneous (Hitachi)
0x38 [unknown vendor specific page code]
0x3e Factory (Seagate/Hitachi)
Write error counter page (spc-3) [0x2]
Errors corrected with possible delays = 0
Total rewrites or rereads = 0
Total errors corrected = 0
Total times correction algorithm processed = 0
Total bytes processed = 60158093824
Total uncorrected errors = 0
^^^^ look for zeros in the error counters, especially errors corrected with
possible delays
Read error counter page (spc-3) [0x3]
Errors corrected without substantial delay = 258780
Errors corrected with possible delays = 0
Total rewrites or rereads = 0
Total errors corrected = 258780
Total times correction algorithm processed = 258780
Total bytes processed = 22475677696
Total uncorrected errors = 0
Verify error counter page (spc-3) [0x5]
Errors corrected without substantial delay = 0
Errors corrected with possible delays = 0
Total rewrites or rereads = 0
Total errors corrected = 0
Total times correction algorithm processed = 0
Total bytes processed = 0
Total uncorrected errors = 0
^^^^ rarely do we see verify enabled... too slow
Non-medium error page (spc-2) [0x6]
Non-medium error count = 1
^^^^ rare to discover what these actually reveal... usually need to get access
to private data
Temperature page (spc-3) [0xd]
Current temperature = 39 C
Reference temperature = 68 C
Self-test results page (spc-3) [0x10]
Background scan results page (sbc-3) [0x15]
Status parameters:
Accumulated power on minutes: 542377 [h:m 9039:37]
^^^^ here is the POH, useful for MTBF-related reliability calculations
Status: background scan enabled, none active (waiting for BMS interval
timer to expire)
Number of background scans performed: 139
Background medium scan progress: 0.00%
Number of background medium scans performed: 1203
Protocol Specific port page for SAS SSP (sas-2) [0x18]
relative target port id = 1
generation code = 0
number of phys = 1
phy identifier = 0
attached device type: expander device
attached reason: SMP phy control function
reason: power on
negotiated logical link rate: 6 Gbps
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=1
SAS address = 0x5000c500476c5b35
attached SAS address = 0x50030480009b7a3f
attached phy identifier = 21
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
^^^^ zeros here are good, too... no running disparity errors means good cabling
Phy event descriptors:
Invalid word count: 0
Running disparity error count: 0
Loss of dword synchronization count: 0
Phy reset problem count: 0
relative target port id = 2
generation code = 0
number of phys = 1
phy identifier = 1
attached device type: no device attached
attached reason: unknown
reason: unknown
negotiated logical link rate: 1.5 Gbps
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000c500476c5b36
attached SAS address = 0x0
attached phy identifier = 0
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 0
Running disparity error count: 0
Loss of dword synchronization count: 0
Phy reset problem count: 0
Seagate cache page [0x37]
Blocks sent to initiator = 43897808
Blocks received from initiator = 111026739
Blocks read from cache and sent to initiator = 4654807
Number of read and write commands whose size <= segment size = 1133705
Number of read and write commands whose size > segment size = 2
No ascii information for page = 0x38, here is hex:
00 38 00 01 b4 00 00 03 d6 00 00 00 10 13 eb 00 08
10 46 a9 13 ae 00 00 00 00 13 fb 00 00 00 00 13 ee
20 00 08 46 a9 13 e9 00 07 d1 db 13 e9 00 07 d1 db
30 13 f3 00 00 00 00 13 da 00 00 00 00 13 e2 00 00
..... [truncated after 64 of 440 bytes (use '-H' to see the rest)]
Seagate/Hitachi factory page [0x3e]
number of hours powered up = 9039.62
^^^^ another counter for POH
number of minutes until next internal SMART test = 14
sg3_utils compiles nicely on SmartOS, as does several of the other commonly
used tools for decoding VPD, log, and mode pages. You might find sdparm to
be more user friendly (actually, you might find a rock to be more user friendly
than
sg3_utils :) But if you really want user-unfriendly, you can get VPD pages from
format(1m)
-- richard
--
[email protected]
+1-760-896-4422
-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription:
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com