[zfs-discuss] Unknown Space Gain

2010-10-20 Thread Krunal Desai
Hi all,

I've got an interesting (I think) thing happening with my storage pool
(tank, 8x1.5TB RAID-Z2)...namely that I seem to gain free-space
without deleting files. I noticed this happening awhile ago, so I set
up a cron script that ran every night and does:

pfexec ls -alR /tank  /export/home/movax/fslog/`date +%F`.contents
pfexec /usr/sbin/zfs list  /export/home/movax/fslog/`date +%F`.zfs

so I could see if files are disappearing. Well, between 10/14 and
10/15, I went from:
tank 8.02T  1.64G  44.1M  /tank

to:
tank 8.01T  14.1G  44.1M  /tank

The only file changes that showed up in a diff are two file
modification date display changes (Apr 17 23:13 to Apr 17  2010,
whatever no big deal), both to Thumbs.db. So, there doesn't *seem* to
be any data loss (this is my personal fileserver, all other users have
read-only access)...and I don't recall a ZFS garbage-collect or
similar mechanism.

Here are the options set on tank:
NAME  PROPERTY   VALUE  SOURCE
tank  type   filesystem -
tank  creation   Tue Feb 24  4:20 2009  -
tank  used   8.01T  -
tank  available  14.1G  -
tank  referenced 44.1M  -
tank  compressratio  1.00x  -
tank  mountedyes-
tank  quota  none   default
tank  reservationnone   default
tank  recordsize 128K   default
tank  mountpoint /tank  default
tank  sharenfs   offdefault
tank  checksum   on default
tank  compressionoffdefault
tank  atime  on default
tank  deviceson default
tank  exec   on default
tank  setuid on default
tank  readonly   offdefault
tank  zoned  offdefault
tank  snapdirhidden local
tank  aclmodegroupmask  default
tank  aclinherit restricted local
tank  canmount   on default
tank  shareiscsi offdefault
tank  xattr  on default
tank  copies 1  default
tank  version4  -
tank  utf8only   off-
tank  normalization  none   -
tank  casesensitivitysensitive  -
tank  vscan  offdefault
tank  nbmand offdefault
tank  sharesmb   offlocal
tank  refquota   none   default
tank  refreservation none   default
tank  primarycache   alldefault
tank  secondarycache alldefault
tank  usedbysnapshots87.5K  -
tank  usedbydataset  44.1M  -
tank  usedbychildren 8.01T  -
tank  usedbyrefreservation   0  -
tank  logbiaslatencydefault
tank  dedup  offlocal
tank  mlslabel   none   default
tank  com.sun:auto-snapshot  true   local

I don't utilize snapshots (this machine just stores media)...so what
could be up?

-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Unknown Space Gain

2010-10-20 Thread Krunal Desai
Huh, I don't actually ever recall enabling that. Perhaps that is
connected to the message I started getting every minute recently in
the kernel buffer,

Oct 20 12:20:49 megatron pcplusmp: [ID 805372 kern.info] pcplusmp: ide
(ata) instance 3 irq 0xf vector 0x45 ioapic 0x2 intin 0xf is bound to
cpu 0
Oct 20 12:21:49 megatron pcplusmp: [ID 805372 kern.info] pcplusmp: ide
(ata) instance 3 irq 0xf vector 0x45 ioapic 0x2 intin 0xf is bound to
cpu 1

I just disabled it (zfs set com.sun\:auto-snapshot=false tank,
correct?), will see if the log messages disappear. Did the filesystem
kill off some snapshots or something in an effort to free up space?

On Wed, Oct 20, 2010 at 12:26 PM,  casper@sun.com wrote:


tank  com.sun:auto-snapshot  true                   local

I don't utilize snapshots (this machine just stores media)...so what
could be up?


 You've also disabled the time-slider functionality?  (automatic snapshots)

 Casper





-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Unknown Space Gain

2010-10-20 Thread Krunal Desai
Argh, yes, lots of snapshots sitting around...apparently time-slider
got activated somehow awhile back. Disabled the services and am now
cleaning out the snapshots!

On Wed, Oct 20, 2010 at 12:41 PM, Tomas Ögren st...@acc.umu.se wrote:
 On 20 October, 2010 - Krunal Desai sent me these 1,5K bytes:

 Huh, I don't actually ever recall enabling that. Perhaps that is
 connected to the message I started getting every minute recently in
 the kernel buffer,

 Oct 20 12:20:49 megatron pcplusmp: [ID 805372 kern.info] pcplusmp: ide
 (ata) instance 3 irq 0xf vector 0x45 ioapic 0x2 intin 0xf is bound to
 cpu 0
 Oct 20 12:21:49 megatron pcplusmp: [ID 805372 kern.info] pcplusmp: ide
 (ata) instance 3 irq 0xf vector 0x45 ioapic 0x2 intin 0xf is bound to
 cpu 1

 I just disabled it (zfs set com.sun\:auto-snapshot=false tank,
 correct?), will see if the log messages disappear. Did the filesystem
 kill off some snapshots or something in an effort to free up space?

 Probably.

 zfs list -t all   to see all the snapshots as well.

 /Tomas
 --
 Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
 |- Student at Computing Science, University of Umeå
 `- Sysadmin at {cs,acc}.umu.se




-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Unknown Space Gain

2010-10-20 Thread Krunal Desai
Where would that log be located? Tried poking around in /var/svc/log
and /var/adm, but I've found just the snapshot-service logs (while
useful, they don't seem to have logged the auto-deletion of
snapshots).

Also, that 'pcplusmp' is triggering every minute, on the minute. It's
probably one of my drives (judging from ata instance 3), how do I tell
what drive is what instance?

If I didn't say before, thanks to all that have replied for your
assistance, greatly appreciated.

On Wed, Oct 20, 2010 at 2:23 PM,  casper@sun.com wrote:

Huh, I don't actually ever recall enabling that. Perhaps that is
connected to the message I started getting every minute recently in
the kernel buffer,

 It's on by default.

 You can see if it was ever enabled by using:

        zfs list -t snapshot |grep @zfs-auto

Oct 20 12:20:49 megatron pcplusmp: [ID 805372 kern.info] pcplusmp: id=
e
(ata) instance 3 irq 0xf vector 0x45 ioapic 0x2 intin 0xf is bound to
cpu 0
Oct 20 12:21:49 megatron pcplusmp: [ID 805372 kern.info] pcplusmp: id=
e
(ata) instance 3 irq 0xf vector 0x45 ioapic 0x2 intin 0xf is bound to
cpu 1

 This sounds more like a device driver unloaded and later it is reloaded
 because of some other service.

I just disabled it (zfs set com.sun\:auto-snapshot=3Dfalse tank,
correct?), will see if the log messages disappear. Did the filesystem
kill off some snapshots or something in an effort to free up space?

 Yes, but typically it will log that.

 Casper





-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware going bad

2010-10-27 Thread Krunal Desai
I believe he meant a memory stress test, i.e. booting with a
memtest86+ CD and seeing if it passed. Even if the memory is OK, the
stress from that test may expose defects in the power supply or other
components.

Your CPU temperature is 56C, which is not out-of-line for most modern
CPUs (you didn't state what type of CPU it is). Heck, 56C would be
positively cool for a NetBurst-based Xeon.

On Wed, Oct 27, 2010 at 4:17 PM, Harry Putnam rea...@newsguy.com wrote:
 Toby Thain t...@telegraphics.com.au writes:

 On 27/10/10 3:14 PM, Harry Putnam wrote:
 It seems my hardware is getting bad, and I can't keep the os running
 for more than a few minutes until the machine shuts down.

 It will run 15 or 20 minutes and then shutdown
 I haven't found the exact reason for it.


 One thing to try is a thorough memory test (few hours).


 It does some kind of memory test on bootup.  I recall seeing something
 about high memory.  And shows all of the 3GB installed

 I just now saw last time it came down, that the cpu was at 134
 degrees.

 And that would of been after it cooled a couple minutes.

 I don't think that is astronomical but it may have been a good bit
 higher under load.  But still wouldn't something show in
 /var/adm/messages if that were the problem?

 Are there not a list of standard things to grep for in logs that would
 indicate various troubles?  Surely system admins would have  somekind
 of reporting tool to get ahead of serious troubles.

 I've had one or another problem with this machine for a couple of
 months now so thinking of scrapping it out, and putting a new setup in
 that roomy  midtower.

 Where can I find a guide to help me understand how to build up a
 machine and then plug my existing discs and data into the new OS?

 I don't mean the hardware part but that part particularly opensolaris
 and zfs related.

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware going bad

2010-10-27 Thread Krunal Desai
With an A64, I think a thermal shutdown would instantly halt CPU
execution, removing the chance to write any kind of log message.
memtest will report any errors in RAM; perhaps when the ARC expands to
the upper-stick of memory it hits the bad bytes and crashes.

Can you try switching power supplies? removing unnecessary add-on
cards? Swapping mobos?

On Wed, Oct 27, 2010 at 4:45 PM, Harry Putnam rea...@newsguy.com wrote:
 Toby Thain t...@telegraphics.com.au writes:

 On 27/10/10 4:21 PM, Krunal Desai wrote:
 I believe he meant a memory stress test, i.e. booting with a
 memtest86+ CD and seeing if it passed.

 Correct. The POST tests are not adequate.

 Got it. Thank you.

 Short of doing such a test, I have evidence already that machine will
 predictably shutdown after 15 to 20 minutes of uptime.

 It seems there ought to be something, some kind of evidence and clues
 if I only knew how to look for them, in the logs.

 Is there not some semi standard kind of keywords to grep for that
 would indicate some clue as to the problem?

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ashift and vdevs

2010-11-23 Thread Krunal Desai
Interesting, I didn't realize that Soracle was working on/had a
solution somewhat in place for 4K-drives. I wonder what will happen
first for me, Hitachi 7K2000s hitting a reasonable price, or
4K/variable-size sector support hiting so I can use Samsung F4s or
Barracuda LPs.

On Tue, Nov 23, 2010 at 9:40 AM, David Magda dma...@ee.ryerson.ca wrote:
 On Tue, November 23, 2010 08:53, taemun wrote:
 zdb -C shows an shift value on each vdev in my pool, I was just wondering
 if it is vdev specific, or pool wide. Google didn't seem to know.

 I'm considering a mixed pool with some advanced format (4KB sector)
 drives, and some normal 512B sector drives, and was wondering if the
 ashift can be set per vdev, or only per pool. Theoretically, this would
 save me some size on metadata on the 512B sector drives.

 It's a per-pool property, and currently hard coded to a value of nine
 (i.e., 2^9 = 512). Sun/Oracle are aware of the new, upcoming sector size/s
 and some changes have been made in the code:

 a. PSARC/2008/769: Multiple disk sector size support
        http://arc.opensolaris.org/caselog/PSARC/2008/769/
 b. PSARC/2010/296: Add tunable to control RMW for Flash Devices
        http://arc.opensolaris.org/caselog/PSARC/2010/296/

 (a) appears to have been fixed in snv_118 or so:

        http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6710930

 However, at this time, there is no publicly available code that
 dynamically determines physical sector size and then adjusts ZFS pools
 automatically. Even if there was, most disks don't support the necessary
 ATA/SCSI command extensions to report on physical and logical sizes
 differences. AFAIK, they all simply report 512 when asked.

 If all of your disks will be 4K, you can hack together a solution to take
 advantage of that fact:

 http://tinyurl.com/25gmy7o
 http://www.solarismen.de/archives/5-Solaris-and-the-new-4K-Sector-Disks-e.g.-WDxxEARS-Part-2.html


 Hopefully it'll make it into at least Solaris 11, as during the lifetime
 of that product there will be even more disks with that property. There's
 also the fact that many LUNs from SANs also have alignment issues, though
 they tend to be at 64K. (At least that's what VMware and NetApp best
 practices state.)


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ashift and vdevs

2010-11-23 Thread Krunal Desai
On Tue, Nov 23, 2010 at 9:59 AM, taemun tae...@gmail.com wrote:
 I'm currently populating a pool with a 9-wide raidz vdev of Samsung HD204UI
 2TB (5400rpm, 4KB sector) and a 9-wide raidz vdev of Seagate LP ST32000542AS
 2TB (5900 rpm, 4KB sector) which was created with that binary, and haven't
 seen any of the performance issues I've had in the past with WD EARS drives.
 It would be lovely if Oracle could see fit to implementing correct detection
 of these drives! Or, at the very least, an -o ashift=12 parameter in the
 zpool create function.

What is the upgrade path like from this? For example, currently I
have b134 OpenSolaris with 8x1.5TB drives in a -Z2 storage pool. I
would like to go to OpenIndiana and move that data to a new pool built
of 3 6-drive -Z2s (utilizing 2TB drives). I am going to stagger my
drive purchases to give my wallet a breather, so I would likely start
with 2 6-drive -Z2s at the beginning. If I was to use that
binary/hack to force the ashift for 4k drives, I should be able to
reconcile and upgrade to a zpool version down the road that is happy
and aware of 4k drives?

I know the safest route would be to just go with 512-byte sector
7K2000s, but their prices do not drop nearly as often as the LPs or
F4s do.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ashift and vdevs

2010-11-26 Thread Krunal Desai
 I'd also note that in the future at some point, we won't be able to purchase 
 512B drives any more. In particular, I think that 3TB drives will all be 4KB 
 formatted. So it isn't inadvisable for a pool that you plan on expanding to 
 have ashift=12 (imo).

One new thought occurred to me; I know some of the 4K drives emulate 512 byte 
sectors, so to the host OS, they appear to be no different than other 512b 
drives. With this additional layer of emulation, I would assume that ashift 
wouldn't be needed, though I have read reports of this affecting performance. I 
think I'll need to confirm what drives do what exactly and then decide on an 
ashift if needed.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ext. UPS-backed SATA SSD ZIL?

2010-11-26 Thread Krunal Desai
 What about powering the X25-E by an external power source, one that is also 
 solid-state and backed by a UPS?  In my experience, smaller power supplies 
 tend to be much more reliable than typical ATX supplies.

I don't think the different PSU would be an issue, The supply you've linked 
doesn't seem to care about linking grounds together.

 or even more reliable would be a PicoPSU w/ a hack to make sure that the 
 power is always on.
 
 Has anyone tried something like this?  Powering ZILs using a second, more 
 reliable PSU?  Thoughts?

I hacked up a PicoPSU for robotics use (running off +24V and providing 
+5/+3.3); your always-on should be as easy as shorting the green-black wires 
(short Pin 14 to ground) with a little solder jumper.

But wouldn't you need some type of reset trigger for when the system is reset? 
Or is that performed by the SATA controller?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ashift and vdevs

2010-11-26 Thread Krunal Desai
On Nov 26, 2010, at 20:09 , taemun wrote:
 If you consider that for a 4KB internal drive, with a 512B external 
 interface, a request for a 512B write will result in the drive reading 4KB, 
 modifying it (putting the new 512B in) and then writing the 4KB out again. 
 This is terrible from a latency perspective. I recall seeing 20 IOPS on a WD 
 EARS 2TB drive (ie, 50ms latency for random 512B writes).

Agreed. However, if you look at this MS KB article: 
http://support.microsoft.com/kb/982018/en-us , Windows 7 (even with SP1) has no 
support for 4K-sector drives. Obviously, we're dealing with ZFS and Solaris/BSD 
here, but what I'm getting at is, which 4K sector drives offer a jumper or 
other method to completely disable any form of emulation and appear to the host 
OS as a 4K-sector drive?

I believe the Barracuda LPs (the 5900rpm) disks can do this, but I'm not sure 
about the others like the F4s. I believe you earlier said you were using F4s 
(the HD204UIs) and the 5900rpm Seagates; can you explicate further about these 
drives and their emulation (or lack thereof), I'd appreciate it!

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ RevoDrive ZFS support

2010-11-28 Thread Krunal Desai
 There are problems with Sandforce controllers, according to forum posts. 
 Buggy firmware. And in practice, Sandforce is far below it's theoretical 
 values. I expect Intel to have fewer problems.

I believe it's more the firmware (and pace of firmware updates) from companies 
making Sandforce-based drives than it is the controller. Enthusiasts can 
tolerate OCZ and others releasing alphas/betas in forum posts.

While the G2 Intel drives may not be the performance kings anymore (or the most 
price-effective), I'd argue they're certainly the most stable when it comes to 
firmware. Have my eye on a G3 Intel drive for my laptop, where I can't really 
afford beta firmware updates biting me on the road.

--khd

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Seagate ST32000542AS and ZFS perf

2010-11-29 Thread Krunal Desai
 I'm using these drives for one of the vdevs in my pool. The pool was created
 with ashift=12 (zpool binary
 from http://digitaldj.net/2010/11/03/zfs-zpool-v28-openindiana-b147-4k-drives-and-you/),
 which limits the minimum block size to 4KB, the same as the physical block
 size on these drives. I haven't noticed any performance issues. These
 obviously aren't 7200rpm drives, so you can't expect them to match those in
 random IOPS.

The Seagate datasheet for those parts report 512-byte sectors. What is
the deal with the ST32000542AS: native 512-byte sectors, native
4k-byte sector with selectable emulation, or native 4k-byte sectors
with 512-byte sector emulation always on?

Also, just a side note, I believe these drives achieve their
low-power status with the reduced RPM (5900rpm), not with the head
parking style power-management that WD Green drives use? The latter
I've read is rather unsuitable for RAID operation (especially with HW
RAID controllers).
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Seagate ST32000542AS and ZFS perf

2010-11-29 Thread Krunal Desai
On Mon, Nov 29, 2010 at 10:59 AM, Krunal Desai mov...@gmail.com wrote:
 The Seagate datasheet for those parts report 512-byte sectors. What is
 the deal with the ST32000542AS: native 512-byte sectors, native
 4k-byte sector with selectable emulation, or native 4k-byte sectors
 with 512-byte sector emulation always on?

Disregard; if I understand correctly, Seagate has proprietary
SmartAlign tech that takes care of 4K sectors (see links below). I
can't seem to find any real whitepaper style explanation of the method
though, but I assume it either:

1. does a really good job of 512-byte emulation that results in little
to no performance degradation
(http://consumer.media.seagate.com/2010/06/the-digital-den/advanced-format-drives-with-smartalign/
references test data)
2. dynamically looks to see if it even needs to do anything; if the
host OS is sending it requests that all 4k-aware/aligned, all is well.

Newegg has these on sale today for $69.99; sadly the limit is 2. I
think I'll pick two up and use them for some tests and stock up on
this model drive. Though, the power-on hours count seems rather low
for me...8760 hours, or just 1 year of 24/7 operation. I may have to
revisit power management in OpenSolaris (or upgrade to OpenIndiana) to
see if my disks are spinning down when they are supposed too.

Links:
http://www.seagate.com/ww/v/index.jsp?locale=en-USname=advanced-format-migration-to-4k-tpcvgnextoid=746f43fce2489210VgnVCM101a48090aRCRD
http://www.seagate.com/docs/pdf/whitepaper/tp615_smartalign_for_af_4k.pdf
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Seagate ST32000542AS and ZFS perf

2010-11-30 Thread Krunal Desai
 Not sure where you got this figure from, the Barracuda Green
 (http://www.seagate.com/docs/pdf/datasheet/disc/ds1720_barracuda_green.pdf) is
 a different drive to the one we've been talking about in this thread
 (http://www.seagate.com/docs/pdf/datasheet/disc/ds_barracuda_lp.pdf).
 I would note that the Seagate 2TB LP has a 0.32% Annualised Failure Rate.
 ie, in a given sample (which aren't overheating, etc) 32 from every 10,000
 should fail. I *believe* that the Power On-Hours on the Barra Green is
 simply saying that it is designed for 24/7 usage. It's a per year number. I
 couldn't imagine them specifying the number of hours before failure like
 that, just below an AFR of 0.43.

Whoops, yes, that's what I did, I assumed that LP == Green, but I
guess that is not the case. I got 2 from the newegg sale, I'll post my
impressions once I get them and added to a pool...assuming they
survived newegg's rather subpar hard drive packaging process.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Large Drives

2010-12-10 Thread Krunal Desai
 can I use any 2TB drive?  Even the WD that lie about their sector size?  
 Speed is not really of any importance here.

Yes, you can. The WD will lie and say it's 512-byte sectors, and
you'll get misaligned reads/writes and performance will suffer but it
will work.

 I also need another vdev to store my media backups.  These are very large 
 files that take up a lot of space.  It would be a bummer if I lost data here, 
 but all of the data is replaceable, it would just take time and effort.  I am 
 thinking of RaidZ1 for this data.  Are there any 2TB drives that will work 
 with ZFS presently?  I am willing to take the risk that if I lost a single 
 disk, that they others wouldn't fail during the stress of a resilver.  Write 
 speed doesn't matter to me.  But I need read speeds to supply at least 
 40mbit/second.  I have 8GB of ram on this machine with usually 1 sometimes 2 
 concurrent reads - so I think prefetch should take care of these read demands 
 regardless if the drive is green.  So my question for this vdec is.. What is 
 the best 2TB drive available for a raidz1 configuration?  Are the samsung F4s 
 a valid option or should I be looking at the seagates?  I have pretty much 
 written off the WD due to the 4k/512 byte sector nonsense.

The Hitachi 7K2000s are THE most foolproof, because they are pure
512-byte sector drives. No ifs, thens or buts. Seagates (at least the
Barracuda LPs) utilize a tech called 'SmartAlign'; apparently it
performs pretty well (taemun on the list has used both those drives
and F4s). The F4s are 4K-sector drives as well, and I'm relatively
certain they do some emulation goofiness.

Safest option: 7K2000. Hopefully by the time you need to expand again,
whether a drive is 4K or not should not matter anymore when it comes
to ZFS.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Looking for 3.5 SSD for ZIL

2010-12-22 Thread Krunal Desai
 As of yet, I have only found 3.5 models with the Sandforce 1200, which was
 not recommended on this list.

I actually bought a SF-1200 based OCZ Agility 2 (60G) for use as a
ZIL/L2ARC (haven't installed it yet however, definitely jumped the gun
on this purchase...) based on some recommendations from fellow users.
Why are these not recommended? Is it performance related, or more
workload will degrade and kill this thing in no time relate?

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Looking for 3.5 SSD for ZIL

2010-12-22 Thread Krunal Desai
 The ZIL accelerator's requirements differ from the L2ARC, as it's very
 purpose is to guarantee *all* data written to the log can be replayed
 (on next reboot) in case of host failure.

Ah, so this would be why say a super-capacitor backed SSD can be very
helpful, as it will have some backup power present. Luckily, my use
case is not a high-availability server, but a NAS in my basement. I've
got it attached to a UPS with very conservative shut-down timing. Or
are there other host failures aside from power a ZIL would be
vulnerable too (system hard-locks?)?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
I recently discovered a drive failure (either that or a loose cable, I
need to investigate further) on my home fileserver. 'fmadm faulty'
returns no output, but I can clearly see a failure when I do zpool
status -v:

pool: tank
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub canceled on Tue Feb  1 11:51:58 2011
config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
  raidz2-0   DEGRADED 0 0 0
c10t0d0  ONLINE   0 0 0
c10t1d0  ONLINE   0 0 0
c10t2d0  ONLINE   0 0 0
c10t3d0  REMOVED  0 0 0
c10t4d0  ONLINE   0 0 0
c10t5d0  ONLINE   0 0 0
c10t6d0  ONLINE   0 0 0
c10t7d0  ONLINE   0 0 0

In dmesg, I see:
Feb  1 11:14:33 megatron scsi: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0 (sd8):
Feb  1 11:14:33 megatronCommand failed to complete...Device is gone

never had any problems with these drives + mpt in snv_134 (on snv_151a
now), only change was adding a second 1068E-IT that's currently
unpopulated with drives. But more importantly I guess, why can't I see
this failure in fmadm (and how would I go about setting up
automatically dispatching an e-mail to me when stuff like this
happens?)? Is a pool going degraded != to failure?

-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
On Tue, Feb 1, 2011 at 1:29 PM, Cindy Swearingen
cindy.swearin...@oracle.com wrote:
 I agree that we need to get email updates for failing devices.

Definitely!

 See if fmdump generated an error report using the commands below.

Unfortunately not, see below:

movax@megatron:/root# fmdump
TIME UUID SUNW-MSG-ID EVENT
fmdump: warning: /var/fm/fmd/fltlog is empty

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
On Tue, Feb 1, 2011 at 6:11 PM, Cindy Swearingen
cindy.swearin...@oracle.com wrote:
 I misspoke and should clarify:

 1. fmdump identifies fault reports that explain system issues

 2. fmdump -eV identifies errors or problem symptoms

Gotcha; fmdump -eV gives me the information I need. It appears to have
been a loose cable, I'm hitting the machine with some heavy I/O load,
and the pool resilvered itself, drive has not dropped out.

SMART status was reported healthy as well (got smartctl kind of
working), but I cannot read the SMART data of my disks behind the
1068E due to limitations of smartmontools I guess. (e.g. 'smartctl -d
scsi -a /dev/rdsk/c10t0d0' gives me serial #, model, and just a
generic 'SMART Ok'). I assume that SUNWhd is licensed only for use on
the X4500 Thumper and family? I'd like to see if it works with the
1068E.

It's getting kind of tempting for me to investigate oing a run of
boards that run Marvell 88SX6081s behind a PLX PCIe - PCI-X bridge.
They should have beyond excellent support seeing as that is what the
X4500 uses to run its SATA ports.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
 The output of fmdump is explicit. I am interested to know if you saw 
 aborts and timeouts or some other errors.

I have the machine off atm while I install new disks (18x ST32000542AS), but 
IIRC they appeared as transport errors (scsi.something.transport, I can paste 
the exact errors in a little bit). A slew of transfer/soft errors followed by 
the drive disappearing. I assume that my HBA took it offline, and mpt driver 
reported that to the OS as an admin disconnecting, not as a failure per se.

 The open-source version of smartmontools seems to be slightly out
 of date and somewhat finicky. Does anyone know of a better SMART
 implementation?

That SUNWhd I mentioned seemed interesting, but I assume licensing means I can 
only get that if I purchase SUn hardware.

 Nice idea, except that the X4500 was EOL years ago and the replacement,
 X4540, uses LSI HBAs. I think you will find better Solaris support for the LSI
 chipsets because Oracle's Sun products use them from the top (M9000) all
 the way down the product line.

Oops, forgot that the X4500s are actually kind of old. I'll have to look up 
what LSI controllers the newer models are using (the LSI 2xx8 something IIRC? 
Will have to Google).

--khd

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-01 Thread Krunal Desai
On Tue, Feb 1, 2011 at 11:34 PM, Richard Elling
richard.ell...@gmail.com wrote:
 There is a failure going on here.  It could be a cable or it could be a bad
 disk or firmware. The actual fault might not be in the disk reporting the 
 errors (!)
 It is not a media error.


Errors were as follows:
Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401
Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401
Feb 01 19:33:01.3665 ereport.io.scsi.cmd.disk.recovered0x269213b01d700401
Feb 01 19:33:04.9969 ereport.io.scsi.cmd.disk.tran 0x269f99ef0b300401
Feb 01 19:33:04.9970 ereport.io.scsi.cmd.disk.tran 0x269f9a165a400401

Verbose of a message:
Feb 01 2011 19:33:04.996932283 ereport.io.scsi.cmd.disk.tran
nvlist version: 0
class = ereport.io.scsi.cmd.disk.tran
ena = 0x269f99ef0b300401
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0
(end detector)

devid = id1,sd@n5000c50010ed6a31
driver-assessment = fail
op-code = 0x0
cdb = 0x0 0x0 0x0 0x0 0x0 0x0
pkt-reason = 0x18
pkt-state = 0x1
pkt-stats = 0x0
__ttl = 0x1
__tod = 0x4d48a640 0x3b6bfabb

It was a cable error, but why didn't fault management tell me about
it? What do you mean by The actual fault might not be in the disk
reporting the errors (!)
It is not a media error.? Fault might be sourcing from my SATA
controller or something possibly?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-02 Thread Krunal Desai
 This error code means the device is gone.
 The command got the bus, but could not access the target.

Thanks for that!

I updated firmware on both of my USAS-L8i (LSI1068E based), and while
controller numbering has shifted around in Solaris (went from c10/c11
to c11/c12, not a big deal I think), suddently smartctl is able to
pull temperatures. Can't get a full SMART listing, but temperatures
are going now. Oddly enough, my second LSI controller has skipped
c12t0d0 and jumped straight from number c12t1d0 and onwards. It's a
good thing that ZFS can figure out what is what, but it will make
configuring power management tricky.

I'll post in pm-discuss about the kernel panics I was getting after
enabling drive power management.

-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-02 Thread Krunal Desai
 # uname -a
 SunOS gandalf.taltos.org 5.11 snv_151a i86pc i386 i86pc

movax@megatron:~# uname -a
SunOS megatron 5.11 snv_151a i86pc i386 i86pc


 # /usr/local/sbin/smartctl -H -i -d sat /dev/rdsk/c7t0d0
                                       smartctl 5.40 2010-10-16 r3189
 [i386-pc-solaris2.11] (local build)
 Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net


Fails for me, my version does not recognize the 'sat' option. I've
been using -d scsi:

movax@megatron:~# smartctl -h
smartctl version 5.36 [i386-pc-solaris2.8] Copyright (C) 2002-6 Bruce Allen

but,

movax@megatron:~# smartctl -a -d scsi /dev/rdsk/c11t0d0
smartctl version 5.36 [i386-pc-solaris2.8] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: ATA  ST31500341AS Version: CC1H
Serial number: 9VS14DJD
Device type: disk
Local Time is: Wed Feb  2 20:45:00 2011 EST
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK

Current Drive Temperature: 49 C

Error Counter logging not supported
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-02 Thread Krunal Desai
 So build the current version of smartmontools. As you should have seen in my 
 original response, I'm using 5.40. Bugs in 5.36 are unlikely to be 
 interesting to the maintainers of the package ;-)

Oops, missed that in your log. Will try compiling from source and see what 
happens.

Also, recently it seems like all the links to tools I need are broken. Where 
can I find a lsiutil binary for Solaris?

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-02 Thread Krunal Desai
 If you search for 'lsiutil solaris' on lsi.com, it'll direct you to
 zipfile that includes a solaris binary for x86 solaris.

Yep, that worked, grabbed it off some other adapter's page. Thanks!
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] kernel messages question

2011-02-06 Thread Krunal Desai
On Sat, Feb 5, 2011 at 5:44 PM, Roy Sigurd Karlsbakk r...@karlsbakk.net wrote:
 Hi

 I keep getting these messages on this one box. There are issues with at least 
 one of the drives in it, but since there are some 80 drives in it, that's not 
 really an issue. I just want to know, if anyone knows, what this kernel 
 message mean. Anyone?

 Feb  5 19:35:57 prv-backup scsi: [ID 365881 kern.info] 
 /pci@7a,0/pci8086,340e@7/pci1000,3140@0 (mpt1):
 Feb  5 19:35:57 prv-backup      Log info 0x3108 received for target 13.
 Feb  5 19:35:57 prv-backup      scsi_status=0x0, ioc_status=0x804b, 
 scsi_state=0x0

I think I got those when I had a loose cable on the backplane (aka,
physical medium errors).
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] CPU Limited on Checksums?

2011-02-08 Thread Krunal Desai
Hi all,

My system is powered by an Intel Core 2 Duo (E6600) with 8GB of RAM.
Running into some very heavy CPU usage.

First, a copy from one zpool to another (cp -aRv /oldtank/documents*
/tank/documents/*), both in the same system. Load averages are around
~4.8. I think I used lockstat correctly, and found the following:

movax@megatron:/tank# lockstat -kIW -D 20 sleep 30

Profiling interrupt: 2960 events in 30.516 seconds (97 events/sec)

Count indv cuml rcnt nsec Hottest CPU+PILCaller
---
 1518  51%  51% 0.00 1800 cpu[0] SHA256TransformBlocks
  334  11%  63% 0.00 2820 cpu[0]
vdev_raidz_generate_parity_pq
  261   9%  71% 0.00 3493 cpu[0] bcopy_altentry
  119   4%  75% 0.00 3033 cpu[0] mutex_enter
   73   2%  78% 0.00 2818 cpu[0] i86_mwait
snip

So, obviously here it seems checksum calculation is, to put it mildly,
eating up CPU cycles like none other. I believe it's bad(TM) to turn
off checksums? (zfs property just has checksum=on, I guess it has
defaulted to SHA256 checksums?)

Second, a copy from my desktop PC to my new zpool. (5900rpm drive over
GigE to 2 6-drive RAID-Z2s). Load average are around ~3. Again, with
lockstat:

movax@megatron:/tank# lockstat -kIW -D 20 sleep 30

Profiling interrupt: 2919 events in 30.089 seconds (97 events/sec)

Count indv cuml rcnt nsec Hottest CPU+PILCaller
---
 1298  44%  44% 0.00 1853 cpu[0] i86_mwait
  301  10%  55% 0.00 2700 cpu[0]
vdev_raidz_generate_parity_pq
  144   5%  60% 0.00 3569 cpu[0] bcopy_altentry
  103   4%  63% 0.00 3933 cpu[0] ddi_getl
   83   3%  66% 0.00 2465 cpu[0] mutex_enter
snip
Here it seems as if 'i86_mwait' is occupying the top spot (is this
because I have power-management set to poll my CPU?). Is something odd
happening drive buffer wise? (i.e. coming in on NIC, buffered in the
HBA somehow, and then flushed to disks?)

Either case, it seems I'm hitting a ceiling of around 65MB/s. I assume
CPU is bottlenecking, since bonnie++ benches resulted in much better
performance for the vdev. In the latter case though, it may just be a
limitation of the source drive (if it can't read data faster than
65MB/s, I can't write faster than that...).

e: E6600 is a first-generation 65nm LGA775 CPU, clocked at 2.40GHz.
Dual-cores, no hyper-threading.

-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Very bad ZFS write performance. Ok Read.

2011-02-14 Thread Krunal Desai
On Sat, Feb 12, 2011 at 3:14 AM, ian W dropbears...@yahoo.com.au wrote:
 Thanks for the responses.. I found the issue. It was due to power management, 
 and a probably bug with event driven power management states,

 changing

 cpupm enable

 to

 cpupm enable poll-mode

 in /etc/power.conf fixed the issue for me. back up to 110MB/sec+ now..

Interesting - I have a E6600 also, and I will give this a try. I left
'cpupm enable' in /etc/power.conf because powertop/prtdiag properly
reported all the available P/C-states of my CPU, so I assumed that
power management was good to go. What do you have cpu-threshold set
too?

(This may be a moot point for me, because my CPU is littering fault
management with strings of L2 cache errors, so might be upgrading to
Nehalem soon).
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CPU Limited on Checksums?

2011-02-16 Thread Krunal Desai
On Wed, Feb 9, 2011 at 12:02 AM, Richard Elling
richard.ell...@gmail.com wrote:
 The data below does not show heavy CPU usage. Do you have data that
 does show heavy CPU usage?  mpstat would be a good start.

Here is mpstat output during a network copy; I think one of the CPUs
disappeared due to a L2 Cache error.

movax@megatron:~# mpstat -p
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl set
  1  333   06  4057 3830 19467  140   27  2650  15611  48
 0  51   0

 Some ZFS checksums are always SHA-256. By default, data checksums are
 Fletcher4 on most modern ZFS implementations, unless dedup is enabled.
I see, thanks for the info.

 Second, a copy from my desktop PC to my new zpool. (5900rpm drive over
 GigE to 2 6-drive RAID-Z2s). Load average are around ~3.

 Lockstat won't provide direct insight to the run queue (which is used to 
 calculate
 load average). Perhaps you'd be better off starting with prstat.
Ah, gotcha. I ran prstat, which is more of what I wanted:
   PID USERNAME  SIZE   RSS STATE  PRI NICE  TIME  CPU PROCESS/NLWP
  1434 root0K0K run  0  -20   0:01:54  23% zpool-tank/136
  1515 root 9804K 3260K cpu1590   0:00:00 0.1% prstat/1
  1438 root   14M 9056K run 590   0:00:00 0.0% smbd/16

zpool thread near the top of usage, which is what I suppose you would expect.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-16 Thread Krunal Desai
On Wed, Feb 2, 2011 at 8:38 PM, Carson Gaspar car...@taltos.org wrote:
 Works For Me (TM).

 c7t0d0 is hanging off an LSI SAS3081E-R (SAS1068E chip) rev B3 MPT rev 105
 Firmware rev 011d (1.29.00.00) (IT FW)

 This is a SATA disk - I don't have any SAS disks behind a LSI1068E to test.

When I try to do a SMART status read (more than just a simple
identify), looks like the 1068E drops the drive for a little bit. I
bought the Intel-branded LSI SAS3081E:
Current active firmware version is 0120 (1.32.00)
Firmware image's version is MPTFW-01.32.00.00-IT
  LSI Logic
x86 BIOS image's version is MPTBIOS-6.34.00.00 (2010.12.07)

kernel log messages:
Feb 17 00:54:05 megatron scsi: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:05 megatronDisconnected command timeout for Target 0
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronLog info 0x3114 received for target 0.
Feb 17 00:54:06 megatronscsi_status=0x0, ioc_status=0x8048,
scsi_state=0xc
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronLog info 0x3113 received for target 0.
Feb 17 00:54:06 megatronscsi_status=0x0, ioc_status=0x8048,
scsi_state=0xc
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronLog info 0x3113 received for target 0.
Feb 17 00:54:06 megatronscsi_status=0x0, ioc_status=0x8048,
scsi_state=0xc
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronLog info 0x3113 received for target 0.
Feb 17 00:54:06 megatronscsi_status=0x0, ioc_status=0x8048,
scsi_state=0xc
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronLog info 0x3113 received for target 0.
Feb 17 00:54:06 megatronscsi_status=0x0, ioc_status=0x8048,
scsi_state=0xc
Feb 17 00:54:06 megatron scsi: [ID 107833 kern.notice]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronmpt_flush_target discovered non-NULL
cmd in slot 33, tasktype 0x3
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronCmd (0xff02dea63a40) dump for
Target 0 Lun 0:
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatroncdb=[ ]
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronpkt_flags=0x8000 pkt_statistics=0x0
pkt_state=0x0
Feb 17 00:54:06 megatron scsi: [ID 365881 kern.info]
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronpkt_scbp=0x0 cmd_flags=0x2800024
Feb 17 00:54:06 megatron scsi: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci8086,2e29@6/pci1000,3140@0 (mpt4):
Feb 17 00:54:06 megatronioc reset abort passthru

Fault management records some transport errors followed by recovery.
Any ideas? Disks are ST32000542AS.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fmadm faulty not showing faulty/offline disks?

2011-02-17 Thread Krunal Desai
On Thu, Feb 17, 2011 at 10:52 AM, Carson Gaspar car...@taltos.org wrote:
 Please give the _exact_ command you are running. I see the same thing, but
 only if I tray and retrieve some of the extended info (-x...). I don't see
 it with -a.

Sure, here it is (apologies in advance if GMail applies its forced wrapping):


movax@megatron:~/downloads# smartctl -a -d sat /dev/rdsk/c1t0d0
smartctl 5.40 2010-10-16 r3189 [i386-pc-solaris2.11] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda LP
Device Model: ST32000542AS
Serial Number:redacted
Firmware Version: CC34
User Capacity:2,000,398,934,016 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:Thu Feb 17 00:52:56 2011 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

drive drops/resets here
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] External SATA drive enclosures + ZFS?

2011-02-27 Thread Krunal Desai
On Feb 27, 2011, at 10:48 , taemun wrote:
 
 eSATA has no need for any interposer chips between a modern SATA chipset on 
 the motherboard and a SATA hard drive. You can buy cables with appropriate 
 ends for this. There is no reason why the data side of an eSATA drive should 
 be any more likely to fail than SATA. (within bounds, for cable lengths, etc) 
 At least you can be assured that the drive will receive a flush request at 
 appropriate times.

Intel's platform design guide (at least for its mobile platforms) calls for a 
SATA repeater/redriver chip immediately before the eSATA connector (or docking 
connector). It is however passive in the sense that is redrives the signal 
without appearing to the system whatsoever (just a receiver and re-driver 
inside the IC). I'd think that an eSATA drive with a stable power supply + a 
cable length within spec would be reliable enough for basic home use.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] GPU acceleration of ZFS

2011-05-10 Thread Krunal Desai
On Tue, May 10, 2011 at 11:29 AM, Anatoly legko...@fastmail.fm wrote:
 Good day,

 I think ZFS can take advantage of using GPU for sha256 calculation,
 encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI
 HD Series can do calculation of sha256 50-100 times faster than modern 4
 cores CPU.
Ignoring optimizations from SIMD extensions like SSE and friends, this
is probably true. However, the GPU also has to deal with the overhead
of data transfer to itself before it can even begin crunching data.
Granted, a Gen. 2 x16 link is quite speedy, but is CPU performance
really that poor where a GPU can still out-perform it? My undergrad
thesis dealt with computational acceleration utilizing CUDA, and the
datasets had to scale quite a ways before there was a noticeable
advantage in using a Tesla or similar over a bog-standard i7-920.

 The only problem that there is no AMD/Nvidia drivers for Solaris that
 support hardware-assisted OpenCL.
This, and keep in mind that most of the professional users here will
likely be using professional hardware, where a simple 8MB Rage XL gets
the job done thanks to the magic of out-of-band management cards and
other such facilities. Even as a home user, I have not placed a
high-end videocard into my machine, I use a $5 ATI PCI videocard that
saw about a hour of use whilst I installed Solaris 11.

-- 
--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 350TB+ storage solution

2011-05-16 Thread Krunal Desai
On Mon, May 16, 2011 at 1:20 PM, Brandon High bh...@freaks.com wrote:
 The 1TB and 2TB are manufactured in China, and have a very high
 failure and DOA rate according to Newegg.

 The 3TB drives come off the same production line as the Ultrastar
 5K3000 in Thailand and may be more reliable.

Thanks for the heads up, I was thinking about 5K3000s to finish out my
build (currently have Barracuda LPs). I do wonder how much of that DOA
is due to newegg HDD packaging/shipping, however.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 350TB+ storage solution

2011-05-16 Thread Krunal Desai
On Mon, May 16, 2011 at 2:29 PM, Paul Kraus p...@kraus-haus.org wrote:
 What Newegg was doing is buying drives in the 20-pack from the
 manufacturer and packing them individually WRAPPED IN BUBBLE WRAP and
 then stuffed in a box. No clamshell. I realized *something* was up
 when _every_ drive I looked at had a much higher report of DOA (or
 early failure) at the Newegg reviews than made any sense (and compared
 to other site's reviews).

I picked up a single 5K3000 last week, have not powered it on yet, but
it came in a pseudo-OEM box with clamshells. I remember getting
bubble-wrapped single drives from Newegg, and more than a fair share
of those drives suffered early deaths or never powered on in the first
place. No complaints about Amazon: Seagate drives came in Seagate OEM
boxes with free shipping via Prime. (probably not practical for you
enterprise/professional guys, but nice for home users).

An order of 6 the 5K3000 drives for work-related purposes shipped in a
Styrofoam holder of sorts that was cut in half for my small number of
drives (is this what 20 pks come in?). No idea what other packaging
was around them (shipping and receiving opened the packages).
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] # disks per vdev

2011-06-15 Thread Krunal Desai
On Wed, Jun 15, 2011 at 8:20 AM, Lanky Doodle lanky_doo...@hotmail.com wrote:
 That's how I understood autoexpand, about not doing so until all disks have 
 been done.

 I do indeed rip from disc rather than grab torrents - to VIDEO_TS folders and 
 not ISO - on my laptop then copy the whole folder up to WHS in one go. So 
 while they're not one large single file, they are lots of small .vob files, 
 but being written in one hit.

I decided on 3x 6-drive RAID-Z2s for my home media server, made up of
2TB drives (mix of Barracuda LP 5900rpm and 5K3000), it's been quite
solid so far. Performance is entirely limited by GigE.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Finding disks [was: # disks per vdev]

2011-07-05 Thread Krunal Desai
On Tue, Jul 5, 2011 at 7:47 AM, Lanky Doodle lanky_doo...@hotmail.com wrote:
 Thanks.

 I ruled out the SAS2008 controller as my motherboard is only PCIe 1.0 so 
 would not have been able to make the most of the difference in increased 
 bandwidth.

Only PCIe 1.0? What chipset is that based on? Might be worthwhile to
upgrade as I believe Solaris power-management has a fairly recent
cutoff in terms of processor support when it comes to
power-management. (AMD Family 16 or better, Intel Nehalem or newer is
what I've been told). PCIe 2.0 has been around for quite awhile, PCIe
3.0 will be making an appearance on Ivy Bridge CPUs (and has already
been announced by FPGA vendors), but I'm fairly confident that
graphics cards will be the first target market to utilize that.

Another thing to consider is that you could buy the SAS2008-based
cards and move them from motherboard to motherboard for the
foreseeable future (copper PCI Express isn't going anywhere for a long
time). Don't kneecap yourself because of your current mobo.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] bad seagate drive?

2011-09-11 Thread Krunal Desai
On Sep 11, 2011, at 13:01 , Richard Elling wrote:
 The removed state can be the result of a transport issue. If this is a 
 Solaris-based
 OS, then look at fmadm faulty for a diagnosis leading to a removal. If 
 none, 
 then look at fmdump -eV for errors relating to the disk. Last, check the 
 zpool
 history to make sure one of those little imps didn't issue a zpool remove 
 command.

Definitely check your cabling; a few of my drives disappeared like this as 
'REMOVED', turned out to be some loose SATA cables on my backplane.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] remove wrongly added device from zpool

2011-09-19 Thread Krunal Desai
On Mon, Sep 19, 2011 at 9:29 AM, Fred Liu fred_...@issi.com wrote:
 Yes. I have connected them back to server. But it does not help.
 I am really sad now...

I cringed a little when I read the thread title. I did this on
accident once as well, but lucky for me, I had enough scratch
storage around in various sizes to cobble together a JBOD (risky) and
use it as a holding area for my data while I remade the pool.

I'm a home user and only have around 21TB or so, so it was feasible
for me. Probably not so feasible for you enterprise guys with 1000s of
users and 100s of filesystems!

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on Dell with FreeBSD

2011-10-19 Thread Krunal Desai
On Wed, Oct 19, 2011 at 10:14 AM, Albert Shih albert.s...@obspm.fr wrote:
 When we buy a MD1200 we need a RAID PERC H800 card on the server so we have
 two options :

        1/ create a LV on the PERC H800 so the server see one volume and put
        the zpool on this unique volume and let the hardware manage the
        raid.

        2/ create 12 LV on the perc H800 (so without raid) and let FreeBSD
        and ZFS manage the raid.

 which one is the best solution ?

 Any advise about the RAM I need on the server (actually one MD1200 so 12x2To 
 disk)

I know the PERC H200 can be flashed with IT firmware, making it in
effect a dumb HBA perfect for ZFS usage. Perhaps the H800 has the
same? (If not, can you get the machine configured with a H200?)

If that's not an option, I think Option 2 will work. My first ZFS
server ran on a PERC 5/i, and I was forced to make 8 single-drive RAID
0s in the PERC Option ROM, but Solaris did not seem to mind that.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on Dell with FreeBSD

2011-10-20 Thread Krunal Desai
On Thu, Oct 20, 2011 at 5:49 AM, Albert Shih albert.s...@obspm.fr wrote:
 I'm not sure what you mean when you say «H200 flashed with IT firmware» ?

IT is Initiator Target, and many LSI chips have a version of their
firmware available that will put them into this mode, which is
desirable for ZFS. This is opposed to other LSI firmware modes like
IR which is RAID, I believe. (which you do not want). Since the H200
uses a LSI chip, you can download that firmware from LSI and flash it
to the card turning it into an IT-mode card and a simple HBA.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS HBA's with No Raid

2011-12-06 Thread Krunal Desai
On Tue, Dec 6, 2011 at 3:57 PM, Karl Rossing ka...@barobinson.com wrote:
 Hi,

 I'm thinking of getting LSI 9212-4i4e(4 internal and 4 external ports) to
 replace a SUN Storagetek raid card.

 Is it possible to disable the raid on an LSI 9212-4i4e and have the drives
 read by a simple sas/sata card? I'm open to other brands but I need internal
 and external ports.

That card looks like it is based on the SAS2008 controller, which
means that you need to locate the latest IT (Initiator Target) Mode
firmware for it, and use sas2flash (If I recall correctly) off a
bootable USB stick or similar to flash the controller with that
firmware.

Here is a guide that explains how to flash a LSI2008 that is on-board
on a motherboard; adapt to your situation as needed:
http://www.servethehome.com/howto-flash-supermicro-x8si6f-lsi-sas-2008-controller-lsi-firmware/

Regards,
Krunal Desai
Hardware Engineer
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Pool Unavailable

2012-08-01 Thread Krunal Desai
On Aug 1, 2012, at 11:06, Jesse Jamez jesse.jam...@gmail.com wrote:

 Hello,

 I recently rebooted my workstation and the disk names changed causing my ZFS 
 pool to be unavailable.

 I did not make any hardware changes?  My first question is the obvious?  Did 
 I loose my data?  Can I recover it?

 What would cause the names to change? Delay in the order that the HBA brought 
 them up?

 How can I correct this problem going forward?

 Thanks - - - JesseJ

Perhaps some removable drives caused the change in drive names?
Regardless, I believe ZFS stores labels on each disk and is clever
enough to figure out what is what even if the operating system name
has changed.

If I recall correctly, a zpool export (if possible) followed by a
zpool import has always corrected this for me.

Barring an actual disk failure (which could have failed to enumerate
therefore throwing off naming) your data should be safe.

--khd (mobile)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Has anyone used a Dell with a PERC H310?

2013-01-08 Thread Krunal Desai
On Mon, Jan 7, 2013 at 4:16 PM, Sašo Kiselkov skiselkov...@gmail.comwrote:

 PERC H200 are well behaved cards that are easy to reflash and work well
 (even in JBOD mode) on Illumos - they are essentially a LSI SAS 9211. If
 you can get them, they're one heck of a reliable beast, and cheap too!


That method that was linked seemed very specific to Dell servers; from my
experience with reflashing various LSI cards, can't I just USB boot to a
FreeDOS environment in any system, and then run sasflash/sas2flsh with the
appropriate IT-mode firmware?

Seems like the M1015 has spiked in price again on eBay (US) whilst the H200
is still under $100.

--khd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss