Re: [zfs-discuss] reliable, enterprise worthy JBODs?

2011-01-25 Thread Marc Nicholas
Rocky,

Does DataON manufacture these units or they LSI OEM?

-marc

Sent from my iPhone
416.414.6271

On 2011-01-25, at 2:53 PM, Rocky Shek roc...@dataonstorage.com wrote:

 Philip,
 
 You can consider DataON DNS-1600 4U 24Bay 6Gb/s SAS JBOD Storage. 
 http://dataonstorage.com/dataon-products/dns-1600-4u-6g-sas-to-sas-sata-jbod
 -storage.html
 
 It is the best fit for ZFS Storage application. It can be a good replacement
 of Sun/Oracle J4400 and J4200   
 
 There are also Ultra density DNS-1660 4U 60 Bay 6Gb/s SAS JBOD Storage and
 other form factor JBOD.   
 
 http://dataonstorage.com/dataon-products/6g-sas-jbod/dns-1660-4u-60-bay-6g-3
 5inch-sassata-jbod.html
 
 
 Rocky
 
 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org
 [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Philip Brown
 Sent: Tuesday, January 25, 2011 10:05 AM
 To: zfs-discuss@opensolaris.org
 Subject: [zfs-discuss] reliable, enterprise worthy JBODs?
 
 So, another hardware question :)
 
 ZFS has been touted as taking maximal advantage of disk hardware, to the
 point where it can be used efficiently and cost-effectively on JBODs, rather
 than having to throw more expensive RAID arrays at it.
 
 Only trouble is.. JBODs seem to have disappeared :(
 Sun/Oracle has discontinued its j4000 line, with no replacement that I can
 see.
 
 IBM seems to have some nice looking hardware in the form of its EXP3500
 expansion trays... but they only support it connected to an IBM (SAS)
 controller... which is only supported when plugged into IBM server hardware
 :(
 
 Any other suggestions for (large-)enterprise-grade, supported JBOD hardware
 for ZFS these days?
 Either fibre or SAS would be okay.
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ext. UPS-backed SATA SSD ZIL?

2010-11-27 Thread Marc Nicholas
That's a great deck, Chris.

-marc

Sent from my iPhone

On 2010-11-27, at 10:34 AM, Christopher George cgeo...@ddrdrive.com wrote:

 I haven't had a chance to test a Vertex 2 PRO against my 2 EX, and I'd 
 be interested if anyone else has.
 
 I recently presented at the OpenStorage Summit 2010 and compared
 exactly the three devices you mention in your post (Vertex 2 EX,
 Vertex 2 Pro, and the DDRdrive X1) as ZIL Accelerators.
 
 Jump to slide 37 for the write IOPS benchmarks:
 
 http://www.ddrdrive.com/zil_accelerator.pdf
 
 and you *really* want to make sure you get  the 4k alignment right
 
 Excellent point, starting on slide 66 the performance impact of partition 
 misalignment is illustrated.  Considering the results, longevity might be
 an even greater concern than decreased IOPS performance as ZIL
 acceleration is a worst case scenario for a Flash based SSD.
 
 The DDRdrive is still the way to go for the ultimate ZIL accelleration, 
 but it's pricey as hell.
 
 In addition to product cost, I believe IOPS/$ is a relevant point of 
 comparison.
 
 Google products gives the price range for the OCZ 50GB SSDs:
 Vertex 2 EX (OCZSSD2-2VTXEX50G: $870 - $1,011 USD)
 Vertex 2 Pro (OCZSSD2-2VTXP50G:  $399 - $525 USD)
 
 4KB Sustained and Aligned Mixed Write IOPS results (See pdf above):
 Vertex 2 EX (6325 IOPS)
 Vertex 2 Pro (3252 IOPS)
 DDRdrive X1 (38701 IOPS)
 
 Using the lowest online price for both the Vertex 2 EX and Vertex 2 Pro,
 and the full list price (SRP) of the DDRdrive X1.
 
 IOPS/Dollar($):
 Vertex 2 EX (6325 IOPS / $870)  =  7.27
 Vertex 2 Pro (3252 IOPS / $399)  =  8.15
 DDRdrive X1 (38701 IOPS / $1,995)  =  19.40
 
 Best regards,
 
 Christopher George
 Founder/CTO
 www.ddrdrive.com
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ideal SATA/SAS Controllers for ZFS

2010-05-18 Thread Marc Nicholas
Nice write-up, Marc.

Aren't the SuperMicro cards their funny UIO form factor? Wouldn't want
someone buying a card that won't work in a standard chassis.

-marc

On Tue, May 18, 2010 at 2:26 AM, Marc Bevand m.bev...@gmail.com wrote:

 The LSI SAS1064E slipped through the cracks when I built the list.
 This is a 4-port PCIe x8 HBA with very good Solaris (and Linux)
 support. I don't remember having seen it mentionned on zfs-discuss@
 before, even though many were looking for 4-port controllers. Perhaps
 the fact it is priced too close to 8-port models explains why it is
 relatively unnoted. That said, the wide x8 PCIe link makes it the
 *cheapest* controller able to feed 300-350MB/s to at least 4 ports
 concurrently. Now added to my list.

 -mrb

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Loss of L2ARC SSD Behaviour

2010-05-06 Thread Marc Nicholas
Hi Michael,

What makes you think striping the SSDs would be faster than round-robin?

-marc

On Thu, May 6, 2010 at 1:09 PM, Michael Sullivan michael.p.sulli...@mac.com
 wrote:

 Everyone,

 Thanks for the help.  I really appreciate it.

 Well, I actually walked through the source code with an associate today and
 we found out how things work by looking at the code.

 It appears that L2ARC is just assigned in round-robin fashion.  If a device
 goes offline, then it goes to the next and marks that one as offline.  The
 failure to retrieve the requested object is treated like a cache miss and
 everything goes along its merry way, as far as we can tell.

 I would have hoped it to be different in some way.  Like if the L2ARC was
 striped for performance reasons, that would be really cool and using that
 device as an extension of the VM model it is modeled after.  Which would
 mean using the L2ARC as an extension of the virtual address space and
 striping it to make it more efficient.  Way cool.  If it took out the bad
 device and reconfigured the stripe device, that would be even way cooler.
  Replacing it with a hot spare more cool too.  However, it appears from the
 source code that the L2ARC is just a (sort of) jumbled collection of ZFS
 objects.  Yes, it gives you better performance if you have it, but it
 doesn't really use it in a way you might expect something as cool as ZFS
 does.

 I understand why it is read only, and it invalidates it's cache when a
 write occurs, to be expected for any object written.

 If an object is not there because of a failure or because it has been
 removed from the cache, it is treated as a cache miss, all well and good -
 go fetch from the pool.

 I also understand why the ZIL is important and that it should be mirrored
 if it is to be on a separate device.  Though I'm wondering how it is handled
 internally when there is a failure of one of it's default devices, but then
 again, it's on a regular pool and should be redundant enough, only just some
 degradation in speed.

 Breaking these devices out from their default locations is great for
 performance, and I understand.  I just wish the knowledge of how they work
 and their internal mechanisms were not so much of a black box.  Maybe that
 is due to the speed at which ZFS is progressing and the features it adds
 with each subsequent release.

 Overall, I am very impressed with ZFS, its flexibility and even more so,
 it's breaking all the rules about how storage should be managed and I really
 like it.  I have yet to see anything to come close in its approach to disk
 data management.  Let's just hope it keeps moving forward, it is truly a
 unique way to view disk storage.

 Anyway, sorry for the ramble, but to everyone, thanks again for the
 answers.

 Mike

 ---
 Michael Sullivan
 michael.p.sulli...@me.com
 http://www.kamiogi.net/
 Japan Mobile: +81-80-3202-2599
 US Phone: +1-561-283-2034

 On 7 May 2010, at 00:00 , Robert Milkowski wrote:

  On 06/05/2010 15:31, Tomas Ögren wrote:
  On 06 May, 2010 - Bob Friesenhahn sent me these 0,6K bytes:
 
 
  On Wed, 5 May 2010, Edward Ned Harvey wrote:
 
  In the L2ARC (cache) there is no ability to mirror, because cache
 device
  removal has always been supported.  You can't mirror a cache device,
 because
  you don't need it.
 
  How do you know that I don't need it?  The ability seems useful to me.
 
  The gain is quite minimal.. If the first device fails (which doesn't
  happen too often I hope), then it will be read from the normal pool once
  and then stored in ARC/L2ARC again. It just behaves like a cache miss
  for that specific block... If this happens often enough to become a
  performance problem, then you should throw away that L2ARC device
  because it's broken beyond usability.
 
 
 
  Well if a L2ARC device fails there might be an unacceptable drop in
 delivered performance.
  If it were mirrored than a drop usually would be much smaller or there
 could be no drop if a mirror had an option to read only from one side.
 
  Being able to mirror L2ARC might especially be useful once a persistent
 L2ARC is implemented as after a node restart or a resource failover in a
 cluster L2ARC will be kept warm. Then the only thing which might affect L2
 performance considerably would be a L2ARC device failure...
 
 
  --
  Robert Milkowski
  http://milek.blogspot.com
 
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Loss of L2ARC SSD Behaviour

2010-05-04 Thread Marc Nicholas
The L2ARC will continue to function.

-marc

On 5/4/10, Michael Sullivan michael.p.sulli...@mac.com wrote:
 HI,

 I have a question I cannot seem to find an answer to.

 I know I can set up a stripe of L2ARC SSD's with say, 4 SSD's.

 I know if I set up ZIL on SSD and the SSD goes bad, the the ZIL will be
 relocated back to the spool.  I'd probably have it mirrored anyway, just in
 case.  However you cannot mirror the L2ARC, so...

 What I want to know, is what happens if one of those SSD's goes bad?  What
 happens to the L2ARC?  Is it just taken offline, or will it continue to
 perform even with one drive missing?

 Sorry, if these questions have been asked before, but I cannot seem to find
 an answer.
 Mike

 ---
 Michael Sullivan
 michael.p.sulli...@me.com
 http://www.kamiogi.net/
 Japan Mobile: +81-80-3202-2599
 US Phone: +1-561-283-2034

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS where to go!

2010-03-26 Thread Marc Nicholas
Richard,

My challenge to you is that at least three vedors that I know of built
their storage platforms on FreeBSD. One of them sells $4bn/year of
product - petty sure that eclipses all (Open)Solaris-based storage ;)

-marc

On 3/26/10, Richard Elling richard.ell...@gmail.com wrote:
 On Mar 26, 2010, at 4:46 AM, Edward Ned Harvey wrote:
 What does everyone thing about that? I bet it is not as mature as on
 OpenSolaris.

 mature is not the right term in this case.  FreeBSD has been around much
 longer than opensolaris, and it's equally if not more mature.

 Bill Joy might take offense to this statement.  Both FreeBSD and Solaris
 trace
 their roots to the work done at Berkeley 30 years ago. Both have evolved in
 different ways at different rates. Since Solaris targets the enterprise
 market,
 I will claim that Solaris is proven in that space. OpenSolaris is just one
 of the
 next steps forward for Solaris.
  -- richard

 ZFS storage and performance consulting at http://www.RichardElling.com
 ZFS training on deduplication, NexentaStor, and NAS performance
 Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com





 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Validating alignment of NTFS/VMDK/ZFS blocks

2010-03-18 Thread Marc Nicholas
On Thu, Mar 18, 2010 at 2:44 PM, Chris Murray chrismurra...@gmail.comwrote:

 Good evening,
 I understand that NTFS  VMDK do not relate to Solaris or ZFS, but I was
 wondering if anyone has any experience of checking the alignment of data
 blocks through that stack?


NetApp has a great little tool called mbrscan/mbralignit's free, but I'm
not sure if NetApp customers are supposed to distribute it.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can we get some documentation on iSCSI sharing after comstar took over?

2010-03-16 Thread Marc Nicholas
On Tue, Mar 16, 2010 at 2:46 PM, Svein Skogen sv...@stillbilde.net wrote:


  Not quite a one liner. After you create the target once (step 3), you do
 not have to do that again for the next volume. So three lines.


 So ... no way around messing with guid numbers?


I'll write you a Perl script :)

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can we get some documentation on iSCSI sharing after comstar took over?

2010-03-16 Thread Marc Nicholas
On Tue, Mar 16, 2010 at 3:16 PM, Svein Skogen sv...@stillbilde.net wrote:


  I'll write you a Perl script :)
 

 I think there are ... several people that'd like a script that gave us
 back some of the ease of the old shareiscsi one-off, instead of having
 to spend time on copy-and-pasting GUIDs they have ... no real use for. ;)


I'll try and knock something up in the next few days, then!

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] [OT] Interesting ranking of MLC SSDs

2010-03-12 Thread Marc Nicholas
Given that quote a few folk ask which is the best SSD?, I thought some
folk might find the following interesting:

http://www.storagenewsletter.com/news/flash/dramexchange-intel-ssds

-marc

P.S: Apologies if the slightly off-topic post offends anyone.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recommendations for an l2arc device?

2010-02-26 Thread Marc Nicholas
On Fri, Feb 26, 2010 at 2:43 PM, Brandon High bh...@freaks.com wrote:

 snip
 The drives I'm considering are:

 OCZ Vertex 30GB
 Intel X25V 40GB
 Crucial CT64M225 64GB


Personally, I'd go with the Intel product...but save a few more pennies up
and get the X-25M. The extra boost on read and write performance is worth
it.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Freeing unused space in thin provisioned zvols

2010-02-26 Thread Marc Nicholas
On Fri, Feb 26, 2010 at 2:42 PM, Lutz Schumann
presa...@storageconcepts.dewrote:


 Now If a virtual machine writes to the zvol, blocks are allocated on disk.
 Reads are now partial from disk (for all blocks written) and from ZFS layer
 (all unwritten blocks).

 If the virtual machine (which may be vmware / xen / hyperv) deletes blocks
 / frees space within the zvol, this also means a write - usually in meta
 data area only. Thus the underlaying Storage system does not know which
 blocks in a zvol are really used.


Your're using VMs and *not* using dedupe?! VMs are almost the perfect
use-case for dedupe :)

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [indiana-discuss] future of OpenSolaris

2010-02-24 Thread Marc Nicholas
On Wed, Feb 24, 2010 at 2:02 PM, Troy Campbell troy.campb...@fedex.comwrote:


 http://www.oracle.com/technology/community/sun-oracle-community-continuity.html

 Half way down it says:
 Will Oracle support Java and OpenSolaris User Groups, as Sun has?

 Yes, Oracle will indeed enthusiastically support the Java User Groups,
 OpenSolaris User Groups, and other Sun-related user group communities
 (including the Java Champions), just as Oracle actively supports hundreds of
 product-oriented user groups today. We will be reaching out to these groups
 soon.


Supporting doesn't necessarily mean continuing the Open Source projects!

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Opensolaris 2010.03 / snv releases

2010-02-23 Thread Marc Nicholas
Isn't the dedupe bug fixed in svn133?

-marc

On Tue, Feb 23, 2010 at 9:21 AM, Jeffry Molanus jeffry.mola...@proact.nlwrote:

 There is no clustering package for it and available source seems very old
 also the de-dup bug is there iirc. So if you don't need HA cluster and
 dedup..

 BR, Jeffry

  -Original Message-
  From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
  boun...@opensolaris.org] On Behalf Of Bruno Sousa
  Sent: dinsdag 23 februari 2010 8:37
  To: zfs-discuss@opensolaris.org
  Subject: [zfs-discuss] Opensolaris 2010.03 / snv releases
 
  Hi all,
 
  According to what i have been reading the opensolaris 2010.03 should be
  released around March this year, but with all the process of the
 Oracle/Sun
  deal i was wondering if anyone knows if this schedule still makes sense,
  and if not does snv_132/133 look very similar to future 2010.03.
  In other words, without waiting for the opensolaris 2010.03 would anyone
  risk to put in production any box with snv_132/133 ?
 
  Thanks,
  Bruno

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Import zpool from FreeBSD in OpenSolaris

2010-02-23 Thread Marc Nicholas
send and receive?!

-marc

On Tue, Feb 23, 2010 at 9:25 PM, Thomas Burgess wonsl...@gmail.com wrote:

 When i needed to do this, the only way i could get it to work was to do
 this:

 Take some disks, use a Opensolaris Live CD and label them EFI
 Create a ZPOOL in FreeBSD with these disks
 copy my data from freebsd to the new zpool
 export the pool
 import the pool




 On Tue, Feb 23, 2010 at 9:11 PM, patrik s...@dentarg.net wrote:

 I want to import my zpool's from FreeBSD 8.0 in OpenSolaris 2009.06.

 After reading the few posts (links below) I was able to find on the
 subject, it seems like it there is a differences between FreeBSD and
 Solaris. FreeBSD operates on directly on the disk and Solaris creates a
 partion and uses that... is that right? Is it impossible for OpenSolaris to
 use zpool's from FreeBSD?

 * http://opensolaris.org/jive/thread.jspa?messageID=445766
 * http://opensolaris.org/jive/thread.jspa?messageID=450755;
 * http://mail.opensolaris.org/pipermail/ug-nzosug/2009-June/27.html

 This is zpool import from my machine with OpenSolaris 2009.06 (all
 zpool's are fine in FreeBSD). Notice that the zpool named temp can be
 imported. Why not secure then? Is it because it is raidz1?

  pool: secure
id: 15384175022505637073
  state: UNAVAIL
 status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-5E
 config:

secureUNAVAIL  insufficient replicas
  raidz1  UNAVAIL  insufficient replicas
c8t1d0p0  ONLINE
c8t2d0s2  ONLINE
c8t3d0s8  UNAVAIL  corrupted data
c8t4d0s8  UNAVAIL  corrupted data


  pool: temp
id: 10889808377251842082
  state: ONLINE
 status: The pool is formatted using an older on-disk version.
 action: The pool can be imported using its name or numeric identifier,
 though
some features will not be available without an explicit 'zpool
 upgrade'.
 config:

tempONLINE
  c8t0d0p0  ONLINE
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-18 Thread Marc Nicholas
On Thu, Feb 18, 2010 at 10:49 AM, Matt registrat...@flash.shanje.comwrote:


 Here's IOStat while doing writes :

 r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
1.0  256.93.0 2242.9  0.3  0.11.30.5  11  12 c0t0d0
0.0  253.90.0 2242.9  0.3  0.11.00.4  10  11 c0t1d0
1.0  253.92.5 2234.4  0.2  0.10.90.4   9  11 c1t0d0
1.0  258.92.5 2228.9  0.3  0.11.30.5  12  13 c1t1d0

 This shows about a 10-12% utilization of my gigabit network, as reported by
 Task Manager in Windows 7.


Unless you are using SSDs (which I believe you're not), you're IOPS-bound on
the drives IMHO. Writes are a better test of this than reads for cache
reasons.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-18 Thread Marc Nicholas
Run Bonnie++. You can install it with the Sun package manger and it'll
appear under /usr/benchmarks/bonnie++

Look for the command line I posted a couple of days back for a decent set of
flags to truly rate performance (using sync writes).

-marc

On Thu, Feb 18, 2010 at 11:05 AM, Matt registrat...@flash.shanje.comwrote:

 Also - still looking for the best way to test local performance - I'd love
 to make sure that the volume is actually able to perform at a level locally
 to saturate gigabit.  If it can't do it internally, why should I expect it
 to work over GbE?
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Bonnie++ stats

2010-02-16 Thread Marc Nicholas
Anyone else got stats to share?

Note: the below is 4*Caviar Black 500GB drives, 1*Intel x-25m setup as both
ZIL and L2ARC, decent ASUS mobo, 2GB of fast RAM.

-marc

r...@opensolaris130:/tank/myfs# /usr/benchmarks/bonnie++/bonnie++ -u root -d
/tank/myfs -f -b
Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03c   --Sequential Output-- --Sequential Input-
--Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
opensolaris130   4G   49503  13 30468   9   67882   6
320.1   1
--Sequential Create-- Random
Create
-Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
%CP
 16  4225  30 + +++  4709  24  3407  38 + +++  4572
22
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-10 Thread Marc Nicholas
Definitely use Comstar as Tim says.

At home I'm using 4*WD Caviar Blacks on an AMD Phenom x4 @ 1.Ghz and
only 2GB of RAM. I'm running svn132. No HBA - onboard SB700 SATA
ports.$

I can, with IOmeter, saturate GigE from my WinXP laptop via iSCSI.

Can you toss the RAID controller aside an use motherboard SATA ports
with just a few drives? That could help highlight if its the RAID
controler or not, and even one drive has better throughput than you're
seeing.

Cache, ZIL, and vdev tweaks are great - but you're not seeing any of
those bottlnecks, I can assure you.

-marc

On 2/10/10, Tim Cook t...@cook.ms wrote:
 On Wed, Feb 10, 2010 at 4:06 PM, Brian E. Imhoff
 beimh...@hotmail.comwrote:

 I am in the proof-of-concept phase of building a large ZFS/Solaris based
 SAN box, and am experiencing absolutely poor / unusable performance.

 Where to begin...

 The Hardware setup:
 Supermicro 4U 24 Drive Bay Chassis
 Supermicro X8DT3 Server Motherboard
 2x Xeon E5520 Nehalem 2.26 Quad Core CPUs
 4GB Memory
 Intel EXPI9404PT 4 port 1000GB Server Network Card (used for ISCSI traffic
 only)
 Adaptec 52445 28 Port SATA/SAS Raid Controller connected to
 24x Western Digital WD1002FBYS 1TB Enterprise drives.

 I have configured the 24 drives as single simple volumes in the Adeptec
 RAID BIOS , and are presenting them to the OS as such.

 I then, Create a zpool, using raidz2, using all 24 drives, 1 as a
 hotspare:
 zpool create tank raidz2 c1t0d0 c1t1d0 [] c1t22d0 spare c1t23d00

 Then create a volume store:
 zfs create -o canmount=off tank/volumes

 Then create a 10 TB volume to be presented to our file server:
 zfs create -V 10TB -o shareiscsi=on tank/volumes/fsrv1data

 From here, I discover the iscsi target on our Windows server 2008 R2 File
 server, and see the disk is attached in Disk Management.  I initialize the
 10TB disk fine, and begin to quick format it.  Here is where I begin to
 see
 the poor performance issue.   The Quick Format took about 45 minutes. And
 once the disk is fully mounted, I get maybe 2-5 MB/s average to this disk.

 I have no clue what I could be doing wrong.  To my knowledge, I followed
 the documentation for setting this up correctly, though I have not looked
 at
 any tuning guides beyond the first line saying you shouldn't need to do
 any
 of this as the people who picked these defaults know more about it then
 you.

 Jumbo Frames are enabled on both sides of the iscsi path, as well as on
 the
 switch, and rx/tx buffers increased to 2048 on both sides as well.  I know
 this is not a hardware / iscsi network issue.  As another test, I
 installed
 Openfiler in a similar configuration (using hardware raid) on this box,
 and
 was getting 350-450 MB/S from our fileserver,

 An iostat -xndz 1 readout of the %b% coloum during a file copy to the
 LUN shows maybe 10-15 seconds of %b at 0 for all disks, then 1-2 seconds
 of
 100, and repeats.

 Is there anything I need to do to get this usable?  Or any additional
 information I can provide to help solve this problem?  As nice as
 Openfiler
 is, it doesn't have ZFS, which is necessary to achieve our final goal.



 You're extremely light on ram for a system with 24TB of storage and two
 E5520's.  I don't think it's the entire source of your issue, but I'd
 strongly suggest considering doubling what you have as a starting point.

 What version of opensolaris are you using?  Have you considered using
 COMSTAR as your iSCSI target?

 --Tim


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-10 Thread Marc Nicholas
How does lowering the flush interval help? If he can't ingress data
fast enough, faster flushing is a Bad Thibg(tm).

-marc

On 2/10/10, Kjetil Torgrim Homme kjeti...@linpro.no wrote:
 Bob Friesenhahn bfrie...@simple.dallas.tx.us writes:
 On Wed, 10 Feb 2010, Frank Cusack wrote:

 The other three commonly mentioned issues are:

  - Disable the naggle algorithm on the windows clients.

 for iSCSI?  shouldn't be necessary.

  - Set the volume block size so that it matches the client filesystem
block size (default is 128K!).

 default for a zvol is 8 KiB.

  - Check for an abnormally slow disk drive using 'iostat -xe'.

 his problem is lazy ZFS, notice how it gathers up data for 15 seconds
 before flushing the data to disk.  tweaking the flush interval down
 might help.

 An iostat -xndz 1 readout of the %b% coloum during a file copy to
 the LUN shows maybe 10-15 seconds of %b at 0 for all disks, then 1-2
 seconds of 100, and repeats.

 what are the other values?  ie., number of ops and actual amount of data
 read/written.

 --
 Kjetil T. Homme
 Redpill Linpro AS - Changing the game

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-10 Thread Marc Nicholas
This is a Windows box, not a DB that flushes every write.

The drives are capable of over 2000 IOPS (albeit with high latency as
its NCQ that gets you there) which would mean, even with sync flushes,
8-9MB/sec.

-marc

On 2/10/10, Brent Jones br...@servuhome.net wrote:
 On Wed, Feb 10, 2010 at 3:12 PM, Marc Nicholas geekyth...@gmail.com wrote:
 How does lowering the flush interval help? If he can't ingress data
 fast enough, faster flushing is a Bad Thibg(tm).

 -marc

 On 2/10/10, Kjetil Torgrim Homme kjeti...@linpro.no wrote:
 Bob Friesenhahn bfrie...@simple.dallas.tx.us writes:
 On Wed, 10 Feb 2010, Frank Cusack wrote:

 The other three commonly mentioned issues are:

  - Disable the naggle algorithm on the windows clients.

 for iSCSI?  shouldn't be necessary.

  - Set the volume block size so that it matches the client filesystem
    block size (default is 128K!).

 default for a zvol is 8 KiB.

  - Check for an abnormally slow disk drive using 'iostat -xe'.

 his problem is lazy ZFS, notice how it gathers up data for 15 seconds
 before flushing the data to disk.  tweaking the flush interval down
 might help.

 An iostat -xndz 1 readout of the %b% coloum during a file copy to
 the LUN shows maybe 10-15 seconds of %b at 0 for all disks, then 1-2
 seconds of 100, and repeats.

 what are the other values?  ie., number of ops and actual amount of data
 read/written.

 --
 Kjetil T. Homme
 Redpill Linpro AS - Changing the game

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


 --
 Sent from my mobile device
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


 ZIL performance issues? Is writecache enabled on the LUNs?

 --
 Brent Jones
 br...@servuhome.net


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Marc Nicholas
I think you'll do just fine then. And I think the extra platter will
work to your advantage.

-marc

On 2/3/10, Simon Breden sbre...@gmail.com wrote:
 Probably 6 in a RAID-Z2 vdev.

 Cheers,
 Simon
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Marc Nicholas
I would go with cores (threads) rather than clock speed here. My home system
is a 4-core AMD @ 1.8Ghz and performs well.

I wouldn't use drives that big and you should be aware of the overheads of
RaidZ[x].

-marc



On Thu, Feb 4, 2010 at 6:19 PM, Brian broco...@vt.edu wrote:

 I am Starting to put together a home NAS server that will have the
 following roles:

 (1) Store TV recordings from SageTV over either iSCSI or CIFS.  Up to 4 or
 5 HD streams at a time.  These will be streamed live to the NAS box during
 recording.
 (2) Playback TV (could be stream being recorded, could be others) to 3 or
 more extenders
 (3) Hold a music repository
 (4) Hold backups from windows machines, mac (time machine), linux.
 (5) Be an iSCSI target for several different Virtual Boxes.

 Function 4 will use compression and deduplication.
 Function 5 will use deduplication.

 I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2
 mirrored boot drives.

 I have been reading these forums off and on for about 6 months trying to
 figure out how to best piece together this system.

 I am first trying to select the CPU.  I am leaning towards AMD because of
 ECC support and power consumption.

 For items such as de-dupliciation, compression, checksums etc.  Is it
 better to get a faster clock speed or should I consider more cores?  I know
 certain functions such as compression may run on multiple cores.

 I have so far narrowed it down to:

 AMD Phenom II X2 550 Black Edition Callisto 3.1GHz
 and
 AMD Phenom X4 9150e Agena 1.8GHz Socket AM2+ 65W Quad-Core

 As they are roughly the same price.
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Marc Nicholas
Very interesting stats -- thanks for taking the time and trouble to share
them!

One thing I found interesting is that the Gen 2 X25-M has higher write IOPS
than the X25-E according to Intel's documentation (6,600 IOPS for 4K writes
versus 3,300 IOPS for 4K writes on the E). I wonder if it'd perform better
as a ZIL? (The write latency on both drives is the same).

-marc

On Thu, Feb 4, 2010 at 6:43 PM, Peter Radig pe...@radig.de wrote:

 I was interested in the impact the type of an SSD has on the performance of
 the ZIL. So I did some benchmarking and just want to share the results.

 My test case is simply untarring the latest ON source (528 MB, 53k files)
 on an Linux system that has a ZFS file system mounted via NFS over gigabit
 ethernet.

 I got the following results:
 - locally on the Solaris box: 30 sec
 - remotely with no dedicated ZIL device: 36 min 37 sec (factor 73 compared
 to local)
 - remotely with ZIL disabled: 1 min 54 sec (factor 3.8 compared to local)
 - remotely with a OCZ VERTEX SATA II 120 GB as ZIL device: 14 min 40 sec
 (factor 29.3 compared to local)
 - remotely with an Intel X25-E 32 GB as ZIL device: 3 min 11 sec (factor
 6.4 compared to local)

 So it really makes a difference what type of SSD you use for your ZIL
 device. I was expecting a good performance from the X25-E, but was really
 suprised that it is that good (only 1.7 times slower than it takes with ZIL
 completely disabled). So I will use the X25-E as ZIL device on my box and
 will not consider disabling ZIL at all to improve NFS performance.

 -- Peter
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Marc Nicholas
On Thu, Feb 4, 2010 at 7:54 PM, Brian broco...@vt.edu wrote:

 It sounds like the consensus is more cores over clock speed.  Surprising to
 me since the difference in clocks speed was over 1Ghz.  So, I will go with a
 quad core.


Four cores @ 1.8Ghz = 7.2Ghz of threaded performance ([Open]Solaris is
relatively decent in terms of threading).

Two cores @ 3.1Ghz = 6.2Ghz

:)

Although you may find single threaded operations slower, as someone pointed
out, but even those might wash out as sometimes its I/O that's the problem.

I was leaning towards 4GB of ram - which hopefully should be enough for
 dedup as I am only planning on dedupping my smaller file systems (backups
 and VMs)


4GB is a good start.


 Was my raidz2 performance comment above correct?  That the write speed is
 that of the slowest disk?  That is what I believe I have read.


You are sort-of-correct that its the write speed of the slowest disk.

Mirrored drives will be faster, especially for random I/O. But you sacrifice
storage for that performance boost. That said, I have a similar setup as far
as number of spindles and can push 200MB/sec+ through it and saturate GigE
for iSCSI so maybe I'm being harsh on raidz2 :)


 Now on to the hard part of picking a motherboard that is supported and has
 enough SATA ports!


I used an ASUS board (M4A785-M) which has six (6) SATA2 ports onboard and
pretty decent Hypertransport throughput.

Hope that helps.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Marc Nicholas
On Thu, Feb 4, 2010 at 10:18 PM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Thu, 4 Feb 2010, Marc Nicholas wrote:

  Very interesting stats -- thanks for taking the time and trouble to share
 them!

 One thing I found interesting is that the Gen 2 X25-M has higher write
 IOPS than the
 X25-E according to Intel's documentation (6,600 IOPS for 4K writes versus
 3,300 IOPS for
 4K writes on the E). I wonder if it'd perform better as a ZIL? (The
 write latency on
 both drives is the same).


 The write IOPS between the X25-M and the X25-E are different since with the
 X25-M, much more of your data gets completely lost.  Most of us prefer not
 to lose our data.

 Would you like to qualify your statement further?

While I understand the difference between MLC and SLC parts, I'm pretty sure
Intel didn't design the M version to make data get completely lost. ;)

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Marc Nicholas
On Thu, Feb 4, 2010 at 10:35 PM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Thu, 4 Feb 2010, Marc Nicholas wrote:


 The write IOPS between the X25-M and the X25-E are different since with
 the X25-M, much
 more of your data gets completely lost.  Most of us prefer not to lose our
 data.

 Would you like to qualify your statement further?


 Google is your friend.  And check earlier on this list/forum as well.

  While I understand the difference between MLC and SLC parts, I'm pretty
 sure Intel didn't
 design the M version to make data get completely lost. ;)


 It loses the most recently written data, even after a cache sync request.
  A number of people have verified this for themselves and posted results.
  Even the X25-E has been shown to lose some transactions.

 The devices have some DRAM (16MB) that is used for write amplification
levelling. The sudden loss of power means that this DRAM doesn't get flushed
to Flash. This is the very reason the STEC devices have a supercap.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-03 Thread Marc Nicholas
As I previously mentioned, I'm pretty happy with the 500GB Caviar
Blacks that I have :)

One word of caution: failure and rebuild times with 1TB+ drives can be
a concern. How many spindles were you planning?

-marc

On 2/3/10, Simon Breden sbre...@gmail.com wrote:
 Sounds good.

 I was taking a look at the 1TB Caviar Black drives which are WD1001FALS I
 think.
 They seem to have superb user ratings and good reliability comments from
 many people.

 I consider these full fat drives as opposed to the LITE (green) drives, as
 they spin at 7200 rpm instead of 5400 rpm, have higher performance  and burn
 more juice than the Green models, but they have superb reviews from almost
 everyone regarding behaviour and reliability, and at the end of the day, we
 need good, reliable drives that work well in a RAID system.

 I can get them for around the same price as the cheapest 1.5TB green drives
 from Samsung.
 Somewhere I saw people saying that WDTLER.EXE works to allow reduction of
 the error reporting time like the enterprise RE versions (RAID Edition).
 However I then saw another user saying on the newer revisions WD have
 disabled this. I need to check a bit more to see what's really the case.

 Cheers,
 Simon

 http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] verging OT: how to buy J4500 w/o overpriced drives

2010-02-02 Thread Marc Nicholas
I agree wholeheartedlyyou're paying to make the problem go away in an
expedient manner. That said, I see how much we spend on NetApp storage at
work and it makes me shudder ;)

I think someone was wondering if the large storage vendors have their own
microcode on drives? I can tell you that NetApp do...and that's one way they
lock you in (if the drive doesn't report NetApp firmware, the filer will
reject the drive) and also how they do tricks like
soft-failure/re-validation, 520-byte sectors, etc.

-marc


On Tue, Feb 2, 2010 at 11:12 AM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Tue, 2 Feb 2010, David Dyer-Bennet wrote:


 Now, I'm sure not ALL drives offered at Newegg could qualify; but the
 question is, how much do I give up by buying an enterprise-grade drive
 from a major manufacturer, compared to the Sun-certified drive?


 If you have a Sun service contract, you give up quite a lot.  If a Sun
 drive fails every other day, then Sun will replace that Sun drive every
 other day, even if the system warranty has expired.  But if it is a non-Sun
 drive, then you have to deal with a disinterested drive manufacturer, which
 could take weeks or months.

 My experiences thus far is that if you pay for a Sun service contract, then
 you should definitely pay extra for Sun branded parts.

 Hopefully Oracle will do better than Sun at explaining the benefits and
 services provided by a service contract.

 Bob
 --
 Bob Friesenhahn
 bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
 GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-02 Thread Marc Nicholas
On Tue, Feb 2, 2010 at 1:38 PM, Brandon High bh...@freaks.com wrote:

 On Sat, Jan 16, 2010 at 9:47 AM, Simon Breden sbre...@gmail.com wrote:
  Which consumer-priced 1.5TB drives do people currently recommend?

 I happened to be looking at the Hitachi product information, and
 noticed that the Deskstar 7K2000 appears to be supported in RAID
 configurations. One of the applications listed is Video editing
 arrays.

 http://www.hitachigst.com/portal/site/en/products/deskstar/7K2000/


I've been having good success with the Western Digital Caviar Black
drives...which are cousins of their Enterprise RE3 platform. AFAIK, you're
stuck at 1TB or 2TB capacities but I've managed to get some good deals on
them...

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-02 Thread Marc Nicholas
I'm running the 500GB models myself, but I wouldn't say they're overly
noisyand I've been doing ZFS/iSCSI/IOMeter/Bonnie++ stress testing with
them.

They whine rather than click FYI.

-marc

On Tue, Feb 2, 2010 at 2:58 PM, Simon Breden sbre...@gmail.com wrote:

 IIRC the Black range are meant to be the 'performance' models and so are a
 bit noisy. What's your opinion? And the 2TB models are not cheap either for
 a home user. The 1TB seem a good price. And from what little I read, it
 seems you can control the error reporting time with the WDTLER.EXE utility
 :)

 Cheers,
 Simon
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-02 Thread Marc Nicholas
On Tue, Feb 2, 2010 at 3:11 PM, Frank Cusack 
frank+lists/z...@linetwo.netwrote:


 That said, I doubt 2TB drives represent good value for a home user.
 They WILL fail more frequently and as a home user you aren't likely
 to be keeping multiple spares on hand to avoid warranty replacement
 time.


 I'm having a hard time convincing myself to go beyond 500GBboth for
performance (I'm trying to build something with reasonable IOPS) and
reliability reasons.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss