Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-14 Thread Andrey Kuzmin
On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner
jel+...@cs.uni-magdeburg.de wrote:
 On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote:
 As to whether it makes sense (as opposed to two distinct physical
 devices), you would have read cache hits competing with log writes for
 bandwidth. I doubt both will be pleased :-)

 Hmm - good point. What I'm trying to accomplish:

 Actually our current prototype thumper setup is:
        root pool (1x 2-way mirror SATA)
        hotspare  (2x SATA shared)
        pool1 (12x 2-way mirror SATA)   ~25% used       user homes
        pool2 (10x 2-way mirror SATA)   ~25% used       mm files, archives, 
 ISOs

 So pool2 is not really a problem - delivers about 600MB/s uncached,
 about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso)
 and is not contineously stressed. However sync write is ~ 200 MB/s
 or 20 MB/s and mirror, only.

 Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice
 usually via NFS and a litle bit via samba - a lot of more or less small
 files, probably widely spread over the platters. E.g. checkin' out a
 project from a svn|* repository into a home takes hours. Also having
 its workspace on NFS isn't fun (compared to linux xfs driven local soft
 2-way mirror).

Flash-based read cache should help here by minimizing (metadata) read
latency, and flash-based log would bring down write latency. The only
drawback of using single F20 is that you're trying to minimize both
with the same device.


 So, seems to be a really interesting thing and I expect at least wrt.
 user homes a real improvement, no matter, how the final configuration
 will look like.

 Maybe the experts at the source are able to do some 4x SSD vs. 1xF20
 benchmarks? I guess at least if they turn out to be good enough, it
 wouldn't hurt ;-)

Would be interesting indeed.

Regards,
Andrey


  Jens Elkner wrote:
 ...
  whether it is possible/supported/would make sense to use a Sun Flash
  Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs?

 Regards,
 jel.
 --
 Otto-von-Guericke University     http://www.cs.uni-magdeburg.de/
 Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
 39106 Magdeburg, Germany         Tel: +49 391 67 12768
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-14 Thread Jens Elkner
On Mon, Dec 14, 2009 at 01:29:50PM +0300, Andrey Kuzmin wrote:
 On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner
 jel+...@cs.uni-magdeburg.de wrote:
...
  Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice
...
 Flash-based read cache should help here by minimizing (metadata) read
 latency, and flash-based log would bring down write latency. The only

Hmmm not yet sure - I think writing via NFS is the biggest problem. 
Anyway, almost finished the work for a 'generic collector' and data
visualizer which allows us to better correlate them to each other on the
fly (i.e. no rrd pain) and understand the numbers hopefully a little bit
better ;-).

 drawback of using single F20 is that you're trying to minimize both
 with the same device.

Yepp. But would that scenario change much, when one puts 4 SSDs at HDD
slots instead? I guess, not really or would be even worse because it
disturbs the data path from/to HDD controlers. Anyway, I'll try that
out next year, when those neat toys are officially supported (and the
budget for this got its final approval of course).
  
Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-13 Thread Jens Elkner
On Sat, Dec 12, 2009 at 03:28:29PM +, Robert Milkowski wrote:
 Jens Elkner wrote:
Hi Robert,
 
 just got a quote from our campus reseller, that readzilla and logzilla
 are not available for the X4540 - hmm strange Anyway, wondering
 whether it is possible/supported/would make sense to use a Sun Flash
 Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? 
 
 If so, is it possible to partition the F20, e.g. into 36 GB logzilla,
 60GB readzilla (also interesting for other X servers)?
 
   
 IIRC the card presents 4x LUNs so you could use each of them for 
 different purpose.
 You could also use different slices.

oh. coool - IMHO this would be sufficient for our purposes (see next
posting).
 
 me or not. Is this correct?
 
 It still does. The capacitor is not for flushing data to disks drives! 
 The card has a small amount of DRAM memory on it which is being flushed 
 to FLASH. Capacitor is to make sure it actually happens if the power is 
 lost.

Yepp - found the specs. (BTW: Was probably to late to think about the
term Flash Accelerator having DRAM prestoserv in mind ;-)).

Thanx,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-13 Thread Jens Elkner
On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote:
 As to whether it makes sense (as opposed to two distinct physical
 devices), you would have read cache hits competing with log writes for
 bandwidth. I doubt both will be pleased :-)
  
Hmm - good point. What I'm trying to accomplish:

Actually our current prototype thumper setup is:
root pool (1x 2-way mirror SATA)
hotspare  (2x SATA shared)
pool1 (12x 2-way mirror SATA)   ~25% used   user homes
pool2 (10x 2-way mirror SATA)   ~25% used   mm files, archives, ISOs

So pool2 is not really a problem - delivers about 600MB/s uncached, 
about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso)
and is not contineously stressed. However sync write is ~ 200 MB/s
or 20 MB/s and mirror, only.

Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice
usually via NFS and a litle bit via samba - a lot of more or less small
files, probably widely spread over the platters. E.g. checkin' out a
project from a svn|* repository into a home takes hours. Also having
its workspace on NFS isn't fun (compared to linux xfs driven local soft
2-way mirror).

So data are coming in/going out currently via 1Gbps aggregated NICs, for
X4540 we plan to use one (may be experiment with two some time later)
10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in
and out for the last PCIe 8x Slot - the F20. Since IO55 is bound
with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory those 
2 GB/s to and from the F20 should be possible. 

So IMHO wrt. bandwith basically it makes not really a difference, whether
one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20
(even when distributing the SSDs over the IO55(2) and MCP55).

However, having it on a separate HT than the HDDs might be an advantage.
Also one would be much more flexible/able to scale immediately, i.e.
don't need to re-organize the pools because of the now unavailable
slots/ is still able to use all HDD slots with normal HDDs.
(we are certainly going to upgrade x4500 to x4540 next year ...)
(And if Sun makes a F40 - dropping the SAS ports and putting 4 other
Flash-Modules on it or is able to get flashMods with double speed , one
could probably really get ~ 1.2 GB write and ~ 2GB/s read).

So, seems to be a really interesting thing and I expect at least wrt. 
user homes a real improvement, no matter, how the final configuration
will look like. 

Maybe the experts at the source are able to do some 4x SSD vs. 1xF20
benchmarks? I guess at least if they turn out to be good enough, it
wouldn't hurt ;-)

  Jens Elkner wrote:
...
  whether it is possible/supported/would make sense to use a Sun Flash
  Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-13 Thread Richard Elling


On Dec 13, 2009, at 5:04 PM, Jens Elkner wrote:


On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote:

As to whether it makes sense (as opposed to two distinct physical
devices), you would have read cache hits competing with log writes  
for

bandwidth. I doubt both will be pleased :-)


Hmm - good point. What I'm trying to accomplish:

Actually our current prototype thumper setup is:
root pool (1x 2-way mirror SATA)
hotspare  (2x SATA shared)
pool1 (12x 2-way mirror SATA)   ~25% used   user homes
pool2 (10x 2-way mirror SATA)   ~25% used   mm files, archives, ISOs

So pool2 is not really a problem - delivers about 600MB/s uncached,
about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso)
and is not contineously stressed. However sync write is ~ 200 MB/s
or 20 MB/s and mirror, only.

Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/ 
soffice
usually via NFS and a litle bit via samba - a lot of more or less  
small

files, probably widely spread over the platters. E.g. checkin' out a
project from a svn|* repository into a home takes hours. Also having
its workspace on NFS isn't fun (compared to linux xfs driven local  
soft

2-way mirror).


This is probably a latency problem, not a bandwidth problem.  Use  
zilstat

to see how much ZIL traffic you have and, if the number is significant,
consider using the F20 for a separate log device.
 -- richard



So data are coming in/going out currently via 1Gbps aggregated NICs,  
for

X4540 we plan to use one (may be experiment with two some time later)
10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in
and out for the last PCIe 8x Slot - the F20. Since IO55 is bound
with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory  
those

2 GB/s to and from the F20 should be possible.

So IMHO wrt. bandwith basically it makes not really a difference,  
whether

one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20
(even when distributing the SSDs over the IO55(2) and MCP55).

However, having it on a separate HT than the HDDs might be an  
advantage.

Also one would be much more flexible/able to scale immediately, i.e.
don't need to re-organize the pools because of the now unavailable
slots/ is still able to use all HDD slots with normal HDDs.
(we are certainly going to upgrade x4500 to x4540 next year ...)
(And if Sun makes a F40 - dropping the SAS ports and putting 4 other
Flash-Modules on it or is able to get flashMods with double speed ,  
one

could probably really get ~ 1.2 GB write and ~ 2GB/s read).

So, seems to be a really interesting thing and I expect at least wrt.
user homes a real improvement, no matter, how the final configuration
will look like.

Maybe the experts at the source are able to do some 4x SSD vs. 1xF20
benchmarks? I guess at least if they turn out to be good enough, it
wouldn't hurt ;-)


Jens Elkner wrote:

...
whether it is possible/supported/would make sense to use a Sun  
Flash

Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs?


Regards,
jel.
--
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-12 Thread Robert Milkowski

Jens Elkner wrote:

Hi,

just got a quote from our campus reseller, that readzilla and logzilla
are not available for the X4540 - hmm strange Anyway, wondering
whether it is possible/supported/would make sense to use a Sun Flash
Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? 


If so, is it possible to partition the F20, e.g. into 36 GB logzilla,
60GB readzilla (also interesting for other X servers)?

  
IIRC the card presents 4x LUNs so you could use each of them for 
different purpose.

You could also use different slices.

me or not. Is this correct?

  


It still does. The capacitor is not for flushing data to disks drives! 
The card has a small amount of DRAM memory on it which is being flushed 
to FLASH. Capacitor is to make sure it actually happens if the power is 
lost.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-12 Thread Andrey Kuzmin
As to whether it makes sense (as opposed to two distinct physical
devices), you would have read cache hits competing with log writes for
bandwidth. I doubt both will be pleased :-)

On 12/12/09, Robert Milkowski mi...@task.gda.pl wrote:
 Jens Elkner wrote:
 Hi,

 just got a quote from our campus reseller, that readzilla and logzilla
 are not available for the X4540 - hmm strange Anyway, wondering
 whether it is possible/supported/would make sense to use a Sun Flash
 Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs?

 If so, is it possible to partition the F20, e.g. into 36 GB logzilla,
 60GB readzilla (also interesting for other X servers)?


 IIRC the card presents 4x LUNs so you could use each of them for
 different purpose.
 You could also use different slices.
 me or not. Is this correct?



 It still does. The capacitor is not for flushing data to disks drives!
 The card has a small amount of DRAM memory on it which is being flushed
 to FLASH. Capacitor is to make sure it actually happens if the power is
 lost.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



-- 
Regards,
Andrey
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-12 Thread Robert Milkowski

Andrey Kuzmin wrote:

As to whether it makes sense (as opposed to two distinct physical
devices), you would have read cache hits competing with log writes for
bandwidth. I doubt both will be pleased :-)
As usual it depends on your workload. In many real-life scenarios the 
bandwidth  probably  won't be an issue.
Then also keep in mind that you can put up-to 4 ssd modules on it and 
each module iirc  is presented as a separate device anyway. So in order 
to get all the performance you need to make sure to issue I/O to all 
modules.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-11 Thread Jens Elkner
Hi,

just got a quote from our campus reseller, that readzilla and logzilla
are not available for the X4540 - hmm strange Anyway, wondering
whether it is possible/supported/would make sense to use a Sun Flash
Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? 

If so, is it possible to partition the F20, e.g. into 36 GB logzilla,
60GB readzilla (also interesting for other X servers)?

Wrt. super capacitators: I would guess, at least wrt. X4540 it doesn't
give one more protection, since if power is lost, the HDDs do not respond
anymore and thus it doesn't matter, whether the log cache is protected for
a short time or not. Is this correct?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss