Re: [zfs-discuss] getting decent NFS performance

2009-12-23 Thread Marion Hakanson
erik.trim...@sun.com said:
> The suggestion was to make the SSD on each machine an iSCSI volume, and  add
> the two volumes as a mirrored ZIL into the zpool. 

I've mentioned the following before

For a poor-person's slog which gives decent NFS performance, we have had
good results with allocating a slice on (e.g.) an X4150's internal disk,
behind the internal Adaptec RAID controller.  Said controller has only
256MB of NVRAM, but it made a big difference with NFS performance (look
for the "tar unpack" results at the bottom of the page):

http://acc.ohsu.edu/~hakansom/j4400_bench.html

You can always replace them when funding for your Zeus SSD's comes in (:-).

Regards,

-- 
Marion Hakanson 
OHSU Advanced Computing Center


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-23 Thread Erik Trimble

Andrey Kuzmin wrote:

And how do you expect the mirrored iSCSI volume to work after
failover, with secondary (ex-primary) unreachable?

Regards,
Andrey
  
As a normal Degraded mirror.   No problem. 

The suggestion was to make the SSD on each machine an iSCSI volume, and 
add the two volumes as a mirrored ZIL into the zpool.



It's a (relatively) simple and ingenious suggestion. 


-Erik



On Wed, Dec 23, 2009 at 9:40 AM, Erik Trimble  wrote:
  

Charles Hedrick wrote:


Is ISCSI reliable enough for this?

  

YES.

The original idea is a good one, and one that I'd not thought of.  The (old)
iSCSI implementation is quite mature, if not anywhere as nice
(feature/flexibility-wise) as the new COMSTAR stuff.

I'm thinking that just putting in a straight-through cable between the two
machine is the best idea here, rather than going through a switch.

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss





--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-23 Thread Andrey Kuzmin
And how do you expect the mirrored iSCSI volume to work after
failover, with secondary (ex-primary) unreachable?

Regards,
Andrey




On Wed, Dec 23, 2009 at 9:40 AM, Erik Trimble  wrote:
> Charles Hedrick wrote:
>>
>> Is ISCSI reliable enough for this?
>>
>
> YES.
>
> The original idea is a good one, and one that I'd not thought of.  The (old)
> iSCSI implementation is quite mature, if not anywhere as nice
> (feature/flexibility-wise) as the new COMSTAR stuff.
>
> I'm thinking that just putting in a straight-through cable between the two
> machine is the best idea here, rather than going through a switch.
>
> --
> Erik Trimble
> Java System Support
> Mailstop:  usca22-123
> Phone:  x17195
> Santa Clara, CA
> Timezone: US/Pacific (GMT-0800)
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Erik Trimble

Charles Hedrick wrote:

Is ISCSI reliable enough for this?
  

YES.

The original idea is a good one, and one that I'd not thought of.  The 
(old) iSCSI implementation is quite mature, if not anywhere as nice 
(feature/flexibility-wise) as the new COMSTAR stuff.


I'm thinking that just putting in a straight-through cable between the 
two machine is the best idea here, rather than going through a switch.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Charles Hedrick
Is ISCSI reliable enough for this?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Ross Walker
On Dec 22, 2009, at 9:08 PM, Bob Friesenhahn > wrote:



On Tue, 22 Dec 2009, Ross Walker wrote:

I think zil_disable may actually make sense.


How about a zil comprised of two mirrored iSCSI vdevs formed from a  
SSD on each box?


I would not have believed that this is a useful idea except that I  
have seen "IOPS offload" to a server on the network work extremely  
well.  Latencies on gigabit ethernet are pretty small these days.


Yes, Gbe only adds about 100us to the latency and when using a raw SSD  
as a backing store it should be a lot better than what the OP is doing  
now (and he can use one of the less costly models).


-Ross
 
___

zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Bob Friesenhahn

On Tue, 22 Dec 2009, Ross Walker wrote:


I think zil_disable may actually make sense.


How about a zil comprised of two mirrored iSCSI vdevs formed from a SSD on 
each box?


I would not have believed that this is a useful idea except that I 
have seen "IOPS offload" to a server on the network work extremely 
well.  Latencies on gigabit ethernet are pretty small these days.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Ross Walker
On Dec 22, 2009, at 8:58 PM, Richard Elling   
wrote:



On Dec 22, 2009, at 5:40 PM, Charles Hedrick wrote:


It turns out that our storage is currently being used for

* backups of various kinds, run daily by cron jobs
* saving old log files from our production application
* saving old versions of java files from our production application

Most of the usage is write-only, and a fair amount of it involves  
copying huge directories. There's no actual current user data.


I think zil_disable may actually make sense.


Except that with the ZIL disabled, you break the trust that the data  
was

written. Kinda defeats the prime objective, no?


But no pre-warp civilizations were influenced!

Oh, you said objective...

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Richard Elling

On Dec 22, 2009, at 5:40 PM, Charles Hedrick wrote:


It turns out that our storage is currently being used for

* backups of various kinds, run daily by cron jobs
* saving old log files from our production application
* saving old versions of java files from our production application

Most of the usage is write-only, and a fair amount of it involves  
copying huge directories. There's no actual current user data.


I think zil_disable may actually make sense.


Except that with the ZIL disabled, you break the trust that the data was
written. Kinda defeats the prime objective, no?
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Ross Walker
On Dec 22, 2009, at 8:40 PM, Charles Hedrick   
wrote:



It turns out that our storage is currently being used for

* backups of various kinds, run daily by cron jobs
* saving old log files from our production application
* saving old versions of java files from our production application

Most of the usage is write-only, and a fair amount of it involves  
copying huge directories. There's no actual current user data.


I think zil_disable may actually make sense.


How about a zil comprised of two mirrored iSCSI vdevs formed from a  
SSD on each box?


An idea.

-Ross
 
___

zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Charles Hedrick
It turns out that our storage is currently being used for

* backups of various kinds, run daily by cron jobs
* saving old log files from our production application
* saving old versions of java files from our production application

Most of the usage is write-only, and a fair amount of it involves copying huge 
directories. There's no actual current user data.

I think zil_disable may actually make sense.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Charles Hedrick
Thanks. That's what I was looking for.

Yikes! I hadn't realized how expensive the Zeus is.

We're using Solaris cluster, so if the system goes down, the other one takes 
over. That means that if the ZIL is on a local disk, we lose it in a crash. 
Might as well just set zil_disable (something I'm considering doing anyway).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] getting decent NFS performance

2009-12-22 Thread Erik Trimble

Charles Hedrick wrote:

We have a server using Solaris 10. It's a pair of systems with a shared J4200, 
with Solaris cluster. It works very nicely. Solaris cluster switches over 
transparently.

However as an NFS server it is dog-slow. This is the usual synchronous write 
problem. Setting zfs_disable fixes the problem. otherwise it can take more than 
an hour to copy files that take me 2 min with our netapp.

The obvious solution is to use a flash disk for the ZIL. However I'm clueless 
what hardware to use. Can anyone suggest either a flash drive that will work in 
the J4200 (SATA), or some way to connect a drive to two machines so that 
Solaris cluster will work? Sun used to claim that they were going to support a 
flash drive in the J4200. How that statement seems to have disappeared, and 
their SATA flash drive seems to be vapor, despite appearing real on the Sun web 
site. (I tried to order one.)
  


I don't see anywhere a specific option from Sun for an SSD in the J4xxx 
arrays.


HOWEVER, the Sun Storage 7210 is based on the J4400.   You can try 
ordering Option  XTA7210-LOGZ18GB


This is the 18GB Zeus-based SSD with the 3.5" bracket for use in the 
J4xxx chassis that come with the 7210.


I have no idea if this would be supported, but it certainly should work 
just fine.



Optionally, you should be able to get an 18 or 32GB SSDs for EACH 
machines you are using as the cluster heads, assuming their a reasonably 
late-model machine (i.e. using 2.5" drives, and first offered within the 
last 2 years).  This IS supported. However, doing something this way 
does run the risk of data loss during a failover - no data corruption, 
mind you, but as the original data on the SSD in the failed machine is 
unaccessible, it won't be available to the failover machine.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] getting decent NFS performance

2009-12-22 Thread Charles Hedrick
We have a server using Solaris 10. It's a pair of systems with a shared J4200, 
with Solaris cluster. It works very nicely. Solaris cluster switches over 
transparently.

However as an NFS server it is dog-slow. This is the usual synchronous write 
problem. Setting zfs_disable fixes the problem. otherwise it can take more than 
an hour to copy files that take me 2 min with our netapp.

The obvious solution is to use a flash disk for the ZIL. However I'm clueless 
what hardware to use. Can anyone suggest either a flash drive that will work in 
the J4200 (SATA), or some way to connect a drive to two machines so that 
Solaris cluster will work? Sun used to claim that they were going to support a 
flash drive in the J4200. How that statement seems to have disappeared, and 
their SATA flash drive seems to be vapor, despite appearing real on the Sun web 
site. (I tried to order one.)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss