Re: Dreadful gmirror performance, though each half works fine

2008-04-22 Thread Andrew Snow

Pete French wrote:

I did some benchmarking, and load gives me a bit better performance than
round-robin so I've elected to use that. Haven't tried prefer as
syncing all the drives backwards and forwards to get the preferences set
seems a bit too much like hard work!


I use this patch for sbin/geom/class/mirror/geom_mirror.c

Change:
  md.md_priority = i - 1;
To:
  md.md_priority = i - 1 + 100;

This makes the first disk with a priority of 100 instead of 0, which 
makes it much easier to use prefer properly.



It's frustrating, it is *so* close to being workable with iscsi, and the
performance is very good, but if it is going to keep locking up on
me then I just cant use it that way :-(


After failing many times with iSCSI, I use geom_gate with the following 
ggatec options:   -t 30 -q 32768 -R 262144 -S 262144 -o rw


It seems to be very reliable and fast, but you have to use prefer to 
get good performance, I only write across the network and not read. 
Load and round-robin lead to slow reads during periods of heavy writes.


Hope that helps,

- Andrew
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-22 Thread Zaphod Beeblebrox
On Tue, Apr 22, 2008 at 2:02 AM, Andrew Snow [EMAIL PROTECTED] wrote:

 Pete French wrote:

  I did some benchmarking, and load gives me a bit better performance
  than
  round-robin so I've elected to use that. Haven't tried prefer as
  syncing all the drives backwards and forwards to get the preferences set
  seems a bit too much like hard work!
 

 I use this patch for sbin/geom/class/mirror/geom_mirror.c

 Change:
  md.md_priority = i - 1;
 To:
  md.md_priority = i - 1 + 100;


I hate to ask for the right  solution, but shouldn't we be patching the
gmirror userland to accept a priority argument to label and make the kernel
part listen to that?  This patch does make sense --- but it doesn't go far
enough.

Also, it seems sensible that you should be able to modify the priority
values of a running disk.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-22 Thread Andrew Snow

Zaphod Beeblebrox wrote:

I use this patch for sbin/geom/class/mirror/geom_mirror.c

Change:
 md.md_priority = i - 1;
To:
 md.md_priority = i - 1 + 100;


I hate to ask for the right  solution, but shouldn't we be patching 
the gmirror userland to accept a priority argument to label and make the 
kernel part listen to that?  This patch does make sense --- but it 
doesn't go far enough.


Also, it seems sensible that you should be able to modify the priority 
values of a running disk.


Both of those are good ideas.  But for years, no one can be bothered 
making a patch.  At least my patch is only one line, and solves 90% of 
the problem, and still no one can be bothered committing it.


Maybe we should apply my patch for now, until someone works on the rest.

- Andrew
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-21 Thread Pete French
 I would suppose you might have to sync the mirror and then break off and
 forget the local copy and then sync again.  In our case, I'm not sure --- it
 was awhile ago, but a number of them are also in the 'load' state --- as the
 higher latency network drives would normally show a higher load.

I did some benchmarking, and load gives me a bit better performance than
round-robin so I've elected to use that. Haven't tried prefer as
syncing all the drives backwards and forwards to get the preferences set
seems a bit too much like hard work!

When I started writing this post I was going to say that the iscsi initiator
patch has fixed all my problems, and  that it has run beautifully for the
entire weekend with no lockups. But as I started typing this I set a large
copy from the remote drive going in another window, and this now appears
to have locked up :-(

It's frustrating, it is *so* close to being workable with iscsi, and the
performance is very good, but if it is going to keep locking up on
me then I just cant use it that way :-(

Thanks for all the advice though, very useful.

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-21 Thread Ivan Voras
Pete French wrote:

 It's frustrating, it is *so* close to being workable with iscsi, and the
 performance is very good, but if it is going to keep locking up on
 me then I just cant use it that way :-(

You should complain about it :) Try to get a backtrace of the situation
on the server (enable kernel debugging, enable keyboard hotkey to kernel
debugger; see
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html),
then start a new, properly named thread with the information.



signature.asc
Description: OpenPGP digital signature


Re: Dreadful gmirror performance, though each half works fine

2008-04-21 Thread Pete French
 You should complain about it :) Try to get a backtrace of the situation
 on the server (enable kernel debugging, enable keyboard hotkey to kernel
 debugger; see
 http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kern=
 eldebug.html),
 then start a new, properly named thread with the information.

I know, and I will if/when I get a chance - right now I just need to get
this mirror working unfortunately. I should be able to re-create all this
on my desk on a non-live database in the next couple of weeks though,
and will try and duplicate it then. In the meantime it's back to ggate :)

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-18 Thread Ivan Voras
Pete French wrote:
 I have experimented with this rather extensively and have operationally
 decided not to use ggated in combination with gmirror --- it doesn't appear
 to work as well as one might expect.
 
 Ah, thats unmfortunate :-( I oroginally started off using the
 iscsi initiator and target, which did work O.K., but when actually
 ran live ended up locking up after several hours,a nd then panicing

Some problems with the iSCSI initiator were found a bit after
7.0-RELEASE was made, they should have been fixed by now in 7-STABLE. If
not, try the patch that appears in this thread:

http://lists.freebsd.org/pipermail/freebsd-scsi/2008-February/003383.html



signature.asc
Description: OpenPGP digital signature


Re: Dreadful gmirror performance, though each half works fine

2008-04-18 Thread Pete French
 Some problems with the iSCSI initiator were found a bit after
 7.0-RELEASE was made, they should have been fixed by now in 7-STABLE. If
 not, try the patch that appears in this thread:

 http://lists.freebsd.org/pipermail/freebsd-scsi/2008-February/003383.html=

Ah, excellent, thankyou. This patch is not uyet in -STABLE though, but I
will give it a try and see if that helps. It does explain why I diidn't
find this in testing - all my test boxes are single core, os don't
show up problems like this (am rectifying that soon).

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-18 Thread Zaphod Beeblebrox
On Thu, Apr 17, 2008 at 4:01 PM, Pete French [EMAIL PROTECTED]
wrote:

  In the end we found that ggate was crashy after a week or two of heavy
 use,
  too... dispite it's performance problems (which can be somewhat fixed by
  telling gmirror to only read from the local disk)

 That last part interests me - how did you manage to make it do that ?
 I read the man page, and the 'prefer' balancing algorithm should
 let you tell it which disc to read from - but there is no mway to
 change the priority on a disc in amirror that I can see. It can only
 be set when inserting new drives. The ddefault is '0' and hence it's
 nbot possible to attach a new drive with a priority below that of
 the existing local drive. I tried using '-1' as a priority to fix
 this, but it came up as 255.


I would suppose you might have to sync the mirror and then break off and
forget the local copy and then sync again.  In our case, I'm not sure --- it
was awhile ago, but a number of them are also in the 'load' state --- as the
higher latency network drives would normally show a higher load.


  Certainly ZFS needs lots of memory --- most of my systems running ZFS
 have
  4G of RAM and are running in 64 bit mode.  With the wiki's recomendation
 of
  a large number of kernel pages, I havn't had a problem with crashing.  I
 am
  using ZFS RAIDZ as a large data store and ZFS mirroring (separately) on
 my
  workstation as /usr, /var, and home directories.

 All out machine ateb64 bit with between 4 and 16 gig of RAm too, so I
 could
 try that. So you trust it then ? I;d be interested to know exactly which
 options from the wiki page you ended up using for both kernel pages and
 ZFS
 itself. That would be my ideal solution if it is stable enough.


Hmm.  Trust is a funny thing.  First the options.  On my notebook:

vm.kmem_size_max=1073741824
vm.kmem_size=1073741824

On the 32 bit version, I have options KVA_PAGES=512, but the same
loader.conf settings above.

My large fileserver is used as SMB and NFS filestore for large datasets
generally ending in .avi or .iso.  Not terribly stressed, I don't think.  My
laptop is used as a workstation.

I don't think I'd run mysql or postgresql on zfs yet --- or if I did, I
might run solaris.  It's newer there.  It would make me nervous at any
rate.  My laptop is a core-2-duo Extreme 7900 with 4 gig of ram.  It will
run make -j8 world quite quickly on zfs --- and even quicker the 2nd
time.  I've been running /usr, /var and home directories on zfs for several
months now and I havn't had any problems.

That said, I take regular (daily) snapshots and I zfs send the snapshots
to the other zfs array for backup.  This is a big advantage to zfs ---
snapshots and snapshot backups are fast (and still checksum protected).

Now as to trust:  ZFS is copy-on-write.  From my reading of problems people
have, it seems that new data might be corrupted in some manner (there are
posts even today about zfs and bittorrent) but the snapshots should not be
affected.  I hedge my bets further by keeping backups.


  efficient.  Removing the read load from the ggated drive seems to help
 quite
  a bit in overall performance.  But even with this change, I still found
 that
  ggate would crash after several days to a week of heavy use.

 Well, I upped the networking buffers and queue sizes to what I woulkd
 normally consider 'stupid' values, and now it seems to have settled down
 and is performing well (am using then 'load' balancing algorithm). Shall
 see if it stays that way for the next few weeks given what you have just
 said. I should probably try ZFS on it too, just for my own curiosity.


'load' should almost always prefer the local drive.  Large buffers are
required to compensate for network latency.  Sounds about normal.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-17 Thread Pete French
 I have experimented with this rather extensively and have operationally
 decided not to use ggated in combination with gmirror --- it doesn't appear
 to work as well as one might expect.

Ah, thats unmfortunate :-( I oroginally started off using the
iscsi initiator and target, which did work O.K., but when actually
ran live ended up locking up after several hours,a nd then panicing
the kernel. So not ideal - but when it was working it was fine. ggated
seems the opposite - doesnt crash, but performance is not suitable for any
kind of real use.

 I'm somewhat vaguely wondering if zfs with one local and one ggated disk
 will work well.

I tried ZFS for a while myself, and it works O.K., but has a tendency
to panic if it wants memory which it can't get. Despite the many different
guides available, I never managed to get it to the point where I would
be happy to use it on a production system without worrying about it
suddenly becomming memory hungry and dieing.

Thanks for the inout though - I am doing some more experimentation
with ggate (basically raing some buffers as per a thread I found) and
seeing if that helps.

BTW, I think ggate is the problem and not gmirror here - gmirror on top
of iscsi works fine as I said.
-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-17 Thread Zaphod Beeblebrox
On Thu, Apr 17, 2008 at 7:18 AM, Pete French [EMAIL PROTECTED]
wrote:

 I am trying to run a system with a pair of drives mirrored under
 gmirror, one of thhem being local and the other remote using ggated.


I have experimented with this rather extensively and have operationally
decided not to use ggated in combination with gmirror --- it doesn't appear
to work as well as one might expect.

Some improvements were made to ggated awhile ago --- to improve error
recovery.  It should have helped, but didn't.

I'm somewhat vaguely wondering if zfs with one local and one ggated disk
will work well.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-17 Thread Zaphod Beeblebrox
On Thu, Apr 17, 2008 at 2:25 PM, Pete French [EMAIL PROTECTED]
wrote:

  I have experimented with this rather extensively and have operationally
  decided not to use ggated in combination with gmirror --- it doesn't
 appear
  to work as well as one might expect.

 Ah, thats unmfortunate :-( I oroginally started off using the
 iscsi initiator and target, which did work O.K., but when actually
 ran live ended up locking up after several hours,a nd then panicing
 the kernel. So not ideal - but when it was working it was fine. ggated
 seems the opposite - doesnt crash, but performance is not suitable for any
 kind of real use.


In the end we found that ggate was crashy after a week or two of heavy use,
too... dispite it's performance problems (which can be somewhat fixed by
telling gmirror to only read from the local disk)

 I'm somewhat vaguely wondering if zfs with one local and one ggated disk
  will work well.

 I tried ZFS for a while myself, and it works O.K., but has a tendency
 to panic if it wants memory which it can't get. Despite the many different
 guides available, I never managed to get it to the point where I would
 be happy to use it on a production system without worrying about it
 suddenly becomming memory hungry and dieing.


Certainly ZFS needs lots of memory --- most of my systems running ZFS have
4G of RAM and are running in 64 bit mode.  With the wiki's recomendation of
a large number of kernel pages, I havn't had a problem with crashing.  I am
using ZFS RAIDZ as a large data store and ZFS mirroring (separately) on my
workstation as /usr, /var, and home directories.


 Thanks for the inout though - I am doing some more experimentation
 with ggate (basically raing some buffers as per a thread I found) and
 seeing if that helps.

 BTW, I think ggate is the problem and not gmirror here - gmirror on top
 of iscsi works fine as I said.


I would agree... save the fact that it may be an interaction between the two
and/or UFS that is causing the problems.  Certainly gmirror on local disks
works fine (I've run gmirror/gstripe combinations for several years now as
RAID 10 store with UFS on top).  This is all going to be latency sensitive
--- ggate needs to allow a larger number of oustanding transactions to be
efficient.  Removing the read load from the ggated drive seems to help quite
a bit in overall performance.  But even with this change, I still found that
ggate would crash after several days to a week of heavy use.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dreadful gmirror performance, though each half works fine

2008-04-17 Thread Pete French
 In the end we found that ggate was crashy after a week or two of heavy use,
 too... dispite it's performance problems (which can be somewhat fixed by
 telling gmirror to only read from the local disk)

That last part interests me - how did you manage to make it do that ?
I read the man page, and the 'prefer' balancing algorithm should
let you tell it which disc to read from - but there is no mway to
change the priority on a disc in amirror that I can see. It can only
be set when inserting new drives. The ddefault is '0' and hence it's
nbot possible to attach a new drive with a priority below that of
the existing local drive. I tried using '-1' as a priority to fix
this, but it came up as 255.

 Certainly ZFS needs lots of memory --- most of my systems running ZFS have
 4G of RAM and are running in 64 bit mode.  With the wiki's recomendation of
 a large number of kernel pages, I havn't had a problem with crashing.  I am
 using ZFS RAIDZ as a large data store and ZFS mirroring (separately) on my
 workstation as /usr, /var, and home directories.

All out machine ateb64 bit with between 4 and 16 gig of RAm too, so I could
try that. So you trust it then ? I;d be interested to know exactly which
options from the wiki page you ended up using for both kernel pages and ZFS
itself. That would be my ideal solution if it is stable enough.

 efficient.  Removing the read load from the ggated drive seems to help quite
 a bit in overall performance.  But even with this change, I still found that
 ggate would crash after several days to a week of heavy use.

Well, I upped the networking buffers and queue sizes to what I woulkd
normally consider 'stupid' values, and now it seems to have settled down
and is performing well (am using then 'load' balancing algorithm). Shall
see if it stays that way for the next few weeks given what you have just
said. I should probably try ZFS on it too, just for my own curiosity.

cheers,

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]