Re: [zfs-discuss] ZFS monitoring

2013-02-12 Thread Borja Marcos

On Feb 12, 2013, at 11:25 AM, Pawel Jakub Dawidek wrote:

 I made kstat data available on FreeBSD via 'kstat' sysctl tree:

Yes, I am using the data. I wasn't sure about how getting something meaningful 
from it, but I've found the arcstats.pl script and I am using it as a model.

Suggestions will be always welcome, though :)

(the sample pages I put on devilator.froblua.com aren't using the better 
organized graphs, though, it's just a crude parameter dump)





Borja.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS monitoring

2013-02-11 Thread Borja Marcos

Hello,

I'n updating Devilator, the performance data collector for Orca and FreeBSD to 
include ZFS monitoring. So far I am graphing the ARC and L2ARC size, L2ARC 
writes and reads, and several hit/misses data pairs.

Any suggestions to improve it? What other variables can be interesting?

An example of the current state of the program is here:

http://devilator.frobula.com

Thanks,





Borja.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS monitoring

2013-02-11 Thread Borja Marcos

On Feb 11, 2013, at 4:56 PM, Tim Cook wrote:

 The zpool iostat output has all sorts of statistics I think would be 
 useful/interesting to record over time.


Yes, thanks :) I think I will add them, I just started with the esoteric ones.

Anyway, still there's no better way to read it than running zpool iostat and 
parsing the output, right? 





Borja.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Puzzling problem with zfs receive exit status

2012-03-29 Thread Borja Marcos

Hello,

I hope someone has an idea. 

I have a replication program that copies a dataset from one server to another 
one. The replication mechanism is the obvious one, of course:

 zfs send -Ri from snapshot(n-1) snapshot(n)  file
scp file remote machine (I do it this way instead of using a pipeline so that a 
network error won't interrupt a receive data stream)
and on the remote machine,
zfs receive -Fd pool

It's been working perfectly for months, no issues. However, yesterday we began 
to see something weird: the zfs receive being executed on the remote machine is 
exiting with an exit status of 1, even though the replication is finished, and 
I see the copied snapshots on the remote machine. 

Any ideas? It's really puzzling. It seems that the replication is working (a 
zfs list -t snapshot shows the new snapshots correctly applied to the dataset) 
but I'm afraid there's some kind of corruption.

The OS is Solaris, SunOS  5.10 Generic_141445-09 i86pc i386 i86pc.

Any ideas?



Thanks in advance,





Borja.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Puzzling problem with zfs receive exit status

2012-03-29 Thread Borja Marcos

On Mar 29, 2012, at 11:59 AM, Ian Collins wrote:

 Does zfs receive produce any warnings?  Have you tried adding -v?

Thank you very much Ian and Carsten. Well, adding a -v gave me a clue. Turns 
out that one of the old snapshots had a clone created. 

zfs receive -v  was complaining that it couldn't destroy an old snapshot, which 
wasn't visible but had been cloned (and forgotten) long ago. A truss of the zfs 
receive process shown it accessing the clone. 

So, zfs receive was doing its job, the new snapshot was applied correctly, but 
it was exiting with an exit value of 1, without printing any warnings, which I 
think is wrong.

I've destroyed  the clone and everything  has gone back to normal. Now zfs 
receive exits with 0.

Still I'm not sure if it could be a bug, the snapshot was cloned in November 
2011 and it had been sitting around for a long time. The pool had less than 20 
% of free space two days ago, maybe it triggered something.

Anyway, as I said, with the clone removed everything has gone back to normal.


Thank you very much,






Borja.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Puzzling problem with zfs receive exit status

2012-03-29 Thread Borja Marcos

On Mar 29, 2012, at 5:11 PM, Richard Elling wrote:

 Thank you very much Ian and Carsten. Well, adding a -v gave me a clue. Turns 
 out that one of the old snapshots had a clone created. 
 
 zfs receive -v  was complaining that it couldn't destroy an old snapshot, 
 which wasn't visible but had been cloned (and forgotten) long ago. A truss 
 of the zfs receive process shown it accessing the clone. 
 
 So, zfs receive was doing its job, the new snapshot was applied correctly, 
 but it was exiting with an exit value of 1, without printing any warnings, 
 which I think is wrong.
 
 You are correct. Both zfs and zpool have a bad case of exit 1 if something 
 isn't right.
 At Nexenta, I filed a bug against the ambiguity of the return code. You 
 should consider
 filing a similar bug with Oracle. In the open-source ZFS implementations, 
 there is some
 other work to get out of the way before properly tackling this, but that work 
 is in progress :-)

I understand that either a warning or, at least, a syslog message  with 
LOG_WARNING is in order.

Regarding the open source camp, yes, I'm using ZFS on FreeBSD as well :) 






Borja.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss