On 07/21/14 08:58, Sebastian Martinez wrote:
> os Red Hat Enterprise Linux Server release 6.5 (Santiago)
> kernel Linux nbsf000p5ap06.nbsf.com.ar
> <http://nbsf000p5ap06.nbsf.com.ar> 2.6.32-431.17.1.el6.ppc64 #1 SMP Fri
> Apr 11 17:30:35 EDT 2014 ppc64 ppc64 ppc64 GNU/Linux

This is a very old kernel. I'm not sure about the PPC architecture,
but, you may want to try running jffnms on a system other than PPC
with a kernel specifically tuned to run JFFNMS, since it is a standalone 
system for your needs.



> the disk is a lun on dell storage vnx 5500
>
>   sdparm /dev/sdh
>      /dev/sdh: DGC       VRAID             0532
> Read write error recovery mode page:
>    AWRE        1  [cha: n, def:  1, sav:  1]
>    ARRE        1  [cha: n, def:  1, sav:  1]
>    PER         1  [cha: n, def:  1, sav:  1]
> Caching (SBC) mode page:
>    WCE         0  [cha: y, def:  0, sav:  0]
>    RCD         0  [cha: y, def:  1, sav:  1]
> Control mode page:
>    SWP         0  [cha: n, def:  0, sav:  0]
>
> the system is dedicated to jffnms

Might an SSD speed things along?  Ceratainly new, faster RAM will.

> top - 10:57:31 up 41 days, 23:14,  1 user,  load average: 0.52, 0.36, 0.30
> Tasks: 206 total,   2 running, 204 sleeping,   0 stopped,   0 zombie
> Cpu(s):  1.5%us,  0.5%sy,  0.0%ni, 97.8%id,  0.1%wa,  0.1%hi,  0.0%si,
>   0.0%st
> Mem:   2052736k total,  2032832k used,    19904k free,   145664k buffers
> Swap:  3145600k total,   155264k used,  2990336k free,   988992k cached
>
>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>   6734 mysql     20   0 2070m  33m 9664 S  3.0  1.7 632:36.61 mysqld
> 23838 apache    20   0  102m  33m  16m R  2.0  1.7   0:22.14 httpd
> 15752 jffcom    20   0  175m  46m  27m S  0.7  2.3   0:01.13 php
>   2330 root      20   0  5632 3840 2432 R  0.3  0.2   0:00.01 top
>   7403 jffcom    20   0  175m  46m  27m S  0.3  2.3   0:04.79 php
> 15859 jffcom    20   0  175m  46m  27m S  0.3  2.3   0:01.35 php
> 15860 jffcom    20   0  175m  47m  27m S  0.3  2.3   0:02.20 php
> 19365 jffcom    20   0  175m  47m  27m S  0.3  2.4   0:07.26 php
> 21821 jffcom    20   0  182m  41m  27m S  0.3  2.1   2:19.43 php
>
>
>
> 2014-07-21 11:51 GMT-03:00 wireless <wirel...@tampabay.rr.com
> <mailto:wirel...@tampabay.rr.com>>:
>
>     On 07/21/14 08:21, Sebastian Martinez wrote:
>      > another example
>      >
>      > this inteface fails randomly but when i run the poller manualy
>     works ok
>      >
>      > 10:01:50 CH:12 (13134):  :  H 255 :  I 1482 :  P   1 :
>      > reachability_start:ping(): 53cd0f3d7ed74 -> buffer(): 1 (time P:
>     3.36 | B:
>      > 0.04)
>      > 10:01:58 CH:12 (13134):  :  H 255 :  I 1482 :  P 123 :
>      > reachability_values:rtt(rtt): 0 -> buffer(): 2 (time P: 0.11 | B:
>     0.01)
>      > 10:01:58 CH:12 (13134):  :  H 255 :  I 1482 :  P 124 :
>      > reachability_values:packetloss(pl): 50 -> buffer(): 3 (time P:
>     0.04 | B:
>      > 0.01)
>      > 10:01:58 CH:12 (13134):  :  H 255 :  I 1482 :  P 125 : status():
>      > unreachable|100% Packet Loss -> alarm(40): Event Added: 1221844
>     (time P:  0.01 | B: 4.02)
>      >

Transient failures  on communications (packetloss) can cause problems?
The question is what is the root cause of those dropped packets 
(wireshark is your friend, when searching for network bottlenecks).
If the cause of the dropped packets in on the local machine, you have
greater troubles than jffnms.....

>      >   php -q poller.php -i 1482 -o -F

suffice is to say,  php might not be the most robust of languages
to use for this poller. popen is a facinating idea, that should be
explored in craigs latest thread.


>     What is the Operating system you are running on ?  (version of OS?)

Redhat Enterprise.

>     If linux, what version of the kernel are you using?

2.6.32-431.17.1.el6.ppc64

>     What is your hardrive)s) specs?  (sdparm /dev/sda ?   man sdparm)

An SSD would help keenly.
>
>     Have you checked (htop, iotop etc) your system resources while the
>     errors occur?
>     What other major software do you have running on your system?

Delete as much as possible.

>     Do these problems correlate to other heavy load events on your system?
>     Do these problems occur when your system is 'lightly loaded' ?

This ideas need to be correlated with the timestamps of your jffnms faults.


What file systems are you using on your linux machine? Hopefully, at 
least ext4? if not something more aggressive?
I'm suspecting the old kernel and the bloatwhere of the commercial linux 
distro are the culprits. Build up a Gentoo box, on a 64 bit (amd/intel) 
with ample ram and an SSD, and only put what you need on the machine. 
I'll bet your problems are not verifiable. (I bet they go away, 
otherwise other folks would be complaining [loudly])? PPC is a 
problematic linux build, for a variety of reasons. x86_64 is your best
bet for a maintainable platform, imho.

This one is up to you and Craig....

Take all of this with a grain of salt, as your constraints may not be 
changeable.


hth,
James



------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
jffnms-users mailing list
jffnms-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jffnms-users

Reply via email to