Ok- I've sorted the patch out. If you want you can build with this manifest file: https://gist.github.com/yruss972/d0ac973759485f43e433 I've also made a pull request against couchbase/sigar
FYI, Yonah On Friday, July 4, 2014 2:07:56 PM UTC+3, Yonah Russ wrote: > > Hold off on that... I made a mess of the patch and my repo. > I'll post back when I got it sorted. > Sorry, > Yonah > > On Friday, July 4, 2014 12:10:25 PM UTC+3, Yonah Russ wrote: >> >> Hi, >> >> I've found the source of the problem. Sigar isn't zone aware so all the >> memory calculations are way off. >> I confirmed it by patching sigar and recompiling. >> >> My patch wouldn't have worked on vanilla Solaris so I did some more >> digging and found that Brendan Gregg actually tried to get this fixed in >> upstream sigar in 2012. >> His patch handles vanilla Solaris as well but I don't have a Solaris >> machine to test on. >> His patch also fixes something in the disk stats. I'm not sure how it >> might affect couchbase. >> >> In any case, I'm rebuilding couchbase with an adapted version of his >> patch. >> If all compiles well, I'll let you all know and make a pull request to >> couchbase/sigar. >> >> Here is the patched version in case someone can test on Solaris: >> https://github.com/yruss972/sigar >> >> Thanks, >> Yonah >> >> On Thursday, July 3, 2014 2:38:20 AM UTC+3, Aliaksey Kandratsenka wrote: >>> >>> >>> >>> >>> On Wed, Jul 2, 2014 at 4:02 PM, Yonah Russ <[email protected]> wrote: >>> >>>> Well I took a look at the scripting option and assuming you meant >>>> escript, it doesn't look like something I'll be picking up from scratch in >>>> an hour or so ;) >>>> >>> >>> No I referred to your favorite scripting language. Like perl, ruby, tcl >>> or something else. >>> >>> >>>> >>>> I hacked port_sigar as you suggested to print the reply values to >>>> stderr and ran it using the example.escript which is in the source >>>> directory ( I hope that was a reasonable thing to do). >>>> >>>> Here is the output from one of the couchbase servers: >>>> >>>> ./portsigar/example.escript >>>> cpu_total_ms: 67285756633 >>>> cpu_idle_ms: 32306446002 >>>> swap_total: 8589934592 >>>> swap_used: 1717555200 >>>> swap_page_in: 5915867 >>>> swap_page_out: 87010188 >>>> mem_total: 4294967296 >>>> mem_used: 18446744073677623296 >>>> mem_actual_used: 18446744017909834968 >>>> mem_actual_free: 60094683944 >>>> escript: exception error: no match of right hand side value >>>> >>>> <<2,0,0,0,40,3,0,0,217,42,139,170,15,0,0,0,178,62,157,133,7,0, >>>> >>>> 0,0,0,0,0,0,2,0,0,0,0,208,95,102,0,0,0,0,219,68,90,0,0,0,0, >>>> 0,140,...>> >>>> >>>> Obviously the mem_used and mem_actual_used numbers are way off. >>>> I dug into it deeper and it seems the calculations made by sigar are >>>> wrong when running inside a zone. >>>> I'll keep looking for a long term solution to that but the thing is >>>> that these calculations in sigar haven't changed in 3 years so what >>>> changed >>>> in 2.5.1 that caused the numbers in the interface to come out screwy? >>>> >>> >>> You can find out by using git log on sigar and sigar_port. Both projects >>> are low "traffic" so you should be able to spot something that looks >>> solaris-specific. Most likely it's due to some change in sigar but I could >>> be wrong. I.e. maybe it's due to sigar_port asking for more stats. >>> >>> -- You received this message because you are subscribed to the Google Groups "Couchbase" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
