Re: kswapd0 causing read timeouts

2012-06-18 Thread Holger Hoffstaette
On Mon, 18 Jun 2012 11:57:17 -0700, Gurpreet Singh wrote:

 Thanks for all the information Holger.
 
 Will do the jvm updates, kernel updates will be slow to come by. I see
 that with disk access mode standard, the performance is stable and better
 than in mmap mode, so i will probably stick to that.

Please let us know how things work out.

 Are you suggesting i try out mongodb?

Uhm, no. :) I meant that it also uses mmap exclusively (!), and
consequently can also have pretty bad/irregular performance when the
(active) data set grows much larger than RAM. To  be fair, that is a
pretty hard problem in general.

-h




Re: kswapd0 causing read timeouts

2012-06-14 Thread Gurpreet Singh
JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions
on this..
1. Is there a way to find out if mlockall really worked other than just the
mlockall successful log message?
2. Does cassandra only mlock the jvm heap or also the mmaped memory?

I disabled mmap completely, and things look so much better.
latency is surprisingly half of what i see when i have mmap enabled.
Its funny that i keep reading tall claims abt mmap, but in practise a lot
of ppl have problems with it, especially when it uses up all the memory. We
have tried mmap for different purposes in our company before,and had
finally ended up disabling it, because it just doesnt handle things right
when memory is low. Maybe the proc/sys/vm needs to be configured right, but
thats not the easiest of configurations to get right.

Right now, i am handling only 80 gigs of data. kernel version is 2.6.26.
java version is 1.6.21
/G

On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey a...@ooyala.com wrote:

 I would check /etc/sysctl.conf and get the values of
 /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

 If you don't have JNA enabled (which Cassandra uses to fadvise) and
 swappiness is at its default of 60, the Linux kernel will happily swap out
 your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
 shouldn't be doing much unless you have a too-large heap or some other app
 using up memory on the system.


 On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov 
 ruslan.usi...@gmail.comwrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
 latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt
 48 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
 util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
 the
  same time buffers keeps decreasing. buffers starts as high as 50 mb, and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
 abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost
 0.75.
 
  Other than just turning off mmap completely, is there any other
 solution or
  setting to avoid a cassandra restart every cpl of days. Something to
 keep
  the RES memory to hit such a high number. I have been constantly
 monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh 
 gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
 stable
  ever since, well at least for the past 20 hours. Previously, in abt
 10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let
 it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
 was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite
 a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
 only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
 iostat
  was doing ok from what i remember, i will have to reproduce the issue
 for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb reduced to 2 MB by the
 end,
  which probably means that pagecache was being evicted by the kswapd0.
 Is
  there a way to fix the size of the buffer cache and not let system
 evict it
  in favour of mmap?
 
  Also, mmapping data files would basically cause not only the data
 (asked
  for) to be read into main memory, but also a bunch of extra pages
  (readahead), which would not be very useful, right? The same thing for
 index
  would actually be more useful, as there would be more index entries in
 the
  readahead 

Re: kswapd0 causing read timeouts

2012-06-14 Thread ruslan usifov
Upgrade java (version 1.6.21 have memleaks) to latest 1.6.32. Its
abnormally that on 80Gigs you have 15Gigs of index

vfs_cache_pressure - used for inodes and dentrys

Also to check that you have memleaks use drop_cache sysctl





2012/6/14 Gurpreet Singh gurpreet.si...@gmail.com:
 JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions
 on this..
 1. Is there a way to find out if mlockall really worked other than just the
 mlockall successful log message?
 2. Does cassandra only mlock the jvm heap or also the mmaped memory?

 I disabled mmap completely, and things look so much better.
 latency is surprisingly half of what i see when i have mmap enabled.
 Its funny that i keep reading tall claims abt mmap, but in practise a lot of
 ppl have problems with it, especially when it uses up all the memory. We
 have tried mmap for different purposes in our company before,and had finally
 ended up disabling it, because it just doesnt handle things right when
 memory is low. Maybe the proc/sys/vm needs to be configured right, but thats
 not the easiest of configurations to get right.

 Right now, i am handling only 80 gigs of data. kernel version is 2.6.26.
 java version is 1.6.21
 /G


 On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey a...@ooyala.com wrote:

 I would check /etc/sysctl.conf and get the values of
 /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

 If you don't have JNA enabled (which Cassandra uses to fadvise) and
 swappiness is at its default of 60, the Linux kernel will happily swap out
 your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
 shouldn't be doing much unless you have a too-large heap or some other app
 using up memory on the system.


 On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
  latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt
  48 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
  util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
  the
  same time buffers keeps decreasing. buffers starts as high as 50 mb,
  and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
  abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost
  0.75.
 
  Other than just turning off mmap completely, is there any other
  solution or
  setting to avoid a cassandra restart every cpl of days. Something to
  keep
  the RES memory to hit such a high number. I have been constantly
  monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh
  gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
  stable
  ever since, well at least for the past 20 hours. Previously, in abt
  10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let
  it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
  was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite
  a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
  only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
  iostat
  was doing ok from what i remember, i will have to reproduce the issue
  for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb reduced to 2 MB by the
  end,
  which probably means that pagecache was being evicted by the kswapd0.
  Is
  there a way to fix the size of the buffer cache and not let system
  evict it
  in favour of 

Re: kswapd0 causing read timeouts

2012-06-14 Thread ruslan usifov
2012/6/14 Gurpreet Singh gurpreet.si...@gmail.com:
 JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions
 on this..
 1. Is there a way to find out if mlockall really worked other than just the
 mlockall successful log message?
yes you must see something like this (from our test server):

 INFO [main] 2012-06-14 02:03:14,745 DatabaseDescriptor.java (line
233) Global memtable threshold is enabled at 512MB


 2. Does cassandra only mlock the jvm heap or also the mmaped memory?

Cassandra obviously mlock only heap, and doesn't mmaped sstables



 I disabled mmap completely, and things look so much better.
 latency is surprisingly half of what i see when i have mmap enabled.
 Its funny that i keep reading tall claims abt mmap, but in practise a lot of
 ppl have problems with it, especially when it uses up all the memory. We
 have tried mmap for different purposes in our company before,and had finally
 ended up disabling it, because it just doesnt handle things right when
 memory is low. Maybe the proc/sys/vm needs to be configured right, but thats
 not the easiest of configurations to get right.

 Right now, i am handling only 80 gigs of data. kernel version is 2.6.26.
 java version is 1.6.21
 /G


 On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey a...@ooyala.com wrote:

 I would check /etc/sysctl.conf and get the values of
 /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

 If you don't have JNA enabled (which Cassandra uses to fadvise) and
 swappiness is at its default of 60, the Linux kernel will happily swap out
 your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
 shouldn't be doing much unless you have a too-large heap or some other app
 using up memory on the system.


 On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
  latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt
  48 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
  util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
  the
  same time buffers keeps decreasing. buffers starts as high as 50 mb,
  and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
  abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost
  0.75.
 
  Other than just turning off mmap completely, is there any other
  solution or
  setting to avoid a cassandra restart every cpl of days. Something to
  keep
  the RES memory to hit such a high number. I have been constantly
  monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh
  gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
  stable
  ever since, well at least for the past 20 hours. Previously, in abt
  10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let
  it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
  was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite
  a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
  only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
  iostat
  was doing ok from what i remember, i will have to reproduce the issue
  for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb reduced to 2 MB by the
  end,
  which probably means that pagecache was being evicted by the kswapd0.
  Is
  there a way to fix the size of the buffer cache and not let system
  evict it
 

Re: kswapd0 causing read timeouts

2012-06-14 Thread ruslan usifov
Soory i mistaken,here is right string

 INFO [main] 2012-06-14 02:03:14,520 CLibrary.java (line 109) JNA
mlockall successful




2012/6/15 ruslan usifov ruslan.usi...@gmail.com:
 2012/6/14 Gurpreet Singh gurpreet.si...@gmail.com:
 JNA is installed. swappiness was 0. vfs_cache_pressure was 100. 2 questions
 on this..
 1. Is there a way to find out if mlockall really worked other than just the
 mlockall successful log message?
 yes you must see something like this (from our test server):

  INFO [main] 2012-06-14 02:03:14,745 DatabaseDescriptor.java (line
 233) Global memtable threshold is enabled at 512MB


 2. Does cassandra only mlock the jvm heap or also the mmaped memory?

 Cassandra obviously mlock only heap, and doesn't mmaped sstables



 I disabled mmap completely, and things look so much better.
 latency is surprisingly half of what i see when i have mmap enabled.
 Its funny that i keep reading tall claims abt mmap, but in practise a lot of
 ppl have problems with it, especially when it uses up all the memory. We
 have tried mmap for different purposes in our company before,and had finally
 ended up disabling it, because it just doesnt handle things right when
 memory is low. Maybe the proc/sys/vm needs to be configured right, but thats
 not the easiest of configurations to get right.

 Right now, i am handling only 80 gigs of data. kernel version is 2.6.26.
 java version is 1.6.21
 /G


 On Wed, Jun 13, 2012 at 8:42 PM, Al Tobey a...@ooyala.com wrote:

 I would check /etc/sysctl.conf and get the values of
 /proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

 If you don't have JNA enabled (which Cassandra uses to fadvise) and
 swappiness is at its default of 60, the Linux kernel will happily swap out
 your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
 shouldn't be doing much unless you have a too-large heap or some other app
 using up memory on the system.


 On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
  latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt
  48 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
  util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
  the
  same time buffers keeps decreasing. buffers starts as high as 50 mb,
  and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
  abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost
  0.75.
 
  Other than just turning off mmap completely, is there any other
  solution or
  setting to avoid a cassandra restart every cpl of days. Something to
  keep
  the RES memory to hit such a high number. I have been constantly
  monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh
  gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
  stable
  ever since, well at least for the past 20 hours. Previously, in abt
  10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let
  it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
  was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite
  a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
  only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
  iostat
  was doing ok from what i remember, i will have to reproduce the issue
  for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb 

Re: kswapd0 causing read timeouts

2012-06-13 Thread Gurpreet Singh
Alright, here it goes again...
Even with mmap_index_only, once the RES memory hit 15 gigs, the read
latency went berserk. This happens in 12 hours if diskaccessmode is mmap,
abt 48 hrs if its mmap_index_only.

only reads happening at 50 reads/second
row cache size: 730 mb, row cache hit ratio: 0.75
key cache size: 400 mb, key cache hit ratio: 0.4
heap size (max 8 gigs): used 6.1-6.9 gigs

No messages about reducing cache sizes in the logs

stats:
vmstat 1 : no swapping here, however high sys cpu utilization
iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
util = 15-30%
top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
cfstats - 70-100 ms. This number used to be 20-30 ms.

The value of the SHR keeps increasing (owing to mmap i guess), while at the
same time buffers keeps decreasing. buffers starts as high as 50 mb, and
goes down to 2 mb.


This is very easily reproducible for me. Every time the RES memory hits abt
15 gigs, the client starts getting timeouts from cassandra, the sys cpu
jumps a lot. All this, even though my row cache hit ratio is almost 0.75.

Other than just turning off mmap completely, is there any other solution or
setting to avoid a cassandra restart every cpl of days. Something to keep
the RES memory to hit such a high number. I have been constantly monitoring
the RES, was not seeing issues when RES was at 14 gigs.
/G

On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh gurpreet.si...@gmail.comwrote:

 Aaron, Ruslan,
 I changed the disk access mode to mmap_index_only, and it has been stable
 ever since, well at least for the past 20 hours. Previously, in abt 10-12
 hours, as soon as the resident memory was full, the client would start
 timing out on all its reads. It looks fine for now, i am going to let it
 continue to see how long it lasts and if the problem comes again.

 Aaron,
 yes, i had turned swap off.

 The total cpu utilization was at 700% roughly.. It looked like kswapd0 was
 using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a
 bit. top was reporting high system cpu, and low user cpu.
 vmstat was not showing swapping. java heap size max is 8 gigs. while only
 4 gigs was in use, so java heap was doing great. no gc in the logs. iostat
 was doing ok from what i remember, i will have to reproduce the issue for
 the exact numbers.

 cfstats latency had gone very high, but that is partly due to high cpu
 usage.

 One thing was clear, that the SHR was inching higher (due to the mmap)
 while buffer cache which started at abt 20-25mb reduced to 2 MB by the end,
 which probably means that pagecache was being evicted by the kswapd0. Is
 there a way to fix the size of the buffer cache and not let system evict it
 in favour of mmap?

 Also, mmapping data files would basically cause not only the data (asked
 for) to be read into main memory, but also a bunch of extra pages
 (readahead), which would not be very useful, right? The same thing for
 index would actually be more useful, as there would be more index entries
 in the readahead part.. and the index files being small wouldnt cause
 memory pressure that page cache would be evicted. mmapping the data files
 would make sense if the data size is smaller than the RAM or the hot data
 set is smaller than the RAM, otherwise just the index would probably be a
 better thing to mmap, no?. In my case data size is 85 gigs, while available
 RAM is 16 gigs (only 8 gigs after heap).

 /G


 On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.comwrote:

 Ruslan,
 Why did you suggest changing the disk_access_mode ?

 Gurpreet,
 I would leave the disk_access_mode with the default until you have a
 reason to change it.

   8 core, 16 gb ram, 6 data disks raid0, no swap configured

 is swap disabled ?

  Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts

 70% of one core or 70% of all cores ?
 Check the server logs, is there GC activity ?
 check nodetool cfstats to see the read latency for the cf.

 Take a look at vmstat to see if you are swapping, and look at iostats to
 see if io is the problem
 http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html

 Cheers

   -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote:

 Thanks Ruslan.
 I will try the mmap_index_only.
 Is there any guideline as to when to leave it to auto and when to use
 mmap_index_only?

 /G

 On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.comwrote:

 disk_access_mode: mmap??

 set to disk_access_mode: mmap_index_only in cassandra yaml

 2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com:
  Hi,
  I am testing cassandra 1.1 on a 1 node cluster.
  8 core, 16 gb ram, 6 data disks raid0, no swap configured
 
  cassandra 1.1.1
  heap size: 8 gigs
  key cache size in mb: 800 (used only 200mb till now)
  memtable_total_space_in_mb : 2048
 
  I am 

Re: kswapd0 causing read timeouts

2012-06-13 Thread ruslan usifov
Hm, it's very strange what amount of you data? You linux kernel
version? Java version?

PS: i can suggest switch diskaccessmode to standart in you case
PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
(from oracle site)

2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
 Alright, here it goes again...
 Even with mmap_index_only, once the RES memory hit 15 gigs, the read latency
 went berserk. This happens in 12 hours if diskaccessmode is mmap, abt 48 hrs
 if its mmap_index_only.

 only reads happening at 50 reads/second
 row cache size: 730 mb, row cache hit ratio: 0.75
 key cache size: 400 mb, key cache hit ratio: 0.4
 heap size (max 8 gigs): used 6.1-6.9 gigs

 No messages about reducing cache sizes in the logs

 stats:
 vmstat 1 : no swapping here, however high sys cpu utilization
 iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6, util
 = 15-30%
 top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
 cfstats - 70-100 ms. This number used to be 20-30 ms.

 The value of the SHR keeps increasing (owing to mmap i guess), while at the
 same time buffers keeps decreasing. buffers starts as high as 50 mb, and
 goes down to 2 mb.


 This is very easily reproducible for me. Every time the RES memory hits abt
 15 gigs, the client starts getting timeouts from cassandra, the sys cpu
 jumps a lot. All this, even though my row cache hit ratio is almost 0.75.

 Other than just turning off mmap completely, is there any other solution or
 setting to avoid a cassandra restart every cpl of days. Something to keep
 the RES memory to hit such a high number. I have been constantly monitoring
 the RES, was not seeing issues when RES was at 14 gigs.
 /G

 On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh gurpreet.si...@gmail.com
 wrote:

 Aaron, Ruslan,
 I changed the disk access mode to mmap_index_only, and it has been stable
 ever since, well at least for the past 20 hours. Previously, in abt 10-12
 hours, as soon as the resident memory was full, the client would start
 timing out on all its reads. It looks fine for now, i am going to let it
 continue to see how long it lasts and if the problem comes again.

 Aaron,
 yes, i had turned swap off.

 The total cpu utilization was at 700% roughly.. It looked like kswapd0 was
 using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a
 bit. top was reporting high system cpu, and low user cpu.
 vmstat was not showing swapping. java heap size max is 8 gigs. while only
 4 gigs was in use, so java heap was doing great. no gc in the logs. iostat
 was doing ok from what i remember, i will have to reproduce the issue for
 the exact numbers.

 cfstats latency had gone very high, but that is partly due to high cpu
 usage.

 One thing was clear, that the SHR was inching higher (due to the mmap)
 while buffer cache which started at abt 20-25mb reduced to 2 MB by the end,
 which probably means that pagecache was being evicted by the kswapd0. Is
 there a way to fix the size of the buffer cache and not let system evict it
 in favour of mmap?

 Also, mmapping data files would basically cause not only the data (asked
 for) to be read into main memory, but also a bunch of extra pages
 (readahead), which would not be very useful, right? The same thing for index
 would actually be more useful, as there would be more index entries in the
 readahead part.. and the index files being small wouldnt cause memory
 pressure that page cache would be evicted. mmapping the data files would
 make sense if the data size is smaller than the RAM or the hot data set is
 smaller than the RAM, otherwise just the index would probably be a better
 thing to mmap, no?. In my case data size is 85 gigs, while available RAM is
 16 gigs (only 8 gigs after heap).

 /G


 On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.com
 wrote:

 Ruslan,
 Why did you suggest changing the disk_access_mode ?

 Gurpreet,
 I would leave the disk_access_mode with the default until you have a
 reason to change it.

  8 core, 16 gb ram, 6 data disks raid0, no swap configured

 is swap disabled ?

 Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts

 70% of one core or 70% of all cores ?
 Check the server logs, is there GC activity ?
 check nodetool cfstats to see the read latency for the cf.

 Take a look at vmstat to see if you are swapping, and look at iostats to
 see if io is the problem
 http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote:

 Thanks Ruslan.
 I will try the mmap_index_only.
 Is there any guideline as to when to leave it to auto and when to use
 mmap_index_only?

 /G

 On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 disk_access_mode: mmap??

 set to disk_access_mode: mmap_index_only in 

Re: kswapd0 causing read timeouts

2012-06-13 Thread Al Tobey
I would check /etc/sysctl.conf and get the values of
/proc/sys/vm/swappiness and /proc/sys/vm/vfs_cache_pressure.

If you don't have JNA enabled (which Cassandra uses to fadvise) and
swappiness is at its default of 60, the Linux kernel will happily swap out
your heap for cache space.  Set swappiness to 1 or 'swapoff -a' and kswapd
shouldn't be doing much unless you have a too-large heap or some other app
using up memory on the system.

On Wed, Jun 13, 2012 at 11:30 AM, ruslan usifov ruslan.usi...@gmail.comwrote:

 Hm, it's very strange what amount of you data? You linux kernel
 version? Java version?

 PS: i can suggest switch diskaccessmode to standart in you case
 PS:PS also upgrade you linux to latest, and javahotspot to 1.6.32
 (from oracle site)

 2012/6/13 Gurpreet Singh gurpreet.si...@gmail.com:
  Alright, here it goes again...
  Even with mmap_index_only, once the RES memory hit 15 gigs, the read
 latency
  went berserk. This happens in 12 hours if diskaccessmode is mmap, abt 48
 hrs
  if its mmap_index_only.
 
  only reads happening at 50 reads/second
  row cache size: 730 mb, row cache hit ratio: 0.75
  key cache size: 400 mb, key cache hit ratio: 0.4
  heap size (max 8 gigs): used 6.1-6.9 gigs
 
  No messages about reducing cache sizes in the logs
 
  stats:
  vmstat 1 : no swapping here, however high sys cpu utilization
  iostat (looks great) - avg-qu-sz = 8, avg await = 7 ms, svc time = 0.6,
 util
  = 15-30%
  top - VIRT - 19.8g, SHR - 6.1g, RES - 15g, high cpu, buffers - 2mb
  cfstats - 70-100 ms. This number used to be 20-30 ms.
 
  The value of the SHR keeps increasing (owing to mmap i guess), while at
 the
  same time buffers keeps decreasing. buffers starts as high as 50 mb, and
  goes down to 2 mb.
 
 
  This is very easily reproducible for me. Every time the RES memory hits
 abt
  15 gigs, the client starts getting timeouts from cassandra, the sys cpu
  jumps a lot. All this, even though my row cache hit ratio is almost 0.75.
 
  Other than just turning off mmap completely, is there any other solution
 or
  setting to avoid a cassandra restart every cpl of days. Something to keep
  the RES memory to hit such a high number. I have been constantly
 monitoring
  the RES, was not seeing issues when RES was at 14 gigs.
  /G
 
  On Fri, Jun 8, 2012 at 10:02 PM, Gurpreet Singh 
 gurpreet.si...@gmail.com
  wrote:
 
  Aaron, Ruslan,
  I changed the disk access mode to mmap_index_only, and it has been
 stable
  ever since, well at least for the past 20 hours. Previously, in abt
 10-12
  hours, as soon as the resident memory was full, the client would start
  timing out on all its reads. It looks fine for now, i am going to let it
  continue to see how long it lasts and if the problem comes again.
 
  Aaron,
  yes, i had turned swap off.
 
  The total cpu utilization was at 700% roughly.. It looked like kswapd0
 was
  using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a
  bit. top was reporting high system cpu, and low user cpu.
  vmstat was not showing swapping. java heap size max is 8 gigs. while
 only
  4 gigs was in use, so java heap was doing great. no gc in the logs.
 iostat
  was doing ok from what i remember, i will have to reproduce the issue
 for
  the exact numbers.
 
  cfstats latency had gone very high, but that is partly due to high cpu
  usage.
 
  One thing was clear, that the SHR was inching higher (due to the mmap)
  while buffer cache which started at abt 20-25mb reduced to 2 MB by the
 end,
  which probably means that pagecache was being evicted by the kswapd0. Is
  there a way to fix the size of the buffer cache and not let system
 evict it
  in favour of mmap?
 
  Also, mmapping data files would basically cause not only the data (asked
  for) to be read into main memory, but also a bunch of extra pages
  (readahead), which would not be very useful, right? The same thing for
 index
  would actually be more useful, as there would be more index entries in
 the
  readahead part.. and the index files being small wouldnt cause memory
  pressure that page cache would be evicted. mmapping the data files would
  make sense if the data size is smaller than the RAM or the hot data set
 is
  smaller than the RAM, otherwise just the index would probably be a
 better
  thing to mmap, no?. In my case data size is 85 gigs, while available
 RAM is
  16 gigs (only 8 gigs after heap).
 
  /G
 
 
  On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.com
  wrote:
 
  Ruslan,
  Why did you suggest changing the disk_access_mode ?
 
  Gurpreet,
  I would leave the disk_access_mode with the default until you have a
  reason to change it.
 
   8 core, 16 gb ram, 6 data disks raid0, no swap configured
 
  is swap disabled ?
 
  Gradually,
   the system cpu becomes high almost 70%, and the client starts
 getting
   continuous timeouts
 
  70% of one core or 70% of all cores ?
  Check the server logs, is there GC activity ?
  check nodetool cfstats to see the read 

kswapd0 causing read timeouts

2012-06-08 Thread Gurpreet Singh
Hi,
I am testing cassandra 1.1 on a 1 node cluster.
8 core, 16 gb ram, 6 data disks raid0, no swap configured

cassandra 1.1.1
heap size: 8 gigs
key cache size in mb: 800 (used only 200mb till now)
memtable_total_space_in_mb : 2048

I am running a read workload.. about 30 reads/second. no writes at all.
The system runs fine for roughly 12 hours.

jconsole shows that my heap size has hardly touched 4 gigs.
top shows -
  SHR increasing slowly from 100 mb to 6.6 gigs in  these 12 hrs
  RES increases slowly from 6 gigs all the way to 15 gigs
  buffers are at a healthy 25 mb at some point and that goes down to 2 mb
in these 12 hrs
  VIRT stays at 85 gigs

I understand that SHR goes up because of mmap, RES goes up because it is
showing SHR value as well.

After around 10-12 hrs, the cpu utilization of the system starts
increasing, and i notice that kswapd0 process starts becoming more active.
Gradually, the system cpu becomes high almost 70%, and the client starts
getting continuous timeouts. The fact that the buffers went down from 20 mb
to 2 mb suggests that kswapd0 is probably swapping out the pagecache.

Is there a way out of this to avoid the kswapd0 starting to do things even
when there is no swap configured?
This is very easily reproducible for me, and would like a way out of this
situation. Do i need to adjust vm memory management stuff like pagecache,
vfs_cache_pressure.. things like that?

just some extra information, jna is installed, mlockall is successful.
there is no compaction running.
would appreciate any help on this.
Thanks
Gurpreet


Re: kswapd0 causing read timeouts

2012-06-08 Thread ruslan usifov
disk_access_mode: mmap??

set to disk_access_mode: mmap_index_only in cassandra yaml

2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com:
 Hi,
 I am testing cassandra 1.1 on a 1 node cluster.
 8 core, 16 gb ram, 6 data disks raid0, no swap configured

 cassandra 1.1.1
 heap size: 8 gigs
 key cache size in mb: 800 (used only 200mb till now)
 memtable_total_space_in_mb : 2048

 I am running a read workload.. about 30 reads/second. no writes at all.
 The system runs fine for roughly 12 hours.

 jconsole shows that my heap size has hardly touched 4 gigs.
 top shows -
   SHR increasing slowly from 100 mb to 6.6 gigs in  these 12 hrs
   RES increases slowly from 6 gigs all the way to 15 gigs
   buffers are at a healthy 25 mb at some point and that goes down to 2 mb in
 these 12 hrs
   VIRT stays at 85 gigs

 I understand that SHR goes up because of mmap, RES goes up because it is
 showing SHR value as well.

 After around 10-12 hrs, the cpu utilization of the system starts increasing,
 and i notice that kswapd0 process starts becoming more active. Gradually,
 the system cpu becomes high almost 70%, and the client starts getting
 continuous timeouts. The fact that the buffers went down from 20 mb to 2 mb
 suggests that kswapd0 is probably swapping out the pagecache.

 Is there a way out of this to avoid the kswapd0 starting to do things even
 when there is no swap configured?
 This is very easily reproducible for me, and would like a way out of this
 situation. Do i need to adjust vm memory management stuff like pagecache,
 vfs_cache_pressure.. things like that?

 just some extra information, jna is installed, mlockall is successful. there
 is no compaction running.
 would appreciate any help on this.
 Thanks
 Gurpreet




Re: kswapd0 causing read timeouts

2012-06-08 Thread Gurpreet Singh
Thanks Ruslan.
I will try the mmap_index_only.
Is there any guideline as to when to leave it to auto and when to use
mmap_index_only?

/G

On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.comwrote:

 disk_access_mode: mmap??

 set to disk_access_mode: mmap_index_only in cassandra yaml

 2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com:
  Hi,
  I am testing cassandra 1.1 on a 1 node cluster.
  8 core, 16 gb ram, 6 data disks raid0, no swap configured
 
  cassandra 1.1.1
  heap size: 8 gigs
  key cache size in mb: 800 (used only 200mb till now)
  memtable_total_space_in_mb : 2048
 
  I am running a read workload.. about 30 reads/second. no writes at all.
  The system runs fine for roughly 12 hours.
 
  jconsole shows that my heap size has hardly touched 4 gigs.
  top shows -
SHR increasing slowly from 100 mb to 6.6 gigs in  these 12 hrs
RES increases slowly from 6 gigs all the way to 15 gigs
buffers are at a healthy 25 mb at some point and that goes down to 2
 mb in
  these 12 hrs
VIRT stays at 85 gigs
 
  I understand that SHR goes up because of mmap, RES goes up because it is
  showing SHR value as well.
 
  After around 10-12 hrs, the cpu utilization of the system starts
 increasing,
  and i notice that kswapd0 process starts becoming more active. Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts. The fact that the buffers went down from 20 mb to 2
 mb
  suggests that kswapd0 is probably swapping out the pagecache.
 
  Is there a way out of this to avoid the kswapd0 starting to do things
 even
  when there is no swap configured?
  This is very easily reproducible for me, and would like a way out of this
  situation. Do i need to adjust vm memory management stuff like pagecache,
  vfs_cache_pressure.. things like that?
 
  just some extra information, jna is installed, mlockall is successful.
 there
  is no compaction running.
  would appreciate any help on this.
  Thanks
  Gurpreet
 
 



Re: kswapd0 causing read timeouts

2012-06-08 Thread ruslan usifov
2012/6/8 aaron morton aa...@thelastpickle.com:
 Ruslan,
 Why did you suggest changing the disk_access_mode ?

Because this bring problems on empty seat, in any case for me mmap
bring similar problem and i doesn't have find any solution to resolve
it, only  change disk_access_mode:-((. For me also will be interesting
hear results of author of this theme


 Gurpreet,
 I would leave the disk_access_mode with the default until you have a reason
 to change it.

  8 core, 16 gb ram, 6 data disks raid0, no swap configured

 is swap disabled ?

 Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts

 70% of one core or 70% of all cores ?
 Check the server logs, is there GC activity ?
 check nodetool cfstats to see the read latency for the cf.

 Take a look at vmstat to see if you are swapping, and look at iostats to see
 if io is the problem
 http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote:

 Thanks Ruslan.
 I will try the mmap_index_only.
 Is there any guideline as to when to leave it to auto and when to use
 mmap_index_only?

 /G

 On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.com
 wrote:

 disk_access_mode: mmap??

 set to disk_access_mode: mmap_index_only in cassandra yaml

 2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com:
  Hi,
  I am testing cassandra 1.1 on a 1 node cluster.
  8 core, 16 gb ram, 6 data disks raid0, no swap configured
 
  cassandra 1.1.1
  heap size: 8 gigs
  key cache size in mb: 800 (used only 200mb till now)
  memtable_total_space_in_mb : 2048
 
  I am running a read workload.. about 30 reads/second. no writes at all.
  The system runs fine for roughly 12 hours.
 
  jconsole shows that my heap size has hardly touched 4 gigs.
  top shows -
    SHR increasing slowly from 100 mb to 6.6 gigs in  these 12 hrs
    RES increases slowly from 6 gigs all the way to 15 gigs
    buffers are at a healthy 25 mb at some point and that goes down to 2
  mb in
  these 12 hrs
    VIRT stays at 85 gigs
 
  I understand that SHR goes up because of mmap, RES goes up because it is
  showing SHR value as well.
 
  After around 10-12 hrs, the cpu utilization of the system starts
  increasing,
  and i notice that kswapd0 process starts becoming more active.
  Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts. The fact that the buffers went down from 20 mb to 2
  mb
  suggests that kswapd0 is probably swapping out the pagecache.
 
  Is there a way out of this to avoid the kswapd0 starting to do things
  even
  when there is no swap configured?
  This is very easily reproducible for me, and would like a way out of
  this
  situation. Do i need to adjust vm memory management stuff like
  pagecache,
  vfs_cache_pressure.. things like that?
 
  just some extra information, jna is installed, mlockall is successful.
  there
  is no compaction running.
  would appreciate any help on this.
  Thanks
  Gurpreet
 
 





Re: kswapd0 causing read timeouts

2012-06-08 Thread Gurpreet Singh
Aaron, Ruslan,
I changed the disk access mode to mmap_index_only, and it has been stable
ever since, well at least for the past 20 hours. Previously, in abt 10-12
hours, as soon as the resident memory was full, the client would start
timing out on all its reads. It looks fine for now, i am going to let it
continue to see how long it lasts and if the problem comes again.

Aaron,
yes, i had turned swap off.

The total cpu utilization was at 700% roughly.. It looked like kswapd0 was
using just 1 cpu, but cassandra (jsvc) cpu utilization increased quite a
bit. top was reporting high system cpu, and low user cpu.
vmstat was not showing swapping. java heap size max is 8 gigs. while only 4
gigs was in use, so java heap was doing great. no gc in the logs. iostat
was doing ok from what i remember, i will have to reproduce the issue for
the exact numbers.

cfstats latency had gone very high, but that is partly due to high cpu
usage.

One thing was clear, that the SHR was inching higher (due to the mmap)
while buffer cache which started at abt 20-25mb reduced to 2 MB by the end,
which probably means that pagecache was being evicted by the kswapd0. Is
there a way to fix the size of the buffer cache and not let system evict it
in favour of mmap?

Also, mmapping data files would basically cause not only the data (asked
for) to be read into main memory, but also a bunch of extra pages
(readahead), which would not be very useful, right? The same thing for
index would actually be more useful, as there would be more index entries
in the readahead part.. and the index files being small wouldnt cause
memory pressure that page cache would be evicted. mmapping the data files
would make sense if the data size is smaller than the RAM or the hot data
set is smaller than the RAM, otherwise just the index would probably be a
better thing to mmap, no?. In my case data size is 85 gigs, while available
RAM is 16 gigs (only 8 gigs after heap).

/G


On Fri, Jun 8, 2012 at 11:44 AM, aaron morton aa...@thelastpickle.comwrote:

 Ruslan,
 Why did you suggest changing the disk_access_mode ?

 Gurpreet,
 I would leave the disk_access_mode with the default until you have a
 reason to change it.

  8 core, 16 gb ram, 6 data disks raid0, no swap configured

 is swap disabled ?

 Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts

 70% of one core or 70% of all cores ?
 Check the server logs, is there GC activity ?
 check nodetool cfstats to see the read latency for the cf.

 Take a look at vmstat to see if you are swapping, and look at iostats to
 see if io is the problem
 http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 8/06/2012, at 9:00 PM, Gurpreet Singh wrote:

 Thanks Ruslan.
 I will try the mmap_index_only.
 Is there any guideline as to when to leave it to auto and when to use
 mmap_index_only?

 /G

 On Fri, Jun 8, 2012 at 1:21 AM, ruslan usifov ruslan.usi...@gmail.comwrote:

 disk_access_mode: mmap??

 set to disk_access_mode: mmap_index_only in cassandra yaml

 2012/6/8 Gurpreet Singh gurpreet.si...@gmail.com:
  Hi,
  I am testing cassandra 1.1 on a 1 node cluster.
  8 core, 16 gb ram, 6 data disks raid0, no swap configured
 
  cassandra 1.1.1
  heap size: 8 gigs
  key cache size in mb: 800 (used only 200mb till now)
  memtable_total_space_in_mb : 2048
 
  I am running a read workload.. about 30 reads/second. no writes at all.
  The system runs fine for roughly 12 hours.
 
  jconsole shows that my heap size has hardly touched 4 gigs.
  top shows -
SHR increasing slowly from 100 mb to 6.6 gigs in  these 12 hrs
RES increases slowly from 6 gigs all the way to 15 gigs
buffers are at a healthy 25 mb at some point and that goes down to 2
 mb in
  these 12 hrs
VIRT stays at 85 gigs
 
  I understand that SHR goes up because of mmap, RES goes up because it is
  showing SHR value as well.
 
  After around 10-12 hrs, the cpu utilization of the system starts
 increasing,
  and i notice that kswapd0 process starts becoming more active.
 Gradually,
  the system cpu becomes high almost 70%, and the client starts getting
  continuous timeouts. The fact that the buffers went down from 20 mb to
 2 mb
  suggests that kswapd0 is probably swapping out the pagecache.
 
  Is there a way out of this to avoid the kswapd0 starting to do things
 even
  when there is no swap configured?
  This is very easily reproducible for me, and would like a way out of
 this
  situation. Do i need to adjust vm memory management stuff like
 pagecache,
  vfs_cache_pressure.. things like that?
 
  just some extra information, jna is installed, mlockall is successful.
 there
  is no compaction running.
  would appreciate any help on this.
  Thanks
  Gurpreet