[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-20 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827682#comment-13827682
 ] 

Michael Shuler commented on CASSANDRA-6275:
---

I will test this out this morning!

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: graham sanderson
 Fix For: 2.0.3

 Attachments: 6275.txt, c_file-descriptors_strace.tbz, 
 cassandra_jstack.txt, leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG  

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-20 Thread J. Ryan Earl (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827705#comment-13827705
 ] 

J. Ryan Earl commented on CASSANDRA-6275:
-

So we've been running this all night, have written a few hundred GB of data 
with some products we're developing, all the while OpsCenter 4.0.0 was doing 
TTL'd rollups and what not.  Deleted file count remained at 1 the entire time, 
never increasing, and total file count remained below 1000.

FYI, the single undeleted file looks like some temporary file randomly 
generated on Cassandra startup that gets deleted but not closed for the 
processes' duration, example:
{noformat}
java1925 cassandra   44u   REG  253,4  4096 13 
/tmp/ffi441Hpl (deleted)
{noformat}

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: graham sanderson
 Fix For: 2.0.3

 Attachments: 6275.txt, c_file-descriptors_strace.tbz, 
 cassandra_jstack.txt, leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-20 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827723#comment-13827723
 ] 

Brandon Williams commented on CASSANDRA-6275:
-

That's probably JNA, or snappy.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: graham sanderson
 Fix For: 2.0.3

 Attachments: 6275.txt, c_file-descriptors_strace.tbz, 
 cassandra_jstack.txt, leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-20 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827793#comment-13827793
 ] 

Sylvain Lebresne commented on CASSANDRA-6275:
-

That patch lgtm, +1.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: graham sanderson
 Fix For: 2.0.3

 Attachments: 6275.txt, c_file-descriptors_strace.tbz, 
 cassandra_jstack.txt, leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-20 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827802#comment-13827802
 ] 

Michael Shuler commented on CASSANDRA-6275:
---

Patch works for me, too :)

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: graham sanderson
 Fix For: 2.0.3

 Attachments: 6275.txt, c_file-descriptors_strace.tbz, 
 cassandra_jstack.txt, leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826531#comment-13826531
 ] 

Jonathan Ellis commented on CASSANDRA-6275:
---

[~mshuler] Can you reproduce in 1.2?  How about 2.0.0?

If in the latter but not the former it's going to be a bitch to bisect but I 
don't have any better ideas.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread Capn Crunch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826633#comment-13826633
 ] 

Capn Crunch commented on CASSANDRA-6275:


Environment Centos 6.4 ~ Kernel 3.10 (elrepo) ~ Cassandra 2.0.2 ~ Opscenter 4.0.

Noted that Opscenter extremely exacerbates the issue, as the roll-up CFs cause 
open files to grow at an incredible pace. Turning Opscenter off causes rate to 
slow and/or stop. After stopping opscenter, the opscenter open files on all C* 
nodes never release although the open files dont continue to grow.

There are 10's of thousands of these open:

/data/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-15491-Data.db 
(deleted)
/data/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-15491-Data.db 
(deleted)
/data/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-15491-Data.db 
(deleted)
/data/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-15491-Data.db 
(deleted)

A restart of cassandra and keeping opscenter off will keep file descriptors 
within a comfortable range. 



 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826912#comment-13826912
 ] 

graham sanderson commented on CASSANDRA-6275:
-

Yes I believe we can mitigate the problem in the OpCenter case, however it is a 
good test bed since it makes the problem easy to spot - note it seems to be 
worse under high read/write activity on tracked keyspaces/CFs, however that 
makes sense.

Note I was poking (somewhat blindly) thru the (2.0.2) code (partly out of 
interest) looking for what might be leaking these file handles, and I also 
found a heap dump. I discovered what turned out to be 
https://issues.apache.org/jira/browse/CASSANDRA-6358 which leaks 
FileDescriptors though their refCounts all seemed to be 0. In any case there 
weren't enough (total FileDescriptors - in a heap dump) to account for the 
problem. They were also for mem-mapped files (the ifile in SSTableReader) and 
none of the leaked delete file handles were mem-mapped (since they were 
compressed data files)

That said CASSANDRA-6358 was pinning the SSTableReaders in memory (since the 
Runnable was an anonymous inner class), so someone with more knowledge of the 
code might have a better idea, if this might be a problem (other than the 
memory leak)

I don't have an environment yet where I can easily build and install code 
changes, though we could downgrade our system test environment to 2.0.0 to see 
if we can reproduce the problem there - unsure if we can downgrade to 1.2.X 
easily given our current testing.

Note while I was looking at the code I came across CASSANDRA-... What 
caught my eye was the interaction between FileCacheService and RAR.deallocate, 
but more specifically related to the fact that this change, added a concurrent 
structure inside another separate concurrent structure, and it seemed like 
there might be a case where a RAR was recycled into a concurrent queue which 
was already completely removed and deallocated, in which case it would get GCed 
without close, presumably causing a file handle leak on the native side. Though 
I couldn't come up with any significantly convincing interactions that would 
cause this to happen without some very very unlucky things happening (and my 
knowledge of the google cache implementation was even more limited!), so this 
is unlikely the cause of this issue (especially if the issue doesn't happen in 
the 1.2.7+ branch).

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826938#comment-13826938
 ] 

Mikhail Stepura commented on CASSANDRA-6275:


bq. What caught my eye was the interaction between FileCacheService and 
RAR.deallocate, but more specifically related to the fact that this change, 
added a concurrent structure inside another separate concurrent structure, and 
it seemed like there might be a case where a RAR was recycled into a concurrent 
queue that was already removed and drained, in which case it would get GCed 
without close, presumably causing a file handle leak on the native side

I might be wrong, but what are you saying correlates with observations from 
CASSANDRA-6283. [~Andie78] experiminted and 
bq. found out, that a finalizer fixes the problem. So after GC the files will 
be deleted (not optimal, but working fine). It runs now 2 days continously 
without problem. Possible fix/test:I wrote the following finalizer at the end 
of class org.apache.cassandra.io.util.RandomAccessReader: { deallocate(); 
super.finalize(); } }

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826955#comment-13826955
 ] 

Mikhail Stepura commented on CASSANDRA-6275:


I wonder what would happen with {{file_cache_size_in_mb: 0 }}. RAR should be 
explicitly deallocated then.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827120#comment-13827120
 ] 

graham sanderson commented on CASSANDRA-6275:
-

Trying that now (one node with that setting)

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread J. Ryan Earl (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827162#comment-13827162
 ] 

J. Ryan Earl commented on CASSANDRA-6275:
-

[~mishail] We (Graham Sanderson and I work together) added 
'file_cache_size_in_mb: 0' to cassandra.yaml on one of the nodes, and restart 
that node plus another with the default (unspecified) file_cache_size_in_mb 
setting to run an A/B test.  Both nodes still leak file handles, however, the 
node with the default setting leaks much faster (about 3-4x the leak rate).

CASSANDRA-6283 appears to be an exact duplicate of this problem, Windows and 
Linux JVMs appear to exhibit the exact same file handle leak behavior.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827173#comment-13827173
 ] 

graham sanderson commented on CASSANDRA-6275:
-

Note that this would tend to imply that I was wrong (at least about the 
particular code path), and the change in leak rate may be attributable to less 
throughput without the file cache. Note the leak rate does seem quite related 
to how hard we are hitting the server as mentioned before, so a threading bug 
elsewhere might be the cause.

Note nominally buffer in RAR should be volatile, but then any code path thru 
close where buffer's latest value is stale would end up calling deallocate 
anyway (at least in the case that file_cache_size_in_mb is off; I didn't think 
though the other case.

So given the finalizer fix - which we can try and build here to test out 
(unless someone has it pre-built) - seems to imply that it is just someone 
failing to call close() under load conditions.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827305#comment-13827305
 ] 

graham sanderson commented on CASSANDRA-6275:
-

We were able to confirm the finalizer fix stopped the leak

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827308#comment-13827308
 ] 

graham sanderson commented on CASSANDRA-6275:
-

Note I believe the problem is caused by CASSANDRA-5514 (2.0 beta 1)

I don't have a patch because I don't know the exact patch for reasons below

Here is CollationController.java starting at line 264: (as of current 2.0 
branch and 2.0.2)

{code}
// Check for row tombstone in the skipped sstables
if (skippedSSTables != null)
{
for (SSTableReader sstable : skippedSSTables)
{
if (sstable.getMaxTimestamp() = minTimestamp)
continue;

sstable.incrementReadCount();
OnDiskAtomIterator iter = 
filter.getSSTableColumnIterator(sstable);
if (iter.getColumnFamily() == null)
continue;

ColumnFamily cf = iter.getColumnFamily();
// we are only interested in row-level tombstones here, and 
only if markedForDeleteAt is larger than minTimestamp
if 
(cf.deletionInfo().getTopLevelDeletion().markedForDeleteAt  minTimestamp)
{
includedDueToTombstones++;
iterators.add(iter);

returnCF.delete(cf.deletionInfo().getTopLevelDeletion());
sstablesIterated++;
}
}
}
{code}

Note if the last if test does not succeed, then iter is neither closed, nor 
is it added to the iterators list to be closed in the finally section at the 
end - it would have been easy for me to add it always to iterators list 
except that iterators is referenced lower in the function:

{code}
if (iterators.isEmpty())
return null;

Tracing.trace(Merging data from memtables and {} sstables, 
sstablesIterated);
filter.collateOnDiskAtom(returnCF, iterators, gcBefore);
{code}

Being new to the code, I cannot say whether it should be in iterators at that 
point, or just have been closed (quietly) above


 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827322#comment-13827322
 ] 

graham sanderson commented on CASSANDRA-6275:
-

Also note stack trace for all leaked files we saw - someone can perhaps use 
this to help figure out what this actually affects (i.e. some of the iter's 
RARs may have been owned by someone else in which case AOK)

{code}
ERROR [Finalizer] 2013-11-20 03:43:42,129 RandomAccessReader.java (line 399) 
LEAK finalizer had to clean up
java.lang.Exception: RAR for 
/data/5/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-6882-Data.db 
allocated
at 
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:66)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
at 
org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
at 
org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
at 
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1182)
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.setToRowStart(IndexedSliceReader.java:108)
at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.init(IndexedSliceReader.java:84)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:273)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1467)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1286)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332)
at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
at 
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Michael Shuler
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 0

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-19 Thread J. Ryan Earl (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827371#comment-13827371
 ] 

J. Ryan Earl commented on CASSANDRA-6275:
-

I accidentally hit the Testing button, and don't see a way to revert.  I've 
build cassandra-2.0.2 with just this patch applied, and we are testing it now, 
but I didn't mean to change the status.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: graham sanderson
 Fix For: 2.0.3

 Attachments: 6275.txt, c_file-descriptors_strace.tbz, 
 cassandra_jstack.txt, leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-18 Thread J. Ryan Earl (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825856#comment-13825856
 ] 

J. Ryan Earl commented on CASSANDRA-6275:
-

We recently ran into this issue after upgrading to OpsCenter-4.0.0, it is quite 
easy to reproduce:
# Install Cassandra-2.0.2
# Install OpsCenter-4.0.0 on above cluster.

I upgraded OpsCenter on Friday, and by Sunday I had reached 1 Million open file 
handles.  I had to kill -9 the Cassandra processes as it wouldn't respond to 
sockets, DSC20 restart scripts reported successfully killing the processes but 
in fact did not.

{noformat}
[root@cassandra2 ~]# lsof -u cassandra|wc -l
175416
[root@cassandra2 ~]# lsof -u cassandra|grep -c OpsCenter
174474
{noformat}

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-18 Thread graham sanderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825892#comment-13825892
 ] 

graham sanderson commented on CASSANDRA-6275:
-

Note also, that most if not all of the deleted files are of the form

{code}
java14018 cassandra  586r   REG   8,33   8792499   1251 
/data/1/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-4656-Data.db 
(deleted)
java14018 cassandra  587r   REG   8,33  27303760   1254 
/data/1/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-4655-Data.db 
(deleted)
java14018 cassandra  588r   REG   8,33   8792499   1251 
/data/1/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-4656-Data.db 
(deleted)
java14018 cassandra  589r   REG   8,33  27303760   1254 
/data/1/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-4655-Data.db 
(deleted)
java14018 cassandra  590r   REG   8,33  10507214936 
/data/1/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-4657-Data.db 
(deleted)
{code}
We have 7 data disks (don't know if this contributes to the problem), and the 
number of such deleted files is very ill balanced with 93% on two of the 7 
disks (on this particular node)... the distribution of live data file size for 
OpsCenter/rollups60 is a little uneven with the same data mounts that have more 
deleted (but open) files having more actual live data, but the deleted file 
counts per mount point vary by several order of magnitudes whereas the data 
itself does not.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-15 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823796#comment-13823796
 ] 

Marcus Eriksson commented on CASSANDRA-6275:


ok, this is what i have so far, i can also reproduce on an m1.medium in EC2, 
ubuntu 13.10 which has 3.11.x kernel.

i cannot reproduce on my laptop (debian squeeze) or my server (rhel 6), both 
run kernel 2.6.x. (jdk7u45 on all)

it happens with trivial tables/data as well, so seems unrelated to TTL or 
truncate etc

just starting cassandra up shows ~50 open FDs for the same Data.db-file

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-15 Thread Duncan Sands (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823805#comment-13823805
 ] 

Duncan Sands commented on CASSANDRA-6275:
-

I originally saw the issue on Ubuntu 10.04 (kernel 2.6.32) and reproduced it on 
Ubuntu 13.10 (kernel 3.11.0).

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-15 Thread Pieter Callewaert (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823806#comment-13823806
 ] 

Pieter Callewaert commented on CASSANDRA-6275:
--

I also have the problem on Ubuntu 12.04 (Linux de-cass00 3.8.0-30-generic 
#44~precise1-Ubuntu SMP Fri Aug 23 18:32:41 UTC 2013 x86_64 x86_64 x86_64 
GNU/Linux)

Posted something om mailing list because I was not sure if it was a bug... 
(Here you can find more info, In some cases I hade a deleted file more than 50k 
times open)
http://www.mail-archive.com/user@cassandra.apache.org/msg32999.html

Temporary fix was to raise the nofile limit to 1kk...

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-15 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823814#comment-13823814
 ] 

Michael Shuler commented on CASSANDRA-6275:
---

My tests were on Ubuntu precise, same kernel as above, with JVM version 1.7_25.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt, 
 leak.log, position_hints.tgz, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-14 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823106#comment-13823106
 ] 

Michael Shuler commented on CASSANDRA-6275:
---

Reproduced in 2.0.2.
On m1.medium, running C*, the open files were about 910.  My query is still 
running, and I'm at 3500 open files.
(I'll work on cassandra-2.0 branch HEAD, next)

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: cassandra_jstack.txt, leak.log, position_hints.tgz, 
 slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-14 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823171#comment-13823171
 ] 

Michael Shuler commented on CASSANDRA-6275:
---

Same results on cassandra-2.0 HEAD.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: cassandra_jstack.txt, leak.log, position_hints.tgz, 
 slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-13 Thread Duncan Sands (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821328#comment-13821328
 ] 

Duncan Sands commented on CASSANDRA-6275:
-

OK, here is how you can reproduce.

1) Create this keyspace:

CREATE KEYSPACE all_production WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};

2) Create a table as follows:

use all_production;
CREATE TABLE position_hints (shard int, date text, when timeuuid, sequence 
bigint, syd int, broker uuid, engine uuid, confirmed bigint, open_buy bigint, 
open_sell bigint, PRIMARY KEY ((shard, date), when)) with clustering order by 
(when desc);

3) Stop Cassandra.  Untar the attached file in 
/var/lib/cassandra/data/all_production/ to populate the position_hints table.

4) Start Cassandra.

5) Prepare a large number of queries as follows:

for (( i = 0 ; i  100 ; i = i + 1 )) ; do echo select * from 
position_hints where shard=1 and date='2013-10-30' and 
whenba719c52-4182-11e3-a471-003048feded4 limit 1; ; done  /tmp/queries

6) In cqlsh:

use all_production;
source '/tmp/queries';

7) Enjoy watching the number of fd's used by Cassandra go up and up.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821419#comment-13821419
 ] 

Jonathan Ellis commented on CASSANDRA-6275:
---

Can you reproduce the above on 2.0.2 or 2.0.2 HEAD, [~mshuler]?

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: cassandra_jstack.txt, leak.log, position_hints.tgz, 
 slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820376#comment-13820376
 ] 

Marcus Eriksson commented on CASSANDRA-6275:


been looking at this a bit and can't really reproduce, suspected CASSANDRA-5228 
- but that seems to work (or, found a bug, but unrelated to this, 
CASSANDRA-6337)

slog.gz looks a bit like what [~mkjellman] reported in CASSANDRA-5241 (looping 
flushing of system tables) but that was resolved a long time ago.

does anyone have a way to reproduce? [~ash2k] would it be possible to post your 
load test?


 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-12 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820839#comment-13820839
 ] 

Robert Coli commented on CASSANDRA-6275:


A brief note to mention that when durable_writes are disabled, handling of 
clean shutdown (via SIGTERM/StorageServiceShutdownHook) has additional blocking 
while waiting for drain. If one is investigating and/or modifying the behavior 
of the shutdown hook, they should be aware that there are two different cases 
to test. See CASSANDRA-2958.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-12 Thread Mikhail Mazursky (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820875#comment-13820875
 ] 

Mikhail Mazursky commented on CASSANDRA-6275:
-

[~krummas] sorry, I cannot post that code. The scenario is something like this:
{code}
TRUNCATE table1;
TRUNCATE table2;
TRUNCATE table3;
loop {
SELECT FROM table1 WHERE key='xxx' (quorum);
INSERT INTO table2 (quorum);
INSERT INTO table3 IF NOT EXISTS; (should sometimes fail)
UPDATE table1 WHERE  key = 'someid' IF column='zzz'; (should sometimes fail)
}
{code}
As I said, if I remove TRUNCATEs it do not leak.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
Assignee: Marcus Eriksson
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13819357#comment-13819357
 ] 

Jonathan Ellis commented on CASSANDRA-6275:
---

[~ash2k] or others, can you verify if this is also a problem in 1.2.11?

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-11 Thread Gianluca Borello (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13819363#comment-13819363
 ] 

Gianluca Borello commented on CASSANDRA-6275:
-

[~jbellis], FWIW 1.2.11 is working fine for us (I have one week of uptime so 
far since the downgrade).

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13819376#comment-13819376
 ] 

Jonathan Ellis commented on CASSANDRA-6275:
---

Can you have a look [~krummas] ?

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-11 Thread Duncan Sands (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13819884#comment-13819884
 ] 

Duncan Sands commented on CASSANDRA-6275:
-

All our leaked fd's were for a large table that uses TTL.  We also don't have 
any problems with 1.2.11 (which we had to downgrade to, just like Gianluca).

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-11 Thread Mikhail Mazursky (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13819890#comment-13819890
 ] 

Mikhail Mazursky commented on CASSANDRA-6275:
-

[~jbellis] my test workload uses LWT so it cannot be run on 1.2.x. The written 
columns themself do not use TTL but AFAIK Paxos table uses TTL. And in my case 
I see a lot of open Paxos-related files. So, taking into account what others 
said above, looks like the problem is somehow connected to TTL.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, leak.log, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-01 Thread Duncan Sands (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1382#comment-1382
 ] 

Duncan Sands commented on CASSANDRA-6275:
-

In my case I didn't use truncate, however I did do one exotic operation: I used 
alter table to drop a no longer needed column a few days before I noticed this 
issue.

Before downgrading I made a copy of the entire contents of /var/lib/cassandra/, 
so I could try recreating the 2.0.2 cluster and the problem in some virtual 
machines using these.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-11-01 Thread Duncan Sands (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1384#comment-1384
 ] 

Duncan Sands commented on CASSANDRA-6275:
-

I also changed sstable_compression from SnappyCompressor to LZ4Compressor.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810590#comment-13810590
 ] 

Mikhail Stepura commented on CASSANDRA-6275:


bq. Also, when that happens it's not always possible to shutdown server process 
via SIGTERM. Have to use SIGKILL.

As far as I understand here *what* is happening

* {{SIGTERM handler}} waits for {{StorageServiceShutdownHook}} 
* {{StorageServiceShutdownHook}} waits (up to *3600 sec == 1hr*) for 
{{mutationStage}} threads to complete. 
* {{MutationStage:2718}} thread performs 
{{ColumnFamilyStore.forceBlockingFlush}} initiated by 
{{TruncateVerbHandler.doVerb}} and waits for {{MemtablePostFlusher:1}} 
* {{MemtablePostFlusher:1}} is waiting on {{CountDownLatch.await}} (in 
{{WrappedRunnable}} returned from {{ColumnFamilyStore.switchMemtable)}}.  It 
will wait until the latch is counted down to zero.

There is also another call to {{ColumnFamilyStore.forceBlockingFlush}} from 
{{OptionalTasks:1:BatchlogManager.cleanup()}} . 


 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810647#comment-13810647
 ] 

Jonathan Ellis commented on CASSANDRA-6275:
---

Hmm.  It's possible that we shouldn't be running Truncate on the Mutation 
stage.  But, I don't think Duncan or Mikhail mentioned running truncate so 
there is probably something else wrong as well.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810664#comment-13810664
 ] 

Mikhail Stepura commented on CASSANDRA-6275:


[~jbellis] still there is {{TruncateVerbHandler.doVerb}} in Mikhail's 
[^cassandra_jstack.txt]

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810674#comment-13810674
 ] 

Brandon Williams commented on CASSANDRA-6275:
-

[~baldrick] are you using the native protocol?

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Duncan Sands (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810714#comment-13810714
 ] 

Duncan Sands commented on CASSANDRA-6275:
-

Yes, I'm using the native protocol.  No use of truncate on my part.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810735#comment-13810735
 ] 

Jonathan Ellis commented on CASSANDRA-6275:
---

Let's create a new ticket for the truncate hang then.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Mikhail Mazursky (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810995#comment-13810995
 ] 

Mikhail Mazursky commented on CASSANDRA-6275:
-

My load test uses TRUNCATE before it starts. I will check if that leak happens 
without it.

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1647r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1648r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1649r  REG 202,17 161158485 
 655420 

[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles

2013-10-31 Thread Mikhail Mazursky (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811002#comment-13811002
 ] 

Mikhail Mazursky commented on CASSANDRA-6275:
-

I tried without TRUNCATE - no leaking (test TRUNCATEs 3 tables before it 
starts).

Before test:
{noformat}
[root@cassandra-test0 ~]$ lsof -n | grep java | wc -l
169

[root@cassandra-test1 ~]$ lsof -n | grep java | wc -l
167

[root@cassandra-test2 ~]# lsof -n | grep java | wc -l
173
{noformat}

After test:
{noformat}
[root@cassandra-test0 ~]$ lsof -n | grep java | wc -l
172

[root@cassandra-test1 ~]$ lsof -n | grep java | wc -l
172

[root@cassandra-test2 ~]# lsof -n | grep java | wc -l
183
{noformat}

 2.0.x leaks file handles
 

 Key: CASSANDRA-6275
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
 Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT 
 2012 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mikhail Mazursky
 Attachments: cassandra_jstack.txt, slog.gz


 Looks like C* is leaking file descriptors when doing lots of CAS operations.
 {noformat}
 $ sudo cat /proc/15455/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds  
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size10485760 unlimitedbytes
 Max core file size00bytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 1024 unlimitedprocesses
 Max open files4096 4096 files
 Max locked memory unlimitedunlimitedbytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   1463314633signals  
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00   
 Max realtime priority 00   
 Max realtime timeout  unlimitedunlimitedus 
 {noformat}
 Looks like the problem is not in limits.
 Before load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 166
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 164
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 180
 {noformat}
 After load test:
 {noformat}
 cassandra-test0 ~]$ lsof -n | grep java | wc -l
 967
 cassandra-test1 ~]$ lsof -n | grep java | wc -l
 1766
 cassandra-test2 ~]$ lsof -n | grep java | wc -l
 2578
 {noformat}
 Most opened files have names like:
 {noformat}
 java  16890 cassandra 1636r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1637r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1638r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1639r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1640r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1641r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1642r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1643r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1644r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java  16890 cassandra 1645r  REG 202,17 161158485 
 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
 java  16890 cassandra 1646r  REG 202,17  88724987 
 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
 java