[ https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821328#comment-13821328 ]
Duncan Sands commented on CASSANDRA-6275: ----------------------------------------- OK, here is how you can reproduce. 1) Create this keyspace: CREATE KEYSPACE all_production WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; 2) Create a table as follows: use all_production; CREATE TABLE position_hints (shard int, date text, when timeuuid, sequence bigint, syd int, broker uuid, engine uuid, confirmed bigint, open_buy bigint, open_sell bigint, PRIMARY KEY ((shard, date), when)) with clustering order by (when desc); 3) Stop Cassandra. Untar the attached file in /var/lib/cassandra/data/all_production/ to populate the position_hints table. 4) Start Cassandra. 5) Prepare a large number of queries as follows: for (( i = 0 ; i < 1000000 ; i = i + 1 )) ; do echo "select * from position_hints where shard=1 and date='2013-10-30' and when>ba719c52-4182-11e3-a471-003048feded4 limit 1;" ; done > /tmp/queries 6) In cqlsh: use all_production; source '/tmp/queries'; 7) Enjoy watching the number of fd's used by Cassandra go up and up. > 2.0.x leaks file handles > ------------------------ > > Key: CASSANDRA-6275 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6275 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: java version "1.7.0_25" > Java(TM) SE Runtime Environment (build 1.7.0_25-b15) > Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) > Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT > 2012 x86_64 x86_64 x86_64 GNU/Linux > Reporter: Mikhail Mazursky > Assignee: Marcus Eriksson > Attachments: cassandra_jstack.txt, leak.log, slog.gz > > > Looks like C* is leaking file descriptors when doing lots of CAS operations. > {noformat} > $ sudo cat /proc/15455/limits > Limit Soft Limit Hard Limit Units > Max cpu time unlimited unlimited seconds > Max file size unlimited unlimited bytes > Max data size unlimited unlimited bytes > Max stack size 10485760 unlimited bytes > Max core file size 0 0 bytes > Max resident set unlimited unlimited bytes > Max processes 1024 unlimited processes > Max open files 4096 4096 files > Max locked memory unlimited unlimited bytes > Max address space unlimited unlimited bytes > Max file locks unlimited unlimited locks > Max pending signals 14633 14633 signals > Max msgqueue size 819200 819200 bytes > Max nice priority 0 0 > Max realtime priority 0 0 > Max realtime timeout unlimited unlimited us > {noformat} > Looks like the problem is not in limits. > Before load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 166 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 164 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 180 > {noformat} > After load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 967 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 1766 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 2578 > {noformat} > Most opened files have names like: > {noformat} > java 16890 cassandra 1636r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1637r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1638r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1639r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1640r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1641r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1642r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1643r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1644r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1645r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1646r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1647r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1648r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1649r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1650r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1651r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1652r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1653r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1654r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1655r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1656r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > {noformat} > Also, when that happens it's not always possible to shutdown server process > via SIGTERM. Have to use SIGKILL. > p.s. See mailing thread for more context information > https://www.mail-archive.com/user@cassandra.apache.org/msg33035.html -- This message was sent by Atlassian JIRA (v6.1#6144)