[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2014-01-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13886716#comment-13886716
 ] 

Jonathan Ellis commented on CASSANDRA-6107:
---

Note: this was reverted in 1.2.14 because of CASSANDRA-6592.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107-v4.txt, 6107.patch, 6107_v2.patch, 6107_v3.patch, 
 Screen Shot 2013-10-03 at 17.59.37.png


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-11-03 Thread Michael Oczkowski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812415#comment-13812415
 ] 

Michael Oczkowski commented on CASSANDRA-6107:
--

This change appears to break code that uses EmbeddedCassandraService and 
PreparedStatements in version 1.2.11.  Please see CASSANDRA-6293 for details.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107-v4.txt, 6107.patch, 6107_v2.patch, 6107_v3.patch, 
 Screen Shot 2013-10-03 at 17.59.37.png


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-07 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787971#comment-13787971
 ] 

Sylvain Lebresne commented on CASSANDRA-6107:
-

CASSANDRA-5981 is indeed about limiting the size at the protocol level. However 
it's a global frame limitation. In particular this is the hard limit for 
queries with their values and for that reason the current hard-coded limit is 
relatively high (256MB). And we can bikeshed on the exact default to user and 
CASSANDRA-5981 will probably allow the user to play with that limit, but in any 
case, it will definitively have to be higher than the 1MB. The other detail is 
that the limit done by CASSANDRA-5981 is on the sent bytes, not the in-memory 
size of the query, but that probably don't matter much.

Anyway, provided that a prepared statement doesn't include values, it wouldn't 
be absurd to have a specific, lower limit on their size. Though my own 
preference would be to just leave it to a global limit on the 
preparedStatements cache map (but it could make sense to reject statements that 
blow up the entire limit on their own, so as to make sure to respect it). Too 
many hard-coded limitations make me nervous.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch, 6107_v2.patch, 6107_v3.patch, 6107-v4.txt, 
 Screen Shot 2013-10-03 at 17.59.37.png


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-06 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787552#comment-13787552
 ] 

Lyuben Todorov commented on CASSANDRA-6107:
---

LGTM. But i was able to build some pretty big batch statements ( 4MB ) so I'm 
not sure about the rejection of large statements at protocol level.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch, 6107_v2.patch, 6107_v3.patch, 6107-v4.txt, 
 Screen Shot 2013-10-03 at 17.59.37.png


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787649#comment-13787649
 ] 

Jonathan Ellis commented on CASSANDRA-6107:
---

bq. I'm not sure about the rejection of large statements at protocol level.

I think that's what CASSANDRA-5981 is open for, actually.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch, 6107_v2.patch, 6107_v3.patch, 6107-v4.txt, 
 Screen Shot 2013-10-03 at 17.59.37.png


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-04 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786522#comment-13786522
 ] 

Jonathan Ellis commented on CASSANDRA-6107:
---

On second thought, rejecting really huge statements should be done at the 
protocol level.  I'll follow up with Sylvain to see if we're already doing that.

v4 attached that just does the weighing as discussed.  WDYT?

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch, 6107_v2.patch, 6107_v3.patch, 6107-v4.txt, 
 Screen Shot 2013-10-03 at 17.59.37.png


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785251#comment-13785251
 ] 

Jonathan Ellis commented on CASSANDRA-6107:
---

Use the cache weigher/weightedCapacity api instead of re-measuring the entire 
cache each time.  then the cache will take care of evicting old ones to make 
room as needed.

suggest making the capacity 1/256 of heap size.

should probably have a separate setting for maximum single statement size.  if 
a single statement is under this threshold but larger than the cache, execute 
it but do not cache it.

finally, statementid size should be negligible, i'd leave that out.


 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch, 6107_v2.patch


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-02 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784635#comment-13784635
 ] 

Aleksey Yeschenko commented on CASSANDRA-6107:
--

I don't think the issue here is (just) large individual prepared statements. 
It's the total size that all the prepared statements are occupying. That's what 
should be tracked and limited, not just the individual ones.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-10-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784658#comment-13784658
 ] 

Jonathan Ellis commented on CASSANDRA-6107:
---

Right.  Use the size you're calculating as the weight in the cache Map.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor
 Fix For: 1.2.11

 Attachments: 6107.patch


 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780018#comment-13780018
 ] 

Constance Eustace commented on CASSANDRA-6107:
--

- Further examination of several point-in-time heap dumps show that ALL cql 
statement batches are retained in the heap. Each statement has multiple 
collections such as ConcurrentHashMap and other data structures which will 
obviously consume huge amounts of resources.

- We have run a smaller run that does NOT batch our updates. It is obviously 
much slower, but the heap dumps show over time objects being garbage collected 
propertly.



 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Critical

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 statements. We have even tried dropping the keyspaces after 
 the updates and the CQL3 statement are still visible in the heapdump, and 
 after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780037#comment-13780037
 ] 

Constance Eustace commented on CASSANDRA-6107:
--

It appears that since we are sending preparedStatements (this allows us to prep 
the statement and then set the consistency level), that the preparedStatements 
are never evicted from the prepared statement cache in 
org.apache.cassandra.cql3.QueryProcessor

There are no removes ever done to preparedStatements or 
thriftPreparedStatements...

This may technically be our fault for preparing every single batch statement, 
but shouldn't there be a limit on stored prep statements with LRU eviction?

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Critical

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780045#comment-13780045
 ] 

Sylvain Lebresne commented on CASSANDRA-6107:
-

bq. but shouldn't there be a limit on stored prep statements with LRU eviction?

There is (from QueryProcessor.java):
{noformat}
public static final int MAX_CACHE_PREPARED = 10; // Enough to keep 
buggy clients from OOM'ing us
private static final MapMD5Digest, CQLStatement preparedStatements = new 
ConcurrentLinkedHashMap.BuilderMD5Digest, CQLStatement()
   
.maximumWeightedCapacity(MAX_CACHE_PREPARED)
   
.build();
{noformat}
but it's possible that this limit was too high to prevent you from OOM'ing in 
that case. And maybe that hard-coded is too high

But really, you should not prepare every single statement, it is a client error.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Minor

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780049#comment-13780049
 ] 

Constance Eustace commented on CASSANDRA-6107:
--

It appears you are using a MAX_CACHE_PREPARED of 100,000, and 
COncurrentLinkedHashMap should use that as an evictor.

If the individual keys for 200 line batch statements are large (say, 10k, which 
I think based on the heap dump they consist of 1 map per statement in the batch 
possibly, so that is easily possible). So 10 x 10 bytes per statement = 
10 gigabytes... uhoh. 

I think 600,000 updates, which are 3000 batches of 200 statements each popped 
the heap for a 4GB. I figure 1 GB of that heap is used for 
filters/sstables/memtables/etc, so 3000 batches popped 3GB of heap, so a 
megabyte per batch.

Can we expose the MAX_CACHE_PREPARED as a config parameter?

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Minor

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780053#comment-13780053
 ] 

Constance Eustace commented on CASSANDRA-6107:
--

I agree, there is client dysfunction here... we're going to stop prepping the 
statements, if possible (I think the cass jdbc project may have required 
prepping to set the consistency level, which sucks, but let me verify).


 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Minor

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780066#comment-13780066
 ] 

Constance Eustace commented on CASSANDRA-6107:
--

We're using SpringJDBC on top of the cass-jdbc driver. In order to intercept 
the update and specify consistency, that is only convenient with a 
PreparedStatementCreator...

so we will not use SpringJDBC/PreparedStatementCreator and instead do a more 
manual JDBC call...

sorry for the critical spam...

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Minor

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780073#comment-13780073
 ] 

Jonathan Ellis commented on CASSANDRA-6107:
---

Could we fix the OOM by adding a weight to the Map entry instead of assuming 
all entries are equal?

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Minor

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780083#comment-13780083
 ] 

Sylvain Lebresne commented on CASSANDRA-6107:
-

Given that prepared is not performance sensitive, I suppose we could even use 
jmeter to get the precise in-memory size, and then cap the prepared statements 
to some percentage of the heap.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Priority: Minor

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6107) CQL3 Batch statement memory leak

2013-09-27 Thread Constance Eustace (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780095#comment-13780095
 ] 

Constance Eustace commented on CASSANDRA-6107:
--

Yep, removing statement preparation looks good! Heap is GC'ing, and multiple 
runs can be done.

 CQL3 Batch statement memory leak
 

 Key: CASSANDRA-6107
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6107
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
 Environment: - CASS version: 1.2.8 or 2.0.1, same issue seen in both
 - Running on OSX MacbookPro
 - Sun JVM 1.7
 - Single local cassandra node
 - both CMS and G1 GC used
 - we are using the cass-JDBC driver to submit our batches
Reporter: Constance Eustace
Assignee: Lyuben Todorov
Priority: Minor

 We are doing large volume insert/update tests on a CASS via CQL3. 
 Using 4GB heap, after roughly 750,000 updates create/update 75,000 row keys, 
 we run out of heap, and it never dissipates, and we begin getting this 
 infamous error which many people seem to be encountering:
 WARN [ScheduledTasks:1] 2013-09-26 16:17:10,752 GCInspector.java (line 142) 
 Heap is 0.9383457210434385 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2013-09-26 16:17:10,753 StorageService.java (line 
 3614) Unable to reduce heap usage since there are no dirty column families
 8 and 12 GB heaps appear to delay the problem by roughly proportionate 
 amounts of 75,000 - 100,000 rowkeys per 4GB. Each run of 50,000 row key 
 creations sees the heap grow and never shrink again. 
 We have attempted to no effect:
 - removing all secondary indexes to see if that alleviates overuse of bloom 
 filters 
 - adjusted parameters for compaction throughput
 - adjusted memtable flush thresholds and other parameters 
 By examining heapdumps, it seems apparent that the problem is perpetual 
 retention of CQL3 BATCH statements. We have even tried dropping the keyspaces 
 after the updates and the CQL3 statement are still visible in the heapdump, 
 and after many many many CMS GC runs. G1 also showed this issue.
 The 750,000 statements are broken into batches of roughly 200 statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira