[jira] [Commented] (CASSANDRA-11122) SASI does not find term when indexing non-ascii character

2016-02-05 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134141#comment-15134141
 ] 

Pierre Laporte commented on CASSANDRA-11122:


Following the script Duyhai provided gave the same outcome: "Object" is not 
returned in the first script.  This has been run against a fresh clone+build of 
https://github.com/xedin/cassandra/ (branch {{CASSANDRA-11067}})

> SASI does not find term when indexing non-ascii character
> -
>
> Key: CASSANDRA-11122
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11122
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Cassandra 3.4 SNAPSHOT
>Reporter: DOAN DuyHai
> Attachments: CASSANDRA-11122.patch
>
>
> I built the snapshot version taken from here: 
> https://github.com/xedin/cassandra/tree/CASSANDRA-11067
> I create a tiny musical dataset with non-ascii characters (*cyrillic* 
> actually) and create a SASI index on the artist name.
> SASI can find rows for the cyrillic name but strangely fails to index normal 
> ascii name (_'Object'_).
> {code:sql}
> CREATE KEYSPACE music WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'}  AND durable_writes = true;
> CREATE TABLE music.albums (
> title text PRIMARY KEY,
> artist text
> );
> INSERT INTO music.albums(artist,title) VALUES('Object','The Reflecting Skin');
> INSERT INTO music.albums(artist,title) VALUES('Hayden','Mild and Hazy');
> INSERT INTO music.albums(artist,title) VALUES('Самое Большое Простое 
> Число','СБПЧ Оркестр');
> CREATE custom INDEX on music.albums(artist) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
> 'case_sensitive': 'false'};
> SELECT * FROM music.albums;
> title   | artist
> -+-
>  The Reflecting Skin |  Object
>Mild and Hazy |  Hayden
> СБПЧ Оркестр | Самое Большое Простое Число
> (3 rows)
> SELECT * FROM music.albums WHERE artist='Самое Большое Простое Число';
> title   | artist
> -+-
> СБПЧ Оркестр | Самое Большое Простое Число
> (1 rows)
> SELECT * FROM music.albums WHERE artist='Hayden';
> title   | artist
> -+-
>Mild and Hazy |  Hayden
> (1 rows)
> SELECT * FROM music.albums WHERE artist='Object';
> title   | artist
> -+-
> (0 rows)
> SELECT * FROM music.albums WHERE artist like 'Ob%';
> title   | artist
> -+-
> (0 rows)
> {code}
> Strangely enough, after cleaning all the data and re-inserting without the 
> russian artist with cyrillic name, SASI does find _'Object_' ...
> {code:sql}
> DROP INDEX albums_artist_idx;
> TRUNCATE TABLE albums;
> INSERT INTO albums(artist,title) VALUES('Object','The Reflecting Skin');
> INSERT INTO albums(artist,title) VALUES('Hayden','Mild and Hazy');
> CREATE custom INDEX on music.albums(artist) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
> 'case_sensitive': 'false'};
> SELECT * FROM music.albums;
> title   | artist
> -+-
>  The Reflecting Skin |  Object
>Mild and Hazy |  Hayden
> (2 rows)
> SELECT * FROM music.albums WHERE artist='Object';
> title   | artist
> -+-
>  The Reflecting Skin |  Object
> (1 rows)
> SELECT * FROM music.albums WHERE artist LIKE 'Ob%';
> title   | artist
> -+-
>  The Reflecting Skin |  Object
> (1 rows)
> {code}
>  The behaviour is quite inconsistent. I can understand that SASI refuses to 
> index cyrillic character or issue exception when encountering non-ascii 
> characters (because we did not specify the locale) but it's very surprising 
> that the indexing fails for normal ascii characters like _Object_
>  Could it be that SASI start indexing the artist name by following table 
> albums token range order (hash of title) and it stops indexing after 
> encountering the cyrillic name ? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-12-15 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246984#comment-14246984
 ] 

Pierre Laporte commented on CASSANDRA-8285:
---

The ruby duration tests have passed, the issue has not been seen on C* 2.0.12 
(HEAD + 8285-v2.txt).

Note that [~kishkaru]'s previous tests was using two tester machines, while 
this one only used one, so that reduces the load C* had to handle.  Still, the 
issue has not been seen during the 3-days test.

 OOME in Cassandra 2.0.11
 

 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
 Cassandra 2.0.11 + ruby-driver 1.0-beta
Reporter: Pierre Laporte
Assignee: Aleksey Yeschenko
 Attachments: 8285-v2.txt, 8285.txt, OOME_node_system.log, 
 gc-1416849312.log.gz, gc.log.gz, heap-usage-after-gc-zoom.png, 
 heap-usage-after-gc.png, system.log.gz


 We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
 with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
 2.0.8-snapshot.
 Attached are :
 | OOME_node_system.log | The system.log of one Cassandra node that crashed |
 | gc.log.gz | The GC log on the same node |
 | heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle 
 |
 | heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |
 Workload :
 Our test executes 5 CQL statements (select, insert, select, delete, select) 
 for a given unique id, during 3 days, using multiple threads.  There is not 
 change in the workload during the test.
 Symptoms :
 In the attached log, it seems something starts in Cassandra between 
 2014-11-06 10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that 
 fills the heap.  We eventually get stuck in a Full GC storm and get an OOME 
 in the logs.
 I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The 
 error does not occur.  It seems specific to 2.0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-12-08 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14238059#comment-14238059
 ] 

Pierre Laporte commented on CASSANDRA-8285:
---

Sure. Is the patch already applied to branch cassandra-2.0 or should I apply it 
manually ?

 OOME in Cassandra 2.0.11
 

 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
 Cassandra 2.0.11 + ruby-driver 1.0-beta
Reporter: Pierre Laporte
Assignee: Aleksey Yeschenko
 Attachments: 8285-v2.txt, 8285.txt, OOME_node_system.log, 
 gc-1416849312.log.gz, gc.log.gz, heap-usage-after-gc-zoom.png, 
 heap-usage-after-gc.png, system.log.gz


 We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
 with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
 2.0.8-snapshot.
 Attached are :
 | OOME_node_system.log | The system.log of one Cassandra node that crashed |
 | gc.log.gz | The GC log on the same node |
 | heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle 
 |
 | heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |
 Workload :
 Our test executes 5 CQL statements (select, insert, select, delete, select) 
 for a given unique id, during 3 days, using multiple threads.  There is not 
 change in the workload during the test.
 Symptoms :
 In the attached log, it seems something starts in Cassandra between 
 2014-11-06 10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that 
 fills the heap.  We eventually get stuck in a Full GC storm and get an OOME 
 in the logs.
 I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The 
 error does not occur.  It seems specific to 2.0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters

2014-12-03 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233202#comment-14233202
 ] 

Pierre Laporte commented on CASSANDRA-8150:
---

[~mstump] By any chance, have you collected Cassandra gc logs against various 
scenarios?  That would be really valuable to find the right values.

I ran a test of java-driver against a C* instance on GCE n1-standard-1 server 
(1 vCPU, 3,75 GB RAM).  The young generation size was 100 MB (80MB for Eden, 
10MB for each survivor)  and the old generation size was 2,4GB.

I had the following:
* Average allocation rate: 352MB/s (outliers above 600MB/s)
* 4.5 DefNew cycles per second
* 1 CMS cycle every 10 minutes

Therefore, during the test, Cassandra was promoting objects at a rate of 
3,8MB/s.

I think the size of Eden could be determined mostly by the allocation rate and 
the DefNew/ParNew frequency we want to achieve.  Here, for instance, I would 
rather have had a bigger young generation to have ~1 DefNew cycle/s.

I did not enable {{-XX:+PrintTenuringDistribution}} so I do not know whether 
the objects were prematurely promoted.  That would have given pointers on 
survivors sizing as well.

Do you have any gc logs with such flag ?

 Revaluate Default JVM tuning parameters
 ---

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Brandon Williams
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-11-24 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223083#comment-14223083
 ] 

Pierre Laporte commented on CASSANDRA-8285:
---

I have the issue after ~1.5 day on the endurance test of java-driver 2.1.3 
against 2.0.12.

Please find the associated heap dump 
[here|https://drive.google.com/open?id=0BxvGkaXP3ayeOElqY1ZNQTlBNTgauthuser=1]



 OOME in Cassandra 2.0.11
 

 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
 Cassandra 2.0.11 + ruby-driver 1.0-beta
Reporter: Pierre Laporte
Assignee: Aleksey Yeschenko
 Attachments: OOME_node_system.log, gc.log.gz, 
 heap-usage-after-gc-zoom.png, heap-usage-after-gc.png


 We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
 with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
 2.0.8-snapshot.
 Attached are :
 | OOME_node_system.log | The system.log of one Cassandra node that crashed |
 | gc.log.gz | The GC log on the same node |
 | heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle 
 |
 | heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |
 Workload :
 Our test executes 5 CQL statements (select, insert, select, delete, select) 
 for a given unique id, during 3 days, using multiple threads.  There is not 
 change in the workload during the test.
 Symptoms :
 In the attached log, it seems something starts in Cassandra between 
 2014-11-06 10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that 
 fills the heap.  We eventually get stuck in a Full GC storm and get an OOME 
 in the logs.
 I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The 
 error does not occur.  It seems specific to 2.0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-11-24 Thread Pierre Laporte (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Laporte updated CASSANDRA-8285:
--
Attachment: system.log.gz
gc-1416849312.log.gz

I just reproduced the issue on my machine against Cassandra 2.1.2.

*Howto*

Create 3-nodes C* cluster

{code}ccm create -n 3 -v 2.1.2 -b -s -i 127.0.0. cassandra-2.1{code}

Insert/delete a lot of rows inside a single table.  I was actually trying to 
reproduce the TombstoneOverwhelmingException but got an OOME instead.

{code}
public class CassandraTest implements AutoCloseable {
public static final String KEYSPACE = TombstonesOverwhelming;

private Cluster cluster;
protected Session session;

public CassandraTest() {
this(new RoundRobinPolicy());
}

public CassandraTest(LoadBalancingPolicy loadBalancingPolicy) {
System.out.println(Creating builder...);
cluster = 
Cluster.builder().addContactPoint(127.0.0.1).withLoadBalancingPolicy(loadBalancingPolicy).build();
for (Host host : cluster.getMetadata().getAllHosts()) {
System.out.println(Found host  + host.getAddress() +  in DC  + 
host.getDatacenter());
}
session = cluster.connect();
}

private void executeQuietly(String query) {
try {
execute(query);
} catch (Exception e) {
e.printStackTrace();
}
}

private ResultSet execute(String query) {
return session.execute(query);
}

private ResultSet execute(Statement statement) {
return session.execute(statement);
}

@Override
public void close() throws IOException {
cluster.close();
}

public static void main(String... args) throws Exception {
try (CassandraTest test = new CassandraTest()) {
test.executeQuietly(DROP KEYSPACE IF EXISTS  + KEYSPACE);
test.execute(CREATE KEYSPACE  + KEYSPACE +   +
WITH REPLICATION = { 'class' : 'SimpleStrategy', 
'replication_factor' : 3 });
test.execute(USE  + KEYSPACE);
test.execute(CREATE TABLE useful (run int, iteration int, copy 
int, PRIMARY KEY (run, iteration, copy)));

System.out.println(Press ENTER to start the test);
System.in.read();

for (int run = 0; run  1_000_000; run++) {
System.out.printf(Starting run % 7d... , run);
System.out.print(Inserting...);
for (int iteration = 0; iteration  1_000_000; iteration++) {
Batch batch = QueryBuilder.batch();
batch.setConsistencyLevel(ConsistencyLevel.QUORUM);
for (int copy = 0; copy  100; copy++) {
batch.add(QueryBuilder.insertInto(useful)
.value(run, run).value(iteration, 
iteration).value(copy, copy));
}
test.execute(batch);
}
System.out.println(Deleting...);
for (int iteration = 0; iteration  1_000_000; iteration++) {
Batch batch = QueryBuilder.batch();
batch.setConsistencyLevel(ConsistencyLevel.QUORUM);
for (int copy = 0; copy  100; copy++) {
batch.add(QueryBuilder.delete().from(useful)
.where(eq(run, run)).and(eq(iteration, 
iteration)).and(eq(copy, copy)));
}
test.execute(batch);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
{code}

I took ~50 minutes before two instances OOME'd.  Please find attached the gc 
log and the system log.  If needed, I can upload a heap dump too.

Hope that helps

 OOME in Cassandra 2.0.11
 

 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
 Cassandra 2.0.11 + ruby-driver 1.0-beta
Reporter: Pierre Laporte
Assignee: Aleksey Yeschenko
 Attachments: OOME_node_system.log, gc-1416849312.log.gz, gc.log.gz, 
 heap-usage-after-gc-zoom.png, heap-usage-after-gc.png, system.log.gz


 We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
 with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
 2.0.8-snapshot.
 Attached are :
 | OOME_node_system.log | The system.log of one Cassandra node that crashed |
 | gc.log.gz | The GC log on the same node |
 | heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle 
 |
 | heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |
 Workload :
 Our test executes 5 CQL statements (select, insert, select, delete, 

[jira] [Comment Edited] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-11-24 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223321#comment-14223321
 ] 

Pierre Laporte edited comment on CASSANDRA-8285 at 11/24/14 7:09 PM:
-

I just reproduced the issue on my machine against Cassandra 2.1.2.

*Howto*

Create 3-nodes C* cluster

{code}ccm create -n 3 -v 2.1.2 -b -s -i 127.0.0. cassandra-2.1{code}

Insert/delete a lot of rows inside a single table.  I was actually trying to 
reproduce the TombstoneOverwhelmingException but got an OOME instead.

{code}
public class CassandraTest implements AutoCloseable {
public static final String KEYSPACE = TombstonesOverwhelming;

private Cluster cluster;
protected Session session;

public CassandraTest() {
this(new RoundRobinPolicy());
}

public CassandraTest(LoadBalancingPolicy loadBalancingPolicy) {
System.out.println(Creating builder...);
cluster = 
Cluster.builder().addContactPoint(127.0.0.1).withLoadBalancingPolicy(loadBalancingPolicy).build();
for (Host host : cluster.getMetadata().getAllHosts()) {
System.out.println(Found host  + host.getAddress() +  in DC  + 
host.getDatacenter());
}
session = cluster.connect();
}

private void executeQuietly(String query) {
try {
execute(query);
} catch (Exception e) {
e.printStackTrace();
}
}

private ResultSet execute(String query) {
return session.execute(query);
}

private ResultSet execute(Statement statement) {
return session.execute(statement);
}

@Override
public void close() throws IOException {
cluster.close();
}

public static void main(String... args) throws Exception {
try (CassandraTest test = new CassandraTest()) {
test.executeQuietly(DROP KEYSPACE IF EXISTS  + KEYSPACE);
test.execute(CREATE KEYSPACE  + KEYSPACE +   +
WITH REPLICATION = { 'class' : 'SimpleStrategy', 
'replication_factor' : 3 });
test.execute(USE  + KEYSPACE);
test.execute(CREATE TABLE useful (run int, iteration int, copy 
int, PRIMARY KEY (run, iteration, copy)));

System.out.println(Press ENTER to start the test);
System.in.read();

for (int run = 0; run  1_000_000; run++) {
System.out.printf(Starting run % 7d... , run);
System.out.print(Inserting...);
for (int iteration = 0; iteration  1_000_000; iteration++) {
Batch batch = QueryBuilder.batch();
batch.setConsistencyLevel(ConsistencyLevel.QUORUM);
for (int copy = 0; copy  100; copy++) {
batch.add(QueryBuilder.insertInto(useful)
.value(run, run).value(iteration, 
iteration).value(copy, copy));
}
test.execute(batch);
}
System.out.println(Deleting...);
for (int iteration = 0; iteration  1_000_000; iteration++) {
Batch batch = QueryBuilder.batch();
batch.setConsistencyLevel(ConsistencyLevel.QUORUM);
for (int copy = 0; copy  100; copy++) {
batch.add(QueryBuilder.delete().from(useful)
.where(eq(run, run)).and(eq(iteration, 
iteration)).and(eq(copy, copy)));
}
test.execute(batch);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
{code}

I took ~50 minutes before two instances OOME'd.  Please find attached the gc 
log (gc-1416849312.log.gz) and the system log (system.log.gz).  If needed, I 
can upload a heap dump too.

Hope that helps


was (Author: pingtimeout):
I just reproduced the issue on my machine against Cassandra 2.1.2.

*Howto*

Create 3-nodes C* cluster

{code}ccm create -n 3 -v 2.1.2 -b -s -i 127.0.0. cassandra-2.1{code}

Insert/delete a lot of rows inside a single table.  I was actually trying to 
reproduce the TombstoneOverwhelmingException but got an OOME instead.

{code}
public class CassandraTest implements AutoCloseable {
public static final String KEYSPACE = TombstonesOverwhelming;

private Cluster cluster;
protected Session session;

public CassandraTest() {
this(new RoundRobinPolicy());
}

public CassandraTest(LoadBalancingPolicy loadBalancingPolicy) {
System.out.println(Creating builder...);
cluster = 
Cluster.builder().addContactPoint(127.0.0.1).withLoadBalancingPolicy(loadBalancingPolicy).build();
for (Host host : cluster.getMetadata().getAllHosts()) {
System.out.println(Found host  + host.getAddress() +  in DC  + 
host.getDatacenter());
}
session 

[jira] [Commented] (CASSANDRA-8365) CamelCase name is used as index name instead of lowercase

2014-11-24 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223328#comment-14223328
 ] 

Pierre Laporte commented on CASSANDRA-8365:
---

[~philipthompson] I am using 2.1.2

 CamelCase name is used as index name instead of lowercase
 -

 Key: CASSANDRA-8365
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8365
 Project: Cassandra
  Issue Type: Bug
Reporter: Pierre Laporte
Priority: Minor
  Labels: cqlsh

 In cqlsh, when I execute a CREATE INDEX FooBar ... statement, the CamelCase 
 name is used as index name, even though it is unquoted. Trying to quote the 
 index name results in a syntax error.
 However, when I try to delete the index, I have to quote the index name, 
 otherwise I get an invalid-query error telling me that the index (lowercase) 
 does not exist.
 This seems inconsistent.  Shouldn't the index name be lowercased before the 
 index is created ?
 Here is the code to reproduce the issue :
 {code}
 cqlsh:schemabuilderit CREATE TABLE IndexTest (a int primary key, b int);
 cqlsh:schemabuilderit CREATE INDEX FooBar on indextest (b);
 cqlsh:schemabuilderit DESCRIBE TABLE indextest ;
 CREATE TABLE schemabuilderit.indextest (
 a int PRIMARY KEY,
 b int
 ) ;
 CREATE INDEX FooBar ON schemabuilderit.indextest (b);
 cqlsh:schemabuilderit DROP INDEX FooBar;
 code=2200 [Invalid query] message=Index 'foobar' could not be found in any 
 of the tables of keyspace 'schemabuilderit'
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8355) NPE when passing wrong argument in ALTER TABLE statement

2014-11-21 Thread Pierre Laporte (JIRA)
Pierre Laporte created CASSANDRA-8355:
-

 Summary: NPE when passing wrong argument in ALTER TABLE statement
 Key: CASSANDRA-8355
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8355
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.1.2
Reporter: Pierre Laporte
Priority: Minor


When I tried to change the caching strategy of a table, I provided a wrong 
argument {{'rows_per_partition' : ALL}} with unquoted ALL. Cassandra returned a 
SyntaxError, which is good, but it seems it was because of a 
NullPointerException.

*Howto*
{code}
CREATE TABLE foo (k int primary key);
ALTER TABLE foo WITH caching = {'keys' : 'all', 'rows_per_partition' : ALL};
{code}

*Output*
{code}
ErrorMessage code=2000 [Syntax error in CQL query] message=Failed parsing 
statement: [ALTER TABLE foo WITH caching = {'keys' : 'all', 
'rows_per_partition' : ALL};] reason: NullPointerException null
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8365) CamelCase name is used as index name instead of lowercase

2014-11-21 Thread Pierre Laporte (JIRA)
Pierre Laporte created CASSANDRA-8365:
-

 Summary: CamelCase name is used as index name instead of lowercase
 Key: CASSANDRA-8365
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8365
 Project: Cassandra
  Issue Type: Bug
Reporter: Pierre Laporte
Priority: Minor


In cqlsh, when I execute a CREATE INDEX FooBar ... statement, the CamelCase 
name is used as index name, even though it is unquoted. Trying to quote the 
index name results in a syntax error.

However, when I try to delete the index, I have to quote the index name, 
otherwise I get an invalid-query error telling me that the index (lowercase) 
does not exist.

This seems inconsistent.  Shouldn't the index name be lowercased before the 
index is created ?

Here is the code to reproduce the issue :

{code}
cqlsh:schemabuilderit CREATE TABLE IndexTest (a int primary key, b int);
cqlsh:schemabuilderit CREATE INDEX FooBar on indextest (b);
cqlsh:schemabuilderit DESCRIBE TABLE indextest ;

CREATE TABLE schemabuilderit.indextest (
a int PRIMARY KEY,
b int
) ;
CREATE INDEX FooBar ON schemabuilderit.indextest (b);

cqlsh:schemabuilderit DROP INDEX FooBar;
code=2200 [Invalid query] message=Index 'foobar' could not be found in any of 
the tables of keyspace 'schemabuilderit'
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-11-12 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208169#comment-14208169
 ] 

Pierre Laporte commented on CASSANDRA-8285:
---

[~jbellis] Please find a new gc log, system log and heap dump 
[here|https://drive.google.com/a/datastax.com/folderview?id=0BxvGkaXP3ayeNV83Nm1nSUNEcDQusp=sharing]
 

Those 3 files come from the same instance that crashed after a couple of hours. 
 The heap dump was triggered by {{-XX:+HeapDumpOnOutOfMemoryError}}

Hope that helps

 OOME in Cassandra 2.0.11
 

 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
 Cassandra 2.0.11 + ruby-driver 1.0-beta
Reporter: Pierre Laporte
Assignee: Russ Hatch
 Attachments: OOME_node_system.log, gc.log.gz, 
 heap-usage-after-gc-zoom.png, heap-usage-after-gc.png


 We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
 with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
 2.0.8-snapshot.
 Attached are :
 | OOME_node_system.log | The system.log of one Cassandra node that crashed |
 | gc.log.gz | The GC log on the same node |
 | heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle 
 |
 | heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |
 Workload :
 Our test executes 5 CQL statements (select, insert, select, delete, select) 
 for a given unique id, during 3 days, using multiple threads.  There is not 
 change in the workload during the test.
 Symptoms :
 In the attached log, it seems something starts in Cassandra between 
 2014-11-06 10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that 
 fills the heap.  We eventually get stuck in a Full GC storm and get an OOME 
 in the logs.
 I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The 
 error does not occur.  It seems specific to 2.0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-11-10 Thread Pierre Laporte (JIRA)
Pierre Laporte created CASSANDRA-8285:
-

 Summary: OOME in Cassandra 2.0.11
 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
Cassandra 2.0.11 + ruby-driver 1.0-beta

Reporter: Pierre Laporte
 Attachments: OOME_node_system.log, gc.log.gz, 
heap-usage-after-gc-zoom.png, heap-usage-after-gc.png

We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
2.0.8-snapshot.

Attached are :

| OOME_node_system.log | The system.log of one Cassandra node that crashed |
| gc.log.gz | The GC log on the same node |
| heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle |
| heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |

Workload :
Our test executes 5 CQL statements (select, insert, select, delete, select) for 
a given unique id, during 3 days, using multiple threads.  There is not change 
in the workload during the test.

Symptoms :
In the attached log, it seems something starts in Cassandra between 2014-11-06 
10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that fills the 
heap.  We eventually get stuck in a Full GC storm and get an OOME in the logs.

I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The error 
does not occur.  It seems specific to 2.0.10.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8285) OOME in Cassandra 2.0.11

2014-11-10 Thread Pierre Laporte (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Laporte updated CASSANDRA-8285:
--
Description: 
We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
2.0.8-snapshot.

Attached are :

| OOME_node_system.log | The system.log of one Cassandra node that crashed |
| gc.log.gz | The GC log on the same node |
| heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle |
| heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |

Workload :
Our test executes 5 CQL statements (select, insert, select, delete, select) for 
a given unique id, during 3 days, using multiple threads.  There is not change 
in the workload during the test.

Symptoms :
In the attached log, it seems something starts in Cassandra between 2014-11-06 
10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that fills the 
heap.  We eventually get stuck in a Full GC storm and get an OOME in the logs.

I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The error 
does not occur.  It seems specific to 2.0.11.



  was:
We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
2.0.8-snapshot.

Attached are :

| OOME_node_system.log | The system.log of one Cassandra node that crashed |
| gc.log.gz | The GC log on the same node |
| heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle |
| heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |

Workload :
Our test executes 5 CQL statements (select, insert, select, delete, select) for 
a given unique id, during 3 days, using multiple threads.  There is not change 
in the workload during the test.

Symptoms :
In the attached log, it seems something starts in Cassandra between 2014-11-06 
10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that fills the 
heap.  We eventually get stuck in a Full GC storm and get an OOME in the logs.

I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The error 
does not occur.  It seems specific to 2.0.10.




 OOME in Cassandra 2.0.11
 

 Key: CASSANDRA-8285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8285
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.11 + java-driver 2.0.8-SNAPSHOT
 Cassandra 2.0.11 + ruby-driver 1.0-beta
Reporter: Pierre Laporte
 Attachments: OOME_node_system.log, gc.log.gz, 
 heap-usage-after-gc-zoom.png, heap-usage-after-gc.png


 We ran drivers 3-days endurance tests against Cassandra 2.0.11 and C* crashed 
 with an OOME.  This happened both with ruby-driver 1.0-beta and java-driver 
 2.0.8-snapshot.
 Attached are :
 | OOME_node_system.log | The system.log of one Cassandra node that crashed |
 | gc.log.gz | The GC log on the same node |
 | heap-usage-after-gc.png | The heap occupancy evolution after every GC cycle 
 |
 | heap-usage-after-gc-zoom.png | A focus on when things start to go wrong |
 Workload :
 Our test executes 5 CQL statements (select, insert, select, delete, select) 
 for a given unique id, during 3 days, using multiple threads.  There is not 
 change in the workload during the test.
 Symptoms :
 In the attached log, it seems something starts in Cassandra between 
 2014-11-06 10:29:22 and 2014-11-06 10:45:32.  This causes an allocation that 
 fills the heap.  We eventually get stuck in a Full GC storm and get an OOME 
 in the logs.
 I have run the java-driver tests against Cassandra 1.2.19 and 2.1.1.  The 
 error does not occur.  It seems specific to 2.0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8276) Unusable prepared statement with 65k parameters

2014-11-07 Thread Pierre Laporte (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Laporte updated CASSANDRA-8276:
--
Summary: Unusable prepared statement with 65k parameters  (was: Prepared 
statement unavailable)

 Unusable prepared statement with 65k parameters
 ---

 Key: CASSANDRA-8276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8276
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.10
 Java driver 2.0.8-SNAPSHOT
Reporter: Pierre Laporte

 We had an issue 
 ([JAVA-515|https://datastax-oss.atlassian.net/browse/JAVA-515]) in the 
 java-driver when the number of parameters in a statement is greater than the 
 supported limit (65k).
 I added a limit-test to verify that prepared statements with 65535 parameters 
 were accepted by the driver, but ran into an issue on the Cassandra side.
 Basically, the test runs forever, because the driver receives an inconsistent 
 answer from Cassandra.  When we prepare the statement, C* answers that it is 
 correctly prepared, however when we try to execute it, we receive a 
 {{UNPREPARED}} answer.
 [Here is the 
 code|https://github.com/datastax/java-driver/blob/JAVA-515/driver-core/src/test/java/com/datastax/driver/core/PreparedStatementTest.java#L448]
  to reproduce the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8276) Prepared statement unavailable

2014-11-07 Thread Pierre Laporte (JIRA)
Pierre Laporte created CASSANDRA-8276:
-

 Summary: Prepared statement unavailable
 Key: CASSANDRA-8276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8276
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.0.10
Java driver 2.0.8-SNAPSHOT
Reporter: Pierre Laporte


We had an issue ([JAVA-515|https://datastax-oss.atlassian.net/browse/JAVA-515]) 
in the java-driver when the number of parameters in a statement is greater than 
the supported limit (65k).

I added a limit-test to verify that prepared statements with 65535 parameters 
were accepted by the driver, but ran into an issue on the Cassandra side.

Basically, the test runs forever, because the driver receives an inconsistent 
answer from Cassandra.  When we prepare the statement, C* answers that it is 
correctly prepared, however when we try to execute it, we receive a 
{{UNPREPARED}} answer.

[Here is the 
code|https://github.com/datastax/java-driver/blob/JAVA-515/driver-core/src/test/java/com/datastax/driver/core/PreparedStatementTest.java#L448]
 to reproduce the issue.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7743) Possible C* OOM issue during long running test

2014-08-13 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095698#comment-14095698
 ] 

Pierre Laporte commented on CASSANDRA-7743:
---

[~tjake] Sure, I just started a new test with this option

 Possible C* OOM issue during long running test
 --

 Key: CASSANDRA-7743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7743
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Google Compute Engine, n1-standard-1
Reporter: Pierre Laporte
 Fix For: 2.1.0


 During a long running test, we ended up with a lot of 
 java.lang.OutOfMemoryError: Direct buffer memory errors on the Cassandra 
 instances.
 Here is an example of stacktrace from system.log :
 {code}
 ERROR [SharedPool-Worker-1] 2014-08-11 11:09:34,610 ErrorMessage.java:218 - 
 Unexpected exception during request
 java.lang.OutOfMemoryError: Direct buffer memory
 at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_25]
 at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:123) 
 ~[na:1.7.0_25]
 at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
 ~[na:1.7.0_25]
 at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:251)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:112)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:507) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:464)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
 {code}
 The test consisted of a 3-nodes cluster of n1-standard-1 GCE instances (1 
 vCPU, 3.75 GB RAM) running cassandra-2.1.0-rc5, and a n1-standard-2 instance 
 running the test.
 After ~2.5 days, several requests start to fail and we see the previous 
 stacktraces in the system.log file.
 The output from linux ‘free’ and ‘meminfo’ suggest that there is still memory 
 available.
 {code}
 $ free -m
 total  used   free sharedbuffers cached
 Mem:  3702   3532169  0161854
 -/+ buffers/cache:   2516   1185
 Swap:0  0  0
 $ head -n 4 /proc/meminfo
 MemTotal:3791292 kB
 MemFree:  173568 kB
 Buffers:  165608 kB
 Cached:   874752 kB
 {code}
 These errors do not affect all the queries we run. The cluster is still 
 responsive but is unable to display tracing information using cqlsh :
 {code}
 $ ./bin/nodetool --host 10.240.137.253 status duration_test
 Datacenter: DC1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address Load   Tokens  Owns (effective)  Host ID  
  Rack
 UN  10.240.98.27925.17 KB  256 100.0%
 41314169-eff5-465f-85ea-d501fd8f9c5e  RAC1
 UN  10.240.137.253  1.1 MB 256 100.0%
 c706f5f9-c5f3-4d5e-95e9-a8903823827e  RAC1
 UN  10.240.72.183   896.57 KB  256 100.0%
 15735c4d-98d4-4ea4-a305-7ab2d92f65fc  RAC1
 $ echo 'tracing on; 

[jira] [Commented] (CASSANDRA-7743) Possible C* OOM issue during long running test

2014-08-12 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093965#comment-14093965
 ] 

Pierre Laporte commented on CASSANDRA-7743:
---

[~benedict] Actually, the nodes are running with memtable_allocation_type: 
heap_buffers.

[~jbellis] The test failed on bigger instance too.  I just realized that 
setting -XX:MaxDirectMemorySize=-1 is useless since it is the default value.  
Now I am doubting -1 really means unlimited...  Restarting a new one with 
-XX:MaxDirectMemorySize=1G to see if things change.

 Possible C* OOM issue during long running test
 --

 Key: CASSANDRA-7743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7743
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Google Compute Engine, n1-standard-1
Reporter: Pierre Laporte

 During a long running test, we ended up with a lot of 
 java.lang.OutOfMemoryError: Direct buffer memory errors on the Cassandra 
 instances.
 Here is an example of stacktrace from system.log :
 {code}
 ERROR [SharedPool-Worker-1] 2014-08-11 11:09:34,610 ErrorMessage.java:218 - 
 Unexpected exception during request
 java.lang.OutOfMemoryError: Direct buffer memory
 at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_25]
 at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:123) 
 ~[na:1.7.0_25]
 at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
 ~[na:1.7.0_25]
 at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:251)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:112)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:507) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:464)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
 {code}
 The test consisted of a 3-nodes cluster of n1-standard-1 GCE instances (1 
 vCPU, 3.75 GB RAM) running cassandra-2.1.0-rc5, and a n1-standard-2 instance 
 running the test.
 After ~2.5 days, several requests start to fail and we see the previous 
 stacktraces in the system.log file.
 The output from linux ‘free’ and ‘meminfo’ suggest that there is still memory 
 available.
 {code}
 $ free -m
 total  used   free sharedbuffers cached
 Mem:  3702   3532169  0161854
 -/+ buffers/cache:   2516   1185
 Swap:0  0  0
 $ head -n 4 /proc/meminfo
 MemTotal:3791292 kB
 MemFree:  173568 kB
 Buffers:  165608 kB
 Cached:   874752 kB
 {code}
 These errors do not affect all the queries we run. The cluster is still 
 responsive but is unable to display tracing information using cqlsh :
 {code}
 $ ./bin/nodetool --host 10.240.137.253 status duration_test
 Datacenter: DC1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address Load   Tokens  Owns (effective)  Host ID  
  Rack
 UN  10.240.98.27925.17 KB  256 100.0% 

[jira] [Commented] (CASSANDRA-7743) Possible C* OOM issue during long running test

2014-08-12 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094030#comment-14094030
 ] 

Pierre Laporte commented on CASSANDRA-7743:
---

Sure, I have uploaded one here : 
https://drive.google.com/file/d/0BxvGkaXP3ayeMDlRTWJ2MVhvT0E/edit?usp=sharing

 Possible C* OOM issue during long running test
 --

 Key: CASSANDRA-7743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7743
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Google Compute Engine, n1-standard-1
Reporter: Pierre Laporte

 During a long running test, we ended up with a lot of 
 java.lang.OutOfMemoryError: Direct buffer memory errors on the Cassandra 
 instances.
 Here is an example of stacktrace from system.log :
 {code}
 ERROR [SharedPool-Worker-1] 2014-08-11 11:09:34,610 ErrorMessage.java:218 - 
 Unexpected exception during request
 java.lang.OutOfMemoryError: Direct buffer memory
 at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_25]
 at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:123) 
 ~[na:1.7.0_25]
 at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
 ~[na:1.7.0_25]
 at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:251)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:112)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:507) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:464)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
 {code}
 The test consisted of a 3-nodes cluster of n1-standard-1 GCE instances (1 
 vCPU, 3.75 GB RAM) running cassandra-2.1.0-rc5, and a n1-standard-2 instance 
 running the test.
 After ~2.5 days, several requests start to fail and we see the previous 
 stacktraces in the system.log file.
 The output from linux ‘free’ and ‘meminfo’ suggest that there is still memory 
 available.
 {code}
 $ free -m
 total  used   free sharedbuffers cached
 Mem:  3702   3532169  0161854
 -/+ buffers/cache:   2516   1185
 Swap:0  0  0
 $ head -n 4 /proc/meminfo
 MemTotal:3791292 kB
 MemFree:  173568 kB
 Buffers:  165608 kB
 Cached:   874752 kB
 {code}
 These errors do not affect all the queries we run. The cluster is still 
 responsive but is unable to display tracing information using cqlsh :
 {code}
 $ ./bin/nodetool --host 10.240.137.253 status duration_test
 Datacenter: DC1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address Load   Tokens  Owns (effective)  Host ID  
  Rack
 UN  10.240.98.27925.17 KB  256 100.0%
 41314169-eff5-465f-85ea-d501fd8f9c5e  RAC1
 UN  10.240.137.253  1.1 MB 256 100.0%
 c706f5f9-c5f3-4d5e-95e9-a8903823827e  RAC1
 UN  10.240.72.183   896.57 KB  256 100.0%
 15735c4d-98d4-4ea4-a305-7ab2d92f65fc  

[jira] [Created] (CASSANDRA-7743) Possible C* OOM issue during long running test

2014-08-11 Thread Pierre Laporte (JIRA)
Pierre Laporte created CASSANDRA-7743:
-

 Summary: Possible C* OOM issue during long running test
 Key: CASSANDRA-7743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7743
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Google Compute Engine, n1-standard-1
Reporter: Pierre Laporte


During a long running test, we ended up with a lot of 
java.lang.OutOfMemoryError: Direct buffer memory errors on the Cassandra 
instances.

Here is an example of stacktrace from system.log :
{code}
ERROR [SharedPool-Worker-1] 2014-08-11 11:09:34,610 ErrorMessage.java:218 - 
Unexpected exception during request
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_25]
at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:123) 
~[na:1.7.0_25]
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
~[na:1.7.0_25]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) 
~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) 
~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:251)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:112)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:507) 
~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:464)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378) 
~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350) 
~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
{code}

The test consisted of a 3-nodes cluster of n1-standard-1 GCE instances (1 vCPU, 
3.75 GB RAM) running cassandra-2.1.0-rc5, and a n1-standard-2 instance running 
the test.

After ~2.5 days, several requests start to fail and we see the previous 
stacktraces in the system.log file.

The output from linux ‘free’ and ‘meminfo’ suggest that there is still memory 
available.

{code}
$ free -m
total  used   free sharedbuffers cached
Mem:  3702   3532169  0161854
-/+ buffers/cache:   2516   1185
Swap:0  0  0

$ head -n 4 /proc/meminfo
MemTotal:3791292 kB
MemFree:  173568 kB
Buffers:  165608 kB
Cached:   874752 kB
{code}

These errors do not affect all the queries we run. The cluster is still 
responsive but is unable to display tracing information using cqlsh :

{code}
$ ./bin/nodetool --host 10.240.137.253 status duration_test
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address Load   Tokens  Owns (effective)  Host ID
   Rack
UN  10.240.98.27925.17 KB  256 100.0%
41314169-eff5-465f-85ea-d501fd8f9c5e  RAC1
UN  10.240.137.253  1.1 MB 256 100.0%
c706f5f9-c5f3-4d5e-95e9-a8903823827e  RAC1
UN  10.240.72.183   896.57 KB  256 100.0%
15735c4d-98d4-4ea4-a305-7ab2d92f65fc  RAC1


$ echo 'tracing on; select count(*) from duration_test.ints;' | ./bin/cqlsh 
10.240.137.253
Now tracing requests.

 count
---
  9486

(1 rows)

Statement trace did not complete within 10 seconds
{code}





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7743) Possible C* OOM issue during long running test

2014-08-11 Thread Pierre Laporte (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093010#comment-14093010
 ] 

Pierre Laporte commented on CASSANDRA-7743:
---

[~enigmacurry] Eclipse MAT shows 300k instances of java.nio.ByteBuffer[] but 
retaining only ~26MB. It only accounts for in-heap data.

[~jbellis] Ok I am going to start two new tests: one on n1-standard-1 with 
-XX:MaxDirectMemorySize=-1 and another one on n1-standard-2 without this setting

 Possible C* OOM issue during long running test
 --

 Key: CASSANDRA-7743
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7743
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Google Compute Engine, n1-standard-1
Reporter: Pierre Laporte

 During a long running test, we ended up with a lot of 
 java.lang.OutOfMemoryError: Direct buffer memory errors on the Cassandra 
 instances.
 Here is an example of stacktrace from system.log :
 {code}
 ERROR [SharedPool-Worker-1] 2014-08-11 11:09:34,610 ErrorMessage.java:218 - 
 Unexpected exception during request
 java.lang.OutOfMemoryError: Direct buffer memory
 at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_25]
 at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:123) 
 ~[na:1.7.0_25]
 at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
 ~[na:1.7.0_25]
 at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:251)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:112)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:507) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:464)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350) 
 ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
  ~[netty-all-4.0.20.Final.jar:4.0.20.Final]
 at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
 {code}
 The test consisted of a 3-nodes cluster of n1-standard-1 GCE instances (1 
 vCPU, 3.75 GB RAM) running cassandra-2.1.0-rc5, and a n1-standard-2 instance 
 running the test.
 After ~2.5 days, several requests start to fail and we see the previous 
 stacktraces in the system.log file.
 The output from linux ‘free’ and ‘meminfo’ suggest that there is still memory 
 available.
 {code}
 $ free -m
 total  used   free sharedbuffers cached
 Mem:  3702   3532169  0161854
 -/+ buffers/cache:   2516   1185
 Swap:0  0  0
 $ head -n 4 /proc/meminfo
 MemTotal:3791292 kB
 MemFree:  173568 kB
 Buffers:  165608 kB
 Cached:   874752 kB
 {code}
 These errors do not affect all the queries we run. The cluster is still 
 responsive but is unable to display tracing information using cqlsh :
 {code}
 $ ./bin/nodetool --host 10.240.137.253 status duration_test
 Datacenter: DC1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address Load   Tokens  Owns (effective)  Host ID  
  Rack
 UN  10.240.98.27925.17 KB  256 100.0%
 41314169-eff5-465f-85ea-d501fd8f9c5e  RAC1
 UN  10.240.137.253