[
https://issues.apache.org/jira/browse/CASSANDRA-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alberto Pujante updated CASSANDRA-5721:
---------------------------------------
Description:
I don't know if this is a bug or a normal behaviour. After doing some
insertions and deletions (new keyspace, new table) and make a flush of the
table, Cassandra gives extremely slow reads (and finally timeouts).
Compactions also are extremely slow (with just a few hundred of columns).
I've created a example script to test this. The result table of the output
script has 750 live columns and 750 tombstones and only is flushed one memtable.
When I don't do deletions and I read the entire row, Cassandra gives normal
times. In this case it would be best for performance manually mark columns as
deleted and when row reaches a % of deletes copy "no deleted columns" in a new
row and delete the old one( row deletion).
Even in models with a few columns deletions, after a while Cassandra would
become extremely slow, and compactions would be very painfull
Thanks in advance
public void test() throws InterruptedException {
session.execute("CREATE KEYSPACE ks WITH replication " + "="
+ "{'class':'SimpleStrategy', 'replication_factor':1};");
session.execute("use ks;");
//session.execute("drop table timelineTable;");
session.execute("CREATE " + "TABLE ks.timelineTable ("
+ "key blob,"
+ "timeline timestamp,"
+ "value blob,"
+ "PRIMARY KEY (key, timeline)"
+ ") WITH CLUSTERING ORDER BY (timeline DESC) and
gc_grace_seconds=0;");
Long interval;
Long time = new Date().getTime();
int j = 0;
while (j < 15) {
int i = 0;
interval = new Date().getTime();
while (i < 100) {
session.execute("insert into timelineTable (key,timeline,value)
values (0x01,"
+ time.toString() + ",0x0" + Integer.toHexString(j) +
")");
time++;
i++;
}
System.out.println("Insert Interval:" + (new Date().getTime() -
interval));
interval = new Date().getTime();
ResultSet results = session.execute("SELECT * FROM timelineTable"
+ " WHERE key = 0x01 ORDER BY timeline DESC limit 100");
System.out.println("Read Interval:" + (new Date().getTime() -
interval));
i = 0;
interval = new Date().getTime();
for (Row row : results) {
if (i >= 50) {
session.execute("DELETE FROM timelineTable WHERE key = 0x01
AND timeline="
+ row.getDate("timeline").getTime());
}
i++;
}
System.out.println("Delete Interval:" + (new Date().getTime() -
interval));
j++;
System.out.println("");
}
}
was:
I don't know if this is a bug or a normal behaviour. After doing some
insertions and deletions (new keyspace, new table) and make a flush of the
table, Cassandra gives extremely slow reads (and finally timeouts).
Compactions also are extremely slow (with just a few hundred of columns).
I've created a example script to test this. The result table of the output
script has 750 live columns and 750 tombstones and only is flushed one memtable.
When I don't do deletions and I read the entire row, Cassandra gives normal
times. In this case it would be best for performance manually mark columns as
deleted and when row reaches a % of deletes copy "no deleted columns" in a new
row and delete the old one( row deletion).
Even in models with a few columns deletions, after a while Cassandra would
become extremely slow, and compactions would be very painfull
public void test() throws InterruptedException {
session.execute("CREATE KEYSPACE ks WITH replication " + "="
+ "{'class':'SimpleStrategy', 'replication_factor':1};");
session.execute("use ks;");
//session.execute("drop table timelineTable;");
session.execute("CREATE " + "TABLE ks.timelineTable ("
+ "key blob,"
+ "timeline timestamp,"
+ "value blob,"
+ "PRIMARY KEY (key, timeline)"
+ ") WITH CLUSTERING ORDER BY (timeline DESC) and
gc_grace_seconds=0;");
Long interval;
Long time = new Date().getTime();
int j = 0;
while (j < 15) {
int i = 0;
interval = new Date().getTime();
while (i < 100) {
session.execute("insert into timelineTable (key,timeline,value)
values (0x01,"
+ time.toString() + ",0x0" + Integer.toHexString(j) +
")");
time++;
i++;
}
System.out.println("Insert Interval:" + (new Date().getTime() -
interval));
interval = new Date().getTime();
ResultSet results = session.execute("SELECT * FROM timelineTable"
+ " WHERE key = 0x01 ORDER BY timeline DESC limit 100");
System.out.println("Read Interval:" + (new Date().getTime() -
interval));
i = 0;
interval = new Date().getTime();
for (Row row : results) {
if (i >= 50) {
session.execute("DELETE FROM timelineTable WHERE key = 0x01
AND timeline="
+ row.getDate("timeline").getTime());
}
i++;
}
System.out.println("Delete Interval:" + (new Date().getTime() -
interval));
j++;
System.out.println("");
}
}
> Extremely slow reads after flusing a table with column deletes
> --------------------------------------------------------------
>
> Key: CASSANDRA-5721
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5721
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Ubuntu 32 bits, Windows XP
> Reporter: Alberto Pujante
>
> I don't know if this is a bug or a normal behaviour. After doing some
> insertions and deletions (new keyspace, new table) and make a flush of the
> table, Cassandra gives extremely slow reads (and finally timeouts).
> Compactions also are extremely slow (with just a few hundred of columns).
> I've created a example script to test this. The result table of the output
> script has 750 live columns and 750 tombstones and only is flushed one
> memtable.
> When I don't do deletions and I read the entire row, Cassandra gives normal
> times. In this case it would be best for performance manually mark columns as
> deleted and when row reaches a % of deletes copy "no deleted columns" in a
> new row and delete the old one( row deletion).
> Even in models with a few columns deletions, after a while Cassandra would
> become extremely slow, and compactions would be very painfull
> Thanks in advance
> public void test() throws InterruptedException {
> session.execute("CREATE KEYSPACE ks WITH replication " + "="
> + "{'class':'SimpleStrategy', 'replication_factor':1};");
> session.execute("use ks;");
> //session.execute("drop table timelineTable;");
> session.execute("CREATE " + "TABLE ks.timelineTable ("
> + "key blob,"
> + "timeline timestamp,"
> + "value blob,"
> + "PRIMARY KEY (key, timeline)"
> + ") WITH CLUSTERING ORDER BY (timeline DESC) and
> gc_grace_seconds=0;");
> Long interval;
> Long time = new Date().getTime();
> int j = 0;
> while (j < 15) {
> int i = 0;
> interval = new Date().getTime();
> while (i < 100) {
> session.execute("insert into timelineTable
> (key,timeline,value) values (0x01,"
> + time.toString() + ",0x0" + Integer.toHexString(j) +
> ")");
> time++;
> i++;
> }
> System.out.println("Insert Interval:" + (new Date().getTime() -
> interval));
> interval = new Date().getTime();
> ResultSet results = session.execute("SELECT * FROM timelineTable"
> + " WHERE key = 0x01 ORDER BY timeline DESC limit 100");
> System.out.println("Read Interval:" + (new Date().getTime() -
> interval));
> i = 0;
> interval = new Date().getTime();
> for (Row row : results) {
> if (i >= 50) {
> session.execute("DELETE FROM timelineTable WHERE key =
> 0x01 AND timeline="
> + row.getDate("timeline").getTime());
> }
> i++;
> }
> System.out.println("Delete Interval:" + (new Date().getTime() -
> interval));
> j++;
> System.out.println("");
> }
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira