date:20101209


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-1083:
---

Attachment: compaction_simulation.rb

Modified the simulation to show what's going on a little more clearly.

 Improvement to CompactionManger's submitMinorIfNeeded
 -

 Key: CASSANDRA-1083
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ryan King
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 0.7.1

 Attachments: 1083-configurable-compaction-thresholds.patch, 
 compaction_simulation.rb, compaction_simulation.rb


 We've discovered that we are unable to tune compaction the way we want for 
 our production cluster. I think the current algorithm doesn't do this as well 
 as it could, since it doesn't sort the sstables by size before doing the 
 bucketing, which means the tuning parameters have unpredictable results.
 I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative 
 proposal:
 config operations:
  minimumCompactionThreshold
  maximumCompactionThreshold
  targetSSTableCount
 The first two would mean what they currently mean: the bounds on how many 
 sstables to compact in one compaction operation. The 3rd is a target for how 
 many SSTables you'd like to have.
 Pseudo code algorithm for determining whether or not to do a minor compaction:
 {noformat} 
 if sstables.length + minimumCompactionThreshold -1  targetSSTableCount
   sort sstables from smallest to largest
   compact the up to maximumCompactionThreshold smallest tables
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1838) Add ability to set TTL on columns in cassandra-cli


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969815#action_12969815
 ] 

Pavel Yaskevich commented on CASSANDRA-1838:


I agree with Sylvain, 'with' will be better.

 Add ability to set TTL on columns in cassandra-cli
 --

 Key: CASSANDRA-1838
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1838
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 0.7.0 rc 1
 Environment: Ubuntu 10.04 64bit
Reporter: Eric Tamme
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.7.1


 Currently the cassandra-cli does not have any mechanism to set the ttl 
 attribute of a column.  This would be a useful ability to have when working 
 with the cli tool.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-1083) Improvement to CompactionManger's submitMinorIfNeeded

[
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969817#action_12969817
]

Tyler Hobbs edited comment on CASSANDRA-1083 at 12/9/10 12:11 PM:
--

Modified the compaction simulation to show what's going on a little more
clearly. (This time without marking for inclusion).

was (Author: thobbs):
Modified the compaction simulation to show what's going on a little more
clearly.

Improvement to CompactionManger's submitMinorIfNeeded
-

Key: CASSANDRA-1083
URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
Project: Cassandra
Issue Type: Improvement
Reporter: Ryan King
Assignee: Tyler Hobbs
Priority: Minor
Fix For: 0.7.1

Attachments: 1083-configurable-compaction-thresholds.patch,
compaction_simulation.rb, compaction_simulation.rb

We've discovered that we are unable to tune compaction the way we want for
our production cluster. I think the current algorithm doesn't do this as well
as it could, since it doesn't sort the sstables by size before doing the
bucketing, which means the tuning parameters have unpredictable results.
I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative
proposal:
config operations:
minimumCompactionThreshold
maximumCompactionThreshold
targetSSTableCount
The first two would mean what they currently mean: the bounds on how many
sstables to compact in one compaction operation. The 3rd is a target for how
many SSTables you'd like to have.
Pseudo code algorithm for determining whether or not to do a minor compaction:
{noformat}
if sstables.length + minimumCompactionThreshold -1 targetSSTableCount
sort sstables from smallest to largest
compact the up to maximumCompactionThreshold smallest tables
{noformat}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1083) Improvement to CompactionManger's submitMinorIfNeeded


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-1083:
---

Attachment: compaction_simulation.rb

Modified the compaction simulation to show what's going on a little more 
clearly.

 Improvement to CompactionManger's submitMinorIfNeeded
 -

 Key: CASSANDRA-1083
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ryan King
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 0.7.1

 Attachments: 1083-configurable-compaction-thresholds.patch, 
 compaction_simulation.rb, compaction_simulation.rb


 We've discovered that we are unable to tune compaction the way we want for 
 our production cluster. I think the current algorithm doesn't do this as well 
 as it could, since it doesn't sort the sstables by size before doing the 
 bucketing, which means the tuning parameters have unpredictable results.
 I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative 
 proposal:
 config operations:
  minimumCompactionThreshold
  maximumCompactionThreshold
  targetSSTableCount
 The first two would mean what they currently mean: the bounds on how many 
 sstables to compact in one compaction operation. The 3rd is a target for how 
 many SSTables you'd like to have.
 Pseudo code algorithm for determining whether or not to do a minor compaction:
 {noformat} 
 if sstables.length + minimumCompactionThreshold -1  targetSSTableCount
   sort sstables from smallest to largest
   compact the up to maximumCompactionThreshold smallest tables
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-1083) Improvement to CompactionManger's submitMinorIfNeeded


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969046#action_12969046
 ] 

Tyler Hobbs edited comment on CASSANDRA-1083 at 12/9/10 12:15 PM:
--

I should note that the ruby logic for this is correct, but the psuedocode seems 
wrong.

This is the key ruby logic:

{noformat}
def minor_if_needed
  tables = SSTABLES.sort{|a, b| b[1] = a[1]}
  to_compact = []
  while to_compact.length  MAX_COMPACT  tables.length  TARGET_SSTABLES - 1
to_compact  tables.pop[0]
  if to_compact.length = MIN_COMPACT
compact(to_compact)
{noformat}

  was (Author: thobbs):
I should note that the ruby logic for this is correct, but the psuedocode 
seems wrong.

This is the key ruby logic:

{noformat}
def minor_if_needed
  tables = SSTABLES.sort{|a, b| b[1] = a[1]}
  to_compact = []
  while to_compact.length  MAX_COMPACT  tables.length  TARGET_SSTABLES - 1
to_compact  tables.pop[0]
{noformat}
  
 Improvement to CompactionManger's submitMinorIfNeeded
 -

 Key: CASSANDRA-1083
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ryan King
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 0.7.1

 Attachments: 1083-configurable-compaction-thresholds.patch, 
 compaction_simulation.rb, compaction_simulation.rb


 We've discovered that we are unable to tune compaction the way we want for 
 our production cluster. I think the current algorithm doesn't do this as well 
 as it could, since it doesn't sort the sstables by size before doing the 
 bucketing, which means the tuning parameters have unpredictable results.
 I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative 
 proposal:
 config operations:
  minimumCompactionThreshold
  maximumCompactionThreshold
  targetSSTableCount
 The first two would mean what they currently mean: the bounds on how many 
 sstables to compact in one compaction operation. The 3rd is a target for how 
 many SSTables you'd like to have.
 Pseudo code algorithm for determining whether or not to do a minor compaction:
 {noformat} 
 if sstables.length + minimumCompactionThreshold -1  targetSSTableCount
   sort sstables from smallest to largest
   compact the up to maximumCompactionThreshold smallest tables
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CASSANDRA-1839) Keep a tombstone cache

Keep a tombstone cache
--

 Key: CASSANDRA-1839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1839
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 0.3
Reporter: Brandon Williams
 Fix For: 0.7.1


There is a use case in production where the pattern is read-then-delete, where 
most of the keys read will not exist, but be attempted many times.  If the key 
has never existed, the bloom filter makes this operation cheap, however if the 
key has exist, especially if it has been overwritten many times and thus spans 
multiple SSTables, the merge-on-read just to end up with a tombstone can be 
expensive.  This can be mitigated with keycache and some rowcache currently, 
but this can be further optimized by storing a sentinel value in the keycache 
indicating that it's a tombstone, which we can invalidate on new writes to the 
row.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1824) Schema only fully propagates from seeds

2010-12-09 Thread Gary Dusbabek (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969819#action_12969819
 ] 

Gary Dusbabek commented on CASSANDRA-1824:
--

can't reproduce.  Tested on 3-node clusters in trunk, 0.7 and 0.7-rc1.

 Schema only fully propagates from seeds
 ---

 Key: CASSANDRA-1824
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1824
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7 beta 1
Reporter: Brandon Williams
Assignee: Gary Dusbabek
 Fix For: 0.7.0


 If you have nodes X, Y, and Z, and Y already has some schema, but X and Z do 
 not, and X is the seed node for the cluster, X will pick up the schema from 
 Y, but it will never propagate to Z.  If X has the schema, it will propagate 
 to both Y and Z.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1839) Keep a tombstone cache


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1839:


Description: There is a use case in production where the pattern is 
read-then-delete, where most of the keys read will not exist, but be attempted 
many times.  If the key has never existed, the bloom filter makes this 
operation cheap, however if the key has existed, especially if it has been 
overwritten many times and thus spans multiple SSTables, the merge-on-read just 
to end up with a tombstone can be expensive.  This can be mitigated with 
keycache and some rowcache currently, but this can be further optimized by 
storing a sentinel value in the keycache indicating that it's a tombstone, 
which we can invalidate on new writes to the row.  (was: There is a use case in 
production where the pattern is read-then-delete, where most of the keys read 
will not exist, but be attempted many times.  If the key has never existed, the 
bloom filter makes this operation cheap, however if the key has exist, 
especially if it has been overwritten many times and thus spans multiple 
SSTables, the merge-on-read just to end up with a tombstone can be expensive.  
This can be mitigated with keycache and some rowcache currently, but this can 
be further optimized by storing a sentinel value in the keycache indicating 
that it's a tombstone, which we can invalidate on new writes to the row.)

 Keep a tombstone cache
 --

 Key: CASSANDRA-1839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1839
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 0.3
Reporter: Brandon Williams
 Fix For: 0.7.1


 There is a use case in production where the pattern is read-then-delete, 
 where most of the keys read will not exist, but be attempted many times.  If 
 the key has never existed, the bloom filter makes this operation cheap, 
 however if the key has existed, especially if it has been overwritten many 
 times and thus spans multiple SSTables, the merge-on-read just to end up with 
 a tombstone can be expensive.  This can be mitigated with keycache and some 
 rowcache currently, but this can be further optimized by storing a sentinel 
 value in the keycache indicating that it's a tombstone, which we can 
 invalidate on new writes to the row.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CASSANDRA-1837) Deleted columns are resurrected after a flush

2010-12-09 Thread Gary Dusbabek (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Dusbabek reassigned CASSANDRA-1837:


Assignee: Gary Dusbabek

 Deleted columns are resurrected after a flush
 -

 Key: CASSANDRA-1837
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1837
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 1
Reporter: Brandon Williams
Assignee: Gary Dusbabek
Priority: Blocker
 Fix For: 0.7.0


 Easily reproduced with the cli:
 {noformat}
 [defa...@unknown] create keyspace testks;
 2785d67c-02df-11e0-ac09-e700f669bcfc
 [defa...@unknown] use testks;
 Authenticated to keyspace: testks
 [defa...@testks] create column family testcf;
 2fbad20d-02df-11e0-ac09-e700f669bcfc
 [defa...@testks] set testcf['test']['foo'] = 'foo';
 Value inserted.
 [defa...@testks] set testcf['test']['bar'] = 'bar';
 Value inserted.
 [defa...@testks] list testcf;
 Using default limit of 100
 ---
 RowKey: test
 = (column=626172, value=626172, timestamp=129182186912)
 = (column=666f6f, value=666f6f, timestamp=129182185732)
 1 Row Returned.
 [defa...@testks] del testcf['test'];
 row removed.
 [defa...@testks] list testcf;
 Using default limit of 100
 ---
 RowKey: test
 1 Row Returned.
 {noformat}
 Now flush testks and look again:
 {noformat}
 [defa...@testks] list testcf;
 Using default limit of 100
 ---
 RowKey: test
 = (column=626172, value=626172, timestamp=129182186912)
 = (column=666f6f, value=666f6f, timestamp=129182185732)
 1 Row Returned.
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1083) Improvement to CompactionManger's submitMinorIfNeeded

2010-12-09 Thread Ryan King (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969829#action_12969829
 ] 

Ryan King commented on CASSANDRA-1083:
--

To be honest, I'm not sure this is the best approach anymore. I think the 
fundamental problem is that its driven by the write traffic, not the read 
traffic.

 Improvement to CompactionManger's submitMinorIfNeeded
 -

 Key: CASSANDRA-1083
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ryan King
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 0.7.1

 Attachments: 1083-configurable-compaction-thresholds.patch, 
 compaction_simulation.rb, compaction_simulation.rb


 We've discovered that we are unable to tune compaction the way we want for 
 our production cluster. I think the current algorithm doesn't do this as well 
 as it could, since it doesn't sort the sstables by size before doing the 
 bucketing, which means the tuning parameters have unpredictable results.
 I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative 
 proposal:
 config operations:
  minimumCompactionThreshold
  maximumCompactionThreshold
  targetSSTableCount
 The first two would mean what they currently mean: the bounds on how many 
 sstables to compact in one compaction operation. The 3rd is a target for how 
 many SSTables you'd like to have.
 Pseudo code algorithm for determining whether or not to do a minor compaction:
 {noformat} 
 if sstables.length + minimumCompactionThreshold -1  targetSSTableCount
   sort sstables from smallest to largest
   compact the up to maximumCompactionThreshold smallest tables
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

svn commit: r1044109 - in /cassandra/branches/cassandra-0.6: CHANGES.txt src/java/org/apache/cassandra/db/Table.java

2010-12-09 Thread jbellis

Author: jbellis
Date: Thu Dec  9 19:48:55 2010
New Revision: 1044109

URL: http://svn.apache.org/viewvc?rev=1044109view=rev
Log:
cleanup smallest CFs first to increase free temp space for larger ones
patch by Jon Hermes and jbellis for CASSANDRA-1811

Modified:
cassandra/branches/cassandra-0.6/CHANGES.txt
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/Table.java

Modified: cassandra/branches/cassandra-0.6/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/CHANGES.txt?rev=1044109r1=1044108r2=1044109view=diff
==
--- cassandra/branches/cassandra-0.6/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.6/CHANGES.txt Thu Dec  9 19:48:55 2010
@@ -14,6 +14,8 @@
  * fix range queries against wrapped range (CASSANDRA-1781)
  * add support for per-CF compaction (CASSANDRA-1812)
  * reduce fat client timeout (CASSANDRA-1730)
+ * cleanup smallest CFs first to increase free temp space for larger ones
+   (CASSANDRA-1811)
 
 
 0.6.8

Modified: 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/Table.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/Table.java?rev=1044109r1=1044108r2=1044109view=diff
==
--- 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/Table.java 
(original)
+++ 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/Table.java 
Thu Dec  9 19:48:55 2010
@@ -226,15 +226,28 @@ public class Table 
 public void forceCleanup()
 {
 if (name.equals(SYSTEM_TABLE))
-throw new RuntimeException(Cleanup of the system table is neither 
necessary nor wise);
+throw new UnsupportedOperationException(Cleanup of the system 
table is neither necessary nor wise);
 
-SetString columnFamilies = tableMetadata.getColumnFamilies();
-for ( String columnFamily : columnFamilies )
+// Sort the column families in order of SSTable size, so cleanup of 
smaller CFs
+// can free up space for larger ones
+ListColumnFamilyStore sortedColumnFamilies = new 
ArrayListColumnFamilyStore(columnFamilyStores.values());
+Collections.sort(sortedColumnFamilies, new 
ComparatorColumnFamilyStore()
 {
-ColumnFamilyStore cfStore = columnFamilyStores.get( columnFamily );
-if ( cfStore != null )
-cfStore.forceCleanup();
-}   
+// Compare first on size and, if equal, sort by name (arbitrary  
deterministic).
+public int compare(ColumnFamilyStore cf1, ColumnFamilyStore cf2)
+{
+long diff = (cf1.getTotalDiskSpaceUsed() - 
cf2.getTotalDiskSpaceUsed());
+if (diff  0)
+return 1;
+if (diff  0)
+return -1;
+return cf1.columnFamily_.compareTo(cf2.columnFamily_);
+}
+});
+
+// Cleanup in sorted order to free up space for the larger ones
+for (ColumnFamilyStore cfs : sortedColumnFamilies)
+cfs.forceCleanup();
 }

[jira] Commented: (CASSANDRA-1791) Return name of snapshot directory after creating it

2010-12-09 Thread Nick Bailey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969899#action_12969899
 ] 

Nick Bailey commented on CASSANDRA-1791:


Good point, I didn't think about taking a full cluster snapshot, which is 
already available in clustertool

 Return name of snapshot directory after creating it
 ---

 Key: CASSANDRA-1791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
 Environment: Debian Squeeze
Reporter: paul cannon
Assignee: Nick Bailey
Priority: Minor
 Fix For: 0.7.1

 Attachments: 
 0001-Use-same-timestamp-for-full-snapshots-and-return-sna.patch


 When making a snapshot, the new directory is created with a timestamp and, 
 optionally, a user-supplied tag. For the sake of automated snapshot-creating 
 tools, it would be helpful to know unequivocally what the new snapshot 
 directory was named (otherwise, the tool must search for a directory similar 
 what it expects the name to be, which could be both error-prone and maybe 
 susceptible to attack).
 Recommend making takeSnapshot and takeAllSnapshot return a string, which is 
 the base component of the new snapshot's directory name.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1788) reduce copies on read, write paths


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1788:
--

Attachment: 0002-remove-copies-from-network-path.txt
0001-setup.txt

 reduce copies on read, write paths
 --

 Key: CASSANDRA-1788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.1

 Attachments: 0001-setup.txt, 
 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 
 1788-v4.txt, 1788.txt


 Currently, we do _three_ unnecessary copies (that is, writing to the socket 
 is necessary; any other copies made are overhead) for each message:
 - constructing the Message body byte[] (this is typically a call to a 
 ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in 
 SchemaCheckVerbHandler's reply)
 - which is copied to a buffer containing the entire Message (i.e. including 
 Header) when sendOneWay calls Message.serializer.serialize()
 - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
 - which is what we write to the socket
 For deserialize we perform a similar orgy of copies:
 - IncomingTcpConnection reads the Message length, allocates a byte[], and 
 reads the serialized Message into it
 - ITcpC then calls Message.serializer().deserialize, which allocates a new 
 byte[] for the body and copies that part
 - finally, the verbHandler (determined by the now-deserialized Message 
 header) deserializes the actual object represented by the body
 Most of these are out of scope for 0.7 but I think we can at least elide the 
 last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1788) reduce copies on read, write paths


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969912#action_12969912
 ] 

Jonathan Ellis commented on CASSANDRA-1788:
---

Rebooted v4 by first adding MessageSerializerTest with a bytesToHex string of 
the old-style bytes-on-wire to make sure I'm not breaking it.  Then when I add 
the new code I'm testing that new code can read old bytes, as well as old code 
reading new bytes.  Everything comes up clean.  I think I need another set of 
eyes on this.

(01 is large because I encapculated MS.instance in a getter to break an 
initialization-cycle problem.)

 reduce copies on read, write paths
 --

 Key: CASSANDRA-1788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.1

 Attachments: 0001-setup.txt, 
 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 
 1788-v4.txt, 1788.txt


 Currently, we do _three_ unnecessary copies (that is, writing to the socket 
 is necessary; any other copies made are overhead) for each message:
 - constructing the Message body byte[] (this is typically a call to a 
 ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in 
 SchemaCheckVerbHandler's reply)
 - which is copied to a buffer containing the entire Message (i.e. including 
 Header) when sendOneWay calls Message.serializer.serialize()
 - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
 - which is what we write to the socket
 For deserialize we perform a similar orgy of copies:
 - IncomingTcpConnection reads the Message length, allocates a byte[], and 
 reads the serialized Message into it
 - ITcpC then calls Message.serializer().deserialize, which allocates a new 
 byte[] for the body and copies that part
 - finally, the verbHandler (determined by the now-deserialized Message 
 header) deserializes the actual object represented by the body
 Most of these are out of scope for 0.7 but I think we can at least elide the 
 last copy on the write path and the first on the read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1824) Schema only fully propagates from seeds


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969922#action_12969922
 ] 

Brandon Williams commented on CASSANDRA-1824:
-

It takes me a few tries, but the procedure is:

1) start a non-seed, load schema
2) start other non-seed
3) start seed

Usually I can reproduce in 10 tries or less.

 Schema only fully propagates from seeds
 ---

 Key: CASSANDRA-1824
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1824
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7 beta 1
Reporter: Brandon Williams
Assignee: Gary Dusbabek
 Fix For: 0.7.0


 If you have nodes X, Y, and Z, and Y already has some schema, but X and Z do 
 not, and X is the seed node for the cluster, X will pick up the schema from 
 Y, but it will never propagate to Z.  If X has the schema, it will propagate 
 to both Y and Z.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1470) use direct io for compaction


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969969#action_12969969
 ] 

Jonathan Ellis commented on CASSANDRA-1470:
---

bq. see http://chbits.blogspot.com/2010/06/lucene-and-fadvisemadvise.html for 
why posix_fadvise won't work [for writes]

 use direct io for compaction
 

 Key: CASSANDRA-1470
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.1

 Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
 CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
 CASSANDRA-1470-v12-0.7.patch, CASSANDRA-1470-v2.patch, 
 CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
 CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
 CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
 CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
 CASSANDRA-1470.patch, 
 use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch


 When compaction scans through a group of sstables, it forces the data in the 
 os buffer cache being used for hot reads, which can have a dramatic negative 
 effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1470) use direct io for compaction


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969973#action_12969973
 ] 

Jonathan Ellis commented on CASSANDRA-1470:
---

Also: sounds like we should have an executor in charge of doing the write, so 
the compaction thread can start filling the next buffer immediately

 use direct io for compaction
 

 Key: CASSANDRA-1470
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.1

 Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
 CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
 CASSANDRA-1470-v12-0.7.patch, CASSANDRA-1470-v2.patch, 
 CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
 CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
 CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
 CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
 CASSANDRA-1470.patch, 
 use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch


 When compaction scans through a group of sstables, it forces the data in the 
 os buffer cache being used for hot reads, which can have a dramatic negative 
 effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1470) use direct io for compaction


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969979#action_12969979
 ] 

Pavel Yaskevich commented on CASSANDRA-1470:


I have tried to play with buffer size from 4Kb to 4Mb that does not give any 
improvement. Using separate thread for write in this point won't give any thing 
because read mutex lock is set on the file during write to avoid inconsistency. 

Please someone who have read linux kernel source correct me if I'm wrong about 
this...

 use direct io for compaction
 

 Key: CASSANDRA-1470
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.1

 Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
 CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
 CASSANDRA-1470-v12-0.7.patch, CASSANDRA-1470-v2.patch, 
 CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
 CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
 CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
 CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
 CASSANDRA-1470.patch, 
 use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch


 When compaction scans through a group of sstables, it forces the data in the 
 os buffer cache being used for hot reads, which can have a dramatic negative 
 effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1408) nodetool drain attempts to delete a deleted file


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1408:


Fix Version/s: 0.6.9

 nodetool drain attempts to delete a deleted file
 

 Key: CASSANDRA-1408
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1408
 Project: Cassandra
  Issue Type: Bug
 Environment: sun-jdk-1.6/Ubuntu 10.04
Reporter: Jon Hermes
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.6.9, 0.7 beta 2

 Attachments: 1408.txt


 Running `nodetool drain` presented me with a pretty stack-trace.
 The drain itself finished successfully and nothing showed up in the 
 system.log.
 {noformat}
 $ bin/nodetool -h 127.0.0.1 -p 8080 drain
 Exception in thread main java.lang.AssertionError: attempted to delete 
 non-existing file CommitLog-1282166457787.log
   at 
 org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:40)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:178)
   at 
 org.apache.cassandra.service.StorageService.drain(StorageService.java:1653)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1408) nodetool drain attempts to delete a deleted file


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1408:


Attachment: 1408-0.6.txt

 nodetool drain attempts to delete a deleted file
 

 Key: CASSANDRA-1408
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1408
 Project: Cassandra
  Issue Type: Bug
 Environment: sun-jdk-1.6/Ubuntu 10.04
Reporter: Jon Hermes
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.6.9, 0.7 beta 2

 Attachments: 1408-0.6.txt, 1408.txt


 Running `nodetool drain` presented me with a pretty stack-trace.
 The drain itself finished successfully and nothing showed up in the 
 system.log.
 {noformat}
 $ bin/nodetool -h 127.0.0.1 -p 8080 drain
 Exception in thread main java.lang.AssertionError: attempted to delete 
 non-existing file CommitLog-1282166457787.log
   at 
 org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:40)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:178)
   at 
 org.apache.cassandra.service.StorageService.drain(StorageService.java:1653)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CASSANDRA-1408) nodetool drain attempts to delete a deleted file


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-1408:
---

Assignee: Brandon Williams  (was: Jonathan Ellis)

 nodetool drain attempts to delete a deleted file
 

 Key: CASSANDRA-1408
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1408
 Project: Cassandra
  Issue Type: Bug
 Environment: sun-jdk-1.6/Ubuntu 10.04
Reporter: Jon Hermes
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.6.9, 0.7 beta 2

 Attachments: 1408-0.6.txt, 1408.txt


 Running `nodetool drain` presented me with a pretty stack-trace.
 The drain itself finished successfully and nothing showed up in the 
 system.log.
 {noformat}
 $ bin/nodetool -h 127.0.0.1 -p 8080 drain
 Exception in thread main java.lang.AssertionError: attempted to delete 
 non-existing file CommitLog-1282166457787.log
   at 
 org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:40)
   at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:178)
   at 
 org.apache.cassandra.service.StorageService.drain(StorageService.java:1653)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
   at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
   at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
   at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
   at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
   at sun.rmi.transport.Transport$1.run(Transport.java:159)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
   at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
   at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1470) use direct io for compaction


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969998#action_12969998
 ] 

Jonathan Ellis commented on CASSANDRA-1470:
---

bq. Using separate thread for write in this point won't give any thing

sure it will

bq. because read mutex lock is set on the file during write to avoid 
inconsistency. 

locks on the file don't stop us from filling *another* buffer while writing the 
last.  so instead of 

{code}
compaction thread:
fill buffer 1
write 1
fill buffer 2
write 2
{code}

you have

{code}
compaction thread:  write thread:
fill buffer 1
fill buffer 2   write 1
write 2
{code}


 use direct io for compaction
 

 Key: CASSANDRA-1470
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.1

 Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
 CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
 CASSANDRA-1470-v12-0.7.patch, CASSANDRA-1470-v2.patch, 
 CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
 CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
 CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
 CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
 CASSANDRA-1470.patch, 
 use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch


 When compaction scans through a group of sstables, it forces the data in the 
 os buffer cache being used for hot reads, which can have a dramatic negative 
 effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1470) use direct io for compaction


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1296#action_1296
 ] 

Pavel Yaskevich commented on CASSANDRA-1470:


Agree, it will be possible is lock is set on the region in the file which we 
won't be reading. Let me try and implement the scheme you suggest here, thanks!

 use direct io for compaction
 

 Key: CASSANDRA-1470
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.1

 Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
 CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
 CASSANDRA-1470-v12-0.7.patch, CASSANDRA-1470-v2.patch, 
 CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
 CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
 CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
 CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
 CASSANDRA-1470.patch, 
 use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch


 When compaction scans through a group of sstables, it forces the data in the 
 os buffer cache being used for hot reads, which can have a dramatic negative 
 effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

svn commit: r1044161 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java

2010-12-09 Thread jbellis

Author: jbellis
Date: Thu Dec  9 23:17:16 2010
New Revision: 1044161

URL: http://svn.apache.org/viewvc?rev=1044161view=rev
Log:
fix regression from CASSANDRA-1829
patch by jbellis

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java?rev=1044161r1=1044160r2=1044161view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
 Thu Dec  9 23:17:16 2010
@@ -398,7 +398,6 @@ public class StorageService implements I
 logger_.info(This node will not auto bootstrap because it is 
configured to be a seed node.);
 
 Token token;
-boolean bootstrapped = false;
 if (DatabaseDescriptor.isAutoBootstrap()
  
!(DatabaseDescriptor.getSeeds().contains(FBUtilities.getLocalAddress()) || 
SystemTable.isBootstrapped()))
 {
@@ -418,8 +417,6 @@ public class StorageService implements I
 {
 bootstrap(token);
 assert !isBootstrapMode; // bootstrap will block until finished
-bootstrapped = true;
-SystemTable.setBootstrapped(true); // first startup is only 
chance to bootstrap
 }
 // else nothing to do, go directly to participating in ring
 }
@@ -446,8 +443,8 @@ public class StorageService implements I
 }
 } 
 
-if(!bootstrapped)
-setToken(token);
+SystemTable.setBootstrapped(true); // first startup is only chance to 
bootstrap
+setToken(token);
 
 assert tokenMetadata_.sortedTokens().size()  0;
 }
@@ -580,7 +577,7 @@ public class StorageService implements I
  * STATE_NORMAL,token 
  *   ready to serve reads and writes.
  * STATE_NORMAL,token,REMOVE_TOKEN,token
- *   specialized normal state in which this node acts as a proxy to tell 
the cluster about a dead node whose 
+ *   specialized normal state in which this node acts as a proxy to tell 
the cluster about a dead node whose
  *   token is being removed. this value becomes the permanent state of 
this node (unless it coordinates another
  *   removetoken in the future).
  * STATE_LEAVING,token

[jira] Commented: (CASSANDRA-1829) Nodetool move is broken


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970003#action_12970003
 ] 

Jonathan Ellis commented on CASSANDRA-1829:
---

committed r1044161 to fix regression of setBootstrapped treatment

 Nodetool move is broken
 ---

 Key: CASSANDRA-1829
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1829
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 1
Reporter: Nick Bailey
Assignee: Nick Bailey
Priority: Blocker
 Fix For: 0.7.0

 Attachments: 0001-Update-token-after-bootstrapping.patch


 The code from finishBootstrapping that finishes a move was removed. This 
 means a move will leave a node stuck in a bootstrapping state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1470) use direct io for compaction


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970004#action_12970004
 ] 

Jonathan Ellis commented on CASSANDRA-1470:
---

bq. the region in the file which we won't be reading

Isn't it primarily the write path that we see the slowdown on?  Or am I 
misunderstanding those benchmark numbers above?

 use direct io for compaction
 

 Key: CASSANDRA-1470
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1470
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 0.7.1

 Attachments: 1470-v2.txt, 1470.txt, CASSANDRA-1470-for-0.6.patch, 
 CASSANDRA-1470-v10-for-0.7.patch, CASSANDRA-1470-v11-for-0.7.patch, 
 CASSANDRA-1470-v12-0.7.patch, CASSANDRA-1470-v2.patch, 
 CASSANDRA-1470-v3-0.7-with-LastErrorException-support.patch, 
 CASSANDRA-1470-v4-for-0.7.patch, CASSANDRA-1470-v5-for-0.7.patch, 
 CASSANDRA-1470-v6-for-0.7.patch, CASSANDRA-1470-v7-for-0.7.patch, 
 CASSANDRA-1470-v8-for-0.7.patch, CASSANDRA-1470-v9-for-0.7.patch, 
 CASSANDRA-1470.patch, 
 use.DirectIORandomAccessFile.for.commitlog.against.1022235.patch


 When compaction scans through a group of sstables, it forces the data in the 
 os buffer cache being used for hot reads, which can have a dramatic negative 
 effect on performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1470) use direct io for compaction