[jira] [Updated] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-07-12 Thread Fabien Rousseau (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabien Rousseau updated CASSANDRA-5677:
---

Attachment: 5677-1.2.overlappingfix.txt

 Performance improvements of RangeTombstones/IntervalTree
 

 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.2.7

 Attachments: 5677-1.2.overlappingfix.txt, 5677-1.2.txt, 
 5677-new-IntervalTree-implementation.patch


 Using massively range tombstones leads to bad response time (ie 100-500 
 ranges tombstones per row).
 After investigation, it seems that the culprit is how the DeletionInfo are 
 merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
 IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
 100 instances of IntervalTree are created, the first one having one interval, 
 the second one 2 intervals,... the 100th one : 100 intervals...)
 It seems that once the IntervalTree is built, it is not possible to add a new 
 Interval. Idea is to change the implementation of the IntervalTree by another 
 one which support insert interval.
 Attached is a proposed patch which :
  - renames the IntervalTree implementation to IntervalTreeCentered (the 
 renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
  - adds a new implementation IntervalTreeAvl (which is described here : 
 http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
 http://en.wikipedia.org/wiki/AVL_tree )
  - adds a new interface IIntervalTree to abstract the implementation
  - adds a new configuration option (interval_tree_provider) which allows to 
 choose between the two implementations (defaults to previous 
 IntervalTreeCentered)
  - updates IntervalTreeTest unit tests to test both implementations
  - creates a mini benchmark between the two implementations (tree creation, 
 point lookup, interval lookup)
  - creates a mini benchmark between the two implementations when merging 
 DeletionInfo (which shows a big performance improvement when using 500 
 tombstones for a row)
 This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-07-11 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5677:
--

 Reviewer: frousseau
  Component/s: Core
Fix Version/s: 1.2.7
 Assignee: Sylvain Lebresne

 Performance improvements of RangeTombstones/IntervalTree
 

 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.2.7

 Attachments: 5677-1.2.txt, 5677-new-IntervalTree-implementation.patch


 Using massively range tombstones leads to bad response time (ie 100-500 
 ranges tombstones per row).
 After investigation, it seems that the culprit is how the DeletionInfo are 
 merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
 IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
 100 instances of IntervalTree are created, the first one having one interval, 
 the second one 2 intervals,... the 100th one : 100 intervals...)
 It seems that once the IntervalTree is built, it is not possible to add a new 
 Interval. Idea is to change the implementation of the IntervalTree by another 
 one which support insert interval.
 Attached is a proposed patch which :
  - renames the IntervalTree implementation to IntervalTreeCentered (the 
 renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
  - adds a new implementation IntervalTreeAvl (which is described here : 
 http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
 http://en.wikipedia.org/wiki/AVL_tree )
  - adds a new interface IIntervalTree to abstract the implementation
  - adds a new configuration option (interval_tree_provider) which allows to 
 choose between the two implementations (defaults to previous 
 IntervalTreeCentered)
  - updates IntervalTreeTest unit tests to test both implementations
  - creates a mini benchmark between the two implementations (tree creation, 
 point lookup, interval lookup)
  - creates a mini benchmark between the two implementations when merging 
 DeletionInfo (which shows a big performance improvement when using 500 
 tombstones for a row)
 This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-07-09 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5677:


Attachment: (was: 5677-1.2.txt)

 Performance improvements of RangeTombstones/IntervalTree
 

 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Priority: Minor
 Attachments: 5677-1.2.txt, 5677-new-IntervalTree-implementation.patch


 Using massively range tombstones leads to bad response time (ie 100-500 
 ranges tombstones per row).
 After investigation, it seems that the culprit is how the DeletionInfo are 
 merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
 IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
 100 instances of IntervalTree are created, the first one having one interval, 
 the second one 2 intervals,... the 100th one : 100 intervals...)
 It seems that once the IntervalTree is built, it is not possible to add a new 
 Interval. Idea is to change the implementation of the IntervalTree by another 
 one which support insert interval.
 Attached is a proposed patch which :
  - renames the IntervalTree implementation to IntervalTreeCentered (the 
 renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
  - adds a new implementation IntervalTreeAvl (which is described here : 
 http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
 http://en.wikipedia.org/wiki/AVL_tree )
  - adds a new interface IIntervalTree to abstract the implementation
  - adds a new configuration option (interval_tree_provider) which allows to 
 choose between the two implementations (defaults to previous 
 IntervalTreeCentered)
  - updates IntervalTreeTest unit tests to test both implementations
  - creates a mini benchmark between the two implementations (tree creation, 
 point lookup, interval lookup)
  - creates a mini benchmark between the two implementations when merging 
 DeletionInfo (which shows a big performance improvement when using 500 
 tombstones for a row)
 This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-07-09 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5677:


Attachment: 5677-1.2.txt

 Performance improvements of RangeTombstones/IntervalTree
 

 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Priority: Minor
 Attachments: 5677-1.2.txt, 5677-new-IntervalTree-implementation.patch


 Using massively range tombstones leads to bad response time (ie 100-500 
 ranges tombstones per row).
 After investigation, it seems that the culprit is how the DeletionInfo are 
 merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
 IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
 100 instances of IntervalTree are created, the first one having one interval, 
 the second one 2 intervals,... the 100th one : 100 intervals...)
 It seems that once the IntervalTree is built, it is not possible to add a new 
 Interval. Idea is to change the implementation of the IntervalTree by another 
 one which support insert interval.
 Attached is a proposed patch which :
  - renames the IntervalTree implementation to IntervalTreeCentered (the 
 renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
  - adds a new implementation IntervalTreeAvl (which is described here : 
 http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
 http://en.wikipedia.org/wiki/AVL_tree )
  - adds a new interface IIntervalTree to abstract the implementation
  - adds a new configuration option (interval_tree_provider) which allows to 
 choose between the two implementations (defaults to previous 
 IntervalTreeCentered)
  - updates IntervalTreeTest unit tests to test both implementations
  - creates a mini benchmark between the two implementations (tree creation, 
 point lookup, interval lookup)
  - creates a mini benchmark between the two implementations when merging 
 DeletionInfo (which shows a big performance improvement when using 500 
 tombstones for a row)
 This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-07-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5677:


Attachment: 5677-1.2.txt

Attaching patch rebased to 1.2. I'll note that the trunk patch contains 2 
commits, but that rebased one is just the first of those commits. The 2nd one 
is a small read optimization, but it's a bit involved so it's not worth 
bothering with in 1.2.


 Performance improvements of RangeTombstones/IntervalTree
 

 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Priority: Minor
 Attachments: 5677-1.2.txt, 5677-new-IntervalTree-implementation.patch


 Using massively range tombstones leads to bad response time (ie 100-500 
 ranges tombstones per row).
 After investigation, it seems that the culprit is how the DeletionInfo are 
 merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
 IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
 100 instances of IntervalTree are created, the first one having one interval, 
 the second one 2 intervals,... the 100th one : 100 intervals...)
 It seems that once the IntervalTree is built, it is not possible to add a new 
 Interval. Idea is to change the implementation of the IntervalTree by another 
 one which support insert interval.
 Attached is a proposed patch which :
  - renames the IntervalTree implementation to IntervalTreeCentered (the 
 renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
  - adds a new implementation IntervalTreeAvl (which is described here : 
 http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
 http://en.wikipedia.org/wiki/AVL_tree )
  - adds a new interface IIntervalTree to abstract the implementation
  - adds a new configuration option (interval_tree_provider) which allows to 
 choose between the two implementations (defaults to previous 
 IntervalTreeCentered)
  - updates IntervalTreeTest unit tests to test both implementations
  - creates a mini benchmark between the two implementations (tree creation, 
 point lookup, interval lookup)
  - creates a mini benchmark between the two implementations when merging 
 DeletionInfo (which shows a big performance improvement when using 500 
 tombstones for a row)
 This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree

2013-06-20 Thread Fabien Rousseau (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabien Rousseau updated CASSANDRA-5677:
---

Attachment: 5677-new-IntervalTree-implementation.patch

 Performance improvements of RangeTombstones/IntervalTree
 

 Key: CASSANDRA-5677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5677
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Fabien Rousseau
Priority: Minor
 Attachments: 5677-new-IntervalTree-implementation.patch


 Using massively range tombstones leads to bad response time (ie 100-500 
 ranges tombstones per row).
 After investigation, it seems that the culprit is how the DeletionInfo are 
 merged. Each time a RangeTombstone is added into the DeletionInfo, the whole 
 IntervalTree is rebuilt (thus, if you have 100 tombstones in one row, then 
 100 instances of IntervalTree are created, the first one having one interval, 
 the second one 2 intervals,... the 100th one : 100 intervals...)
 It seems that once the IntervalTree is built, it is not possible to add a new 
 Interval. Idea is to change the implementation of the IntervalTree by another 
 one which support insert interval.
 Attached is a proposed patch which :
  - renames the IntervalTree implementation to IntervalTreeCentered (the 
 renaming is inspired from : http://en.wikipedia.org/wiki/Interval_tree)
  - adds a new implementation IntervalTreeAvl (which is described here : 
 http://en.wikipedia.org/wiki/Interval_tree#Augmented_tree and here : 
 http://en.wikipedia.org/wiki/AVL_tree )
  - adds a new interface IIntervalTree to abstract the implementation
  - adds a new configuration option (interval_tree_provider) which allows to 
 choose between the two implementations (defaults to previous 
 IntervalTreeCentered)
  - updates IntervalTreeTest unit tests to test both implementations
  - creates a mini benchmark between the two implementations (tree creation, 
 point lookup, interval lookup)
  - creates a mini benchmark between the two implementations when merging 
 DeletionInfo (which shows a big performance improvement when using 500 
 tombstones for a row)
 This patch applies for 1.2 branch...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira