[jira] [Commented] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-06-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567278#comment-14567278
 ] 

Sylvain Lebresne commented on CASSANDRA-8547:
-

Maybe Tyler meant CASSANDRA-9486?

And I do think the actual problem is the one pointed in CASSANDRA-9486. The 
idea behind {{RangeTombstone.Tracker}} is that it only tracks tombstones that 
are actually useful, i.e. those that still cover something. As such, the linear 
scan of {{isDeleted}} shouldn't be a problem, it shouldn't scan anything 
uselessly.  However, and that's what CASSANDRA-9486, the tracker is not always 
use properly, and there is cases where it's {{update}} method is not called, 
resulting in the non-expected higher cost in {{isDeleted}}. In practice, I'm 
sure the attached patch does improve things, but that's not really the right 
fix. And as the right fix is being discussed on CASSANDRA-9486 already, I'm 
going to mark this as a duplicate.

 Make RangeTombstone.Tracker.isDeleted() faster
 --

 Key: CASSANDRA-8547
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: 2.0.11
Reporter: Dominic Letz
Assignee: Dominic Letz
  Labels: tombstone
 Fix For: 2.1.x

 Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
 cassandra-2.1-8547.txt, rangetombstone.tracker.txt


 During compaction and repairs with many tombstones an exorbitant amount of 
 time is spend in RangeTombstone.Tracker.isDeleted().
 The amount of time spend there can be so big that compactions and repairs 
 look stalled and the time remaining time estimated frozen at the same value 
 for days.
 Using visualvm I've been sample profiling the code during execution and both 
 in Compaction as well as during repairs found this. (point in time backtraces 
 attached)
 Looking at the code the problem is obviously the linear scanning:
 {code}
 public boolean isDeleted(Column column)
 {
 for (RangeTombstone tombstone : ranges)
 {
 if (comparator.compare(column.name(), tombstone.min) = 0
  comparator.compare(column.name(), tombstone.max) = 0
  tombstone.maxTimestamp() = column.timestamp())
 {
 return true;
 }
 }
 return false;
 }
 {code}
 I would like to propose to change this and instead use a sorted list (e.g. 
 RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-05-29 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564997#comment-14564997
 ] 

Tyler Hobbs commented on CASSANDRA-8547:


I believe this is obsoleted by CASSANDRA-6446.

 Make RangeTombstone.Tracker.isDeleted() faster
 --

 Key: CASSANDRA-8547
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: 2.0.11
Reporter: Dominic Letz
Assignee: Dominic Letz
  Labels: tombstone
 Fix For: 2.1.x

 Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
 cassandra-2.1-8547.txt, rangetombstone.tracker.txt


 During compaction and repairs with many tombstones an exorbitant amount of 
 time is spend in RangeTombstone.Tracker.isDeleted().
 The amount of time spend there can be so big that compactions and repairs 
 look stalled and the time remaining time estimated frozen at the same value 
 for days.
 Using visualvm I've been sample profiling the code during execution and both 
 in Compaction as well as during repairs found this. (point in time backtraces 
 attached)
 Looking at the code the problem is obviously the linear scanning:
 {code}
 public boolean isDeleted(Column column)
 {
 for (RangeTombstone tombstone : ranges)
 {
 if (comparator.compare(column.name(), tombstone.min) = 0
  comparator.compare(column.name(), tombstone.max) = 0
  tombstone.maxTimestamp() = column.timestamp())
 {
 return true;
 }
 }
 return false;
 }
 {code}
 I would like to propose to change this and instead use a sorted list (e.g. 
 RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-05-29 Thread Oleg Anastasyev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565008#comment-14565008
 ] 

Oleg Anastasyev commented on CASSANDRA-8547:


Um, not sure. At least for 2.0 we had this problem during repairs after 
application of 6446.

 Make RangeTombstone.Tracker.isDeleted() faster
 --

 Key: CASSANDRA-8547
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: 2.0.11
Reporter: Dominic Letz
Assignee: Dominic Letz
  Labels: tombstone
 Fix For: 2.1.x

 Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
 cassandra-2.1-8547.txt, rangetombstone.tracker.txt


 During compaction and repairs with many tombstones an exorbitant amount of 
 time is spend in RangeTombstone.Tracker.isDeleted().
 The amount of time spend there can be so big that compactions and repairs 
 look stalled and the time remaining time estimated frozen at the same value 
 for days.
 Using visualvm I've been sample profiling the code during execution and both 
 in Compaction as well as during repairs found this. (point in time backtraces 
 attached)
 Looking at the code the problem is obviously the linear scanning:
 {code}
 public boolean isDeleted(Column column)
 {
 for (RangeTombstone tombstone : ranges)
 {
 if (comparator.compare(column.name(), tombstone.min) = 0
  comparator.compare(column.name(), tombstone.max) = 0
  tombstone.maxTimestamp() = column.timestamp())
 {
 return true;
 }
 }
 return false;
 }
 {code}
 I would like to propose to change this and instead use a sorted list (e.g. 
 RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-05-29 Thread Randy Fradin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565143#comment-14565143
 ] 

Randy Fradin commented on CASSANDRA-8547:
-

I see this problem on 2.1.5 so I don't think this is resolved. A validation 
compaction is completely stuck; a thread dump shows it inside this loop, and 
top is showing 1 CPU core 100% utilized.

 Make RangeTombstone.Tracker.isDeleted() faster
 --

 Key: CASSANDRA-8547
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: 2.0.11
Reporter: Dominic Letz
Assignee: Dominic Letz
  Labels: tombstone
 Fix For: 2.1.x

 Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
 cassandra-2.1-8547.txt, rangetombstone.tracker.txt


 During compaction and repairs with many tombstones an exorbitant amount of 
 time is spend in RangeTombstone.Tracker.isDeleted().
 The amount of time spend there can be so big that compactions and repairs 
 look stalled and the time remaining time estimated frozen at the same value 
 for days.
 Using visualvm I've been sample profiling the code during execution and both 
 in Compaction as well as during repairs found this. (point in time backtraces 
 attached)
 Looking at the code the problem is obviously the linear scanning:
 {code}
 public boolean isDeleted(Column column)
 {
 for (RangeTombstone tombstone : ranges)
 {
 if (comparator.compare(column.name(), tombstone.min) = 0
  comparator.compare(column.name(), tombstone.max) = 0
  tombstone.maxTimestamp() = column.timestamp())
 {
 return true;
 }
 }
 return false;
 }
 {code}
 I would like to propose to change this and instead use a sorted list (e.g. 
 RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-05-29 Thread Randy Fradin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565142#comment-14565142
 ] 

Randy Fradin commented on CASSANDRA-8547:
-

I see this problem on 2.1.5 so I don't think this is resolved. A validation 
compaction is completely stuck; a thread dump shows it inside this loop, and 
top is showing 1 CPU core 100% utilized.

 Make RangeTombstone.Tracker.isDeleted() faster
 --

 Key: CASSANDRA-8547
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: 2.0.11
Reporter: Dominic Letz
Assignee: Dominic Letz
  Labels: tombstone
 Fix For: 2.1.x

 Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
 cassandra-2.1-8547.txt, rangetombstone.tracker.txt


 During compaction and repairs with many tombstones an exorbitant amount of 
 time is spend in RangeTombstone.Tracker.isDeleted().
 The amount of time spend there can be so big that compactions and repairs 
 look stalled and the time remaining time estimated frozen at the same value 
 for days.
 Using visualvm I've been sample profiling the code during execution and both 
 in Compaction as well as during repairs found this. (point in time backtraces 
 attached)
 Looking at the code the problem is obviously the linear scanning:
 {code}
 public boolean isDeleted(Column column)
 {
 for (RangeTombstone tombstone : ranges)
 {
 if (comparator.compare(column.name(), tombstone.min) = 0
  comparator.compare(column.name(), tombstone.max) = 0
  tombstone.maxTimestamp() = column.timestamp())
 {
 return true;
 }
 }
 return false;
 }
 {code}
 I would like to propose to change this and instead use a sorted list (e.g. 
 RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265891#comment-14265891
 ] 

Benedict commented on CASSANDRA-8547:
-

This looks like something [~slebresne] should take a look at.

 Make RangeTombstone.Tracker.isDeleted() faster
 --

 Key: CASSANDRA-8547
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: 2.0.11
Reporter: Dominic Letz
Assignee: Dominic Letz
 Fix For: 2.1.3

 Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
 cassandra-2.1-8547.txt, rangetombstone.tracker.txt


 During compaction and repairs with many tombstones an exorbitant amount of 
 time is spend in RangeTombstone.Tracker.isDeleted().
 The amount of time spend there can be so big that compactions and repairs 
 look stalled and the time remaining time estimated frozen at the same value 
 for days.
 Using visualvm I've been sample profiling the code during execution and both 
 in Compaction as well as during repairs found this. (point in time backtraces 
 attached)
 Looking at the code the problem is obviously the linear scanning:
 {code}
 public boolean isDeleted(Column column)
 {
 for (RangeTombstone tombstone : ranges)
 {
 if (comparator.compare(column.name(), tombstone.min) = 0
  comparator.compare(column.name(), tombstone.max) = 0
  tombstone.maxTimestamp() = column.timestamp())
 {
 return true;
 }
 }
 return false;
 }
 {code}
 I would like to propose to change this and instead use a sorted list (e.g. 
 RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)