[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433288#comment-15433288
 ] 

Marcus Eriksson commented on CASSANDRA-6216:


nice catch, committed, thanks

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Assignee: Dikang Gu
>Priority: Minor
>  Labels: compaction, lcs
> Fix For: 3.0.9, 3.10
>
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-23 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433157#comment-15433157
 ] 

Dikang Gu commented on CASSANDRA-6216:
--

Thanks [~krummas]! It's probably due to merge, I think you need to remove the 
calculatedLastCompactedKeys check as well. See my patch:  
https://github.com/DikangGu/cassandra/commit/2d1f3871b7ac76e0b7d0ff8e47aa69cda013b0dc

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Assignee: Dikang Gu
>Priority: Minor
>  Labels: compaction, lcs
> Fix For: 3.0.9, 3.10
>
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432238#comment-15432238
 ] 

Marcus Eriksson commented on CASSANDRA-6216:


this lgtm with a small change - we should call super.startup() after 
calculating the last compacted keys - otherwise (i think) we could start 
compactions before it has been calculated

I did that and rebased on 3.0 
[here|https://github.com/krummas/cassandra/commits/dikang/6216]

tests:
https://cassci.datastax.com/view/Dev/view/krummas/job/krummas-dikang-6216-testall/
https://cassci.datastax.com/view/Dev/view/krummas/job/krummas-dikang-6216-dtest/

will commit when tests have finished successfully

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Assignee: Dikang Gu
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-22 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431816#comment-15431816
 ] 

Dikang Gu commented on CASSANDRA-6216:
--

[~krummas], thanks for the review! Addressed your comments and here are the 
latest patch, 
https://github.com/DikangGu/cassandra/commit/928a9ea5621c1405c5ac07f4dc26c616206277cf.

I also manually tested it, seems fine.

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Assignee: Dikang Gu
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430121#comment-15430121
 ] 

Marcus Eriksson commented on CASSANDRA-6216:


This looks good to me, just one small comment

Instead of calculating it in LeveledManifest#getCompactionCandidates - I think 
we could override AbstractCompactionStrategy#startup in 
LeveledCompactionStrategy - at this point we should know all the sstables and 
can do the calculation

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Assignee: Dikang Gu
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-20 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429283#comment-15429283
 ] 

Dikang Gu commented on CASSANDRA-6216:
--

[~krummas] cool, I think it's a more easy-to-understand idea, and maybe could 
save a lot of code. Here is the patch 
https://github.com/DikangGu/cassandra/commit/a63c8583ada93187b177c68dbc4443c29717e44e,
 can you please take a look?

I'm not sure how to write the unit test at this moment, will try some manual 
testing.

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-19 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429243#comment-15429243
 ] 

Marcus Eriksson commented on CASSANDRA-6216:


If we want to start compaction in level n, maybe we could find the newest (by 
creation or modification time) file in level n+1 and use its last token when 
finding the start point in level n?

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-19 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429086#comment-15429086
 ] 

Dikang Gu commented on CASSANDRA-6216:
--

I'd like to work on this, I'll see if I can store the last compacted keys in 
CFMetaData, sounds a correct direction?

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428769#comment-15428769
 ] 

sankalp kohli commented on CASSANDRA-6216:
--

I did not fix it..not sure if someone else has fixed it

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2016-08-19 Thread Wei Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428713#comment-15428713
 ] 

Wei Deng commented on CASSANDRA-6216:
-

Isn't this already implemented so this should be closed with a proper fixVer 
number?

> Level Compaction should persist last compacted key per level
> 
>
> Key: CASSANDRA-6216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
>  Labels: compaction, lcs
> Attachments: JIRA-6216.diff
>
>
> Level compaction does not persist the last compacted key per level. This is 
> important for higher levels. 
> The sstables with higher token and in higher levels wont get a chance to 
> compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2013-12-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850971#comment-13850971
 ] 

Jonathan Ellis commented on CASSANDRA-6216:
---

We should be going by size-on-disk, not sstable count.  They do tend to 
correspond but it's common to get leftover sstables smaller than the max, 
which can be meaningful as max gets larger.  We can also get slightly more 
sophisticated and count bytes that would be written after accounting for 
expired tombstones.  (See {{findDroppableSSTable}} for example of using 
tombstone stats.)

Also, this is actually backwards -- we want the *least* overlapping, since that 
means we spend less time rewriting data that doesn't change (less write 
amplification).  See improved compaction here: 
http://hackingdistributed.com/2013/06/17/hyperleveldb/, although I think they 
did overcomplicate the solution (as I mentioned in my comment on that page).

 Level Compaction should persist last compacted key per level
 

 Key: CASSANDRA-6216
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor
 Attachments: JIRA-6216.diff


 Level compaction does not persist the last compacted key per level. This is 
 important for higher levels. 
 The sstables with higher token and in higher levels wont get a chance to 
 compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2013-12-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850982#comment-13850982
 ] 

Jonathan Ellis commented on CASSANDRA-6216:
---

That is, we should prefer smallest write amplification where

wa = (overlapped size-minus-tombstones + candidate size-minus-tombstones) / 
candidate size-minus-tombstones

 Level Compaction should persist last compacted key per level
 

 Key: CASSANDRA-6216
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor
 Attachments: JIRA-6216.diff


 Level compaction does not persist the last compacted key per level. This is 
 important for higher levels. 
 The sstables with higher token and in higher levels wont get a chance to 
 compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2013-10-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804863#comment-13804863
 ] 

Jonathan Ellis commented on CASSANDRA-6216:
---

bq. should we not pick those sstables which have the largest overlapping with 
sstables in the higer level. 

+1.  IIRC HyperDex does this in their leveldb fork.

bq. this might cause starvation to some sstables as they wont get a chance to 
compact.

Well, if they're worse candidates, I'm okay with starving them. :)

(We will force compaction of an sstable that has too many expired tombstones 
already, so tombstones living forever shouldn't be a problem.)

 Level Compaction should persist last compacted key per level
 

 Key: CASSANDRA-6216
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: sankalp kohli
Priority: Minor

 Level compaction does not persist the last compacted key per level. This is 
 important for higher levels. 
 The sstables with higher token and in higher levels wont get a chance to 
 compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6216) Level Compaction should persist last compacted key per level

2013-10-17 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798675#comment-13798675
 ] 

sankalp kohli commented on CASSANDRA-6216:
--

Instead of sorting all sstables in a level by token and iterating over them, 
should we not pick those sstables which have the largest overlapping with 
sstables in the higer level. Though this might cause starvation to some 
sstables as they wont get a chance to compact. May be a mix of both will work 
better. 

 Level Compaction should persist last compacted key per level
 

 Key: CASSANDRA-6216
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Priority: Minor

 Level compaction does not persist the last compacted key per level. This is 
 important for higher levels. 
 The sstables with higher token and in higher levels wont get a chance to 
 compact as the last compacted key will get reset after a restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)