[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507627#comment-13507627
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

committed, thanks!

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.3

 Attachments: 3974-v8.txt, 3974-v9.txt, trunk-3974.txt, 
 trunk-3974v2.txt, trunk-3974v3.txt, trunk-3974v4.txt, trunk-3974v5.txt, 
 trunk-3974v6.txt, trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501891#comment-13501891
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

Set to open: because patch has been reviewed, no need for it to keep showing up 
in Patch Available until there is a new one.

Target version 1.3: because we missed the window to make 1.2.0rc1.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.3

 Attachments: 3974-v8.txt, trunk-3974.txt, trunk-3974v2.txt, 
 trunk-3974v3.txt, trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt, 
 trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-20 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501412#comment-13501412
 ] 

Jeremy Hanna commented on CASSANDRA-3974:
-

So this isn't going into 1.2 because it didn't apply cleanly to trunk?  I'm 
confused why the status is set to open and the target version is not set to 1.3.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.3

 Attachments: 3974-v8.txt, trunk-3974.txt, trunk-3974v2.txt, 
 trunk-3974v3.txt, trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt, 
 trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-19 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13500420#comment-13500420
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

v8 lgtm.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: 3974-v8.txt, trunk-3974.txt, trunk-3974v2.txt, 
 trunk-3974v3.txt, trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt, 
 trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13499413#comment-13499413
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

v8 attached to add Column.create factory and use that from addColumn and 
UpdateStatement

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: 3974-v8.txt, trunk-3974.txt, trunk-3974v2.txt, 
 trunk-3974v3.txt, trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt, 
 trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-15 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498111#comment-13498111
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

I guess I thought that was implied but we can apply YAGNI here.

What about UpdateStatemnt/addColumn?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt, trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13496334#comment-13496334
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

If we use ttl = 0 as a signal to use the default ttl in CF.addColumn, how do 
we override the default to be no ttl at all?  Should we treat 
Integer.MAX_VALUE as don't use the default, just give me a non-expiring 
Column?

Does UpdateStatement go through addColumn eventually?  If so we are duplicating 
code there.  If not that makes me a bigger fan of centralizing this in a 
factory method.  (Guess we can leave the other constructors alone.)

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt, trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-13 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13496414#comment-13496414
 ] 

Kirk True commented on CASSANDRA-3974:
--

Jonathan, we want the ability for clients to explicitly *not* use the column 
family default TTL?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt, trunk-3974v7.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-12 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495293#comment-13495293
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

bq. Should we make the constructors private and expose a factory method that 
requires the metadata?

I like that idea.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-09 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494091#comment-13494091
 ] 

Kirk True commented on CASSANDRA-3974:
--

Pinging for feedback.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494137#comment-13494137
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

I realize I'm reviewer on this one. I seem that remember that [~jbellis] wanted 
to have a look at that but maybe I misunderstood that?

In any case, I had a look at that patch and that looks good to me overall. That 
being, and that's not really a criticism of the patch, I do was slightly 
surprised that we only need to modify {{ColumnFamily.addColumn(QueryPath, 
...)}} and {{InsertStatement}} to make that work. Don't get me wrong, I do 
think this is correct, but it does feel a bit fragile that some future internal 
code could too easily add an ExpiringColumn though 
ColumnFamily.addColumn(IColumn) and skip the global cf setting. I don't really 
have any good solution to make it less fragile however, I'm just thinking out 
loud. But that remark aside, again the patch does lgtm (aside from needing 
rebase).

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-11-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494329#comment-13494329
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

bq. it does feel a bit fragile that some future internal code could too easily 
add an ExpiringColumn though ColumnFamily.addColumn(IColumn)

Tracing and unit tests (via Util.expiringColumn) already use this method.

Should we make the constructors private and expose a factory method that 
requires the metadata?


 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0 rc1

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt, trunk-3974v6.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-10-02 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467776#comment-13467776
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

Sorry, I've forgot about that one, so the patch needs rebasing. But from a 
cursory inspection, v5 looks ok (except maybe for the M/R support Jeremy 
suggested above (but I'm not sure where's the best place to add that)).

Also, CASSANDRA-3442 adds info that should help use optimize things by dropping 
fully expired sstables, but I'm not sure the optimization itself is implemented 
yet. Do we want to do that in this ticket or should we move that to later (I'm 
good either way)?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2.0

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt, trunk-3974v5.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-14 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434497#comment-13434497
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

I think Sylvain is right that that makes more sense...  sorry about the wild 
goose chase!

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-07 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13430612#comment-13430612
 ] 

Kirk True commented on CASSANDRA-3974:
--

I can reintroduce the 'TTL as a default' approach from the first patch if 
that's how we want it to work.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-02 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427314#comment-13427314
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

I'm sorry I'm a little late to the discussion, but I'm not sure I'm a fan of 
using the metadata TTL to decide of expiration because:
# It means we use the column timestamp to decide of the expiration. However, we 
have been very careful so far to not use the column timestamp as a server side 
timestamp. And in particular, the patch assumes the timestamp is in 
microseconds, while most clients and CQL actually use microseconds.
# Altering the default TTL is imo more confusing that way, because we are 
pretending that altering the TTL will apply to all existing CF and columns, 
which itself suggests that if you want to remove everything older than say 1h, 
you can switch the TTL to 1h and then change it back right away to some other 
much longer value (or 0). But that's not the case, because the new TTL will 
only be applied to existing data only when compaction happens. And I really 
don't think that user visible behaviors should depends in any way on the timing 
of internal operations.
# This requires passing the CFMetadata in lots of places in the code, which 
isn't really nice. In particular, we should call isColumnExpiredFromDefaultTTL 
pretty much every time DeletionInfo.isDeleted() is called (after all, having an 
expired column is exatly the same than having a deleted one), and the current 
patch is missing quite a few places.

So I think I do prefer the idea of having the CF TTL just being the default TTL 
applied to columns when inserted if they don't have one. 


 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427342#comment-13427342
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

If we're just going to have CF TTL being sugar for clients too lazy to apply 
what they want, then I'm not interested.

But if we use CF TTL to provide an upper bound on how long data can live, then 
we open the door for some interesting optimizations.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-02 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427343#comment-13427343
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

Well, if the goal is just to be able to drop entire sstables when we know 
everything is expired, we could compute and keep in the metadata the min TTL of 
the sstable. 

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427346#comment-13427346
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

Hmm.  Now that you mention it, Yuki already added that in CASSANDRA-3442...

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427397#comment-13427397
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

bq. If we're just going to have CF TTL being sugar for clients too lazy to 
apply what they want, then I'm not interested.

I guess it would be a good thing to have for CQL though by the same reasoning 
as CASSANDRA-4448.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-02 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427399#comment-13427399
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

bq. I guess it would be a good thing to have for CQL though by the same 
reasoning as CASSANDRA-4448.

Agreed.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-08-02 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427448#comment-13427448
 ] 

Jeremy Hanna commented on CASSANDRA-3974:
-

bq. If we're just going to have CF TTL being sugar for clients too lazy to 
apply what they want, then I'm not interested.

Also if that client happens to be Pig or Hive, there's not currently a way to 
set TTLs.  So in that case it's not laziness of the client.

A use case: I don't want to MapReduce over my giant archival column family so 
when ingesting data, I'll write to my archival column family and in addition a 
column family with a default TTL or however it's implemented, so it would just 
be data from the last 30 days.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt, 
 trunk-3974v4.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-07-24 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13421558#comment-13421558
 ] 

Kirk True commented on CASSANDRA-3974:
--

Sylvain - the main issue is that the code isn't structured in such a way that a 
CFMetaData object is available.

Neither the code for QueryFilter.isRelevant nor its callers have access to a 
CFMetaData. Can you think of a way to get the CFMetaData in there or a 
different way to structure the code in general?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-07-18 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417710#comment-13417710
 ] 

Robert Coli commented on CASSANDRA-3974:


Sorry for the erroneous attachment, somehow JIRA produced a link on creation of 
https://issues.apache.org/jira/browse/CASSANDRA-4446 which directed me here and 
I attached before I noticed it was the wrong ticket.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-06-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402025#comment-13402025
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

About the removeDeleted problem: I think that trying to force calls to 
removeDeleted (which force an iteration of all columns) so that we can add the 
logic of this ticket is the wrong approach (because it's inefficient for no 
good reason and doesn't make the code easier to follow). I.e. currently the 
code to ignore irrelevant columns is split between QueryFilter.isRelevant() and 
removeDeleted depending of which code path is taken (reads use isRelevant and 
compaction uses removeDeleted basically). So I see mostly 2 options:
# we find a way to refactor the code so that we only ever ignore irrelevant 
columns in one place. That would be great but again it's unclear how to do that 
correctly.
# we put the logic for this patch in both removeDeleted and isRelevant.

I'm personally fine going the second solution for the purpose of this ticket 
and keep the first option in mind for later as a way to improve the code base. 

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt, trunk-3974v3.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-06-20 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13397644#comment-13397644
 ] 

Kirk True commented on CASSANDRA-3974:
--

Pinging to have someone look at patch v2.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-06-04 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288953#comment-13288953
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

bq. when the user explicitly provides a column TTL longer than the default 
column family TTL, I would think we'd either want to a) give an error, or b) 
provide a warning

I'd be in favor of (a).

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-06-04 Thread Leonardo Stern (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288962#comment-13288962
 ] 

Leonardo Stern commented on CASSANDRA-3974:
---

As I user I prefer option (a)

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-06-04 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289007#comment-13289007
 ] 

Kirk True commented on CASSANDRA-3974:
--

CASSANDRA-4299 ({{removeDeleted}} clean up) blocks this as reads aren't 
presently calling {{removeDeleted}} and columns aren't being filtering out.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Affects Versions: 1.2
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt, trunk-3974v2.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-30 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286113#comment-13286113
 ] 

Kirk True commented on CASSANDRA-3974:
--

I still need to implement unit tests. Do you have an suggestions as to an 
existing class into which they could be incorporated and/or good examples to 
copy?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-30 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286115#comment-13286115
 ] 

Kirk True commented on CASSANDRA-3974:
--

Also, given that the logic is {{min(CF TTL, column TTL)}}, when the user 
explicitly provides a column TTL longer than the default column family TTL, I 
would think we'd either want to a) give an error, or b) provide a warning. At 
this point, the larger value provided by the user is simply ignored.

Thoughts?  

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-30 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286138#comment-13286138
 ] 

Kirk True commented on CASSANDRA-3974:
--

Filed CASSANDRA-4299 to handle the QF cleanup and 'putting removeDeleted on 
every path'.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285300#comment-13285300
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

If we're leaving QF cleanup for another ticket, is this done / ready for review 
then?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281036#comment-13281036
 ] 

Sylvain Lebresne commented on CASSANDRA-3974:
-

I does indeed seem that removeDeleted is not called on that path. I don't know 
if this show up as a bug though: columns shadowed by a row tombstone are 
removed by QueryFilter.isRelevant, and column tombstones are removed before 
being returned to the client. Yet, it would probably be cleaner to put 
removeDeleted on every path, especially since I agree with Jonathan that it's 
probably the right place to put the CF-TTL check.

Actually I think that if we make sure to put removeDeleted on every path, we 
could probably make it the only method concerned with tombstone and remove 
QueryFilter.isRelelevant for instance, which would clean things up. But we can 
probably leave that to another ticket.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-18 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278616#comment-13278616
 ] 

Kirk True commented on CASSANDRA-3974:
--

In my test case, it does go through {{RowIteratorFactory}}, but it *doesn't* go 
through line 111. In {{getReduced}} {{cached}} is always {{null}} so it calls 
the {{filter.collateColumns}} path.

So I made this naive change:

{noformat}
if (cached == null)
{
// not cached: collate
filter.collateColumns(returnCF, colIters, gcBefore);
returnCF = ColumnFamilyStore.removeDeleted(returnCF, gcBefore);
}
else
{
QueryFilter keyFilter = new QueryFilter(key, filter.path, filter.filter);
returnCF = cfs.filterColumnFamily(cached, keyFilter, gcBefore);
}
{noformat}

Be manually calling {{removeDeleted}} I was able to get my columns filtered 
out as expected.

I'm pretty sure this is incomplete or just plain wrong, but I wanted to get 
your thoughts.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278939#comment-13278939
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

Hmm. I think you've found a bug...

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-17 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278344#comment-13278344
 ] 

Kirk True commented on CASSANDRA-3974:
--

On IRC Jonathan suggested to look at {{ColumnFamilyStore.removeDeleted}} and 
{{PrecompactedRow.removeDeletedAndOldShards}}. However, at doesn't _appear_ 
that either of these are called during column reads so I can't rely on those to 
filter out results sent back to the client.

The logic that I see for filtering out results sent to the client is in places 
such as {{CassandraServer.thriftifyColumns}} via the 
{{IColumn.isMarkedForDelete}} call. However, as stated previously, since an 
{{IColumn}} doesn't internally store a {{CFMetaData}} object, we'd have to pass 
one in. {{isMarkedForDelete}} is used in a lot of places, so it has a ripple 
effect that causes a lot of changes.

Please advise.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278360#comment-13278360
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

CFS.removeDeleted is the one that's called during column reads. E.g., 
SliceByNamesReadCommand.getRow - Table.getRow - CFS.getColumnFamily - 
CFS.removeDeleted

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-17 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278414#comment-13278414
 ] 

Kirk True commented on CASSANDRA-3974:
--

In my case (inserting data and then calling {{list my_cf}} from the CLI}}), it 
goes through the {{RangeSliceCommand}} path which doesn't end up calling 
{{ColumnFamilyStorage.getColumnFamily}}. As such, the 
expired-and-thus-should-be-ignored rows are still showing up.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278489#comment-13278489
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

RSVH still goes through getColumnFamily.  It can either go through the index 
AbstractScanIterator which calls getCF at line 195 in KeysSearcher, or the seq 
scan iterator via CFS.filterColumnFamily (RowIteratorFactory line 111).

Remember that these only remove *expired* tombstones; non-expired ones need to 
be returned to the coordinator for read repair.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-05-07 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269956#comment-13269956
 ] 

Kirk True commented on CASSANDRA-3974:
--

My understanding is that in order to reduce potential user confusion when 
updating the column family's default TTL, we need to keep the column family's 
default TTL value separate. That is, we probably *don't* want to make 
{{ExpiringColumn}}s for a column family that has a default TTL (using {{min(CF 
TTL, column TTL)}} as the TTL value). Instead, we keep the logic as is and keep 
the column family's default TTL value in {{CFMetaData}}.

That's all fine and good, but looking at the code I'm not quite sure as to when 
we'd check the column family default TTL. It would seem that we need to pass a 
{{CFMetaData}} instance in to {{Column}}'s {{isMarkedForDelete}} so that it can 
perform logic such as:

{noformat}
public boolean isMarkedForDelete(CFMetaData metadata)
{
if (metadata.getDefaultTimeToLive()  0)
{
// Check if we're using a CF-based TTL.
return System.currentTimeMillis() = (timestamp + 
(metadata.getDefaultTimeToLive() * 1000));
}
else
{
return (int) (System.currentTimeMillis() / 1000) = 
getLocalDeletionTime();
}
}
{noformat}

Is this the correct line of thought? If so, that changes a couple of dozen call 
sites which makes me wonder if I'm doing something wrong :)

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258846#comment-13258846
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

I mean, a CF ttl of X is useful only if it lets us reason that an sstable 
written more than X seconds ago is entirely expired.  So... min? :)

Is describe caching the schema as in CASSANDRA-4052?

Agreed that if we want to allow altering CF ttl, keeping it separate from 
column ttl until we need to check for expired-ness makes the most sense.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-20 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258685#comment-13258685
 ] 

Kirk True commented on CASSANDRA-3974:
--

There are a few pieces missing yet:

# The ability to alter a column family to change the default TTL option. 
Because I made the change to use max(column TTL, CF TTL) at column mutate time, 
altering the column family default TTL value will be lost on such columns.
# I'm fighting with Python to understand why 'DESCRIBE COLUMNFAMILY FOO' 
doesn't show the new default TTL value. I made the change in the Python and 
Java layers to accept this new option, but the describe fails to display it.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-20 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258702#comment-13258702
 ] 

Kirk True commented on CASSANDRA-3974:
--

What are the semantics of updating the CF TTL? Should updating the CF TTL 
effect existing columns? If so, we would not want to use max(column TTL, CF 
TTL) _at column mutation time_ but keeping them separate and dynamic to 
evaluate liveness at some other event (column retrieval and/or compaction).

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-20 Thread Dave Brosius (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258738#comment-13258738
 ] 

Dave Brosius commented on CASSANDRA-3974:
-

JBellis... you mean min, right?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-12 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252522#comment-13252522
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

bq. In the initial patch, I had made changes to both 
UpdateStatement.addToMutation and ColumnFamily.addColumn to use the larger of 
the column's TTL or the column family default TTL

Oops, I totally missed the addColumn changes.  That's exactly what I had in 
mind.

It sounds like you have an updated patch, could you post that?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-11 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251561#comment-13251561
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

bq. Part of the code I changed was in CFMetaData's toThrift and fromThrift 
methods

Let me back up.  I can see two main approaches towards respecting the per-CF 
ttl:

# Set the column TTL to the max(column, CF) ttl on insert; then the rest of the 
code doesn't have to know anything changed
# Take max(column, CF) ttl during operations like compaction, and leave column 
ttl to specify *only* the column TTL

The code in UpdateStatement led me to believe you're going with option 1.  So 
what I meant by my comment was, you need to make a similar change for inserts 
done over Thrift RPC, as well.  (to/from Thrift methods are used for telling 
Thrift clients about the schema, but are not used for insert/update operations.)

Does that help?

bq. Sorry, I'm not sure to which part of the code you're referring

CFMetadata.getTimeToLive.  Sounds like you addressed this anyway.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-11 Thread Kirk True (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251966#comment-13251966
 ] 

Kirk True commented on CASSANDRA-3974:
--

I made changes to both {{UpdateStatement.addToMutation}} and 
{{ColumnFamily.addColumn}} to use the larger of the column's TTL or the column 
family default TTL. I tested against the {{cassandra-cli}} and {{cqlsh}} tools 
and both show the default TTL being used if none is specified.

This is all to say that it _looks_ like both the Thrift and CQL paths are 
working as expected. Perhaps it's high time I found the unit tests and added 
some...

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-10 Thread Kirk True (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251196#comment-13251196
 ] 

Kirk True commented on CASSANDRA-3974:
--

Jonathan, thanks for the feedback.

I need a bit of clarification for a newbie hacking on the code...

bq. Looks like this only updates the CQL path? We'd want to make the Thrift 
path cf-ttl-aware as well. I think this just means updating RowMutation + CF 
addColumn methods.

I actually thought the opposite. Part of the code I changed was in 
{{CFMetaData}}'s {{toThrift}} and {{fromThrift}} methods. Perhaps I'm reading 
too much into the method names?

But I took a look at {{ColumnFamily}}'s {{addColumn}} method, but it already 
performs the conditional based on the TTL value.

bq. Nit: we could simplify getTTL a bit by adding assert ttl  0.

Sorry, I'm not sure to which part of the code you're referring :( Can you 
elaborate?

bq.I got it backwards: we want max(cf ttl, column ttl) to be able to reason 
about the live-ness of CF data w/o looking at individual rows

I cleaned up the {{CFMetaData.getTimeToLive}} method, which is now simply:

{noformat}
public int getTimeToLive(int timeToLive)
{
return Math.max(defaultTimeToLive, timeToLive);
}
{noformat}

bq.We can break the compaction optimizations into another ticket. It really 
needs a separate compaction Strategy; the idea is if we have an sstable A older 
than CF ttl, then all the data in the file is dead and we can just delete the 
file without looking at it row-by-row. However, there's a lot of tension there 
with the goal of normal compaction, which wants to merge different versions of 
the same row, so we're going to churn a lot with a low chance of ever having an 
sstable last the full TTL without being merged, effectively restarting our 
timer. So, I think we're best served by a ArchivingCompactionStrategy that 
doesn't merge sstables at all, just drops obsolete ones, and let people use 
that for append-only insert workloads. Which is a common enough case that it's 
worth the trouble... probably.

Either way is fine. Would love to contribute.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-04 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246621#comment-13246621
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

Thanks Kirk!

My comments:

- Looks like this only updates the CQL path?  We'd want to make the Thrift path 
cf-ttl-aware as well.  I *think* this just means updating RowMutation + CF 
addColumn methods.
- Nit: we could simplify getTTL a bit by adding assert ttl  0.
- I got it backwards: we want max(cf ttl, column ttl) to be able to reason 
about the live-ness of CF data w/o looking at individual rows
- We can break the compaction optimizations into another ticket.  It really 
needs a separate compaction Strategy; the idea is if we have an sstable A older 
than CF ttl, then all the data in the file is dead and we can just delete the 
file without looking at it row-by-row.  However, there's a lot of tension there 
with the goal of normal compaction, which wants to merge different versions of 
the same row, so we're going to churn a lot with a low chance of ever having an 
sstable last the full TTL without being merged, effectively restarting our 
timer.  So, I think we're best served by a ArchivingCompactionStrategy that 
doesn't merge sstables at all, just drops obsolete ones, and let people use 
that for append-only insert workloads.  Which is a common enough case that it's 
worth the trouble... probably. :)

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2

 Attachments: trunk-3974.txt


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-04-01 Thread Aleksey Vorona (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243831#comment-13243831
 ] 

Aleksey Vorona commented on CASSANDRA-3974:
---

Copying real life use cases which need this feature from the older bug ( 
CASSANDRA-3077 ):

1. I want one of my CFs not to store any data older than two months. It is a 
notifications CF which is of no interest to user past certain point in time.
Currently I am setting TTL with each insert in the CF, but since it is a 
constant it makes sense to me to have it configured in CF definition to apply 
automatically to all rows in the CF.

2. Default TTL would be very helpfull in Map/Reduce scenarios where you dont 
have direct control of TTL (IE: hive)

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Kirk True
Priority: Minor
 Fix For: 1.2


 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-03-27 Thread Leonardo Stern (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239502#comment-13239502
 ] 

Leonardo Stern commented on CASSANDRA-3974:
---

This is related to CASSANDRA-3077, Also very helpful in map/reduce scenarios.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Priority: Minor

 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-03-09 Thread Dave Brosius (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226702#comment-13226702
 ] 

Dave Brosius commented on CASSANDRA-3974:
-

What happens if a column in a ttl'ed column family has a ttl that's longer than 
the cf's ttl? Would that be allowed?

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Priority: Minor

 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3974) Per-CF TTL

2012-03-09 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226772#comment-13226772
 ] 

Jonathan Ellis commented on CASSANDRA-3974:
---

It would have to be min(cf ttl, column ttl) to be useful.

 Per-CF TTL
 --

 Key: CASSANDRA-3974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Priority: Minor

 Per-CF TTL would allow compaction optimizations (drop an entire sstable's 
 worth of expired data) that we can't do with per-column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira