[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-29 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-2897:
---

Attachment: 0003-CASSANDRA-2897.txt

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0003-CASSANDRA-2897.txt, 2897-apply-cleanup.txt, 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2897:
--

Attachment: 2897-v4.txt

v4 pushes *all* index updates into the helper closure, renamed to 
SecondaryIndexManager.Updater.  This cleans up Table.apply even more (no more 
looping to create a redundant Map of updated columns), and allows index 
maintenance during compaction relatively cleanly -- this is added for the first 
time here.

I note, for the record, that composite indexes make my head hurt 
(CASSANDRA-4586).

I further note that finding the wrong column value being used to create 
dummyColumn in the index-stale block was a *bitch*.  Not sure how your new 
tests passed with that.  Two bugs cancelling out, I guess.  (Similarly, 
dummyColumn needed to be introduced in KeysSearcher since just using the index 
column is wrong even for non-composites, since delete expects a base-data 
column.)

I await news of the new bugs I've introduced. :)

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0003-CASSANDRA-2897.txt, 2897-apply-cleanup.txt, 2897-v4.txt, 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2897:
--

Attachment: 2897-apply-cleanup.txt

bq. I prefer Philip's method of pushing down the resolution of indexed values 
from Table.apply

Hmm, I disagree.  The problem is that Phil's method does a lot of extra 
allocation, even when no indexed columns are updated.  (And even when we just 
need the size delta, we move from a no-allocation long to a Pair.)

So I think we need to push the index management down into ACC.

Somewhat orthogonally, attached is a patch to clean out unnecessary code from 
the apply path; we don't need obsolete indexed columns anymore, and deleting an 
indexed range doesn't need special casing either.

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 2897-apply-cleanup.txt, 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2897:
--

Reviewer: jbellis
Assignee: Sam Tunnicliffe

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 2897-apply-cleanup.txt, 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-17 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-2897:
---

Attachment: 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt

I prefer Philip's method of pushing down the resolution of indexed values from 
Table.apply, IMHO the changes required in his version are less intrusive and 
result in a cleaner and clearer API. I've added a third patch which merges the 
two previous, it should apply to trunk (trunk is broken atm so I can't verify 
for sure) and incorporates Philip's pushdown code with my changes to the 
IndexSearcher implementations. It has my changes to SchemaLoader  
CFMetaDataTest and merges both versions of ColumnFamilyStoreTest

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-16 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2897:
--

Fix Version/s: 1.2

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-16 Thread Philip Jenvey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Jenvey updated CASSANDRA-2897:
-

Attachment: 41ec9fc-2897.txt

Here's an alternative patch that also tackles just the non-compaction changes 
(it's a little stale, against 41ec9fc)

Briefly looking at Sam's version, I'll note that:

o Mine handles entire row deletions in Memtable

o but it lacks changes to CompositesSearcher/SchemaLoader/CFMetaDataTest 
(though I'm not familiar with these code paths, either)

o in KeysSearcher, I very likely should be using the compare method from 
getValueValidator to check for staleness (instead of naively just calling 
equals)

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-15 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-2897:
---

Attachment: 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: secondary_index
 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira