[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224998#comment-15224998
 ] 

Joshua McKenzie edited comment on CASSANDRA-8844 at 4/7/16 9:40 PM:
--------------------------------------------------------------------

v1 is ready for review.

h5. General outline of changes in the patch
* CQL syntax changes to support CDC:
** CREATE KEYSPACE ks WITH replication... AND cdc_datacenters={'dc1','dc2'...}
** ALTER KEYSPACE ks DROP CDCLOG;
*** Cannot drop keyspaces w/CDC enabled without first disabling CDC.
* Changes to Parser.g to support sets being converted into maps. Reference 
normalizeSetOrMapLiteral, cleanMap, cleanSet
* Statement changes to support new keyspace param for Option.CDC_DATACENTERS
* Refactored {{CommitLogReplayer}} into {{CommitLogReplayer}}, 
{{CommitLogReader}}, and {{ICommitLogReadHandler}} in preparation for having a 
CDC consumer that needs to read commit log segments.
* Refactored commit log versioned deltas from various read* methods into 
{{CommitLogReader.CommitLogFormat}}
* Renamed {{ReplayPosition}} to {{CommitLogSegmentPosition}} (this is 
responsible for quite a bit of noise in the diff - sorry)
* Refactored {{CommitLogSegmentManager}} into:
** {{AbstractCommitLogSegmentManager}}
** {{CommitLogSegmentManagerStandard}}
*** Old logic for alloc (always succeed, block on allocate)
*** discard (delete if true)
*** unusedCapacity check (CL directory only)
** {{CommitLogSegmentManagerCDC}}
*** Fail alloc if atCapacity. We have an extra couple of atomic checks on the 
critical path for CDC-enabled (size + cdc overflow) and fail allocation if 
we're at limit. CommitLog now throws WriteTimeoutException for allocations 
returned null from CommitLog, which the standard should never do as it infinite 
loops in {{advanceAllocatingFrom}}.
*** Move files to cdc overflow folder as configured in yaml on discard
*** unusedCapacity includes lazy calculated size of CDC overflow as well. See 
DirectorySizerBench.java for why I went w/separate thread to lazy calculate 
size of overflow instead of doing it sync on failed allocation
*** Separate size limit configured in cassandra.yaml for CDC and CommitLog so 
they each have their own unusedCapacity checks. Went with 1/8th disk or 4096 on 
CDC as default, putting it at 1/2 the size of CommitLog.
* Refactored buffer management portions of {{FileDirectSegment}} into 
{{SimpleCachedBufferPool}}, owned by a {{CommitLogSegmentManager}} instance
** There's considerable logical overlap between this and BufferPool in general, 
though this is considerably simpler and purpose-built. I'm personally ok 
leaving it separate for now given it's simplicity.
* Some other various changes and movements around the code-base related to this 
patch ({{DirectorySizeCalculator}}, some javadoccing, typos I came across in 
comments or variable names while working on this, etc)

h5. What's not yet done:
* Consider running all / relevant CommitLog related unit tests against a 
CDC-based keyspace
* Performance testing (want to confirm that added determination of which 
{{CommitLogSegmentManager}} during write path is negligable impact along w/2 
atomic checks on CDC write-path)
* dtests specific to CDC
* fallout testing on CDC
* Any code-changes to specifically target supporting a consumer following a CDC 
log as it's being written in CommitLogReader / ICommitLogReader. A requester 
should be able to trivially handle that with the 
{{CommitLogReader.readCommitLogSegment}} signature supporting 
{{CommitLogSegmentPosition}} and {{mutationLimit}}, however, so I'm leaning 
towards not further polluting CommitLogReader / C* and keeping that in the 
scope of a consumption daemon

h5. Special point of concern:
* This patch changes us from an implicit singleton view of 
{{CommitLogSegmentManager}} to having multiple CommitLogSegmentManagers managed 
under the CommitLog. There have been quite a few places where I've come across 
undocumented assumptions that we only ever have 1 logical object allocating 
segments (the latest being FileDirectSegment uncovered by 
CommitLogSegmentManagerTest). I plan on again checking the code to make sure 
the new "calculate off multiple segment managers" view of some of the things 
exposed in the CommitLog interface don't violate their contract now that 
there's no longer single CLSM-atomicity on those results.

h5. Known issues:
* dtest is showing a pretty consistent error w/an inability to find a cdc 
CommitLogSegment during recovery that looks to be unique to the dtest env
* a few failures left in testall
* intermittent failure in the new {{CommitLogSegmentManagerCDCTest}} (3/150 
runs - on Windows, so I haven't yet ruled out an env. issue w/the testing)

[~blambov]: while [~carlyeks] is primary reviewer on this and quite familiar 
with the changes as he worked w/me on the design process, I'd also appreciate 
it if you could provide a backup pair of eyes and look over the CommitLog 
changes, CommitLogReplayer refactor, and CommitLogSegmentManagerCDC changes 
since you've done a good bit of work on the CommitLog subsystem and it should 
be familiar to you.

||branch||testall||dtest||
|[8844|https://github.com/josh-mckenzie/cassandra/tree/8844_review]|[testall|http://cassci.datastax.com/view/Dev/view/josh-mckenzie/job/josh-mckenzie-8844_review-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/josh-mckenzie/job/josh-mckenzie-8844_review-dtest]|

Targeting 3.6 for this so we have 4 weeks until freeze.

edit: had a hiccup w/params on the ci jobs, so updating branch name and job 
links


was (Author: joshuamckenzie):
v1 is ready for review.

h5. General outline of changes in the patch
* CQL syntax changes to support CDC:
** CREATE KEYSPACE ks WITH replication... AND cdc_datacenters={'dc1','dc2'...}
** ALTER KEYSPACE ks DROP CDCLOG;
*** Cannot drop keyspaces w/CDC enabled without first disabling CDC.
* Changes to Parser.g to support sets being converted into maps. Reference 
normalizeSetOrMapLiteral, cleanMap, cleanSet
* Statement changes to support new keyspace param for Option.CDC_DATACENTERS
* Refactored {{CommitLogReplayer}} into {{CommitLogReplayer}}, 
{{CommitLogReader}}, and {{ICommitLogReadHandler}} in preparation for having a 
CDC consumer that needs to read commit log segments.
* Refactored commit log versioned deltas from various read* methods into 
{{CommitLogReader.CommitLogFormat}}
* Renamed {{ReplayPosition}} to {{CommitLogSegmentPosition}} (this is 
responsible for quite a bit of noise in the diff - sorry)
* Refactored {{CommitLogSegmentManager}} into:
** {{AbstractCommitLogSegmentManager}}
** {{CommitLogSegmentManagerStandard}}
*** Old logic for alloc (always succeed, block on allocate)
*** discard (delete if true)
*** unusedCapacity check (CL directory only)
** {{CommitLogSegmentManagerCDC}}
*** Fail alloc if atCapacity. We have an extra couple of atomic checks on the 
critical path for CDC-enabled (size + cdc overflow) and fail allocation if 
we're at limit. CommitLog now throws WriteTimeoutException for allocations 
returned null from CommitLog, which the standard should never do as it infinite 
loops in {{advanceAllocatingFrom}}.
*** Move files to cdc overflow folder as configured in yaml on discard
*** unusedCapacity includes lazy calculated size of CDC overflow as well. See 
DirectorySizerBench.java for why I went w/separate thread to lazy calculate 
size of overflow instead of doing it sync on failed allocation
*** Separate size limit configured in cassandra.yaml for CDC and CommitLog so 
they each have their own unusedCapacity checks. Went with 1/8th disk or 4096 on 
CDC as default, putting it at 1/2 the size of CommitLog.
* Refactored buffer management portions of {{FileDirectSegment}} into 
{{SimpleCachedBufferPool}}, owned by a {{CommitLogSegmentManager}} instance
** There's considerable logical overlap between this and BufferPool in general, 
though this is considerably simpler and purpose-built. I'm personally ok 
leaving it separate for now given it's simplicity.
* Some other various changes and movements around the code-base related to this 
patch ({{DirectorySizeCalculator}}, some javadoccing, typos I came across in 
comments or variable names while working on this, etc)

h5. What's not yet done:
* Consider running all / relevant CommitLog related unit tests against a 
CDC-based keyspace
* Performance testing (want to confirm that added determination of which 
{{CommitLogSegmentManager}} during write path is negligable impact along w/2 
atomic checks on CDC write-path)
* dtests specific to CDC
* fallout testing on CDC
* Any code-changes to specifically target supporting a consumer following a CDC 
log as it's being written in CommitLogReader / ICommitLogReader. A requester 
should be able to trivially handle that with the 
{{CommitLogReader.readCommitLogSegment}} signature supporting 
{{CommitLogSegmentPosition}} and {{mutationLimit}}, however, so I'm leaning 
towards not further polluting CommitLogReader / C* and keeping that in the 
scope of a consumption daemon

h5. Special point of concern:
* This patch changes us from an implicit singleton view of 
{{CommitLogSegmentManager}} to having multiple CommitLogSegmentManagers managed 
under the CommitLog. There have been quite a few places where I've come across 
undocumented assumptions that we only ever have 1 logical object allocating 
segments (the latest being FileDirectSegment uncovered by 
CommitLogSegmentManagerTest). I plan on again checking the code to make sure 
the new "calculate off multiple segment managers" view of some of the things 
exposed in the CommitLog interface don't violate their contract now that 
there's no longer single CLSM-atomicity on those results.

h5. Known issues:
* dtest is showing a pretty consistent error w/an inability to find a cdc 
CommitLogSegment during recovery that looks to be unique to the dtest env
* a few failures left in testall
* intermittent failure in the new {{CommitLogSegmentManagerCDCTest}} (3/150 
runs - on Windows, so I haven't yet ruled out an env. issue w/the testing)

[~blambov]: while [~carlyeks] is primary reviewer on this and quite familiar 
with the changes as he worked w/me on the design process, I'd also appreciate 
it if you could provide a backup pair of eyes and look over the CommitLog 
changes, CommitLogReplayer refactor, and CommitLogSegmentManagerCDC changes 
since you've done a good bit of work on the CommitLog subsystem and it should 
be familiar to you.

||branch||testall||dtest||
|[8844|https://github.com/josh-mckenzie/cassandra/tree/8844]|[testall|http://cassci.datastax.com/view/Dev/view/josh-mckenzie/job/josh-mckenzie-8844-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/josh-mckenzie/job/josh-mckenzie-8844-dtest]|

Targeting 3.6 for this so we have 4 weeks until freeze.

> Change Data Capture (CDC)
> -------------------------
>
>                 Key: CASSANDRA-8844
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Coordination, Local Write-Read Paths
>            Reporter: Tupshin Harper
>            Assignee: Joshua McKenzie
>            Priority: Critical
>             Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - Instead of writing to a logfile, by default, Cassandra could expose a 
> socket for a daemon to connect to, and from which it could pull each row.
> - Cassandra would have a limited buffer for storing rows, should the listener 
> become backlogged, but it would immediately spill to disk in that case, never 
> incurring large in-memory costs.
> h2. Additional consumption possibility
> With all of the above, still relevant:
> - instead (or in addition to) using the other logging mechanisms, use CQL 
> transport itself as a logger.
> - Extend the CQL protoocol slightly so that rows of data can be return to a 
> listener that didn't explicit make a query, but instead registered itself 
> with Cassandra as a listener for a particular event type, and in this case, 
> the event type would be anything that would otherwise go to a CDC log.
> - If there is no listener for the event type associated with that log, or if 
> that listener gets backlogged, the rows will again spill to the persistent 
> storage.
> h2. Possible Syntax
> {code:sql}
> CREATE TABLE ... WITH CDC LOG
> {code}
> Pros: No syntax extesions
> Cons: doesn't make it easy to capture the various permutations (i'm happy to 
> be proven wrong) of per-dc logging. also, the hypothetical multiple logs per 
> table would break this
> {code:sql}
> CREATE CDC_LOG mylog ON mytable WHERE MyUdf(mycol1, mycol2) = 5 with 
> DCs={'dc1','dc3'}
> {code}
> Pros: Expressive and allows for easy DDL management of all aspects of CDC
> Cons: Syntax additions. Added complexity, partly for features that might not 
> be implemented



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to