[
https://issues.apache.org/jira/browse/CASSANDRA-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249647#comment-13249647
]
Brian ONeill edited comment on CASSANDRA-1311 at 4/9/12 1:56 AM:
-----------------------------------------------------------------
Agreed. I don't think we should include REST in the formal API either, just
offering that up as a design pattern for those that need to do more than you
can fit in a little javascript snippet.
We are heavy in performance/stress testing right now. And we now have two
models working: one where we use synchronous triggers (prior to write), and
another where triggers execute asynchronously after write. Both are useful for
different things. (asynch where we can't slow down the actual write -- e.g.
user interactions, and synch when we need to integrity)
Additionally, we see a need for two levels of guarantees. For some of the
triggers, we don't really care if the trigger failed, because we can rely on a
regular map/reduce job to "cleanup" any failed trigger executions. We'd
rather not have the overhead of a CSCL even. The system just needs to execute
the trigger for us (if it can). If it fails, oh well.
For other jobs, (synchronous or asynchronous) we need to know when we are in a
bad state. i.e. we need to know if the data is ever out of synch with a
side-effect of a trigger. For these scenarios, the overhead of the CSCL is
acceptable. We can see failed trigger executions even in the event of a crash.
(e.g. those log entries left in a PENDING state > some acceptable time period
are considered failed and we need to go rectify the situation).
Unless there are transactional semantics, I think it suffices to have three
interception points:
# Pre-mutation synchronous (blocking until trigger execution completes)
#* Trigger can add additional mutations
#** (additional columns to a row "in-transaction" seems useful)
#* Trigger can fail the operation
#** (quality/integrity checks)
# Post-mutation synchronous
#* Upon failure, we can signal "trigger failure" to the client suggesting
retry, but it doesn't fail the actual operation
#** (since its already happened, and we don't want to add rollback)
# Post-mutation asynchronous
#* No influence on write (obviously), but need to be guaranteed trigger
executes, or know when it has not.
For each of these, I think there are two levels of guarantees, either:
# You don't necessarily care if ALL executions were successful, you'd rather be
fast
#* (e.g. statistics / analytics that need to be "close-enough")
# You absolutely need to know if data changed and a trigger was unsuccessful in
processing that mutation.
random thoughts,
-brian
was (Author: boneill):
Agreed. I don't think we should include REST in the formal API either,
just offering that up as a design pattern for those that need to do more than
you can fit in a little javascript snippet.
We are heavy in performance/stress testing right now. And we now have two
models working: one where we use synchronous triggers (prior to write), and
another where triggers execute asynchronously after write. Both are useful for
different things. (asynch where we can't slow down the actual write -- e.g.
user interactions, and synch when we need to integrity)
Additionally, we see a need for two levels of guarantees. For some of the
triggers, we don't really care if the trigger failed, because we can rely on a
regular map/reduce job to "cleanup" any failed trigger executions. We'd
rather not have the overhead of a CSCL even. The system just needs to execute
the trigger for us (if it can). If it fails, oh well.
For other jobs, (synchronous or asynchronous) we need to know when we are in a
bad state. i.e. we need to know if the data is ever out of synch with a
side-effect of a trigger. For these scenarios, the overhead of the CSCL is
acceptable. We can see failed trigger executions even in the event of a crash.
(e.g. those log entries left in a PENDING state > some acceptable time period
are considered failed and we need to go rectify the situation).
Unless there are transactional semantics, I think it suffices to have three
interception points:
1) Pre-mutation synchronous (blocking until trigger execution completes)
- Trigger can add additional mutations
(additional columns to a row "in-transaction" seems useful)
- Trigger can fail the operation
(quality/integrity checks)
2) Post-mutation synchronous
- Upon failure, we can signal "trigger failure" to the client suggesting
retry, but it doesn't fail the actual operation
(since its already happened, and we don't want to add rollback)
3) Post-mutation asynchronous
- No influence on write (obviously), but need to be guaranteed trigger
executes, or know when it has not.
For each of these, I think there are two levels of guarantees, either:
1) You don't necessarily care if ALL executions were successful, you'd rather
be fast
(e.g. statistics / analytics that need to be "close-enough")
2) You absolutely need to know if data changed and a trigger was unsuccessful
in processing that mutation.
random thoughts,
-brian
> Triggers
> --------
>
> Key: CASSANDRA-1311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1311
> Project: Cassandra
> Issue Type: New Feature
> Reporter: Maxim Grinev
> Fix For: 1.2
>
> Attachments: HOWTO-PatchAndRunTriggerExample-update1.txt,
> HOWTO-PatchAndRunTriggerExample.txt, ImplementationDetails-update1.pdf,
> ImplementationDetails.pdf, trunk-967053.txt, trunk-984391-update1.txt,
> trunk-984391-update2.txt
>
>
> Asynchronous triggers is a basic mechanism to implement various use cases of
> asynchronous execution of application code at database side. For example to
> support indexes and materialized views, online analytics, push-based data
> propagation.
> Please find the motivation, triggers description and list of applications:
> http://maxgrinev.com/2010/07/23/extending-cassandra-with-asynchronous-triggers/
> An example of using triggers for indexing:
> http://maxgrinev.com/2010/07/23/managing-indexes-in-cassandra-using-async-triggers/
> Implementation details are attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira