[
https://issues.apache.org/jira/browse/FLUME-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664219#comment-13664219
]
Edward Sargisson commented on FLUME-1995:
-----------------------------------------
I think this is bad idea. I just finished working on a team for 18 months
heavily using Cassandra and know a little about its internal design. Generally
speaking, the advice is that Cassandra is a bad choice for a queue.
Queuing behaviour means that you have some producers adding items and consumers
deleting items. Cassandra doesn't really delete - it's an append only system so
a delete means that it creates a tombstone in the latest SSTable. Then,
sometime later, a repair process is run which ensures that all the records are
actually deleted. In the meantime you run the risk of some of the nodes
replying with a 'latest' record that may have been deleted off some other node
but the update hasn't propagated yet.
If this is not convincing enough then I'll discuss it on the Cassandra list and
bring the results back here.
If you happen to want large scalable queueing then a common solution I've seen
is to use Redis. However, I don't see why you wouldn't use multiple Flume
agents and file channels to solve the same problem.
> CassandraChannel - A Distributed Channel Backed By Apache Cassandra as a
> Persistent Store for Events
> ----------------------------------------------------------------------------------------------------
>
> Key: FLUME-1995
> URL: https://issues.apache.org/jira/browse/FLUME-1995
> Project: Flume
> Issue Type: New Feature
> Components: Channel
> Affects Versions: v1.4.0
> Reporter: Israel Ekpo
> Assignee: Israel Ekpo
>
> Apache Cassandra Channel
> The events received by this channel are queued up in Cassandra to be picked
> up later when sinks send pickup requests to the channel.
> This type of channel is suitable for use cases where recoverability in the
> event of a hardware failure on the agent machine is important.
> The Cassandra cluster can be located on a remote machine.
> Cassandra also supports replication which could back up and replicate the
> events further to other nodes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira