Sounds like you want https://issues.apache.org/jira/browse/CASSANDRA-2045

On Tue, Apr 26, 2011 at 8:38 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote:
> So maybe this idea has been sent around before but I would like to
> know what everyone thinks. We have a huge column family called bigdata
> let's say 200 gb a node. We have used cass* as you would expect we
> never read before writing and during our bulk loading we can get rates
> like 2000 inserts per second per node. This morning I noticed this cf
> on only some nodes had a lot of reads which went on for hours.
>
> Since our apps should not have been reading I dove in. What was
> happening was a node was down during the bulk load period. As a resukt
> when it came alive the other node with hints went to deliver them. The
> problem was the other node was high io trying to deliver hints. I see
> why.
>
> Cassandra does NOT write before read EXCEPT when writing a handoff.
>
> This is not a good thing. It means the bigger big data cf gets the
> more intensive delivering the hint will be on the sender side. Write
> rate may be 2000 but they can not be read that fast.
>
> I know you can now drop and throttle hh in 0.7.0 but this is not good
> enough since this only takes longer to get consistent. Or you never
> get consistent so here is my thinking...
>
> Store hints in separate physical files and or possibly deliver those
> file by streaming.
>
> Maybe there is already a jira out there on this. I just work up so to
> me it is an original idea :)
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to