Hi All
I enabled CDC through yaml. When I insert 100k small rows, I don't see CDC
file being created or updated unless I restart cassandra service after each
update. However, when I insert rows with columns of 1MB, I started to see
CDC files added. I looked at commit log, they are updated in a
RF in the Analytics DC can be 2 (or even 1) if storage cost is more important
than availability. There is a storage (and CPU and network latency) cost for a
separate Spark cluster. So, the variables of your specific use case may swing
the decision in different directions.
Sean Durity
From:
At this point, I would be talking to DataStax. They already have Spark and
SOLR/search fully embedded in their product. You can look at their docs for
some idea of the RAM and CPU required for combined Search/Analytics use cases.
I would expect this to be a much faster route to production.
I see that nodetool compactionstats reports uncompressed byte size, but does
anyone know why? It seems that for all use cases, the true (compressed) size
would be most useful.
Thanks,
Valerie
-
To unsubscribe, e-mail:
Hello,
Seeing this mail thread pop up in my search filter I want to give some insights
as one of the Arrow PMCs.
I have not yet heard of anyone currently working on Cassandra + Arrow. This
would definitely be a great combination to better support performant clients
that send/receive larger