If Spark workers are installed on the same nodes as Cassandra nodes, then they
can take advantage of data locality, greatly reducing the amount of network IO
in Spark jobs. If you use a seperate / Cloudera / Hortonworks / EMR cluster,
you won't be able to benefit from this. Other than the
Hi All,My previous cassandra clusters had moderate loads, and I'd simply
schedule full repairs at different times in the week (but on the same day).
That seemed to work ok, but was redundant. In my current project, I'm going to
need to care about repair times a lot more, and was wondering what
One thing to note is that the exception you get... in this case, you'll get a
timeout, not a failure. i.e. as far as Cassandra is concerned, the write is
still ongoing - it hasn't failed; but from the client's perspective, it's timed
out. In this case (i.e. timeout), the application would
Hello,I'm playing with DataStax's ODBC connector for Cassandra and have noticed
something...well...broken.
If I have a keyspace with tables that don't have a UDT column (even though the
UDT is created), things work fine. However, the moment I add a table that has a
UDT column, nothing works.
Hi Asit,While I haven't used stratio's lucene indexing, a few points on the
datastax .net connector:
i) It got a major revamp last year. I'm assuming you're using the latest
one?ii) Are prepared statements actually being reused many times? If not,
perhaps the overhead isn't worth it. Prepared
What's a good way to load some cassandra data (perhaps result of a cql query)
into Excel / Tableau? I see DSE has support, but that's not always an option.
Simba do an odbc connectory that currently doesn't support UDTs + collections
properly (and it's expensive). Is there a way to use Spark to