RE: Data platform support

2016-05-17 Thread Ashic Mahtab
If Spark workers are installed on the same nodes as Cassandra nodes, then they can take advantage of data locality, greatly reducing the amount of network IO in Spark jobs. If you use a seperate / Cloudera / Hortonworks / EMR cluster, you won't be able to benefit from this. Other than the

Repair schedules for new clusters

2016-05-17 Thread Ashic Mahtab
Hi All,My previous cassandra clusters had moderate loads, and I'd simply schedule full repairs at different times in the week (but on the same day). That seemed to work ok, but was redundant. In my current project, I'm going to need to care about repair times a lot more, and was wondering what

RE: Read Repair

2015-07-08 Thread Ashic Mahtab
One thing to note is that the exception you get... in this case, you'll get a timeout, not a failure. i.e. as far as Cassandra is concerned, the write is still ongoing - it hasn't failed; but from the client's perspective, it's timed out. In this case (i.e. timeout), the application would

ODBC connector, UDTs and Tableau

2015-05-15 Thread Ashic Mahtab
Hello,I'm playing with DataStax's ODBC connector for Cassandra and have noticed something...well...broken. If I have a keyspace with tables that don't have a UDT column (even though the UDT is created), things work fine. However, the moment I add a table that has a UDT column, nothing works.

RE: Efficient .net client for cassandra

2015-02-24 Thread Ashic Mahtab
Hi Asit,While I haven't used stratio's lucene indexing, a few points on the datastax .net connector: i) It got a major revamp last year. I'm assuming you're using the latest one?ii) Are prepared statements actually being reused many times? If not, perhaps the overhead isn't worth it. Prepared

Accessing Cassandra Data from Excel / Tableau / R

2015-02-17 Thread Ashic Mahtab
What's a good way to load some cassandra data (perhaps result of a cql query) into Excel / Tableau? I see DSE has support, but that's not always an option. Simba do an odbc connectory that currently doesn't support UDTs + collections properly (and it's expensive). Is there a way to use Spark to