> We stop at the memtable if we know that’s all we need. This depends on a lot
> of factors (schema, point read vs slice, etc)
The codes seems to search sstables without checking whether the query
is already satisfied in memtable only.
Could you point out the related code snippets for what you
On Wed, Jan 9, 2019 at 7:28 AM Durity, Sean R
wrote:
> I think you could consider option C: Create a (new) analytics DC in
> Cassandra and run your spark nodes there. Then you can address the scaling
> just on that DC. You can also use less vnodes, only replicate certain
> keyspaces, etc. in
> I’m still not sure if having tombstones vs. empty values / frozen UDTs
will have the same results.
When in doubt, benchmark.
Good luck,
Jon
On Wed, Jan 9, 2019 at 3:02 PM Tomas Bartalos
wrote:
> Loosing atomic updates is a good point, but in my use case its not a
> problem, since I always
Not sure why they put that in there, it's definitely misleading. There's
nothing arrow related in Cassandra.
There's an open JIRA, but nothing has been committed yet:
https://issues.apache.org/jira/browse/CASSANDRA-9259
On Wed, Jan 9, 2019 at 3:48 PM Tomas Bartalos
wrote:
> There is a diagram
There is a diagram on the homepage displaying Cassandra (with other
storages) as source of data.
https://arrow.apache.org/img/shared.png
Which made me think there should be some integration...
On Thu, 10 Jan 2019, 12:38 am Jonathan Haddad Where are you seeing that it works with Cassandra?
Where are you seeing that it works with Cassandra? There's no mention of
it under https://arrow.apache.org/powered_by/, and on the homepage it says
only says that a Cassandra developer worked on it.
We (unfortunately) don't do anything with it at the moment.
On Wed, Jan 9, 2019 at 3:24 PM Tomas
I’ve read lot of nice things about Apache Arrow in-memory columnar format. On
their homepage they mention Cassandra as a possible storage which could
interoperate with Arrow. Unfortunately I was not able to find any working
example which would demonstrate their cooperation.
My use case: I’m
Loosing atomic updates is a good point, but in my use case its not a problem,
since I always overwrite the whole record (no partitial updates).
I’m still not sure if having tombstones vs. empty values / frozen UDTs will
have the same results.
When I update one row with 10 null columns it will
Thanks Sean. But what if I want to have both Spark and elasticsearch with
Cassandra as separare data center. Does that cause any overhead ?
On Wed, Jan 9, 2019 at 7:28 AM Durity, Sean R
wrote:
> I think you could consider option C: Create a (new) analytics DC in
> Cassandra and run your spark
I think you could consider option C: Create a (new) analytics DC in Cassandra
and run your spark nodes there. Then you can address the scaling just on that
DC. You can also use less vnodes, only replicate certain keyspaces, etc. in
order to perform the analytics more efficiently.
Sean Durity
You’re comparing single machine key/value stores to a distributed db with a
much richer data model (partitions/slices, statics, range reads, range
deletions, etc). They’re going to read very differently. Instead of explaining
why they’re not like rocks/ldb, how about you tell us what you’re
On Tue, 8 Jan 2019 at 18:29, Jeff Jirsa wrote:
> Given Consul's popularity, seems like someone could make an argument that
> we should be shipping a consul-aware seed provider.
>
Elasticsearch has a very handy dedicated file-based discovery system:
On Tue, 8 Jan 2019 at 18:39, Jeff Jirsa wrote:
> On Tue, Jan 8, 2019 at 8:19 AM Jonathan Ballet wrote:
>
>> Hi Jeff,
>>
>> thanks for answering to most of my points!
>> From the reloadseeds' ticket, I followed to
>> https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very
>>
13 matches
Mail list logo