[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622809#comment-14622809
]
Alan Boudreault commented on CASSANDRA-6477:
--------------------------------------------
Okay, I've been working on these comparisons but haven't been able to provide
useful results due to an issue I hit. I am doing my benchmarks on ec2 with a
cluster of 3 nodes. Basically, I can get realistic and useful results with C*
stock (no MV) and C* with a Secondary Index ( between 70000 and 85000 op/s).
When it comes to testing C* with 1 MV, I got many many WriteTimeoutExceptions
which results in a performance of 100 operations per second. I have been able
to reproduce that 100 op/s locally using a 3 nodes cluster. The issue doesn't
seem to be present when using a single node cluster.
I've profiled one of the node and it looks like most of the time is spend in
io.netty.channel.epoll.EpollEventLoop.epollWait() (like 75% of the time).
Here's a yourkit snapshot of the first node of the cluster.
http://dl.alanb.ca/CassandraDaemon-cluster-3-nodes-2015-07-10.snapshot.zip
I've attached my users.yaml profile that I am using for testing: [^users.yaml]
Here's the materialized view creation statement:
{code}
CREATE MATERIALIZED VIEW perftesting.users_by_first_name AS SELECT * FROM
perftesting.users PRIMARY KEY (first_name);
{code}
Here's the stress command I've been using:
{code}
cassandra-stress user profile=/path/to/users.yaml ops\(insert=1\) n=5000000
no-warmup -pop seq=1..200M no-wrap -rate threads=200 -node
127.0.0.1,127.0.0.2,127.0.0.3
{code}
Let me know if I am doing anything wrong or if I can provide anything else to
help. I'll provide the benchmarks as soon as I have a workaround for this issue.
> Materialized Views (was: Global Indexes)
> ----------------------------------------
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
> Issue Type: New Feature
> Components: API, Core
> Reporter: Jonathan Ellis
> Assignee: Carl Yeksigian
> Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh
>
>
> Local indexes are suitable for low-cardinality data, where spreading the
> index across the cluster is a Good Thing. However, for high-cardinality
> data, local indexes require querying most nodes in the cluster even if only a
> handful of rows is returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)