If this is a standard column family, not a CQL3 table, then using CQL3 will
not give you the results you expect.
From cassandra-cli, let's set up some test data:
[default@unknown] create keyspace test;
[default@unknown] use test;
[default@test] create column family test;
[default@test] set
away the main
feature of the NoSQL store? Or am I am missing something obvious here?
Regards,
Shahab
On Tue, Jun 4, 2013 at 2:12 PM, Eric Stevens migh...@gmail.com wrote:
If this is a standard column family, not a CQL3 table, then using CQL3
will not give you the results you expect.
From
---+-++-+
0|abc | def | 0 | abc | 0xdeadbeef
1|xyz | uvw | 1 | xyz | 0x8badf00d
cqlsh:test SELECT * FROM tbl WHERE k1=0;
k1k2 | k3 | k1 | k2 | m
---+-++-+
0|abc | def | 0 | abc | 0xdeadbeef
-Eric Stevens
ProtectWise, Inc.
On Wed, Jun 5, 2013 at 9:29 AM, Sorin
column family.
-Eric Stevens
ProtectWise, Inc.
On Thu, Jun 6, 2013 at 9:49 AM, Francisco Andrades Grassi
bigjoc...@gmail.com wrote:
Hi,
CQL3 does now support dynamic columns. For tags or metadata values you
could use a Collection:
http://www.datastax.com/dev/blog/cql3_collections
For wide
mutating values are a
problem as the collection gets large, or cases where you need to know only
a subset of the the collection at a time.
-Eric Stevens
ProtectWise, Inc.
On Thu, Jun 6, 2013 at 10:59 AM, Edward Capriolo edlinuxg...@gmail.comwrote:
The problem about being careful about how much you
Load is the size of the storage on disk as I understand it. This can
fluctuate during normal usage even if records are not being added or
removed, a node's load may be reduced during compaction for example.
During compaction, especially if you use Size Tiered Compaction strategy
(the default),
At the DataStax Cassandra Summit 2013 last week, Al Tobey from Ooyala
recommended ss_table_size_in_mb be set at 256mb unless you have a fairly
small data set. The talk was Extreme Cassandra Optimization, and it was
superbly informative, I highly recommend it once DataStax gets the videos
online.
On its face my answer is not... really? What do you view yourself as
getting with this technique versus using built in replication? As an
example, you lose the ability to do LOCAL_QUORUM vs EACH_QUORUM
consistency level operations?
Doing replication manually sounds like a recipe for the
Is there a way to replace a failed server using vnodes? I only had
occasion to do this once, on a relatively small cluster. At the time I
just needed to get the new server online and wasn't concerned about the
performance implications, so I just removed the failed server from the
cluster and
It's my understanding that if cardinality of the first part of the primary
key has low cardinality, you will struggle with cluster balance as (unless
you use WITH COMPACT STORAGE) the first entry of the primary key equates to
the row key from the traditional interface, thus all entries related to
, Eric Stevens migh...@gmail.com wrote:
Is there a way to replace a failed server using vnodes? I only had
occasion
to do this once, on a relatively small cluster. ...
Of course that caused a bunch of key reassignments,
so I'm sure it would be less work for the cluster if I could bring
I wonder if one particular node is having trouble; when you notice the
missing column, what happens if you execute the read manually from cqlsh or
cassandra-cli independently directly on each node?
On Wed, Jul 3, 2013 at 2:00 AM, Blake Eggleston bl...@grapheffect.comwrote:
Hi All,
We're
The following setting is probably not a good idea:
bloom_filter_fp_chance = 1.0
It would disable the bloom filters all together, and this setting doesn't
have appreciably greater benefits over a setting of 0.1 (which has the
advantage of saving you from disk I/O 90% of the time for keys which
= Adding a new node between other nodes would avoid running move, but
the ring would be unbalanced, right? Would this imply in having a node
(with bigger range, 1/2 of the range while other 2 nodes have 1/2 each,
supposing 3 nodes) overloaded? I'm refering
You should be able to set the key_validation_class on the column family to
use a different data type for the row keys. You may not be able to change
this for a CF with existing data without some troubles due to a mismatch of
data types; if that's a concern you'll have to create a separate CF and
If you're creating dynamic columns via Thrift interface, they will not be
reflected in the CQL3 schema. I would recommend not mixing paradigms like
that, either stick with CQL3 or Thrift / cassandra-cli. With compact
storage creates column families which can be interacted with meaningfully
via
My understanding is that it is not possible to change the number of tokens
after the node has been initialized. To do so you would first need to
decommission the node, then start it clean with the appropriate num_tokens
in the yaml.
On Fri, Jul 12, 2013 at 9:17 PM, Radim Kolar h...@filez.com
vnodes currently do not brings any noticeable benefits to outweight trouble
The main advantage of vnodes is that it lets you have flexibility with
respect to adding and removing nodes from your cluster without having to
rebalance your cluster (issuing a lot of moves). A shuffle is a lot of
with the shuffle process and boom, like that, out
of disk space.
David
On Tue, Jul 16, 2013 at 8:35 AM, Eric Stevens migh...@gmail.com wrote:
vnodes currently do not brings any noticeable benefits to outweight
trouble
The main advantage of vnodes is that it lets you have flexibility with
respect
We've been commissioning some new nodes on a 2.0.10 community edition
cluster, and we're seeing streams that look like they're shipping way more
data than they ought for individual files during bootstrap.
/var/lib/cassandra/data/x/y/x-y-jb-11748-Data.db
3756423/3715409
-7878 which is fixed
in 2.0.11 / 2.1.1
Mark
On 1 November 2014 14:08, Eric Stevens migh...@gmail.com wrote:
We've been commissioning some new nodes on a 2.0.10 community edition
cluster, and we're seeing streams that look like they're shipping way more
data than they ought for individual files
If this is just for doing tests to make sure you get back the data you
expect, I would recommend looking some sort of eventually construct in your
testing. We use Specs2 as our testing framework, and our write-then-read
tests look something like this:
someDAO.write(someObject)
eventually {
They do not use Raid10 on the node, they don't use dual power as well,
because it's not cheap in cluster of many nodes
I think the point here is that money spent on traditional failure avoidance
models is better spent in a Cassandra cluster by instead having more nodes
of less expensive
Wouldn't it be a better idea to issue removenode on the crashed node, wipe
the whole data directory (including system) and let it bootstrap cleanly so
that it's not part of the cluster while it gets back up to speed?
On Tue, Nov 11, 2014, 12:32 PM Robert Coli rc...@eventbrite.com wrote:
On Tue,
You may be able to do something with conditional updates, however trying to
use Cassandra for this kind of coordination smells to me a lot like typical
antipatterns (eg write then read or read then write). You probably would
do better if you need one writer to consistently win a race condition
I'm not aware of a way to query TTL or writetime on collections from CQL
yet. You can access this information from Thrift though.
On Sat Nov 15 2014 at 12:51:55 AM DuyHai Doan doanduy...@gmail.com wrote:
Why don't you use map to store write time as value and data as key?
Le 15 nov. 2014
load average on DC1 nodes are around 3-5 and on DC2 around 7-10
Anecdotally I can say that loads in the 7-10 range have been dangerously
high. When we had a cluster running in this range, the cluster was falling
behind on important tasks such as compaction, and we really struggled to
If the new node never formally joined the cluster (streaming never
completed, it never entered UN state), shouldn't that node be safe to scrub
and start over again? It shouldn't be taking primary writes while it's
bootstrapping, should it?
On Mon Nov 17 2014 at 6:34:04 PM Michael Shuler
You're right that there's no way to use the counter data type to
materialize a view ordered by the counter. Computing this post hoc is the
way to go if your needs allow for it (if not, something like Summingbird or
vanilla Storm may be necessary).
I might suggest that you make your primary key
.
Thanks for your response.
Robert
On Nov 24, 2014, at 9:40 AM, Eric Stevens migh...@gmail.com wrote:
You're right that there's no way to use the counter data type to
materialize a view ordered by the counter. Computing this post hoc is
the way to go if your needs allow
Consider adding log_bucket timestamp, and then indexing that column. Your
data loader can SELECT * FROM logs WHERE log_bucket = ?. The value you
supply there would be the timestamp log bucket you're processing - in your
case logged_at % 5.
However, I'll caution against writing data to Cassandra
A lot of people do a lot of multi-threaded work with Datastax Java Driver.
It looks like you're using Cassandra Driver 2.0.0-RC2, might I suggest as a
first step, at least upgrade to 2.0.0 final? RC2 wasn't even the final
release candidate for 2.0.0.
On Wed Nov 26 2014 at 8:44:07 AM Brian Tarbox
There's no reason you can't run on multiple cloud providers as long as you
treat them as logically distinct DC's. It should largely work the same way
as running in several AWS regions, but you'll need to use something
like GossipingPropertyFileSnitch
because the EC2 snitches are specific to AWS.
Be careful with creating many dynamically created column families unless
you're cleaning up old ones to keep the total number of CF's reasonable.
Having many column families will increase memory pressure and reduce
overall performance.
On Thu Nov 27 2014 at 8:19:35 AM DuyHai Doan
The underlying write time is still tracked for each value in the collection
- it's part of how conflict resolution is managed - but it's not exposed
through CQL.
On Fri Nov 28 2014 at 4:18:47 AM Batranut Bogdan batra...@yahoo.com wrote:
Hello all,
If one has a table like this:
id text,
ts
@Jens,
will inactive CFs be released from C*'s memory after i.e. a few days
or when under resource pressure?
No, certain memory structures are allocated and will remain resident on
each node for as long as the table exists.
These CFs are used as time buckets, but are to be kept for speedy
How does the difference in load compare to the effective ownership? If you
deleted the system directory as well, you should end up with new ranges, so
I'm wondering if perhaps you just ended up with a really bad shuffle. Did
you run removenode on the old host after you took it down (I assume so
Do you have snapshot_before_compaction enabled?
http://datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__snapshot_before_compaction
On Wed Dec 03 2014 at 10:25:12 AM Robert Wille rwi...@fold3.com wrote:
I built my first
high correlation.
I think the moral of the story is that I shouldn’t delete the system
directory. If I have issues with a node, I should recommission it properly.
Robert
On Dec 3, 2014, at 10:23 AM, Eric Stevens migh...@gmail.com wrote:
How does the difference in load compare
B would work better in the case where you need to do sequential or ranged
style reads on the id, particularly if id has any significant sparseness
(eg, id is a timeuuid). You can compute the buckets and do reads of entire
buckets within your range. However if you're doing random access by id,
The official recommendation is 100k:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html
I wonder if there's an advantage to this over unlimited if you're running
servers which are dedicated to your Cassandra cluster (which you should be
for
It depends on the size of your data, but if your data is reasonably small,
there should be no trouble including thousands of records on the same
partition key. So a data model using PRIMARY KEY ((seq_id), seq_type)
ought to work fine.
If the data size per partition exceeds some threshold that
Based on recent conversations with Datastax engineers, the recommendation
is definitely still to run a finite and reasonable set of column families.
The best way I know of to support multitenancy is to include tenant id in
all of your partition keys.
On Fri Dec 05 2014 at 7:39:47 PM Kai Wang
...@gmail.com wrote:
On Sat, Dec 6, 2014 at 11:18 AM, Eric Stevens migh...@gmail.com wrote:
It depends on the size of your data, but if your data is reasonably
small, there should be no trouble including thousands of records on the
same partition key. So a data model using PRIMARY KEY ((seq_id
Hi Joy,
Are you resetting your data after each test run? I wonder if your tests
are actually causing you to fall behind on data grooming tasks such as
compaction, and so performance suffers for your later tests.
There are *so many* factors which can affect performance, without reviewing
test
, you're probably already in insane
and difficult to recover crisis mode).
On Sun Dec 07 2014 at 8:55:47 AM Eric Stevens migh...@gmail.com wrote:
Hi Joy,
Are you resetting your data after each test run? I wonder if your tests
are actually causing you to fall behind on data grooming tasks
with full CQL syntax.) would be very
helpful. I mean, Cassandra has no “subset” concept, nor a “load subset”
command, so what are we really talking about?
Also, I presume we are talking CQL, but some of the references seem more
Thrift/slice oriented.
-- Jack Krupansky
*From:* Eric Stevens migh
calculate.
Could you please describe in detail about your test deployment?
Thank you very much,
Joy
2014-12-07 23:55 GMT+08:00 Eric Stevens migh...@gmail.com:
Hi Joy,
Are you resetting your data after each test run? I wonder if your
tests are actually causing you to fall behind on data
Writing then immediately reading the same data (or reading then immediately
writing) are both antipatterns in any eventually consistent system,
Cassandra included.
You may need to investigate Compare and Set operations and see if they will
work for your needs. Or else look into Serial
We're considering moving to a model where we put each of our tables in a
dedicated keyspace. This is so we can tune replication per table, and
change our mind about that replication on a per-table basis without a major
migration. The biggest driver for this is Solr integration, we want to
tune
at 11:21 AM, Eric Stevens migh...@gmail.com wrote:
We're considering moving to a model where we put each of our tables in a
dedicated keyspace. This is so we can tune replication per table, and
change our mind about that replication on a per-table basis without a major
migration. The biggest
Jon,
The really important thing to really take away from Ryan's original post
is that batches are not there for performance.
tl;dr: you probably don't want batch, you most likely want many async
calls
My own rudimentary testing does not bear this out - at least not if you
mean to say that
You can seen what the partition key strategies are for each of the tables,
test5 shows the least improvement. The set (aid, end) should be unique,
and bckt is derived from end. Some of these layouts result in clustering
on the same partition keys, that's actually tunable with the ~15 per
bucket
, Eric Stevens migh...@gmail.com wrote:
You can seen what the partition key strategies are for each of the tables,
test5 shows the least improvement. The set (aid, end) should be unique,
and bckt is derived from end. Some of these layouts result in clustering
on the same partition keys, that's
code in a gist or something? I can't really talk
about your benchmark without seeing it and you're basing your stance on the
premise that it is correct, which it may not be.
On Sat Dec 13 2014 at 8:45:21 AM Eric Stevens migh...@gmail.com wrote:
You can seen what the partition key strategies
to model my application.
On Sat, Dec 13, 2014 at 10:58 AM, Eric Stevens migh...@gmail.com wrote:
Isn't the net effect of coordination overhead incurred by batches
basically the same as the overhead incurred by RoundRobin or other
non-token-aware request routing? As the cluster size increases
), end, proto) reverse order=
25,163,064,000
traverse test5 ((aid, bckt, end)) =
30,233,744,000
On Sat, Dec 13, 2014 at 11:07 AM, Jonathan Haddad j...@jonhaddad.com wrote:
On Sat Dec 13 2014 at 10:00:16 AM Eric Stevens migh...@gmail.com wrote:
Isn't the net
be reading it wrong.
Sorry I don't have more time to debug the script. Any of the above ideas
apply?
Jon
On Mon Dec 15 2014 at 1:11:43 PM Eric Stevens migh...@gmail.com wrote:
Unfortunately my Scala isn't the best so I'm going to have to take a
little bit to wade through the code.
I
No, deletes are always written as a tombstone no matter the consistency.
This is because data at rest is written to sstables which are immutable
once written. The tombstone marks that a record in another sstable is now
deleted, and so a read of that value should be treated as if it doesn't
exist.
You should be able to use Cassandra's built in tooling for sure. But just
be aware that restoring from a backup of the data will be a lot faster and
won't introduce any stress on the existing cluster. Repair and replace
operations aren't free to the other nodes, so an offline backup and restore
is
If you're just trying to get your feet wet with distributed software, and
your node count is going to be reasonably low and won't grow any time soon,
it's probably easier to just install it yourself rather than trying to also
learn how to use software deployment technologies like puppet or chef.
As Ryan mentioned, CQL is simply a translation layer to the underlying
storage mechanism you're already familiar with with Thrift.
There are definitely corner cases where it's not possible to get a
one-for-one equivalent in CQL vs Thrift, and even when there's equivalents,
the underlying data
Timestamps are timezone independent. This is a property of timestamps, not
a property of Cassandra. A given moment is the same timestamp everywhere in
the world. To display this in a human readable form, you then need to know
what timezone you're attempting to represent the timestamp as, this is
I would suggest enabling tracing in cqlsh and see what it has to say.
There are many things which could cause this, but I'm thinking in
particular you may have a lot of tombstones which get lifted when you read
the whole row, and are missed when you read just one column.
On Fri, Dec 26, 2014 at
. Is that anyway we can avoid it and
Cassandra assume the current time of the server?
Thanks
Ajay
On Dec 26, 2014 10:50 PM, Eric Stevens migh...@gmail.com wrote:
Timestamps are timezone independent. This is a property of timestamps,
not a property of Cassandra. A given moment is the same
I think Joanne is taking not about bulk loading, but about just general
access as in any standard client driver.
Joanne, this is a pretty broad topic. You would need to have some part of a
website built in some language such as Python or Java or some other
language. Then you would use an
Can you send us your exact data model? Even though you normally use
Thrift, you may also be able to access the data from CQL, and if so, query
tracing is a very powerful feature in CQL which may describe why there is a
performance difference.
Do you do deletes of data? If so, tombstones really
This is a bit difficult. Depending on your access patterns and data
volume, I'd be inclined to keep a separate table with a (count,
foreign_key) clustering key. Then do a client-side join to read the data
back in the order you're looking for. That will at least make the heavily
updated table
If the counters get incorrect, it could't be corrected
You'd have to store something that allowed you to correct it. For example,
the TimeUUID approach to keep true counts, which are slow to read but
accurate, and a background process that trues up your counter columns
periodically.
On Mon,
So while not exactly the same, this seems like a good analogy for
suggesting a third interface to fix problems with existing interfaces:
http://xkcd.com/927/
Even if the CQL parsing code in Cassandra is subpar (I haven't studied it),
that's not an especially compelling case to suggest replacing
tombstones (say by
default 20 days).
Thanks
Ajay
On Mon, Dec 29, 2014 at 7:47 PM, Eric Stevens migh...@gmail.com wrote:
If the counters get incorrect, it could't be corrected
You'd have to store something that allowed you to correct it. For
example, the TimeUUID approach to keep true
And also stored entirely for each UPDATE. Change one element,
re-serialize the whole thing to disk.
Is this true? I thought updates (adds, removes, but not overwrites)
affected just the indicated columns. Isn't it just the reads that involve
reading the entire collection?
DS docs talk about
Can this split and combine be done automatically by cassandra when
inserting/fetching the file without application being bothered about it?
There are client libraries which offer recipes for this, but in general,
no.
You're trying to do something with Cassandra that it's not designed to do.
You
are you really recommending I throw 4 years of work out and completely
rewrite code that works and has been tested?
Our codebase was about 3 years old, and we finished migrating it to CQL not
that long ago. It can definitely be frustrating to have to touch stable
code to modernize it. Our
If you're concerned about impacting production performance, the steps of
compacting and sstable2json will almost certainly also cause performance
problems if performed on the same hardware. You won't get away from a
production performance impact as long as you're using production hardware.
If
If you are doing only writes and no reads, then 'cold_reads_to_omit' is
probably preventing your cluster from crossing a threshold where it decides
it needs to engage in compaction. Setting it to 0.0 should fix this, but
remember that you tuned it as you should be able to revert it to default
Using Cassandra triggers is generally a fairly dangerous proposition, and
generally not recommended.It's probably a better idea to load your
search data with a separate process.
On Mon, Jan 26, 2015 at 11:42 AM, Brian Sam-Bodden bsbod...@integrallis.com
wrote:
I did an little experiment
My understanding is consistent with Alain's, there's no way to force a
tombstone-only compaction, your only option is major compaction. If you're
using size tiered, that comes with its own drawbacks.
I wonder if there's a technical limitation that prevents introducing a
shadowed data cleanup
...@gmail.com wrote:
That's actually GREAT news !! + Solr will give a lot of feature to
Cassandra !
But while waiting for this huge feature (and wanted for a lot of users I
guess)
I guess that Prefix search will also be useful for using geohash...
2015-01-26 18:12 GMT+01:00 Eric Stevens migh
I don't have directly relevant advice, especially WRT getting a meaningful
and coherent subset of your production data - that's probably too closely
coupled with your business logic. Perhaps you can run a testing cluster
with a default TTL on all your tables of ~2 weeks, feeding it with real
for a particular rowKey
Thanks, it does.
How about in astyanax?
*From:* Eric Stevens [mailto:migh...@gmail.com migh...@gmail.com]
*Sent:* Tuesday, February 03, 2015 1:49 PM
*To:* user@cassandra.apache.org
*Subject:* Re: Smart column searching for a particular rowKey
WHERE + ORDER DESC + LIMIT
Just a minor observation: those field names are extremely long. You store
a copy of every field name with every value with only a couple of
exceptions:
http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architecturePlanningUserData_t.html
Your partition key column name
Colin, I'm not familiar with Ceph, but it sounds like it's a more
sophisticated version of a SAN.
Be aware that running Cassandra on absolutely anything other than local
disks is an anti-pattern. It will have a profound negative impact on
performance, scalability, and reliability of your
WHERE + ORDER DESC + LIMIT should be able to accomplish that.
On Tue, Feb 3, 2015 at 11:28 AM, Ravi Agrawal ragra...@clearpoolgroup.com
wrote:
Hi Guys,
Need help with this.
My rowKey is stockName like GOOGLE, APPLE.
Columns are sorted as per timestamp and they include some set of data
I'm struggling to think of a model where it makes sense to update a primary
key as a typical operation. It suggests, as Adil said, that you may be
reasoning wrong about your data model. Maybe you can explain your problem
in more detail - what kind of thing has you updating your PK on a regular
policy for old user
names. For example, can they be reused, or are they locked, or... whatever.
-- Jack Krupansky
On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal ajku@gmail.com wrote:
On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens migh...@gmail.com wrote:
I'm struggling to think of a model
It seems like you should be able to solve it with two more queries
immediately after your first query:
SELECT * FROM timeseries WHERE tstamp ${MIN(firstQuery.tstamp)} LIMIT 1
SELECT * FROM timeseries WHERE tstamp ${MAX(firstQuery.tstamp)} LIMIT 1
On Tue, Jan 13, 2015 at 9:31 AM, Hugo José
@DENIZ, Jon's point is that CQL is the new standard, Thrift is frozen and
being deprecated. Anything you build using the Thrift interface will hurt
you over time, so you ought to just go for CQL. There really is next to no
reason not to use CQL aside from personal preference, and that argument
Yes, many sstables can have a huge negative impact read performance, and
will also create memory pressure on that node.
There are a lot of things which can produce this effect, and it strongly
also suggests you're falling behind on compaction in general (check
nodetool compactionstats, you should
compactors do not seem to help). Oddly
enough, one node has just 160 SSTables while the rest are at 500-600
tables.
Is size-tiered compaction easier on the CPU than leveled compaction?
Thanks,
William
*From:* Eric Stevens [mailto:migh...@gmail.com]
*Sent:* den 12 januari 2015 14:51
@Rob - he's probably referring to the thread titled Reasons for nodes not
compacting? where Tyler speculates that the tables are falling below the
cold read threshold for compaction. He speculated it may be a bug. At the
same time in a different thread, Roland had a similar problem, and Tyler's
:
Thanks for the reply. The bootstrap of new node put a heavy burden on the
whole cluster and I don't know why. So that' the issue I want to fix
actually.
On Mon, Jan 12, 2015 at 6:08 AM, Eric Stevens migh...@gmail.com wrote:
Yes, but it won't do what I suspect you're hoping for. If you
Ah.. six replicas. At least its super inexpensive that way (sarcasm!)
Well it's up to you to decide what your data locality and fault tolerance
requirements are.
If you want to run two DC's, costs are going to increase since each DC has
a full set of replicas within itself. But you get the
It depends on your version of Cassandra. I would suggest starting with
this, which describes the differences between 2.0 and 2.1
http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1
In particular:
In previous releases, this cache has required storing the entire
partition in memory,
I've seen removenode hang indefinitely also (per CASSANDRA-6542).
Generally speaking, if a node is in good health and you want to take it out
of the cluster for whatever reason (including the one you mentioned),
nodetool decommission is a better choice. Removenode is for when a node is
see this only on testing cluster).
I looks to me that compactions were not triggered. I tried a nodetool
compact on one node overnight - but that crashed the entire node.
Roland
Am 15.01.2015 um 19:14 schrieb Eric Stevens:
Yes, many sstables can have a huge negative impact read performance
Note that getAllRows() is deprecated in Astyanax (see here
https://github.com/Netflix/astyanax/wiki/Getting-Started#iterate-through-the-entire-keyspace-deprecated
).
You should prefer to use the AllRowsReader recipe:
https://github.com/Netflix/astyanax/wiki/AllRowsReader-All-rows-query
Note the
If you're getting partial data back, then failing eventually, try setting
.withCheckpointManager() - this will let you keep track of the token ranges
you've successfully processed, and not attempt to reprocess them. This
will also let you set up tasks on bigger data sets that take hours or days
Check out
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_tunable_consistency_c.html
Cassandra 2.0 uses the Paxos consensus protocol, which resembles 2-phase
commit, to support linearizable consistency. All operations are quorum-based
...
This kicks in whenever you do CAS
AFAIK yes. If you want just a subset of the metrics, I would suggest
exporting them all, and filtering on the Graphite side.
On Wed, Feb 11, 2015 at 6:54 AM, Erik Forsberg forsb...@opera.com wrote:
Hi!
I was pleased to find out that cassandra 2.0.x has added support for
pluggable metrics
1 - 100 of 225 matches
Mail list logo