Hi!
In datastax
documentationhttp://www.datastax.com/docs/1.0/ddl/column_familythere
is an explanation of what CFs are a good fit for compression:
When to Use Compression
Compression is best suited for column families where there are many rows,
with each row having the same columns, or at least
2012/9/20 aaron morton aa...@thelastpickle.com
I would consider:
# User CF
* row_key: user_id
* columns: user properties, key=value
# UserRequests CF
* row_key: user_id : partition_start where partition_start is the start
of a time partition that makes sense in your domain. e.g.
But the only advantage in this solution is to split data among partitions?
You need to split data among partitions or your query won't scale as more and
more data is added to table. Having the partition means you are querying a lot
less rows.
What do you mean here by current partition?
He
I have been digging more and more into CQL vs. PlayOrm S-SQL and found a
major difference that is quite interesting(thought you might be interested
plus I have a question). CQL uses a composite row key with the prefix so
now any other tables that want to reference that entity have references to
It's a pretty solid standard at this point. The large majority of client
library work from this point on will be based on cql.
On Sun, Sep 23, 2012 at 12:45 AM, Bradford Toney
bradford.to...@gmail.comwrote:
Yeah i've seen how it's done in CQL3 is just wasn't sure if it was a solid
standard
Due to repetition in the column metadata, you're still likely to get a
reasonable amount of compression. This is especially true if there is some
amount of repetition in the column names, values, or TTLs in wide rows.
Compression will almost always be beneficial unless you're already somehow
CPU
There were no errors in the log (other than the messages dropped exception
pasted below), and the node does recover. We have only a small number of
secondary indexes (3 in the whole system).
However, I went through the cassandra code, and I believe I've worked through
this problem.
Just to
As well as your unlimited column names may all have the same prefix, right?
Like accounts.rowkey56, accounts.rowkey78, etc. etc. so the accounts gets
a ton of compression then.
Later,
Dean
From: Tyler Hobbs ty...@datastax.commailto:ty...@datastax.com
Reply-To:
Hello,
We have been noticing an issue where, about 50% of the time in which a node
fails or is restarted, secondary indexes appear to be partially lost or
corrupted. A drop and re-add of the index appears to correct the issue. There
are no errors in the cassandra logs that I see. Part of
/var/log/cassandra$ cat system.log | grep Compacting large | grep -E
[0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ;
print foo MB }' | sort -nr | head -n 50
Is it bad signal?
Sorry, I do not know what this is outputting.
As I can see in cfstats, compacted row maximum
If you think about space, use Leveled compaction! This won't only allow you
to fill more space, but also will shrink you data much faster in case of
updates. Size compaction can give you 3x-4x more space used than there are
live data. Consider the following (our simplified) scenario:
1) The data
On Sun, Sep 23, 2012 at 8:18 PM, Віталій Тимчишин tiv...@gmail.com wrote:
If you think about space, use Leveled compaction! This won't only allow you
to fill more space, but also will shrink you data much faster in case of
updates. Size compaction can give you 3x-4x more space used than there
In CQL3, names are case insensitive by default, while they were case
sensitive in CQL2. You can force whatever case you want in CQL3
however using double quotes. So in other words, in CQL3,
USE TestKeyspace;
should work as expected.
--
Sylvain
On Sun, Sep 23, 2012 at 9:22 PM, Oleksandr Petrov
On Fri, Sep 21, 2012 at 2:05 AM, aaron morton aa...@thelastpickle.com wrote:
Would it help if I partitioned the computing resources of my physical
machines into VMs?
No.
Just like cutting a cake into smaller pieces does not mean you can eat more
without getting fat.
In the general case,
If this is intended behavior, could somebody please point me to where this is
documented?
It is intended.
The docs don't make it totally clear though:
clause syntax is:
primary key name { = | | | = | = } key_value
primary key name IN (key_value [,...])
Yup.
(Multi get is just a convenience method, it explodes into multiple gets on the
server side. )
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 24/09/2012, at 5:01 AM, Hiller, Dean dean.hil...@nrel.gov wrote:
But the only advantage
To put in other words, Cassandra will lock down all tables until all pending
flush requests fit in the pending queue.
This was the first issue I looked at in my Cassandra SF talk
http://www.datastax.com/events/cassandrasummit2012/presentations
I've seen it occur more often with
Love the Mars lander analogies :)
On Sep 23, 2012, at 5:39 PM, aaron morton wrote:
To put in other words, Cassandra will lock down all tables until all pending
flush requests fit in the pending queue.
This was the first issue I looked at in my Cassandra SF talk
You might find these two projects useful:
- ccm, which makes it easy to run a cluster on a single machine:
https://github.com/pcmanus/ccm
- Cassanova, which supports a large portion of the Thrift API with a
lightweight python process: https://github.com/riptano/Cassanova
On Sun, Sep 23, 2012 at
On Sun, Sep 23, 2012 at 10:41 PM, aaron morton aa...@thelastpickle.com wrote:
/var/log/cassandra$ cat system.log | grep Compacting large | grep -E
[0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ;
print foo MB }' | sort -nr | head -n 50
Is it bad signal?
Sorry, I do not
20 matches
Mail list logo