It wont obviously matter in case your columns are fat but in several cases,
(at least I could think of several cases) where you need to, for example,
just store an integer column name empty column value. Thus 12 bytes for
the column where 8 bytes is just the overhead to store timestamps doesn't
Hi thanks for your answer but I don't want to add more layer on top of
Cassandra. I also have done all of my application without Countandra and I
would like to continue this way.
Furthermore there is a Cassandra modeling problem that I would like to
solve, and not just hide.
Alain
2012/1/18
On Thu, Jan 19, 2012 at 3:54 AM, Josep Blanquer blanq...@rightscale.com wrote:
On Wed, Jan 18, 2012 at 12:44 PM, Jonathan Ellis jbel...@gmail.com wrote:
On Wed, Jan 18, 2012 at 12:31 PM, Josep Blanquer
blanq...@rightscale.com wrote:
If I do a slice without a start (i.e., get me the first
Each node is stores the rows in it's token range, and those in the token
ranges it is a replica for. So it will store roughly num_nodes / rf the rows.
If you are approaching a situation where the node may store 2 billion rows, and
so may have 2 billion entries in the secondary index row, you
Did you run a scrub as part of the upgrade process ? That will re-write all the
sstables and remove the old ones.
If not run a scrub now and it will re-write the data with a -hb- format in the
file name.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
Some tips here from Matt Dennis on how to model time series data
http://www.slideshare.net/mattdennis/cassandra-nyc-2011-data-modeling
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 19/01/2012, at 10:30 PM, Alain RODRIGUEZ wrote:
Hi
When I upgraded I did it in 2 stages.
Upgrade from 0.7.6 to 1.0.0
Run scrub on each node.
Run repair on the cluster
Upgrade to 1.0.3
Is it safe to run scrub again? Because it did not seem to help when I
updated it to 1.0.0
Was there a bug in the scrub process in 1.0.0?
What is the
Hi,
I've defined a column family 'Vaibhav' in which every row has few columns and
its values. I've declared two column as secondary index so that I can filter
the rows on the basis of those column values.
Now whenever I execute a CQL with either only rowkey or column name in 'WHERE'
On 18.01.2012, at 02:19, Maki Watanabe wrote:
Are there any significant difference of number of sstables on each nodes?
No, no significant difference there. Actually, node 8 is among those with more
sstables but with the least load (20GB)
On 17.01.2012, at 20:14, Jeremiah Jordan wrote:
Are you
Thanks for your comments. The application is indeed suffering from a freezing
Cassandra node. Queries are taking longer than 10 seconds at the moment of a
full garbage collect.
Here is an example from the logs. I have a three node cluster. At some point I
see on a node the following log:
2012/1/19 aaron morton aa...@thelastpickle.com:
If you have performed any token moves the data will not be deleted until you
run nodetool cleanup.
We did that after adding nodes to the cluster. And then, the cluster
wasn't balanced either.
Also, does the Load really account for dead data, or is
On Wed, Jan 18, 2012 at 7:58 PM, Rustam Aliyev rus...@code.az wrote:
Hi Andrei,
As you know, we are using Whirr for ElasticInbox (
https://github.com/elasticinbox/whirr-elasticinbox). While testing we
encountered a few minor problems which I think could be improved. Note that
we were using
Great, will try 0.7.1 when it's ready.
(Bug I mentioned was already reported)
On 19/01/2012 13:15, Andrei Savu wrote:
On Wed, Jan 18, 2012 at 7:58 PM, Rustam Aliyev rus...@code.az
mailto:rus...@code.az wrote:
Hi Andrei,
As you know, we are using Whirr for ElasticInbox
I will have a look very soon and if I find something I'll let you know.
Thank you in advance!
2012/1/19 aaron morton aa...@thelastpickle.com
Michael, Robin
Let us know if the reported live load is increasing and diverging from the
on disk size.
If it is can you check nodetool cfstats and
Thanks aaron, I already paid attention to these slides and I just looked at
them again.
I'm still in the dark about how to get the number of unique visitors
between 2 dates (randomly chosen, because chosen by user) efficiently.
I could easily count them per hour, day, week, month... But it's a
What's the version of Java do you use? Can you try reducing NewSize
and increasing Old generation? If you are on old version of Java I
also recommend upgrading that version.
On Thu, Jan 19, 2012 at 3:27 AM, Rene Kochen
rene.koc...@emea.schange.com wrote:
Thanks for your comments. The application
mmm, they are not included in the snapshot they are probably not used.
Have you dropped an index call 09partition on AttractionCheckins?
In [52]: .join(chr(int(x+y, 16)) for x,y in
zip(3039706172746974696f6e[0::2], 3039706172746974696f6e[1::2]))
Out[52]: '09partition'
The simple thing to do is
It is working as expected.
Because you have specified a KEY the query returns records that match that
key(s), and it ignores the other clauses.
Selecting rows follows one of three paths:
* selects rows by key(s)
* select rows by key range, i.e. rows after this key.
* select rows by
Load reported from node tool ring is the live load, which means SSTables that
the server has open and will read from during a request. This will include
tombstones, expired and over written data.
nodetool ctstats also includes dead load, which is sstables that are in use
but still on disk.
I believe you need to move the nodes on the ring. What was the load on the
nodes before you added 5 new nodes? Its just that you are getting data in
certain token range more than others.
-Naren
On Thu, Jan 19, 2012 at 3:22 AM, Marcel Steinbach marcel.steinb...@chors.de
wrote:
On 18.01.2012,
On Thu, Jan 19, 2012 at 8:25 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:
I'm still in the dark about how to get the number of unique visitors
between 2 dates (randomly chosen, because chosen by user) efficiently.
I could easily count them per hour, day, week, month... But it's a bit
You might want to look at the code in countandra.org; regardless of whether
you use it. It use a model of dynamic composite keys (although static
composite keys would have worked as well). For the actual query,only one
row is hit. This of course only works bc the data model is attuned for the
Thanks Philippe, I checked their docs. RPMs should be at
http://rpm.datastax.com/community/ now, but 1.0.6 is not there either.
Can someone at datastax please comment on this? Are you guys no longer
packaging cassandra releases?
From: Philippe
On node 172.16.107.46, I see the following:
21:53:27.192+0100: 1335393.834: [GC 1335393.834: [ParNew (promotion failed):
319468K-324959K(345024K), 0.1304456 secs]1335393.964: [CMS:
6000844K-3298251K(8005248K), 10.8526193 secs] 6310427K-3298251K(8350272K),
[CMS Perm :
Ah, that explains part of the problem indeed. The whole situation still
doesn't make a lot of sense to me, unless the answer is that the default
sstable size with level compaction is just no good for large datasets. I
restarted cassandra a few hours ago and it had to open about 32k files
at
We're embarking on a project where we estimate we will need on the order
of 100 cassandra nodes. The data set is perfectly partitionable, meaning
we have no queries that need to have access to all the data at once. We
expect to run with RF=2 or =3. Is there some notion of ideal cluster
size? Or
Dear Aaron,
Thanks for the information.
Actually it's a normal query which works with SQL. I believe there will be some
mechanism to do so in Cassandra, as first retrieving the records based on key
and then checking for the column index later will be inefficient.
Thanks again.
Regards,
We're embarking on a project where we estimate we will need on the order
of 100 cassandra nodes. The data set is perfectly partitionable, meaning
we have no queries that need to have access to all the data at once. We
expect to run with RF=2 or =3. Is there some notion of ideal cluster
size?
I think that qualify as a bug. We should either refuse the query if we
don't know how to do this correctly or return a sensible result (i.e,
no result in that case).
Would you mind opening a ticket on
https://issues.apache.org/jira/browse/CASSANDRA?
--
Sylvain
On Fri, Jan 20, 2012 at 6:39 AM,
29 matches
Mail list logo