> Does huge variation in no. of columns in rows, over the column family
> has *any* impact on the performance ?
>
> Can I have like just 100 columns in some rows and like hundred
> thousands of columns in another set of rows, without any downsides ?
If I interpret your question the way I think you
Why not synchronize on the client side? Make sure that the process that
allocates user ids runs on only a single machine, in a synchronized method,
and uses QUORUM for its reads and writes to Cassandra?
On Sun, Feb 6, 2011 at 11:02 PM, Aaron Morton wrote:
> If you mix mysql and Cassandra you risk
Hi all!
I'm running into OOM problem during batch_mutate. I've a test cluster
of two servers, 32GB and 16GB RAM, real HW. I've one keyspace and one
CF with 1,4mil rows, each 10 columns. A row is around 5k in size. I
run Hadoop MR task that reads one column and generates Mutation that
updates anoth
Thanks for the detailed explanation Peter! Definitely cleared my doubts !
On Mon, Feb 7, 2011 at 1:52 PM, Peter Schuller
wrote:
>> Does huge variation in no. of columns in rows, over the column family
>> has *any* impact on the performance ?
>>
>> Can I have like just 100 columns in some rows a
Hi,
Recently I worked on implementation of java jdbc driver for cassandra using CQL.
Given below is an example code base(with basic features) about how to use it:
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.s
Hi,
on my two-node test setup I get repeatedly following error:
The 10.0.18.129 server log:
INFO 14:10:37,707 Node /10.0.18.99 has restarted, now UP again
INFO 14:10:37,708 Checking remote schema before delivering hints
INFO 14:10:37,708 Sleeping 45506ms to stagger hint delivery
INFO 14:10:3
I forgot to mention I use 0.7.0 stable version.
HTH,
Patrik
Just tried current 0.7.1 from cassandra-0.7 branch and it does the
same. OOM after three runs.
-Xm* setting is computed by cassandra-env.sh like this: -Xms8022M
-Xmx8022M -Xmn2005M
What am I doing wrong?
Thanks,
Patrik
On Mon, Feb 7, 2011 at 14:18, Patrik Modesto wrote:
> I forgot to mention
INFO 15:30:49,647 Compacted to
/www/foo/cassandra/data/foo/Url-tmp-f-767-Data.db. 4,199,999,762 to
4,162,579,242 (~99% of original) bytes for 379,179 keys. Time:
137,149ms.
ERROR 15:30:49,699 Fatal exception in thread Thread[CompactionExecutor:1,1,main]
java.lang.RuntimeException: java.lang.Illeg
I think this is related to a faulty disk.
On Mon, Feb 7, 2011 at 3:35 PM, Patrik Modesto wrote:
> INFO 15:30:49,647 Compacted to
> /www/foo/cassandra/data/foo/Url-tmp-f-767-Data.db. 4,199,999,762 to
> 4,162,579,242 (~99% of original) bytes for 379,179 keys. Time:
> 137,149ms.
> ERROR 15:30:49,
Looks like you don't have a big enough working set from your GC logs, there
doesn't seem to be a lot being reclaimed in the GC process. The process is
reclaiming a few hundred MB and is running every few seconds. How big are
your caches? The probable reason that it works the first couple times when
On Mon, Feb 7, 2011 at 5:40 AM, Aditya Narayan wrote:
> Thanks for the detailed explanation Peter! Definitely cleared my doubts !
>
>
>
> On Mon, Feb 7, 2011 at 1:52 PM, Peter Schuller
> wrote:
>>> Does huge variation in no. of columns in rows, over the column family
>>> has *any* impact on the p
On Mon, Feb 7, 2011 at 15:44, sridhar basam wrote:
> Looks like you don't have a big enough working set from your GC logs, there
> doesn't seem to be a lot being reclaimed in the GC process. The process is
> reclaiming a few hundred MB and is running every few seconds. How big are
> your caches? T
On Mon, Feb 7, 2011 at 15:42, Thibaut Britz
wrote:
> I think this is related to a faulty disk.
I'm not sure thats the problem. Cassandra 0.7.0 didn't report any
problem. It started with Cassandra 0.7.1.
Patrik
This sounds like a possible bug since the BRAF was re-written in 0.7.1.
Could you open a ticket?
On Mon, Feb 7, 2011 at 10:32 AM, Patrik Modesto wrote:
> On Mon, Feb 7, 2011 at 15:42, Thibaut Britz
> wrote:
> > I think this is related to a faulty disk.
>
> I'm not sure thats the problem. Cassand
On 02/04/2011 12:43 PM, Jonathan Ellis wrote:
> Can you create a ticket?
I noticed the same thing. CASSANDRA-2123 created.
On Mon, 2011-02-07 at 12:08 +, Vivek Mishra wrote:
> Recently I worked on implementation of java jdbc driver for cassandra
> using CQL. Given below is an example code base(with basic features)
> about how to use it:
[ ... ]
Nice!
> I am not sure if there is any JIRA related to this. With suc
On Sat, Feb 5, 2011 at 10:12 PM, Joshua Partogi wrote:
> Hi,
>
> I don't know whether my assumption is right or not. When I tried to insert a
> Time value into a column I am getting this exception:
>
> vendor/ruby/1.8/gems/thrift-0.5.0/lib/thrift/protocol/binary_protocol.rb:106:in
> `write_string'
It's not really the storage of spatial data that's tricky. We use geojson as
a wire-line format at the higher levels of our system (e.g., the HTTP
API). But the hard part is organizing the data for efficient retrieval and
keeping those indices consistent with the data being indexed. Efficient
multi
It depends a little on your write pattern:
- Wide rows tend to get distributed over more sstables so more disk reads are
necessary. This will become noticeable when you have high io load and reads
actually hit the discs.
- If you delete a lot slice query performance might suffer: extreme example
Hi,
Our application space is such that there is data that might not be read for
a long time. The data is mostly immutable. How should I approach
detecting/solving the bitrot problem? One approach is read data and let read
repair do the detection, but given the size of data, that does not look very
Hello,
I'm trying to access a Cassandra 0.7 cluster in a hadoop map-reduce job (hadoop
0.20.2) and seeing a thrift library conflict. Hadoop uses an older version of
thrift than hector 0.7, and this older version is getting picked up by my job,
causing the following exception:
FATAL org.apache.
> Our application space is such that there is data that might not be read for
> a long time. The data is mostly immutable. How should I approach
> detecting/solving the bitrot problem? One approach is read data and let read
> repair do the detection, but given the size of data, that does not look v
Dan,
Do you have any more information on this issue? Have you been able to
discover anything from exporing your SSTables to JSON?
Thanks,
Ben
On 1/29/11 12:45 PM, Dan Hendry wrote:
I am once again having severe problems with my Cassandra cluster. This
time, I straight up cannot read sectio
Hello,
one node in my 3-machine cluster cannot perform compaction. I tried multiple
times, it ran out of heap space once and I increased it. Now I'm getting the
dump below (after it does run for a few minutes). I hope somebody can shed a
little light on what' going on, because I'm at a loss and th
Some RAID storage might do it, potentially more efficiently!!
Rhetorical question - Does Cassandra's architecture of reconciling reads
over multiple copies of the same data provide an even more interesting
answer? I submit - YES!
All bitrot protection mechanisms involve some element of redundant
Hey,
I have read about the new TTL columns in Cassandra 0.7. In my case I'd
like to expire an entire row automatically after a certain amount of
time. Is this possible as well?
Thanks,
-Kal
Hi,
Does anyone know if anything similar to
https://issues.apache.org/jira/browse/CASSANDRA-1748 or
https://issues.apache.org/jira/browse/CASSANDRA-1837 exists in 0.6.x
releases? Both of those bugs look like they were introduced, found, and
fixed in 0.7, and CASSANDRA-1837 comments indicate that
I don't think this is supported (but I could be completely wrong).
However, I'd love to see this functionality as well.
How would one go about requesting such a feature?
Bill-
On Mon, Feb 7, 2011 at 4:15 PM, Kallin Nagelberg
wrote:
> Hey,
>
> I have read about the new TTL columns in Cassandra 0
> Some RAID storage might do it, potentially more efficiently!!
People keep claiming that but I have yet to confirm that a hardware
raid does actual checksumming as opposed to just healing bad blocks.
But yes, they might :)
> Food for thought, or wild imagination ?
That was my intent. Checksummi
Was there a time where nodetool repair was not run frequently ?There are some steps listed here to reset issues around tombstones coming back to life http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSecondsWhy do you run nodetool
Deleting all the columns in a row via TTL has the same affect as deleting th row, the data will physically by removed during compaction. AaronOn 08 Feb, 2011,at 10:24 AM, Bill Speirs wrote:I don't think this is supported (but I could be completely wrong).
However, I'd love to see this functionalit
I tried that but I still see the row coming back on a list
in the CLI. My concern is that there will be a pointer
to an empty row for all eternity.
-Kal
On Mon, Feb 7, 2011 at 4:38 PM, Aaron Morton wrote:
> Deleting all the columns in a row via TTL has the same affect as deleting th
> row, the
Yes, we failed to run nodetool repair for quite a while and I believe it
might have been our situation that prompted the addition of that info to the
wiki :-)
We've tried/are trying two of the suggested steps there, but haven't done
the process of removing/reinserting the pseudo-failed nodes (all
I also tried forcing a major compaction on the column family using JMX
but the row remains.
On Mon, Feb 7, 2011 at 4:43 PM, Kallin Nagelberg
wrote:
> I tried that but I still see the row coming back on a list
> in the CLI. My concern is that there will be a pointer
> to an empty row for all eter
Hey,
I am developing a session management system using Cassandra and need
to generate unique sessionIDs (cassandra columnfamily keys). Does
anyone know of an elegant/simple way to accomplish this? I am not sure
about using time based uuids on the client as there a chance that
multiple clients coul
I noticed this as well in a machine that was left running with the current 0.7 branch code. Created https://issues.apache.org/jira/browse/CASSANDRA-2131aaronOn 08 Feb, 2011,at 04:34 AM, Jake Luciani wrote:This sounds like a possible bug since the BRAF was re-written in 0.7.1. Could you open a tick
Hello Kallin.
If you use timeUUID the chance to generate two time the same uuid is the
following :
considering that both client generate the uuid at the *same millisecond*,
the chance of generating the same uuid is :
1/1.84467441 × 1019Which is equal to the probability for winning a national
lotte
Maybe I can just use java5's UUID.. Need to research how this is
effective across multiple clients..
On Mon, Feb 7, 2011 at 4:57 PM, Kallin Nagelberg
wrote:
> Hey,
>
> I am developing a session management system using Cassandra and need
> to generate unique sessionIDs (cassandra columnfamily keys
On Sun, Feb 6, 2011 at 11:03 AM, Shaun Cutts wrote:
> What I think you should be doing is the following: open iterators on the
> matching keys for each of the indexes; the inside loop would pick an iterator
> at random, and pull a match from it. This would assure that the expected
> number of e
On Mon, Feb 7, 2011 at 1:51 AM, TSANG Yiu Wing wrote:
> cassandra version: 0.7
>
> client library: scale7-pelops / 1.0-RC1-0.7.0-SNAPSHOT
>
> cluster: 3 machines (A, B, C)
>
> details:
> it works perfectly when all 3 machines are up and running
>
> but if the seed machine is down, the problems hap
Can you open a ticket for this? And are you using order-preserving partitioner?
On Mon, Feb 7, 2011 at 7:16 AM, Patrik Modesto wrote:
> Hi,
>
> on my two-node test setup I get repeatedly following error:
>
> The 10.0.18.129 server log:
>
> INFO 14:10:37,707 Node /10.0.18.99 has restarted, now U
Sounds like the keyspace was created on the 32GB machine, so it
guessed memtable sizes that are too large when run on the 16GB one.
Use "update column family" from the cli to cut the throughput and
operations thresholds in half, or to 1/4 to be cautious.
On Mon, Feb 7, 2011 at 9:00 AM, Patrik Mode
I've patched ColumnSortedMap on the 0.7 branch to not swallow the
IOException it's getting.
On Mon, Feb 7, 2011 at 3:02 PM, buddhasystem wrote:
>
> Hello,
> one node in my 3-machine cluster cannot perform compaction. I tried multiple
> times, it ran out of heap space once and I increased it. Now
Hi,
I've added some comments and questions inline.
Cheers,
Dan
On 8 February 2011 10:00, Jonathan Ellis wrote:
> On Mon, Feb 7, 2011 at 1:51 AM, TSANG Yiu Wing wrote:
> > cassandra version: 0.7
> >
> > client library: scale7-pelops / 1.0-RC1-0.7.0-SNAPSHOT
> >
> > cluster: 3 machines (A, B, C)
Thanks Jonathan -- does it mean that the machine is experiencing IO problems?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Java-bombs-during-compaction-please-help-tp6001773p6002320.html
Sent from the cassandra-u...@incubator.apache.org maili
If you are a Hector user, TimeUUIDUtils can be used to create Time UUIDs.
https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/utils/TimeUUIDUtils.java
On Mon, Feb 7, 2011 at 2:11 PM, Kallin Nagelberg wrote:
> Maybe I can just use java5's UUID.. Need to rese
Jonathan,
Thanks for your thoughts
> On Sun, Feb 6, 2011 at 11:03 AM, Shaun Cutts wrote:
>> What I think you should be doing is the following: open iterators on the
>> matching keys for each of the indexes; the inside loop would pick an
>> iterator at random, and pull a match from it. This
> Yes, we failed to run nodetool repair for quite a while and I believe it
> might have been our situation that prompted the addition of that info to the
> wiki :-)
So is it correct then that nowadays, you hope to not be violating the
repair frequency requirement but you're still seeing data pop u
unsubscribe
I'm a newbie here, but, with apologies for my presumptuousness, I think you
should deprecate SuperColumns. They are already distracting you, and as the
years go by the cost of supporting them as you add more and more functionality
is only likely to get worse. It would be better to concentrate o
Thanks Ryan.
That makes more sense now. So I should instead find a way to (de)serialize
Ruby objects to string vice versa when inserting to Column.
Kind regards,
Joshua
On Tue, Feb 8, 2011 at 4:43 AM, Ryan King wrote:
> On Sat, Feb 5, 2011 at 10:12 PM, Joshua Partogi
> wrote:
> > Hi,
> >
> >
i will continue the issue here:
http://groups.google.com/group/scale7/browse_thread/thread/dd74f1d6265ae2e7
thanks
On Tue, Feb 8, 2011 at 7:44 AM, Dan Washusen wrote:
> Hi,
> I've added some comments and questions inline.
>
> Cheers,
> Dan
> On 8 February 2011 10:00, Jonathan Ellis wrote:
>>
Pretty sure it also uses mac address, so chances are very slim. I'll check
out time uuid too, thanks.
On 7 Feb 2011 17:11, "Victor Kabdebon" wrote:
Hello Kallin.
If you use timeUUID the chance to generate two time the same uuid is the
following :
considering that both client generate the uuid at
Dear all,
Sorry to come back again to this point but I am really worried about
Cassandra memory consumption. I have a single machine that runs one
Cassandra server. There is almost no data on it but I see a crazy memory
consumption and it doesn't care at all about the instructions...
Note that I a
Hi,
Lets suppose you are using Cassandra happily in production, but you
have an army of coders, with varying levels of knowledge about
Cassandra.
Currently we have hid most of our developers from the Cassandra
dependency by using a Fake interface that returns fake data from it,
but this is turnin
On Feb 7, 2011, at 10:28 PM, Paul Querna wrote:
> So, I guess this is coming down to:
> 1) Has anyone built any easy to install packages of Cassandra?
I didn't find it necessary. I implemented a simple embedding wrapper for
Cassandra so that it could be started as part of a web application lif
Hi,
here is the ticket: https://issues.apache.org/jira/browse/CASSANDRA-2134
I'm using the default partitioner, that should be the RandomPartitioner.
HTH,
Patrik
On Tue, Feb 8, 2011 at 00:03, Jonathan Ellis wrote:
> Can you open a ticket for this? And are you using order-preserving
> partit
The expired columns were converted into tombstones, which will live for the
GC timeout. The "empty" row will be cleaned up when those tombstones are
removed.
Returning the empty row is unfortunate... we'd love to find a more
appropriate solution that might not involve endless scanning.
See
http:/
59 matches
Mail list logo