should we update the wiki?
On Fri, Sep 14, 2012 at 1:18 PM, aaron morton aa...@thelastpickle.comwrote:
Yes.
If your IDE is starting cassandra the settings from cassandra-env.sh will
not be used.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
Out of interest, how out of sync where they ?
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 14/09/2012, at 6:53 AM, Ben Frank b...@airlust.com wrote:
Hi Sergey,
That was exactly it, thank you!
-Ben
On Thu, Sep 13, 2012 at
INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java
(line
221) Compacted to
[/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
ventsByItem.ebi_eventtypeIndex-he-10-Data.db,]. 78,623,000 to 373,348 (~0%
of o
riginal) bytes for 83 keys at 0.000280MB/s.
You _could_ use one wide row and do a multiget against the same row for
different column slices. Would be less efficient than a single get against the
row. But you could still do big contiguous column slices.
You may get some benefit from the collections in CQL 3
Consider a course_students col family which gives a list of students for a
course
I would use two CF's:
Course CF:
* Each row is one course
* Columns are the properties and values of the course
CourseEnrolements CF
* Each row is one course
* Column name is the
I have a hunch that the SSTable selection based on the Min and Max keys in
ColumnFamilyStore.markReferenced() means that a higher false positive has less
of an impact.
it's just a hunch, i've not tested it.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
Hi Brian did you see my follow up questions here
http://www.mail-archive.com/user@cassandra.apache.org/msg24840.html
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 12/09/2012, at 11:52 PM, Brian Jeltema brian.jelt...@digitalenvoy.net
It's not possible to read just the column names.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 14/09/2012, at 9:05 PM, Robin Verlangen ro...@us2.nl wrote:
Hi there,
Would it be possible to read only the column names, instead of the
Hi Aaron,
Is this something that's worth becoming a feature in the future? Or should
I rework my data model? If so, do you have any suggestions?
Best regards,
Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl
Disclaimer: The information contained in this
Thanks Aaron,
At another production site the exact same problems occur (also after
~6 months). Here I have a very small cluster of three nodes with
replication factor = 3.
One of the three nodes begins to have many long Parnews and high CPU
load. I upgraded to Cassandra 1.0.11, but the GC problem
Hello.
I have a schema that represents a filesystem and one example of a Super CF is:
CF FilesPerDir: (DIRNAME - (FILENAME - (attribute1: value1, attribute2:
value2))
And in cases of directory moves, I have to fetch all files of that directory
and subdirectories. This implies one cassandra
playOrm uses EXACTLY that pattern where @OneToMany becomes
student.rowkeyStudent1 student.rowkeyStudent2 and the other fields are fixed.
It is a common pattern in noSQL.
Dean
From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com
Reply-To:
Hi,
I'm facing a problem in Cassandra cluster deployed on EC2 where the node is
going down under write load.
I have configured a cluster of 4 Large EC2 nodes with RF of 2.
All nodes are instance storage backed. DISK is RAID0 with 800GB
I'm pumping in write requests at about 4000 writes/sec. One
Hi Robbit,
I think it's running out of disk space, please verify that (on Linux: df -h
).
Best regards,
Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl
Disclaimer: The information contained in this message and attachments is
intended solely for the
Robbit = Rohit of course, excuse me.
Best regards,
Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl
Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
Hi Robin,
I had checked that. Our disk size is about 800GB, and the total data size
is not more than 40GB. Even if all the data is stored in one node, this
won't happen.
I'll try to see if the disk failed.
Is this anything to do with VM memory?.. cause this logs suggests that..
Heap is
Guys
I am pretty new to Cassandra. I have a script that needs to set up a schema
first before starting up the cassandra node. Is this possible ? Can I create
the schema directly on cassandra storage and then when the node starts up it
will pick up the schema ?
Zaili
From: rohit reddy
Cassandra writes to memtables, that will get flushed to disk when it's
time. That might be because of running out of memory (the log message you
just posted), on a shutdown, or at other times. That's why you're using
memory while writing.
You seem to be running on AWS, are you sure your data
Hi,
Another newby question. My script needs to start up cassandra node. However
cassandra doesn't close the stdout console and therefore never returns
cassandra -p c.pid
Is there anyway to have cassandra close the stdout ?
Zaili
**
I think what you're describing might give me what I'm after, but I don't
see how I can pass different column slices in a multiget call. I may be
missing something, but it looks like you pass multiple keys but only a
singular SlicePredicate. Please let me know if that's not what you meant.
I'm
Michael Theroux mtheroux2 at yahoo.com writes:
Hello,
A number of weeks ago, Amazon announced the availability of EBS Optimized
instances and Provisioned IOPs for Amazon EC2. Historically, I've read EBS is
not recommended for Cassandra due to the network contention that can quickly
result
There is another trick here. On the playOrm open source project, we need to do
a sparse query for a join and so we send out 100 async requests and cache up
the java Future objects and return the first needed result back without
waiting for the others. With the S-SQLin playOrm, we have the IN
Hi,
I am new to java and trying to get the Astyanax client running for Cassandra.
Downloaded astyanax from https://github.com/Netflix/astyanax. How do I
compile the source code from here it in a very simple fashion from
linux command line ?
Thanks.
I have a hunch that the SSTable selection based on the Min and Max keys in
ColumnFamilyStore.markReferenced() means that a higher false positive has
less of an impact.
it's just a hunch, i've not tested it.
For leveled compaction, yes. For non-leveled, I can't see how it would
since each
On Fri, Sep 14, 2012 at 12:28:08PM -0400, A J wrote:
Hi,
I am new to java and trying to get the Astyanax client running for Cassandra.
Downloaded astyanax from https://github.com/Netflix/astyanax. How do I
compile the source code from here it in a very simple fashion from
linux command line
I didn't need to compile it. It is up in the maven repositories as we
http://mvnrepository.com/artifact/com.netflix.astyanax/astyanax
Or are you trying to see how it works? (We use the same client on playORM
open source projectŠit works like a charm).
Dean
On 9/14/12 10:28 AM, A J
On Fri, Sep 14, 2012 at 10:49 AM, Hiller, Dean dean.hil...@nrel.gov wrote:
I didn't need to compile it. It is up in the maven repositories as we
http://mvnrepository.com/artifact/com.netflix.astyanax/astyanax
Actually, yeah, that's what I ended up doing with my ghetto set up
too, but I did
Do the row size stats reported by 'nodetool cfstats' include the
effect of compression?
Thanks,
Jim
Hi all,
Does minor compaction delete expired column-tombstones when the row is
also present in another table which is not subject to the minor
compaction?
Example:
Say there are 5 SStables:
- Customers_0 (10 MB)
- Customers_1 (10 MB)
- Customers_2 (10 MB)
- Customers_3 (10 MB)
- Customers_4
I'm trying to do a bulk load from a Cassandra/Hadoop job using the
BulkOutputFormat class.
It appears that the reducers are generating the SSTables, but is failing to
load them into the cluster:
12/09/14 14:08:13 INFO mapred.JobClient: Task Id :
attempt_201208201337_0184_r_04_0, Status :
I'm building a new cluster (to replace the broken setup I've written
about in previous posts) that will consist of only two nodes. I understand
that I'll be sacrificing high availability of writes if one of the nodes
goes down, and I'm okay with that. I'm more interested in maintaining high
Thanks for the inputs.
The disk on the EC2 node failed. This led to the problem. Now i have
created a new cassandra node and added it to the cluster.
Do i need to do anything to delete the old node from the cluster, or will
the cluster balance it self.
Asking this since in Datastax ops center its
You will need to run nodetool removetoken with the old node's token to
permanently remove it from the cluster.
On Fri, Sep 14, 2012 at 3:06 PM, rohit reddy rohit.kommare...@gmail.comwrote:
Thanks for the inputs.
The disk on the EC2 node failed. This led to the problem. Now i have
created a
Hi--
We are iterating rows in a column family two different ways and are
seeing radically different row counts. We are using 1.0.8 and
RandomPartitioner on a 3-node cluster.
In the first case, we have a trivial Hadoop job that counts 29M rows
using the standard MR pattern for counting
Are there any deletions in your data? The Hadoop support doesn't filter out
tombstones, though you may not be filtering them out in your code either. I've
used the hadoop support for doing a lot of data validation in the past and as
long as you're sure that the code is sound, I'm pretty
A couple of guesses:
- are you mixing versions of Cassandra? Streaming differences between versions
might throw this error. That is, are you bulk loading with one version of
Cassandra into a cluster that's a different version?
- (shot in the dark) is your cluster overwhelmed for some reason?
36 matches
Mail list logo