database design

2011-04-13 Thread Jean-Yves LEBLEU
Hi all,

Just some thoughts and question I have about cassandra data modeling.

If I understand well, cassandra is better on writing than on reading.
So you have to think about your queries to design cassandra schema. We
are doing incremental design, and already have our system in
production and we have to develop new queries.
How do you usualy do when you have new queries, do you write a
specific job to update data in the database to match the new query you
are writing ?

Thanks for your help.

Jean-Yves


Renaming keyspace

2011-04-05 Thread Jean-Yves LEBLEU
Hi all,

We are running .6.12, is there any particular precaution to rename the
keyspace, is it enough to shutdown cassandra, update storag-conf.xml,
rename data directory and start cassandra again.
Thanks for your help.
Jean-Yves


Re: Calculate memory used for keycache

2011-03-15 Thread Jean-Yves LEBLEU
One additionnal question, I don't really understand what is in the key
cache. I have a column family with only one key, and the keycache size
is 118 ... ?
Any idea.
Thks.
Jean-Yves


Out of Memory every 2 weeks

2011-03-14 Thread Jean-Yves LEBLEU
Sorry to create a new thread about Out of Memory problem, but I
checked all other threads and did not find the answer.

We have a running cluster of 2 cassandra nodes replication factor = 2
on red hat 4.8 32 bits with 4 G of memory  where we run periodicaly
out of memory (every 2 weeks)  and both nodes are crashing (trace at
the end of the file).

We are running 0.6.12 (do not had the time to upgrade to 0.7.3) and we
followed the memory configuration from the wiki :

memtable_throughput_in_mb * 3 * number of hot CFs + 1G + internal caches

memtable_throughput_in_mb = 16
We have 11 column families

So if my calculation is right, I need  16*3*11 + 1G + internal caches
: 528 Mbytes + 1G + internal caches, so more than 1,5 G of heap memory

We start with 2G

Here is the cassandra.in.sh parameters

JVM_OPTS= \
-ea \
-Xms2000M \
-Xmx2000M \
-XX:+UseParNewGC \
-XX:+UseConcMarkSweepGC \
-XX:+CMSParallelRemarkEnabled \
-XX:SurvivorRatio=8 \
-XX:MaxTenuringThreshold=1 \
-XX:CMSInitiatingOccupancyFraction=75 \
-XX:+UseCMSInitiatingOccupancyOnly \
-XX:+HeapDumpOnOutOfMemoryError \


The question is I don't really understand the configuration problem,
if some body have any clue of what we could do, or what we should
monitor to avoid the problem, as it is very difficult to reproduce the
problem as it does not happen very often.

Thanks for any help.
Jean-Yves


-below the stack trace
--


INFO [FLUSH-WRITER-POOL:1] 2011-03-04 17:59:45,952 Memtable.java (line 152)
Writing Memtable-HintsColumnFamily@15830055(2280 bytes, 240 operations)
 INFO [FLUSH-WRITER-POOL:1] 2011-03-04 17:59:45,972 Memtable.java (line 166)
Completed flushing
/opt/database/data/system/HintsColumnFamily-164-Data.db (255 bytes)
 INFO [FLUSH-TIMER] 2011-03-04 18:17:47,802 ColumnFamilyStore.java (line 478)
VoiceMail has reached its threshold; switching in a fresh Memtable at
CommitLogContext(file='/opt/database/commitlog/CommitLog-1297775500410.log',
position=10407973)
 INFO [FLUSH-TIMER] 2011-03-04 18:17:47,803 ColumnFamilyStore.java (line 748)
Enqueuing flush of Memtable-VoiceMail@29468188(3000 bytes, 120 operations)
 INFO [FLUSH-WRITER-POOL:1] 2011-03-04 18:17:47,803 Memtable.java (line 152)
Writing Memtable-VoiceMail@29468188(3000 bytes, 120 operations)
ERROR [FLUSH-WRITER-POOL:1] 2011-03-04 18:17:47,824 CassandraDaemon.java (line
87) Uncaught exception in thread Thread[FLUSH-WRITER-POOL:1,5,main]
java.lang.OutOfMemoryError
at java.io.RandomAccessFile.readBytes(Native Method)
at java.io.RandomAccessFile.read(RandomAccessFile.java:322)
at
org.apache.cassandra.io.util.BufferedRandomAccessFile.fillBuffer(BufferedRandomAccessFile.java:209)
at
org.apache.cassandra.io.util.BufferedRandomAccessFile.seek(BufferedRandomAccessFile.java:246)
at
org.apache.cassandra.io.util.BufferedRandomAccessFile.writeAtMost(BufferedRandomAccessFile.java:389)
at
org.apache.cassandra.io.util.BufferedRandomAccessFile.write(BufferedRandomAccessFile.java:365)
at java.io.DataOutputStream.writeUTF(DataOutputStream.java:384)
at java.io.RandomAccessFile.writeUTF(RandomAccessFile.java:1064)
at org.apache.cassandra.io.SSTableWriter.append(SSTableWriter.java:94)
at
org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:162)
at org.apache.cassandra.db.Memtable.access$000(Memtable.java:46)
at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:178)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 INFO [main] 2011-03-07 17:46:09,454 CLibrary.java (line 47) JNA not found.
Native methods will be disabled.
 INFO [main] 2011-03-07 17:46:09,768 DatabaseDescriptor.java (line 277)
DiskAccessMode 'auto' determined to be standard, indexAccessMode is standard


Re: Out of Memory every 2 weeks

2011-03-14 Thread Jean-Yves LEBLEU
Thank you,
I am going to try that.


cassandra 0.6.11 binary package problem

2011-02-03 Thread Jean-Yves LEBLEU
Hi all,

Just for info, in apache-cassandra-0.6.11-bin.tar.gz there are both
apache-cassandra-0.6.10.jar  and apache-cassandra-0.6.11.jar in the
lib directory.

Causing troubles to my upgrade scripts which use this file to get
installed version and check if upgrade needed . :(

Thanks for the good job.
Jean-Yves


Re: cassandra 0.6.11 binary package problem

2011-02-03 Thread Jean-Yves LEBLEU
Don't known, only checked
http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.6.11/apache-cassandra-0.6.11-bin.tar.gz
Rgds.
JY

On Thu, Feb 3, 2011 at 3:36 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Well, that's odd. :)

 Do any of the other tar.gz balls contain multiple jars?

 On Thu, Feb 3, 2011 at 6:06 AM, Jean-Yves LEBLEU jleb...@gmail.com wrote:
 Hi all,

 Just for info, in apache-cassandra-0.6.11-bin.tar.gz there are both
 apache-cassandra-0.6.10.jar  and apache-cassandra-0.6.11.jar in the
 lib directory.

 Causing troubles to my upgrade scripts which use this file to get
 installed version and check if upgrade needed . :(

 Thanks for the good job.
 Jean-Yves




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-20 Thread Jean-Yves LEBLEU
Java + Pelops
Cassandra 0.6.8


Re: Data management on a ring

2010-11-10 Thread Jean-Yves LEBLEU
Thanks for the anwser.

It was not exactly my point, I would like to know if in a 10 nodes rings if
it is possible to restrict replication of some data to only 2 nodes, and
other data to all nodes ?
Regards.
Jean-Yves

On Wed, Nov 10, 2010 at 11:17 AM, aaron morton aa...@thelastpickle.comwrote:

 If I understand your correctly, you just want to add 8 nodes to a ring that
 already has 2 ?

 You could add the nodes and manually assign them tokens following the
 guidelines here http://wiki.apache.org/cassandra/Operations

 I'm not sure how to ensure the minimum amount of data transfer though.
 Adding all 8 at once is probably a bad idea.

 How about you make a new cluster of 8 nodes, manually assign tokens and
 then copy the data from the 2 node ring to the 8 node. Then move the 2
 original nodes into the new cluster?

 Hope that helps.
 Aaron

 On 10 Nov 2010, at 20:56, Jean-Yves LEBLEU wrote:

  Hello all,
 
  We have an installation of 10 nodes, and we choose to deploy 5 rings of 2
 nodes.
 
  We would like to change to a ring of 10 nodes.
 
  Some data have to be replicated on the 10 nodes, some should stay on 2
 nodes. Do you have any idea or documentation pointer in order to have a ring
 of 10 nodes with such data repartition ?
 
  Thanks for any answer.
 
  Jean-Yves




Data management on a ring

2010-11-09 Thread Jean-Yves LEBLEU
Hello all,

We have an installation of 10 nodes, and we choose to deploy 5 rings of 2
nodes.

We would like to change to a ring of 10 nodes.

Some data have to be replicated on the 10 nodes, some should stay on 2
nodes. Do you have any idea or documentation pointer in order to have a ring
of 10 nodes with such data repartition ?

Thanks for any answer.

Jean-Yves


Changing column families and drain

2010-10-13 Thread Jean-Yves LEBLEU
Hi all,

When I look at the wiki the procedure to change the column family is :


   1. Empty the commitlog with nodetool drain.
   2. Shutdown Cassandra and verify that there is no remaining data in the
   commitlog.
   3. Delete the sstable files (-Data.db, -Index.db, and -Filter.db) for any
   CFs removed, and rename the files for any CFs that were renamed.
   4. Make necessary changes to your storage-conf.xml.
   5. Start Cassandra back up and your edits should take effect. **



How do we check that the commitlog is empty ? as it seems that there are
still files in the commit log directory after a drain ?

Is it necessary to shutdown all nodes in a ring before changing the
storage-conf.xml files ?

On linux is a kill -9 acceptable as a cassandra shutdown ?

Thanks for answer.

Jean-Yves
**


Server takes a long time to answer

2010-08-02 Thread Jean-Yves LEBLEU
Hi all,

We have a cassandra installation with two nodes in a ring, replication
factor = 2, some times cassandra becomes non-responsive, it takes
about three minutes before answering to a get.
Do you have any idea of what we should check when it happens ? Or what
could cause the problem.
We are using cassandra release 6.3.
Thanks  regards.
Jean-Yves