Re: problems while TimeUUIDType-index-querying with two expressions

2011-03-15 Thread aaron morton
Perfectly reasonable, created 
https://issues.apache.org/jira/browse/CASSANDRA-2328

Aaron
On 15 Mar 2011, at 16:52, Jonathan Ellis wrote:

 Sounds like we should send an InvalidRequestException then.
 
 On Mon, Mar 14, 2011 at 8:06 PM, aaron morton aa...@thelastpickle.com wrote:
 It's failing to when comparing two TimeUUID values because on of them is not
 properly formatted. In this case it's comparing a stored value with the
 value passed in the get_indexed_slice() query expression.
 I'm going to assume it's the value passed for the expression.
 When you create the IndexedSlicesQuery this is incorrect
 IndexedSlicesQueryString, byte[], byte[] indexQuery = HFactory
 .createIndexedSlicesQuery(keyspace,
 stringSerializer, bytesSerializer, bytesSerializer);
 Use a UUIDSerializer for the last param and then pass the UUID you want to
 build the expressing. Rather than the string/byte thing you are passing
 Hope that helps.
 Aaron
 On 15 Mar 2011, at 04:17, Johannes Hoerle wrote:
 
 Hi all,
 
 in order to improve our queries, we started to use IndexedSliceQueries from
 the hector project (https://github.com/zznate/hector-examples). I followed
 the instructions for creating IndexedSlicesQuery with
 GetIndexedSlices.java.
 I created the corresponding CF with in a keyspace called “Keyspace1” (
 “create keyspace  Keyspace1;”) with:
 create column family Indexed1 with column_type='Standard' and
 comparator='UTF8Type' and keys_cached=20 and read_repair_chance=1.0 and
 rows_cached=2 and column_metadata=[{column_name: birthdate,
 validation_class: LongType, index_name: dateIndex, index_type:
 KEYS},{column_name: birthmonth, validation_class: LongType, index_name:
 monthIndex, index_type: KEYS}];
 and the example GetIndexedSlices.java worked fine.
 
 Output of CF Indexed1:
 ---
 [default@Keyspace1] list Indexed1;
 Using default limit of 100
 ---
 RowKey: fake_key_12
 = (column=birthdate, value=1974, timestamp=1300110485826059)
 = (column=birthmonth, value=0, timestamp=1300110485826060)
 = (column=fake_column_0, value=66616b655f76616c75655f305f3132,
 timestamp=1300110485826056)
 = (column=fake_column_1, value=66616b655f76616c75655f315f3132,
 timestamp=1300110485826057)
 = (column=fake_column_2, value=66616b655f76616c75655f325f3132,
 timestamp=1300110485826058)
 ---
 RowKey: fake_key_8
 = (column=birthdate, value=1974, timestamp=1300110485826039)
 = (column=birthmonth, value=8, timestamp=1300110485826040)
 = (column=fake_column_0, value=66616b655f76616c75655f305f38,
 timestamp=1300110485826036)
 = (column=fake_column_1, value=66616b655f76616c75655f315f38,
 timestamp=1300110485826037)
 = (column=fake_column_2, value=66616b655f76616c75655f325f38,
 timestamp=1300110485826038)
 ---
 
 
 
 Now to the problem:
 As we have another column format in our cluster (using TimeUUIDType as
 comparator in CF definition) I adapted the application to our schema on a
 cassandra-0.7.3 cluster.
 We use a manually defined UUID for a mandator id index
 (--1000--) and another one for a userid index
 (0001--1000--). It can be created with:
 create column family ByUser with column_type='Standard' and
 comparator='TimeUUIDType' and keys_cached=20 and read_repair_chance=1.0
 and rows_cached=2 and column_metadata=[{column_name:
 --1000--, validation_class: BytesType,
 index_name: mandatorIndex, index_type: KEYS}, {column_name:
 0001--1000--, validation_class: BytesType,
 index_name: useridIndex, index_type: KEYS}];
 
 
 which looks in the cluster using cassandra-cli like this:
 
 [default@Keyspace1] describe keyspace;
 Keyspace: Keyspace1:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
 Replication Factor: 1
   Column Families:
 ColumnFamily: ByUser
   Columns sorted by: org.apache.cassandra.db.marshal.TimeUUIDType
   Row cache size / save period: 2.0/0
   Key cache size / save period: 20.0/14400
   Memtable thresholds: 0.2953125/63/1440
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 0.01
   Built indexes: [ByUser.mandatorIndex, ByUser.useridIndex]
   Column Metadata:
 Column Name: 0001--1000--
   Validation Class: org.apache.cassandra.db.marshal.BytesType
   Index Name: useridIndex
   Index Type: KEYS
 Column Name: --1000--
   Validation Class: org.apache.cassandra.db.marshal.BytesType
   Index Name: mandatorIndex
   Index Type: KEYS
 ColumnFamily: Indexed1
   Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
   Row cache size / save period: 2.0/0
   Key cache size / save period: 20.0/14400
   Memtable thresholds: 0.2953125/63/1440
   GC grace seconds: 864000
   

Re: Linux HugePages and mmap

2011-03-15 Thread Oleg Anastasyev
mcasandra mohitanchlia at gmail.com writes:

 
 Thanks! I think it still is a good idea to enable HiugePages and use
 UseLargePageSize option in JVM. What do you think?

I experimented with it. It was about 10% performance improvement. But this was
on 100% row cache hit. On smaller cache hit ratios the performance gain will be
smaller, I believe.



Re: running all unit tests

2011-03-15 Thread aaron morton
There is a test target in the build script. 

Aron

On 15 Mar 2011, at 17:29, Jeffrey Wang wrote:

 Hey all,
  
 We’re applying some patches to our own branch of Cassandra, and we are 
 wondering if there is a good way to run all the unit tests. Just having JUnit 
 run all the test classes seems to result in a lot of errors that are hard to 
 fix, so I’m hoping there’s an easy way to do this. Thanks!
  
 -Jeffrey
  



Re: reducing disk usage advice

2011-03-15 Thread Sylvain Lebresne
On Mon, Mar 14, 2011 at 8:17 PM, Karl Hiramoto k...@hiramoto.org wrote:
 On 03/14/11 15:33, Sylvain Lebresne wrote:

 CASSANDRA-1537 is probably also a partial but possibly sufficient
 solution. That's also probably easier than CASSANDRA-1610 and I'll try
 to give it a shot asap, that had been on my todo list way too long.

 Thanks, eager to see CASSANDRA-1610 someday.   What  I've been doing the
 last day has been multiple restarts across the cluster when one node's
 data/ dir gets to 150GB.  restarting cassandra brings the nodes data/
 directory down to around 60GB, I see cassandra deleteing a lot of
 SSTables on startup.

This is because cassandra lazily remove the compacted files. You don't have to
restart a node though, you can just trigger a full java GC through
jconsole and this
should remove the files.


 One question,  since I use a TTL is it safe to set GCGraceSeconds  to 0?   I 
 don't manually delete ever, I just rely on the TTL for deletion, so are 
 forgotten deletes an issue?


 The rule is this. Say you think that m is a reasonable value for
 GCGraceSeconds. That is, you make sure that you'll always put back up
 failing nodes and run repair within m seconds. Then, if you always use
 a TTL of n (in your case 24 hours), the actual GCGraceSeconds that you
 should set is m - n.

 So putting a GCGrace of 0 in you would would be roughly equivalent to
 set a GCGrace of 24h on a normal CF. That's probably a bit low.

 What do you mean by normal?  If I were to set GCGrace to 0 would  risk
 data corruption?  Wouldn't setting GCGrace to 0 help reduce disk space
 pressure?

Actually, if you really only use TTL on that column family and you
always set the
same TTL, it's ok to set a GCGrace of 0.
If you don't always put the same TTL, the kind of scenario that could
happen are this:
  - you insert a column with ttl=24h.
  - after 3h, you overwrite the column with a ttl of 2h.
At that point, you expect that you have updated the column ttl so that
it will only leave
2 hours. However, if you have GCGrace=0 and you are unlucky enough that a node
got the first insert but not the second one, and stay dead for more
than 2h, then when
you put it back up, it will not receive anything related to the second
insert, because
the column has expired and no tombstone has been created for it (since
GCGrace=0) and
thus the initial column will reappear (for a few hours but still). If
you have a bigger value
for GCGrace, then when the failing node is back up, it will receive a
tombstone about
the second insert and thus will not make the first insert reappear.

So the rule is, if you never lower down the TTL of a column (expanding
it is fine), you can
safely set GCGrace to 0.


OldNetworkTopologyStrategy with one data center

2011-03-15 Thread Jonathan Colby
Hi -

I have a question. Obviously there is no purpose in running
OldNetworkTopologyStrategy in one data center.  However,  we want to
share the same configuration in our production (multiple data centers)
and pre-production (one data center) environments.

My question is will
org.apache.cassandra.locator.OldNetworkTopologyStrategy function with
one data center and RackInferringSnitch?

Jon


AW: problems while TimeUUIDType-index-querying with two expressions

2011-03-15 Thread Roland Gude
Actually its not the column values that should be UUIDs in our case, but the 
column keys. The CF uses TimeUUID ordering and the values are just some 
ByteArrays. Even with changing the code to use UUIDSerializer instead of 
serializing the UUIDs manually the issue still exists.

As far as I can see, there is nothing wrong with the IndexExpression.
using two Index expressions with key=TimedUUID and Value=anything does not work
using one index expression (any one of the other two) alone does work fine.

I refactored Johannes code into a junit testcase. It  needs the cluster 
configured as described in Johannes mail.
There are three cases. Two with one of the indexExpressions and one with both 
index expression. The one with Both IndexExpression will never finish and youz 
will see the exception in the Cassandra logs.

Bye,
roland

Von: aaron morton [mailto:aa...@thelastpickle.com]
Gesendet: Dienstag, 15. März 2011 07:54
An: user@cassandra.apache.org
Cc: Juergen Link; Roland Gude; her...@datastax.com
Betreff: Re: problems while TimeUUIDType-index-querying with two expressions

Perfectly reasonable, created 
https://issues.apache.org/jira/browse/CASSANDRA-2328

Aaron
On 15 Mar 2011, at 16:52, Jonathan Ellis wrote:


Sounds like we should send an InvalidRequestException then.

On Mon, Mar 14, 2011 at 8:06 PM, aaron morton 
aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote:

It's failing to when comparing two TimeUUID values because on of them is not
properly formatted. In this case it's comparing a stored value with the
value passed in the get_indexed_slice() query expression.
I'm going to assume it's the value passed for the expression.
When you create the IndexedSlicesQuery this is incorrect
IndexedSlicesQueryString, byte[], byte[] indexQuery = HFactory
.createIndexedSlicesQuery(keyspace,
stringSerializer, bytesSerializer, bytesSerializer);
Use a UUIDSerializer for the last param and then pass the UUID you want to
build the expressing. Rather than the string/byte thing you are passing
Hope that helps.
Aaron
On 15 Mar 2011, at 04:17, Johannes Hoerle wrote:

Hi all,

in order to improve our queries, we started to use IndexedSliceQueries from
the hector project (https://github.com/zznate/hector-examples). I followed
the instructions for creating IndexedSlicesQuery with
GetIndexedSlices.java.
I created the corresponding CF with in a keyspace called Keyspace1 (
create keyspace  Keyspace1;) with:
create column family Indexed1 with column_type='Standard' and
comparator='UTF8Type' and keys_cached=20 and read_repair_chance=1.0 and
rows_cached=2 and column_metadata=[{column_name: birthdate,
validation_class: LongType, index_name: dateIndex, index_type:
KEYS},{column_name: birthmonth, validation_class: LongType, index_name:
monthIndex, index_type: KEYS}];
and the example GetIndexedSlices.java worked fine.

Output of CF Indexed1:
---
[default@Keyspace1] list Indexed1;
Using default limit of 100
---
RowKey: fake_key_12
= (column=birthdate, value=1974, timestamp=1300110485826059)
= (column=birthmonth, value=0, timestamp=1300110485826060)
= (column=fake_column_0, value=66616b655f76616c75655f305f3132,
timestamp=1300110485826056)
= (column=fake_column_1, value=66616b655f76616c75655f315f3132,
timestamp=1300110485826057)
= (column=fake_column_2, value=66616b655f76616c75655f325f3132,
timestamp=1300110485826058)
---
RowKey: fake_key_8
= (column=birthdate, value=1974, timestamp=1300110485826039)
= (column=birthmonth, value=8, timestamp=1300110485826040)
= (column=fake_column_0, value=66616b655f76616c75655f305f38,
timestamp=1300110485826036)
= (column=fake_column_1, value=66616b655f76616c75655f315f38,
timestamp=1300110485826037)
= (column=fake_column_2, value=66616b655f76616c75655f325f38,
timestamp=1300110485826038)
---



Now to the problem:
As we have another column format in our cluster (using TimeUUIDType as
comparator in CF definition) I adapted the application to our schema on a
cassandra-0.7.3 cluster.
We use a manually defined UUID for a mandator id index
(--1000--) and another one for a userid index
(0001--1000--). It can be created with:
create column family ByUser with column_type='Standard' and
comparator='TimeUUIDType' and keys_cached=20 and read_repair_chance=1.0
and rows_cached=2 and column_metadata=[{column_name:
--1000--, validation_class: BytesType,
index_name: mandatorIndex, index_type: KEYS}, {column_name:
0001--1000--, validation_class: BytesType,
index_name: useridIndex, index_type: KEYS}];


which looks in the cluster using cassandra-cli like this:

[default@Keyspace1] describe keyspace;
Keyspace: Keyspace1:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Factor: 1
  Column Families:
ColumnFamily: ByUser
  Columns sorted by: 

Re: Calculate memory used for keycache

2011-03-15 Thread Jean-Yves LEBLEU
One additionnal question, I don't really understand what is in the key
cache. I have a column family with only one key, and the keycache size
is 118 ... ?
Any idea.
Thks.
Jean-Yves


Move token to another node

2011-03-15 Thread ruslan usifov
Hello

I have follow task. I want to move token from one node to another how can i
do that?


A way to break the cluster

2011-03-15 Thread Patrik Modesto
Hi,

I'm did break my test cluster again. It's really strange. I use
cassandra 0.7.3. This is what I did:

- install node1
- install node2, auto_bootstrap: true
- install node3, auto_bootstrap: true

- created a keyspace with RF 1, populate with data
- create a keyspace with RF 3, populate with data

- nodetool -h node1 loadbalance - after an hour of doing nothing I
killed the nodetool, nodetool ring shows node1 as Up/Leaving
- restarted node1
- nodetool ring shows node1 as Up/Normal

From now on nodetool loadbalance or decommission failt with:

nodetool -h skhd1.dev move 0
Exception in thread main java.lang.IllegalStateException:
replication factor (3) exceeds number of endpoints (2)
at 
org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60)
at 
org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:929)
at 
org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:895)
at 
org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1595)
at 
org.apache.cassandra.service.StorageService.move(StorageService.java:1733)
at 
org.apache.cassandra.service.StorageService.move(StorageService.java:1708)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at 
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

nodetool repair pass, but nothing changed.

Regards,
Patrik


Re: Move token to another node

2011-03-15 Thread Sasha Dolgy
Hi Ruslan,

nodetool -h target node move newtoken

-sd


On Tue, Mar 15, 2011 at 11:23 AM, ruslan usifov ruslan.usi...@gmail.com wrote:
 Hello

 I have follow task. I want to move token from one node to another how can i
 do that?


jna and swapping

2011-03-15 Thread Daniel Doubleday
Hi all

strange things here: we are using jna. Log file says mlockall was successful. 
We start with -Xms2000M -Xmx2000M and run cassandra as root process so 
RLIMIT_MEMLOCK limit should have no relevance. Still cassandra is swapping ...

Used swap varies between 100MB - 800MB

We removed the swap partition altogether now but I still dont understand why 
this happens.

We see this on nodes with a longer uptime ( 2 weeks). 

Here's some process info: 

top - 14:27:35 up 146 days,  3:02,  1 user,  load average: 0.89, 0.97, 0.93
Tasks: 122 total,   1 running, 121 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.4%us,  0.6%sy,  0.0%ni, 85.5%id, 12.6%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   6128360k total,  5852408k used,   275952k free, 4472k buffers
Swap:  1951892k total,   231008k used,  1720884k free,  1576720k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND

   
29757 root  18  -2  251g 3.7g 298m S6 63.8   1590:17 java 


blnrzh019:/var/log/cassandra# ps axxx|grep 29757
29757 ?SLl 1589:56 /usr/bin/java -ea -Xms2000M -Xmx2000M 
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-Dcassandra.compaction.priority=1 -Dcassandra.dynamic_snitch=true 
-Dcom.sun.management.jmxremote.port=8080 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dstorage-config=/opt/smeet-cassandra/bin/../conf -cp 
/opt/smeet-cassandra/bin/../conf:/opt/smeet-cassandra/bin/../build/classes:/opt/smeet-cassandra/bin/../lib/antlr-3.1.3.jar:/opt/smeet-cassandra/bin/../lib/apache-cassandra-0.6.12-patched.jar:/opt/smeet-cassandra/bin/../lib/clhm-production.jar:/opt/smeet-cassandra/bin/../lib/commons-cli-1.1.jar:/opt/smeet-cassandra/bin/../lib/commons-codec-1.2.jar:/opt/smeet-cassandra/bin/../lib/commons-collections-3.2.1.jar:/opt/smeet-cassandra/bin/../lib/commons-lang-2.4.jar:/opt/smeet-cassandra/bin/../lib/google-collections-1.0.jar:/opt/smeet-cassandra/bin/../lib/hadoop-core-0.20.1.jar:/opt/smeet-cassandra/bin/../lib/high-scale-lib.jar:/opt/smeet-cassandra/bin/../lib/ivy-2.1.0.jar:/opt/smeet-cassandra/bin/../lib/jackson-core-asl-1.4.0.jar:/opt/smeet-cassandra/bin/../lib/jackson-mapper-asl-1.4.0.jar:/opt/smeet-cassandra/bin/../lib/jline-0.9.94.jar:/opt/smeet-cassandra/bin/../lib/jna-3.2.7.jar:/opt/smeet-cassandra/bin/../lib/jna.jar:/opt/smeet-cassandra/bin/../lib/json-simple-1.1.jar:/opt/smeet-cassandra/bin/../lib/libthrift-r917130.jar:/opt/smeet-cassandra/bin/../lib/log4j-1.2.14.jar:/opt/smeet-cassandra/bin/../lib/slf4j-api-1.5.8.jar:/opt/smeet-cassandra/bin/../lib/slf4j-log4j12-1.5.8.jar:/opt/smeet-cassandra/bin/../lib/smeet-cassandra-contrib.jar
 org.apache.cassandra.thrift.CassandraDaemon

blnrzh019:/var/log/cassandra# cat /proc/29757/smaps |grep -i swap| awk '{SUM += 
$2} END {print SUM:  SUM  kB ( SUM/1024  MB)}'
SUM: 207844 kB (202.973 MB)

blnrzh019:/var/log/cassandra# grep JNA /var/log/cassandra/system.log*
/var/log/cassandra/system.log.1: INFO [main] 2011-01-27 17:38:11,201 
CLibrary.java (line 86) JNA mlockall successful
/var/log/cassandra/system.log.1: INFO [main] 2011-02-16 07:47:24,788 
CLibrary.java (line 86) JNA mlockall successful
/var/log/cassandra/system.log.1: INFO [main] 2011-02-18 12:29:39,958 
CLibrary.java (line 86) JNA mlockall successful
/var/log/cassandra/system.log.1: INFO [main] 2011-02-25 11:59:42,318 
CLibrary.java (line 86) JNA mlockall successful



Re: OldNetworkTopologyStrategy with one data center

2011-03-15 Thread Jonathan Ellis
Yes.

On Tue, Mar 15, 2011 at 4:29 AM, Jonathan Colby
jonathan.co...@gmail.com wrote:
 Hi -

 I have a question. Obviously there is no purpose in running
 OldNetworkTopologyStrategy in one data center.  However,  we want to
 share the same configuration in our production (multiple data centers)
 and pre-production (one data center) environments.

 My question is will
 org.apache.cassandra.locator.OldNetworkTopologyStrategy function with
 one data center and RackInferringSnitch?

 Jon




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


[RELEASE] 0.7.4

2011-03-15 Thread Eric Evans

Hot on the heals of 0.7.3, I'm pleased to announce 0.7.4, with bugs
fixed, optimizations made, and features added[1].

Upgrading from 0.7.3 is a snap, but if you're upgrading from an earlier
version, do pay special attention to the release notes[2].

If you spot any problems, let us know[3], and if you have any questions,
don't hesitate to ask.

Enjoy!


[1]: http://goo.gl/ZwACq (CHANGES.txt)
[2]: http://goo.gl/Ib28x (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA
[4]: http://cassandra.apache.org/download
[5]: http://wiki.apache.org/cassandra/DebianPackaging

-- 
Eric Evans
eev...@rackspace.com



where to find the stress testing programs?

2011-03-15 Thread Jonathan Colby
According to the Cassandra Wiki and OReilly book supposedly there is a
contrib directory within the cassandra download containing the
Python Stress Test script stress.py.  It's not in the binary tarball
of 0.7.3.

Anyone know where to find it?

Anyone know of other, maybe better stress testing scripts?

Jon


Re: where to find the stress testing programs?

2011-03-15 Thread Sasha Dolgy
the contrib folder is in the source tarball ...

On Tue, Mar 15, 2011 at 5:23 PM, Jonathan Colby
jonathan.co...@gmail.com wrote:
 According to the Cassandra Wiki and OReilly book supposedly there is a
 contrib directory within the cassandra download containing the
 Python Stress Test script stress.py.  It's not in the binary tarball
 of 0.7.3.

 Anyone know where to find it?

 Anyone know of other, maybe better stress testing scripts?

 Jon


Re: where to find the stress testing programs?

2011-03-15 Thread Jeremy Hanna
contrib is only in the source download of cassandra

On Mar 15, 2011, at 11:23 AM, Jonathan Colby wrote:

 According to the Cassandra Wiki and OReilly book supposedly there is a
 contrib directory within the cassandra download containing the
 Python Stress Test script stress.py.  It's not in the binary tarball
 of 0.7.3.
 
 Anyone know where to find it?
 
 Anyone know of other, maybe better stress testing scripts?
 
 Jon



nodetool repair on cluster

2011-03-15 Thread Huy Le
Hi,

We have a cluster with 12 servers and use RF=3.  When running nodetool
repair, do we have to run it on all nodes on the cluster or can we run on
every 3rd node?  Thanks!

Huy

-- 
Huy Le
Spring Partners, Inc.
http://springpadit.com


Re: nodetool repair on cluster

2011-03-15 Thread Daniel Doubleday
At least if you are using RackUnawareStrategy

Cheers,
Daniel

On Mar 15, 2011, at 6:44 PM, Huy Le wrote:

 Hi,
 
 We have a cluster with 12 servers and use RF=3.  When running nodetool 
 repair, do we have to run it on all nodes on the cluster or can we run on 
 every 3rd node?  Thanks!
 
 Huy
 
 -- 
 Huy Le 
 Spring Partners, Inc.
 http://springpadit.com 



Re: Calculate memory used for keycache

2011-03-15 Thread Peter Schuller
 One additionnal question, I don't really understand what is in the key
 cache. I have a column family with only one key, and the keycache size
 is 118 ... ?

The key cache is basically a hash table mapping row keys to sstable
offsets. It avoids the need to read from the index portion of the
sstable for specific keys that have recently been accessed. (Normally
the index portion is seeked into, a bit of data is streamed from disk,
de-serialized, and used to find the offset for that key).

-- 
/ Peter Schuller


RE: running all unit tests

2011-03-15 Thread Jeffrey Wang
Awesome, thanks. I'm seeing some weird errors due to deleting commit logs, 
though (I'm running on Windows, which might have something to do with it):

[junit] java.io.IOException: Failed to delete C:\Documents and 
Settings\jwang\workspace-cass\Cassandra\Cassandra-0.7.0\build\test\cassandra\commitlog\CommitLog-1300214497376.log
[junit]   at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54)
[junit]   at 
org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:201)
[junit]   at 
org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:197)
[junit]   at 
org.apache.cassandra.CleanupHelper.cleanup(CleanupHelper.java:55)
[junit]   at 
org.apache.cassandra.CleanupHelper.cleanupAndLeaveDirs(CleanupHelper.java:41)

Does anyone know how to get these to work?

-Jeffrey

From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Tuesday, March 15, 2011 1:26 AM
To: user@cassandra.apache.org
Subject: Re: running all unit tests

There is a test target in the build script.

Aron

On 15 Mar 2011, at 17:29, Jeffrey Wang wrote:


Hey all,

We're applying some patches to our own branch of Cassandra, and we are 
wondering if there is a good way to run all the unit tests. Just having JUnit 
run all the test classes seems to result in a lot of errors that are hard to 
fix, so I'm hoping there's an easy way to do this. Thanks!

-Jeffrey




Re: Nodes frozen in GC

2011-03-15 Thread Peter Schuller
Sorry about the delay,

 I do believe there is a fundamental issue with compactions allocating too 
 much memory and incurring too many garbage collections (at least with 0.6.12).

[snip a lot of good info]

You certainly seem to have a real issue, though I don't get the feel
it's the same as the OP.

I don't think I can offer a silver bullet. I was going to suggest that
you're seeing rows that are large enough that you're taking young-gen
GC:s prior to the complection of individual rows so that the per-row
working set is promoted to old-gen, yet small enough (row) to be below
in_memory_compaction_limit_in_mb. But this seems inconsistent with the
fact that you report problems even with huge new-gen (10 gig).

With the large new-gen, you were actually seeing fallbacks to full GC?
You weren't just still experiencing problems because at 10 gig, the
new-gen will be so slow to compact to effectively be similar to a full
gc in terms of affecting latency?

If there is any suspicion that the above is happening, maybe try
decreasing in_memory_compaction_limit_in_mb (preparing to see lots of
stuff logged to console, assuming that's still happening in the 0.6.
version you're running).

Also, you did mention taking into account tenuring into old-gen, so
maybe your observations there are inconsistent with the above
hypothesis too. But just one correction/note regarding this: You said
that:

   However, when the young generation is being collected (which
happens VERY often during compactions b/c allocation rate is so high),
objects are allocated directly into the tenured generation.

Im not sure on what you're basing that, but unless I have fatally
failed to grok something fundamental about the interaction between
new-gen and old-gen with CMS, object's aren't being allocated *period*
while the young generation is being collected as that is a
stop-the-world pause. (This is also why I said before that at 10 gig
new-gen size, the observed behavior on young gen collections may be
similar to fallback-to-full-gc cases, but not quite since it would be
parallel rather than serial)

Anyways, I sympathize with your issues and the fact that you don't
have time to start attaching with profilers etc. Unfortunately I don't
know what to suggest that is simpler than that.

-- 
/ Peter Schuller


Re: Move token to another node

2011-03-15 Thread ruslan usifov
2011/3/15 Sasha Dolgy sdo...@gmail.com

 Hi Ruslan,

 nodetool -h target node move newtoken


And how add node to cluster without token?


Re: Move token to another node

2011-03-15 Thread Peter Schuller
 And how add node to cluster without token?

You must always have a token. You can bootstrap a node into the
cluster and have it auto-select a token by leaving the initial_token
setting empty, but that only makes sense if you want the new node to
evenly split the largest current range.

I think the general recommendation is to set the token manually to
avoid surprises.

-- 
/ Peter Schuller


Re: running all unit tests

2011-03-15 Thread Jonathan Ellis
Are you still trying to run tests manually?  You need to enable
junit's flag for running each test class in a separate JVM.

On Tue, Mar 15, 2011 at 1:44 PM, Jeffrey Wang jw...@palantir.com wrote:
 Awesome, thanks. I’m seeing some weird errors due to deleting commit logs,
 though (I’m running on Windows, which might have something to do with it):



     [junit] java.io.IOException: Failed to delete C:\Documents and
 Settings\jwang\workspace-cass\Cassandra\Cassandra-0.7.0\build\test\cassandra\commitlog\CommitLog-1300214497376.log

     [junit]   at
 org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54)

     [junit]   at
 org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:201)

     [junit]   at
 org.apache.cassandra.io.util.FileUtils.deleteRecursive(FileUtils.java:197)

     [junit]   at
 org.apache.cassandra.CleanupHelper.cleanup(CleanupHelper.java:55)

     [junit]   at
 org.apache.cassandra.CleanupHelper.cleanupAndLeaveDirs(CleanupHelper.java:41)



 Does anyone know how to get these to work?



 -Jeffrey



 From: aaron morton [mailto:aa...@thelastpickle.com]
 Sent: Tuesday, March 15, 2011 1:26 AM
 To: user@cassandra.apache.org
 Subject: Re: running all unit tests



 There is a test target in the build script.



 Aron



 On 15 Mar 2011, at 17:29, Jeffrey Wang wrote:

 Hey all,



 We’re applying some patches to our own branch of Cassandra, and we are
 wondering if there is a good way to run all the unit tests. Just having
 JUnit run all the test classes seems to result in a lot of errors that are
 hard to fix, so I’m hoping there’s an easy way to do this. Thanks!



 -Jeffrey







-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: nodetool repair on cluster

2011-03-15 Thread aaron morton
AFAIK you should run it on every node. 
http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data

Aaron

On 16 Mar 2011, at 06:58, Daniel Doubleday wrote:

 At least if you are using RackUnawareStrategy
 
 Cheers,
 Daniel
 
 On Mar 15, 2011, at 6:44 PM, Huy Le wrote:
 
 Hi,
 
 We have a cluster with 12 servers and use RF=3.  When running nodetool 
 repair, do we have to run it on all nodes on the cluster or can we run on 
 every 3rd node?  Thanks!
 
 Huy
 
 -- 
 Huy Le 
 Spring Partners, Inc.
 http://springpadit.com 
 



Cassandra Crash

2011-03-15 Thread Sanjeev Kulkarni
Hey guys,
Have started facing a crash in my cassandra while reading. Here are the
details.
1. single node. replication factor of 1
2. Cassandra version 0.7.3
3. Single keyspace. 5 column families.
4. No super columns
5. My data model is a little bit skewed. It results in having several small
rows and one really big row(lots of columns). cfstats say that on one column
family the Compacted row maximum size is 5960319812. Not sure if this is the
problem.
6. Starting cassandra has no issues. I give a max heap size of 6G.
7. I then start reading a bunch of rows including the really long row. After
some point cassandra starts crashing.
The log is
ERROR [ReadStage:30] 2011-03-15 16:52:27,598 AbstractCassandraDaemon.java
(line 114) Fatal exception in thread Thread[ReadStage:30,5,main]
java.lang.AssertionError
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
ERROR [ReadStage:13] 2011-03-15 16:52:27,600 AbstractCassandraDaemon.java
(line 114) Fatal exception in thread Thread[ReadStage:13,5,main]
java.lang.AssertionError
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

and several more of these errors

Thanks for the help. Let me know if you need more info.


Re: [RELEASE] 0.7.4

2011-03-15 Thread Mark

Still not seeing 0.7.4 as a download option on the main site?

On 3/15/11 9:20 AM, Eric Evans wrote:

Hot on the heals of 0.7.3, I'm pleased to announce 0.7.4, with bugs
fixed, optimizations made, and features added[1].

Upgrading from 0.7.3 is a snap, but if you're upgrading from an earlier
version, do pay special attention to the release notes[2].

If you spot any problems, let us know[3], and if you have any questions,
don't hesitate to ask.

Enjoy!


[1]: http://goo.gl/ZwACq (CHANGES.txt)
[2]: http://goo.gl/Ib28x (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA
[4]: http://cassandra.apache.org/download
[5]: http://wiki.apache.org/cassandra/DebianPackaging



Re: nodetool repair on cluster

2011-03-15 Thread Jonathan Ellis
right, every 3rd node is adequate w/ RUS/SimpleStrategy since each
node repairs all the ranges it has replicated to it.

On Tue, Mar 15, 2011 at 12:58 PM, Daniel Doubleday
daniel.double...@gmx.net wrote:
 At least if you are using RackUnawareStrategy
 Cheers,
 Daniel
 On Mar 15, 2011, at 6:44 PM, Huy Le wrote:

 Hi,

 We have a cluster with 12 servers and use RF=3.  When running nodetool
 repair, do we have to run it on all nodes on the cluster or can we run on
 every 3rd node?  Thanks!

 Huy

 --
 Huy Le
 Spring Partners, Inc.
 http://springpadit.com





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Cassandra Crash

2011-03-15 Thread Jonathan Ellis
Did you upgrade from an earlier version?  Did you read NEWS.txt?

On Tue, Mar 15, 2011 at 4:21 PM, Sanjeev Kulkarni sanj...@locomatix.com wrote:
 Hey guys,
 Have started facing a crash in my cassandra while reading. Here are the
 details.
 1. single node. replication factor of 1
 2. Cassandra version 0.7.3
 3. Single keyspace. 5 column families.
 4. No super columns
 5. My data model is a little bit skewed. It results in having several small
 rows and one really big row(lots of columns). cfstats say that on one column
 family the Compacted row maximum size is 5960319812. Not sure if this is the
 problem.
 6. Starting cassandra has no issues. I give a max heap size of 6G.
 7. I then start reading a bunch of rows including the really long row. After
 some point cassandra starts crashing.
 The log is
 ERROR [ReadStage:30] 2011-03-15 16:52:27,598 AbstractCassandraDaemon.java
 (line 114) Fatal exception in thread Thread[ReadStage:30,5,main]
 java.lang.AssertionError
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
         at
 org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
         at
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
         at org.apache.cassandra.db.Table.getRow(Table.java:333)
         at
 org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
         at
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
         at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
 ERROR [ReadStage:13] 2011-03-15 16:52:27,600 AbstractCassandraDaemon.java
 (line 114) Fatal exception in thread Thread[ReadStage:13,5,main]
 java.lang.AssertionError
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
         at
 org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
         at
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
         at org.apache.cassandra.db.Table.getRow(Table.java:333)
         at
 org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
         at
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
         at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
 and several more of these errors

 Thanks for the help. Let me know if you need more info.



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: [RELEASE] 0.7.4

2011-03-15 Thread Brandon Williams
On Tue, Mar 15, 2011 at 4:26 PM, Mark static.void@gmail.com wrote:

 Still not seeing 0.7.4 as a download option on the main site?


Artifacts are probably still syncing, try
http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.7.4/apache-cassandra-0.7.4-bin.tar.gz
until
one of them works.

-Brandon


Is column update column-atomic or row atomic?

2011-03-15 Thread buddhasystem
Sorry for the rather primitive question, but it's not clear to me if I need
to fetch the whole row, add a column as a dictionary entry and re-insert it
if I want to expand the row by one column. Help will be appreciated.


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-column-update-column-atomic-or-row-atomic-tp6174445p6174445.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Seed

2011-03-15 Thread mcasandra
That is from the wiki  http://wiki.apache.org/cassandra/StorageConfiguration
http://wiki.apache.org/cassandra/StorageConfiguration 

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Seed-tp6162837p6174450.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Is column update column-atomic or row atomic?

2011-03-15 Thread Edward Capriolo
On Tue, Mar 15, 2011 at 5:46 PM, buddhasystem potek...@bnl.gov wrote:
 Sorry for the rather primitive question, but it's not clear to me if I need
 to fetch the whole row, add a column as a dictionary entry and re-insert it
 if I want to expand the row by one column. Help will be appreciated.


 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-column-update-column-atomic-or-row-atomic-tp6174445p6174445.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


No. In Cassandra you do not need to read to write. You should try to
avoid it if possible.


Re: Is column update column-atomic or row atomic?

2011-03-15 Thread buddhasystem
Thanks. Can you give me a pycassa example, if possible?

Thanks!


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-column-update-column-atomic-or-row-atomic-tp6174445p6174487.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Cassandra Crash

2011-03-15 Thread Sanjeev Kulkarni
Hey Jonathan,
Thanks for the reply.
I was earlier running 0.7.2 and upgraded it to 0.7.3. Looks like I had to
run the nodetool scrub command to sanitize the sstables because of the
bloomfilter bug. I did that and the Assert error went away but I'm getting
Java Heap Space Out of Memory error. I again upgraded to 0.7.4 which is just
released but the OOM crash remains.
The log is attached below. Row caching is disabled and key caching is set to
default(20). The max heap space that I'm giving is pretty large(6G). Do
you think reducing the key caching will help?
Thanks again!

java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:269)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:315)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:272)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)



On Tue, Mar 15, 2011 at 2:27 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Did you upgrade from an earlier version?  Did you read NEWS.txt?

 On Tue, Mar 15, 2011 at 4:21 PM, Sanjeev Kulkarni sanj...@locomatix.com
 wrote:
  Hey guys,
  Have started facing a crash in my cassandra while reading. Here are the
  details.
  1. single node. replication factor of 1
  2. Cassandra version 0.7.3
  3. Single keyspace. 5 column families.
  4. No super columns
  5. My data model is a little bit skewed. It results in having several
 small
  rows and one really big row(lots of columns). cfstats say that on one
 column
  family the Compacted row maximum size is 5960319812. Not sure if this is
 the
  problem.
  6. Starting cassandra has no issues. I give a max heap size of 6G.
  7. I then start reading a bunch of rows including the really long row.
 After
  some point cassandra starts crashing.
  The log is
  ERROR [ReadStage:30] 2011-03-15 16:52:27,598 AbstractCassandraDaemon.java
  (line 114) Fatal exception in thread Thread[ReadStage:30,5,main]
  java.lang.AssertionError
  at
 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
  at
 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
  at
 
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
  at
 
 org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
  at
 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
  at org.apache.cassandra.db.Table.getRow(Table.java:333)
  at
 
 org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
  at
 
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
  at
  org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at
 
 

How does one node communicate with the other node?

2011-03-15 Thread Joshua Partogi
Hi there,

I am trying to understand the underlying architecture of cassandra.
How does one node communicate with other node? Does cassandra use
Thrift or JMX to communicate with other node?

Kind regards,
Joshua.
-- 
http://twitter.com/jpartogi


Re: How does one node communicate with the other node?

2011-03-15 Thread aaron morton
The internode messages are custom binary format. In the code it's in the  
o.a.c.net package, the Message class is the thing sent around. MessageService 
is the main guy handling incoming and outgoing messages. 

The node listens on the storage_port and listen_address as set in 
conf/cassandra.yaml

Hope that helps. 
Aaron

On 16 Mar 2011, at 12:44, Joshua Partogi wrote:

 Hi there,
 
 I am trying to understand the underlying architecture of cassandra.
 How does one node communicate with other node? Does cassandra use
 Thrift or JMX to communicate with other node?
 
 Kind regards,
 Joshua.
 -- 
 http://twitter.com/jpartogi



How does new node know about other hosts and joins the cluster

2011-03-15 Thread mcasandra
I am assuming it is the seed node that tells who are the other member in the
cluster. And then does the new node joining the cluster send join message
(something like that) to other nodes or is there a master coordinator (like
jboss cluster) that tells other nodes that new node has joined?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-does-new-node-know-about-other-hosts-and-joins-the-cluster-tp6174900p6174900.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: get_string_property in 0.6

2011-03-15 Thread Jonathan Ellis
[moving to user list]

the things we used to expose as strings now have proper data types and
their own methods, e.g. describe_cluster_name, describe_version, etc.

On Tue, Mar 15, 2011 at 8:40 PM, Anurag Gujral anurag.guj...@gmail.com wrote:
 Hi All,
           I am working on porting my cassandra application to work with
 cassandra version 0.7 from cassandra version 0.6.

 Is there any function corresponding to function get_string_property(which is
 available in 0.6) in 0.7.

 Thanks a ton,
 Anurag







-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: How does new node know about other hosts and joins the cluster

2011-03-15 Thread Robert Coli
On Tue, Mar 15, 2011 at 5:45 PM, mcasandra mohitanch...@gmail.com wrote:
 I am assuming it is the seed node that tells who are the other member in the
 cluster. And then does the new node joining the cluster send join message
 (something like that) to other nodes or is there a master coordinator (like
 jboss cluster) that tells other nodes that new node has joined?

When you have these sorts of questions about cassandra internals,
there are two strategies likely to be more effective than mailing the
-user mailing list.

a) join #cassandra on IRC and ask there

This alternative is superior because you can get into a discussion
with someone who can interactively help you find the information you
need.

b) read the source code

This alternative is superior because it is more certain to be correct
than any doc or mailing list poster, and you don't have to wait around
for an answer.

Example :

./src/java/org/apache/cassandra/gms/Gossiper.java

private class GossipTask implements Runnable
...
public void run()
...
   /* Gossip to some random live member */
 boolean gossipedToSeed = doGossipToLiveMember(prod);
...
 /* Gossip to a seed if we did not do so above, or we have seen less nodes
   than there are seeds.  ... */
...
if (!gossipedToSeed || liveEndpoints.size()  seeds.size())
doGossipToSeed(prod);


=Rob


Re: Cassandra Crash

2011-03-15 Thread Jonathan Ellis
I would scrub again w/ 0.7.4.

On Tue, Mar 15, 2011 at 6:38 PM, Sanjeev Kulkarni sanj...@locomatix.com wrote:
 Hey Jonathan,
 Thanks for the reply.
 I was earlier running 0.7.2 and upgraded it to 0.7.3. Looks like I had to
 run the nodetool scrub command to sanitize the sstables because of the
 bloomfilter bug. I did that and the Assert error went away but I'm getting
 Java Heap Space Out of Memory error. I again upgraded to 0.7.4 which is just
 released but the OOM crash remains.
 The log is attached below. Row caching is disabled and key caching is set to
 default(20). The max heap space that I'm giving is pretty large(6G). Do
 you think reducing the key caching will help?
 Thanks again!
 java.lang.OutOfMemoryError: Java heap space
         at
 org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:269)
         at
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:315)
         at
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:272)
         at
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
         at
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
         at
 org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
         at
 org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
         at
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
         at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
         at org.apache.cassandra.db.Table.getRow(Table.java:333)
         at
 org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
         at
 org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
         at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)


 On Tue, Mar 15, 2011 at 2:27 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Did you upgrade from an earlier version?  Did you read NEWS.txt?

 On Tue, Mar 15, 2011 at 4:21 PM, Sanjeev Kulkarni sanj...@locomatix.com
 wrote:
  Hey guys,
  Have started facing a crash in my cassandra while reading. Here are the
  details.
  1. single node. replication factor of 1
  2. Cassandra version 0.7.3
  3. Single keyspace. 5 column families.
  4. No super columns
  5. My data model is a little bit skewed. It results in having several
  small
  rows and one really big row(lots of columns). cfstats say that on one
  column
  family the Compacted row maximum size is 5960319812. Not sure if this is
  the
  problem.
  6. Starting cassandra has no issues. I give a max heap size of 6G.
  7. I then start reading a bunch of rows including the really long row.
  After
  some point cassandra starts crashing.
  The log is
  ERROR [ReadStage:30] 2011-03-15 16:52:27,598
  AbstractCassandraDaemon.java
  (line 114) Fatal exception in thread Thread[ReadStage:30,5,main]
  java.lang.AssertionError
          at
 
  org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
          at
 
  org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
          at
 
  org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:72)
          at
 
  org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
          at
 
  org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
          at
 
  org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
          at
 
  org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
          at
 
  org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
          at org.apache.cassandra.db.Table.getRow(Table.java:333)
          at
 
  org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
          at
 
  org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
          at
  

Re: [RELEASE] 0.7.4

2011-03-15 Thread Eric Evans
On Tue, 2011-03-15 at 14:26 -0700, Mark wrote:
 Still not seeing 0.7.4 as a download option on the main site?

Something about the site's pubsub isn't working; I'll contact INFRA.

-- 
Eric Evans
eev...@rackspace.com



Re: [RELEASE] 0.7.4

2011-03-15 Thread Eric Evans
On Tue, 2011-03-15 at 22:19 -0500, Eric Evans wrote:
 On Tue, 2011-03-15 at 14:26 -0700, Mark wrote:
  Still not seeing 0.7.4 as a download option on the main site?
 
 Something about the site's pubsub isn't working; I'll contact INFRA.

https://issues.apache.org/jira/browse/INFRA-3520

-- 
Eric Evans
eev...@rackspace.com