Re: column family names

2010-08-30 Thread Aaron Morton
Moving to the user list.

The new restrictions were added as part of  CASSANDRA-1377 for 0.6.5 and 0.7, 
AFAIK it's to ensure the file names created for the CFs can be correctly 
parsed. So it's probably not going to change.

The names have to match the \w reg ex class, which includes the underscore 
character. 


Aaron  

On 30 Aug 2010, at 21:01, Terje Marthinussen tmarthinus...@gmail.com wrote:

 Hi,
 
 Now that we can make columns families on the fly, it gets interesting to use
 column families more as part of the data model (can reduce diskspace quite a
 bit vs. super columns in some cases).
 
 However, currently, the column family name validator is pretty strict
 allowing only word characters and in some cases it is pretty darned nice to
 be able to put something like a - inbetweenallthewords.
 
 Any reason to be this strict or could it be loosened up a little bit?
 
 Terje


Re: column family names

2010-08-30 Thread Terje Marthinussen
Ah, sorry, I forgot that underscore was part of \w.
That will do the trick for now.

I do not see the big issue with file names though. Why not expand the
allowed characters a bit and escape the file names? Maybe some sort of URL
like escaping.

Terje

On Mon, Aug 30, 2010 at 6:29 PM, Aaron Morton aa...@thelastpickle.comwrote:

 Moving to the user list.

 The new restrictions were added as part of  CASSANDRA-1377 for 0.6.5 and
 0.7, AFAIK it's to ensure the file names created for the CFs can be
 correctly parsed. So it's probably not going to change.

 The names have to match the \w reg ex class, which includes the underscore
 character.


 Aaron


 On 30 Aug 2010, at 21:01, Terje Marthinussen tmarthinus...@gmail.com
 wrote:

 Hi,

 Now that we can make columns families on the fly, it gets interesting to
 use
 column families more as part of the data model (can reduce diskspace quite
 a
 bit vs. super columns in some cases).

 However, currently, the column family name validator is pretty strict
 allowing only word characters and in some cases it is pretty darned nice to
 be able to put something like a - inbetweenallthewords.

 Any reason to be this strict or could it be loosened up a little bit?

 Terje




Re: Thrift + PHP: help!

2010-08-30 Thread Mike Peters

Juho, do you mind sharing your implementation with the group? 

We'd love to help as well with rewriting the thrift interface, specificaly
TSocket.php which seems to be where the majority of the problems are
lurking. 

Has anyone tried compiling native thrift support as described here 
https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP   
-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Thrift-PHP-help-tp5437314p5478057.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: cassandra for a inbox search with high reading qps

2010-08-30 Thread Mike Peters

Chen, 

Have you considered using  http://www.slideshare.net/otisg/lucandra Lucandra 
for Inbox search? 

We have a similar setup and are currently looking into using Lucandra over
implementing the searching ourselves with pure Cassandra.
-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/cassandra-for-a-inbox-search-with-high-reading-qps-tp5434900p5478060.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-08-30 Thread Mike Peters

Hi guys, 

There are several patches you need to apply to Thrift to completely resolve
all timeout errors. 

Here's a list of them along with a link to download a patched thrift
library: 
http://www.softwareprojects.com/resources/programming/t-php-thrift-library-for-cassandra-1982.html
http://www.softwareprojects.com/resources/programming/t-php-thrift-library-for-cassandra-1982.html
 
-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/TException-Error-TSocket-timed-out-reading-1024-bytes-from-10-1-1-27-9160-tp4905687p5478055.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: RowMutationVerbHandler.java (line 78) Error in row mutation

2010-08-30 Thread Gary Dusbabek
Is it possible this was a new node with a manual token and
autobootstrap turned off?

If not, could you give more details about the node?

Gary.


On Fri, Aug 27, 2010 at 17:58, B. Todd Burruss bburr...@real.com wrote:
 i got the latest code this morning.  i'm testing with 0.7


 ERROR [ROW-MUTATION-STAGE:388] 2010-08-27 15:54:58,053
 RowMutationVerbHandler.java (line 78) Error in row mutation
 org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find
 cfId=1002
    at
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:113)
    at
 org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:372)
    at
 org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:382)
    at
 org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:340)
    at
 org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:46)
    at
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:50)
    at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)




get_slice sometimes returns previous result on php

2010-08-30 Thread Juho Mäkinen
I've ran into a strange bug where get_slice returns the result from
previous query. My application iterates over a set of columns inside a
supercolumn and for some reason it sometimes (quite rarely but often
enough that it shows up) the results gets shifted around so that the
application gets the previous result. The application is using the
same cassandra thrift connection (it doesn't close it in between) and
everything is happening inside same php process.

Here's a cleaned up example from logs where this happens:

14:40 suomirock php-fi: [MISC] WARNING /blog.php: Cassandra stored
blog content for blog id 47528165 differs from database content.
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from cassandra: 
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from database : B

14:40 suomirock php-fi: [MISC] WARNING /blog.php: Cassandra stored
blog content for blog id 47523032 differs from database content.
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from cassandra: B
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from database : CC

The data model is that I have a Super Column family which stores blog
entries. Each user has a single row. Inside this row there are CF's
where each CF contains a single blog entry. The key of the CF is the
blog id number and one of the columns inside the CF contains the blog
content.

The data which is in cassandra is correctly there and it's the same
what's inside our old storage tier (PostgreSQL) so I'm able to compare
the data returned from cassandra with the data returned from old
database.
Here's part of the output from cassandra-cli where I queried the row
for this user. As you can see, the blog id matches the super_column
inside cassandra.

= (super_column=47540671, (column=content, value=,
timestamp=1282940401925456) )
= (super_column=47528165, (column=content, value=B,
timestamp=1282940401925456) )
= (super_column=47523032, (column=content, value=CC,
timestamp=1282940401925456) )

I'm in the middle of writing bunch of debugging code to get better
data what's really going on, but I'd be very happy if someone could
have any clue or helpful ideas how to debug this out.

 - Juho Mäkinen


Re: Calls block when using Thrift API

2010-08-30 Thread Gary Dusbabek
If you're only interested in accessing data natively, I suggest you
try the fat client.  It brings up a node that participates in
gossip, exposes the StorageProxy API, but does not receive a token and
so does not have storage responsibilities.

StorageService.instance.initClient();

in 0.7 you will want to loop until the node receives its storage
definitions via gossip.

Calling SS.instance.initServer() directly bypasses some of the routine
startup operations.  I don't recommend it unless you really know
what's going on (it might work now, but it's not guaranteed to in the
future).

Gary.


On Sun, Aug 29, 2010 at 10:28, Ruben de Laat ru...@logic-labs.nl wrote:
 Just for the people looking to run Cassandra embedded and access
 directly (not via Thrift/Avro).

 This works:

 StorageService.instance.initServer();

 And then just use the StorageProxy for data access.

 I have no idea if this is the right way, but is works.

 Kind regards,
 Ruben




Re: cassandra disk usage

2010-08-30 Thread Jonathan Ellis
column names are stored per cell

(moving to user@)

On Mon, Aug 30, 2010 at 6:58 AM, Terje Marthinussen
tmarthinus...@gmail.com wrote:
 Hi,

 Was just looking at a SSTable file after loading a dataset. The data load
 has no updates of data  but:
 - Columns can in some rare cases be added to existing super columns
 - SuperColumns will be added to the same key (but not overwriting existing
 data). I batch these, but it is quite likely that there will be 2-3 updates
 to a key.

 This is a random selected SSTable file from a much bigger dataset.

 The data is stored as date(super)/type(column)/value
 Date is a simple 20100811 type string.
 Value is a small integer, 2 digit on average

 If I run a simple strings on the SSTable and look for the data:
 value: 692Kbyte of data
 type: 4.01MByte of data
 date: 4.6MB of data

 In total: 9.4MByte

 The size of the .db file however, is 36.4MB...

 The expansion from the column headers are bad enough, but I can somehow
 accept that.
 The almost 4x expansion on top of that is a bit harder to justify...

 Anyone know already where this expansion comes from? Or I need to take a
 careful look at source (probably useful anyway :))

 Terje




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Thrift + PHP: help!

2010-08-30 Thread Mike Peters

 Interesting! Thanks for sharing

Have you considered instead of retrying the failing node, to iterate 
through other nodes in your cluster?


If one node is failing (let's assume it's overloaded for a minute), 
you're probably going to be better off having the client send the insert 
to the next node in line.


Thoughts?

On 8/30/2010 9:17 AM, Juho Mäkinen wrote:


Yes, I've already planning to do so. The class has still some
dependencies into our other functions which I need to first clear out.

Basically each api call is wrapped inside a retry loop as we can
assume that each operation can be retried as many times as needed:
$tries = 0;
$this-last_exception = null;
$delay = 1000; // start with 1ms retry delay
do {
try {
$this-client-insert($this-keyspace, $key, 
$column_path, $value,
$timestamp, $consistency_level);
return;
} catch (cassandra_InvalidRequestException $e) {
Logger::error(InvalidRequestException:  . 
$e-why . ',
stacktrace: ' . $e-getMessage());
throw $e;
} catch (Exception $e) {
$this-last_exception = $e;
$tries++;

// sleep for some time and try again
usleep($delay);
$delay = $delay * 3;
$this-connect(); // Drop current server and 
reopen a connection
into another server
}



} while ($tries  4);
// Give up and throw the last exception
throw $this-last_exception;


  - Juho Mäkinen

On Mon, Aug 30, 2010 at 3:48 PM, Mike Peters
cassan...@softwareprojects.com  wrote:

Juho, do you mind sharing your implementation with the group?

We'd love to help as well with rewriting the thrift interface, specificaly
TSocket.php which seems to be where the majority of the problems are
lurking.

Has anyone tried compiling native thrift support as described here
https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Thrift-PHP-help-tp5437314p5478057.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.







Re: Thrift + PHP: help!

2010-08-30 Thread Juho Mäkinen
On Mon, Aug 30, 2010 at 4:24 PM, Mike Peters
cassan...@softwareprojects.com wrote:
 Have you considered instead of retrying the failing node, to iterate through
 other nodes in your cluster?


Yes, the $this-connect() does just that: it removes the previous node
from the node list and gives the list back to thrift connection
function. In case all nodes have been tried (and thus removed) it
refills the node list and starts looping it again. In practice this
will never happen but the code is there just to be sure :)

 - Juho Mäkinen




 If one node is failing (let's assume it's overloaded for a minute), you're
 probably going to be better off having the client send the insert to the
 next node in line.

 Thoughts?

 On 8/30/2010 9:17 AM, Juho Mäkinen wrote:

 Yes, I've already planning to do so. The class has still some
 dependencies into our other functions which I need to first clear out.

 Basically each api call is wrapped inside a retry loop as we can
 assume that each operation can be retried as many times as needed:
                $tries = 0;
                $this-last_exception = null;
                $delay = 1000; // start with 1ms retry delay
                do {
                        try {
                                $this-client-insert($this-keyspace,
 $key, $column_path, $value,
 $timestamp, $consistency_level);
                                return;
                        } catch (cassandra_InvalidRequestException $e) {
                                Logger::error(InvalidRequestException:  .
 $e-why . ',
 stacktrace: ' . $e-getMessage());
                                throw $e;
                        } catch (Exception $e) {
                                $this-last_exception = $e;
                                $tries++;

                                // sleep for some time and try again
                                usleep($delay);
                                $delay = $delay * 3;
                                $this-connect(); // Drop current server
 and reopen a connection
 into another server
                        }



                } while ($tries  4);
                // Give up and throw the last exception
                throw $this-last_exception;


  - Juho Mäkinen

 On Mon, Aug 30, 2010 at 3:48 PM, Mike Peters
 cassan...@softwareprojects.com  wrote:

 Juho, do you mind sharing your implementation with the group?

 We'd love to help as well with rewriting the thrift interface,
 specificaly
 TSocket.php which seems to be where the majority of the problems are
 lurking.

 Has anyone tried compiling native thrift support as described here
 https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
 https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP
 --
 View this message in context:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Thrift-PHP-help-tp5437314p5478057.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at
 Nabble.com.






NodeTool won't connect remotely

2010-08-30 Thread Allan Carroll
Hi, 

I'm trying to manage my cassandra cluster from a remote box and having issues 
getting nodetool to connect. All the machines I'm using are running on AWS.

Here's what happens when I try:

/opt/apache-cassandra-0.6.4/bin/nodetool -h xxx.xxx.xxx.143 -p 10036 ring
Error connecting to remote JMX agent!
java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.143; nested 
exception is: 
java.net.ConnectException: Connection timed out


When I'm local to a box (Ubuntu 10.04) running Cassandra, I can connect fine 
via both 127.0.0.1 and external ip (xxx.xxx.xxx.143). I can telnet into the jmx 
port from an external machine fine:

telnet xxx.xxx.xxx.143 10036
Trying xxx.xxx.xxx.143...
Connected to xxx.xxx.xxx.143.
Escape character is '^]'.

I already added the -Djava.rmi.server.hostname parameter to the java runtime, 
but it didn't seem to affect anything.

/usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile 
/var/run/cassandra.pid -errfile 1 -outfile /var/log/cassandra/output.log -cp 
/usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.6.3.jar:/usr/share/cassandra/avro-1.2.0-dev.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collections-1.0.jar:/usr/share/cassandra/hadoop-core-0.20.1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/ivy-2.1.0.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/libthrift-r917130.jar:/usr/share/cassandra/log4j-1.2.14.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar
 -Xmx4G -Xms128M -Djava.rmi.server.hostname=xxx.xxx.xxx.143 -Dcassandra 
-Dstorage-config=/etc/cassandra -Dcom.sun.management.jmxremote.port=10036 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
org.apache.cassandra.thrift.CassandraDaemon


netstat shows that I'm still bound to IP6

netstat -nap|grep 10036
tcp6   0  0 :::10036:::*LISTEN  
29277/jsvc


And, now I'm at an impasse. Any help would be greatly appreciated.

Thanks
-Allan



Re: NodeTool won't connect remotely

2010-08-30 Thread Juho Mäkinen
I think that JMX needs additional ports to function correctly. Try to
disable all firewalls between the client and the server so that client
can connect to any port in the server and try again.

 - Juho Mäkinen

On Mon, Aug 30, 2010 at 7:07 PM, Allan Carroll alla...@gmail.com wrote:
 Hi,

 I'm trying to manage my cassandra cluster from a remote box and having issues 
 getting nodetool to connect. All the machines I'm using are running on AWS.

 Here's what happens when I try:

 /opt/apache-cassandra-0.6.4/bin/nodetool -h xxx.xxx.xxx.143 -p 10036 ring
 Error connecting to remote JMX agent!
 java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.143; 
 nested exception is:
        java.net.ConnectException: Connection timed out


 When I'm local to a box (Ubuntu 10.04) running Cassandra, I can connect fine 
 via both 127.0.0.1 and external ip (xxx.xxx.xxx.143). I can telnet into the 
 jmx port from an external machine fine:

 telnet xxx.xxx.xxx.143 10036
 Trying xxx.xxx.xxx.143...
 Connected to xxx.xxx.xxx.143.
 Escape character is '^]'.

 I already added the -Djava.rmi.server.hostname parameter to the java runtime, 
 but it didn't seem to affect anything.

 /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile 
 /var/run/cassandra.pid -errfile 1 -outfile /var/log/cassandra/output.log -cp 
 /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.6.3.jar:/usr/share/cassandra/avro-1.2.0-dev.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collections-1.0.jar:/usr/share/cassandra/hadoop-core-0.20.1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/ivy-2.1.0.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/libthrift-r917130.jar:/usr/share/cassandra/log4j-1.2.14.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar
  -Xmx4G -Xms128M -Djava.rmi.server.hostname=xxx.xxx.xxx.143 -Dcassandra 
 -Dstorage-config=/etc/cassandra -Dcom.sun.management.jmxremote.port=10036 
 -Dcom.sun.management.jmxremote.ssl=false 
 -Dcom.sun.management.jmxremote.authenticate=false 
 org.apache.cassandra.thrift.CassandraDaemon


 netstat shows that I'm still bound to IP6

 netstat -nap|grep 10036
 tcp6       0      0 :::10036                :::*                    LISTEN    
   29277/jsvc


 And, now I'm at an impasse. Any help would be greatly appreciated.

 Thanks
 -Allan




Re: Cassandra HAProxy

2010-08-30 Thread Dave Viner
FWIW - we've been using HAProxy in front of a cassandra cluster in
production and haven't run into any problems yet.  It sounds like our
cluster is tiny in comparison to Anthony M's cluster.  But I just wanted to
mentioned that others out there are doing the same.

One thing in this thread that I thought was interesting is Ben's initial
comment the presence of the proxy precludes clients properly backing off
from nodes returning errors.  I think it would be very cool if someone
implemented a mechanism for haproxy to detect the error nodes and then
enable it to drop those nodes from the rotation.  I'd be happy to help with
this, as I know how it works with haproxy and standard web servers or other
tcp servers.  But, I'm not sure how to make it work with Cassandra, since,
as Ben points out, it can return valid tcp responses (that say
error-condition) on the standard port.

Dave Viner


On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro 
antho...@alumni.caltech.edu wrote:


 On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
  On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
  antho...@alumni.caltech.edu wrote:
  
  
   I don't know it seems to tax our setup of 39 extra large ec2 nodes, its
   also closer to 24000 reqs/sec at peak since there are different tables
   (2 tables for each read and 2 for each write)
  
 
  Could you clarify what you mean here?  On the face of it, this
  performance seems really poor given the number and size of nodes.

 As you say I would expect to achieve much better performance given the node
 size, but if you go back and look through some of the issues we've seen
 over time, you'll find we've been hit with nodes being too small, having
 too few nodes to deal with request volume, having OOMs, having bad
 sstables,
 having the ring appear different to different nodes, and several other
 problems.

 Many of i/o problems presented themselves as MessageDeserializer pool
 backups
 (although we stopped having these since Jonathan was by and suggested row
 cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
 which are probably caused by GC storms during compactions (although usually
 the nodes restart and compact fine, so who knows).  I also regularly watch
 nodes go away for 30 seconds or so (logs show node goes dead, then comes
 back to life a few seconds later).

 I've sort of given up worrying about these, as we are in the process of
 moving this cluster to our own machines in a colo, so I figure I should
 wait until they are moved, and see how the new machines do before I worry
 more about performance.

 -Anthony

 --
 
 Anthony Molinaro   antho...@alumni.caltech.edu



Re: cassandra disk usage

2010-08-30 Thread Terje Marthinussen
On Mon, Aug 30, 2010 at 10:10 PM, Jonathan Ellis jbel...@gmail.com wrote:

 column names are stored per cell

 (moving to user@)



I think that is already accommodated for in my numbers?

What i listed was measured from the actual SSTable file (using the output
from strings sstable.db), so multiples of the supercolumn and columns
names is already part of the strings output.

Typically, you get something like this as output from strings:
20100629
20100629
20100629
string matching the type
java.util.BitSetn
bitst
[Jxpur
[Jx

repeating.

I am not entirely sure why I get those repeating supercolumn names there
(there are more supercolumn names in this file than column names, which is
not logical, it should be the other way around!), but I will have a closer
look at that one.

These strings makes up about 1/2 of the total data. The remainder being
binary and tons of null bytes.

The strings command (which will of course give me some binary noise) returns
14.943.928 bytes (or rather characters) of data
If we ignore the binary noise for a second and also count the number of null
bytes in this file, we get:

Text: 14,943,928 bytes (as mentioned in my previous posting, 9.4MB of this
is column headers)
Null Bytes: 14,634,412 bytes
Other (binary): 8,580,188 bytes
Total size: 38,158,528

Yes yes yes, doing this is ugly and lots of null bytes would occur for many
reasons (no reason to tell me that), but chew on that number for a second
and take a look at an SSTable near you, there is a heck of a lot of nothing
there.

Should be noted that this is 0.7 beta 1.

I realize that this code will change dramatically by 0.8 so this is probably
not too interesting to spend too much time on,  but the expansion of data is
pretty excessive in many scenarios, so I just looked briefely at an actual
file trying to understand it a bit better.

Terje


Re: Cassandra HAProxy

2010-08-30 Thread Edward Capriolo
On Mon, Aug 30, 2010 at 12:40 PM, Dave Viner davevi...@pobox.com wrote:
 FWIW - we've been using HAProxy in front of a cassandra cluster in
 production and haven't run into any problems yet.  It sounds like our
 cluster is tiny in comparison to Anthony M's cluster.  But I just wanted to
 mentioned that others out there are doing the same.
 One thing in this thread that I thought was interesting is Ben's initial
 comment the presence of the proxy precludes clients properly backing off
 from nodes returning errors.  I think it would be very cool if someone
 implemented a mechanism for haproxy to detect the error nodes and then
 enable it to drop those nodes from the rotation.  I'd be happy to help with
 this, as I know how it works with haproxy and standard web servers or other
 tcp servers.  But, I'm not sure how to make it work with Cassandra, since,
 as Ben points out, it can return valid tcp responses (that say
 error-condition) on the standard port.
 Dave Viner

 On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro
 antho...@alumni.caltech.edu wrote:

 On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
  On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
  antho...@alumni.caltech.edu wrote:
  
  
   I don't know it seems to tax our setup of 39 extra large ec2 nodes,
   its
   also closer to 24000 reqs/sec at peak since there are different tables
   (2 tables for each read and 2 for each write)
  
 
  Could you clarify what you mean here?  On the face of it, this
  performance seems really poor given the number and size of nodes.

 As you say I would expect to achieve much better performance given the
 node
 size, but if you go back and look through some of the issues we've seen
 over time, you'll find we've been hit with nodes being too small, having
 too few nodes to deal with request volume, having OOMs, having bad
 sstables,
 having the ring appear different to different nodes, and several other
 problems.

 Many of i/o problems presented themselves as MessageDeserializer pool
 backups
 (although we stopped having these since Jonathan was by and suggested row
 cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
 which are probably caused by GC storms during compactions (although
 usually
 the nodes restart and compact fine, so who knows).  I also regularly watch
 nodes go away for 30 seconds or so (logs show node goes dead, then comes
 back to life a few seconds later).

 I've sort of given up worrying about these, as we are in the process of
 moving this cluster to our own machines in a colo, so I figure I should
 wait until they are moved, and see how the new machines do before I worry
 more about performance.

 -Anthony

 --
 
 Anthony Molinaro                           antho...@alumni.caltech.edu



Any proxy with a TCP health check should be able to determine if the
Cassandra service is down hard. The problem for the tools that are not
cassandra protocol aware are detecting slowness or other anomalies
like TimedOut exceptions.

If you are seeing GC storms during compactions you might have rows
that are too big. When the compaction hits these memory spikes. I
lowered the compaction priority (and added more nodes) which has
helped compaction back off leaving some IO for requests.


Re: NodeTool won't connect remotely

2010-08-30 Thread Allan Carroll
Thanks! That did it. Looks like the connection happens on 10036 and then the 
server negotiates a separate port for continued communication. 

Found this article once I knew what to look for. It also describes how to get 
more consistency on port numbers to allow for ssh tunneling and firewalls.

From http://jared.ottleys.net/alfresco/tunneling-debug-and-jmx-for-alfresco

The -Djava.rmi.server.hostname=dummyhost option is needed to help RMI know 
where to connect.  RMI connects in a two part process.  First by connecting to 
the RMI server registry, which pushes your request to the JMX service which is 
dynamically allocated on the first open port available to it at start up time.


On Aug 30, 2010, at 10:30 AM, Juho Mäkinen wrote:

 I think that JMX needs additional ports to function correctly. Try to
 disable all firewalls between the client and the server so that client
 can connect to any port in the server and try again.
 
 - Juho Mäkinen
 
 On Mon, Aug 30, 2010 at 7:07 PM, Allan Carroll alla...@gmail.com wrote:
 Hi,
 
 I'm trying to manage my cassandra cluster from a remote box and having 
 issues getting nodetool to connect. All the machines I'm using are running 
 on AWS.
 
 Here's what happens when I try:
 
 /opt/apache-cassandra-0.6.4/bin/nodetool -h xxx.xxx.xxx.143 -p 10036 ring
 Error connecting to remote JMX agent!
 java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.143; 
 nested exception is:
java.net.ConnectException: Connection timed out
 
 
 When I'm local to a box (Ubuntu 10.04) running Cassandra, I can connect fine 
 via both 127.0.0.1 and external ip (xxx.xxx.xxx.143). I can telnet into the 
 jmx port from an external machine fine:
 
 telnet xxx.xxx.xxx.143 10036
 Trying xxx.xxx.xxx.143...
 Connected to xxx.xxx.xxx.143.
 Escape character is '^]'.
 
 I already added the -Djava.rmi.server.hostname parameter to the java 
 runtime, but it didn't seem to affect anything.
 
 /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile 
 /var/run/cassandra.pid -errfile 1 -outfile /var/log/cassandra/output.log 
 -cp 
 /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.6.3.jar:/usr/share/cassandra/avro-1.2.0-dev.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collections-1.0.jar:/usr/share/cassandra/hadoop-core-0.20.1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/ivy-2.1.0.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/libthrift-r917130.jar:/usr/share/cassandra/log4j-1.2.14.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar
  -Xmx4G -Xms128M -Djava.rmi.server.hostname=xxx.xxx.xxx.143 -Dcassandra 
 -Dstorage-config=/etc/cassandra -Dcom.sun.management.jmxremote.port=10036 
 -Dcom.sun.management.jmxremote.ssl=false 
 -Dcom.sun.management.jmxremote.authenticate=false 
 org.apache.cassandra.thrift.CassandraDaemon
 
 
 netstat shows that I'm still bound to IP6
 
 netstat -nap|grep 10036
 tcp6   0  0 :::10036:::*LISTEN   
29277/jsvc
 
 
 And, now I'm at an impasse. Any help would be greatly appreciated.
 
 Thanks
 -Allan
 
 



Dumping

2010-08-30 Thread Mark
 Is there an easy way to retrieve all values from a CF.. similar to a 
dump?


How about retrieving all columns for a particular key?

In the second use case a simple iteration would work using a start and 
finish but how would this be accomplished across all keys for a 
particular CF when you don't know the keys in advance?


Thanks


Re: Dumping

2010-08-30 Thread aaron morton
sstable2json discussed here http://wiki.apache.org/cassandra/Operations may be 
what you are after, or the snapshot feature. Not sure what you want to use the 
dump for. 

If you do not know the keys in the CF in advance take a look at 
get_range_slices (http://wiki.apache.org/cassandra/API) it allows you to slice 
through the keys in a similar way to slicing through the columns. 


Aaron

On 31 Aug 2010, at 05:15, Mark wrote:

 Is there an easy way to retrieve all values from a CF.. similar to a dump?
 
 How about retrieving all columns for a particular key?
 
 In the second use case a simple iteration would work using a start and finish 
 but how would this be accomplished across all keys for a particular CF when 
 you don't know the keys in advance?
 
 Thanks



Client developer mailing list

2010-08-30 Thread Jeremy Hanna
There has been a new mailing list created for those who are working on 
Cassandra clients above thrift and/or avro.  You can subscribe by sending an 
email to client-dev-subscr...@cassandra.apache.org or using the link at the 
bottom of http://cassandra.apache.org

The list is meant to give client authors a discussion forum as well as a place 
to interact with core cassandra developers about the roadmap and upcoming 
features.

Thanks to Cliff Moon (@moonpolysoft) for starting a discussion about client 
quality at the Cassandra Summit.

Re: Job opening cassandra Barcelona, Spain

2010-08-30 Thread Dimitry Lvovsky
Thanks for the suggestion.

On Aug 30, 2010, at 8:01 PM, Norman Maurer wrote:

 I think you should try jobs at apache.org too ;)
 
 Bye,
 Norman
 
 2010/8/25 Dimitry Lvovsky dimi...@reviewpro.com:
 Hi All,
 Please forgive the job offer spam.
 
 We're looking to add a developer with  experience using Cassandra, to join 
 our team in Barcelona.  An ideal candidate  will have a strong CS background 
 (academic or otherwise) with high level Java skills and experience 
 programing in Scala.  Knowing your way around CSS/Javascript would be a 
 definite plus.
 
 We can only accept potential candidates with permission to work in the EU at 
 this time.
 
 Please send inquires to  jobs at reviewpro dot com.
 
 
 
 
 
 
 
 



Re: Client developer mailing list

2010-08-30 Thread Ran Tavory
awesome, thanks, I'm subscribed :)

On Mon, Aug 30, 2010 at 10:05 PM, Jeremy Hanna
jeremy.hanna1...@gmail.comwrote:

 There has been a new mailing list created for those who are working on
 Cassandra clients above thrift and/or avro.  You can subscribe by sending an
 email to client-dev-subscr...@cassandra.apache.org or using the link at
 the bottom of http://cassandra.apache.org

 The list is meant to give client authors a discussion forum as well as a
 place to interact with core cassandra developers about the roadmap and
 upcoming features.

 Thanks to Cliff Moon (@moonpolysoft) for starting a discussion about client
 quality at the Cassandra Summit.


Re: Client developer mailing list

2010-08-30 Thread Mike Peters

 I'm in!

We really need a better PHP Thrift


Re: get_slice sometimes returns previous result on php

2010-08-30 Thread Benjamin Black
On Mon, Aug 30, 2010 at 6:05 AM, Juho Mäkinen juho.maki...@gmail.com wrote:
 The application is using the
 same cassandra thrift connection (it doesn't close it in between) and
 everything is happening inside same php process.


This is why you are seeing this problem (and is specific to connection
reuse in certain languages, not a general problem with connection
reuse).


b


Re: get_slice sometimes returns previous result on php

2010-08-30 Thread Juho Mäkinen
I'm not using connection poolin where the same tcp socket is used
between different php requests. I open a new thrift connection with
new socket to the node and I use the node through the request and I
close it after. The get_slice requests are all happening in the same
request, so something odd happens in the between.

Tomorrow I'm going to implement a history buffer which logs all
cassandra operations within the php request and logs it out in case I
detect this anomaly again. Hopefully that gives some light to the
problem.

 - Juho Mäkinen

On Mon, Aug 30, 2010 at 10:50 PM, Benjamin Black b...@b3k.us wrote:
 On Mon, Aug 30, 2010 at 6:05 AM, Juho Mäkinen juho.maki...@gmail.com wrote:
 The application is using the
 same cassandra thrift connection (it doesn't close it in between) and
 everything is happening inside same php process.


 This is why you are seeing this problem (and is specific to connection
 reuse in certain languages, not a general problem with connection
 reuse).


 b



Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-30 Thread Peter Schuller
 collection runs for the cases tested. In most cases, I prefer having low
 pauses due to any garbage collection runs and don't care too much about the
 shape of the memory usage, and I guess, that's the reason why the low pause
 collector is used by default for running cassandra. For myself, I have mixed
 feelings regarding the low pause collector, because I found it difficult to
 find good young generation sizings, which are suitable to different load
 patterns. Therefor I mostly prefer the throughput collector, which
 adaptively sizes the young generation, doing a good job to avoiding that too
 much data goes to the tenured generation.

Well, if you care about pause times, usually the best bet would be to
have the young gen be as large as possible to yield what you consider
to be a pause time within an acceptable range. I.e., as large as
acceptable but no larger.

 I would be interested in, what are
 the differences concerning the stop times between the different GC variants,
 when running cassandra. Is it really much better to use the low pause
 collector in regard to get stabile response times, even if I use
 XX:+UseParallelOldGC and XX:MaxGCPauseMillis=nnn flags? Any experiences with
 this?

If you use the default (for the JVM, not for cassandra) throughput
collector, you *will* take full stop-the-world collections, period.
You can enable parallel GC, but with that collector there's no way
around the fact that full collections will pause the application for
the full duration of such full GC:s. In general, the larger the heap
(relative to speed of the collection), the more of a problem this will
be. If you deem the pause times acceptable for your particular
use-case, I don't see an obvious reason to prefer the CMS collector.

MaxGCPauseMillis won't help; the throughput collector just doesn't
have any way to ader to it. A full GC is a full GC.

For CMS, I'm not sure what, if any, effect the MaxGCPauseMillis has.
In my very limited testing I didn't see any obvious effect on e.g.
sizing choice for the young generation (but I have not checked the
code to see if CMS uses it).

It is definitely used by the G1 collector; typically MaxGCPauseMillis
and GCPauseIntervalMillis are the two most importants settings to
tweak. They are directly used to decide the young generation size, as
well as limit the number of non-young regions that are picked for GC
during a partial (not young-only) GC.

Has anyone run Cassandra with G1 in production for prolonged periods
of time? One thing that concerns me is the reliance on GC to remove
obsolete SS tables. That relies on certain GC behavior that is true
for CMS and the throughput collector, but not with G1. With CMS, an
unreachable sstable will be detected when concurrent mark/sweep
triggers; but with G1, there is not necessarily any expectation at all
that some particular region that happens to contain the reference in
question will be collected - *ever* - since G1 always picks the best
regions first (best in terms of bang for the buck - the most memory
reclaimed at the lowest cost).

-- 
/ Peter Schuller


Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-30 Thread Jonathan Ellis
On Mon, Aug 30, 2010 at 5:18 PM, Peter Schuller
peter.schul...@infidyne.com wrote:
 Has anyone run Cassandra with G1 in production for prolonged periods
 of time?

Not AFAIK.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: cassandra for a inbox search with high reading qps

2010-08-30 Thread Todd Nine
We use Lucandra as well for searching for users, as well as
geo-encoding.  It really works well except for numeric fields.

https://issues.apache.org/jira/browse/CASSANDRA-1235

That bug may be a bit of an issue, but after they release 0.6.5 all the
Lucene functionality will be available to you.

Todd






On Mon, 2010-08-30 at 05:49 -0700, Mike Peters wrote:

 Chen, 
 
 Have you considered using  http://www.slideshare.net/otisg/lucandra Lucandra 
 for Inbox search? 
 
 We have a similar setup and are currently looking into using Lucandra over
 implementing the searching ourselves with pure Cassandra.


Re: cassandra for a inbox search with high reading qps

2010-08-30 Thread Chen Xinli
what's the average size of a user?

As I know, lucandra will first poll the data from cassandra, then do
computation in the client. That's ok for small rows.
But we have 1M row in average, and some rows scale to 100M; at the same
time, we expect high reading qps.
Polling these data to client machine through network is unacceptable.

I have setup a demo system which do the searching locally in cassandra; it
seems working with reading qps 1000+ per node.

2010/8/31 Todd Nine t...@spidertracks.co.nz

  We use Lucandra as well for searching for users, as well as geo-encoding.
 It really works well except for numeric fields.

 https://issues.apache.org/jira/browse/CASSANDRA-1235

 That bug may be a bit of an issue, but after they release 0.6.5 all the
 Lucene functionality will be available to you.

 Todd







   On Mon, 2010-08-30 at 05:49 -0700, Mike Peters wrote:

 Chen,

 Have you considered using  http://www.slideshare.net/otisg/lucandra Lucandra
 for Inbox search?

 We have a similar setup and are currently looking into using Lucandra over
 implementing the searching ourselves with pure Cassandra.




-- 
Best Regards,
Chen Xinli


Re: column family names

2010-08-30 Thread Benjamin Black
URL encoding.

On Mon, Aug 30, 2010 at 5:55 PM, Aaron Morton aa...@thelastpickle.com wrote:
 under scores or URL encoding ?
 Aaron
 On 31 Aug, 2010,at 12:27 PM, Benjamin Black b...@b3k.us wrote:

 Please don't do this.

 On Mon, Aug 30, 2010 at 5:22 AM, Terje Marthinussen
 tmarthinus...@gmail.com wrote:
 Ah, sorry, I forgot that underscore was part of \w.
 That will do the trick for now.

 I do not see the big issue with file names though. Why not expand the
 allowed characters a bit and escape the file names? Maybe some sort of URL
 like escaping.

 Terje

 On Mon, Aug 30, 2010 at 6:29 PM, Aaron Morton aa...@thelastpickle.com
 wrote:

 Moving to the user list.
 The new restrictions were added as part of  CASSANDRA-1377 for 0.6.5 and
 0.7, AFAIK it's to ensure the file names created for the CFs can be
 correctly parsed. So it's probably not going to change.
 The names have to match the \w reg ex class, which includes the
 underscore
 character.

 Aaron

 On 30 Aug 2010, at 21:01, Terje Marthinussen tmarthinus...@gmail.com
 wrote:

 Hi,

 Now that we can make columns families on the fly, it gets interesting to
 use
 column families more as part of the data model (can reduce diskspace
 quite
 a
 bit vs. super columns in some cases).

 However, currently, the column family name validator is pretty strict
 allowing only word characters and in some cases it is pretty darned nice
 to
 be able to put something like a - inbetweenallthewords.

 Any reason to be this strict or could it be loosened up a little bit?

 Terje





Re: column family names

2010-08-30 Thread Terje Marthinussen
Beyond aesthetics, specific reasons?

Terje

On Tue, Aug 31, 2010 at 11:54 AM, Benjamin Black b...@b3k.us wrote:

 URL encoding.