SV: how to retrieve data from supercolumns by phpcassa ?

2010-08-13 Thread Thorvaldsson Justus
I don't use php so I don't know the method but http://wiki.apache.org/cassandra/API get ColumnOrSuperColumn get(string keyspace, string key, ColumnPath column_path, ConsistencyLevel consistency_level) Get the Column or SuperColumn at the given column_path. If no value is present,

Re: SV: how to retrieve data from supercolumns by phpcassa ?

2010-08-13 Thread lisek
Thanks Justus, I'll check it -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-to-retrieve-data-from-supercolumns-by-phpcassa-tp5416141p5419536.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Index feature in 0.7

2010-08-13 Thread Carlos Sanchez
All, I was wondering if I could get some information (link / pdf) about the new [column] indices in Cassandra for version 0.7 Thanks a lot, Carlos This email message and any attachments are for the sole use of the intended recipients and may contain proprietary and/or confidential

Re: how to retrieve data from supercolumns by phpcassa ?

2010-08-13 Thread lisek
Ok I deal with it. There was a bug in phpcassa and now I can make it that way: get-('client', UUID::convert('2a3909c0-a612-11df-b27e-346336336631', UUID::FMT_STRING, UUID::FMT_BINARY)) If someone is using phpcassa and want to make it work, please give me a sign and I'll post the solution. --

Re: How does cfstats calculate Row Size?

2010-08-13 Thread Julie
Jonathan Ellis jbellis at gmail.com writes: Right, row stats in 0.6 are just what I've seen during the compactions that happened to run since this node restarted last. 0.7 has persistent (and more fine-grained) statistics. I'm guessing (haven't read this part of the source) that the

Re: 0.7 CLI w/TSocket

2010-08-13 Thread Jonathan Ellis
if you turn off framed mode (by setting the the transport size to 0) then you need to use the unframed option with cli On Thu, Aug 12, 2010 at 10:20 PM, Mark static.void@gmail.com wrote: On 8/12/10 9:14 PM, Jonathan Ellis wrote: Works fine here. bin/cassandra-cli --host localhost --port

RE: error using get_range_slice with random partitioner

2010-08-13 Thread Adam Crain
David, This much like the behavior I saw... I thought that I might be doing something wrong, but I haven't had the time to check out other clients iteration implementations. What client are you using? -Adam -Original Message- From: David McIntosh [mailto:da...@radiotime.com] Sent:

Re: 0.7 CLI w/TSocket

2010-08-13 Thread Mark
On 8/13/10 7:09 AM, Jonathan Ellis wrote: if you turn off framed mode (by setting the the transport size to 0) then you need to use the unframed option with cli On Thu, Aug 12, 2010 at 10:20 PM, Markstatic.void@gmail.com wrote: On 8/12/10 9:14 PM, Jonathan Ellis wrote: Works

Cassandra and Pig

2010-08-13 Thread Christian Decker
Hi all, I'm trying to get Pig to read data from a Cassandra cluster, which I thought trivial since Cassandra already provides me with the CassandraStorage class. Problem is that once I try executing a simple script like this: register /path/to/pig-0.7.0-core.jar;register

TimeUUID vs Epoch

2010-08-13 Thread Mark
I'm a little confused on when I should be using TimeUUID vs Epoch/Long when I want columns ordered by time. I know it sounds strange and the obvious choice should be TimeUUID but I'm not sure why that would be preferred over just using the Epoch stamp? The pretty much seem to accomplish the

Re: TimeUUID vs Epoch

2010-08-13 Thread Sylvain Lebresne
As long as time sorting is involved, you'll the same ordering if you use Epoch/Long or TimeUUID. The difference is between the ties. If when you insert two values at the exact same time, you want to have only one stay, then you want LongType. If however you don't want to merge such inserts, then

Key Index/Key Slices

2010-08-13 Thread Mark
Keys are indexed in Cassandra but are they ordered? If so, how? Do Key Slices work like Range Slices for columns.. ie I can give a start and end range? It seems like if they are not ordered (which I think is true) then performing KeyRanges would be somewhat inefficient or at least not as

Re: Data Distribution / Replication

2010-08-13 Thread Oleg Anastasjev
Benjamin Black b at b3k.us writes: 3. I waited for the data to replicate, which didn't happen. Correct, you need to run nodetool repair because the nodes were not present when the writes came in. You can also use a higher consistency level to force read repair before returning data, which

Re: Data Distribution / Replication

2010-08-13 Thread Benjamin Black
On Fri, Aug 13, 2010 at 9:48 AM, Oleg Anastasjev olega...@gmail.com wrote: Benjamin Black b at b3k.us writes: 3. I waited for the data to replicate, which didn't happen. Correct, you need to run nodetool repair because the nodes were not present when the writes came in.  You can also use a

Re: TimeUUID vs Epoch

2010-08-13 Thread Mark
! So long story short you can give a start/end range when using TimeUUID? For example I am storing a bunch of records keyed by the current date 20100813. Each column is a TimeUUID. If I wanted to get all the columns that between some arbitrary time.. say 6am - 9am I can get that? Using Long

RE: Cassandra and Pig

2010-08-13 Thread Stu Hood
That error is coming from the frontend: the jars must also be on the local classpath. Take a look at how contrib/pig/bin/pig_cassandra sets up $PIG_CLASSPATH. -Original Message- From: Christian Decker decker.christ...@gmail.com Sent: Friday, August 13, 2010 11:30am To:

Re: TimeUUID vs Epoch

2010-08-13 Thread Mark
range when using TimeUUID? For example I am storing a bunch of records keyed by the current date 20100813. Each column is a TimeUUID. If I wanted to get all the columns that between some arbitrary time.. say 6am - 9am I can get that? Using Long I can just use a start of 12817044 and a finish

Re: Cassandra and Pig

2010-08-13 Thread Christian Decker
Wow, that was extremely quick, thanks Stu :-) I'm still a bit unclear on what the pig_cassandra script does. It sets some variables (PIG_CLASSPATH for one) and then starts the original pig binary but injects some libraries in it (libthrift and pig-core) but strangely not the cassandra loadfunc,

Monitoring cassandra using Munin...how to get a hourly graph?

2010-08-13 Thread Simon Reavely
Hi, For those of you using Munin to monitor Cassandra's JMX stats I wondered if anyone had figured out how to get hourly graphs. By default we are getting daily, weekly, monthly, yearly but for our performance testing we really need hourly and looking on the message boards we can't figure out how

Migration from .6 to.7

2010-08-13 Thread Claire Chang
I was wondering if there will be a document on how to do it? Sent from my iPhone

Re: Migration from .6 to.7

2010-08-13 Thread Jonathan Ellis
yes. NEWS.txt On Fri, Aug 13, 2010 at 10:31 AM, Claire Chang cla...@merchantcircle.com wrote: I was wondering if there will be a document on how to do it? Sent from my iPhone -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra

Count rows

2010-08-13 Thread Mark
Is there some way I can count the number of rows in a CF.. CLI, MBean? Gracias

Re: a plea not to remove rowsize warning

2010-08-13 Thread Jonathan Ellis
added key to in_memory_compaction_limit threshold log: logger.info(String.format(Compacting large row %s (%d bytes) incrementally, FBUtilities.bytesToHex(rows.get(0).getKey().key), rowSize)); On Wed, Aug 11, 2010 at 4:11 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Hello all,

Re: Count rows

2010-08-13 Thread Jonathan Ellis
not without fetching all of them with get_range_slices On Fri, Aug 13, 2010 at 10:37 AM, Mark static.void@gmail.com wrote: Is there some way I can count the number of rows in a CF.. CLI, MBean? Gracias -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source

Re: Count rows

2010-08-13 Thread Mark
On 8/13/10 10:44 AM, Jonathan Ellis wrote: not without fetching all of them with get_range_slices On Fri, Aug 13, 2010 at 10:37 AM, Markstatic.void@gmail.com wrote: Is there some way I can count the number of rows in a CF.. CLI, MBean? Gracias Im guessing you would

Re: Count rows

2010-08-13 Thread Jonathan Ellis
because it would work amazingly poorly w/ billions of rows. it's an antipattern. On Fri, Aug 13, 2010 at 10:50 AM, Mark static.void@gmail.com wrote: On 8/13/10 10:44 AM, Jonathan Ellis wrote: not without fetching all of them with get_range_slices On Fri, Aug 13, 2010 at 10:37 AM,

Re: Count rows

2010-08-13 Thread Mark
On 8/13/10 10:52 AM, Jonathan Ellis wrote: because it would work amazingly poorly w/ billions of rows. it's an antipattern. On Fri, Aug 13, 2010 at 10:50 AM, Markstatic.void@gmail.com wrote: On 8/13/10 10:44 AM, Jonathan Ellis wrote: not without fetching all of them with

Re: Cassandra and Pig

2010-08-13 Thread Stu Hood
Hmm, the example code there may not have been run in distributed mode recently, or perhaps Pig performs some magic to automatically register Jars containing classes directly referenced as UDFs. -Original Message- From: Christian Decker decker.christ...@gmail.com Sent: Friday, August 13,

Re: Cassandra and Pig

2010-08-13 Thread Stu Hood
Still I get an exception which I cannot explain where it comes from (http://pastebin.com/JYfSSfny) Which version of Cassandra are you using? The 0.6 series requires that a valid storage-conf.xml is distributed with the job to specify connection/partitioner/etc information, but trunk/0.7-beta2

Re: Count rows

2010-08-13 Thread Gary Dusbabek
Should we close https://issues.apache.org/jira/browse/CASSANDRA-653 then? Fetching a count of all rows is just a specific instance of fetching the count of a range or rows. I spoke to a programmer at the summit who was working on this ticket mainly as a way of getting familiar with the codebase.

[RELEASE] 0.7.0 beta1

2010-08-13 Thread Eric Evans
Happy Friday the 13th. Are you feeling lucky? I know I am. Ok, first off, a disclaimer. As the suffix on the version indicates this is *beta* software. If you run off and upgrade a production server with this there is a very good chance that you are going to be

Re: Data Distribution / Replication

2010-08-13 Thread Bill de hÓra
On Fri, 2010-08-13 at 09:51 -0700, Benjamin Black wrote: My recommendation is to leave Autobootstrap disabled, copy the datafiles over, and then run cleanup. It is faster and more reliable than streaming, in my experience. What is less reliable about streaming? Bill

RE: error using get_range_slice with random partitioner

2010-08-13 Thread David McIntosh
Adam, I'm using my own code to iterate that is similar to what Dave Viner posted except in C#. Given that it works in 0.6.3 I'd like to think that the code is ok unless this type of iteration isn't supported. I was going to try iterating using tokens today but it turns out it's not so easy to

Re: Count rows

2010-08-13 Thread Jonathan Ellis
Well, it's a bad idea, except when it isn't. I think I'm okay with our api evolving to handle more corner cases. It's true that it runs the risk of encouraging bad design from new users though. On Fri, Aug 13, 2010 at 1:07 PM, Gary Dusbabek gdusba...@gmail.com wrote: Should we close

Re: Data Distribution / Replication

2010-08-13 Thread Benjamin Black
Number of bugs I've hit doing this with scp: 0 Number of bugs I've hit with streaming: 2 (and others found more) Also easier to monitor progress, manage bandwidth, etc. I just prefer using specialized tools that are really good at specific things. This is such a case. b On Fri, Aug 13, 2010

Re: Data Distribution / Replication

2010-08-13 Thread Stefan Kaufmann
My recommendation is to leave Autobootstrap disabled, copy the datafiles over, and then run cleanup.  It is faster and more reliable than streaming, in my experience. I thought about copying da Data manually. However if I have a running environment and add a node (or replace a broken one), how