Re: Inter node communication over UDP

2011-05-20 Thread pankajsoni0126
I am working on version 0.7.6 of cassandra. I have been looking into the code
to identify communication between nodes.

it seems to me that both inter-node and servernode-client communication
happens using thrift protocol, is my understanding correct? 

and the gossiper communication takes place using tcp and message queue?



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Inter-node-communication-over-UDP-tp6358459p6384978.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


How to reduce the Read Latency.

2011-05-20 Thread Dikang Gu
Hi All,

I'm running three cassandra 0.7.4 nodes in a cluster, and I give 2G memory to 
each node. 

Now, I get the cfstats here:

Keyspace: UserMap
Read Count: 38411
Read Latency: 123.54214613001484 ms.
Write Count: 44155
Write Latency: 0.02341093873853471 ms.
Pending Tasks: 0
Column Family: Map
SSTable count: 3
Space used (live): 32704387
Space used (total): 32704387
Memtable Columns Count: 49
Memtable Data Size: 3348
Memtable Switch Count: 56
Read Count: 38411
Read Latency: 123.542 ms.
Write Count: 44155
Write Latency: 0.023 ms.
Pending Tasks: 0
Key cache capacity: 20
Key cache size: 611
Key cache hit rate: 0.9294361241314483
Row cache: disabled
Compacted row minimum size: 125
Compacted row maximum size: 17436917
Compacted row mean size: 147647

You can find that the Read Latency is really high here, so what can I do to 
reduce the latency? Give more memory to the three nodes? Any other options?

Thanks. 
-- 
Dikang Gu
0086 - 18611140205


CQL: Select for multiple ranges

2011-05-20 Thread David Boxenhorn
In order to fully implement the functionality of super columns using
compound columns I need to be able to select multiple column ranges - this
would be functionally equivalent to selecting multiple super columns (and
more!).

I would like to request the following CQL syntax:

SELECT [FIRST N] [REVERSED] name1..nameN1, name2..nameN2... FROM ...

I am heading into my weekend here. If no one has created a JIRA ticket for
this by Sunday, and I am not talked out of it, I will create one myself.


[RELEASE] Apache Cassandra 0.7.6-2 released

2011-05-20 Thread Sylvain Lebresne
Apache Cassandra 0.7.6 is just 2 days old, but it shipped with a bug in the
debian packaging code that prevents the package from being successfully
installed. Since debian packaging is an integral part of the project, 0.7.6-2
comes to fix this issue.

This is the only change[1] that separate 0.7.6-2 from 0.7.6, so if you don't
use debian and have already upgraded to 0.7.6, you can safely upgrade 0.7.6-2
or simply skip that version.

For the debian users, this will fix the error you may have seen during the
upgrade to 0.7.6.

As always, all relevant details are in the release notes[2] and please let us
know if you encounter any problem[3].

And sorry for the inconvenience.
The Cassandra Team.

[1]: http://goo.gl/BfuBh (CHANGES.txt)
[2]: http://goo.gl/y8gGF (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA


Re: Cassandra Vs. Oracle Coherence

2011-05-20 Thread Jeffrey Kesselman
I believe coherence is their name for the TimesTen technology they bought.

TT is an in memory SQL database that can  run as a cache for Oracle.

Its totally different from Cassandra.   On the one hand it supports
trad SQL whereas Cassandra does not.  On the other hand Cassandra is
truly distributed and fault tolerant, whereas TT is not.

I suggest getting and reading the Oriely Cassandra book.

JK

On Tue, May 17, 2011 at 10:44 PM, Karamel, Raghu
raghu_kara...@intuit.com wrote:
 Hi,



 I am new to Cassandra and very excited with the technology. I am evaluating
 it and trying to understand the difference between Cassandra and Oracle
 Coherence. Precisely , looking for reasons why would some select Cassandra
 over Oracle Coherence. Does anyone did the exercise of comparing them?
 Appreciate if you can share some information on that.



 Regrads

 -RK



-- 
It's always darkest just before you are eaten by a grue.


Re: How to reduce the Read Latency.

2011-05-20 Thread Jeffrey Kesselman
What consistency are you asking for?

On Fri, May 20, 2011 at 7:42 AM, Dikang Gu dikan...@gmail.com wrote:
 Hi All,
 I'm running three cassandra 0.7.4 nodes in a cluster, and I give 2G memory
 to each node.
 Now, I get the cfstats here:
 Keyspace: UserMap
 Read Count: 38411
 Read Latency: 123.54214613001484 ms.
 Write Count: 44155
 Write Latency: 0.02341093873853471 ms.
 Pending Tasks: 0
 Column Family: Map
 SSTable count: 3
 Space used (live): 32704387
 Space used (total): 32704387
 Memtable Columns Count: 49
 Memtable Data Size: 3348
 Memtable Switch Count: 56
 Read Count: 38411
 Read Latency: 123.542 ms.
 Write Count: 44155
 Write Latency: 0.023 ms.
 Pending Tasks: 0
 Key cache capacity: 20
 Key cache size: 611
 Key cache hit rate: 0.9294361241314483
 Row cache: disabled
 Compacted row minimum size: 125
 Compacted row maximum size: 17436917
 Compacted row mean size: 147647
 You can find that the Read Latency is really high here, so what can I do to
 reduce the latency?  Give more memory to the three nodes? Any other options?
 Thanks.
 --
 Dikang Gu
 0086 - 18611140205




-- 
It's always darkest just before you are eaten by a grue.


Re: Inter node communication over UDP

2011-05-20 Thread Jeffrey Kesselman
TCP/IP byte over-head v. UDP really isnt that much if your packets are
of any significant size (its 30 bytes).

And as others have pointed out you can easily get more over-head with
worse results trying to reinvent reliable transport on top of UDP.
Remember that TCP/IP has had 30 years of development and tuning.

On Fri, May 20, 2011 at 7:39 AM, pankajsoni0126
pankajsoni0...@gmail.com wrote:
 I am working on version 0.7.6 of cassandra. I have been looking into the code
 to identify communication between nodes.

 it seems to me that both inter-node and servernode-client communication
 happens using thrift protocol, is my understanding correct?

 and the gossiper communication takes place using tcp and message queue?



 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Inter-node-communication-over-UDP-tp6358459p6384978.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.




-- 
It's always darkest just before you are eaten by a grue.


Re: How to reduce the Read Latency.

2011-05-20 Thread Dikang Gu
I use the default consistency level in the hector client, so it should be 
QUORUM.

-- 
Dikang Gu
0086 - 18611140205
On Friday, May 20, 2011 at 4:25 PM, Jeffrey Kesselman wrote: 
 What consistency are you asking for?
 
 On Fri, May 20, 2011 at 7:42 AM, Dikang Gu dikan...@gmail.com wrote:
  Hi All,
  I'm running three cassandra 0.7.4 nodes in a cluster, and I give 2G memory
  to each node.
  Now, I get the cfstats here:
  Keyspace: UserMap
  Read Count: 38411
  Read Latency: 123.54214613001484 ms.
  Write Count: 44155
  Write Latency: 0.02341093873853471 ms.
  Pending Tasks: 0
  Column Family: Map
  SSTable count: 3
  Space used (live): 32704387
  Space used (total): 32704387
  Memtable Columns Count: 49
  Memtable Data Size: 3348
  Memtable Switch Count: 56
  Read Count: 38411
  Read Latency: 123.542 ms.
  Write Count: 44155
  Write Latency: 0.023 ms.
  Pending Tasks: 0
  Key cache capacity: 20
  Key cache size: 611
  Key cache hit rate: 0.9294361241314483
  Row cache: disabled
  Compacted row minimum size: 125
  Compacted row maximum size: 17436917
  Compacted row mean size: 147647
  You can find that the Read Latency is really high here, so what can I do to
  reduce the latency? Give more memory to the three nodes? Any other options?
  Thanks.
  --
  Dikang Gu
  0086 - 18611140205
 
 
 
 -- 
 It's always darkest just before you are eaten by a grue.
 


Re: Inter node communication over UDP

2011-05-20 Thread Sylvain Lebresne
On Fri, May 20, 2011 at 9:39 AM, pankajsoni0126
pankajsoni0...@gmail.com wrote:
 I am working on version 0.7.6 of cassandra. I have been looking into the code
 to identify communication between nodes.

 it seems to me that both inter-node and servernode-client communication
 happens using thrift protocol, is my understanding correct?

No. inter-node communication uses MessagingService that itself uses
OutboundTcpConnection that just write serialized Message to a tcp socket
(using a message queue). None of this uses thrift.

 and the gossiper communication takes place using tcp and message queue?

Yes, but actually gossip uses MessagingService too. So gossip communication
is really just one type of inter-node communication.

--
Sylvain




 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Inter-node-communication-over-UDP-tp6358459p6384978.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.



simple implementation of counters

2011-05-20 Thread Sasha Dolgy
Hi,

I'm trying to play around with 0.8.0-rc1 and counters, and I'm a
little confused.

First question I have is about the definition.  A column within a
standard column family cannot be a counter column type?

I had tried the following, with no success.

create column family urlcounts
with comparator = UTF8Type
and default_validation_class = LongType
and column_metadata=[
{ column_name:count,
  validation_class:CounterColumnType
}
];


The only successful way was for me to create a new column family with
a default validation class of CounterColumnType:

create column family counters with default_validation_class = CounterColumnType;

Now I have a column family with counters:

Keyspace: sdo:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
Options: [datacenter1:1]
  Column Families:
ColumnFamily: counters
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator:
org.apache.cassandra.db.marshal.CounterColumnType
  Columns sorted by: org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 0.2953125/63/1440 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: false
  Built indexes: []

From the CLI, I insert / increment a [key][column] in this column family:

incr counters[ascii('foo')][ascii('c1')];

Second question:  If I want to implement counters, they have to exist
in a seperate column family from the rest of my data?

-- 
Sasha Dolgy
sasha.do...@gmail.com


Re : selecting data

2011-05-20 Thread karim abbouh
is there a way to set  for a column of the same key a set of value ?





De : Watanabe Maki watanabe.m...@gmail.com
À : user@cassandra.apache.org user@cassandra.apache.org
Envoyé le : Jeu 19 mai 2011, 17h 38min 39s
Objet : Re: selecting data


Cassandra is not a RDBMS. Only you can do is search on a key, or you need full 
scan.
You need to design your schema carefully as your application needs.



On 2011/05/20, at 1:11, karim abbouh karim_...@yahoo.fr wrote:


i'm new using cassandra database,
i want to get data as in relationnel database:
select * from table where field=value;
i see using CLI we have just the followings commands :
get ksp.cf['key']  Get a slice of 
columns.
get ksp.cf['key']['super']   Get a slice of sub 
columns.
get  ksp.cf['key']['col'] Get a column 
value.
get ksp.cf['key']['super']['col']  Get a sub column 
value.

is there a way for that.
i think using java API is possible.
cassandra version : 6.0.12


thanks for help





Re: Cassandra Vs. Oracle Coherence

2011-05-20 Thread Peter Lin
That's completely wrong.

TimesTen and Coherence are 2 separate products sold by Oracle.
Coherence is a data grid that takes a key/value approach. It is
massively scalable across LAN and WAN.

In terms of comparing Cassandra and Coherence, I wouldn't. Coherence
is a data grid and most often used as a fault tolerant distributed
cache, though it does a lot more than that. Oracle bought Tangosol a
few years back.

On Fri, May 20, 2011 at 4:23 AM, Jeffrey Kesselman jef...@gmail.com wrote:
 I believe coherence is their name for the TimesTen technology they bought.

 TT is an in memory SQL database that can  run as a cache for Oracle.

 Its totally different from Cassandra.   On the one hand it supports
 trad SQL whereas Cassandra does not.  On the other hand Cassandra is
 truly distributed and fault tolerant, whereas TT is not.

 I suggest getting and reading the Oriely Cassandra book.

 JK

 On Tue, May 17, 2011 at 10:44 PM, Karamel, Raghu
 raghu_kara...@intuit.com wrote:
 Hi,



 I am new to Cassandra and very excited with the technology. I am evaluating
 it and trying to understand the difference between Cassandra and Oracle
 Coherence. Precisely , looking for reasons why would some select Cassandra
 over Oracle Coherence. Does anyone did the exercise of comparing them?
 Appreciate if you can share some information on that.



 Regrads

 -RK



 --
 It's always darkest just before you are eaten by a grue.



DatabaseMetadata

2011-05-20 Thread Vivek Mishra
Any thoughts on building something like separate DatabaseMetadata api for CQL?

-Original Message-
From: Peter Lin [mailto:wool...@gmail.com]
Sent: Friday, May 20, 2011 5:28 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Vs. Oracle Coherence

That's completely wrong.

TimesTen and Coherence are 2 separate products sold by Oracle.
Coherence is a data grid that takes a key/value approach. It is massively 
scalable across LAN and WAN.

In terms of comparing Cassandra and Coherence, I wouldn't. Coherence is a data 
grid and most often used as a fault tolerant distributed cache, though it does 
a lot more than that. Oracle bought Tangosol a few years back.

On Fri, May 20, 2011 at 4:23 AM, Jeffrey Kesselman jef...@gmail.com wrote:
 I believe coherence is their name for the TimesTen technology they bought.

 TT is an in memory SQL database that can  run as a cache for Oracle.

 Its totally different from Cassandra.   On the one hand it supports
 trad SQL whereas Cassandra does not.  On the other hand Cassandra is
 truly distributed and fault tolerant, whereas TT is not.

 I suggest getting and reading the Oriely Cassandra book.

 JK

 On Tue, May 17, 2011 at 10:44 PM, Karamel, Raghu
 raghu_kara...@intuit.com wrote:
 Hi,



 I am new to Cassandra and very excited with the technology. I am
 evaluating it and trying to understand the difference between
 Cassandra and Oracle Coherence. Precisely , looking for reasons why
 would some select Cassandra over Oracle Coherence. Does anyone did the 
 exercise of comparing them?
 Appreciate if you can share some information on that.



 Regrads

 -RK



 --
 It's always darkest just before you are eaten by a grue.




Write to us for a Free Gold Pass to the Cloud Computing Expo, NYC to attend a 
live session by Head of Impetus Labs on ‘Secrets of Building a Cloud Vendor 
Agnostic PetaByte Scale Real-time Secure Web Application on the Cloud ‘.

Looking to leverage the Cloud for your Big Data Strategy ? Attend Impetus 
webinar on May 27 by registering at http://www.impetus.com/webinar?eventid=42 .


NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


how to use indexed column for this case

2011-05-20 Thread Monkey me
Hi,
 I have a SCF, Key is string, super column is TimeUUID, and several
columns with one column named type, I create secondary index on this
column.  I want to have the foliowing query to fetch all super columns along
with all columns.
 1. given a specific key
 2. given a range of super column (start time to end time)
 3. given specific type value.

 Is such query possible? I could not figure out how to use
getIndexedSlice to achieve this. Any idea? Thanks.


Hou


cannot parse as hex bytes

2011-05-20 Thread Patrick Julien
The following sample:

http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes

No longer works with 0.8.0-rc1

I get cannot parse as hex bytes when doing the

Next we add some users:
[default@demo] set users[bsanderson][full_name] = 'Brandon Sanderson';

section

I can confirm that CASSANDRA-2497 does work


Re : Re : selecting data

2011-05-20 Thread karim abbouh
a storage-conf.xml is read just at the starting of cassandra?
is there a way to add a column family dynamically?

BR





De : karim abbouh karim_...@yahoo.fr
À : user@cassandra.apache.org
Envoyé le : Ven 20 mai 2011, 12h 48min 54s
Objet : Re : selecting data


is there a way to set  for a column of the same key a set of value ?





De : Watanabe Maki watanabe.m...@gmail.com
À : user@cassandra.apache.org user@cassandra.apache.org
Envoyé le : Jeu 19 mai 2011, 17h 38min 39s
Objet : Re: selecting data


Cassandra is not a RDBMS. Only you can do is search on a key, or you need full 
scan.
You need  to design your schema carefully as your application needs.



On 2011/05/20, at 1:11, karim abbouh karim_...@yahoo.fr wrote:


i'm new using cassandra database,
i want to get data as in relationnel database:
select * from table where field=value;
i see using CLI we have just the followings commands :
get ksp.cf['key']  Get a slice of 
columns.
get  ksp.cf['key']['super']   Get a slice of sub 
columns.
get  ksp.cf['key']['col'] Get a column 
value.
get ksp.cf['key']['super']['col']  Get a sub column 
value.

is there a way for that.
i think using java API is possible.
cassandra version : 6.0.12


thanks for help





Re: Cassandra Vs. Oracle Coherence

2011-05-20 Thread Jeffrey Kesselman
AH, I stand corrected.  Hard to follow Larry's acquisitions without a scorecard.

On 5/20/11, Peter Lin wool...@gmail.com wrote:
 That's completely wrong.

 TimesTen and Coherence are 2 separate products sold by Oracle.
 Coherence is a data grid that takes a key/value approach. It is
 massively scalable across LAN and WAN.

 In terms of comparing Cassandra and Coherence, I wouldn't. Coherence
 is a data grid and most often used as a fault tolerant distributed
 cache, though it does a lot more than that. Oracle bought Tangosol a
 few years back.

 On Fri, May 20, 2011 at 4:23 AM, Jeffrey Kesselman jef...@gmail.com wrote:
 I believe coherence is their name for the TimesTen technology they bought.

 TT is an in memory SQL database that can  run as a cache for Oracle.

 Its totally different from Cassandra.   On the one hand it supports
 trad SQL whereas Cassandra does not.  On the other hand Cassandra is
 truly distributed and fault tolerant, whereas TT is not.

 I suggest getting and reading the Oriely Cassandra book.

 JK

 On Tue, May 17, 2011 at 10:44 PM, Karamel, Raghu
 raghu_kara...@intuit.com wrote:
 Hi,



 I am new to Cassandra and very excited with the technology. I am
 evaluating
 it and trying to understand the difference between Cassandra and Oracle
 Coherence. Precisely , looking for reasons why would some select
 Cassandra
 over Oracle Coherence. Does anyone did the exercise of comparing them?
 Appreciate if you can share some information on that.



 Regrads

 -RK



 --
 It's always darkest just before you are eaten by a grue.




-- 
It's always darkest just before you are eaten by a grue.


RE: Re : Re : selecting data

2011-05-20 Thread Bahadur, Kamal
This is how you create it dynamically:

 

KsDef ksdef = new KsDef();

ksdef.name = ProgKS;

ksdef.replication_factor = 1;

ksdef.strategy_class =

org.apache.cassandra.locator.RackUnawareStrategy;

ListCfDef cfdefs = new ArrayListCfDef();

CfDef cfdef1 = new CfDef();

cfdef1.name = ProgCF1;

cfdef1.keyspace = ksdef.name;

cfdefs.add(cfdef1);

 

ksdef.cf_defs = cfdefs;

client.system_add_keyspace(ksdef);

CfDef cfdef2 = new CfDef();

cfdef2.keyspace = ksdef.name;

cfdef2.column_type = Standard;

cfdef2.name = ProgCF;

client.system_add_column_family(cfdef2);

 

From: karim abbouh [mailto:karim_...@yahoo.fr] 
Sent: Friday, May 20, 2011 8:03 AM
To: user@cassandra.apache.org
Subject: Re : Re : selecting data

 

a storage-conf.xml is read just at the starting of cassandra?
is there a way to add a column family dynamically?

BR

 



De : karim abbouh karim_...@yahoo.fr
À : user@cassandra.apache.org
Envoyé le : Ven 20 mai 2011, 12h 48min 54s
Objet : Re : selecting data

is there a way to set  for a column of the same key a set of value ?

 



De : Watanabe Maki watanabe.m...@gmail.com
À : user@cassandra.apache.org user@cassandra.apache.org
Envoyé le : Jeu 19 mai 2011, 17h 38min 39s
Objet : Re: selecting data

Cassandra is not a RDBMS. Only you can do is search on a key, or you need full 
scan.

You need to design your schema carefully as your application needs.

 


On 2011/05/20, at 1:11, karim abbouh karim_...@yahoo.fr wrote:

i'm new using cassandra database,
i want to get data as in relationnel database:
select * from table where field=value;
i see using CLI we have just the followings commands :
get ksp.cf['key']  Get a slice of 
columns.
get ksp.cf['key']['super']   Get a slice of sub 
columns.
get ksp.cf['key']['col'] Get a 
column value.
get ksp.cf['key']['super']['col']  Get a sub 
column value.

is there a way for that.
i think using java API is possible.
cassandra version : 6.0.12


thanks for help






Documentation of Known Issues

2011-05-20 Thread Daniel Doubleday
Hi all

I was wondering if there might be some way to better communicate known issues. 

We do try to track jira issues but at times some slip through or we miss 
implications.

Things like the broken repair of specific CFs. 
(https://issues.apache.org/jira/browse/CASSANDRA-2670). I know that this 
potentially dups jira but maybe tasks could get tagged and some magic filter 
could show a list. Or something like a simple wiki page that lists the known 
issues that might not be critical such as the mentioned bug but still are a 
pain if it happens to you.

Cheers,
Daniel



Re: Exception when starting

2011-05-20 Thread mcasandra
Whenever I hear someone say data is corrupted I panic :) I have seen few
people have reported that but have not seen the real reason for it. Is it a
manual error, config error, bug etc. It will be good to identify why these
things happen so that it can  be fixed before it happens in PROD :(

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6386809.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: how to use indexed column for this case

2011-05-20 Thread Anand Somani
From what I know you cannot create secondary indexes on SCF. You should have
gotten this = https://issues.apache.org/jira/browse/CASSANDRA-1813 on index
creation.

On Fri, May 20, 2011 at 6:56 AM, Monkey me monkey1024.pub...@gmail.comwrote:

 Hi,
  I have a SCF, Key is string, super column is TimeUUID, and several
 columns with one column named type, I create secondary index on this
 column.  I want to have the foliowing query to fetch all super columns along
 with all columns.
  1. given a specific key
  2. given a range of super column (start time to end time)
  3. given specific type value.

  Is such query possible? I could not figure out how to use
 getIndexedSlice to achieve this. Any idea? Thanks.


 Hou



Re: How to reduce the Read Latency.

2011-05-20 Thread mcasandra
What's your avg column size and row size? Your read latency in most case will
directly be related to how much you are trying to read. In my experience you
will see high read latency if you have big column size.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-reduce-the-Read-Latency-tp6385107p6386817.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Can I use secondary index with any partitioner

2011-05-20 Thread Jonathan Ellis
Yes.

On Thu, May 19, 2011 at 7:24 PM, Dave Rav daver...@yahoo.com wrote:
 Can I use secondary index with any partitioner



 1) RandomPartitioner
 2) ByteOrderedPartitioner





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: simple implementation of counters

2011-05-20 Thread Jonathan Ellis
On Fri, May 20, 2011 at 5:55 AM, Sasha Dolgy sdo...@gmail.com wrote:
 First question I have is about the definition.  A column within a
 standard column family cannot be a counter column type?

Right. https://issues.apache.org/jira/browse/CASSANDRA-2614 is open to
address this but that's going to be post-0.8.0.

 Second question:  If I want to implement counters, they have to exist
 in a seperate column family from the rest of my data?

Isn't that the same as the first question? :)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: cannot parse as hex bytes

2011-05-20 Thread Jonathan Ellis
You'd need to add a key validation type for 0.8.

On Fri, May 20, 2011 at 9:28 AM, Patrick Julien pjul...@gmail.com wrote:
 The following sample:

 http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes

 No longer works with 0.8.0-rc1

 I get cannot parse as hex bytes when doing the

 Next we add some users:
 [default@demo] set users[bsanderson][full_name] = 'Brandon Sanderson';

 section

 I can confirm that CASSANDRA-2497 does work




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Cassandra Vs. Oracle Coherence

2011-05-20 Thread mcasandra
Coherence is similar to memcachd (free). It's in memory cache layer on top of
the DB. You as a user need to keep that cache in sync with the DB.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-Vs-Oracle-Coherence-tp6375561p6386847.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Exception when starting

2011-05-20 Thread Brandon Williams
There was a bug, it is fixed.  It's just a cache, chill.
On May 20, 2011 11:50 AM, mcasandra mohitanch...@gmail.com wrote:
 Whenever I hear someone say data is corrupted I panic :) I have seen few
 people have reported that but have not seen the real reason for it. Is it
a
 manual error, config error, bug etc. It will be good to identify why these
 things happen so that it can be fixed before it happens in PROD :(

 --
 View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6386809.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.


Re: Cassandra Vs. Oracle Coherence

2011-05-20 Thread Peter Lin
Although one can use Coherence as a memory cache layer on top of a DB,
many customer use it in the financial sector as a in-memory key/value
store.

Think of transient data that doesn't need to be saved, but needs to
scale out across hundreds or thousands of nodes.

memcached is making progress, but it is not comparable to coherence
today. Eventually it will probably get there.

On Fri, May 20, 2011 at 1:00 PM, mcasandra mohitanch...@gmail.com wrote:
 Coherence is similar to memcachd (free). It's in memory cache layer on top of
 the DB. You as a user need to keep that cache in sync with the DB.

 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-Vs-Oracle-Coherence-tp6375561p6386847.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.



Re: Exception when starting

2011-05-20 Thread Eranda Sooriyabandara
In my case I started cassandra after sometime with the newer version of it.
I think this occurred because there were some files remained belongs to the
previous version of cassandra and overwritten with the new one.
This is just my thought.
Thanks
Eranda


Re: simple implementation of counters

2011-05-20 Thread Sasha Dolgy
uh.  yeah ... was a bad morning ... cheers

On Fri, May 20, 2011 at 6:57 PM, Jonathan Ellis jbel...@gmail.com wrote:
 On Fri, May 20, 2011 at 5:55 AM, Sasha Dolgy sdo...@gmail.com wrote:
 First question I have is about the definition.  A column within a
 standard column family cannot be a counter column type?

 Right. https://issues.apache.org/jira/browse/CASSANDRA-2614 is open to
 address this but that's going to be post-0.8.0.

 Second question:  If I want to implement counters, they have to exist
 in a seperate column family from the rest of my data?

 Isn't that the same as the first question? :)


Re: Cassandra Vs. Oracle Coherence

2011-05-20 Thread Milind Parikh
Other interesting flavors in a distributed cache terracotta,
gemfire.together with a complex event processing engine. like
OCEP
drives a lot of low latency, high freq trading where nano seconds matter

/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/

On May 20, 2011 10:05 AM, Peter Lin wool...@gmail.com wrote:

Although one can use Coherence as a memory cache layer on top of a DB,
many customer use it in the financial sector as a in-memory key/value
store.

Think of transient data that doesn't need to be saved, but needs to
scale out across hundreds or thousands of nodes.

memcached is making progress, but it is not comparable to coherence
today. Eventually it will probably get there.


On Fri, May 20, 2011 at 1:00 PM, mcasandra mohitanch...@gmail.com wrote:
 Coherence is similar t...


Re: Cassandra Vs. Oracle Coherence

2011-05-20 Thread Peter Lin
another product in the same area is gigaspaces.


On Fri, May 20, 2011 at 1:12 PM, Milind Parikh milindpar...@gmail.com wrote:
 Other interesting flavors in a distributed cache terracotta,
 gemfire.together with a complex event processing engine. like
 OCEP
 drives a lot of low latency, high freq trading where nano seconds matter

 /***
 sent from my android...please pardon occasional typos as I respond @ the
 speed of thought
 /

 On May 20, 2011 10:05 AM, Peter Lin wool...@gmail.com wrote:

 Although one can use Coherence as a memory cache layer on top of a DB,
 many customer use it in the financial sector as a in-memory key/value
 store.

 Think of transient data that doesn't need to be saved, but needs to
 scale out across hundreds or thousands of nodes.

 memcached is making progress, but it is not comparable to coherence
 today. Eventually it will probably get there.

 On Fri, May 20, 2011 at 1:00 PM, mcasandra mohitanch...@gmail.com wrote:
 Coherence is similar t...


Re: Exception when starting

2011-05-20 Thread mcasandra

Brandon Williams wrote:
 
 There was a bug, it is fixed.  It's just a cache, chill.
 

There is no time to chill when fighting it in production :) It's good to
know it's fixed.

Another question, when this happens are we able to restore data from replica
nodes?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6386925.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Exception when starting

2011-05-20 Thread Krish Pan
No data gets lost - * only * thing corrupted is key-cache

On Fri, May 20, 2011 at 10:17 AM, mcasandra mohitanch...@gmail.com wrote:


 Brandon Williams wrote:
 
  There was a bug, it is fixed.  It's just a cache, chill.
 

 There is no time to chill when fighting it in production :) It's good to
 know it's fixed.

 Another question, when this happens are we able to restore data from
 replica
 nodes?

 --
 View this message in context:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6386925.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at
 Nabble.com.



Re: Can I use secondary index with any partitioner

2011-05-20 Thread Dave Rav
if I use 'RandomPartitioner' and call 'get_indexed_slices'
what  do I do with 'start_key'
struct IndexClause { 1: required listIndexExpression expressions 2: 
required binary start_key, 3: required i32 count=100, }




 Yes.

  On Thu, May 19, 2011 at 7:24 PM, Dave Rav daver...@yahoo.com wrote:
   Can I use secondary index with any partitioner
  
  
  
   1) RandomPartitioner
   2) ByteOrderedPartitioner
  
  


Re: Exception when starting

2011-05-20 Thread mcasandra
In this case, yes. I was asking for the cases where commit log corruption was
reported.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6387101.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: cannot parse as hex bytes

2011-05-20 Thread Patrick Julien
thanks

On Fri, May 20, 2011 at 12:58 PM, Jonathan Ellis jbel...@gmail.com wrote:
 You'd need to add a key validation type for 0.8.

 On Fri, May 20, 2011 at 9:28 AM, Patrick Julien pjul...@gmail.com wrote:
 The following sample:

 http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes

 No longer works with 0.8.0-rc1

 I get cannot parse as hex bytes when doing the

 Next we add some users:
 [default@demo] set users[bsanderson][full_name] = 'Brandon Sanderson';

 section

 I can confirm that CASSANDRA-2497 does work




 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



Re: Inconsistent results using secondary indexes between two DC

2011-05-20 Thread Jonathan Ellis
Has this cluster always been on 0.7.5 or was it upgraded from an
earlier version?

On Thu, May 19, 2011 at 3:26 AM, Wojciech Pietrzok kosci...@gmail.com wrote:
 Just checked. Seems to be present in CF on all nodes (in both
 datacenters), but are not indexed correctly

 On each node I've used sstablekeys for all CF_NAME-f-XX-Data.db files.
 In cassandra-cli I've (using node that behaves correctly) made query
 get CF_NAME where foo = bar, got correct number of results. Checked
 using grep if all the keys are present in the lists returned by
 sstablekeys - none was missing, so it seems that the rows are present
 on all nodes.
 When doing the same query on the nodes in the second DC (using
 ConsistencyLevel.ONE) the results are invalid. Sometimes I got 15 rows
 (expected, correct number of rows), 3 rows, or 10 rows. What's
 interesting every time I get only 3 rows it's the same list of 3 rows
 on both affected nodes.


 2011/5/17 Jonathan Ellis jbel...@gmail.com:
 Nothing comes to mind.

 I'd start by using sstable2json to see if the missing rows are in the
 main data CF -- i.e., are they just unindexed, or are they missing
 completely?

 On Sun, May 15, 2011 at 4:33 PM, Wojciech Pietrzok kosci...@gmail.com 
 wrote:
 Hello,

 I've noticed strange behaviour of Cassandra when using secondary indexes.
 There are 2 Data Centers, each with 2 nodes, RF=4, on all nodes
 Cassandra 0.7.5 is installed.
 When I connect to one of the nodes in DC1 and perform query using
 secondary indexes (get ColumnFamily where column = 'foo' in
 cassandra-cli) I always get correct number of rows returned, no matter
 which ConsistencyLevel is set.
 When I connect to one of the nodes in DC2 and perform same query using
 ConsistencyLevel LOCAL_QUORUM the results are correct. But using
 ConsistencyLevel ONE Cassandra doesn't return correct number of rows
 (it seems that most of the times there some of the rows are missing).
 Tried running nodetool repair, and nodetool scrub but this doesn't seem to 
 help.

 What might the cause of such behaviour?

 --
 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  KosciaK     mail: kosci...@gmail.com
                    www : http://kosciak.net/
 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Can I use secondary index with any partitioner

2011-05-20 Thread Jonathan Ellis
empty byte array for the first call, last key from the previous
resultset afterwards

On Fri, May 20, 2011 at 12:53 PM, Dave Rav daver...@yahoo.com wrote:
 if I use 'RandomPartitioner' and call 'get_indexed_slices'

 what  do I do with 'start_key'

 struct IndexClause {
 1: required listIndexExpression expressions
 2: required binary start_key,
 3: required i32 count=100,
 }


  Yes.

   On Thu, May 19, 2011 at 7:24 PM, Dave Rav daver...@yahoo.com wrote:
    Can I use secondary index with any partitioner
   
   
   
    1) RandomPartitioner
    2) ByteOrderedPartitioner
   
   




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


CounterColumn

2011-05-20 Thread Mark Emerson
CASSANDRA-2614 - create Column and CounterColumn in the same column family
when this will be in cassandra
will this be in cassandra 0.8

Re: CounterColumn

2011-05-20 Thread Jonathan Ellis
Look for Fix version on the ticket:
https://issues.apache.org/jira/browse/CASSANDRA-2614.

On Fri, May 20, 2011 at 4:35 PM, Mark Emerson a202...@yahoo.com wrote:
 CASSANDRA-2614 - create Column and CounterColumn in the same column family
 when this will be in cassandra
 will this be in cassandra 0.8



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: RTG/MRTG/Cricket replacement using Cassandra?

2011-05-20 Thread yangyangyyy
hi Ryan:


Thanks for the link.
I read the slides, could you please provide some more details on how the
temporal aggregation is implemented?
do you use time+granularity as the key ? or as column names ?

Thanks
Yang

In reply to this post by Aaron Turner
We have a solution for time series data on cassandra at Twitter that 
we'd like to open source, but it requires 0.8/trunk so we're not going 
to release it until that's stable. 

See
http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011

-ryan 

On Thu, Mar 31, 2011 at 3:56 PM, Aaron Turner [hidden email] wrote:

 I've been looking at replacing our PostgreSQL backend for RTG (a SNMP 
 based polling and graphing solution for network traffic/ports) with 
 something using Cassandra in order to solve our scalability and 
 redundancy requirements.  Based on a lot of what I've read, Cassandra 
 is an ideal data store for this  time series data.  In fact, Eric 
 Evans in his presentation on the Cassandra home page suggests that 
 this kind of use case is perfect for Cassandra. 
 
 So this got me wondering if someone had already come up with a CF 
 model for this kind of data, including daily/weekly/monthly/yearly 
 rollups.  Perhaps there's even an open source project or two 
 implementing this sorta thing?  I've found flewton 
 (https://github.com/flewton/flewton), which is possibly relevant, but 
 my Java skills are pretty non-existent so I'm having a hard time 
 figuring it out. 
 
 Thanks, 
 Aaron 
 
 -- 
 Aaron Turner 
  http://synfin.net/  http://synfin.net/ Twitter: @synfinatic 
 http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix 
 Windows 
 Those who would give up essential Liberty, to purchase a little temporary 
 Safety, deserve neither Liberty nor Safety. 
 -- Benjamin Franklin 
 carpe diem quam minimum credula postero 
 
... [show rest of quote]

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/RTG-MRTG-Cricket-replacement-using-Cassandra-tp6229322p6388192.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: RTG/MRTG/Cricket replacement using Cassandra?

2011-05-20 Thread Edward Capriolo
The first love of my open life was cacti. I am going to discuss with
them porting some of the system to cassandra.

On Friday, May 20, 2011, yangyangyyy tedd...@gmail.com wrote:
 hi Ryan:


 Thanks for the link.
 I read the slides, could you please provide some more details on how the
 temporal aggregation is implemented?
 do you use time+granularity as the key ? or as column names ?

 Thanks
 Yang

 In reply to this post by Aaron Turner
 We have a solution for time series data on cassandra at Twitter that
 we'd like to open source, but it requires 0.8/trunk so we're not going
 to release it until that's stable.

 See
 http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011

 -ryan

 On Thu, Mar 31, 2011 at 3:56 PM, Aaron Turner [hidden email] wrote:

 I've been looking at replacing our PostgreSQL backend for RTG (a SNMP
 based polling and graphing solution for network traffic/ports) with
 something using Cassandra in order to solve our scalability and
 redundancy requirements.  Based on a lot of what I've read, Cassandra
 is an ideal data store for this  time series data.  In fact, Eric
 Evans in his presentation on the Cassandra home page suggests that
 this kind of use case is perfect for Cassandra.

 So this got me wondering if someone had already come up with a CF
 model for this kind of data, including daily/weekly/monthly/yearly
 rollups.  Perhaps there's even an open source project or two
 implementing this sorta thing?  I've found flewton
 (https://github.com/flewton/flewton), which is possibly relevant, but
 my Java skills are pretty non-existent so I'm having a hard time
 figuring it out.

 Thanks,
 Aaron

 --
 Aaron Turner
  http://synfin.net/  http://synfin.net/         Twitter: @synfinatic
 http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix 
 Windows
 Those who would give up essential Liberty, to purchase a little temporary
 Safety, deserve neither Liberty nor Safety.
     -- Benjamin Franklin
 carpe diem quam minimum credula postero

 ... [show rest of quote]

 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/RTG-MRTG-Cricket-replacement-using-Cassandra-tp6229322p6388192.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.