record not available on created system when other system (Node\Seed) is shutdown

2011-09-12 Thread RAJASHEKAR REDDY
Hi,
 
I Installed Cassandra 0.8.4 on two systems configure like below
 
System 1: IP 10.1.1.1 which is acting as seed

  - seeds: 10.1.1.1
  listen_address: 10.1.1.1
  rpc_address: 10.1.1.1
 
  
System 2: IP 10.1.1.2 which is acting as node
 
 - seeds: 10.1.1.1
 listen_address: 10.1.1.2
 rpc_address: 10.1.1.2
 
Followed the steps below
 
1. Now i started System 1 (Seed) and created Key Space called aspace.
2. Now Started the System 2 (Node) and use the Key Space aspace successfully.
3. Now in System 1 (Seed) created column family and maintain a record.
4. Now in System 2 (Node) tested getting the created record in seed is success.
5. Now in System 1 (Seed) tested getting the create record in itself and it is 
success.
6. Now i shutdown the Cassandra in system 2 (Node).
7. After shutdown of System 2 (Node) i am trying to get the record in System 1 
(Seed) where the record was created, but failed to get the record as it return 
null.
8. Again i brought the System 2 (Node) up, then tested in System 1 (seed) to 
get the record, it is success.
 
 
My understand with Cassandra is irrespective of system down or up the record 
should be available on other systems(node\seed);
 
So when we stop the Node the records created on Seed were not available for 
Seed it self, also vice versa? 
 
Did I miss any configuration?
 
Advance thanks for your help.
 
 
Observation:
 The crazy is that i noted the record is not displayed on created system when 
the other system (node) is down but same record is accessible when the record 
created system(seed) is down also tested vice versa situation is same.
 
 
Regards
P. Rajashekar Reddy 



memtable flush thresholds

2011-09-12 Thread Sorin Julean
Hi,

 I've checked the memtable flush (cassandra 0.8.4)  and it seams to me it
hapens sooner then the threshold is reached.

Here's the threshould's (the default ones calculated for a heap size of
-Xmx1980M):
ColumnFamily: idx_graphable (Super)
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator:
org.apache.cassandra.db.marshal.BytesType
  Columns sorted by:
org.apache.cassandra.db.marshal.UTF8Type/org.apache.cassandra.db.marshal.UTF8Type
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  *Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB)*
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true

In the logs it seams to me none oh the thresold is reached ( definitively
minutes threshold is not reached ).

9-08 20:12:30,136 MeteredFlusher.java (line 62) flushing high-traffic column
family ColumnFamilyStore(table='graph', columnFamily='idx_graphable')
 INFO [NonPeriodicTasks:1] 2011-09-08 20:12:30,144 ColumnFamilyStore.java
(line 1036) Enqueuing flush of
Memtable-idx_graphable@915643571*(4671498/96780112
serialized/live bytes, 59891 ops)*
 INFO [FlushWriter:111] 2011-09-08 20:12:30,145 Memtable.java (line 237)
Writing Memtable-idx_graphable@915643571(4671498/96780112 serialized/live
bytes, 59891 ops)
 INFO [FlushWriter:111] 2011-09-08 20:12:30,348 Memtable.java (line 254)
Completed flushing [...]/cassandra/data/graph/idx_graphable-g-23-Data.db
(4673905 bytes)


Could someone clarify it for me ?
high-traffic column family has a special meaning ?

Many thanks,
Sorin


Index search in provided list of rows (list of rowKeys).

2011-09-12 Thread Evgeniy Ryabitskiy
Hi,

We have an issue to search over Cassandra and we are using Sphinx for
indexing.
Because of Sphinx architecture we can't use range queries over all fields
that we need to.
So we have to run Sphinx Query first to get List of rowKeys and perform
additional range filtering over column values.

First simple solution is to do it on Client side. That will increase network
traffic and memory usage on client.

Now I'm wondering if it possible to perform such filtering on Cassandra
side.
I wish to use some IndexExpression for range filtering in list of records
(list of rowKeys returned from external Indexing Search Engine).

Looking at get_indexed_slices I found out that in IndexClause is no
possibility set List of rowKeys (like for multiget_slice), only start_key.

So 2 questions:

1) Am I missing something and my idea is possible via some another API?
2) If not possible, can I add JIRA for this feature?

Evgeny.


Re: memtable flush thresholds

2011-09-12 Thread Jonathan Ellis
see memtable_total_space_in_mb at
http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/

On Mon, Sep 12, 2011 at 6:55 AM, Sorin Julean sorin.jul...@gmail.com wrote:
 Hi,

  I've checked the memtable flush (cassandra 0.8.4)  and it seams to me it
 hapens sooner then the threshold is reached.

 Here's the threshould's (the default ones calculated for a heap size of
 -Xmx1980M):
     ColumnFamily: idx_graphable (Super)
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator:
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by:
 org.apache.cassandra.db.marshal.UTF8Type/org.apache.cassandra.db.marshal.UTF8Type
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: true

 In the logs it seams to me none oh the thresold is reached ( definitively
 minutes threshold is not reached ).

 9-08 20:12:30,136 MeteredFlusher.java (line 62) flushing high-traffic column
 family ColumnFamilyStore(table='graph', columnFamily='idx_graphable')
  INFO [NonPeriodicTasks:1] 2011-09-08 20:12:30,144 ColumnFamilyStore.java
 (line 1036) Enqueuing flush of
 Memtable-idx_graphable@915643571(4671498/96780112 serialized/live bytes,
 59891 ops)
  INFO [FlushWriter:111] 2011-09-08 20:12:30,145 Memtable.java (line 237)
 Writing Memtable-idx_graphable@915643571(4671498/96780112 serialized/live
 bytes, 59891 ops)
  INFO [FlushWriter:111] 2011-09-08 20:12:30,348 Memtable.java (line 254)
 Completed flushing [...]/cassandra/data/graph/idx_graphable-g-23-Data.db
 (4673905 bytes)


 Could someone clarify it for me ?
 high-traffic column family has a special meaning ?

 Many thanks,
 Sorin




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: memtable flush thresholds

2011-09-12 Thread Sorin Julean
Thanks Jonathan !

 memtable_total_space_in_mb is the threshold that is reached.

Kind regards,
Sorin

On Mon, Sep 12, 2011 at 3:16 PM, Jonathan Ellis jbel...@gmail.com wrote:

 see memtable_total_space_in_mb at
 http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/

 On Mon, Sep 12, 2011 at 6:55 AM, Sorin Julean sorin.jul...@gmail.com
 wrote:
  Hi,
 
   I've checked the memtable flush (cassandra 0.8.4)  and it seams to me it
  hapens sooner then the threshold is reached.
 
  Here's the threshould's (the default ones calculated for a heap size of
  -Xmx1980M):
  ColumnFamily: idx_graphable (Super)
Key Validation Class: org.apache.cassandra.db.marshal.BytesType
Default column value validator:
  org.apache.cassandra.db.marshal.BytesType
Columns sorted by:
 
 org.apache.cassandra.db.marshal.UTF8Type/org.apache.cassandra.db.marshal.UTF8Type
Row cache size / save period in seconds: 0.0/0
Key cache size / save period in seconds: 20.0/14400
Memtable thresholds: 0.5671875/1440/121 (millions of
 ops/minutes/MB)
GC grace seconds: 864000
Compaction min/max thresholds: 4/32
Read repair chance: 1.0
Replicate on write: true
 
  In the logs it seams to me none oh the thresold is reached ( definitively
  minutes threshold is not reached ).
 
  9-08 20:12:30,136 MeteredFlusher.java (line 62) flushing high-traffic
 column
  family ColumnFamilyStore(table='graph', columnFamily='idx_graphable')
   INFO [NonPeriodicTasks:1] 2011-09-08 20:12:30,144 ColumnFamilyStore.java
  (line 1036) Enqueuing flush of
  Memtable-idx_graphable@915643571(4671498/96780112 serialized/live bytes,
  59891 ops)
   INFO [FlushWriter:111] 2011-09-08 20:12:30,145 Memtable.java (line 237)
  Writing Memtable-idx_graphable@915643571(4671498/96780112
 serialized/live
  bytes, 59891 ops)
   INFO [FlushWriter:111] 2011-09-08 20:12:30,348 Memtable.java (line 254)
  Completed flushing [...]/cassandra/data/graph/idx_graphable-g-23-Data.db
  (4673905 bytes)
 
 
  Could someone clarify it for me ?
  high-traffic column family has a special meaning ?
 
  Many thanks,
  Sorin
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



Replace Live Node

2011-09-12 Thread Kyle Gibson
Version=0.7.8

I have a 3 node cluster with RF=3, how would I move data from a live
node to a replacement node?

I tried an autobootstrap + decomission, but I got this error on the live node:

Exception in thread main java.lang.IllegalStateException:
replication factor (3) exceeds number of endpoints (2)

And I got this error on the new node:

Bootstraping to existing token 113427455640312821154458202477256070484
is not allowed (decommission/removetoken the old node first).

-

Do I really need to do the token - 1 manual selection for this?

Thanks


Re: Not all data structures need timestamps (and don't require wasted memory).

2011-09-12 Thread David Jeske
On Sat, Sep 3, 2011 at 8:26 PM, Kevin Burton bur...@spinn3r.com wrote:

 The point is that replication in Cassandra only needs timestamps to handle
 out of order writes … for values that are idempotent, this isn't necessary.
  The order doesn't matter.


I believe this is a mis-understanding of how idempotency applies to
Cassandra replication. If there were no timestamps stored, how would
read-repair work? There would be two different values with no way to tell
which was written second.


Re: Replace Live Node

2011-09-12 Thread Jeremy Hanna
Yeah - I would bootstrap at initial_token of -1 the current one.  Then once 
that has bootstrapped, then decommission the old one.  Avoid trying to use 
removetoken on anything before 0.8.3.  Use decommission if you can if you're 
dealing with a live node.

On Sep 12, 2011, at 10:42 AM, Kyle Gibson wrote:

 Version=0.7.8
 
 I have a 3 node cluster with RF=3, how would I move data from a live
 node to a replacement node?
 
 I tried an autobootstrap + decomission, but I got this error on the live node:
 
 Exception in thread main java.lang.IllegalStateException:
 replication factor (3) exceeds number of endpoints (2)
 
 And I got this error on the new node:
 
 Bootstraping to existing token 113427455640312821154458202477256070484
 is not allowed (decommission/removetoken the old node first).
 
 -
 
 Do I really need to do the token - 1 manual selection for this?
 
 Thanks



Re: what's the difference between repair CF separately and repair the entire node?

2011-09-12 Thread Peter Schuller
 I am using 0.7.4.  so it is always okay to do the routine repair on
 Column Family basis? thanks!

It's okay but won't do what you want; due to a bug you'll see
streaming of data for other column families than the one you're trying
to repair. This will be fixed in 1.0.

-- 
/ Peter Schuller (@scode on twitter)


Re: Not all data structures need timestamps (and don't require wasted memory).

2011-09-12 Thread David Jeske
After writing my message, I recognized a scenerio you might be referring to
Kevin.

If I understand correctly, you're not referring to set-membership in the
general sense, where one could add and remove entries. General
set-membership, in the context of eventual-consistency, requires timestamps.
The timestamps distinguish between the two values present and
not-present. (not-present being represented by timestamped tombstones in
the case of deletion/removal).

So I suppose you're referring to additive-only set membership, where there
is no need to distinguish between two different states (such as present or
not present in a set), because items can only be added, never changed or
removed. If entries are not allowed to be deleted or modified, then
cassandra-style eventual consistency replication could occur without any
timestamp, because you're simply replicating the existence of keys to all
replicas.

To me this seems a particularly narrow use-case. Any inadvertant write (even
one from a bug or data-corruption), would require very frustrating manual
intervention to remove. (you'd have to manually shutdown all nodes, manually
purge bad values out of the dataset, then bring the nodes back online) I'm
not a cassandra developer, but this seems like a path which is very
specialized and not very in-line with Cassandra's design.

You might have better luck with a distributed store that is not based on
timestamp eventual consistency. I don't know if you can explicitly turn off
timestamps in HBase, but AFAIK the client is allowed to supply them, so you
can just supply zero and they should be compressed out quite well.


AntiEntropyService.getNeighbors pulls information from where?

2011-09-12 Thread Sasha Dolgy
This relates to the issue i opened the other day:
https://issues.apache.org/jira/browse/CASSANDRA-3175 ..  basically,
'nodetool ring' throws an exception on two of the four nodes.

In my fancy little world, the problems appear to be related to one of
the nodes thinking that someone is their neighbor ... and that someone
moved away a long time ago

/mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:5]
2011-09-10 21:20:02,182 AntiEntropyService.java (line 658) Could not
proceed on repair because a neighbor (/10.130.185.136) is dead:
manual-repair-d8cdb59a-04a4-4596-b73f-cba3bd2b9eab failed.
/mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:7]
2011-09-11 21:20:02,258 AntiEntropyService.java (line 658) Could not
proceed on repair because a neighbor (/10.130.185.136) is dead:
manual-repair-ad17e938-f474-469c-9180-d88a9007b6b9 failed.
/mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:9]
2011-09-12 21:20:02,256 AntiEntropyService.java (line 658) Could not
proceed on repair because a neighbor (/10.130.185.136) is dead:
manual-repair-636150a5-4f0e-45b7-b400-24d8471a1c88 failed.

Appears only in the logs for one node that is generating the issue. 172.16.12.10

Where do I find where the AntiEntropyService.getNeighbors(tablename,
range) is pulling it's information from?

On the two nodes that work:

[default@system] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0--564008fe649f: [172.16.12.10, 172.16.12.11,
172.16.14.12, 172.16.14.10]
[default@system]

From the two nodes that don't work:

[default@unknown] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.Ec2Snitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
1b871300-dbdc-11e0--564008fe649f: [172.16.12.10, 172.16.12.11,
172.16.14.12, 172.16.14.10]
UNREACHABLE: [10.130.185.136] -- which is really 172.16.14.10
[default@unknown]

Really now.  Where does 10.130.185.136 exist?  It's in none of the
configurations I have AND the full ring has been shut down and started
up ... not trying to give Vijay a hard time by posting here btw!

Just thinking it could be something super silly ... that a wider
audience has come across.

-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: Replace Live Node

2011-09-12 Thread Kyle Gibson
What could you do if the initial_token is 0?

On Mon, Sep 12, 2011 at 1:09 PM, Jeremy Hanna
jeremy.hanna1...@gmail.com wrote:
 Yeah - I would bootstrap at initial_token of -1 the current one.  Then once 
 that has bootstrapped, then decommission the old one.  Avoid trying to use 
 removetoken on anything before 0.8.3.  Use decommission if you can if you're 
 dealing with a live node.

 On Sep 12, 2011, at 10:42 AM, Kyle Gibson wrote:

 Version=0.7.8

 I have a 3 node cluster with RF=3, how would I move data from a live
 node to a replacement node?

 I tried an autobootstrap + decomission, but I got this error on the live 
 node:

 Exception in thread main java.lang.IllegalStateException:
 replication factor (3) exceeds number of endpoints (2)

 And I got this error on the new node:

 Bootstraping to existing token 113427455640312821154458202477256070484
 is not allowed (decommission/removetoken the old node first).

 -

 Do I really need to do the token - 1 manual selection for this?

 Thanks




Re: Replace Live Node

2011-09-12 Thread Konstantin Naryshkin
The ring wraps around, so the value before 0 is the max possible token. I 
believe that it is 2**127 -1 .

- Original Message -
From: Kyle Gibson kyle.gib...@frozenonline.com
To: user@cassandra.apache.org
Sent: Monday, September 12, 2011 3:30:20 PM
Subject: Re: Replace Live Node

What could you do if the initial_token is 0?

On Mon, Sep 12, 2011 at 1:09 PM, Jeremy Hanna
jeremy.hanna1...@gmail.com wrote:
 Yeah - I would bootstrap at initial_token of -1 the current one.  Then once 
 that has bootstrapped, then decommission the old one.  Avoid trying to use 
 removetoken on anything before 0.8.3.  Use decommission if you can if you're 
 dealing with a live node.

 On Sep 12, 2011, at 10:42 AM, Kyle Gibson wrote:

 Version=0.7.8

 I have a 3 node cluster with RF=3, how would I move data from a live
 node to a replacement node?

 I tried an autobootstrap + decomission, but I got this error on the live 
 node:

 Exception in thread main java.lang.IllegalStateException:
 replication factor (3) exceeds number of endpoints (2)

 And I got this error on the new node:

 Bootstraping to existing token 113427455640312821154458202477256070484
 is not allowed (decommission/removetoken the old node first).

 -

 Do I really need to do the token - 1 manual selection for this?

 Thanks




Re: Replace Live Node

2011-09-12 Thread Jeremy Hanna
I believe you'd need 2^127 - 1, which is 170141183460469231731687303715884105727
On Sep 12, 2011, at 2:30 PM, Kyle Gibson wrote:

 What could you do if the initial_token is 0?
 
 On Mon, Sep 12, 2011 at 1:09 PM, Jeremy Hanna
 jeremy.hanna1...@gmail.com wrote:
 Yeah - I would bootstrap at initial_token of -1 the current one.  Then once 
 that has bootstrapped, then decommission the old one.  Avoid trying to use 
 removetoken on anything before 0.8.3.  Use decommission if you can if you're 
 dealing with a live node.
 
 On Sep 12, 2011, at 10:42 AM, Kyle Gibson wrote:
 
 Version=0.7.8
 
 I have a 3 node cluster with RF=3, how would I move data from a live
 node to a replacement node?
 
 I tried an autobootstrap + decomission, but I got this error on the live 
 node:
 
 Exception in thread main java.lang.IllegalStateException:
 replication factor (3) exceeds number of endpoints (2)
 
 And I got this error on the new node:
 
 Bootstraping to existing token 113427455640312821154458202477256070484
 is not allowed (decommission/removetoken the old node first).
 
 -
 
 Do I really need to do the token - 1 manual selection for this?
 
 Thanks
 
 



cleanup / move

2011-09-12 Thread David McNelis
While it would certainly be preferable to not run a cleanup and a move  at
the same time on the same node, is there a techincal problem with running a
nodetool move  on a node while a cleanup is running?  Or if its possible to
gracefully kill a cleanup, so that a move can  be run and then cleanup run
after?

We have a node that is almost full and need to move it so that we can shift
its  loadbut it already has a cleanup process running which, instead of
causing less data usage as expected, is  actually growing the amount of
space taken at a pretty fast rate.

-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*


Re: Replace Live Node

2011-09-12 Thread Kyle Gibson
So to move data from node with token 0, the new node needs to have
initial token set to 170141183460469231731687303715884105727 ?

Another idea: could I move token to 1, and then use token 0 on the new node?

On Mon, Sep 12, 2011 at 3:38 PM, Jeremy Hanna
jeremy.hanna1...@gmail.com wrote:
 I believe you'd need 2^127 - 1, which is 
 170141183460469231731687303715884105727
 On Sep 12, 2011, at 2:30 PM, Kyle Gibson wrote:

 What could you do if the initial_token is 0?

 On Mon, Sep 12, 2011 at 1:09 PM, Jeremy Hanna
 jeremy.hanna1...@gmail.com wrote:
 Yeah - I would bootstrap at initial_token of -1 the current one.  Then once 
 that has bootstrapped, then decommission the old one.  Avoid trying to use 
 removetoken on anything before 0.8.3.  Use decommission if you can if 
 you're dealing with a live node.

 On Sep 12, 2011, at 10:42 AM, Kyle Gibson wrote:

 Version=0.7.8

 I have a 3 node cluster with RF=3, how would I move data from a live
 node to a replacement node?

 I tried an autobootstrap + decomission, but I got this error on the live 
 node:

 Exception in thread main java.lang.IllegalStateException:
 replication factor (3) exceeds number of endpoints (2)

 And I got this error on the new node:

 Bootstraping to existing token 113427455640312821154458202477256070484
 is not allowed (decommission/removetoken the old node first).

 -

 Do I really need to do the token - 1 manual selection for this?

 Thanks






Re: Replace Live Node

2011-09-12 Thread Jeremy Hanna
 So to move data from node with token 0, the new node needs to have
 initial token set to 170141183460469231731687303715884105727 ?

I would do this route.

 Another idea: could I move token to 1, and then use token 0 on the new node?

nodetool move prior to 0.8 is a very heavy operation.

balancing issue with Random partitioner

2011-09-12 Thread David McNelis
We are running the datastax .8 rpm distro.  We have a situation where we
have 4 nodes and each owns 25% of the keys.  However the last node in the
ring does not seem to be  getting much of a load at all.

We are using the random partitioner, we have a total of about 20k keys that
are sequential...

Our nodetool ring  output is currently:

Address DC  RackStatus State   LoadOwns
   Token

   127605887595351923798765477786913079296
10.181.138.167  datacenter1 rack1   Up Normal  99.37 GB
 25.00%  0
192.168.100.6   datacenter1 rack1   Up Normal  106.25 GB
25.00%  42535295865117307932921825928971026432
10.181.137.37   datacenter1 rack1   Up Normal  77.7 GB
25.00%  85070591730234615865843651857942052863
192.168.100.5   datacenter1 rack1   Up Normal  494.67 KB
25.00%  127605887595351923798765477786913079296


Nothing is running on netstats on .37 or .5.

I understand that the nature of the beast would cause the load to differ
between the nodes...but I wouldn't expect it to be so drastic.  We had the
token for .37 set to 85070591730234615865843651857942052864, and I
decremented and moved it to try to kickstart some streaming on the thought
that something may have failed, but that didn't yield any appreciable
results.

Are we seeing completely abnormal behavior?  Should I consider making the
token for the fourth node considerably smaller?  We calculated the node's
tokens using the standard python script.

-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*


Re: Index search in provided list of rows (list of rowKeys).

2011-09-12 Thread aaron morton
Just checking, you want an API call like this ? 


multiget_filtered_slice(keys, column_parent, predicate, filter_clause, 
consistency_level)

Where filter_clause is an IndexClause. 

It's a bit messy.

is there no way to express this as a single get_indexed_slice() call? With a == 
index expression to get the row keys and the other expressions todo the range 
filtering ? 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 13/09/2011, at 1:55 AM, Evgeniy Ryabitskiy wrote:

 Hi,
 
 We have an issue to search over Cassandra and we are using Sphinx for 
 indexing.
 Because of Sphinx architecture we can't use range queries over all fields 
 that we need to.
 So we have to run Sphinx Query first to get List of rowKeys and perform 
 additional range filtering over column values.
 
 First simple solution is to do it on Client side. That will increase network 
 traffic and memory usage on client.
 
 Now I'm wondering if it possible to perform such filtering on Cassandra side.
 I wish to use some IndexExpression for range filtering in list of records 
 (list of rowKeys returned from external Indexing Search Engine).
 
 Looking at get_indexed_slices I found out that in IndexClause is no 
 possibility set List of rowKeys (like for multiget_slice), only start_key.
 
 So 2 questions:
 
 1) Am I missing something and my idea is possible via some another API?
 2) If not possible, can I add JIRA for this feature? 
 
 Evgeny.
 
 
 
 
 



Re: balancing issue with Random partitioner

2011-09-12 Thread Jonathan Ellis
Looks kind of like the 4th node was added to the cluster w/o bootstrapping.

On Mon, Sep 12, 2011 at 3:59 PM, David McNelis
dmcne...@agentisenergy.com wrote:
 We are running the datastax .8 rpm distro.  We have a situation where we
 have 4 nodes and each owns 25% of the keys.  However the last node in the
 ring does not seem to be  getting much of a load at all.
 We are using the random partitioner, we have a total of about 20k keys that
 are sequential...
 Our nodetool ring  output is currently:
 Address         DC          Rack        Status State   Load            Owns
    Token

    127605887595351923798765477786913079296
 10.181.138.167  datacenter1 rack1       Up     Normal  99.37 GB
  25.00%  0
 192.168.100.6   datacenter1 rack1       Up     Normal  106.25 GB
 25.00%  42535295865117307932921825928971026432
 10.181.137.37   datacenter1 rack1       Up     Normal  77.7 GB
 25.00%  85070591730234615865843651857942052863
 192.168.100.5   datacenter1 rack1       Up     Normal  494.67 KB
 25.00%  127605887595351923798765477786913079296

 Nothing is running on netstats on .37 or .5.
 I understand that the nature of the beast would cause the load to differ
 between the nodes...but I wouldn't expect it to be so drastic.  We had the
 token for .37 set to 85070591730234615865843651857942052864, and I
 decremented and moved it to try to kickstart some streaming on the thought
 that something may have failed, but that didn't yield any appreciable
 results.
 Are we seeing completely abnormal behavior?  Should I consider making the
 token for the fourth node considerably smaller?  We calculated the node's
 tokens using the standard python script.
 --
 David McNelis
 Lead Software Engineer
 Agentis Energy
 www.agentisenergy.com
 o: 630.359.6395
 c: 219.384.5143
 A Smart Grid technology company focused on helping consumers of energy
 control an often under-managed resource.





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: balancing issue with Random partitioner

2011-09-12 Thread David McNelis
Auto-bootstrapping is turned on and the node had  been started several hours
ago.   Since the node already shows up as part of the ring I would imagine
that nodetool join wouldn't do anything.Is there a command to jumpstart
bootstrapping?

On Mon, Sep 12, 2011 at 4:22 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Looks kind of like the 4th node was added to the cluster w/o bootstrapping.

 On Mon, Sep 12, 2011 at 3:59 PM, David McNelis
 dmcne...@agentisenergy.com wrote:
  We are running the datastax .8 rpm distro.  We have a situation where we
  have 4 nodes and each owns 25% of the keys.  However the last node in the
  ring does not seem to be  getting much of a load at all.
  We are using the random partitioner, we have a total of about 20k keys
 that
  are sequential...
  Our nodetool ring  output is currently:
  Address DC  RackStatus State   Load
  Owns
 Token
 
 127605887595351923798765477786913079296
  10.181.138.167  datacenter1 rack1   Up Normal  99.37 GB
   25.00%  0
  192.168.100.6   datacenter1 rack1   Up Normal  106.25 GB
  25.00%  42535295865117307932921825928971026432
  10.181.137.37   datacenter1 rack1   Up Normal  77.7 GB
  25.00%  85070591730234615865843651857942052863
  192.168.100.5   datacenter1 rack1   Up Normal  494.67 KB
  25.00%  127605887595351923798765477786913079296
 
  Nothing is running on netstats on .37 or .5.
  I understand that the nature of the beast would cause the load to differ
  between the nodes...but I wouldn't expect it to be so drastic.  We had
 the
  token for .37 set to 85070591730234615865843651857942052864, and I
  decremented and moved it to try to kickstart some streaming on the
 thought
  that something may have failed, but that didn't yield any appreciable
  results.
  Are we seeing completely abnormal behavior?  Should I consider making the
  token for the fourth node considerably smaller?  We calculated the node's
  tokens using the standard python script.
  --
  David McNelis
  Lead Software Engineer
  Agentis Energy
  www.agentisenergy.com
  o: 630.359.6395
  c: 219.384.5143
  A Smart Grid technology company focused on helping consumers of energy
  control an often under-managed resource.
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*


Re: what's the difference between repair CF separately and repair the entire node?

2011-09-12 Thread Jim Ancona
On Mon, Sep 12, 2011 at 1:44 PM, Peter Schuller
peter.schul...@infidyne.com wrote:
 I am using 0.7.4.  so it is always okay to do the routine repair on
 Column Family basis? thanks!

 It's okay but won't do what you want; due to a bug you'll see
 streaming of data for other column families than the one you're trying
 to repair. This will be fixed in 1.0.

I think we might be running into this. Is CASSANDRA-2280 the issue
you're referring to?

Jim


Re: Index search in provided list of rows (list of rowKeys).

2011-09-12 Thread Evgeniy Ryabitskiy
Something like this.

Actually I think it's better to extend get_indexed_slice() API instead of
creating new one thrift method.
I wish to have something like this:

//here we run query to external search engine
Listbyte[] keys = performSphinxQuery(someFullTextSearchQuery);
IndexClause indexClause = new IndexClause();

//required API to set list of keys
indexClause.setKeys(keys);
indexClause.setExpressions(someFilteringExpressions);
List finalResult = get_indexed_slices(colParent, indexClause, colPredicate,
cLevel);



I can't solve my issue with single get_indexed_slice().
Here is issue in more details:
1) have ~ 6 millions records, in feature could be much more
2) have   10k different properties (stored as column values in Cassandra),
in feature could be much more
3) properties are text descriptions , int/float values, string values
4) need to implement search over all properties. For text descriptions: full
text search. for int/float properties: range search.
5) Search query could use any combination of property descriptions. Like
full text search description and some range expression for int/float field.
6) have external search engine (Sphinx) that indexed all string and text
properties
7) still need to perform range search for int, float fields.

So now I split my query expressions in 2 groups:
1) expressions that can be handled by search engine
2) others (additional filters)

For example I run first query to Sphinx and got list of rowKeys, with length
of 100k.  (mark as RESULT1)
Now I need to filter it by second group of expressions. For example I have
simple expression: age  25.
So imagine I would run get_indexed_slice() with this query and could
possibly get half of my records in result. (mark as RESULT2)
Then I would need to get intersection between RESULT1 and RESULT2 on client
side, which could take a lot of time and memory.
That is why I can't use single get_indexed_slice here.

For me is better to iterate RESULT1 (with 100k records) at client side to
filter by age and got 10-50k record as final result. Disadvantage here is
that I have to fetch all 100k records.

Evgeny.


Cassandra performance on a virtual network....

2011-09-12 Thread Chris Marino
Hello everyone, I wanted to tell you about some performance
benchmarking we have done with Cassandra running in EC2 on a virtual
network.

The purpose of the experiment was to see how running Cassandra on a
virtual network could simplify operational complexity and to determine
the performance impact, relative to native interfaces.

The summary results for running a 4 node cluster are:

Cassandra Performance on vCider Virtual Network
Replication Factor 1   32 64   128  192       256 byte cols.
v. Unencrypted:      -8.2%  0.8%   -2.3%-2.3%   -6.7%
v. Encrypted:         63.8% 55.4%  60.0%   53.9%   61.7%
v. Node Only Encryption: -0.7% -5.0%1.9%5.4%4.7%

Replication Factor 3 32 64128   192   256 byte cols
v. Unencrypted:      -4.5% -4.7%   -5.8% -4.5%-1.5%
v. Encrypted:     31.5% 29.6%  31.4% 27.3%   29.9%
v. Node Only Encryption: 3.8%   3.9%   6.1%8.3%   4.0%

There is tremendous EC2 performance variability and our experiments
tried to adjust for that by running 10 trials for each column size and
averaging them. Averaged across all column widths, the performance
was:

Replication Factor 1
v. Unencrypted:                        -3.7%
v. Encrypted:                           +59%
v. Node Only Encryption:         +1.3%

Replication Factor 3
v. Unencrypted:                        -4.2%
v. Encrypted:                            +30%
v. Node Only Encryption:         +5.2%

As you might expect, the performance while running on a virtual
network was slower than running on the native interfaces.

However, when you encrypt communications (both node and client) the
performance of the virtual network was faster by nearly 60% (30% with
R3). Since this measurement is primarily an indication of the client
encryption performance, we also measured performance of the somewhat
unrealistic configuration when only node communications were
encrypted.  Here the virtual network performed better as well.

The overall decrease performance loss -3.7% to -4.2% for un-encrypted
R1 v. R3 is understandable since R3 is more network intensive than R1.
However, since the virtual network performs encryption in the kernel
(which seems to be faster than what Cassandra can do natively) when
encryption is turned on, the performance gains are greater with R3
since more data needs to be encrypted.

We ran the tests using the Cassandra stress test tool across a range
of column widths, replication strategies and consistency levels (One,
Quourm).  We used OpenVPN for client encryption. The complete test
results are attached.

I’m going to write up a more complete analysis of these results, but
wanted to share them with you to see if there was anything obvious
that we overlooked.  We are currently running experiments against
clusters running in multiple EC2 regions.

We expect similar performance characteristics across regions, but with
the added benefit of not needing to fuss with the EC2 snitch. The
virtual network lets you assign your own private IPs for all Cassandra
interfaces so the standard Snitch can be used everywhere.

If you're running Cassandra in EC2 (or any other public cloud) and
want encrypted communications, running on virtual network is a clear
winner.  Here, not only is it 30-60% faster, but you don’t have to
bother with the point-to-point configurations of setting up a third
party encryption technique. Since these run in user space, its not
surprising that dramatic performance gains can be achieved with the
kernel based approach of the virtual network.

When we’re done will put everything in a public repo that includes all
Puppet configuration modules as well as collection of scripts that
automate nearly all of the testing. I hope to have that in the next
week or so, but wanted to get some of these single region results out
there in advance.

If you are interested, you can learn more about the vCider virtual
network at www.vcider.com

Let me know if you have any questions.
CM


vCider.Cassandra.benchmarks.pdf
Description: Adobe PDF document


Re: AntiEntropyService.getNeighbors pulls information from where?

2011-09-12 Thread aaron morton
I'm pretty sure I'm behind on how to deal with this problem. 

Best I know is to start the node with -Dcassandra.load_ring_state=false as a 
JVM option. But if the ghost IP address is in gossip it will not work, and it 
should be in gossip.

Does the ghost IP show up in nodetool ring ? 

Anyone know a way to remove a ghost IP from gossip that does not have a token 
associated with it ?

Cheers
  
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 13/09/2011, at 6:39 AM, Sasha Dolgy wrote:

 This relates to the issue i opened the other day:
 https://issues.apache.org/jira/browse/CASSANDRA-3175 ..  basically,
 'nodetool ring' throws an exception on two of the four nodes.
 
 In my fancy little world, the problems appear to be related to one of
 the nodes thinking that someone is their neighbor ... and that someone
 moved away a long time ago
 
 /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:5]
 2011-09-10 21:20:02,182 AntiEntropyService.java (line 658) Could not
 proceed on repair because a neighbor (/10.130.185.136) is dead:
 manual-repair-d8cdb59a-04a4-4596-b73f-cba3bd2b9eab failed.
 /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:7]
 2011-09-11 21:20:02,258 AntiEntropyService.java (line 658) Could not
 proceed on repair because a neighbor (/10.130.185.136) is dead:
 manual-repair-ad17e938-f474-469c-9180-d88a9007b6b9 failed.
 /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:9]
 2011-09-12 21:20:02,256 AntiEntropyService.java (line 658) Could not
 proceed on repair because a neighbor (/10.130.185.136) is dead:
 manual-repair-636150a5-4f0e-45b7-b400-24d8471a1c88 failed.
 
 Appears only in the logs for one node that is generating the issue. 
 172.16.12.10
 
 Where do I find where the AntiEntropyService.getNeighbors(tablename,
 range) is pulling it's information from?
 
 On the two nodes that work:
 
 [default@system] describe cluster;
 Cluster Information:
 Snitch: org.apache.cassandra.locator.Ec2Snitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
 1b871300-dbdc-11e0--564008fe649f: [172.16.12.10, 172.16.12.11,
 172.16.14.12, 172.16.14.10]
 [default@system]
 
 From the two nodes that don't work:
 
 [default@unknown] describe cluster;
 Cluster Information:
 Snitch: org.apache.cassandra.locator.Ec2Snitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
 1b871300-dbdc-11e0--564008fe649f: [172.16.12.10, 172.16.12.11,
 172.16.14.12, 172.16.14.10]
 UNREACHABLE: [10.130.185.136] -- which is really 172.16.14.10
 [default@unknown]
 
 Really now.  Where does 10.130.185.136 exist?  It's in none of the
 configurations I have AND the full ring has been shut down and started
 up ... not trying to give Vijay a hard time by posting here btw!
 
 Just thinking it could be something super silly ... that a wider
 audience has come across.
 
 -- 
 Sasha Dolgy
 sasha.do...@gmail.com



Re: cleanup / move

2011-09-12 Thread aaron morton
  is there a techincal problem with running a nodetool move  on a node while a 
 cleanup is running?  
Cleanup is removing data that the node is no longer responsible for while move 
is first removing *all* data from the node and then streaming new data to it. 

I'd put that in the crossing the streams category 
(http://www.youtube.com/watch?v=jyaLZHiJJnE). i.e. best avoided. 

To kill the cleanup kill the node. Operations such as that create new data, and 
then delete old data. They do not mutate existing data. 

Cleanup will write new SSTables, and then mark the old ones as compacted. When 
the old SSTables are marked as compacted you should will see a zero length 
.Compacted file. Cassandra will delete the compacted data files when it needs 
to. 

If you want the deletion to happen sooner rather than later force a Java GC 
through JConsole. 

Hope that helps. 
 
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 13/09/2011, at 7:41 AM, David McNelis wrote:

 While it would certainly be preferable to not run a cleanup and a move  at 
 the same time on the same node, is there a techincal problem with running a 
 nodetool move  on a node while a cleanup is running?  Or if its possible to 
 gracefully kill a cleanup, so that a move can  be run and then cleanup run 
 after?
 
 We have a node that is almost full and need to move it so that we can shift 
 its  loadbut it already has a cleanup process running which, instead of 
 causing less data usage as expected, is  actually growing the amount of space 
 taken at a pretty fast rate.
 
 -- 
 David McNelis
 Lead Software Engineer
 Agentis Energy
 www.agentisenergy.com
 o: 630.359.6395
 c: 219.384.5143
 
 A Smart Grid technology company focused on helping consumers of energy 
 control an often under-managed resource.
 
 



Re: balancing issue with Random partitioner

2011-09-12 Thread aaron morton
Try a reapir on 100.5 , it will then request the data from the existing nodes. 

You will then need to clean on the existing three nodes once the repair has 
completed. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 13/09/2011, at 9:32 AM, David McNelis wrote:

 Auto-bootstrapping is turned on and the node had  been started several hours 
 ago.   Since the node already shows up as part of the ring I would imagine 
 that nodetool join wouldn't do anything.Is there a command to jumpstart 
 bootstrapping?
 
 On Mon, Sep 12, 2011 at 4:22 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Looks kind of like the 4th node was added to the cluster w/o bootstrapping.
 
 On Mon, Sep 12, 2011 at 3:59 PM, David McNelis
 dmcne...@agentisenergy.com wrote:
  We are running the datastax .8 rpm distro.  We have a situation where we
  have 4 nodes and each owns 25% of the keys.  However the last node in the
  ring does not seem to be  getting much of a load at all.
  We are using the random partitioner, we have a total of about 20k keys that
  are sequential...
  Our nodetool ring  output is currently:
  Address DC  RackStatus State   LoadOwns
 Token
 
 127605887595351923798765477786913079296
  10.181.138.167  datacenter1 rack1   Up Normal  99.37 GB
   25.00%  0
  192.168.100.6   datacenter1 rack1   Up Normal  106.25 GB
  25.00%  42535295865117307932921825928971026432
  10.181.137.37   datacenter1 rack1   Up Normal  77.7 GB
  25.00%  85070591730234615865843651857942052863
  192.168.100.5   datacenter1 rack1   Up Normal  494.67 KB
  25.00%  127605887595351923798765477786913079296
 
  Nothing is running on netstats on .37 or .5.
  I understand that the nature of the beast would cause the load to differ
  between the nodes...but I wouldn't expect it to be so drastic.  We had the
  token for .37 set to 85070591730234615865843651857942052864, and I
  decremented and moved it to try to kickstart some streaming on the thought
  that something may have failed, but that didn't yield any appreciable
  results.
  Are we seeing completely abnormal behavior?  Should I consider making the
  token for the fourth node considerably smaller?  We calculated the node's
  tokens using the standard python script.
  --
  David McNelis
  Lead Software Engineer
  Agentis Energy
  www.agentisenergy.com
  o: 630.359.6395
  c: 219.384.5143
  A Smart Grid technology company focused on helping consumers of energy
  control an often under-managed resource.
 
 
 
 
 
 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com
 
 
 
 -- 
 David McNelis
 Lead Software Engineer
 Agentis Energy
 www.agentisenergy.com
 o: 630.359.6395
 c: 219.384.5143
 
 A Smart Grid technology company focused on helping consumers of energy 
 control an often under-managed resource.
 
 



Re: AntiEntropyService.getNeighbors pulls information from where?

2011-09-12 Thread Sasha Dolgy
use system;
del LocationInfo[52696e67];

i ran this on the nodes that had the problems.  stopped, started the
nodes, it re-did it's job  job done.  all fixed with a new bug!
https://issues.apache.org/jira/browse/CASSANDRA-3186

On Tue, Sep 13, 2011 at 2:09 AM, aaron morton aa...@thelastpickle.com wrote:
 I'm pretty sure I'm behind on how to deal with this problem.

 Best I know is to start the node with -Dcassandra.load_ring_state=false as 
 a JVM option. But if the ghost IP address is in gossip it will not work, and 
 it should be in gossip.

 Does the ghost IP show up in nodetool ring ?

 Anyone know a way to remove a ghost IP from gossip that does not have a token 
 associated with it ?

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 13/09/2011, at 6:39 AM, Sasha Dolgy wrote:

 This relates to the issue i opened the other day:
 https://issues.apache.org/jira/browse/CASSANDRA-3175 ..  basically,
 'nodetool ring' throws an exception on two of the four nodes.

 In my fancy little world, the problems appear to be related to one of
 the nodes thinking that someone is their neighbor ... and that someone
 moved away a long time ago

 /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:5]
 2011-09-10 21:20:02,182 AntiEntropyService.java (line 658) Could not
 proceed on repair because a neighbor (/10.130.185.136) is dead:
 manual-repair-d8cdb59a-04a4-4596-b73f-cba3bd2b9eab failed.
 /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:7]
 2011-09-11 21:20:02,258 AntiEntropyService.java (line 658) Could not
 proceed on repair because a neighbor (/10.130.185.136) is dead:
 manual-repair-ad17e938-f474-469c-9180-d88a9007b6b9 failed.
 /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:9]
 2011-09-12 21:20:02,256 AntiEntropyService.java (line 658) Could not
 proceed on repair because a neighbor (/10.130.185.136) is dead:
 manual-repair-636150a5-4f0e-45b7-b400-24d8471a1c88 failed.

 Appears only in the logs for one node that is generating the issue. 
 172.16.12.10

 Where do I find where the AntiEntropyService.getNeighbors(tablename,
 range) is pulling it's information from?

 On the two nodes that work:

 [default@system] describe cluster;
 Cluster Information:
 Snitch: org.apache.cassandra.locator.Ec2Snitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
 1b871300-dbdc-11e0--564008fe649f: [172.16.12.10, 172.16.12.11,
 172.16.14.12, 172.16.14.10]
 [default@system]

 From the two nodes that don't work:

 [default@unknown] describe cluster;
 Cluster Information:
 Snitch: org.apache.cassandra.locator.Ec2Snitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
 1b871300-dbdc-11e0--564008fe649f: [172.16.12.10, 172.16.12.11,
 172.16.14.12, 172.16.14.10]
 UNREACHABLE: [10.130.185.136] -- which is really 172.16.14.10
 [default@unknown]

 Really now.  Where does 10.130.185.136 exist?  It's in none of the
 configurations I have AND the full ring has been shut down and started
 up ... not trying to give Vijay a hard time by posting here btw!

 Just thinking it could be something super silly ... that a wider
 audience has come across.

 --
 Sasha Dolgy
 sasha.do...@gmail.com





-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: what's the difference between repair CF separately and repair the entire node?

2011-09-12 Thread Yan Chunlu
I think it is a serious problem since I can not repair.  I am
using cassandra on production servers. is there some way to fix it
without upgrade?  I heard of that 0.8.x is still not quite ready in
production environment.

thanks!

On Tue, Sep 13, 2011 at 1:44 AM, Peter Schuller
peter.schul...@infidyne.com wrote:
 I am using 0.7.4.  so it is always okay to do the routine repair on
 Column Family basis? thanks!

 It's okay but won't do what you want; due to a bug you'll see
 streaming of data for other column families than the one you're trying
 to repair. This will be fixed in 1.0.

 --
 / Peter Schuller (@scode on twitter)



Re: Cassandra -f problem

2011-09-12 Thread Hernán Quevedo
Hi, Roshan. This is great support, amazing support; not used to it :)

Thanks for the reply.

Well I think java is installed correctly, I mean, the java -version command
works on a terminal, so the PATH env variable is correctly set, right?
I downloaded the JDK7 and put it on opt/java/ and then set the path.


But, the eclipse icon says it can't find any JRE or JDK, which is weird
because of what I said above... but... but... what else could it be?

Thanks!

On Sun, Sep 11, 2011 at 10:05 PM, Roshan Dawrani roshandawr...@gmail.comwrote:

 Hi,

 Cassandra starts JVM as $JAVA -ea -cp $CLASSPATH

 Looks like $JAVA is coming is empty in your case, hence the error exec -ea
 not found.

 Do you not have java installed? Please install it and set JAVA_HOME
 appropriately and retry.

 Cheers.


 On Mon, Sep 12, 2011 at 8:23 AM, Hernán Quevedo 
 alexandros.c@gmail.com wrote:

 Hi, all.

 I´m new at this and haven´t been able to install cassandra in debian
 6. After uncompressing the tar and creating var/log and var/lib
 directories, the command bin/cassandra -f results in message exec:
 357 -ea not found preventing cassandra from run the process README
 file says it is suppose to start.

 Any help would be very appreciated.

 Thnx!




 --
 Roshan
 Blog: http://roshandawrani.wordpress.com/
 Twitter: @roshandawrani http://twitter.com/roshandawrani
 Skype: roshandawrani




-- 
Είναι η θέληση των Θεών.


Re: Cassandra -f problem

2011-09-12 Thread Roshan Dawrani
Hi,

Do you have JAVA_HOME exported? If not, can you export it and retry?

Cheers.

On Tue, Sep 13, 2011 at 8:59 AM, Hernán Quevedo
alexandros.c@gmail.comwrote:

 Hi, Roshan. This is great support, amazing support; not used to it :)

 Thanks for the reply.

 Well I think java is installed correctly, I mean, the java -version command
 works on a terminal, so the PATH env variable is correctly set, right?
 I downloaded the JDK7 and put it on opt/java/ and then set the path.


 But, the eclipse icon says it can't find any JRE or JDK, which is weird
 because of what I said above... but... but... what else could it be?

 Thanks!


 On Sun, Sep 11, 2011 at 10:05 PM, Roshan Dawrani 
 roshandawr...@gmail.comwrote:

 Hi,

 Cassandra starts JVM as $JAVA -ea -cp $CLASSPATH

 Looks like $JAVA is coming is empty in your case, hence the error exec
 -ea not found.

 Do you not have java installed? Please install it and set JAVA_HOME
 appropriately and retry.

 Cheers.


 On Mon, Sep 12, 2011 at 8:23 AM, Hernán Quevedo 
 alexandros.c@gmail.com wrote:

 Hi, all.

 I´m new at this and haven´t been able to install cassandra in debian
 6. After uncompressing the tar and creating var/log and var/lib
 directories, the command bin/cassandra -f results in message exec:
 357 -ea not found preventing cassandra from run the process README
 file says it is suppose to start.

 Any help would be very appreciated.

 Thnx!




 --
 Roshan
 Blog: http://roshandawrani.wordpress.com/
 Twitter: @roshandawrani http://twitter.com/roshandawrani
 Skype: roshandawrani




 --
 Είναι η θέληση των Θεών.




-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani http://twitter.com/roshandawrani
Skype: roshandawrani