date:20120801

Unsuccessful attempt to add a second node to a ring.

2012-08-01 Thread Jakub Glapa

Hi Everybody!

I'm trying to add a second node to an already operating one node cluster.

Some specs:
- cassandra 1.0.7
- both nodes have a routable listen_address and rpc_address.
- Ports are open: (from node2) telnet node1 7000 is successful
- Seeds parameter on node2 points to node 1.

[node1] nodetool -h localhost ring
Address DC  RackStatus State   LoadOwns
   Token
node1.ip datacenter1 rack1   Up Normal  74.33 KB100.00%
0

- initial token on node2 was specified

I see something like that in the logs on node2:

DEBUG [main] 2012-07-31 13:50:38,640 CollationController.java (line 76)
collectTimeOrderedData
 INFO [main] 2012-07-31 13:50:38,641 StorageService.java (line 667)
JOINING: waiting for ring and schema information
DEBUG [WRITE-NODE1/node1.ip] 2012-07-31 13:50:39,642
OutboundTcpConnection.java (line 206) attempting to connect to
NODE1/node1.ip
DEBUG [ScheduledTasks:1] 2012-07-31 13:50:40,639 LoadBroadcaster.java (line
86) Disseminating load info ...
 INFO [main] 2012-07-31 13:51:08,641 StorageService.java (line 667)
JOINING: schema complete, ready to bootstrap
DEBUG [main] 2012-07-31 13:51:08,642 StorageService.java (line 554) ... got
ring + schema info
 INFO [main] 2012-07-31 13:51:08,642 StorageService.java (line 667)
JOINING: getting bootstrap token
DEBUG [main] 2012-07-31 13:51:08,644 BootStrapper.java (line 138) token
manually specified as 85070591730234615865843651857942052864
DEBUG [main] 2012-07-31 13:51:08,645 Table.java (line 387) applying
mutation of row 4c


but it doesn't join the ring:

[node2] nodetool -h localhost ring
Address DC  RackStatus State   LoadOwns
   Token
node2.ip   datacenter1 rack1   Up Normal  13.49 KB100.00%
85070591730234615865843651857942052864



I'm attaching the full log from node2 startup in debug mode.



PS.
When I didn't specified the initial token on node2 I ended up with
exception like that:
Exception encountered during startup: No other nodes seen!  Unable to
bootstrap.If you intended to start a single-node cluster, you should make
sure your broadcast_address (or listen_address) is listed as a seed.
Otherwise, you need to determine why the seed being contacted has no
knowledge of the rest of the cluster.  Usually, this can be solved by
giving all nodes the same seed list.


I'm not sure how to proceed now. I found a couple of posts with problems
like that but they weren't very useful.

--
regards,
Jakub Glapa


system.log
Description: Binary data

Re: Unsuccessful attempt to add a second node to a ring.

2012-08-01 Thread Roshni Rajagopal

Jakub,

Have you set the
Data, commitlog, saved cache directories to different ones in each yaml file 
for each node?

Regards,
Roshni


From: Jakub Glapa jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Unsuccessful attempt to add a second node to a ring.

Hi Everybody!

I'm trying to add a second node to an already operating one node cluster.

Some specs:
- cassandra 1.0.7
- both nodes have a routable listen_address and rpc_address.
- Ports are open: (from node2) telnet node1 7000 is successful
- Seeds parameter on node2 points to node 1.

[node1] nodetool -h localhost ring
Address DC  RackStatus State   LoadOwns
Token
node1.ip datacenter1 rack1   Up Normal  74.33 KB100.00% 0

- initial token on node2 was specified

I see something like that in the logs on node2:

DEBUG [main] 2012-07-31 13:50:38,640 CollationController.java (line 76) 
collectTimeOrderedData
 INFO [main] 2012-07-31 13:50:38,641 StorageService.java (line 667) JOINING: 
waiting for ring and schema information
DEBUG [WRITE-NODE1/node1.ip] 2012-07-31 13:50:39,642 OutboundTcpConnection.java 
(line 206) attempting to connect to NODE1/node1.ip
DEBUG [ScheduledTasks:1] 2012-07-31 13:50:40,639 LoadBroadcaster.java (line 86) 
Disseminating load info ...
 INFO [main] 2012-07-31 13:51:08,641 StorageService.java (line 667) JOINING: 
schema complete, ready to bootstrap
DEBUG [main] 2012-07-31 13:51:08,642 StorageService.java (line 554) ... got 
ring + schema info
 INFO [main] 2012-07-31 13:51:08,642 StorageService.java (line 667) JOINING: 
getting bootstrap token
DEBUG [main] 2012-07-31 13:51:08,644 BootStrapper.java (line 138) token 
manually specified as 85070591730234615865843651857942052864
DEBUG [main] 2012-07-31 13:51:08,645 Table.java (line 387) applying mutation of 
row 4c


but it doesn't join the ring:

[node2] nodetool -h localhost ring
Address DC  RackStatus State   LoadOwns
Token
node2.ip   datacenter1 rack1   Up Normal  13.49 KB100.00% 
85070591730234615865843651857942052864



I'm attaching the full log from node2 startup in debug mode.



PS.
When I didn't specified the initial token on node2 I ended up with exception 
like that:
Exception encountered during startup: No other nodes seen!  Unable to 
bootstrap.If you intended to start a single-node cluster, you should make sure 
your broadcast_address (or listen_address) is listed as a seed.
Otherwise, you need to determine why the seed being contacted has no knowledge 
of the rest of the cluster.  Usually, this can be solved by giving all nodes 
the same seed list.


I'm not sure how to proceed now. I found a couple of posts with problems like 
that but they weren't very useful.

--
regards,
Jakub Glapa

This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***

Re: Unsuccessful attempt to add a second node to a ring.

2012-08-01 Thread Jakub Glapa

Hi Roshni,
no they are the same, my changes in cassandra.yaml were only in the
listen_address, rpc_address, seeds and initial_token field.
The rest is exactly the same as on node1.

That's how the file looks on node2:



cluster_name: 'Test Cluster'
initial_token: 85070591730234615865843651857942052864
hinted_handoff_enabled: true
hinted_handoff_throttle_delay_in_ms: 1
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authority: org.apache.cassandra.auth.AllowAllAuthority
partitioner: org.apache.cassandra.dht.RandomPartitioner
data_file_directories:
- /data/servers/cassandra_sbe_edtool/cassandra_data/data
commitlog_directory:
/data/servers/cassandra_sbe_edtool/cassandra_data/commitlog
saved_caches_directory:
/data/servers/cassandra_sbe_edtool/cassandra_data/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 1
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
  parameters:
  - seeds: NODE1
flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
concurrent_reads: 32
concurrent_writes: 32
memtable_flush_queue_size: 4
sliced_buffer_size_in_kb: 64
storage_port: 7000
ssl_storage_port: 7001
listen_address: NODE2
rpc_address: NODE2
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: false
snapshot_before_compaction: false
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 64
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true
rpc_timeout_in_ms: 1
endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 60
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra




--
regards,
pozdrawiam,
Jakub Glapa


On Wed, Aug 1, 2012 at 10:29 AM, Roshni Rajagopal 
roshni.rajago...@wal-mart.com wrote:

 Jakub,

 Have you set the
 Data, commitlog, saved cache directories to different ones in each yaml
 file for each node?

 Regards,
 Roshni


 From: Jakub Glapa jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Unsuccessful attempt to add a second node to a ring.

 Hi Everybody!

 I'm trying to add a second node to an already operating one node cluster.

 Some specs:
 - cassandra 1.0.7
 - both nodes have a routable listen_address and rpc_address.
 - Ports are open: (from node2) telnet node1 7000 is successful
 - Seeds parameter on node2 points to node 1.

 [node1] nodetool -h localhost ring
 Address DC  RackStatus State   Load
  OwnsToken
 node1.ip datacenter1 rack1   Up Normal  74.33 KB
  100.00% 0

 - initial token on node2 was specified

 I see something like that in the logs on node2:

 DEBUG [main] 2012-07-31 13:50:38,640 CollationController.java (line 76)
 collectTimeOrderedData
  INFO [main] 2012-07-31 13:50:38,641 StorageService.java (line 667)
 JOINING: waiting for ring and schema information
 DEBUG [WRITE-NODE1/node1.ip] 2012-07-31 13:50:39,642
 OutboundTcpConnection.java (line 206) attempting to connect to
 NODE1/node1.ip
 DEBUG [ScheduledTasks:1] 2012-07-31 13:50:40,639 LoadBroadcaster.java
 (line 86) Disseminating load info ...
  INFO [main] 2012-07-31 13:51:08,641 StorageService.java (line 667)
 JOINING: schema complete, ready to bootstrap
 DEBUG [main] 2012-07-31 13:51:08,642 StorageService.java (line 554) ...
 got ring + schema info
  INFO [main] 2012-07-31 13:51:08,642 StorageService.java (line 667)
 JOINING: getting bootstrap token
 DEBUG [main] 2012-07-31 13:51:08,644 BootStrapper.java (line 138) token
 manually specified as 85070591730234615865843651857942052864
 DEBUG [main] 2012-07-31 13:51:08,645 Table.java (line 387) applying
 mutation of row 4c


 but it doesn't join the ring:

 [node2] nodetool -h localhost ring
 Address DC  RackStatus State   Load
  OwnsToken
 node2.ip   datacenter1 rack1   Up Normal  13.49 KB100.00%
 85070591730234615865843651857942052864



 I'm attaching the full log from node2 startup in debug mode.



 PS.
 When I didn't specified the initial token on node2 I ended up with
 exception like that:
 Exception encountered during startup: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should make
 sure your broadcast_address (or listen_address) is listed as a seed.
 Otherwise, you need to determine

Re: Unsuccessful attempt to add a second node to a ring.

2012-08-01 Thread Roshni Rajagopal

Ok, sorry it may not be required,
I was thinking of a configuration I had done on my local laptop, where I had 
aliased my IP address.
In that case the directories and jmx port needed to be different.

Cluster name is same right?


From: Jakub Glapa jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Unsuccessful attempt to add a second node to a ring.

Hi Roshni,
no they are the same, my changes in cassandra.yaml were only in the 
listen_address, rpc_address, seeds and initial_token field.
The rest is exactly the same as on node1.

That's how the file looks on node2:



cluster_name: 'Test Cluster'
initial_token: 85070591730234615865843651857942052864
hinted_handoff_enabled: true
hinted_handoff_throttle_delay_in_ms: 1
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authority: org.apache.cassandra.auth.AllowAllAuthority
partitioner: org.apache.cassandra.dht.RandomPartitioner
data_file_directories:
- /data/servers/cassandra_sbe_edtool/cassandra_data/data
commitlog_directory: /data/servers/cassandra_sbe_edtool/cassandra_data/commitlog
saved_caches_directory: 
/data/servers/cassandra_sbe_edtool/cassandra_data/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 1
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
  parameters:
  - seeds: NODE1
flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
concurrent_reads: 32
concurrent_writes: 32
memtable_flush_queue_size: 4
sliced_buffer_size_in_kb: 64
storage_port: 7000
ssl_storage_port: 7001
listen_address: NODE2
rpc_address: NODE2
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: false
snapshot_before_compaction: false
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 64
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true
rpc_timeout_in_ms: 1
endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 60
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra




--
regards,
pozdrawiam,
Jakub Glapa


On Wed, Aug 1, 2012 at 10:29 AM, Roshni Rajagopal 
roshni.rajago...@wal-mart.commailto:roshni.rajago...@wal-mart.com wrote:
Jakub,

Have you set the
Data, commitlog, saved cache directories to different ones in each yaml file 
for each node?

Regards,
Roshni


From: Jakub Glapa 
jakub.gl...@gmail.commailto:jakub.gl...@gmail.commailto:jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
Reply-To: 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
To: 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
 
user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Unsuccessful attempt to add a second node to a ring.

Hi Everybody!

I'm trying to add a second node to an already operating one node cluster.

Some specs:
- cassandra 1.0.7
- both nodes have a routable listen_address and rpc_address.
- Ports are open: (from node2) telnet node1 7000 is successful
- Seeds parameter on node2 points to node 1.

[node1] nodetool -h localhost ring
Address DC  RackStatus State   LoadOwns
Token
node1.ip datacenter1 rack1   Up Normal  74.33 KB100.00% 0

- initial token on node2 was specified

I see something like that in the logs on node2:

DEBUG [main] 2012-07-31 13:50:38,640 CollationController.java (line 76) 
collectTimeOrderedData
 INFO [main] 2012-07-31 13:50:38,641 StorageService.java (line 667) JOINING: 
waiting for ring and schema information
DEBUG [WRITE-NODE1/node1.ip] 2012-07-31 13:50:39,642 OutboundTcpConnection.java 
(line 206) attempting to connect to NODE1/node1.ip
DEBUG [ScheduledTasks:1] 2012-07-31 13:50:40,639 LoadBroadcaster.java (line 86) 
Disseminating load info ...
 INFO [main] 2012-07-31 13:51:08,641 StorageService.java (line 667) JOINING: 
schema complete, ready to bootstrap
DEBUG [main] 2012-07-31 13:51:08,642 StorageService.java (line 554) ... got 
ring + schema info
 INFO [main] 2012-07-31 13:51:08,642

Re: Unsuccessful attempt to add a second node to a ring.

2012-08-01 Thread Jakub Glapa

yes it's the same


--
regards,
pozdrawiam,
Jakub Glapa


On Wed, Aug 1, 2012 at 11:24 AM, Roshni Rajagopal 
roshni.rajago...@wal-mart.com wrote:

 Ok, sorry it may not be required,
 I was thinking of a configuration I had done on my local laptop, where I
 had aliased my IP address.
 In that case the directories and jmx port needed to be different.

 Cluster name is same right?


 From: Jakub Glapa jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Unsuccessful attempt to add a second node to a ring.

 Hi Roshni,
 no they are the same, my changes in cassandra.yaml were only in the
 listen_address, rpc_address, seeds and initial_token field.
 The rest is exactly the same as on node1.

 That's how the file looks on node2:



 cluster_name: 'Test Cluster'
 initial_token: 85070591730234615865843651857942052864
 hinted_handoff_enabled: true
 hinted_handoff_throttle_delay_in_ms: 1
 authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
 authority: org.apache.cassandra.auth.AllowAllAuthority
 partitioner: org.apache.cassandra.dht.RandomPartitioner
 data_file_directories:
 - /data/servers/cassandra_sbe_edtool/cassandra_data/data
 commitlog_directory:
 /data/servers/cassandra_sbe_edtool/cassandra_data/commitlog
 saved_caches_directory:
 /data/servers/cassandra_sbe_edtool/cassandra_data/saved_caches
 commitlog_sync: periodic
 commitlog_sync_period_in_ms: 1
 seed_provider:
 - class_name: org.apache.cassandra.locator.SimpleSeedProvider
   parameters:
   - seeds: NODE1
 flush_largest_memtables_at: 0.75
 reduce_cache_sizes_at: 0.85
 reduce_cache_capacity_to: 0.6
 concurrent_reads: 32
 concurrent_writes: 32
 memtable_flush_queue_size: 4
 sliced_buffer_size_in_kb: 64
 storage_port: 7000
 ssl_storage_port: 7001
 listen_address: NODE2
 rpc_address: NODE2
 rpc_port: 9160
 rpc_keepalive: true
 rpc_server_type: sync
 thrift_framed_transport_size_in_mb: 15
 thrift_max_message_length_in_mb: 16
 incremental_backups: false
 snapshot_before_compaction: false
 column_index_size_in_kb: 64
 in_memory_compaction_limit_in_mb: 64
 multithreaded_compaction: false
 compaction_throughput_mb_per_sec: 16
 compaction_preheat_key_cache: true
 rpc_timeout_in_ms: 1
 endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
 dynamic_snitch_update_interval_in_ms: 100
 dynamic_snitch_reset_interval_in_ms: 60
 dynamic_snitch_badness_threshold: 0.1
 request_scheduler: org.apache.cassandra.scheduler.NoScheduler
 index_interval: 128
 encryption_options:
 internode_encryption: none
 keystore: conf/.keystore
 keystore_password: cassandra
 truststore: conf/.truststore
 truststore_password: cassandra




 --
 regards,
 pozdrawiam,
 Jakub Glapa


 On Wed, Aug 1, 2012 at 10:29 AM, Roshni Rajagopal 
 roshni.rajago...@wal-mart.commailto:roshni.rajago...@wal-mart.com
 wrote:
 Jakub,

 Have you set the
 Data, commitlog, saved cache directories to different ones in each yaml
 file for each node?

 Regards,
 Roshni


 From: Jakub Glapa jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
 mailto:jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 mailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Unsuccessful attempt to add a second node to a ring.

 Hi Everybody!

 I'm trying to add a second node to an already operating one node cluster.

 Some specs:
 - cassandra 1.0.7
 - both nodes have a routable listen_address and rpc_address.
 - Ports are open: (from node2) telnet node1 7000 is successful
 - Seeds parameter on node2 points to node 1.

 [node1] nodetool -h localhost ring
 Address DC  RackStatus State   Load
  OwnsToken
 node1.ip datacenter1 rack1   Up Normal  74.33 KB
  100.00% 0

 - initial token on node2 was specified

 I see something like that in the logs on node2:

 DEBUG [main] 2012-07-31 13:50:38,640 CollationController.java (line 76)
 collectTimeOrderedData
  INFO [main] 2012-07-31 13:50:38,641 StorageService.java (line 667)
 JOINING: waiting for ring and schema information
 DEBUG [WRITE-NODE1/node1.ip] 2012-07-31 13:50:39,642
 OutboundTcpConnection.java (line 206) attempting to connect to
 NODE1/node1.ip
 DEBUG [ScheduledTasks:1] 2012-07-31 13:50:40,639 LoadBroadcaster.java
 (line 86) Disseminating load info ...
  INFO [main]

Restore snapshot

2012-08-01 Thread Desimpel, Ignace

Hi,

Is it possible to restore a snapshot of a keyspace on a live cassandra cluster 
(I mean without restarting)?

Re: Does Cassandra support operations in a transaction?

2012-08-01 Thread Greg Fausak

Hi Ivan,

No Cassandra does not support transactions.

I believe each operation is atomic.  If that operation returns
a successful result, then it worked.  You can't do things like
bind two operations and guarantee is either fails they both fail.

You will find that Cassandra doesn't do a lot of things compared to a sql db :-)

But, it does write a lot of data quickly.

-g


On Wed, Aug 1, 2012 at 5:21 AM, Ivan Jiang wiwi1...@gmail.com wrote:
 Hi,
 I am a new guy to Cassandra, I wonder if available to call Cassandra in
 one Transaction such as in Relation-DB.

 Thanks in advance.

 Best Regards,
 Ivan Jiang

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Thomas Spengler

Just for information

we are running on 1.1.2
JNA or not, had no difference
Manually call full gc, had no difference

but
in my case

the reduction of
commitlog_total_space_in_mb to 2048 (from default 4096)
makes the difference.

On 07/26/2012 04:27 PM, Mina Naguib wrote:

Hi Thomas

On a modern 64bit server, I recommend you pay little attention to the virtual
size. It's made up of almost everything within the process's address space,
including on-disk files mmap()ed in for zero-copy access. It's not
unreasonable for a machine with N amount RAM to have a process whose virtual
size is several times the value of N. That in and of itself is not
problematic

In a default cassandra 1.1.x setup, the bulk of that will be your sstables'
data and index files. On linux you can invoke the pmap tool on the
cassandra process's PID to see what's in there. Much of it will be anonymous
memory allocations (the JVM heap itself, off-heap data structures, etc), but
lots of it will be references to files on disk (binaries, libraries, mmap()ed
files, etc).

What's more important to keep an eye on is the JVM heap - typically
statically allocated to a fixed size at cassandra startup. You can get info
about its used/capacity values via nodetool -h localhost info. You can
also hook up jconsole and trend it over time.

The other critical piece is the process's RESident memory size, which
includes the JVM heap but also other off-heap data structures and
miscellanea. Cassandra has recently been making more use of off-heap
structures (for example, row caching via SerializingCacheProvider). This is
done as a matter of efficiency - a serialized off-heap row is much smaller
than a classical object sitting in the JVM heap - so you can do more with
less.

Unfortunately, in my experience, it's not perfect. They still have a cost,
in terms of on-heap usage, as well as off-heap growth over time.

Specifically, my experience with cassandra 1.1.0 showed that off-heap row
caches incurred a very high on-heap cost (ironic) - see my post at
http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3c6feb097f-287b-471d-bea2-48862b30f...@bloomdigital.com%3E
- as documented in that email, I managed that with regularly scheduled full
GC runs via System.gc()

I have, since then, moved away from scheduled System.gc() to scheduled row
cache invalidations. While this had the same effect as System.gc() I
described in my email, it eliminated the 20-30 second pause associated with
it. It did however introduce (or may be I never noticed earlier), slow creep
in memory usage outside of the heap.

It's typical in my case for example for a process configured with 6G of JVM
heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up slowly
throughout a week to 10-11GB range. Depending on what else the box is doing,
I've experienced the linux OOM killer killing cassandra as you've described,
or heavy swap usage bringing everything down (we're latency-sensitive), etc..

And now for the good news. Since I've upgraded to 1.1.2:
1. There's no more need for regularly scheduled System.gc()
2. There's no more need for regularly scheduled row cache invalidation
3. The HEAP usage within the JVM is stable over time
4. The RESident size of the process appears also stable over time

Point #4 above is still pending as I only have 3 day graphs since the
upgrade, but they show promising results compared to the slope of the same
graph before the upgrade to 1.1.2

So my advice is give 1.1.2 a shot - just be mindful of
https://issues.apache.org/jira/browse/CASSANDRA-4411

On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:

I saw this.

All works fine upto version 1.1.0
the 0.8.x takes 5GB of memory of an 8GB machine
the 1.0.x takes between 6 and 7 GB on a 8GB machine
and
the 1.1.0 takes all

and it is a problem
for me it is no solution to wait of the OOM-Killer from the linux kernel
and restart the cassandraprocess

when my machine has less then 100MB ram available then I have a problem.

On 07/25/2012 07:06 PM, Tyler Hobbs wrote:
Are you actually seeing any problems from this? High virtual memory usage
on its own really doesn't mean anything. See
http://wiki.apache.org/cassandra/FAQ#mmap

On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler
thomas.speng...@toptarif.de wrote:

No one has any idea?

we tryed

update to 1.1.2
DiskAccessMode standard, indexAccessMode standard
row_cache_size_in_mb: 0
key_cache_size_in_mb: 0

Our next try will to change

SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

any other proposals are welcom

On 07/04/2012 02:13 PM, Thomas Spengler wrote:
Hi @all,

since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
of the cassandra-nodes explodes

our setup is:
* 5 - centos 5.8 nodes
* each 4 CPU's and 8 GB RAM
* each node holds about

Re: Unsuccessful attempt to add a second node to a ring.

2012-08-01 Thread Jakub Glapa

I found a similar thread from March :
http://www.mail-archive.com/user@cassandra.apache.org/msg21007.html

For me clearing the data and starting from the beginning didn't help.

It's interesting because on my dev environment I was able to add another
node without any problems.

The only difference is that the second node now is in a different data
center. (but I'm not using any different settings, SimpleSnitch)
7000,9160,7199 ports were open between those 2 nodes.

How else can I check if the communication between those 2 nodes is working?
In the logs I see that:
DEBUG [WRITE-NODE1/node1.ip] 2012-07-31 13:50:39,642
OutboundTcpConnection.java (line 206) attempting to connect to
NODE1/node1.ip

So I assume that the communication is somehow established?


--
regards,
Jakub Glapa


On Wed, Aug 1, 2012 at 11:36 AM, Jakub Glapa jakub.gl...@gmail.com wrote:

 yes it's the same



 --
 regards,
 pozdrawiam,
 Jakub Glapa


 On Wed, Aug 1, 2012 at 11:24 AM, Roshni Rajagopal 
 roshni.rajago...@wal-mart.com wrote:

 Ok, sorry it may not be required,
 I was thinking of a configuration I had done on my local laptop, where I
 had aliased my IP address.
 In that case the directories and jmx port needed to be different.

 Cluster name is same right?


 From: Jakub Glapa jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Unsuccessful attempt to add a second node to a ring.

 Hi Roshni,
 no they are the same, my changes in cassandra.yaml were only in the
 listen_address, rpc_address, seeds and initial_token field.
 The rest is exactly the same as on node1.

 That's how the file looks on node2:



 cluster_name: 'Test Cluster'
 initial_token: 85070591730234615865843651857942052864
 hinted_handoff_enabled: true
 hinted_handoff_throttle_delay_in_ms: 1
 authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
 authority: org.apache.cassandra.auth.AllowAllAuthority
 partitioner: org.apache.cassandra.dht.RandomPartitioner
 data_file_directories:
 - /data/servers/cassandra_sbe_edtool/cassandra_data/data
 commitlog_directory:
 /data/servers/cassandra_sbe_edtool/cassandra_data/commitlog
 saved_caches_directory:
 /data/servers/cassandra_sbe_edtool/cassandra_data/saved_caches
 commitlog_sync: periodic
 commitlog_sync_period_in_ms: 1
 seed_provider:
 - class_name: org.apache.cassandra.locator.SimpleSeedProvider
   parameters:
   - seeds: NODE1
 flush_largest_memtables_at: 0.75
 reduce_cache_sizes_at: 0.85
 reduce_cache_capacity_to: 0.6
 concurrent_reads: 32
 concurrent_writes: 32
 memtable_flush_queue_size: 4
 sliced_buffer_size_in_kb: 64
 storage_port: 7000
 ssl_storage_port: 7001
 listen_address: NODE2
 rpc_address: NODE2
 rpc_port: 9160
 rpc_keepalive: true
 rpc_server_type: sync
 thrift_framed_transport_size_in_mb: 15
 thrift_max_message_length_in_mb: 16
 incremental_backups: false
 snapshot_before_compaction: false
 column_index_size_in_kb: 64
 in_memory_compaction_limit_in_mb: 64
 multithreaded_compaction: false
 compaction_throughput_mb_per_sec: 16
 compaction_preheat_key_cache: true
 rpc_timeout_in_ms: 1
 endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
 dynamic_snitch_update_interval_in_ms: 100
 dynamic_snitch_reset_interval_in_ms: 60
 dynamic_snitch_badness_threshold: 0.1
 request_scheduler: org.apache.cassandra.scheduler.NoScheduler
 index_interval: 128
 encryption_options:
 internode_encryption: none
 keystore: conf/.keystore
 keystore_password: cassandra
 truststore: conf/.truststore
 truststore_password: cassandra




 --
 regards,
 pozdrawiam,
 Jakub Glapa


 On Wed, Aug 1, 2012 at 10:29 AM, Roshni Rajagopal 
 roshni.rajago...@wal-mart.commailto:roshni.rajago...@wal-mart.com
 wrote:
 Jakub,

 Have you set the
 Data, commitlog, saved cache directories to different ones in each yaml
 file for each node?

 Regards,
 Roshni


 From: Jakub Glapa jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
 mailto:jakub.gl...@gmail.commailto:jakub.gl...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 mailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Unsuccessful attempt to add a second node to a ring.

 Hi Everybody!

 I'm trying to add a second node to an already operating one node cluster.

 Some specs:
 - cassandra 1.0.7
 - both nodes have a routable listen_address and

Re: Creating counter columns in cassandra

2012-08-01 Thread Pushpalanka Jayawardhana

Hi All,

I faced this same problem when trying to query the counter values. I am
using a phone number as row key and updating the number of calls taken to
that number. So my query is like

SELECT KEY FROM columnFamily WHERE No_of_Calls5

This does not return any data, neither any exception, though I am 100% sure
that entries are there which satisfy that query.
I used same code as Amila mentioned. My doubt is this is due to some
mismatch types with the counter value representation and query value, but
failed to resolve this. :(

Any ideas or guidance is greatly helpful.
Thanks in advance!


On Tue, Jul 31, 2012 at 1:49 PM, Amila Paranawithana amila1...@gmail.comwrote:

 Hi all,
 Thanks all for the valuable feedback. I have a problem with running
 queries with Cqlsh.
 My query is  SELECT * FROM rule1 WHERE sms=3;

 java.lang.NumberFormatException: An hex string representing bytes must
 have an even length
  at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:52)
 at
 org.apache.cassandra.utils.ByteBufferUtil.hexToBytes(ByteBufferUtil.java:501)
  at
 org.apache.cassandra.db.marshal.CounterColumnType.fromString(CounterColumnType.java:57)
  at org.apache.cassandra.cql.Term.getByteBuffer(Term.java:96)
 at
 org.apache.cassandra.cql.QueryProcessor.multiRangeSlice(QueryProcessor.java:185)
  at
 org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:484)
 at org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:877)
  at
 org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1235)
  at
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542)
  at
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530)
  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
  at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)

 but when I say SELECT * FROM rule1 WHERE sms=03; no exceptions are shown.
 But though I have entries where sms count =3 that entry is not retrieved.

 And for queries like SELECT * FROM rule1 WHERE sms=03;
 Bad Request: No indexed columns present in by-columns clause with equals
 operator

 Can anyone recognize the problem here??

 Following are the methods I used.

 //for indexing columns
 void indexColumn(String idxColumnName,String CountercfName){

 Cluster cluster = HFactory.getOrCreateCluster(
 BasicConf.CASSANDRA_CLUSTER, BasicConf.CLUSTER_PORT);
 KeyspaceDefinition keyspaceDefinition =
 cluster.describeKeyspace(BasicConf.KEYSPACE);

 ListColumnFamilyDefinition cdfs = keyspaceDefinition.getCfDefs();
 ColumnFamilyDefinition cfd = null;
 for(ColumnFamilyDefinition c:cdfs){
  if(c.getName().toString().equals(CountercfName)) {
  System.out.println(c.getName());
  cfd=c;
  break;
  }
 }

 BasicColumnFamilyDefinition columnFamilyDefinition = new
 BasicColumnFamilyDefinition(cfd);

 BasicColumnDefinition bcdf = new BasicColumnDefinition();
 bcdf.setName(StringSerializer.get().toByteBuffer(idxColumnName));
 bcdf.setIndexName(idxColumnName+index);
 bcdf.setIndexType(ColumnIndexType.KEYS);
 bcdf.setValidationClass(ComparatorType.COUNTERTYPE.getClassName());

 columnFamilyDefinition.addColumnDefinition(bcdf);
 cluster.updateColumnFamily(new
 ThriftCfDef(columnFamilyDefinition));

  }

 // for adding a new counter column
 void insertCounterColumn(String cfName, String counterColumnName,
  String phoneNumberKey) {

  MutatorString mutator = HFactory.createMutator(keyspace,
  StringSerializer.get());
  mutator.insertCounter(phoneNumberKey, cfName, HFactory
 .createCounterColumn(counterColumnName, 1L,
  StringSerializer.get()));
  mutator.execute();
 CounterQueryString, String counter = new
 ThriftCounterColumnQueryString, String(
  keyspace, StringSerializer.get(), StringSerializer.get());
 counter.setColumnFamily(cfName).setKey(phoneNumberKey)
  .setName(counterColumnName);

  indexColumn(columnName, cfName);

  }

 // incrementing counter values
 void incrementCounter(String ruleName, String columnName,
 HashMapString, Long entries) {

 MutatorString mutator = HFactory.createMutator(keyspace,
 StringSerializer.get());

 SetString keys = entries.keySet();
 for (String s : keys) {
  mutator.incrementCounter(s, ruleName, columnName, entries.get(s));

 }

 mutator.execute();

 }



 On Sun, Jul 29, 2012 at 3:29 PM, Paolo Bernardi berna...@gmail.comwrote:

 On Sun, Jul 29, 2012 at 9:30 AM, Abhijit Chanda
 abhijit.chan...@gmail.com wrote:
  There should be at least one =

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Greg Fausak

Mina,

Thanks for that post. Very interesting :-)

What sort of things are you graphing? Standard *nux stuff
(mem/cpu/etc)? Or do you
have some hooks in to the C* process (I saw somoething about port 1414
in the .yaml file).

Best,

-g

On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
mina.nag...@bloomdigital.com wrote:

Hi Thomas

Unfortunately, in my experience, it's not perfect. They still have a cost,
in terms of on-heap usage, as well as off-heap growth over time.

Point #4 above is still pending as I only have 3 day graphs since the
upgrade, but they show promising results compared to the slope of the same
graph before the upgrade to 1.1.2

So my advice is give 1.1.2 a shot - just be mindful of
https://issues.apache.org/jira/browse/CASSANDRA-4411

On 2012-07-26, at 2:18 AM, Thomas Spengler wrote:

I saw this.

All works fine upto version 1.1.0
the 0.8.x takes 5GB of memory of an 8GB machine
the 1.0.x takes between 6 and 7 GB on a 8GB machine
and
the 1.1.0 takes all

and it is a problem
for me it is no solution to wait of the OOM-Killer from the linux kernel
and restart the cassandraprocess

when my machine has less then 100MB ram available then I have a problem.

On Wed, Jul 25, 2012 at 1:21 AM, Thomas Spengler
thomas.speng...@toptarif.de wrote:

No one has any idea?

we tryed

update to 1.1.2
DiskAccessMode standard, indexAccessMode standard
row_cache_size_in_mb: 0
key_cache_size_in_mb: 0

Our next try will to change

SerializingCacheProvider to ConcurrentLinkedHashCacheProvider

any other proposals are welcom

On 07/04/2012 02:13 PM, Thomas Spengler wrote:
Hi @all,

since our upgrade form cassandra 1.0.3 to 1.1.0 the virtual memory usage
of the cassandra-nodes explodes

our setup is:
* 5 - centos 5.8 nodes
* each 4

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

2012-08-01 Thread Mina Naguib

All our servers (cassandra and otherwise) get monitored with nagios + get many
basic metrics graphed by pnp4nagios. This covers a large chunk of a box's
health, as well as cassandra basics (specifically the pending tasks, JVM heap
state). IMO it's not possible to clearly debug a cassandra issue if you don't
have a good holistic view of the boxes' health (CPU, RAM, swap, disk
throughput, etc.)

Separate from that we have an operational dashboard. It's a bunch of
manually-defined RRD files and custom scripts that grab metrics, store, and
graph the health of various layers in the infrastructure in an an
easy-to-digest way (for example, each data center gets a color scheme - stacked
machines within multiple DCs can just be eyeballed). There we can see for
example our total read volume, total write volume, struggling boxes, dynamic
endpoint snitch reaction, etc...

Finally, almost all the software we write integrates with statsd + graphite.
In graphite we have more metrics than we know what to do with, but it's better
than the other way around. From there for example we can see cassandra's
response time including things cassandra itself can't measure (network, thrift,
etc), across various different client softwares that talk to it. Within
graphite we have several dashboards defined (users make their own, some
infrastructure components have shared dashboards.)

--
Mina Naguib :: Director, Infrastructure Engineering
Bloom Digital Platforms :: T 514.394.7951 #208
http://bloom-hq.com/

On 2012-08-01, at 3:43 PM, Greg Fausak wrote:

Mina,

Thanks for that post. Very interesting :-)

What sort of things are you graphing? Standard *nux stuff
(mem/cpu/etc)? Or do you
have some hooks in to the C* process (I saw somoething about port 1414
in the .yaml file).

Best,

-g

On Thu, Jul 26, 2012 at 9:27 AM, Mina Naguib
mina.nag...@bloomdigital.com wrote:

Hi Thomas

On a modern 64bit server, I recommend you pay little attention to the
virtual size. It's made up of almost everything within the process's
address space, including on-disk files mmap()ed in for zero-copy access.
It's not unreasonable for a machine with N amount RAM to have a process
whose virtual size is several times the value of N. That in and of itself
is not problematic

In a default cassandra 1.1.x setup, the bulk of that will be your sstables'
data and index files. On linux you can invoke the pmap tool on the
cassandra process's PID to see what's in there. Much of it will be
anonymous memory allocations (the JVM heap itself, off-heap data structures,
etc), but lots of it will be references to files on disk (binaries,
libraries, mmap()ed files, etc).

Unfortunately, in my experience, it's not perfect. They still have a cost,
in terms of on-heap usage, as well as off-heap growth over time.

It's typical in my case for example for a process configured with 6G of JVM
heap to start up, stabilize at 6.5 - 7GB RESident usage, then creep up
slowly throughout a week to 10-11GB range. Depending on what else the box
is doing, I've experienced the linux OOM killer killing cassandra as you've
described, or heavy swap usage bringing everything down (we're
latency-sensitive), etc..

Re: Does Cassandra support operations in a transaction?

2012-08-01 Thread Ivan Jiang

Hi Greg,

Thank you for your answers.

I should have to convert my mind to NoSql from RD-SQL while using Cassandra.

Best Regards,
Ivan



On Wed, Aug 1, 2012 at 9:20 PM, Greg Fausak g...@named.com wrote:

 Hi Ivan,

 No Cassandra does not support transactions.

 I believe each operation is atomic.  If that operation returns
 a successful result, then it worked.  You can't do things like
 bind two operations and guarantee is either fails they both fail.

 You will find that Cassandra doesn't do a lot of things compared to a sql
 db :-)

 But, it does write a lot of data quickly.

 -g


 On Wed, Aug 1, 2012 at 5:21 AM, Ivan Jiang wiwi1...@gmail.com wrote:
  Hi,
  I am a new guy to Cassandra, I wonder if available to call Cassandra
 in
  one Transaction such as in Relation-DB.
 
  Thanks in advance.
 
  Best Regards,
  Ivan Jiang

Re: Does Cassandra support operations in a transaction?

2012-08-01 Thread Roshni Rajagopal

Hi Ivan,

Cassandra supports 'tunable consistency' . If you always read and write at a 
quorum (or local quorum for multi data center) from one , you can guarantee 
that the results will be consistent as in all the data will be compared and the 
latest will be returned, and no data will be out of date. This is at a loss of 
performance- it will be fastest to just read and write once rather than check a 
quorum of nodes.

What you chose depends on what your application needs are. Is it ok if some 
users receive out of date data (it isn't earth shattering if someone doesn't 
know what you're eating right now), or is it a banking transaction system where 
all entities must be consistently updated.

So designing in cassandra priortizes de-normalization. You cannot have 
referential integrity that 2 tables (col families in cassandra) are in sync 
because the database has designed it to be so using foreign keys. The 
application needs to ensure that all data in column families are accurate and 
not out of sync, because data elements may be duplicated in different col 
families.


You cannot have 2 different entities and ensure that changes to both will be 
done and then only be visible to others.


Regards,


From: Jeffrey Kesselman jef...@gmail.commailto:jef...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Does Cassandra support operations in a transaction?

Short story is that few if any of the NoSql systems supprot transactions 
natively. Thats oen of the big compromises they make.  What they call eventual 
consistancy is actually eventual Durabiltiy in ACID terms.

Consistancy, as meant by the C in ACID,  is not gauranteed at all.

On Wed, Aug 1, 2012 at 6:21 AM, Ivan Jiang 
wiwi1...@gmail.commailto:wiwi1...@gmail.com wrote:
Hi,
I am a new guy to Cassandra, I wonder if available to call Cassandra in one 
Transaction such as in Relation-DB.

Thanks in advance.

Best Regards,
Ivan Jiang



--
It's always darkest just before you are eaten by a grue.

This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***

Re: Looking for a good Ruby client

2012-08-01 Thread Thorsten von Eicken

Harry, we're in a similar situation and are starting to work out our own
ruby client. The biggest issue is that it doesn't make much sense to
build a higher level abstraction on anything other than CQL3, given
where things are headed. At least this is our opinion.
At the same time, CQL3 is just barely becoming usable and still seems
rather deficient in wide-row usage. The tricky part is that with the
current CQL3 you have to construct quite complex iterators to retrieve a
large result set. Which means that you end up having to either parse
CQL3 coming in to insert the iteration stuff, or you have to pass CQL3
fragments in and compose them together with iterator clauses. Not fun
stuff either way.
The only good solution I see is to switch to a streaming protocol (or
build some form of continue on top of thrift) such that the client can
ask for a huge result set and the cassandra coordinator can break it
into sub-queries as it sees fit and return results chunk-by-chunk. If
this is really the path forward then all abstractions built above CQL3
before that will either have a good piece of complex code that can be
deleted or worse, will have an interface that is no longer best practice.
Good luck!
Thorsten


On 8/1/2012 1:47 PM, Harry Wilkinson wrote:
 Hi,

 I'm looking for a Ruby client for Cassandra that is pretty high-level.
  I am really hoping to find a Ruby gem of high quality that allows a
 developer to create models like you would with ActiveModel.

 So far I have figured out that the canonical Ruby client for Cassandra
 is Twitter's Cassandra gem https://github.com/twitter/cassandra/ of
 the same name.  It looks great - mature, still in active development,
 etc.  No stated support for Ruby 1.9.3 that I can see, but I can
 probably live with that for now.

 What I'm looking for is a higher-level gem built on that gem that
 works like ActiveModel in that you just include a module in your model
 class and that gives you methods to declare your model's serialized
 attributes and also the usual ActiveModel methods like 'save!',
 'valid?', 'find', etc.

 I've been trying out some different NoSQL databases recently, and for
 example there is an official Ruby client
 https://github.com/basho/riak-ruby-client for Riak with a domain
 model that is close to Riak's, but then there's also a gem called
 'Ripple' https://github.com/seancribbs/ripple that uses a domain
 model that is closer to what most Ruby developers are used to.  So it
 looks like Twitter's Cassandra gem is the one that stays close to the
 domain model of Cassandra, and what I'm looking for is a gem that's a
 Cassandra equivalent of RIpple.

 From some searching I found cassandra_object
 https://github.com/NZKoz/cassandra_object, which has been inactive
 for a couple of years, but there's a fork
 https://github.com/data-axle/cassandra_object that looks like it's
 being maintained, but I have not found any kind of information to
 suggest the maintained fork is in general use yet.  I have found quite
 a lot of gems of a similar style that people have started and then not
 really got very far with.

 So, does anybody know of a suitable gem?  Would you recommend it?  Or
 perhaps you would recommend not using such a gem and sticking with the
 lower-level client gem?

 Thanks in advance for your advice.

 Harry

Re: Does Cassandra support operations in a transaction?

2012-08-01 Thread Jeffrey Kesselman

Roshni,

Thats not what consistancy in ACID means. Its not consistancy of reading
the ame data, its referntial integrity between related pecies of data.

Consistency
Data is in a consistent state when a transaction starts and when it ends. For
example, in an application that transfers funds from one account to
another, the consistency property ensures that the total value of funds in
both the accounts is the same at the start and end of each transaction.
http://publib.boulder.ibm.com/infocenter/cicsts/v3r2/index.jsp?topic=%2Fcom.ibm.cics.ts.productoverview.doc%2Fconcepts%2Facid.html

A lot of people i nthe NoSql wqorld use the term consistancy when what
they mean is durability.

Durability After a transaction successfully completes, changes to data
persist and are not undone, even in the event of a system failure.

Many NoSql databses (includiogn Cassandra) are eventuallydurable, in the
sense that a read immediately after a write may noit reflect that write,
but at soem l;ater point, it will.

None p[rovide true consistancy that I am aware of.

On Thu, Aug 2, 2012 at 12:24 AM, Roshni Rajagopal
roshni.rajago...@wal-mart.com wrote:

Hi Ivan,

Cassandra supports 'tunable consistency' . If you always read and write at
a quorum (or local quorum for multi data center) from one , you can
guarantee that the results will be consistent as in all the data will be
compared and the latest will be returned, and no data will be out of date.
This is at a loss of performance- it will be fastest to just read and write
once rather than check a quorum of nodes.

What you chose depends on what your application needs are. Is it ok if
some users receive out of date data (it isn't earth shattering if someone
doesn't know what you're eating right now), or is it a banking transaction
system where all entities must be consistently updated.

So designing in cassandra priortizes de-normalization. You cannot have
referential integrity that 2 tables (col families in cassandra) are in sync
because the database has designed it to be so using foreign keys. The
application needs to ensure that all data in column families are accurate
and not out of sync, because data elements may be duplicated in different
col families.

You cannot have 2 different entities and ensure that changes to both will
be done and then only be visible to others.

Regards,

From: Jeffrey Kesselman jef...@gmail.commailto:jef...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Does Cassandra support operations in a transaction?

Short story is that few if any of the NoSql systems supprot transactions
natively. Thats oen of the big compromises they make. What they call
eventual consistancy is actually eventual Durabiltiy in ACID terms.

Consistancy, as meant by the C in ACID, is not gauranteed at all.

On Wed, Aug 1, 2012 at 6:21 AM, Ivan Jiang wiwi1...@gmail.commailto:
wiwi1...@gmail.com wrote:
Hi,
I am a new guy to Cassandra, I wonder if available to call Cassandra
in one Transaction such as in Relation-DB.

Thanks in advance.

Best Regards,
Ivan Jiang

--
It's always darkest just before you are eaten by a grue.

This email and any files transmitted with it are confidential and intended
solely for the individual or entity to whom they are addressed. If you have
received this email in error destroy it immediately. *** Walmart
Confidential ***

--
It's always darkest just before you are eaten by a grue.

Re: Does Cassandra support operations in a transaction?

2012-08-01 Thread Jeffrey Kesselman

True consistancy, btw, pretty much is only possible in a transactional
environment.

On Thu, Aug 2, 2012 at 12:56 AM, Jeffrey Kesselman jef...@gmail.com wrote:

Roshni,

Thats not what consistancy in ACID means. Its not consistancy of reading
the ame data, its referntial integrity between related pecies of data.

http://publib.boulder.ibm.com/infocenter/cicsts/v3r2/index.jsp?topic=%2Fcom.ibm.cics.ts.productoverview.doc%2Fconcepts%2Facid.html

A lot of people i nthe NoSql wqorld use the term consistancy when what
they mean is durability.

Durability After a transaction successfully completes, changes to data
persist and are not undone, even in the event of a system failure.

Many NoSql databses (includiogn Cassandra) are eventuallydurable, in the
sense that a read immediately after a write may noit reflect that write,
but at soem l;ater point, it will.

None p[rovide true consistancy that I am aware of.

On Thu, Aug 2, 2012 at 12:24 AM, Roshni Rajagopal
roshni.rajago...@wal-mart.com wrote:

Hi Ivan,

Cassandra supports 'tunable consistency' . If you always read and write
at a quorum (or local quorum for multi data center) from one , you can
guarantee that the results will be consistent as in all the data will be
compared and the latest will be returned, and no data will be out of date.
This is at a loss of performance- it will be fastest to just read and write
once rather than check a quorum of nodes.

You cannot have 2 different entities and ensure that changes to both will
be done and then only be visible to others.

Regards,

Consistancy, as meant by the C in ACID, is not gauranteed at all.

Thanks in advance.

Best Regards,
Ivan Jiang

--
It's always darkest just before you are eaten by a grue.

This email and any files transmitted with it are confidential and
intended solely for the individual or entity to whom they are addressed. If
you have received this email in error destroy it immediately. *** Walmart
Confidential ***

--
It's always darkest just before you are eaten by a grue.

Unsuccessful attempt to add a second node to a ring.

Re: Unsuccessful attempt to add a second node to a ring.

Re: Unsuccessful attempt to add a second node to a ring.

Re: Unsuccessful attempt to add a second node to a ring.

Re: Unsuccessful attempt to add a second node to a ring.

Restore snapshot

Re: Does Cassandra support operations in a transaction?

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: Unsuccessful attempt to add a second node to a ring.

Re: Creating counter columns in cassandra

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: virtual memory of all cassandra-nodes is growing extremly since Cassandra 1.1.0

Re: Does Cassandra support operations in a transaction?

Re: Does Cassandra support operations in a transaction?

Re: Looking for a good Ruby client

Re: Does Cassandra support operations in a transaction?

Re: Does Cassandra support operations in a transaction?

17 matches

Site Navigation

Mail list logo

Footer information