Duplicate result of get_indexed_slices, depending on indexClause.count

2011-04-15 Thread sam_
Hi All,

I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java).

I noticed that if I am querying a Column Family with indexed columns
sometimes I get a duplicate result in get_indexed_slices depending on the
number of rows in the CF and the count that I set in IndexClause.count.
It also depends on the order of rows in CF.

For example consider the following CF that I call Attributes:

create column family Attributes with comparator=UTF8Type
and column_metadata=[
{column_name: range_id, validation_class: LongType, index_type: 
KEYS},
{column_name: attr_key, validation_class: UTF8Type, index_type: 
KEYS},
{column_name: attr_val, validation_class: BytesType, 
index_type: KEYS}
];

And suppose I have the following rows in the CF:

key   range_id   attr_keyattr_val
1/@1/0,   1,  A,   1
1/5/0,  1,  B, 1000
3/@1/0,   2,  A,   1
3/5/0,  2,  B, 1001
5/@1/0,   3,  A,   2
5/5/0,  3,  B, 1002
7/@1/0,   4,  A,   2
7/5/0,  4, B,  1003

Now if I have a query with IndexClause like this (in pseudo code):

attr_key == A AND attr_val == 1

with indexClause.count = 4;

Then I ill get the rows with the following keys from get_indexed_slices :

1/@1/0, 3/@1/0, 3/@1/0

The last key is a duplicate!

This is very sensitive to the order of rows in the CF and the number of rows
and the number you set in indexClause.count. I noticed when the number of
rows in the CF is twice the indexClause.count this issue might happen
depending on the order of rows in CF!

This seems a bug. And it occurs in both 0.7.2 and 0.7.4. 

Is there a solution to this problem? 

Many Thanks,
Sam





--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Duplicate-result-of-get-indexed-slices-depending-on-indexClause-count-tp6275394p6275394.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Indexes on heterogeneous rows

2011-04-15 Thread Wangpei (Peter)
Does the get_indexed_slice in 0.7.4 version already do thing that way?
It seems always take the 1st indexed column with EQ.
Or is it a new feature of coming 0.7.5 or 0.8?

-邮件原件-
发件人: Jonathan Ellis [mailto:jbel...@gmail.com] 
发送时间: 2011年4月15日 0:21
收件人: user@cassandra.apache.org
抄送: David Boxenhorn; aaron morton
主题: Re: Indexes on heterogeneous rows

This should work reasonably well w/ 0.7 indexes. Cassandra tracks
statistics on index selectivity, so it would plan that query as index
lookup on e=5, then iterate over those results and return only rows
that also have type=2.

On Thu, Apr 14, 2011 at 5:33 AM, David Boxenhorn da...@taotown.com wrote:
 Thank you for your answer, and sorry about the sloppy terminology.

 I'm thinking of the scenario where there are a small number of results in
 the result set, but there are billions of rows in the first of your
 secondary indexes.

 That is, I want to do something like (not sure of the CQL syntax):

 select * where type=2 and e=5

 where there are billions of rows of type 2, but some manageable number of
 those rows have e=5.

 As I understand it, secondary indexes are like column families, where each
 value is a column. So the billions of rows where type=2 would go into a
 single row of the secondary index. This sounds like a problem to me, is it?

 I'm assuming that the billions of rows that don't have column e at all
 (those rows of other types) are not a problem at all...

 On Thu, Apr 14, 2011 at 12:12 PM, aaron morton aa...@thelastpickle.com
 wrote:

 Need to clear up some terminology here.
 Rows have a key and can be retrieved by key. This is *sort of* the primary
 index, but not primary in the normal RDBMS sense.
 Rows can have different columns and the column names are sorted and can be
 efficiently selected.
 There are secondary indexes in cassandra 0.7 based on column
 values http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes
 So you could create secondary indexes on the a,e, and h columns and get
 rows that have specific values. There are some limitations to secondary
 indexes, read the linked article.
 Or you can make your own secondary indexes using row keys as the index
 values.
 If you have billions of rows, how many do you need to read back at once?
 Hope that helps
 Aaron

 On 14 Apr 2011, at 04:23, David Boxenhorn wrote:

 Is it possible in 0.7.x to have indexes on heterogeneous rows, which have
 different sets of columns?

 For example, let's say you have three types of objects (1, 2, 3) which
 each had three members. If your rows had the following pattern

 type=1 a=? b=? c=?
 type=2 d=? e=? f=?
 type=3 g=? h=? i=?

 could you index type as your primary index, and also index a, e, h
 as secondary indexes, to get the objects of that type that you are looking
 for?

 Would it work if you had billions of rows of each type?






-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Duplicate result of get_indexed_slices, depending on indexClause.count

2011-04-15 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-2406

On Fri, Apr 15, 2011 at 1:43 AM, sam_ amin_shar...@yahoo.com wrote:
 Hi All,

 I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java).

 I noticed that if I am querying a Column Family with indexed columns
 sometimes I get a duplicate result in get_indexed_slices depending on the
 number of rows in the CF and the count that I set in IndexClause.count.
 It also depends on the order of rows in CF.

 For example consider the following CF that I call Attributes:

 create column family Attributes with comparator=UTF8Type
        and column_metadata=[
                {column_name: range_id, validation_class: LongType, 
 index_type: KEYS},
                {column_name: attr_key, validation_class: UTF8Type, 
 index_type: KEYS},
                {column_name: attr_val, validation_class: BytesType, 
 index_type: KEYS}
        ];

 And suppose I have the following rows in the CF:

 key           range_id       attr_key        attr_val
 1/@1/0,       1,              A,               1
 1/5/0,          1,              B,             1000
 3/@1/0,       2,              A,               1
 3/5/0,          2,              B,             1001
 5/@1/0,       3,              A,               2
 5/5/0,          3,              B,             1002
 7/@1/0,       4,              A,               2
 7/5/0,          4,             B,              1003

 Now if I have a query with IndexClause like this (in pseudo code):

 attr_key == A AND attr_val == 1

 with indexClause.count = 4;

 Then I ill get the rows with the following keys from get_indexed_slices :

 1/@1/0, 3/@1/0, 3/@1/0

 The last key is a duplicate!

 This is very sensitive to the order of rows in the CF and the number of rows
 and the number you set in indexClause.count. I noticed when the number of
 rows in the CF is twice the indexClause.count this issue might happen
 depending on the order of rows in CF!

 This seems a bug. And it occurs in both 0.7.2 and 0.7.4.

 Is there a solution to this problem?

 Many Thanks,
 Sam





 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Duplicate-result-of-get-indexed-slices-depending-on-indexClause-count-tp6275394p6275394.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
Just experienced something i don't understand yet.

Running a 3 node cluster successfully for a few days now, then one of
the nodes went down (server required reboot).
After this the other two nodes kept throwing UnavailableExceptions like

UnavailableException()
at 
org.apache.cassandra.service.WriteResponseHandler.assureSufficientLiveNodes(WriteResponseHandler.java:127)
at 
org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:118)
at 
no.finntech.countstats.listener.CassandraMessageListener$1.run(CassandraMessageListener.java:356)

(this code being loosely based off the second example in
http://wiki.apache.org/cassandra/ScribeToCassandra ).

This seems a bit weird to me when the StorageProxy.mutate(..) is being
called with ConsistencyLevel.ONE.

I'm running 0.7.4 so i doubt it to be CASSANDRA-2069

~mck

-- 
Everything you can imagine is real. Pablo Picasso 
| http://semb.wever.org | http://sesat.no
| http://tech.finn.no   | Java XSS Filter



signature.asc
Description: This is a digitally signed message part


How to warm up a cold node

2011-04-15 Thread Héctor Izquierdo Seliva
Hi everyone, is there any recommended procedure to warm up a node before
bringing it up? 

Thanks!



Re: How to warm up a cold node

2011-04-15 Thread Peter Schuller
 Hi everyone, is there any recommended procedure to warm up a node before
 bringing it up?

Currently the only out-of-the-box support for warming up caches is
that implied by the key cache and row cache, which will pre-heat on
start-up. Indexes will be indirectly preheated by index sampling, to
the extent that they operating system retains them in page cache.

If you're wanting to pre-heat sstables there's currently no way to do
that (but it's a useful feature to have). Pragmatically, you can
script something that e.g. does cat path/to/keyspace/*  /dev/null
or similar. But that only works if the total database size fits
reasonably well in page cache.

Pre-heating sstables on a per-cf basis on start-up would be a nice
feature to have.

-- 
/ Peter Schuller


Re: How to warm up a cold node

2011-04-15 Thread Héctor Izquierdo Seliva
How difficult do you think this could be? I would be interested into
developing this if it's feasible.

El vie, 15-04-2011 a las 16:19 +0200, Peter Schuller escribió:
  Hi everyone, is there any recommended procedure to warm up a node before
  bringing it up?
 
 Currently the only out-of-the-box support for warming up caches is
 that implied by the key cache and row cache, which will pre-heat on
 start-up. Indexes will be indirectly preheated by index sampling, to
 the extent that they operating system retains them in page cache.
 
 If you're wanting to pre-heat sstables there's currently no way to do
 that (but it's a useful feature to have). Pragmatically, you can
 script something that e.g. does cat path/to/keyspace/*  /dev/null
 or similar. But that only works if the total database size fits
 reasonably well in page cache.
 
 Pre-heating sstables on a per-cf basis on start-up would be a nice
 feature to have.
 




question about performance of Cassandra 0.7.4 under a read-heavy workload.

2011-04-15 Thread 魏金仙
I just deployed cassandra 0.7.4 as a 6-server cluster and tested its 
performance via YCSB.
The result seems confusing when compared to that of Cassandra0.6.6. Under a 
write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a 
really satisfactory latency. I mean both the read latency and write latency is 
much lower than those of Cassandra0.6.6.
However, under a read heavy workload(i.e., write/read:5%/95%), Cassandra0.7.4 
performs far worse than Cassandra0.6.6 does.

Did I miss something?


Consistency model

2011-04-15 Thread James Cipar
I've been experimenting with the consistency model of Cassandra, and I found 
something that seems a bit unexpected.  In my experiment, I have 2 processes, a 
reader and a writer, each accessing a Cassandra cluster with a replication 
factor greater than 1.  In addition, sometimes I generate background traffic to 
simulate a busy cluster by uploading a large data file to another table.

The writer executes a loop where it writes a single row that contains just an 
sequentially increasing sequence number and a timestamp.  In python this looks 
something like:

while time.time()  start_time + duration:
target_server = random.sample(servers, 1)[0]
target_server = '%s:9160'%target_server

row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
seqnum += 1
# print 'uploading to server %s, %s'%(target_server, row)   



pool = pycassa.connect('Keyspace1', [target_server])
cf = pycassa.ColumnFamily(pool, 'Standard1')
cf.insert('foo', row, write_consistency_level=consistency_level)
pool.dispose()

if sleeptime  0.0:
time.sleep(sleeptime)


The reader simply executes a loop reading this row and reporting whenever a 
sequence number is *less* than the previous sequence number.  As expected, with 
consistency_level=ConsistencyLevel.ONE there are many inconsistencies, 
especially with a high replication factor.

What is unexpected is that I still detect inconsistencies when it is set at 
ConsistencyLevel.QUORUM.  This is unexpected because the documentation seems to 
imply that QUORUM will give consistent results.  With background traffic the 
average difference in timestamps was 0.6s, and the maximum was 3.5s.  This 
means that a client sees a version of the row, and can subsequently see another 
version of the row that is 3.5s older than the previous.

What I imagine is happening is this, but I'd like someone who knows that 
they're talking about to tell me if it's actually the case:

I think Cassandra is not using an atomic commit protocol to commit to the 
quorum of servers chosen when the write is made.  This means that at some point 
in the middle of the write, some subset of the quorum have seen the write, 
while others have not.  At this time, there is a quorum of servers that have 
not seen the update, so depending on which quorum the client reads from, it may 
or may not see the update.

Of course, I understand that the client is not *choosing* a bad quorum to read 
from, it is just the first `q` servers to respond, but in this case it is 
effectively random and sometimes an bad quorum is chosen.

Does anyone have any other insight into what is going on here?

Key cache hit rate

2011-04-15 Thread mcasandra

How to intepret  Key cache hit rate? What does this no mean?


Keyspace: StressKeyspace
Read Count: 87579
Read Latency: 11.792417360326105 ms.
Write Count: 179749
Write Latency: 0.009272318622078566 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 59
Space used (live): 52432078035
Space used (total): 52432078035
Memtable Columns Count: 229
Memtable Data Size: 114103248
Memtable Switch Count: 375
Read Count: 87579
Read Latency: NaN ms.
Write Count: 179751
Write Latency: 0.007 ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 78576
Key cache hit rate: 3.8880248833592535E-4
Row cache: disabled
Compacted row minimum size: 182786
Compacted row maximum size: 5839588
Compacted row mean size: 532956


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Key-cache-hit-rate-tp6277236p6277236.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: What's the best modeling approach for ordering events by date?

2011-04-15 Thread Ethan Rowe
Hi.

So, the OPP will direct all activity for a range of keys to a particular
node (or set of nodes, in accordance with your replication factor).
 Depending on the volume of writes, this could be fine.  Depending on the
distribution of key values you write at any given time, it can also be fine.
 But if you're using the OPP, and your keys align with the time of receiving
the data, and your application writes that data as it receives it, you're
going to be placing write activity on effectively one node at a time, for
the range of time allocated to that node.

If you use RP, and can divide time into finer slices such that you have
multiple tweets in a row, you trade off a more complex read in exchange for
better distribution of load throughout your cluster.  The necessity of this
depends on your particulars.

In your TweetsBySecond example, you're using a deterministic set of keys
(the keys correspond to seconds since epoch).  Querying for ranges of time
is nice with OPP, but if the ranges of time you're interested in are
constrained, you don't specifically need OPP.  You could use RP and request
all the keys for the seconds contained within the time range of interest.
 In this way, you balance writes across the cluster more effectively than
you would with OPP, while still getting a workable data set.  Again, the
degree to which you need this is dependent on your situation.  Others on the
list will no doubt have more informed opinions on this than me.  :)

On Thu, Apr 14, 2011 at 8:00 PM, Guillermo Winkler gwink...@inconcertcc.com
 wrote:

 Hi Ethan,

 I want to present the events ordered by time, always in pages of 20/40
 events. If the events are tweets, you can have 1000 tweets from the same
 second or you can have 30 tweets in a 10 minute range. But I always wanna be
 able to page through the results in an orderly fashion.

 I think that using seconds since epoch it's what I'm doing, that is divide
 time into a fixed series of interval. Each second is an interval, and all of
 the events for that particular second are columns of that row.

 Again with tweets for easier visualizatoin

 TweetsBySecond : {
  12121121212 :{ - seconds since epoch
  id1,id2,id3 - all the tweet ids ocurred in that particular second
 },
 12121212123 : {
 id4,id5
 },
 12121212124 : {
 id6
 }
 }

 The problem is you can't do that using OPP in cassandra 0.7, or it's just
 me missing something?

 Thanks for your answer,
 Guille

 On Thu, Apr 14, 2011 at 4:49 PM, Ethan Rowe et...@the-rowes.com wrote:

 How do you plan to read the data?  Entire histories, or in relatively
 confined slices of time?  Do the events have any attributes by which you
 might segregate them, apart from time?

 If you can divide time into a fixed series of intervals, you can insert
 members of a given interval as columns (or supercolumns) in a row.  But it
 depends how you want to use the data on the read side.


 On Thu, Apr 14, 2011 at 12:25 PM, Guillermo Winkler 
 gwink...@inconcertcc.com wrote:

 I have a huge number of events I need to consume later, ordered by the
 date the event occured.

 My first approach to this problem was to use seconds since epoch as row
 key, and event ids as column names (empty value), this way:

 EventsByDate : {
 SecondsSinceEpoch: {
 evid:, evid:, evid:
 }
 }

 And use OPP as partitioner. Using GetRangeSlices to retrieve ordered
 events secuentially.

 Now I have two problems to solve:

 1) The system is realtime, so all the events in a given moment are
 hitting the same box
 2) Migrating from cassandra 0.6 to cassandra 0.7 OPP doesn't seem to like
 LongType for row keys, was this purposedly deprecated?

 I was thinking about secondary indexes, but it does not assure the order
 the rows are coming out of cassandra.

 Anyone has a better approach to model events by date given that
 restrictions?

 Thanks,
 Guille







Two versions of schema

2011-04-15 Thread mcasandra
Is there a problem?


[default@StressKeyspace] update column family StressStandard with
keys_cached=100;
854ee0a0-6792-11e0-81f9-93d987913479
Waiting for schema agreement...
The schema has not settled in 10 seconds; further migrations are ill-advised
until it does.
Versions are 854ee0a0-6792-11e0-81f9-93d987913479:[10.18.62.202,
10.18.62.203, 10.18.62.200, 10.18.62.204, 10.18.62.199, 10.18.62.196,
10.18.62.197],22d165ff-6783-11e0-81f9-93d987913479:[10.18.62.198]


I remember reading somewhere before that when you have 2 versions of schemas
you are basically in trouble. Can someone explain what it means and it's
implications?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6277365.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Problems with subcolumn retrieval after upgrade from 0.6 to 0.7

2011-04-15 Thread Abraham Sanderson
I'm having some issues with a few of my ColumnFamilies after a cassandra
upgrade/import from 0.6.1 to 0.7.4.  I followed the instructions to upgrade
and everything seem to work OK...until I got into the application and
noticed some wierd behavior.  I was getting the following stacktrace in
cassandra occassionally when I did get operations for a single subcolumn for
some of the Super type CFs:

ERROR 12:56:05,669 Internal error processing get
java.lang.AssertionError
at org.apache.cassandra.thrift.
CassandraServer.get(CassandraServer.java:300)
at
org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

The assertion that is failing is the check that only one column is retrieved
by the get.  I did some debugging with the cli and a remote debugger and
found a few interesting patterns.  First, the problem does not seem
consistently duplicatable.  If one supercolumn is affected though, it will
happen more frequently for subcolumns that when sorted appear at the
beginning of the range.  For columns near the end of the range, it seems to
be more intermittent, and almost never occurs when I step through the code
line by line.  The only factor I can think of that might cause issues is
that I am using custom data types for all supercolumns and columns.  I
originally thought I might be reading past the end of the ByteBuffer, but I
have quadrupled checked that this is not the case.

Abe Sanderson


recurring EOFException exception in 0.7.4

2011-04-15 Thread Jonathan Colby
I've been struggling with these kinds of exceptions for some time now.  I 
thought it might have been a one-time thing, so on the 2 nodes where I saw this 
problem I pulled in fresh data with a repair on an empty data directory.

Unfortunately, this problem is now coming up on a new node that has, up until 
now, not had this problem.

What could be causing this?  Could it be related to encoding?   Why are these 
rows not readable?   

This exception prevents cassandra from doing repairs, and even minor 
compactions.  It also messes up memtable management (with a normal load of 
25GB,  disk goes to almost 100% full on a 500 GB hd).

This is incredibly frustrating.  This is the only pain-point I have had with 
cassandra so far.   By the way, this node was never upgraded - it was 0.7.4 
from the start, so that eliminates format compatibility problems.

ERROR [CompactionExecutor:1] 2011-04-15 21:31:23,479 PrecompactedRow.java (line 
82) Skipping row DecoratedKey(105452551814086725777389040553659117532, 
4d657373616765456e726963686d656e743a313032343937) in 
/var/lib/cassandra/data/DFS/main-f-91-Data.db
java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
at 
org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:270)
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:315)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:272)
at 
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
at 
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:176)
at 
org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:78)
at 
org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:147)
at 
org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:108)
at 
org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:43)
at 
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at 
org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
at 
org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
at 
org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:449)
at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124)
at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)



cluster IP question and Jconsole?

2011-04-15 Thread tinhuty he
I have followed the description here 
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters 
to created 5 instances of cassandra in one CentOS 5.5 machine. using 
nodetool shows the 5 nodes are all running fine.


Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 127.0.0.1 
is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks to 
me that they are not valid IP? how come all 5 nodes are working ok?


Another question. I have installed MX4J in instance 127.0.0.1 on port 8081. 
I am able to connect to http://server:8081/ from the browser. However how do 
I connect using Jconsole that was installed in another windows 
machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed).


Thanks.



Re: Cassandra 2 DC deployment

2011-04-15 Thread Peter Schuller
 You are right about the automatic fallback to ONE. Its quite possible, if 2 
 nodes die for some reason I will have the same problem. So probably the right 
 thing to do would be to read/write at ONE only when we lose a DC by changing 
 some manual configuration. Since we shouldn't be losing DCs that often, this 
 should be an acceptable change. So my follow up questions would be -

Seems reasonable to have a human do it, since it seems that you really
want QUORUM - so presumably there is some kind of negative impact and
you don't want that sporadically happening every time there is a
hiccup. But of course I don't know the context.

 When would be the right time to start reading/writing at QUORUM again?

I'd say usually as soon as possible, but it will depend on details of
your situation. For example, if you have 2 DC:s with 5 nodes in one
and 1 node in another, and there is a partition - the DC with just one
node will start seeing older data (from the point of view of writes
done in the 1-node DC) if you start asking for quorum since a lot of
the time a quorum will be 4 nodes in the other DC. So if there is
interest in preferring the local dc's copy of the data after an
emergency fallback to CL.ONE, it may be detrimental to go QUORUM too
early.

But this will depend on what your application is actually doing and
what is important to you.

 Should we be marking the 2 nodes in the lost DC as down?
 Should we be doing some administrative work on Cassandra before we start 
 reading/writing at QUORUM again?

Are you talking about permanently losing a DC then, rather than just a
transient partition? For non-permanent situations it seems
counter-productive to mark other DC's nodes as down. Oh and btw, keep
in mind you can choose to use LOCAL_QUORUM to get intra-site
consistency (rather than ONE).

As for administrative work: I can't answer in general since we're
talking about very special circumstances, but at least it's valid to
say that whenever you have some kind of issue that has caused
inconsistency, running 'nodetool repair' (perhaps earlier than the
standard weekly/whatever repair) is the most efficient way to achieve
consistency again.

-- 
/ Peter Schuller


RE: recurring EOFException exception in 0.7.4

2011-04-15 Thread Dan Hendry
Try running nodetool scrub on the cf: its pretty good at detecting and
fixing most corruption problems.

Dan


-Original Message-
From: Jonathan Colby [mailto:jonathan.co...@gmail.com] 
Sent: April-15-11 15:41
To: user@cassandra.apache.org
Subject: recurring EOFException exception in 0.7.4

I've been struggling with these kinds of exceptions for some time now.  I
thought it might have been a one-time thing, so on the 2 nodes where I saw
this problem I pulled in fresh data with a repair on an empty data
directory.

Unfortunately, this problem is now coming up on a new node that has, up
until now, not had this problem.

What could be causing this?  Could it be related to encoding?   Why are
these rows not readable?   

This exception prevents cassandra from doing repairs, and even minor
compactions.  It also messes up memtable management (with a normal load of
25GB,  disk goes to almost 100% full on a 500 GB hd).

This is incredibly frustrating.  This is the only pain-point I have had with
cassandra so far.   By the way, this node was never upgraded - it was 0.7.4
from the start, so that eliminates format compatibility problems.

ERROR [CompactionExecutor:1] 2011-04-15 21:31:23,479 PrecompactedRow.java
(line 82) Skipping row DecoratedKey(105452551814086725777389040553659117532,
4d657373616765456e726963686d656e743a313032343937) in
/var/lib/cassandra/data/DFS/main-f-91-Data.db
java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
at
org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRand
omAccessFile.java:270)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:315)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java
:272)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:9
4)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:3
5)
at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFami
lySerializer.java:129)
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithC
olumns(SSTableIdentityIterator.java:176)
at
org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:78)
at
org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterato
r.java:147)
at
org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav
a:108)
at
org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav
a:43)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.jav
a:73)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator
.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131
)
at
org.apache.commons.collections.iterators.FilterIterator.setNextObject(Filter
Iterator.java:183)
at
org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterat
or.java:94)
at
org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.jav
a:449)
at
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124)
at
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)
at java.lang.Thread.run(Thread.java:662)

No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
02:34:00



RE: Consistency model

2011-04-15 Thread Dan Hendry
So Cassandra does not use an atomic commit protocol at the cluster level.
Strong consistency on a quorum read is only guaranteed *after* a successful
quorum write. The behaviour you are seeing is possible if you are reading in
the middle of a write or the write failed (which should be reported to your
code via an exception). 

Dan

-Original Message-
From: James Cipar [mailto:jci...@cmu.edu] 
Sent: April-15-11 14:15
To: user@cassandra.apache.org
Subject: Consistency model

I've been experimenting with the consistency model of Cassandra, and I found
something that seems a bit unexpected.  In my experiment, I have 2
processes, a reader and a writer, each accessing a Cassandra cluster with a
replication factor greater than 1.  In addition, sometimes I generate
background traffic to simulate a busy cluster by uploading a large data file
to another table.

The writer executes a loop where it writes a single row that contains just
an sequentially increasing sequence number and a timestamp.  In python this
looks something like:

while time.time()  start_time + duration:
target_server = random.sample(servers, 1)[0]
target_server = '%s:9160'%target_server

row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
seqnum += 1
# print 'uploading to server %s, %s'%(target_server, row)


pool = pycassa.connect('Keyspace1', [target_server])
cf = pycassa.ColumnFamily(pool, 'Standard1')
cf.insert('foo', row, write_consistency_level=consistency_level)
pool.dispose()

if sleeptime  0.0:
time.sleep(sleeptime)


The reader simply executes a loop reading this row and reporting whenever a
sequence number is *less* than the previous sequence number.  As expected,
with consistency_level=ConsistencyLevel.ONE there are many inconsistencies,
especially with a high replication factor.

What is unexpected is that I still detect inconsistencies when it is set at
ConsistencyLevel.QUORUM.  This is unexpected because the documentation seems
to imply that QUORUM will give consistent results.  With background traffic
the average difference in timestamps was 0.6s, and the maximum was 3.5s.
This means that a client sees a version of the row, and can subsequently see
another version of the row that is 3.5s older than the previous.

What I imagine is happening is this, but I'd like someone who knows that
they're talking about to tell me if it's actually the case:

I think Cassandra is not using an atomic commit protocol to commit to the
quorum of servers chosen when the write is made.  This means that at some
point in the middle of the write, some subset of the quorum have seen the
write, while others have not.  At this time, there is a quorum of servers
that have not seen the update, so depending on which quorum the client reads
from, it may or may not see the update.

Of course, I understand that the client is not *choosing* a bad quorum to
read from, it is just the first `q` servers to respond, but in this case it
is effectively random and sometimes an bad quorum is chosen.

Does anyone have any other insight into what is going on here?=
No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
02:34:00



Schemas diverging while dynamically creating CF.

2011-04-15 Thread Alejandro Perez
Hello,

We're testing cassandra for integration with indextank. In this first try,
we're creating one column family for each user. In practice, on the first
run and for the first few documents (a few 100s), a new CF is created, and a
document is immediately added to it. A few (up to 50) requests of this type
are issued in parallel (for different column families).

The end result, and quite repeatable, is having the cluster split with
different schema versions, and they never agree.

Any thoughts?


Thanks,

Spike.

-- 
Alejandro Perez
IndexTank

follow us @indextank http://twitter.com/indextank | read our
bloghttp://blog.indextank.com/ | subscribe
our user mailing list http://groups.google.com/group/indextank

http://blog.indextank.com/


Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Jonathan Ellis
Sure sounds like you have RF=1 to me.

On Fri, Apr 15, 2011 at 7:45 AM, Mick Semb Wever m...@apache.org wrote:
 Just experienced something i don't understand yet.

 Running a 3 node cluster successfully for a few days now, then one of
 the nodes went down (server required reboot).
 After this the other two nodes kept throwing UnavailableExceptions like

 UnavailableException()
        at 
 org.apache.cassandra.service.WriteResponseHandler.assureSufficientLiveNodes(WriteResponseHandler.java:127)
        at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:118)
        at 
 no.finntech.countstats.listener.CassandraMessageListener$1.run(CassandraMessageListener.java:356)

 (this code being loosely based off the second example in
 http://wiki.apache.org/cassandra/ScribeToCassandra ).

 This seems a bit weird to me when the StorageProxy.mutate(..) is being
 called with ConsistencyLevel.ONE.

 I'm running 0.7.4 so i doubt it to be CASSANDRA-2069

 ~mck

 --
 Everything you can imagine is real. Pablo Picasso
 | http://semb.wever.org | http://sesat.no
 | http://tech.finn.no       | Java XSS Filter





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote:
 Sure sounds like you have RF=1 to me.

Yes that's right.

I see... so the answer here is that i should be using CL.ANY ?
(so the write goes through and hinted handoff can get it to the correct
node latter on).

~mck

-- 
The fox condemns the trap, not himself. William Blake 
| http://semb.wever.org | http://sesat.no
| http://tech.finn.no   | Java XSS Filter


signature.asc
Description: This is a digitally signed message part


RE: Schemas diverging while dynamically creating CF.

2011-04-15 Thread Dan Hendry
Uh... don't create a column family per user. Column families are meant to be
fairly static; conceptually equivalent to a table in a relational database.
Why do you need (or even want) a CF per user? Reconsider your data model, a
single column family with an inverted index for a 'user' column is probably
more what you are looking for. Operationally, the fewer CFs the better.

 

Dan

 

From: Alejandro Perez [mailto:sp...@indextank.com] 
Sent: April-15-11 16:39
To: user@cassandra.apache.org
Cc: Support
Subject: Schemas diverging while dynamically creating CF.

 

Hello,

 

We're testing cassandra for integration with indextank. In this first try,
we're creating one column family for each user. In practice, on the first
run and for the first few documents (a few 100s), a new CF is created, and a
document is immediately added to it. A few (up to 50) requests of this type
are issued in parallel (for different column families).

 

The end result, and quite repeatable, is having the cluster split with
different schema versions, and they never agree.

 

Any thoughts?

 

 

Thanks,

 

Spike.


-- 

Alejandro Perez
IndexTank

follow us @indextank http://twitter.com/indextank  | read our blog
http://blog.indextank.com/  | subscribe our user mailing list
http://groups.google.com/group/indextank 

 http://blog.indextank.com/ 


No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
02:34:00



Re: Schemas diverging while dynamically creating CF.

2011-04-15 Thread Alejandro Perez
Thanks for the quick response!. I will reconsider the schema.

However, the problem troubles me somehow. How are schema changes supposed to
be done? Should I serialize them, should I halt other cluster operations
while I do the schema change? Is this a known problem with cassandra?

The other question, and I think the more important one for me now: how do I
repair the cluster without loosing data once the schemas diverge? Right now
the only way I have is erase all data and have the cluster start empty.
Should this problem ever happen in production, it's important there's a way
to recover the data.

On Fri, Apr 15, 2011 at 1:57 PM, Dan Hendry dan.hendry.j...@gmail.comwrote:

 Uh... don’t create a column family per user. Column families are meant to
 be fairly static; conceptually equivalent to a table in a relational
 database. Why do you need (or even want) a CF per user? Reconsider your data
 model, a single column family with an inverted index for a ‘user’ column is
 probably more what you are looking for. Operationally, the fewer CFs the
 better.



 Dan



 *From:* Alejandro Perez [mailto:sp...@indextank.com]
 *Sent:* April-15-11 16:39
 *To:* user@cassandra.apache.org
 *Cc:* Support
 *Subject:* Schemas diverging while dynamically creating CF.



 Hello,



 We're testing cassandra for integration with indextank. In this first try,
 we're creating one column family for each user. In practice, on the first
 run and for the first few documents (a few 100s), a new CF is created, and a
 document is immediately added to it. A few (up to 50) requests of this type
 are issued in parallel (for different column families).



 The end result, and quite repeatable, is having the cluster split with
 different schema versions, and they never agree.



 Any thoughts?





 Thanks,



 Spike.


 --

 Alejandro Perez
 IndexTank

 follow us @indextank http://twitter.com/indextank | read our 
 bloghttp://blog.indextank.com/ | subscribe
 our user mailing list http://groups.google.com/group/indextank


 http://blog.indextank.com/

 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
 02:34:00




-- 
Alejandro Perez
IndexTank

follow us @indextank http://twitter.com/indextank | read our
bloghttp://blog.indextank.com/ | subscribe
our user mailing list http://groups.google.com/group/indextank

http://blog.indextank.com/


Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Jonathan Ellis
Yes, if you want to keep writes available w/ RF=1 then you need to use CL.ANY.

On Fri, Apr 15, 2011 at 3:48 PM, Mick Semb Wever m...@apache.org wrote:
 On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote:
 Sure sounds like you have RF=1 to me.

 Yes that's right.

 I see... so the answer here is that i should be using CL.ANY ?
 (so the write goes through and hinted handoff can get it to the correct
 node latter on).

 ~mck

 --
 The fox condemns the trap, not himself. William Blake
 | http://semb.wever.org | http://sesat.no
 | http://tech.finn.no       | Java XSS Filter




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Upcoming Bay area Cassandra events

2011-04-15 Thread Jonathan Ellis
FYI, there's a couple Cassandra events coming up in April and May in
the Bay area:

Wednesday, April 27, 1pm-6pm: Free Cassandra training by DataStax,
hosted by Ooyala! *Space is limited*; you can sign up at
http://www.datastax.com/freetraining.

Wednesday, April 27, 6pm-8pm (yes, the evening of the training day):
DataStax and Ooyala will be hosting a meet n' greet with pizza, beer,
and Cassandra. The event begins with a happy hour from 6PM to 7PM.
Following the happy hour, Ooyala staff will show how they're using
Cassandra to power their analytics. (Some background material at [1].)
 DataStax engineers will also be there to share details about
Brisk[2], the new open source Hadoop distribution that uses Cassandra
for its core services.

RSVP at http://www.meetup.com/Cassandra-User-Group-Meeting/events/17283903/

Monday, May 9, 2011, 6:45pm: The San Francisco Geo Meetup will feature
a presentation by Mike Malone of SimpleGeo. Mike will explain how and
why the company built its own data indexing scheme using Apache
Cassandra; some background is at [3]. This is a great opportunity to
to see the type of problems that arise when working with
multidimensional spatial data.

RSVP at http://www.meetup.com/geomeetup/events/17034143/

[1] http://www.ooyala.com/whitepapers/Cassandrawhitepaper.pdf
[2] http://www.datastax.com/wp-content/uploads/2011/03/WP-Brisk.pdf
[3] 
http://www.slideshare.net/mmalone/working-with-dimensional-data-in-distributed-hash-tables

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Cassandra Database Modeling

2011-04-15 Thread Aaron Morton
There rows can have 2 billion columns, max column size is 2 GB . But less than 
10 mb sounds like a sane limit for a single column.

For the serialisation it depends on what your data looks like, point is that 
json is not space efficient. You may get away with just compressing it (gzip, 
lzo...), or you may need to create you own space efficient binary format. Start 
with compressing and use the c accelerated simplejson package.

Struct.pack is a way to encode bytes typically to exchange with other programs.

Good luck.
Aaron

On 15/04/2011, at 3:59 PM, csharpplusproject csharpplusproj...@gmail.com 
wrote:

 Aaron,
 
 Thank you so much.
 
 So, the way things appear, it is definitely possible that I could be making 
 queries that would return all 10M particle pairs (at least, I should plan for 
 it). What would be the best design in such a case?
 I read somewhere that the recommended maximum size of a row (meaning, 
 including all columns) should be around 10[MB], and better not to exceed 
 that. Is that correct?
 
 As per packing data efficiently, what would be the best way? would packing 
 the data using say (in python terms) struct.pack( ... ) be at all helpful?
 
 Thanks,
 Shalom.
 
 -Original Message-
 From: aaron morton aa...@thelastpickle.com
 Reply-to: user@cassandra.apache.org
 To: user@cassandra.apache.org
 Subject: Re: Cassandra Database Modeling
 Date: Thu, 14 Apr 2011 20:54:43 +1200
 
 WRT your query, it depends on how big a slice you want to get how time 
 critical it is. e.g. Could you be making queries that would return all 10M 
 pairs ? Or would the queries generally want to get some small fraction of the 
 data set? Again, depends on how the sim runs. 
 
 If you sim has stop the world pauses were you have a full view of the data 
 space, then you could grab all the points at a certain distance and 
 efficiently pack them up. Where efficiently means not using JSON. 
 
 http://wiki.apache.org/cassandra/LargeDataSetConsiderations 
 http://wiki.apache.org/cassandra/CassandraLimitations   Aaron 
 
 On 13 Apr 2011, at 15:48, csharpplusproject wrote: 
 Aaron,
 
 Thank you so much for your help. It is greatly appreciated!
 
 Looking at the design of the particle pairs:
 
 - key: expriement_id.time_interval 
 - column name: pair_id 
 - column value: distance, angle, other data packed together as JSON or some 
 other format
 
 You wrote that retrieving millions of columns (I will have about 10,000,000 
 particles pairs) would be slow. You are also right that the retrieval of 
 millions of columns into Python, won't be fast.
 
 If my desired query is to get all particle pairs on time interval [ 
 Tn..T(n+1) ] where the distance between the two particles is smaller than X 
 and the angle between the two particles is greater than Y.
 
 In such a query (as the above), given the fact that retrieving millions of 
 columns could be slow, would it be best to say 'concatenate' all values for 
 all particle pairs for a given 'expriement_id.time_interval' into one column?
 
 If data is stored in this way, I will be getting from Cassandra a binary 
 string / JSON Object that I will have to 'unpack' in my application. Is this 
 a recommended approach? are there better approaches?
 
 Is there a limit to the size that can be stored in one 'cell' (by 'cell' I 
 mean the intersection between a key and a data column)? is there a limit to 
 the size of data of one key?  one data column?
 
 Thanks in advance for any help / guidance.
 
 -Original Message-
 From: aaron morton aa...@thelastpickle.com
 Reply-to: user@cassandra.apache.org
 To: user@cassandra.apache.org
 Subject: Re: Cassandra Database Modeling
 Date: Wed, 13 Apr 2011 10:14:21 +1200
 
 Yes for  interactive == real time queries.  Hadoop based techniques are non 
 time critical queries, but they do have greater analytical capabilities.  
 
 particle_pairs: 1) Yes and no and sort of. Under the hood the get_slice api 
 call will be used by your client library to pull back chunks of (ordered) 
 columns. Most client libraries abstract away the chunking for you.  
 
 2) If you are using a packed structure like JSON then no, Cassandra will 
 have no idea what you've put in the columns other than bytes . It really 
 depends on how much data you have per pair, but generally it's easier to 
 pull back more data than try to get exactly what you need. Downside is you 
 have to update all the data.  
 
 3) No, you would need to update all the data for the pair. I was assuming 
 most of the data was written once, and that your simulation had something 
 like a stop-the-world phase between time slices where state was dumped and 
 then read to start the next interval. You could either read it first, or we 
 can come up with something else. 
 
 distance_cf 1) the query would return an list of columns, which have a name 
 and value (as well as a timestamp and ttl). 2) depends on the client 
 library, if using python go for https://github.com/pycassa/pycassa 

Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.

2011-04-15 Thread Aaron Morton
Will need to know more about the number of requests, iostats etc. There is no 
reason for it to run slower.

Aaron
On 16/04/2011, at 2:35 AM, 魏金仙 sei_...@126.com wrote:

 I just deployed cassandra 0.7.4 as a 6-server cluster and tested its 
 performance via YCSB.
 The result seems confusing when compared to that of Cassandra0.6.6. Under a 
 write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a 
 really satisfactory latency. I mean both the read latency and write latency 
 is much lower than those of Cassandra0.6.6.
 However, under a read heavy workload(i.e., write/read:5%/95%), Cassandra0.7.4 
 performs far worse than Cassandra0.6.6 does.
 
 Did I miss something?
 
 
 体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!


Re: Key cache hit rate

2011-04-15 Thread Aaron Morton
Move the decimal point 4 places to the left. It's the percent of your queries 
that get a hit from the key cache . 

Aaron
On 16/04/2011, at 6:25 AM, mcasandra mohitanch...@gmail.com wrote:

 
 How to intepret  Key cache hit rate? What does this no mean?
 
 
 Keyspace: StressKeyspace
Read Count: 87579
Read Latency: 11.792417360326105 ms.
Write Count: 179749
Write Latency: 0.009272318622078566 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 59
Space used (live): 52432078035
Space used (total): 52432078035
Memtable Columns Count: 229
Memtable Data Size: 114103248
Memtable Switch Count: 375
Read Count: 87579
Read Latency: NaN ms.
Write Count: 179751
Write Latency: 0.007 ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 78576
Key cache hit rate: 3.8880248833592535E-4
Row cache: disabled
Compacted row minimum size: 182786
Compacted row maximum size: 5839588
Compacted row mean size: 532956
 
 
 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Key-cache-hit-rate-tp6277236p6277236.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


DatabaseDescriptor.defsVersion

2011-04-15 Thread Jeffrey Wang
Hey all,

I've been seeing a very rare issue with schema change conflicts on 0.7.3 (I am 
serializing all schema changes to a single Cassandra node and waiting for them 
to finish before continuing). Occasionally a node in the cluster will never 
report the correct schema, and I think it may have to do with synchronization 
on DatabaseDescriptor.defsVersion.

As far as I can tell, it is a static variable accessed by multiple threads but 
is not protected by synchronized/volatile. I was able to write a test in which 
one thread never reads the modification done by another thread (as is expected 
by an unsynchronized variable). Should this be fixed or is there a higher level 
reason this does not need to be synchronized (in which case I should continue 
looking for the reason why my schemas don't agree)? Thanks.

-Jeffrey



Re:Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.

2011-04-15 Thread 魏金仙
To make a comparation, 10 threads were run against the two workloads 
seperately. below is the result of Cassandra0.7.4.
write heavy workload(i.e., write/read: 50%/50%)
median throughput:  5816 operations/second(i.e., 2908 writes and 2908 reads) 
update latency:1.32ms read latency:1.81ms
read heavy workload(i.e., write/read: 5%/95%)
median throughput:  40 operations/second(i.e., 2 writes and 38 reads) update 
latency:1.85ms read latency:90.43ms

and for cassandra0.6.6, the result is:
write heavy workload(i.e., write/read: 50%/50%)
median throughput:  3284 operations/second(i.e., 1642 writes and 1642 reads) 
update latency:2.29ms read latency:3.51ms
read heavy workload(i.e., write/read: 5%/95%)
median throughput:  2759 operations/second(i.e., 2621 writes and 138 reads) 
update latency:2.33ms read latency:3.53ms

all the tests were run in one environment. and most configurations of cassandra 
are just as default, except that:we choose  orderPreservingPartitioner for all 
the tests and set concurrent_reads as 8( which is the default value of 0.6.6 
but the default value of 0.7.4 is 32) .





At 2011-04-16 06:53:01,Aaron Morton aa...@thelastpickle.com wrote:

Will need to know more about the number of requests, iostats etc. There is no 
reason for it to run slower.


Aaron
On 16/04/2011, at 2:35 AM, 魏金仙 sei_...@126.com wrote:


I just deployed cassandra 0.7.4 as a 6-server cluster and tested its 
performance via YCSB.
The result seems confusing when compared to that of Cassandra0.6.6. Under a 
write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a 
really satisfactory latency. I mean both the read latency and write latency is 
much lower than those of Cassandra0.6.6.
However, under a read heavy workload(i.e., write/read:5%/95%), Cassandra0.7.4 
performs far worse than Cassandra0.6.6 does.

Did I miss something?



体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!

Re: cluster IP question and Jconsole?

2011-04-15 Thread Maki Watanabe
127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias
addresses for your loopback interface.
Verify:
  % ifconfig -a

127.0.0.0/8 is for loopback, so you can't connect this address from
remote machines.
You may be able configure SSH port forwarding from your monitroing
host to cassandra node though I haven't try.

maki

2011/4/16 tinhuty he tinh...@hotmail.com:
 I have followed the description here
 http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters
 to created 5 instances of cassandra in one CentOS 5.5 machine. using
 nodetool shows the 5 nodes are all running fine.

 Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 127.0.0.1
 is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks to
 me that they are not valid IP? how come all 5 nodes are working ok?

 Another question. I have installed MX4J in instance 127.0.0.1 on port 8081.
 I am able to connect to http://server:8081/ from the browser. However how do
 I connect using Jconsole that was installed in another windows
 machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed).

 Thanks.


Re: cluster IP question and Jconsole?

2011-04-15 Thread tinhuty he
Maki, thanks for your reply. for the second question, I wasn't using the 
loopback address, I was using the actually IP address for that server. I am 
able to telnet to that IP on port 8081, but using jconsole failed.


-Original Message- 
From: Maki Watanabe

Sent: Friday, April 15, 2011 9:43 PM
To: user@cassandra.apache.org
Cc: tinhuty he
Subject: Re: cluster IP question and Jconsole?

127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias
addresses for your loopback interface.
Verify:
 % ifconfig -a

127.0.0.0/8 is for loopback, so you can't connect this address from
remote machines.
You may be able configure SSH port forwarding from your monitroing
host to cassandra node though I haven't try.

maki

2011/4/16 tinhuty he tinh...@hotmail.com:

I have followed the description here
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters
to created 5 instances of cassandra in one CentOS 5.5 machine. using
nodetool shows the 5 nodes are all running fine.

Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 
127.0.0.1
is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks 
to

me that they are not valid IP? how come all 5 nodes are working ok?

Another question. I have installed MX4J in instance 127.0.0.1 on port 
8081.
I am able to connect to http://server:8081/ from the browser. However how 
do

I connect using Jconsole that was installed in another windows
machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed).

Thanks. 




RE: DatabaseDescriptor.defsVersion

2011-04-15 Thread Jeffrey Wang
Done: https://issues.apache.org/jira/browse/CASSANDRA-2490

-Jeffrey

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Friday, April 15, 2011 7:39 PM
To: user@cassandra.apache.org
Cc: Jeffrey Wang
Subject: Re: DatabaseDescriptor.defsVersion

I think you found a bug; it should be volatile.  (Cassandra does
already make sure that only one change runs internally at a time.)

Can you create a ticket?

On Fri, Apr 15, 2011 at 6:04 PM, Jeffrey Wang jw...@palantir.com wrote:
 Hey all,



 I've been seeing a very rare issue with schema change conflicts on 0.7.3 (I
 am serializing all schema changes to a single Cassandra node and waiting for
 them to finish before continuing). Occasionally a node in the cluster will
 never report the correct schema, and I think it may have to do with
 synchronization on DatabaseDescriptor.defsVersion.



 As far as I can tell, it is a static variable accessed by multiple threads
 but is not protected by synchronized/volatile. I was able to write a test in
 which one thread never reads the modification done by another thread (as is
 expected by an unsynchronized variable). Should this be fixed or is there a
 higher level reason this does not need to be synchronized (in which case I
 should continue looking for the reason why my schemas don't agree)? Thanks.



 -Jeffrey





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


What will be the steps for adding new nodes

2011-04-15 Thread Roni
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica
factor 2). I wants to add two more nodes and balance the cluster (replica
factor 2).

I want all of them to be seed's.

 

What should be the simple steps:

1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only
the new ones?

2. add the Seed[new_node]/Seed to the config file of the old nodes
before adding the new ones?

3. do the old node need to be restarted (if no change is needed in their
config file)?

 

TX,

 

 



What will be the steps for adding new nodes

2011-04-15 Thread Roni
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica
factor 2). I wants to add two more nodes and balance the cluster (replica
factor 2).

I want all of them to be seed's.

 

What should be the simple steps:

1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only
the new ones?

2. add the Seed[new_node]/Seed to the config file of the old nodes
before adding the new ones?

3. do the old node need to be restarted (if no change is needed in their
config file)?

 

TX,