from:"mcasandra"

Speaking purely from my personal experience, I haven't found cassandra
optimal for storing big fat rows. Even if it is only 100s of KB I didn't
find cassandra suitable for it. In my case I am looking at 400 writes + 400
reads per sec and grow 20%-30% every ear with file sizes from 70k-300k. What
I found is that when you have simultaneous reads and writes going in
parallel that is inserting and reading big rows it kills the performance of
cassandra. Even if you add more nodes it doesn't scale at the level you
would expect it to. You would start to see dropped messaged all around.
With 8 node cluster, good disks (SAS) and following recommendations of
tunning cassandra performance I was only able to get 140 inserts and 80
reads per sec.

You can simply test it by using stress tool and you will see the difference
as you start to increase the column size. and you would see that performance
of small columns that starts with 1000s / sec gets dropped quickly as you
start to increase column size.

But if your traffic is low volume it might work ok. Also, if over period you
will have tons of Blobs you might find yourself in difficult situation. I
suggest doing some tests.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Storing-files-in-blob-into-Cassandra-tp6503165p6505188.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: 99.999% uptime - Operations Best Practices?

In my opinion 5 9s don't matter. It's the number of impacted customers. You
might be down during peak for 5 mts causing 1000s of customer turn aways
while you might be down during night causing only few customer turn aways.

There is no magic bullet. It's all about learning and improving. You will
not get HA right away, but over period of time as you learn and improve you
will do better.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/99-999-uptime-Operations-Best-Practices-tp6506227p6506511.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: 99.999% uptime - Operations Best Practices?

Start with reading comments on cassandra.yaml and 
http://wiki.apache.org/cassandra/Operations
http://wiki.apache.org/cassandra/Operations 

As far as I know there is no comprehensive list for performance tuning. More
specifically common setting applicable to everyone. For most part issues
revolve around compactions and GC tuning.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/99-999-uptime-Operations-Best-Practices-tp6506227p6506529.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: 99.999% uptime - Operations Best Practices?


Les Hazlewood wrote:
 
 I have architected, built and been responsible for systems that support
 4-5
 9s for years. 
 

So have most of us. But probably by now it should be clear that no
technology can provide concrete recommendations. They can only provide what
might be helpful which varies from env to env. That's why I suggest look at
the comments in cassandra.yaml and see which are applicable in your
scenario. I learn something new everytime I read it.

BTW: Can you be clear as to what kind of recommendations are you referring
to? NetworkToplogy, how many copies to store, uptime, load balancing,
request routing when on DC is down? If you ask specific questions you might
get better response.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/99-999-uptime-Operations-Best-Practices-tp6506227p6506565.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Is LOCAL_QUORUM as strong as QUORUM?

LOCAL_QUORUM gurantees consistency in the local data center only. Other
replica nodes in the same DC and other DC not part of the QUORUM will be
eventually consistent. If you want to ensure consistency accross DCs you can
use EACH_QUORUM but keep in mind the latency involved assuming DCs are not
located within short distance.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-LOCAL-QUORUM-as-strong-as-QUORUM-tp6506592p6506621.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Is LOCAL_QUORUM as strong as QUORUM?

Well it depends on the requirements. If you use any combination of CL with
EACH_QUORUM it means you are accepting the fact that you are ok if one of
the DC is down. And in your scenario you care more about DCs being
consistent even if writes were to fail. Also you are ok with network
latency.

I think there is a broader design question here and you might be able to
solve it with LOCAL_QUORUM if you handled it at application or load
balancing layer. Is this active/active data center? What's your actual
requirements? Are these external clients that can go to any data center?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-LOCAL-QUORUM-as-strong-as-QUORUM-tp6506592p6506937.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Direct control over where data is stored?

2011-06-05 Thread mcasandra

Please give more detailed info about what exactly you are worried about or
trying to solve.

Please take a step back and look at cassandra's architecture again and what
it's trying to solve. It's a distributed database so if you do what you are
describing there is a potential of getting hotspots. Which will probably
lead in other problems. You might solve one problem but then intriduce
another like slow reads or one node getting overloaded.

IF you really want to do what you described you can solve it simply by
designing your data model in that way. For eg: For User A you can store
information for all it's friends. This will lead to duplicate data but will
solve your problem.

I also suggest run some stress test and worry about the load, performance
only if it is a real problem for your kind of data.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Direct-control-over-where-data-is-stored-tp6441048p6442802.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Direct control over where data is stored?

2011-06-05 Thread mcasandra


Khanh Nguyen wrote:
 
 Is there a way to tell where a piece of data is stored in a cluster?
 For example, can I tell if LastNameColumn['A'] is stored at node 1 in
 the ring?
 

I have not used it but you can see getNaturalEndpoints in jmx. It will tell
you which nodes are responsible for a given row *key*

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Direct-control-over-where-data-is-stored-tp6441048p6443571.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Reading quorum

2011-06-03 Thread mcasandra


Fredrik Stigbäck wrote:
 
 Does reading quorum mean only waiting for quorum respones or does it mean
 quorum respones with same latest timestamp?
 
 Regards
 /Fredrik
 

Well it depends on how your CL is for writes. If you write with QUORUM and
then read with QUORUM then yes you will get at least one response with
latest timestamp.

http://wiki.apache.org/cassandra/API http://wiki.apache.org/cassandra/API 

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Reading-quorum-tp6435568p6436020.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: EC2 node adding trouble

2011-05-25 Thread mcasandra

Can you post the output of netstat -anp|grep LISTEN|grep java from all
the 3 nodes?

Also compare seconds nodes yaml with new nodes yaml and see what diff. you
find, if any.

Another thing try telnet tests from seed node to the new node.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/EC2-node-adding-trouble-tp6399102p6403602.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Exception when starting

Whenever I hear someone say data is corrupted I panic :) I have seen few
people have reported that but have not seen the real reason for it. Is it a
manual error, config error, bug etc. It will be good to identify why these
things happen so that it can  be fixed before it happens in PROD :(

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6386809.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: How to reduce the Read Latency.

What's your avg column size and row size? Your read latency in most case will
directly be related to how much you are trying to read. In my experience you
will see high read latency if you have big column size.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-reduce-the-Read-Latency-tp6385107p6386817.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Cassandra Vs. Oracle Coherence

Coherence is similar to memcachd (free). It's in memory cache layer on top of
the DB. You as a user need to keep that cache in sync with the DB.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-Vs-Oracle-Coherence-tp6375561p6386847.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Exception when starting


Brandon Williams wrote:
 
 There was a bug, it is fixed.  It's just a cache, chill.
 

There is no time to chill when fighting it in production :) It's good to
know it's fixed.

Another question, when this happens are we able to restore data from replica
nodes?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6386925.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Exception when starting

In this case, yes. I was asking for the cases where commit log corruption was
reported.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-Exception-when-starting-tp6383464p6387101.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Inconsistent results using secondary indexes between two DC

2011-05-19 Thread mcasandra

I am wondering if running nodetool repair will help in anyway

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Inconsistent-results-using-secondary-indexes-between-two-DC-tp632p6382819.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Commitlog Disk Full

2011-05-17 Thread mcasandra

Do you see anything in log files?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6374234.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Commitlog Disk Full

2011-05-17 Thread mcasandra

Those messages are ok to ignore. It's basically deleting the files that are
already flused as SSTables.

Which version are you running?

Have you tried restarting the node?

Pick one node and send ls -ltr output also the complete log files since
your last restart from the same node. I looked at the code and it looks like
you should see something in the logs for those files.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6375353.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Commitlog Disk Full

2011-05-16 Thread mcasandra

You can try to update column family using cassandra-cli. Try to set
memtable_throughput to 32 first.

[default@unknown] help update column family;
update column family Bar;
update column family Bar with att1=value1;
update column family Bar with att1=value1 and att2=value2...;

Update a column family with the specified values for the given set of
attributes. Note that you must be using a keyspace.

valid attributes are:
- column_type: Super or Standard
- comment: Human-readable column family description. Any string is
acceptable
- rows_cached: Number or percentage of rows to cache
- row_cache_save_period: Period with which to persist the row cache, in
seconds
- keys_cached: Number or percentage of keys to cache
- key_cache_save_period: Period with which to persist the key cache, in
seconds
- read_repair_chance: Probability (0.0-1.0) with which to perform read
repairs on CL.ONE reads
- gc_grace: Discard tombstones after this many seconds
- column_metadata: null
- memtable_operations: Flush memtables after this many operations (in
millions)
- memtable_throughput: ... or after this many MB have been written
- memtable_flush_after: ... or after this many minutes
- default_validation_class: null
- min_compaction_threshold: Avoid minor compactions of less than this
number of sstable files
- max_compaction_threshold: Compact no more than this number of sstable
files at once
- column_metadata: Metadata which describes columns of column family.
Supported format is [{ k:v, k:v, ... }, { ... }, ...]
Valid attributes: column_name, validation_class (see comparator),
index_type (integer), index_name.

--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6370913.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.

Re: Commitlog Disk Full

2011-05-13 Thread mcasandra

Is there a way to look at the actual size of memtable? Would that help?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6360001.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Commitlog Disk Full

2011-05-13 Thread mcasandra

5G in one hour is actually very low. Something else is wrong. Peter pointed
to something related to memtable size could be causing this problem, can you
turn down memtable_throughput and see if that helps.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6362301.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Strange corrupt sstable

2011-04-28 Thread mcasandra

What do you mean by Bad memory? Is it less heap size, OOM issues or something
else? What happens in such scenario, is there a data loss?

Sorry for many questions just trying to understand since data is critical
afterall :)

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-corrupt-sstable-tp6314052p6314218.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: AW: AW: Two versions of schema

2011-04-19 Thread mcasandra

What would be the procedure in this case? Run drain on the node that is
disagreeing? But is it enough to run just drain or you suggest drain + rm
system files?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6287863.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: AW: Two versions of schema

2011-04-18 Thread mcasandra

In my case all hosts were reachable and I ran nodetool ring before running
the schema update. I don't think it was because of node being down. I tihnk
for some reason it just took over 10 secs because I was reducing key_cache
from 1M to 1000. I think it might be taking long to trim the keys hence 10
sec default may not be the right way.

What is drain?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6284276.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Two versions of schema

2011-04-16 Thread mcasandra

I don't think I got correct answer to my original post. Can someone please
help?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6280070.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Key cache hit rate

2011-04-15 Thread mcasandra


How to intepret  Key cache hit rate? What does this no mean?


Keyspace: StressKeyspace
Read Count: 87579
Read Latency: 11.792417360326105 ms.
Write Count: 179749
Write Latency: 0.009272318622078566 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 59
Space used (live): 52432078035
Space used (total): 52432078035
Memtable Columns Count: 229
Memtable Data Size: 114103248
Memtable Switch Count: 375
Read Count: 87579
Read Latency: NaN ms.
Write Count: 179751
Write Latency: 0.007 ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 78576
Key cache hit rate: 3.8880248833592535E-4
Row cache: disabled
Compacted row minimum size: 182786
Compacted row maximum size: 5839588
Compacted row mean size: 532956


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Key-cache-hit-rate-tp6277236p6277236.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Two versions of schema

2011-04-15 Thread mcasandra

Is there a problem?


[default@StressKeyspace] update column family StressStandard with
keys_cached=100;
854ee0a0-6792-11e0-81f9-93d987913479
Waiting for schema agreement...
The schema has not settled in 10 seconds; further migrations are ill-advised
until it does.
Versions are 854ee0a0-6792-11e0-81f9-93d987913479:[10.18.62.202,
10.18.62.203, 10.18.62.200, 10.18.62.204, 10.18.62.199, 10.18.62.196,
10.18.62.197],22d165ff-6783-11e0-81f9-93d987913479:[10.18.62.198]


I remember reading somewhere before that when you have 2 versions of schemas
you are basically in trouble. Can someone explain what it means and it's
implications?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6277365.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

All nodes down even though ring shows up

2011-04-14 Thread mcasandra

I ran stress test to read 50K rows and since then I am getting below error
even though ring show all nodes are up:

ERROR 12:40:29,999 Exception:
me.prettyprint.hector.api.exceptions.HectorException: All host pools marked
down. Retry burden pushed out to client.
at
me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(HConnectionManager.java:308)
at
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:213)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
at
me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:203)
at
me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:200)
at
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at
me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:200)
at
com.riptano.cassandra.stress.InsertCommand.call(InsertCommand.java:117)
at
com.riptano.cassandra.stress.InsertCommand.call(InsertCommand.java:1)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)


---

No errors logged in the system.log and tpsats shows nothing.

nodetool -h dsdb1 tpstats
Pool NameActive   Pending  Completed
ReadStage 0 0  50176
RequestResponseStage  0 0 207223
MutationStage 0 0 199473
ReadRepairStage   0 0  14615
GossipStage   0 0  39835
AntiEntropyStage  0 0  0
MigrationStage0 0207
MemtablePostFlusher   0 0386
StreamStage   0 0  0
FlushWriter   0 0385
FILEUTILS-DELETE-POOL 0 0   1446
MiscStage 0 0  0
FlushSorter   0 0  0
InternalResponseStage 0 0   1230
HintedHandoff 0 0  7


compaction stats say pending: 0



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/All-nodes-down-even-though-ring-shows-up-tp6274152p6274152.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4


Peter Schuller wrote:
 
 Saturated.
 
But read latency is still something like 30ms which I would think would be
much higher if it's saturated.


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6269655.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

One correction qu size in iostat ranges between 6-120. But still this doesn't
explain why read latency is low in cfstats.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6269875.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

I still don't understand. You would expect read latency to increase
drastically when it's fully saturated and lot of READ drop messages also,
correct? I don't see that in cfstats or system.log which I don't really
understand why.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6270244.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

Actually when I run 2 stress clients in parallel I see Read Latency stay the
same. I wonder if cassandra is reporting accurate nos.

I understand your analogy but for some reason I don't see that happening
with the results I am seeing with multiple stress clients running. So I am
just confused where the real bottleneck is.



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6270942.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Timeout during stress test

Here is what cfhistograms look like. Don't really understand what this means,
will try to read. I also %util in iostat continuously 90%. Not sure if this
is caused by extra reads by cassandra. It seems unusual.

[root@dsdb4 ~]# nodetool -h `hostname` cfhistograms StressKeyspace
StressStandard
StressKeyspace/StressStandard histograms
Offset  SSTables Write Latency  Read Latency  Row Size 
Column Count
1  45720 0 0 0  
 
498857
2  0 0 0 0  
  
0
3  0 0 0 0  
  
0
4  0 0 0 0  
  
0
5  0 0 0 0  
  
0
6  0 0 1 0  
  
0
7  0 0 1 0  
  
0
8  0 0 0 0  
  
0
10 0 0 0 0  
  
0
12 0 0 0 0  
  
0
14 0 0 0 0  
  
0
17 0 1 0 0  
  
0
20 0 2 0 0  
  
0
24 0 1 0 0  
  
0
29 0 6 0 0  
  
0
35 068 0 0  
  
0
42 0   509 0 0  
  
0
50 0  1128 0 0  
  
0
60 0  1449 0 0  
  
0
72 0   789 0 0  
  
0
86 0   400 0 0  
  
0
1030   319 0 0  
  
0
1240   388 0 0  
  
0
1490   456 0 0  
  
0
1790   519 0 0  
  
0
2150   262 0 0  
  
0
2580   194 0 0  
  
0
310048 0 0  
  
0
3720 5 0 0  
  
0
4460 1 0 0  
  
0
5350 0 0 0  
  
0
6420 0 0 0  
  
0
7700 1 0 0  
  
0
9240 1 0 0  
  
0
1109   0 0 0 0  
  
0
1331   0 1 0 0  
  
0
1597 0 0 0  
  
0
1916 1 0 0  
  
0
2299 0 0 0  
  
0
2759 0 0 0  
  
0
3311 0 0 0  
  
0
3973 1 0 0  
  
0
4768 5 0 0  
  
0
572219 0 0  
  
0
686646 0 0  
  
0
8239   102 0 0  
  
0
9887   226 0 0  
  
0
11864  368 0 0  
  
0
14237  572 0

Re: Lot of pending tasks for writes

Can someone please help?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Lot-of-pending-tasks-for-writes-tp6263462p6266213.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

flush_largest_memtables_at messages in 7.4

I am using cassandra 7.4 and getting these messages.

Heap is 0.7802529021498031 full. You may need to reduce memtable and/or
cache sizes Cassandra will now flush up to the two largest memtables to free
up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if
you don't want Cassandra to do this automatically

How do I verify that I need to adjust any thresholds? And how to calculate
correct value?

When I got this message only reads were occuring.

create keyspace StressKeyspace
with replication_factor = 3
and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';

use StressKeyspace;
drop column family StressStandard;
create column family StressStandard
with comparator = UTF8Type
and keys_cached = 100
and memtable_flush_after = 1440
and memtable_throughput = 128;

 nodetool -h dsdb4 tpstats
Pool NameActive   Pending  Completed
ReadStage32   281 456598
RequestResponseStage  0 0 797237
MutationStage 0 0 499205
ReadRepairStage   0 0 149077
GossipStage   0 0 217227
AntiEntropyStage  0 0  0
MigrationStage0 0201
MemtablePostFlusher   0 0   1842
StreamStage   0 0  0
FlushWriter   0 0   1841
FILEUTILS-DELETE-POOL 0 0   3670
MiscStage 0 0  0
FlushSorter   0 0  0
InternalResponseStage 0 0  0
HintedHandoff 0 0 15

cfstats

Keyspace: StressKeyspace
Read Count: 460988
Read Latency: 38.07654727454945 ms.
Write Count: 499205
Write Latency: 0.007409593253272703 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 9
Space used (live): 247408645485
Space used (total): 247408645485
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 1878
Read Count: 460989
Read Latency: 28.237 ms.
Write Count: 499205
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 299862
Key cache hit rate: 0.6031833150384193
Row cache: disabled
Compacted row minimum size: 219343
Compacted row maximum size: 5839588
Compacted row mean size: 497474


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6266221.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

64 bit 12 core 96 GB RAM

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6266400.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

Yes

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6266726.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

One thing I am noticing is that cache hit rate is very low even though my
cache key size is 1M and I have less than 1M rows. Not sure why so many
cache miss?

Keyspace: StressKeyspace
Read Count: 162506
Read Latency: 45.22479006928975 ms.
Write Count: 247180
Write Latency: 0.011610943442026053 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 184
Space used (live): 99616537894
Space used (total): 99616537894
Memtable Columns Count: 351
Memtable Data Size: 171716049
Memtable Switch Count: 543
Read Count: 162507
Read Latency: 317.892 ms.
Write Count: 247180
Write Latency: 0.006 ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 256013
Key cache hit rate: 0.33801452784503633
Row cache: disabled
Compacted row minimum size: 182786
Compacted row maximum size: 5839588
Compacted row mean size: 537470



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6267234.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: flush_largest_memtables_at messages in 7.4

Does it really matter how long cassandra has been running? I thought it will
keep keys of 1M at least.

Regarding your previous question about queue size in iostat I see it ranging
from 114-300.



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/flush-largest-memtables-at-messages-in-7-4-tp6266221p6267728.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Timeout during stress test

I am running stress test using hector. In the client logs I see:

me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
at
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:32)
at
me.prettyprint.cassandra.service.HColumnFamilyImpl$1.execute(HColumnFamilyImpl.java:256)
at
me.prettyprint.cassandra.service.HColumnFamilyImpl$1.execute(HColumnFamilyImpl.java:227)
at
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:221)
at
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at
me.prettyprint.cassandra.service.HColumnFamilyImpl.doExecuteSlice(HColumnFamilyImpl.java:227)
at
me.prettyprint.cassandra.service.HColumnFamilyImpl.getColumns(HColumnFamilyImpl.java:139)
at
com.riptano.cassandra.stress.SliceCommand.call(SliceCommand.java:48)
at
com.riptano.cassandra.stress.SliceCommand.call(SliceCommand.java:20)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: TimedOutException()
at
org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7174)
at
org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:540)
at
org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:512)
at
me.prettyprint.cassandra.service.HColumnFamilyImpl$1.execute(HColumnFamilyImpl.java:236)


But I don't see anything in cassandra logs.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-during-stress-test-tp6262430p6262430.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Timeout during stress test

I see this occurring often when all cassandra nodes all of a sudden show CPU
spike. All reads fail for about 2 mts. GC.log and system.log doesn't reveal
much.

Only think I notice is that when I restart nodes there are tons of files
that gets deleted. cfstats from one of the nodes looks like this:

nodetool -h `hostname` tpstats
Pool NameActive   Pending  Completed
ReadStage2727  21491
RequestResponseStage  0 0 201641
MutationStage 0 0 236513
ReadRepairStage   0 0   7222
GossipStage   0 0  31498
AntiEntropyStage  0 0  0
MigrationStage0 0  0
MemtablePostFlusher   0 0324
StreamStage   0 0  0
FlushWriter   0 0324
FILEUTILS-DELETE-POOL 0 0   1220
MiscStage 0 0  0
FlushSorter   0 0  0
InternalResponseStage 0 0  0
HintedHandoff 1 3  9

--


Keyspace: StressKeyspace
Read Count: 21957
Read Latency: 46.91765058978913 ms.
Write Count: 222104
Write Latency: 0.008302124230090408 ms.
Pending Tasks: 0
Column Family: StressStandard
SSTable count: 286
Space used (live): 377916657941
Space used (total): 377916657941
Memtable Columns Count: 362
Memtable Data Size: 164403613
Memtable Switch Count: 326
Read Count: 21958
Read Latency: 631.464 ms.
Write Count: 222104
Write Latency: 0.007 ms.
Pending Tasks: 0
Key cache capacity: 100
Key cache size: 22007
Key cache hit rate: 0.002453626459907744
Row cache: disabled
Compacted row minimum size: 87
Compacted row maximum size: 5839588
Compacted row mean size: 552698




--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-during-stress-test-tp6262430p6263087.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Timeout during stress test

It looks like hector did retry on all the nodes and failed. Does this then
mean cassandra is down for clients in this scenario? That would be bad.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-during-stress-test-tp6262430p6263270.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Lot of pending tasks for writes

I am running stress test and on one of the nodes I see:

[root@dsdb5 ~]# nodetool -h `hostname` tpstats
Pool NameActive   Pending  Completed
ReadStage 0 0   2495
RequestResponseStage  0 0 242202
MutationStage48   521 287850
ReadRepairStage   0 0799
GossipStage   0 0  10639
AntiEntropyStage  0 0  0
MigrationStage0 0202
MemtablePostFlusher   1 2   1047
StreamStage   0 0  0
FlushWriter   1 1   1047
FILEUTILS-DELETE-POOL 0 0   2048
MiscStage 0 0  0
FlushSorter   0 0  0
InternalResponseStage 0 0  0
HintedHandoff 1 3  5

and cfstats

Keyspace: StressKeyspace
Read Count: 2494
Read Latency: 4987.431669206095 ms.
Write Count: 281705
Write Latency: 0.017631469090005503 ms.
Pending Tasks: 49
Column Family: StressStandard
SSTable count: 882
Space used (live): 139589196497
Space used (total): 139589196497
Memtable Columns Count: 6
Memtable Data Size: 14204955
Memtable Switch Count: 1932
Read Count: 2494
Read Latency: 5921.633 ms.
Write Count: 282522
Write Latency: 0.017 ms.
Pending Tasks: 32
Key cache capacity: 100
Key cache size: 1198
Key cache hit rate: 0.0013596193065941536
Row cache: disabled
Compacted row minimum size: 219343
Compacted row maximum size: 5839588
Compacted row mean size: 557125

I am just running simple test in 6 node cassandra 4 GB heap, 96 GB RAM and
12 core per host. I am inserting 1M rows with avg col size of 250k. I keep
getting Dropped mutation messages in logs. Not sure how to troubleshoot or
tune it.

Can someone please help?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Lot-of-pending-tasks-for-writes-tp6263462p6263462.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Timeout during stress test

But I don't understand the reason for oveload. It was doing simple read of 12
threads and reasing 5 rows. Avg CPU only 20%, No GC issues that I see. I
would expect cassandra to be able to process more with 6 nodes, 12 core, 96
GB RAM and 4 GB heap.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-during-stress-test-tp6262430p6263470.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Timeout during stress test


aaron morton wrote:
 
 You'll need to provide more information, from the TP stats the read stage
 could not keep up. If the node is not CPU bound then it is probably IO
 bound. 
 
 
 What sort of read?
 How many columns was it asking for ? 
 How many columns do the rows have ?
 Was the test asking for different rows ?
 How many ops requests per second did it get up to?
 What do the io stats look like ? 
 What does nodetool cfhistograms say ?
 
It's simple read of 1M rows with one column of avg size of 200K. Got around
70 req per sec.

Not sure how to intepret the iostats output with things happening async in
cassandra. Can you give little description on how to interpret it?

I have posted output of cfstats. Does cfhistograms provide better info?


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-during-stress-test-tp6262430p6263859.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Columns values(integer) need frequent updates/ increments

2011-04-10 Thread mcasandra

What's the difference between a row index and sstable index?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Columns-values-integer-need-frequent-updates-increments-tp6251464p6259882.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: What need to be monitored while running stress test

What is a storage proxy latency?

By query latency you mean the one in cfstats and cfhistorgrams?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/What-need-to-be-monitored-while-running-stress-test-tp6255765p6257932.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Columns values(integer) need frequent updates/ increments

If there are multiple updates to same columns and scattered accross multiple
sstables then how does cassandra know which sstable has the most recent
value.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Columns-values-integer-need-frequent-updates-increments-tp6251464p6257960.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Heads up: restarting a node with autobootstrap=true after nodetool move will re-bootstrap the node in 0.7.0-0.7.4

Thanks for the info!

Does this also happens if initial_token is set?

Also, I am unable to understand the last line in that JIRA



 A potential complication was that seed nodes were moved without using the
 correct procedure of de-seeding them first. This was clearly wrong
 

What is de-seeding and why would it cause this problem? Relation of seeding
and auto_bootstrap always confuses me. If and when you have time can you
please explain it to me.

Thanks

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Heads-up-restarting-a-node-with-autobootstrap-true-after-nodetool-move-will-re-bootstrap-the-node-in4-tp6257434p6257988.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Columns values(integer) need frequent updates/ increments

That I understand but my basic quesiton was how does it know that there are
multiple updates that have occurred on the same column? and how does it
efficiently knows which sstable have these updates?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Columns-values-integer-need-frequent-updates-increments-tp6251464p6258033.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

auto_bootstrap

2011-04-08 Thread mcasandra

in yaml:
# Set to true to make new [non-seed] nodes automatically migrate data
# to themselves from the pre-existing nodes in the cluster. 

Why only non-seed nodes? What if seed nodes need to bootstrap?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/auto-bootstrap-tp6254993p6254993.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

CF config for Stress Test

2011-04-08 Thread mcasandra

I am starting a stress test using hector on 6 node machine 4GB heap and 12
core. In hectore readme this is what I got by default:

create keyspace StressKeyspace
with replication_factor = 3
and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';

use StressKeyspace;
drop column family StressStandard;
create column family StressStandard
with comparator = UTF8Type
and keys_cached = 1
and memtable_flush_after = 1440
and memtable_throughput = 32;

Are these good values? I was thinking of highher keys_cached but not sure if
it's in bytes or no of keys.

Also not sure how to tune memtable values.

I have set concurrent_readers to 32 and writers to 48.

Can someone please help me with good values that I can start this test with?

Also, any other suggested values that I need to change?

Thanks

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CF-config-for-Stress-Test-tp6255608p6255608.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

What need to be monitored while running stress test

2011-04-08 Thread mcasandra

What are the key things to monitor while running a stress test? There is tons
of details in nodetoll tpstats/netstats/cfstats. What in particular should I
be looking at?

Also, I've been looking at iostat and await really goes high but cfstats
shows low latency in microsecs. Is latency in cfstats calculated per
operation?

I am just trying to understand what I need to look just to make sure I don't
overlook important points in process of evaluating cassandra.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/What-need-to-be-monitored-while-running-stress-test-tp6255765p6255765.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: 0.7.4 - Cassandra Nodes Do Not Start

2011-04-06 Thread mcasandra

I see this error in the logs posted. Is this normal?

java.io.IOError: java.io.EOFException
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:73)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:61)



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/0-7-4-Cassandra-Nodes-Do-Not-Start-tp6246431p6246900.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: LB scenario

2011-04-06 Thread mcasandra

I think best is to have global load balancer in front of web servers/app
servers. And leave app servers to handle requests at local quoram. If data
center goes down then load balancer will simply hand out only one DCs ips.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/LB-scenario-tp6224754p6246968.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

RE: Ditching Cassandra

2011-04-01 Thread mcasandra

Where can I read more about CQL? I am assuming it's similar to SQL and
drivers like JDBC can be written on top of it. Is that right?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ditching-Cassandra-tp6221436p6231654.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Endless minor compactions after heavy inserts

2011-04-01 Thread mcasandra

Is there a way to monitor the compactions using nodetools? I don't see it in
tpstats.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Endless-minor-compactions-after-heavy-inserts-tp6229633p6231672.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Understanding cfhistogram output

2011-04-01 Thread mcasandra

I can't find it on wiki. Do you have a link where it can give detail help?

Also, is the latency in micro sec. or millisec?

How about latency in cfstats? Is it micro or mill? It says ms which is gen.
millisec.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Understanding-cfhistogram-output-tp6231927p6232572.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: How to determine if repair need to be run

2011-03-31 Thread mcasandra

If I am not wrong node repair need to be run on all the nodes in staggerred
manner. It is required to take care of tombstones. Please correct me team if
I am wrong :)

See Distributed Deletes:

http://wiki.apache.org/cassandra/Operations



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-determine-if-repair-need-to-be-run-tp6220005p6227778.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

nodetool cfstathistogram error

2011-03-31 Thread mcasandra

Cassandra 7.4:

nodetool -h `hostname` cfhistograms system schema
Exception in thread main java.lang.reflect.UndeclaredThrowableException
at $Proxy5.getRecentReadLatencyHistogramMicros(Unknown Source)
at
org.apache.cassandra.tools.NodeCmd.printCfHistograms(NodeCmd.java:452)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:605)
Caused by: javax.management.InstanceNotFoundException:
org.apache.cassandra.db:type=ColumnFamilies,keyspace=system,columnfamily=schema
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:662)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
at
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
at
sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:255)
at
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:233)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:142)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at
javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown
Source)
at
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:878)
at
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:263)


Am I doing something wrong?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/nodetool-cfstathistogram-error-tp6228995p6228995.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: nodetool cfstathistogram error

2011-03-31 Thread mcasandra

It looks like if I use system schema it fails. Is it because of
LocalPartitioner?

I ran with other keyspace and got following output.

Offset SSTables Write Latency Read Latency Row Size Column Count
1 0 0 0 0 0
2 0 0 0 0 0
179 0 0 0 320 320


Can someone please help me understand the output in first 2 columns? Why are
SSTables always 0?

I am writing shell/awk scripts to parse this data and send it out to
monitoring tool. 

So far I am planning to monitor output of netstat, tpstat and cfhistograms.
Is there anything else I should monitor that might be helpful?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/nodetool-cfstathistogram-error-tp6228995p6229038.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

How to determine if repair need to be run

Is there a way to monitor and tell if one of the node require repair? For eg:
Node was down and came back up but in the meantime HH were dropped. Now
unless we are really careful in all the scenarios we wouldn't have any
problems :) but in general when things are going awry you might forget about
running repair or other commands until there is a customer impact.

Is there a way to monitor and alert on such things like repair?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-determine-if-repair-need-to-be-run-tp6220005p6220005.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: How to determine if repair need to be run

Yes but that doesn't really provide the monitoring that will really be
helpful. If I don't realize it until 2 days then we potentially could be
returning inconsistent results or not have data sync for 2 days until repair
is run. It will be best to be able to monitor these things so that it can be
run as soon as it is required (eg node down). Have such monitoring will be
helpful for operations team to monitor also who may not know all internals
of cassandra.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-determine-if-repair-need-to-be-run-tp6220005p6220171.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: How to determine if repair need to be run

I think what I feel is that there is a need to know if repair is required
flag in order for team to manage the cluster.

Atleast at minimum, Is there a flag somewhere that tells if repair was run
within GCGracePeriod?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-to-determine-if-repair-need-to-be-run-tp6220005p6221157.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Ditching Cassandra

I am also interested in knowing when 8 will be released. Also, is there
someplace where we can read about features that will be relased in 8? Looks
like some major changes are going to come out.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ditching-Cassandra-tp6221436p6221685.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Central monitoring of Cassandra cluster

2011-03-25 Thread mcasandra

Thanks everyone this gives me a good head start.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Central-monitoring-of-Cassandra-cluster-tp6205275p6208331.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Central monitoring of Cassandra cluster

2011-03-24 Thread mcasandra

Can someone share if they have centralized monitoring for all cassandra
servers. With many nodes it becomes difficult to monitor them individually
unless we can look at data in one place. I am looking at solutions where
this can be done. Looking at Cacti currently but not sure how to integrate
it with JMX.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Central-monitoring-of-Cassandra-cluster-tp6205275p6205275.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Active / Active Data Center and RF

2011-03-21 Thread mcasandra

I think what I am trying to ask is this:

what happens if it's RF=3 with network toplogy (RackInferringSnitch) and 2
copies are stored in Site A and 1 copy in Site B data center. Now client for
some reason is directed to Site B data center and does a write/update on
existing column, now would Site B have 2 copies too because of network
topology (RackInferringSnitch)? 



--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Active-Active-Data-Center-and-RF-tp6185528p6192916.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Active / Active Data Center and RF

2011-03-20 Thread mcasandra

CL is just a way to satisfy consistency but you still want majority of your
reads (preferrably) occurring in the same DC.

I don't think that answers my question at all. I understand the CL but I
think I have more basic and important question about active/active data
center and the replicas in that very specific scenario which to me looks
like a issue somehow. Can someone please look at my question specifically
again?




--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Active-Active-Data-Center-and-RF-tp6185528p6191120.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Active / Active Data Center and RF

2011-03-18 Thread mcasandra

When in active/active data center how to decide right replication factor?
Client may connect and request for the information from either data center
so if locally it's RF=3 then in multiple data center should it be RF=6 in
active/active?

Or what happens if it's RF=3 with network toplogy and 2 copies are stored in
Site A and 1 copy in Site B data center. Now client for some reason is
directed to Site B data center and does a write, now would Site B have 2
copies and Site A one (or still 2)? It's getting confusing slowly :) I have
several more questions but will start with understanding this first. 

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Active-Active-Data-Center-and-RF-tp6185528p6185528.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Does concurrent_reads relate to number of drives in RAID0?

2011-03-17 Thread mcasandra

Also when it comes to RAID controller there are other options like write
policy, read policy, cache io/direct io. Is there any preference on which
policies should be chosen?

In our case:

http://support.dell.com/support/edocs/software/svradmin/1.9/en/stormgmt/cntrls.html

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Does-concurrent-reads-relate-to-number-of-drives-in-RAID0-tp6182346p6183075.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Seed

2011-03-15 Thread mcasandra

That is from the wiki  http://wiki.apache.org/cassandra/StorageConfiguration
http://wiki.apache.org/cassandra/StorageConfiguration 

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Seed-tp6162837p6174450.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

How does new node know about other hosts and joins the cluster

2011-03-15 Thread mcasandra

I am assuming it is the seed node that tells who are the other member in the
cluster. And then does the new node joining the cluster send join message
(something like that) to other nodes or is there a master coordinator (like
jboss cluster) that tells other nodes that new node has joined?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-does-new-node-know-about-other-hosts-and-joins-the-cluster-tp6174900p6174900.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Seed


Tyler Hobbs-2 wrote:
 
 Seeds:
 Never use a node's own address as a seed if you are bootstrapping it by
 setting autobootstrap to true! 
 

I came accross this on the wiki. Can someone please help me understand this
with some example?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Seed-tp6162837p6169871.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Linux HugePages and mmap

Currently, in cassandra.yaml disk_access_mode is set to auto but the
recommendation seems to be to use 'mmap_index_only'.

If we use HugePages then do we still need to worry about setting
disk_access_mode to mmap? I am planning to enable HugePages and use
-XX:+UseLargePages option in JVM. I had a very good experience using
HugePages with Oracle.


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Linux-HugePages-and-mmap-tp6170193p6170193.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Linux HugePages and mmap


Jonathan Ellis-3 wrote:
 
 Wrong.  The recommendation is to leave it on auto.
 
this is where I see mmap recommended for index. 
http://wiki.apache.org/cassandra/StorageConfiguration
http://wiki.apache.org/cassandra/StorageConfiguration 



Jonathan Ellis-3 wrote:
 
 HugePages has nothing to do with disk access mode.
 
Can you explain little more? Isn't mmap pinning the process memory in RAM
similar to HugePages?


--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Linux-HugePages-and-mmap-tp6170193p6170423.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: Linux HugePages and mmap