If your shared disk is super fast enough to handle IO requests from
multiple cassandra node, you can do it in theory. And the disk will be
the single point of failure in your system.
For optimal performance, each node should have at least 2 hdd, one for
commitlog and one for data.
maki
2012/4/26
If you set trace level for IncomingTCPConnection, the message Version
is now ... will be printed for every inter-cassandra message received
by the node, including Gossip.
Enabling this log in high traffic will saturate IO for your log disk by itself.
You should better to inspect nodetool tpstats,
http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1
hope this help.
maki
2012/4/12 puneet loya puneetl...@gmail.com:
what is composite columns?
super column under it can contain just multiple columns.. will composite
columns be useful?
On Thu, Apr 12, 2012 at 3:50
You can configure sstable size by sstable_size_in_mb parameter for LCS.
The default value is 5MB.
You should better to check you don't have many pending compaction tasks
with nodetool tpstats and compactionstats also.
If you have enough IO throughput, you can increase
for details, open conf/log4j-server.properties and add following configuration:
log4j.logger.org.apache.cassandra.db.compaction.LeveledManifest=DEBUG
fyi.
maki
2012/4/10 Jonathan Ellis jbel...@gmail.com:
CompactionExecutor doesn't have level information available to it; it
just compacts the
Check your cassandra log.
If you can't find any interesting log, set cassandra log level
to DEBUG and run your program again.
maki
2012/4/10 puneet loya puneetl...@gmail.com:
hi,
sorry i posted the port as 7000. I m using 9160 but still has the same
error.
Cannot read, Remote side has
You can find it in the bin directory of the binary distribution.
maki
2012/3/27 puneet loya puneetl...@gmail.com:
How do i use the cqlsh comand line utility??
I m using cassandra 1.0.8.. Does cqlsh command line utility comes with the
download of cassandra 1.0.8 or we have to do it
:)
Reply
On Wed, Mar 28, 2012 at 8:34 AM, Maki Watanabe watanabe.m...@gmail.com
wrote:
You can find it in the bin directory of the binary distribution.
maki
2012/3/27 puneet loya puneetl...@gmail.com:
How do i use the cqlsh comand line utility??
I m using cassandra 1.0.8.. Does cqlsh
auto_bootstrap has been removed from cassandra.yaml and always enabled
since 1.0.
fyi.
maki
2012/3/26 R. Verlangen ro...@us2.nl:
Yes, you can add nodes to a running cluster. It's very simple: configure
the cluster name and seed node(s) in cassandra.yaml, set auto_bootstrap to
true and start
What version are you using?
Anyway try nodetool repair compact.
maki
2012/3/26 Tamar Fraenkel ta...@tok-media.com
Hi!
I created Amazon ring using datastax image and started filling the db.
The cluster seems un-balanced.
nodetool ring returns:
Address DC Rack
user ---n:m--- role ---n:m--- resource
It can work, but Cassandra is not RDBMS as you know, so RDBMS-ish data
modeling may not fit in production. (depends on your requirement on
performance. I'm not sure.)
In general you should better to desgin schema from your access pattern.
maki
2012/3/20
snapshot files are hardlinks of the original sstables.
As you know, on windows, you can't delete files opened by other process.
If you try to delete the hardlink, windows thinks you try to delete
the sstables in production.
maki
2012/3/14 Jim Newsham jnews...@referentia.com:
Hi,
I'm using
Have you build and installed SimpleAuthenticator from the source repository?
It is not included in the binary kit.
maki
2012/3/14 Sabbiolina sabbiol...@gmail.com:
HI. I followed this:
To set up simple authentication and authorization
1. Edit cassandra.yaml, setting
Do you use same storage_port across 3 nodes?
Can you access to the storage_port of the seed node from the last (failed) node?
2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in:
I was able to successfully join a node to already existing one-node cluster
(without giving any intital_token),
Fixed in 1.0.9, 1.1.0
https://issues.apache.org/jira/browse/CASSANDRA-3989
You should better to avoid to use cleanup/scrub/upgradesstable if you
can on 1.0.7 though
it will not corrupt sstables.
2012/3/14 Thomas van Neerijnen t...@bossastudios.com:
Hi all
I am trying to run a cleanup on a
Try:
% telnet seed_node 7000
on the 3rd node and see what it says.
maki
2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in:
I am using storage port 7000 (deafult) across all nodes.
-Original Message-
From: Maki Watanabe [mailto:watanabe.m...@gmail.com]
Sent: Wednesday, March
Fixed in 1.0?
https://issues.apache.org/jira/browse/CASSANDRA-3176
2012/3/2 Radim Kolar h...@sendmail.cz:
Can be something made to remove these empty delivery attempts from log?
Its just tombstoned row.
[default@system] list HintsColumnFamily;
Using default limit of 100
initially..
Could you guys also recommend some minimum memory to start with ? Of course
that would depend on my workload as well, but that's why I am asking for the
min
On Wed, Feb 29, 2012 at 7:40 AM, Maki Watanabe watanabe.m...@gmail.com
wrote:
If you run your service with 2 node and RF=2
DataStax has not recommend to run major compaction now:
http://www.datastax.com/docs/1.0/operations/tuning
But if you can afford it, major compaction will improve read latency as you see.
Major compaction is expensive, so you will not want to run it during
high traffic hours. And you should not
If you have 3 nodes of RF=3, you can continue the service on cassandra even if
one of the node will fail ( by hardware or software failure ).
One other benefit is you can shutdown one node for maintenance or patch up
without service interruption.
If you run your service with 2 node and RF=2, your
If you run your service with 2 node and RF=2, your data will be replicated but
your service will not be redundant. ( You can't stop both of nodes )
If your service doesn't need strong consistency ( allow cassandra returns
old data after write, and possible write lost ), you can use CL=ONE
for
I've verified it in the source: deliverHintsToEndpointInternal in
HintedHandOffManager.java
Yes it add random delay before HH delivery.
2012/2/24 Todd Burruss bburr...@expedia.com:
if I remember correctly, cassandra has a random delay in it so hint
deliver is staggered and does not overwhelm
to get propagated to all apache mirrors).
--
SYlvain
On Wed, Feb 22, 2012 at 2:46 AM, Maki Watanabe watanabe.m...@gmail.com
wrote:
The link is wrong.
http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.0/apache-cassandra-1.1.0-beta1-bin.tar.gz
Should be:
http://www.apache.org/dyn
The link is wrong.
http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.0/apache-cassandra-1.1.0-beta1-bin.tar.gz
Should be:
http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.0-beta1/apache-cassandra-1.1.0-beta1-bin.tar.gz
2012/2/21 Sylvain Lebresne sylv...@datastax.com:
The
I found I can get the info by git tag.
I should better to learn git more to switch...
2012/2/13 Maki Watanabe watanabe.m...@gmail.com:
Perfect! Thanks.
2012/2/13 Dave Brosius dbros...@mebigfatguy.com:
Based on the tags listed here:
http://git-wip-us.apache.org/repos/asf?p=cassandra.git
I
Updated http://wiki.apache.org/cassandra/HowToBuild .
2012/2/13 Maki Watanabe watanabe.m...@gmail.com:
I found I can get the info by git tag.
I should better to learn git more to switch...
2012/2/13 Maki Watanabe watanabe.m...@gmail.com:
Perfect! Thanks.
2012/2/13 Dave Brosius dbros
Hello,
The current trunk limit the value of phi_convict_threshold from 5 to 16
in DatabaseDescriptor.java.
And phi value is calculated in FailureDetector.java as
PHI_FACTOR x time_since_last_gossip / mean_heartbeat_interval
And the PHI_FACTOR is a predefined value:
PHI_FACTOR = 1 / Log(10) =~
Are there any significant difference of number of sstables on each nodes?
2012/1/18 Marcel Steinbach marcel.steinb...@chors.de:
We are running regular repairs, so I don't think that's the problem.
And the data dir sizes match approx. the load from the nodetool.
Thanks for the advise, though.
Small correction:
The token range for each node is (Previous_token, My_Token].
( means exclusive and ] means inclusive.
So N1 is responsible from X+1 to A in following case.
maki
2012/1/11 Roland Gude roland.g...@yoochoose.com:
Each node in the cluster is assigned a token (can be done
project switched to git from svn.
See Source control section of http://cassandra.apache.org/download/ .
Regards,
Yuki
--
Yuki Morishita
On Thursday, January 5, 2012 at 7:59 PM, Maki Watanabe wrote:
Sorry, ignore my reply.
I had same result with import. ( 1 error in unit test code many
How about to use File-Import... rather than File-New Java Project?
After extracting the source, ant build, and ant generate-eclipse-files:
1. File-Import...
2. Choose Existing Project into workspace...
3. Choose your source directory as root directory and then push Finish
2012/1/6 bobby saputra
Sorry, ignore my reply.
I had same result with import. ( 1 error in unit test code many warnings )
2012/1/6 Maki Watanabe watanabe.m...@gmail.com:
How about to use File-Import... rather than File-New Java Project?
After extracting the source, ant build, and ant generate-eclipse-files:
1
I missed the news.
How the nodetool move work in recent version (0.8.x or later?)
Just stream appropriate range of data between nodes?
2011/11/10 Peter Schuller peter.schul...@infidyne.com:
Keep in mind that if you're using an older version of Cassandra a move
is actually a decommission
Hello, I'm writing CassandraCli wiki page draft (sorry to late,
aaron), and found 2 problems in schema-sample.txt shipped with 1.0.0
release.
cassandra-cli prints following warning and error on loading the schema.
WARNING: [{}] strategy_options syntax is deprecated, please use {}
Konstantin,
You can modify the RF of the keyspace with following command in cassandra-cli:
update keyspace KEYSPACE_NAME with storage_options = {replication_factor:N};
When you decrease RF, you need to run nodetool clean on each node.
When you increase RF, you need to run nodetool repair on
Hello, I'm writing CassandraCli wiki page draft (sorry to late,
aaron), and found 2 problems in schema-sample.txt shipped with 1.0.0
release.
cassandra-cli prints following warning and error on loading the schema.
WARNING: [{}] strategy_options syntax is deprecated, please use {}
Hello aaron,
I raise my hand too.
If you have to-do list about the wiki, please let us know.
maki
2011/10/10 aaron morton aa...@thelastpickle.com:
Hi there,
The dev's have been very busy and Cassandra 1.0 is just around the corner
and full of new features. To celebrate I'm trying to give the
The book is a bit out dated now.
You should better to use cassandra-cli to define your application schema.
Please refer to conf/schema-sample.txt and help in cassandra-cli.
% cassandra-cli
[default@unknown] help;
[default@unknown] help create keyspace;
[default@unknown] help create column family;
You have a chance to write it by your own. I'll buy one :-)
maki
2011/9/22 Sajith Kariyawasam saj...@gmail.com:
Thanks Maki.
If you came across with any other book supporting latest Cassandara
versions, pls let me know.
On Thu, Sep 22, 2011 at 12:03 PM, Maki Watanabe watanabe.m...@gmail.com
This kind of information is very helpful.
Thank you to share your experience.
maki
2011/7/27 Teijo Holzer thol...@wetafx.co.nz:
Hi,
I thought I share the following with this mailing list as a number of other
users seem to have had similar problems.
We have the following set-up:
OS:
Offset represent different units for each columns.
On SSTables columns, you can see following histgrams:
20 4291637
24 28680590
29 3876198
It means your 4291637 read operations required 20 SStables to read,
28680590 ops required 24, so on.
In Write/Read latency columns, Offset
These 0 byte files with -Compacted suffix indicate that the
associated sstables can be removed.
In current version, Cassandra delete compacted sstables at Full GC and
on startup.
maki
2011/7/14 Sameer Farooqui cassandral...@gmail.com:
Running Cassandra 0.8.1. Ran major compaction via:
sudo
I'll write a FAQ for this topic :-)
maki
2011/7/13 Peter Schuller peter.schul...@infidyne.com:
To be sure that I didn't misunderstand (English is not my mother tongue) here
is what the entire repair paragraph says ...
Read it, I maintain my position - the book is wrong or at the very
least
Consistency and Availability are in trade-off each other.
If you use RF=7 + CL=ONE, your read/write will success if you have one
node alive during replicate data to 7 nodes.
Of course you will have a chance to read old data in this case.
If you need strong consistency, you must use CL=QUORUM.
Cassandra has authentication interface, but doesn't have authorization.
So you need to implement authorization in your application layer.
maki
2011/7/11 David McNelis dmcne...@agentisenergy.com:
I've been looking in the documentation and haven't found anything about
this... but is there
A little addendum
Key := Your data to identify a row
Token := Index on the ring calculated from Key. The calculation is
defined in replication strategy.
You can lookup responsible nodes (endpoints) for a specific key with
JMX getNaturalEndpoints interface.
maki
2011/6/24 aaron morton
But decreasing min_compaction_threashold will affect on minor
compaction frequency, won't it?
maki
2011/6/10 Terje Marthinussen tmarthinus...@gmail.com:
bug in the 0.8.0 release version.
Cassandra splits the sstables depending on size and tries to find (by
default) at least 4 files of
You can find useful information in:
http://www.datastax.com/docs/0.8/operations/scheduled_tasks
sstables are immutable. Once it written to disk, it won't be updated.
When you take snapshot, the tool makes hard links to sstable files.
After certain time, you will have some times of memtable
getNaturalEndpoints tells you which key will be stored on which nodes,
but we can't force cassandra to store given key to specific nodes.
maki
2011/6/6 mcasandra mohitanch...@gmail.com:
Khanh Nguyen wrote:
Is there a way to tell where a piece of data is stored in a cluster?
For example, can
You may be able to do it with the Order Preserving Partitioner with
making key to node mapping before storing data, or you may need your
custom Partitioner. Please note that you are responsible to distribute
load between nodes in this case.
From application design perspective, it is not clear for
Amrita,
I recommend you to take a bit more time to investigate, think, and
struggle on the problems by yourself before posting questions.
It will increase your technical skill, and help you much when you will
face on really serious problem in future.
For the current problem, if I am you, I'll
How much replication factor did you set for the keyspace?
If the RF is 2, your data should be replicated to both of nodes. If
the RF is 1, you will lose the half of data when the node A is down.
maki
2011/5/31 Preston Chang zhangyf2...@gmail.com:
Hi,
I have a cluster with two nodes (node A
http://thobbs.github.com/phpcassa/installation.html
They also have mailing list and irc channel.
http://thobbs.github.com/phpcassa/
maki
2011/5/31 Amrita Jayakumar amritajayakuma...@gmail.com:
I have log files of the format id key value. I want to load these
files into cassandra using
at org.apache.log4j.Category.info(Category.java:666)
It seems that your cassandra can't write log by device full.
Check where your cassanra log is written to. The log file path is
configured at log4j.appender.R.File property
in conf/log4j-server.properties.
maki
2011/6/1 Bryce Godfrey
Did you read Jonathan's reply?
If you can't understand what README says, please let us know where you
are stack on.
maki
2011/5/31 Amrita Jayakumar amritajayakuma...@gmail.com:
can anyone help me how to start with cassandra??? starting from the
basics???
Thanks and Regards,
Amrita
On
chown -R `whoami` /var/lib/cassandra
Now is there any configuration settings to be made in
apache-cassandra-0.7.6-2/conf/ before i fire
bin/cassandra -f ???
If so then which all are the that i should change???
Thanks and Regards,
Amrita
On Tue, May 31, 2011 at 10:00 AM, Maki Watanabe
- Consistency Level
- Distributed Delete
- Compaction
before going farther.
maki
On Tue, May 31, 2011 at 10:42 AM, Maki Watanabe watanabe.m...@gmail.com
wrote:
You can just start
bin/cassandra -f
.
Readme.txt says:
Now that we're ready, let's start it up!
* bin/cassandra -f
I assume your question is on that how CL will affects on the throughput.
In theory, I believe CL will not affect on the throughput of the
Cassandra system.
In any CL, the coordinator node needs to submit write/read requests
along the RF specified for the KS.
But for the latency, CL will affects
It depends on what you really use which CL for your operations.
Your RF is 2, so if you read/write with CL=ALL, your r/w will be
always consistent. If your read is CL=ONE, you have chance to read old
data anytime, decommission is not matter. CL=QUORUM on RF=2 is
semantically identical with CL=ALL.
Just FYI for beginners like me: I've also write it with jython.
Getting attributes are more easier than invoke Operations. I feel
jython will be a good option to create custom monitoring/management
tools.
#!/usr/bin/jython
#
# *** This is JYTHON script. You can't run it on CPython. ***
import
, expecting that to be sufficient.
I will try both set to ALL and see if I get better consistency.
-Ryan
On May 14, 2011, at 4:41 AM, Maki Watanabe wrote:
It depends on what you really use which CL for your operations.
Your RF is 2, so if you read/write with CL=ALL, your r/w will be
always
HH will be stored into one of live replica node. It is just a hint,
rather than data to be replicated.
maki
2011/5/12 Anurag Gujral anurag.guj...@gmail.com:
Hi All,
I have two questions:
a) Is there a way to turn on and off hinted handoff per keyspace rather
than for multiple
/11 Jonathan Ellis jbel...@gmail.com:
Thanks!
On Wed, May 11, 2011 at 10:20 AM, Maki Watanabe watanabe.m...@gmail.com
wrote:
Add a new faq:
http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg
2011/5/11 Nick Bailey n...@datastax.com:
Yes.
On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe
correct name, the tool runs fine.
Thanks.
2011/5/13 Alex Araujo cassandra-us...@alex.otherinbox.com:
On 5/13/11 10:08 AM, Maki Watanabe wrote:
I wrote a small JMX client to invoke getNaturalEndpoints.
It works fine at my test environment, but throws NPE for keyspace we
will use for our
Hello,
It's a question on jconsole rather than cassandra, how can I invoke
getNaturalEndpoints with jconsole?
org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints
I want to run this method to find nodes which are responsible to store
data for specific row key.
I can find
array as a parameter and jconsole doesn't
provide a way for inputting a byte array. You might be able to use the
thrift call 'describe_ring' to do what you want though. You will have
to manually hash your key to see what range it falls in however.
On Wed, May 11, 2011 at 6:14 AM, Maki Watanabe
Add a new faq:
http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg
2011/5/11 Nick Bailey n...@datastax.com:
Yes.
On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe watanabe.m...@gmail.com
wrote:
Thanks,
So my options are:
1. Write a thrift client code to call describe_ring with hashed
Done. Thank you for your comment.
maki
2011/4/24 aaron morton aa...@thelastpickle.com:
May also want to add that seed nodes do not auto bootstrap.
Thanks
Aaron
storage_port: Used for Gossip and Data exchange. So in your word, it
is the port for the seeds.
You CAN change the storage_port, but all nodes in your ring need to
use same storage_port number.
That's why you need different IP address for each node.
rpc_port: Used for Thrift which the Cassandra
Hello,
I found Gossipper is initiated with seconds from Unix Epoch
(=System.currentTimeMillis() / 1000)
for HeartBeatState Generation.
Do we get same generation value at very quick restart? Are there any risk here?
regards,
maki
gossip message
- To a known live node (picked randomly)
- To a known dead node (based on some probability)
- To a seed node (based on some probability)
Thanks,
Naren
On Wed, Apr 20, 2011 at 7:13 PM, Maki Watanabe watanabe.m...@gmail.com
wrote:
I made self answered faqs on seed after
I made self answered faqs on seed after reading the wiki and code.
If I misunderstand something, please point out to me.
== What are seeds? ==
Seeds, or seed nodes are the nodes which new nodes refer to on
bootstrap to know ring information.
When you add a new node to ring, you need to specify
127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias
addresses for your loopback interface.
Verify:
% ifconfig -a
127.0.0.0/8 is for loopback, so you can't connect this address from
remote machines.
You may be able configure SSH port forwarding from your monitroing
host to
Hello Mark,
Disable verbose mode (-w or $VERBOSE) of ruby.
Or, you can cleanup ruby thrift library by yourself.
2011/4/12 Mark Lilback mlilb...@stat.wvu.edu:
I'm trying to connect to Cassandra from a Ruby script. I'm using rvm, and
made a clean install of Ruby 1.9.2 and then did gem install
in all sstable in
gc_grace_period, it is safe to run nodetool compact at least once in
gc_grace_period, isn't it?
maki
2011/4/6 Sylvain Lebresne sylv...@datastax.com:
On Tue, Apr 5, 2011 at 12:01 AM, Maki Watanabe watanabe.m...@gmail.com
wrote:
Hello,
On reading O'Reilly's Cassandra book
Hello,
On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on
nodetool repair and compact.
I believe we need to run nodetool repair regularly, and it synchronize
all replica nodes at the end.
According to the documents the repair invokes major compaction also
(as side effect?).
Will
ant on my command line had completed without error.
Next I tried to build cassandra 0.7.4 in eclipse, and had luck.
So I'll explore cassandra code with eclipse, rather than IDEA.
maki
2011/3/31 Maki Watanabe watanabe.m...@gmail.com:
Not yet. I'll try.
maki
2011/3/31 Tommy Tynjä
Woud you cassandra team think to add an alias name for nodetool
repair command?
I mean, the word repair scares some of people.
When I say we need to run nodetool repair regularly on cassandra
nodes, they think OH... Those are broken so often!.
So if I can say it in more soft word, ex. sync, tune,
A client thread need to wait for response, during the server can
handle multiple requests simultaneously.
2011/3/22 Sheng Chen chensheng2...@gmail.com:
I am just wondering, why the stress test tools (python, java) need more
threads ?
Is the bottleneck of a single thread in the client, or in
How do you define your Keyspace?
As you may know, in Cassandra, replication (factor) is defined as the
attribute of Keyspace.
And what do you mean:
However replication never happened.
I can't get data I set at other node.
What did you do on cassandra, and what did you get in response?
maki
Refer to:
http://wiki.apache.org/cassandra/StorageConfiguration
You can specify the data directories with following parameter in
storage-config.xml (or cassandra.yaml in 0.7+).
commit_log_directory : where commitlog will be written
data_file_directories : data files
saved_cache_directory : saved
According to Cassandra Wiki, best strategy is no swap at all.
http://wiki.apache.org/cassandra/MemtableThresholds#Virtual_Memory_and_Swap
2011/3/16 ruslan usifov ruslan.usi...@gmail.com:
Dear community!
Please share you settings for swap on linux box
--
w3m
Hello Bob,
1. What does lsof says on TCP:9160 port?
$ lsof -i TCP:9160
2. Have you try to change rpc_port in conf/cassandra.yaml?
ex. rpc_port: 19160
maki
2011/3/12 Jeremy Hanna jeremy.hanna1...@gmail.com:
I don't know if others have asked this but do you have a firewall running
that would
Hello,
According to the Wiki/StorageConfiguration page, auto_bootstrap is
described as below:
auto_bootstrap
Set to 'true' to make new [non-seed] nodes automatically migrate the
right data to themselves. (If no InitialToken is specified, they will
pick one such that they will get half the
Thx!
2011/3/8 aaron morton aa...@thelastpickle.com:
AFAIK yes. The node marks itself as bootstrapped whenever it starts, and
will not re-bootstrap once that it set.
More info here
http://wiki.apache.org/cassandra/Operations#Bootstrap
Hope that helps.
Aaron
On 8/03/2011, at 9:35 PM, Maki
Hello folks,
I'm tranlating the Wiki pages to Japanese now.
I found all of images in MemtableThresholds are broken:
http://wiki.apache.org/cassandra/MemtableThresholds
Can anyone fix the links?
Thanks
--
maki
Ok, I got it.
2011年2月21日13:37 morishita.y...@future.co.jp:
I think apache infra team is working on the issue...
https://issues.apache.org/jira/browse/INFRA-3352
-Original Message-
From: Maki Watanabe [mailto:watanabe.m...@gmail.com]
Sent: Monday, February 21, 2011 1:20 PM
87 matches
Mail list logo