Advantage of pre-defining column metadata

2012-08-28 Thread A J
For static column family what is the advantage in pre-defining column metadata ? I can see ease of understanding type of values that the CF contains and that clients will reject incompatible insertion. But are there any major advantages in terms of performance or something else that makes it

Astyanax - build

2012-09-14 Thread A J
Hi, I am new to java and trying to get the Astyanax client running for Cassandra. Downloaded astyanax from https://github.com/Netflix/astyanax. How do I compile the source code from here it in a very simple fashion from linux command line ? Thanks.

Astyanax error

2012-09-17 Thread A J
Hello, I am tyring to retrieve a list of Column Names (that are defined as Integer) from a CF with RowKey as Integer as well. (I don't care for the column values that are just nulls) Following is snippet of my Astyanax code. I am getting 0 columns but I know the key that I am querying contains a

Specifying exact nodes for Consistency Levels

2011-05-02 Thread A J
Is it possible in some way to specify what specific nodes I want to include (or exclude) from the Consistency Level fulfillment ? Example, I have a cluster of 4 nodes (n1,n2,n3 and n4) and set N=4. I want to set W=3 and want to ensure that it is n1,n2 and n3 only that are used to satisfy w=3 (i.e.

Backup full cluster

2011-05-03 Thread A J
Snapshot runs on a local node. How do I ensure I have a 'point in time' snapshot of the full cluster ? Do I have to stop the writes on the full cluster and then snapshot all the nodes individually ? Thanks.

Force a node to form part of quorum

2011-06-15 Thread A J
Is there a way to favor a node to always participate (or never participate) towards fulfillment of read consistency as well as write consistency ? Thanks AJ

Re: Force a node to form part of quorum

2011-06-16 Thread A J
hits see https://github.com/apache/cassandra/blob/cassandra-0.8.0/conf/cassandra.yaml#L308 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 16 Jun 2011, at 05:58, A J wrote: Is there a way to favor a node to always

Auto compaction to be staggered ?

2011-06-27 Thread A J
Is there an enhancement on the roadmap to stagger the auto compactions on different nodes, to avoid more than one node compacting at any given time (or as few nodes as possible to compact at any given time). If not, any workarounds ? Thanks.

Clock skew

2011-06-27 Thread A J
During writes, the timestamp field in the column is the system-time of that node (correct me if that is not the case and the system-time of the co-ordinator is what gets applied to all the replicas). During reads, the latest write wins. What if there is a clock skew ? It could lead to a stale

Re: Clock skew

2011-06-28 Thread A J
://ria101.wordpress.com/2011/02/08/cassandra-the-importance-of-system-clocks-avoiding-oom-and-how-to-escape-oom-meltdown/ Hope this helps Dominic On 27 June 2011 23:03, A J s5a...@gmail.com wrote: During writes, the timestamp field in the column is the system-time of that node (correct me

Data storage security

2011-06-29 Thread A J
Are there any options to encrypt the column families when they are stored in the database. Say in a given keyspace some CF has sensitive info and I don't want a 'select *' of that CF to layout the data in plain text. Thanks.

Chunking if size 64MB

2011-06-29 Thread A J
From what I read, Cassandra allows a single column value to be up-to 2GB but would chunk the data if greater than 64MB. Is the chunking transparent to the application or does the app need to know if/how/when the chunking happened for a specific column value that happened to be 64MB. Thank you.

api to extract gossiper results

2011-06-29 Thread A J
Cassandra uses accrual failure detector to interpret the gossips. Is it somehow possible to extract these (gossip values and results of the failure detector) in an external system ? Thanks

Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread A J
I am little confused of the reason why nodetool repair has to run within GCGraceSeconds. The documentation at: http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair is not very clear to me. How can a delete be 'unforgotten' if I don't run nodetool repair? (I understand that if

Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread A J
Thanks all ! In other words, I think it is safe to say that a node as a whole can be made consistent only on 'nodetool repair'. Has there been enough interest in providing anti-entropy without compaction as a separate operation (nodetool repair does both) ? On Thu, Jun 30, 2011 at 5:27 PM,

cql error

2011-07-05 Thread A J
cqlsh CREATE KEYSPACE twissandra with ... strategy_class = ... 'org.apache.cassandra.locator.NetworkTopologyStrategy' ... and strategy_options=[{DC1:1, DC2:1}]; Bad Request: line 4:37 no viable alternative at character ']' What is wrong with the above syntax ? Thanks.

Re: cql error

2011-07-05 Thread A J
Thanks. That worked. On Tue, Jul 5, 2011 at 11:35 AM, Jonathan Ellis jbel...@gmail.com wrote: replace the s_o line with and strategy_options:DC1=1 and strategy_options:DC2=2 On Tue, Jul 5, 2011 at 10:09 AM, A J s5a...@gmail.com wrote: cqlsh CREATE KEYSPACE twissandra

How to make a node an exact replica of another node ?

2011-07-05 Thread A J
Hello, Let me explain what I am trying to do: I am prototyping 2 Data centers (DC1 and DC2) with two nodes each. Say DC1_n1 and DC1_n2 nodes in DC1 and DC2_n1 and DC2_n2 in DC2. With PropertyFileSnitch and NetworkTopologyStrategy and 'strategy_options of DC1=1 and DC2=1', I am able to ensure that

Re: How to make a node an exact replica of another node ?

2011-07-05 Thread A J
Perfect ! Thanks. On Tue, Jul 5, 2011 at 1:51 PM, Eric tamme eta...@gmail.com wrote: AJ, You can use offset mirror tokens to achieve this.   Pick your initial tokens for DC1N1 and DC1N2 as if they were the only nodes in your cluster.  Now increment each by 1 and use them as the tokens for

Details of 'nodetool move'

2011-07-05 Thread A J
Hello, Where can I find details of nodetool move. Most places just mention that 'move the target node to a given Token. Moving is essentially a convenience over decommission + bootstrap.' Stuff like, when do I need to do and on what nodes? What is the value of 'new token' to be provided ? What

deduct token values for BOP

2011-07-06 Thread A J
I wish to use the order preserving byte-ordered partitioner. How do I figure the initial token values based on the text key value. Say I wish to put all keys starting from a to d on N1. e to m on N2 and n to z on N3. What would be the initial_token values on each of the 3 nodes to accomplish this

When is 'Cassandra High Performance Cookbook' expected to be available ?

2011-07-07 Thread A J
https://www.packtpub.com/cassandra-apache-high-performance-cookbook/book

List nodes where write was applied to

2011-07-07 Thread A J
Is there a way to find what all nodes was a write applied to ? It could be a successful write (i.e. w was met) or unsuccessful write (i.e. less than w nodes were met). In either case, I am interested in finding: Number of nodes written to (before timeout or on success) Name of nodes written to

What does a write lock ?

2011-07-07 Thread A J
Does a write lock: 1. Just the columns in question for the specific row in question ? 2. The full row in question ? 3. The full CF ? I doubt read does any locks. Thanks.

'select * from cf' - FTS or Index

2011-07-07 Thread A J
Does a 'select * from cf' with no filter still use the primary index on the key or do a 'full table scan' ? Thanks.

Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-07-11 Thread A J
Instead of doing nodetool repair, is it not a cheaper operation to keep tab of failed writes (be it deletes or inserts or updates) and read these failed writes at a set frequency in some batch job ? By reading them, RR would get triggered and they would get to a consistent state. Because these

Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-07-11 Thread A J
Never mind. I see the issue with this. I will be able to catch the writes as failed only if I set CL=ALL. For other CLs, I may not know that it failed on some node. On Mon, Jul 11, 2011 at 2:33 PM, A J s5a...@gmail.com wrote: Instead of doing nodetool repair, is it not a cheaper operation

Node repair questions

2011-07-11 Thread A J
Hello, Have the following questions related to nodetool repair: 1. I know that Nodetool Repair Interval has to be less than GCGraceSeconds. How do I come up with an exact value of GCGraceSeconds and 'Nodetool Repair Interval'. What factors would want me to change the default of 10 days of

Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-07-12 Thread A J
with neighboring nodes. So is this text from the book misleading ? On Fri, Jul 8, 2011 at 10:36 AM, Jonathan Ellis jbel...@gmail.com wrote: that's an internal term meaning background i/o, not sstable merging per se. On Fri, Jul 8, 2011 at 9:24 AM, A J s5a...@gmail.com wrote: I think node repair involves

Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-07-12 Thread A J
Just confirming. Thanks for the clarification. On Tue, Jul 12, 2011 at 10:53 AM, Peter Schuller peter.schul...@infidyne.com wrote: From Cassandra the definitive guide - Basic Maintenance - Repair Running nodetool repair causes Cassandra to execute a major compaction. During a major

OOM

2011-11-02 Thread A J
Hi, For a single node of cassandra(1.0 version) having 15G of data+index, 48GB RAM, 8GB heap and about 2.6G memtable threshold, I am getting OOM when I have 1000 concurrent inserts happening at the same time. I have kept concurrent_writes: 128 in cassandra.yaml as there are a total of 16 cores

(A or B) AND C AND !D

2011-11-13 Thread A J
Hello Say I have 4 nodes: A, B, C and D and wish to have consistency level for writes defined in such as way that writes meet the following consistency level: (A or B) AND C AND !D, i.e. either of A or B will suffice and C to be included into consistency level as well. But the write should not

Re: (A or B) AND C AND !D

2011-11-15 Thread A J
To clarify, I wish to keep N=4 and W=2 in the following scenario. Thanks. On Sun, Nov 13, 2011 at 11:20 PM, A J s5a...@gmail.com wrote: Hello Say I have 4 nodes: A, B, C and D and wish to have consistency level for writes defined in such as way that writes meet the following consistency

Continuous export of data out of database

2011-11-15 Thread A J
Hello VoltDB has an export feature to stream the data out of the database. http://voltdb.com/company/blog/voltdb-export-connecting-voltdb-other-systems This is different from Cassandra's export feature (http://wiki.apache.org/cassandra/Operations#Import_.2BAC8_export) which is more of a different

Re: Continuous export of data out of database

2011-11-15 Thread A J
The issue with that is that I wish to have EACH_QUORUM in our other 2 datacenters but not in the third DC. Could not figure a way to accomplish that so exploring have a near-realtime backup copy in the third DC via some streaming process. On Tue, Nov 15, 2011 at 12:12 PM, Robert Jackson

garbage collecting tombstones

2011-12-01 Thread A J
Hello, Is 'garbage collecting tombstones ' a different operation than the JVM GC. Garbage collecting tombstones is controlled by gc_grace_seconds which by default is set to 10 days. But the traditional GC seems to happen much more frequently (when observed through jconsole) ? How can I force the

Increase replication factor

2011-12-05 Thread A J
If I update a keyspace to increase the replication factor; what happens to existing data for that keyspace ? Does the existing data get automatically increase its replication ? Or only on a RR or node repair does the existing data increase its replication factor ? Thanks.

each_quorum in pycassa

2011-12-12 Thread A J
What is the syntax for each_quorum in pycassa ? Thanks.

setStrategy_options syntax in thrift

2011-12-19 Thread A J
What is the syntax of setStrategy_options in thrift. The following fails: Util.java:22: setStrategy_options(java.util.Mapjava.lang.String,java.lang.String) in org.apache.cassandra.thrift.KsDef cannot be applied to (java.lang.String) newKs.setStrategy_options({replication_factor:2});

java thrift error

2011-12-20 Thread A J
The following syntax : import org.apache.cassandra.thrift.*; . . ColumnOrSuperColumn col = client.get(count_key.getBytes(UTF-8), cp, ConsistencyLevel.QUORUM); is giving the error: get(java.nio.ByteBuffer,org.apache.cassandra.thrift.ColumnPath,org.apache.cassandra.thrift.ConsistencyLevel)

Re: setStrategy_options syntax in thrift

2011-12-20 Thread A J
is the option and the value is the option value. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/12/2011, at 12:02 PM, A J wrote: What is the syntax of setStrategy_options in thrift. The following fails: Util.java:22

Re: java thrift error

2011-12-20 Thread A J
toByteBuffer(String value) throws UnsupportedEncodingException{return ByteBuffer.wrap(value.getBytes(UTF-8));} see http://wiki.apache.org/cassandra/ThriftExamples *- Original Message -* *From:* A J s5a...@gmail.com *Sent:* Tue, December 20, 2011 15:52 *Subject

Re: setStrategy_options syntax in thrift

2011-12-20 Thread A J
-* *From:* A J s5a...@gmail.com *Sent:* Tue, December 20, 2011 16:03 *Subject:* Re: setStrategy_options syntax in thrift I am new to java. Can you specify the exact syntax for replication_factor=2 ? Thanks. On Tue, Dec 20, 2011 at 1:50 PM, aaron morton aa...@thelastpickle.com wrote: It looks

Restart for change of endpoint_snitch ?

2011-12-27 Thread A J
If I change endpoint_snitch from SimpleSnitch to PropertyFileSnitch, does it require restart of cassandra on that node ? Thanks.

Encryption related question

2012-01-20 Thread A J
Hello, I am trying to use internode encryption in Cassandra (1.0.6) for the first time. 1. Followed the steps 1 to 5 at http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html#CreateKeystore Q. In cassandra.yaml , what value goes for keystore ? I exported the

Command to display config values

2012-01-24 Thread A J
Is there a command in cqlsh or cassandra CLI that can display the various values of the configuration parameters at use. I am particularly interested in finding the value of ' commitlog_sync' that the current session is using ? Thanks. AJ

Re: Command to display config values

2012-01-24 Thread A J
- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/01/2012, at 10:10 AM, A J wrote: Is there a command in cqlsh or cassandra CLI that can display the various values of the configuration parameters at use. I am particularly interested

batch mode and flushing

2012-02-02 Thread A J
Hello, when you set 'commitlog_sync: batch' on all the nodes in a multi-DC cluster and call writes with CL=ALL, does the operation wait till the write is flushed to all the disks on all the nodes ? Thanks.

Logging 'write' operations

2012-02-21 Thread A J
Hello, What is the best way to log write operations (insert,remove, counter add, batch operations) in Cassandra. I need to store the operations (with values being passed) in some fashion or the other for audit purposes (and possibly to undo some operation after inspection). Thanks.

Test Data creation in Cassandra

2012-03-02 Thread A J
What is the best way to create millions of test data in Cassandra ? I would like to have some script where I first insert say 100 rows in a CF. Then reinsert the same data on 'server side' with new unique key. That will make it 200 rows. Then continue the exercise a few times till I get lot of

Does the 'batch' order matter ?

2012-03-13 Thread A J
I know batch operations are not atomic but does the success of a write imply all writes preceeding it in the batch were successful ? For example, using cql: BEGIN BATCH USING CONSISTENCY QUORUM AND TTL 864 INSERT INTO users (KEY, password, name) VALUES ('user2', 'ch@ngem3b', 'second user')

Re: Does the 'batch' order matter ?

2012-03-14 Thread A J
at 4:15 PM, Tyler Hobbs ty...@datastax.com wrote: On Wed, Mar 14, 2012 at 11:50 AM, A J s5a...@gmail.com wrote: Are you saying the way 'batch mutate' is coded, the order of writes in the batch does not mean anything ? You can ask the batch to do A,B,C and then D in sequence; but sometimes

Order rows numerically

2012-03-16 Thread A J
If I define my rowkeys to be Integer (key_validation_class=IntegerType) , how can I order the rows numerically ? ByteOrderedPartitioner orders lexically and retrieval using get_range does not seem to make sense in order. If I were to change rowkey to be UTF8 (key_validation_class=UTF8Type), BOP

Max # of CFs

2012-03-19 Thread A J
How many Column Families are one too many for Cassandra ? I created a db with 5000 CFs (I can go into the reasons later) but the latency seems to be very erratic now. Not sure if it is because of the number of CFs. Thanks.

Re: Max # of CFs

2012-03-20 Thread A J
+state:results If you still got questions after reading this thread or some others about the same topic, do not hesitate asking again, Alain 2012/3/19 A J s5a...@gmail.com How many Column Families are one too many for Cassandra ? I created a db with 5000 CFs (I can go into the reasons

Re: Max # of CFs

2012-03-20 Thread A J
column family. 20.03.12 16:05, A J написав(ла): ok, the last thread says that 1.0+ onwards, thousands of CFs should not be a problem. But I am finding that all the allocated heap memory is getting consumed. I started with 8GB heap and then on reading http://www.datastax.com/dev/blog/whats

Re: Max # of CFs

2012-03-21 Thread A J
histogram first, heap dump second. Best regards, Vitalii Tymchyshyn 20.03.12 18:12, A J написав(ла): I have both row cache and column cache disabled for all my CFs. cfstats says Bloom Filter Space Used: 1760 per CF. Assuming it is in bytes, it is total of about 9MB of bloom filter size for 5K

Re: Order rows numerically

2012-03-21 Thread A J
Yes, that is good enough for now. Thanks. On Fri, Mar 16, 2012 at 6:49 PM, Watanabe Maki watanabe.m...@gmail.com wrote: How about to fill zeros before smaller digits? Ex. 0001, 0002, etc maki On 2012/03/17, at 6:29, A J s5a...@gmail.com wrote: If I define my rowkeys to be Integer

Re: solr query for string match in CQL

2012-04-12 Thread A J
Never mind. Double quotes within the single quotes worked: select * from solr where solr_query='body:sixty eight million nine hundred forty three thousand four hundred twenty four'; On Thu, Apr 12, 2012 at 11:42 AM, A J s5a...@gmail.com wrote: What is the syntax for a string match in CQL

Physical storage of rowkey

2012-08-09 Thread A J
Are row key hashed before being physically stored in Cassandra ? If so, what hash function is used to ensure collision is minimal. Thanks.

Custom Partitioner Type

2012-08-13 Thread A J
Is it possible to use a custom Partitioner type (other than RP or BOP) ? Say if my rowkeys are all Integers and I want all even keys to go to node1 and odd keys to node2, is it feasible ? How would I go about ? Thanks.

'WHERE' with several indexed columns

2012-08-16 Thread A J
Hi If I have a WHERE clause in CQL with several 'AND' and each column is indexed, which index(es) is(are) used ? Just the first field in the where clause or all the indexes involved in the clause ? Also is index used only with an equality operator or also with greater than /less than comparator

nodetool , localhost connection refused

2012-08-20 Thread A J
I am running 1.1.3 Nodetool on the database node (just a single node db) is giving the error: Failed to connect to 'localhost:7199': Connection refused Any idea what could be causing this ? Thanks.

Re: nodetool , localhost connection refused

2012-08-20 Thread A J
on that port? So database node did not start and bind to that port and you would see exception in the logs of that database nodeŠ.just a guess. Dean On 8/20/12 4:10 PM, A J s5a...@gmail.com wrote: I am running 1.1.3 Nodetool on the database node (just a single node db) is giving the error

Subscribe

2011-02-15 Thread A J

Binary object storage in Cassandra

2011-02-15 Thread A J
Hello Is it possible to store binary objects (images, pdfs, videos etc) in Cassandra. The size of my images are less than 100MB. If so, how do I try inserting and retrieving a few files from cassandra ? Would prefer if someone can give examples using pycassa. Thanks ! AJ

What if write consistency level cannot me met ?

2011-02-15 Thread A J
Say I set write consistency level to ALL and all but one node are down. What happens to writes ? Does it rollback from the live node before returning failure to client ? Thanks.

Update of value for a given name

2011-02-15 Thread A J
If I update a column (i.e. change the value contents for a given name in a given key), is the physical disk operation equivalent to delete followed by insert. Or is it just insert somehow making the last value marked as stale ? In the definite guide, it says the following about SSTable: *All

Re: Coordinator node

2011-02-15 Thread A J
determines the correct value to send to the client based on the responses it receives and then sends it. On Tue, Feb 15, 2011 at 3:55 PM, A J s5a...@gmail.com wrote: Thanks. 1. That is somewhat disappointing. Wish the redundancy of write on the coordinator node could have been avoided somehow

Re: Partitioning

2011-02-16 Thread A J
Yes, I read the same and it sounded weird. *Note that with RackAwareStrategy, succeeding nodes along the ring should alternate data centers to avoid hot spots. For instance, if you have nodes A, B, C, and D in increasing Token order, and instead of alternating you place A and B in DC1, and C and

Re: Coordinator node

2011-02-16 Thread A J
...@gmail.com wrote: A J s5alye at gmail.com writes: Makes sense ! Thanks. Just a quick follow-up: Now I understand the write is not made to coordinator (unless it is part of the replica for that key). But does the write column traffic 'flow' through the coordinator node. For a 2G column

Commercial support for cassandra

2011-02-16 Thread A J
By any chance are there companies that provide support for Cassandra ? Consult on setup and configuration and annual support packages ?

test

2011-02-17 Thread A J

Able to send only blank emails without contents to the group. What could be going on ?

2011-02-17 Thread A J

Replica details

2011-02-17 Thread A J
Where can I get good detailed explanation of the various replication options (Simple, Old Network and Network) along with snitches. I did read the definitive guide but not really satisfied. Is there a good post somewhere explaining this ? I will have 4 datacenters (assume) and 3 nodes in each

Re: Able to send only blank emails without contents to the group. What could be going on ?

2011-02-17 Thread A J
Thanks ! Finally. Did several retries since morning. On Thu, Feb 17, 2011 at 1:39 PM, Jonathan Ellis jbel...@gmail.com wrote: Maybe https://issues.apache.org/jira/browse/INFRA-3356? On Thu, Feb 17, 2011 at 12:37 PM, A J s5a...@gmail.com wrote: -- Jonathan Ellis Project Chair, Apache

Re: Coordinator node

2011-02-18 Thread A J
Hi, Are there any blogs/writeups anyone is aware of that talks of using primary replica as coordinator node (rather than a random coordinator node) in production scenarios ? Thank you. On Wed, Feb 16, 2011 at 10:53 AM, A J s5a...@gmail.com wrote: Thanks for the confirmation. Interesting

Re: R and N

2011-02-18 Thread A J
? Thanks. On Fri, Feb 18, 2011 at 10:23 AM, A J s5a...@gmail.com wrote: Questions about R and N (and W): 1. If I set R to Quorum and cassandra identifies a need for read repair before returning, would the read repair happen on R nodes (I mean subset of R that needs repair) or N nodes before

Metadata

2011-02-18 Thread A J
If I wish to find name of all the keys in all the column families along with other related metadata (such as last updated, size of column value field), is there an additional solution that caches this metadata OR do I have to always perform range queries and get the information ? I am not

Re: Async write

2011-02-18 Thread A J
W always stands for number of sync writes. N-W is the number of async writes. Note, N decides number of replicas. W only decides out of those N replicas, how many should be written synchronously before returning success of write to client. All writes always happen to a total of N nodes (W right

Non-latin implementation

2011-02-24 Thread A J
Hello, Have there been Cassandra implementations in non-latin languages. In particular: Mandarin (China) ,Devanagari (India), Korean (Korea) I am interested in finding if there are storage, sorting or other types of issues one should be aware of in these languages. Thanks.

Re: New Chain for : Does Cassandra use vector clocks

2011-02-24 Thread A J
but could be broken in case of a failed write You can think of a scenario where R + W N still leads to inconsistency even for successful writes. Say you keep W=1 and R=N . Lets say the one node where a write happened with success goes down before it made to the other N-1 nodes. Lets say it goes

Re: New Chain for : Does Cassandra use vector clocks

2011-02-24 Thread A J
in error could have gone through partially Again, this is not an absolutely unfamiliar territory and can be dealt with. -JA On Thu, Feb 24, 2011 at 1:16 PM, A J s5a...@gmail.com wrote: but could be broken in case of a failed write You can think of a scenario where R + W N still leads

Re: New Chain for : Does Cassandra use vector clocks

2011-02-24 Thread A J
While we are at it, there's more to consider than just CAP in distributed :) http://voltdb.com/blog/clarifications-cap-theorem-and-data-related-errors On Thu, Feb 24, 2011 at 3:31 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Thu, Feb 24, 2011 at 3:03 PM, A J s5a...@gmail.com wrote: yes

Re: New Chain for : Does Cassandra use vector clocks

2011-02-25 Thread A J
PM, A J s5a...@gmail.com wrote: While we are at it, there's more to consider than just CAP in distributed :) http://voltdb.com/blog/clarifications-cap-theorem-and-data-related-errors On Thu, Feb 24, 2011 at 3:31 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Thu, Feb 24, 2011 at 3:03 PM

Re: New Chain for : Does Cassandra use vector clocks

2011-02-25 Thread A J
-tolerance/ Then, Jeff Darcy's response: http://pl.atyp.us/wordpress/?p=3110 On Thu, Feb 24, 2011 at 2:56 PM, A J s5a...@gmail.com wrote: While we are at it, there's more to consider than just CAP in distributed :) http://voltdb.com/blog/clarifications-cap-theorem-and-data-related-errors On Thu

2x storage

2011-02-25 Thread A J
I read in some cassandra notes that each node should be allocated twice the storage capacity you wish it to contain. I think the reason was during compaction another copy of SSTables have to be made before the original ones are discarded. Can someone confirm if that is actually true ? During

Re: 2x storage

2011-02-25 Thread A J
OK. Is it also driven by type of compaction ? Does a minor compaction require less working space than major compaction ? On Fri, Feb 25, 2011 at 12:40 PM, Robert Coli rc...@digg.com wrote: On Fri, Feb 25, 2011 at 9:22 AM, A J s5a...@gmail.com wrote: I read in some cassandra notes that each node

Re: 2x storage

2011-02-25 Thread A J
Thanks. What happens when my compaction fails for space reasons ? Is no compaction possible till I add more space ? I would assume writes are not impacted though the latency of reads would increase, right ? Also though writes are not seek-intensive, compactions are seek-intensive, no ? On Fri,

Re: 2x storage

2011-02-25 Thread A J
Another related question: Can the minor compactions across nodes be staggered so that I can control how many nodes are compacting at any given point ? On Fri, Feb 25, 2011 at 2:01 PM, A J s5a...@gmail.com wrote: Thanks. What happens when my compaction fails for space reasons ? Is no compaction

Time to rebuild a node

2011-02-28 Thread A J
Hello, I know it depends on lot of factors but are there any ballpark number on how long it takes to recreate a node from other nodes (or add a new node to a ring). Something like x GBs take y minutes to build ? Thanks.

Re: Storing photos, images, docs etc.

2011-03-01 Thread A J
Depends on the specs of your large files. If the files are less than 64MB, there will be no splitting. Cassandra(actually thrift) has no streaming abilities. But if your objects are small (in a few MBs) they would fit in memory easily. I will have lot of binaries less than few MBs in size. I am

Re: Storing photos, images, docs etc.

2011-03-02 Thread A J
What are other options then Several. 1. Mogilefs. Stores on filesystem but metadata in database (MySQL or Postgres). Also has redundancy built in. Does not require RAID. No SPOF. But I think it has too many moving parts and requires a few more boxes than cassandra. 2. Ofcourse the good old Blob

cassandra.yaml

2011-03-02 Thread A J
Hello, I am trying to setup a cluser (for the first time) of a few nodes. Had a few questions related to that. I want the following properties in my cluster: 1. Not to use RP but BOP 2. Specify initial token myself on each node. 3. Change a few memtable defaults. 4. For Keyspaces to use

Defrag

2011-03-02 Thread A J
Are there any details on how much of an issue fragmentation is (with Cassandra ) ? With all the merging and deletes that happen with during compactions, how does the disk fragmentation look like over time ? Any thumb-rules on how frequently and how to defrag ? Thanks.

cassandra-rack.properties or cassandra-topology.properties

2011-03-03 Thread A J
In PropertyFileSnitch is cassandra-rack.properties or cassandra-topology.properties file used ? Little confused by the stmt: PropertyFileSnitch determines the location of nodes by referring to a user-defined description of the network details located in the property file

Re: cassandra-rack.properties or cassandra-topology.properties

2011-03-03 Thread A J
, 2011 at 1:28 PM, Jonathan Ellis jbel...@gmail.com wrote: Did you try ls conf/ ? On Thu, Mar 3, 2011 at 11:27 AM, A J s5a...@gmail.com wrote: In PropertyFileSnitch is cassandra-rack.properties or cassandra-topology.properties file used ? Little confused by the stmt: PropertyFileSnitch determines

Network Topology Strategy error

2011-03-03 Thread A J
using latest cassandra (0.7.2). I want to try out Network Topology Strategy. Following is related setting in cassandra.yaml endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch I have four nodes. Set them accordingly in ./conf/cassandra-topology.properties: 10.252.219.224=DC2:RAC1

Re: Storing photos, images, docs etc.

2011-03-03 Thread A J
why would you keep metadata in cassandra ? Even for millions of documents, metadata would be very small, mysql/postgres should suffice. Luster ofcourse is well known and widely used along with glusterfs. Luster I think requires kernel modifications and will be much more complex. Also it is easier

Several 'TimedOutException' in stress.py

2011-03-08 Thread A J
Trying out stress.py on AWS EC2 environment (4 Large instances. Each of 2-cores and 7.5GB RAM. All in the same region/zone.) python stress.py -o insert -d 10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t 10 -n 500 -S 100 -k (I want to try with column size of about 1MB.

  1   2   >