Re: column with TTL of 10 seconds lives very long...

2013-05-25 Thread Jeremiah Jordan
If you do that same get again, is the column still being returned? (days later) -Jeremiah On Thu, May 23, 2013 at 6:16 AM, Tamar Fraenkel ta...@tok-media.com wrote: Hi! TTL was set: [default@HLockingManager] get HLocks['/LockedTopic/31a30c12-652d-45b3-9ac2-0401cce85517']; =

Re: remove DC

2012-11-12 Thread Jeremiah Jordan
If you have any data that you wrote to DC2, since the last time you ran repair, you should probably run repair to make sure that data made it over to DC1, if you never wrote data directly to DC2, then you are correct you don't need to run repair. You should just need to update the schema, and

Re: CREATE COLUMNFAMILY

2012-11-11 Thread Jeremiah Jordan
That is fine. You just have to be careful that you haven't already inserted data which would be rejected by the type you update to, as a client will have issues reading that data back. -Jeremiah On Nov 11, 2012, at 4:09 PM, Kevin Burton rkevinbur...@charter.net wrote: What happens when you

Re: NetworkTopologyStrategy with 1 node

2012-05-26 Thread Jeremiah Jordan
What is the output of nodetool ring? Does the cluster actually think your node is in DC1? -Jeremiah On May 26, 2012, at 6:36 AM, Cyril Auburtin wrote: I get the same issue on Cassandra 1.1: create keyspace ks with strategy_class = 'NetworkTopologyStrategy' AND strategy_options ={DC1:1};

RE: understanding of native indexes: limitations, potential side effects,...

2012-05-16 Thread Jeremiah Jordan
The limitation is because number of columns could be equal to number of rows. If number of rows is large this can become an issue. -Jeremiah From: David Vanderfeesten [feest...@gmail.com] Sent: Wednesday, May 16, 2012 6:58 AM To: user@cassandra.apache.org

Re: Does or will Cassandra support OpenJDK ?

2012-05-14 Thread Jeremiah Jordan
Open JDK is java 1.7. Once Cassandra supports Java 1.7 it would most likely work on Open JDK, as the 1.7 Open JDK really is the same thing as Oracle JDK 1.7 without some licensed stuff. -Jeremiah On May 11, 2012, at 10:02 PM, ramesh wrote: I've had problem downloading the Sun (Oracle) JDK

Re: DELETE from table with composite keys

2012-05-14 Thread Jeremiah Jordan
Slice deletes are not supported currently. It is being worked on. https://issues.apache.org/jira/browse/CASSANDRA-3708 -Jeremiah On May 14, 2012, at 12:18 PM, Roland Mechler wrote: I have a table with a 3 part composite key and I want to delete rows based on the first 2 parts of the key.

RE: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Jeremiah Jordan
You have to use nodetool move to change the token after the node has started the first time. The value in the config file is only used on first startup. Unless you were using RF=3 on your 3 node ring, you can't just start with a new token without using nodetool. You have to do move so that

Re: Resident size growth

2012-04-09 Thread Jeremiah Jordan
He says he disabled JNA. You can't mmap without JNA can you? On Apr 9, 2012, at 4:52 AM, aaron morton wrote: see http://wiki.apache.org/cassandra/FAQ#mmap Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.comhttp://www.thelastpickle.com/ On

RE: Write performance compared to Postgresql

2012-04-03 Thread Jeremiah Jordan
So Cassandra may or may not be faster than your current system when you have a couple connections. Where it is faster, and scales, is when you get hundreds of clients across many nodes. See: http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html With 60 clients running

RE: Counter Column

2012-04-03 Thread Jeremiah Jordan
Right, it affects every version of Cassandra from 0.8 beta 1 until the Fix Version, which right now is None, so it isn't fixed yet... From: Avi-h [avih...@gmail.com] Sent: Tuesday, April 03, 2012 5:23 AM To: cassandra-u...@incubator.apache.org Subject:

RE: Compression on client side vs server side

2012-04-02 Thread Jeremiah Jordan
The server side compression can compress across columns/rows so it will most likely be more efficient. Whether you are CPU bound or IO bound depends on your application and node setup. Unless your working set fits in memory you will be IO bound, and in that case server side compression helps

Re: data size difference between supercolumn and regular column

2012-04-01 Thread Jeremiah Jordan
Is that 80% with compression? If not, the first thing to do is turn on compression. Cassandra doesn't behave well when it runs out of disk space. You really want to try and stay around 50%, 60-70% works, but only if it is spread across multiple column families, and even then you can run

RE: Any improvements in Cassandra JDBC driver ?

2012-03-29 Thread Jeremiah Jordan
There is no such thing as pure insert which will give an error if the thing already exists. Everything is really UPDATE OR INSERT. Whether you say UPDATE, or INSERT, it will all act like UPDATE OR INSERT, if the thing is there it get over written, if it isn't there it gets inserted.

Re: copy data for dev

2012-03-27 Thread Jeremiah Jordan
If you have the disck space you can just copy all the data files from the snapshot onto the dev node, renaming any with conflicting names. Then bring up the dev node and it should see the data. you can then compact to merge and drop all the duplicate data. You can also use the sstable loader

RE: Network, Compaction, Garbage collection and Cache monitoring in cassandra

2012-03-21 Thread Jeremiah Jordan
You can also use any network/server monitoring tool which can talk to JMX. We are currently using vFabric Hyperic's JMX plugin for this. IIRC there are some cacti and nagios scripts on github for getting the data into those. -Jeremiah From: R. Verlangen

RE: repair broke TTL based expiration

2012-03-20 Thread Jeremiah Jordan
You need to create the tombstone in case the data was inserted without a timestamp at some point. -Jeremiah From: Radim Kolar [h...@filez.com] Sent: Monday, March 19, 2012 4:48 PM To: user@cassandra.apache.org Subject: Re: repair broke TTL based

RE: Hector counter question

2012-03-19 Thread Jeremiah Jordan
No, Cassandra doesn't support atomic counters. IIRC it is on the list of things for 1.2. -Jeremiah From: Tamar Fraenkel [ta...@tok-media.com] Sent: Monday, March 19, 2012 1:26 PM To: cassandra-u...@incubator.apache.org Subject: Hector counter question Hi! Is

RE: 0.8.1 Vs 1.0.7

2012-03-16 Thread Jeremiah Jordan
I would guess more aggressive compaction settings, did you update rows or insert some twice? If you run major compaction a couple times on the 0.8.1 cluster does the data size get smaller? You can use the describe command to check if compression got turned on. -Jeremiah

RE: Composite keys and range queries

2012-03-14 Thread Jeremiah Jordan
Right, so until the new CQL stuff exists to actually query with something smart enough to know about composite keys , You have to define and query on your own. Row Key = UUID Column = CompositeColumn(string, string) You want to then use COLUMN slicing, not row ranges to query the data. Where

Re: Schema change causes exception when adding data

2012-03-06 Thread Jeremiah Jordan
? [Edit: corrected to a question] Then I can block the insertion of data until then. On Thu, Mar 1, 2012 at 4:33 AM, Jeremiah Jordan jeremiah.jor...@morningstar.com mailto:jeremiah.jor...@morningstar.com wrote: The error is that the specified colum family

Re: Adding a second datacenter

2012-03-05 Thread Jeremiah Jordan
You need to make sure your clients are reading using LOCAL_* settings so that they don't try to get data from the other data center. But you shouldn't get errors while replication_factor is 0. Once you change the replication factor to 4, you should get missing data if you are using LOCAL_*

Re: Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Jeremiah Jordan
There is a requirement that all nodes have a unique token. There is still one global cluster/ring that each node needs to be unique on. The logically seperate rings that NetworkTopologyStrategy puts them into is hidden from the rest of the code. -Jeremiah On 03/05/2012 05:13 AM, Hontvári

Re: unidirectional communication/replication

2012-02-29 Thread Jeremiah Jordan
You might check out some of the stuff Netflix does with their Cassandra backup, and Cassandra ETL tools.: http://techblog.netflix.com/2012/02/aegisthus-bulk-data-pipeline-out-of.html http://techblog.netflix.com/2012/02/announcing-priam.html -Jeremiah On 02/29/2012 11:04 AM, Alexandru Sicoe

RE: Schema change causes exception when adding data

2012-02-29 Thread Jeremiah Jordan
The error is that the specified colum family doesn't exist. If you connect with the CLI and describe the keyspace does it show up? Also, after adding a new column family programmatically you can't use it immediately, you have to wait for it to propagate. You can use calls to describe schema to

Chicago Cassandra Meetup on 3/1 (Preview of my Pycon talk)

2012-02-22 Thread Jeremiah Jordan
I am going to be doing a trial run of my Pycon talk about setting up a development instance of Cassandra and accessing it from Python (Pycassa mostly, some thrift just to scare people off of using thrift) for a Chicago Cassandra Meetup. Anyone in Chicago feel free to come by. The talk is

Re: Deleting a column vs setting it's value to empty

2012-02-10 Thread Jeremiah Jordan
Either one works fine. Setting to may cause you less headaches as you won't have to deal with tombstones. Deleting a non existent column is fine. -Jeremiah On 02/10/2012 02:15 PM, Drew Kutcharian wrote: Hi Everyone, Let's say I have the following object which I would like to save in

Re: Cassandra 1.0.6 multi data center question

2012-02-09 Thread Jeremiah Jordan
No, not an issue. The nodes in DC2 know that they aren't supposed to have data, so they go ask the nodes in DC1 for the data to return to you. -Jeremiah On 02/09/2012 05:28 AM, Roshan Pradeep wrote: Thanks Peter for the replies. Previously it was a typing mistake and it should be getting. I

Re: Disable Nagle algoritm in thrift i.e. TCP_NODELAY

2012-01-26 Thread Jeremiah Jordan
Should already be on for all of the server side stuff. All of the clients that I have used set it as well. -Jeremiah On 01/26/2012 07:17 AM, ruslan usifov wrote: Hello Is it possible set TCP_NODELAY on thrift socket in cassandra?

Re: Unbalanced cluster with RandomPartitioner

2012-01-17 Thread Jeremiah Jordan
Are you deleting data or using TTL's? Expired/deleted data won't go away until the sstable holding it is compacted. So if compaction has happened on some nodes, but not on others, you will see this. The disparity is pretty big 400Gb to 20GB, so this probably isn't the issue, but with our

Re: nodetool ring question

2012-01-17 Thread Jeremiah Jordan
There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Jeremiah Jordan
Correct, any kind of locking in Cassandra requires clocks that are in sync, and requires you to wait possible clock out of sync time before reading to check if you got the lock, to prevent the issue you describe below. There was a pretty detailed discussion of locking with only Cassandra a

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Jeremiah Jordan
Since a Zookeeper cluster is a quorum based system similar to Cassandra, it only goes down when n/2 nodes go down. And the same way you have to stop writing to Cassandra if N/2 nodes are down (if using QUoRUM), your App will have to wait for the Zookeeper cluster to come online again before

Re: How to reliably achieve unique constraints with Cassandra?

2012-01-06 Thread Jeremiah Jordan
By using quorum. One of the partitions will may be able to acquire locks, the other one won't... On 01/06/2012 03:36 PM, Drew Kutcharian wrote: Bryce, I'm not sure about ZooKeeper, but I know if you have a partition between HazelCast nodes, than the nodes can acquire the same lock

Re: Replacing supercolumns with composite columns; Getting the equivalent of retrieving a list of supercolumns by name

2012-01-04 Thread Jeremiah Jordan
.george. Is there a way to query a slice of numbers with a list of ids? As in, I want all the columns with numbers between 4 and 10 which have ids steve or greg. Cheers, Steve -Original Message- From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com

Re: Replacing supercolumns with composite columns; Getting the equivalent of retrieving a list of supercolumns by name

2012-01-03 Thread Jeremiah Jordan
The main issue with replacing super columns with composite columns right now is that if you don't know all your sub-column names you can't select multiple super columns worth of data in the same query without getting extra stuff. You have to use a slice to get all subcolumns of a given super

Re: Newbie question about writer/reader consistency

2011-12-29 Thread Jeremiah Jordan
So you can do this with Cassandra, but you need more logic in your code. Basically, you get the last safe number, M, then get N..M, if there are any gaps, you try again reading those numbers. As long as you are not over writing data, and you only update the last safe number after a successful

Re: memory estimate for each key in the key cache

2011-12-19 Thread Jeremiah Jordan
It is not telling you to multiply your key size by 10-12, it is telling you to multiply the output of the nodetool cfstats reported key cache size by 10-12. -Jeremiah On Dec 18, 2011, at 6:37 PM, Guy Incognito wrote: to be blunt, this doesn't sound right to me, unless it's doing something

Re: gracefully recover from data file corruptions

2011-12-16 Thread Jeremiah Jordan
You need to run repair on the node once it is back up (to get back the data you just deleted). If this is happening on more than one node you could have data loss... -Jeremiah On 12/16/2011 07:46 AM, Ramesh Natarajan wrote: We are running a 30 node 1.0.5 cassandra cluster running RHEL 5.6

Re: Cassandra C client implementation

2011-12-14 Thread Jeremiah Jordan
If you are OK linking to a C++ based library you can look at: https://github.com/minaguib/libcassandra/tree/kickstart-libcassie-0.7/libcassie It is wrapper code around libcassandra which exports a C++ interface. If you look at the function names etc in the other languages, just use the similar

Re: Slow Compactions - CASSANDRA-3592

2011-12-13 Thread Jeremiah Jordan
Does your issue look similar this one? https://issues.apache.org/jira/browse/CASSANDRA-3532 It is also dealing with compactaion taking 10X longer in 1.0.X On 12/13/2011 09:00 AM, Dan Hendry wrote: I have been observing that major compaction can be incredibly slow in Cassandra 1.0 and was

Re: cassandra in production environment

2011-12-12 Thread Jeremiah Jordan
What java are you using? OpenJDK or Sun/Oracle (http://www.oracle.com/technetwork/java/javase/downloads/index.html)? If you are using OpenJDK you might try Sun. Have you run diagnostics on the disk? It is more likely there is an issue with your disk, not with Cassandra. On 12/11/2011

Re: Need to reconcile data from 2 drives

2011-12-12 Thread Jeremiah Jordan
If you don't want downtime, you can take the original data and use the bulk sstable loader to send it back into the cluster. If you don't mind downtime you can take all the files from both data folders and put them together, make sure there aren't any with the same names (rename them if there

Re: exporting data from Cassandra cluster

2011-12-09 Thread Jeremiah Jordan
for copying data from a Cassandra cluster to somewhere on a disk where there is no Cassandra instance? If not what is the best way/tool to achieve that? Cheers, Alexandru On Wed, Dec 7, 2011 at 10:00 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com mailto:jeremiah.jor...@morningstar.com wrote

Re: exporting data from Cassandra cluster

2011-12-07 Thread Jeremiah Jordan
Stop your current cluster. Start a new cassandra instance on the machine you want to store your data on. Use the sstable loader to load the sstables from all of the current machines into the new machine. Run major compaction a couple times. You will have all of the data on one machine.

Re: Insufficient disk space to flush

2011-12-01 Thread Jeremiah Jordan
If you are writing data with QUORUM or ALL you should be safe to restart cassandra on that node. If the extra space is all from *tmp* files from compaction they will get deleted at startup. You will then need to run repair on that node to get back any data that was missed while it was full.

Re: JMX monitoring

2011-11-23 Thread Jeremiah Jordan
jconsole is going to be the most up to date documentation for the JMX interface =(. -Jeremiah On 11/23/2011 10:49 AM, David McNelis wrote: Ok. in that case I think the Docs are wrong. http://wiki.apache.org/cassandra/JmxInterface has StorageService as part of org.apache.cassandra.service.

Re: DataCenters each with their own local data source

2011-11-22 Thread Jeremiah Jordan
Cassandra's Multiple Data Center Support is meant for replicating all data across multiple datacenter's efficiently. You could use the Byte Order Partitioner to prefix data with a key and assign those keys to nodes in specific data centers, though the edge nodes would get tricky as those would

Re: DataCenters each with their own local data source

2011-11-22 Thread Jeremiah Jordan
Oops, I was thinking all in the same keyspace. If you made a new keyspace for each DC you could specify where to put the data and have them only be in one place. -Jeremiah On Nov 22, 2011, at 8:49 PM, Jeremiah Jordan wrote: Cassandra's Multiple Data Center Support is meant for replicating

Re: 7199

2011-11-22 Thread Jeremiah Jordan
Yes, that is the port nodetool needs to access. On Nov 22, 2011, at 8:43 PM, Maxim Potekhin wrote: Hello, I have this in my cassandra-env.sh JMX_PORT=7199 Does this mean that if I use nodetool from another node, it will try to connect to that particular port? Thanks, Maxim

Re: Efficiency of Cross Data Center Replication...?

2011-11-20 Thread Jeremiah Jordan
in this case? (assume hint is disable) Thanks in advance. On Thu, Nov 17, 2011 at 10:46 AM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: Pretty sure data is sent to the coordinating node in DC2 at the same time it is sent to replicas in DC1, so I would think 10's of milliseconds after

Re: Efficiency of Cross Data Center Replication...?

2011-11-16 Thread Jeremiah Jordan
Pretty sure data is sent to the coordinating node in DC2 at the same time it is sent to replicas in DC1, so I would think 10's of milliseconds after the transport time to DC2. On Nov 16, 2011, at 3:48 PM, ehers...@gmail.com wrote: On a related note - assuming there are available resources

Re: Is a direct upgrade from .6 to 1.0 possible?

2011-11-14 Thread Jeremiah Jordan
You should be able to do it as long as you shut down the whole cluster for it: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Upgrading-to-1-0-tp6954908p6955316.html On 11/13/2011 02:14 PM, Timothy Smith wrote: Due to some application dependencies I've been holding off on a

Re: questions on frequency and timing of async replication between DCs

2011-11-11 Thread Jeremiah Jordan
If you query with ALL do you get the data? If you query with a range slice do you get the data (list from the cli)? On 11/11/2011 04:10 PM, Subrahmanya Harve wrote: I have cross dc replication set up using 0.8.7 with 3 nodes on each DC by following the +1 rule for tokens. I am seeing an

Re: Data retrieval inconsistent

2011-11-10 Thread Jeremiah Jordan
I am pretty sure the way you have K1 configured it will be placed across both DC's as if you had large ring. If you want it only in DC1 you need to say DC1:1, DC2:0. If you are writing and reading at ONE you are not guaranteed to get the data if RF 1. If RF = 2, and you write with ONE, you

Re: Data retrieval inconsistent

2011-11-10 Thread Jeremiah Jordan
at 1:30 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: I am pretty sure the way you have K1 configured it will be placed across both DC's as if you had large ring. If you want it only in DC1 you need to say DC1:1, DC2:0. If you are writing and reading at ONE you are not guaranteed

Re: : Cassandra reads under write-only load, read degradation after massive writes

2011-11-09 Thread Jeremiah Jordan
Indexed columns cause read before write so that the index can be updated if the column already exists. On 11/09/2011 02:46 PM, Oleg Tsernetsov wrote: When monitoring JMX metrics of cassandra 0.8.7 loaded by write-only test I observe significant read activity on column family where I write to.

Re: Second Cassandra users survey

2011-11-07 Thread Jeremiah Jordan
Actually, the data will be visible at QUORUM as well if you can see it with ONE. QUORUM actually gives you a higher chance of seeing the new value than ONE does. In the case of R=3 you have 2/3 chance of seeing the new value with QUORUM, with ONE you have 1/3... And this JIRA fixed an issue

Re: Second Cassandra users survey

2011-11-07 Thread Jeremiah Jordan
- Batch read/slice from multiple column families. On 11/01/2011 05:59 PM, Jonathan Ellis wrote: Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1] The results [2] have been extremely useful in setting and prioritizing goals for Cassandra development. But with

Re: Cassandra 1.0.0 - Node Load Bug

2011-10-21 Thread Jeremiah Jordan
I thought this patch made it into the 1.0 release? I remember it being referenced in one of the re-rolls. On Oct 20, 2011, at 9:56 PM, Jonathan Ellis jbel...@gmail.com wrote: That looks to me like it's reporting uncompressed size as the load. Should be fixed in the 1.0 branch for 1.0.1.

Re: Massive writes when only reading from Cassandra

2011-10-21 Thread Jeremiah Jordan
I could be totally wrong here, but If you are doing a QUORUM read and there is a bad value encountered from the QUORUM won't a repair happen? I thought read_repair_chance 0 just means it won't query extra nodes to check for bad values. -Jeremiah On Oct 17, 2011, at 4:22 PM, Jeremy Hanna

Re: nodetool ring Load column

2011-10-21 Thread Jeremiah Jordan
Are you using compressed sstables? or the leveled sstables? Make sure you include how you are configured in any JIRA you make, someone else was seeing a similar issue with compression turned on. -Jeremiah On Oct 14, 2011, at 1:13 PM, Ramesh Natarajan wrote: What does the Load column in

Re: How to speed up Waiting for schema agreement for a single node Cassandra cluster?

2011-10-04 Thread Jeremiah Jordan
But truncate is still slow, especially if it can't use JNA (windows) as it snapshots. Depending on how much data you are inserting during your unit tests, just paging through all the keys and then deleting them is the fastest way, though if you use timestamps besides now this won't work, as

Re: Very large rows VS small rows

2011-09-29 Thread Jeremiah Jordan
If A works for our use case, it is a much better option. A given row has to be read in full to return data from it, there used to be limitations that a row had to fit in memory, but there is now code to page through the data, so while that isn't a limitation any more, it means rows that don't

Re: Very large rows VS small rows

2011-09-29 Thread Jeremiah Jordan
So I need to read what I write before hitting send. Should have been, If A works for YOUR use case. and Wide rows DON'T spread across nodes well On 09/29/2011 02:34 PM, Jeremiah Jordan wrote: If A works for our use case, it is a much better option. A given row has to be read in full

Re: Thrift CPU Usage

2011-09-26 Thread Jeremiah Jordan
Yes. All the stress tool does is flood data through the API, no real processing or anything happens. So thrift reading/writing data should be the majority of the CPU time... On 09/26/2011 08:32 AM, Baskar Duraikannu wrote: Hello - I have been running read tests on Cassandra using stress

Re: [BETA RELEASE] Apache Cassandra 1.0.0-beta1 released

2011-09-15 Thread Jeremiah Jordan
Is it possible to update an existing column family with {stable_compression: SnappyCompressor, compaction_strategy:LeveldCompactionStrategy}? Or will I have to make a new column family and migrate my data to it? -Jeremiah On 09/15/2011 01:01 PM, Sylvain Lebresne wrote: The Cassandra team

Re: Updates lost

2011-09-01 Thread Jeremiah Jordan
Are you running on windows? If the default timestamp is just using time.time()*1e6 you will get the same timestamp twice if the code is close together. time.time() on windows is only millisecond resolution. I don't use pycassa, but in the Thrift api wrapper I created for our python code I

Solandra distributed search

2011-08-15 Thread Jeremiah Jordan
When using Solandra, do I need to use the Solr sharding synxtax in my queries? I don't think I do because Cassandra is handling the sharding, not Solr, but just want to make sure. The Solandra wiki references the distributed search limitations, which talks about the shard syntax further down

Re: Cassandra in Multiple Datacenters Active - Standby configuration

2011-08-15 Thread Jeremiah Jordan
Assign the tokens like they are two separate rings, just make sure you don't have any duplicate tokens. http://wiki.apache.org/cassandra/Operations#Token_selection The two datacenters are treated as separate rings, LOCAL_QUORUM will only delay the client as long as it takes to write the data

Re: thrift c++ insert Exception [Column value is required]

2011-08-14 Thread Jeremiah Jordan
You can checkout libcassandra for a C++ client built on top of thrift. It is not feature complete, but it is pretty good. https://github.com/matkor/libcassandra On Aug 14, 2011, at 3:59 AM, Konstantinos Chasapis wrote: Hi, Thank you for your answer. Is there any documentation that

Re: Restarting servers

2011-08-12 Thread Jeremiah Jordan
You need to wait for the servers to be up again before restarting the next one. nodetool ring on one of the servers you aren't restarting will tell you when it is back up. You can also watch for Starting up server gossip in the log file to know when it is starting to join the cluster again.

RE: Write everywhere, read anywhere

2011-08-04 Thread Jeremiah Jordan
If you have RF=3 quorum won't fail with one node down. So R/W quorum will be consistent in the case of one node down. If two nodes go down at the same time, then you can get inconsistent data from quorum write/read if the write fails with TimeOut, the nodes come back up, and then read asks

Re: Cassandra 0.6.8 snapshot problem?

2011-08-02 Thread Jeremiah Jordan
Does snapshot in 0.6 cause a flush to happen first? If not there could be data in the database that won't be in the snapshot. Though that seems like a long time for data to be sitting in the commit log and not make it to the sstables. On Thu, 2011-07-28 at 17:30 -0500, Jonathan Ellis wrote:

Re: RF=1

2011-08-02 Thread Jeremiah Jordan
If you have RF=1, taking one node down is going to cause 25% of your data to be unavailable. If you want to tolerate a machines going down you need to have at least RF=2, if you want to use quorum and have a machine go down, you need at least RF=3. On Tue, 2011-08-02 at 16:22 +0200, Patrik

Re: 8 million Cassandra data files on disk

2011-08-02 Thread Jeremiah Jordan
Connect with jconsole and run garbage collection. All of the files that have a -Compacted with the same name will get deleted the next time a full garbage collection runs, or when the node is restarted. They have already been combined into new files, the old ones just haven't been deleted yet.

Re: Nodetool ring not showing all nodes in cluster

2011-08-02 Thread Jeremiah Jordan
All of the nodes should have the same seedlist. Don't use localhost as one of the items in it if you have multiple nodes. On Tue, 2011-08-02 at 10:10 -0700, Aishwarya Venkataraman wrote: Nodetool does not show me all the nodes. Assuming I have three nodes A, B and C. The seedlist of A is

RE: custom StoragePort?

2011-07-11 Thread Jeremiah Jordan
If you are on linux see: https://github.com/pcmanus/ccm -Original Message- From: Yang [mailto:tedd...@gmail.com] Sent: Monday, July 11, 2011 3:08 PM To: user@cassandra.apache.org Subject: Re: custom StoragePort? never mind, found this..

RE: Node repair questions

2011-07-11 Thread Jeremiah Jordan
The more often you repair, the quicker it will be. The more often your nodes go down the longer it will be. Repair streams data that is missing between nodes. So the more data that is different the longer it will take. Your workload is impacted because the node has to scan the data it has to

RE: Cassandra memory problem

2011-07-07 Thread Jeremiah Jordan
We are running into the same issue on some of our machines. Still haven't tracked down what is causing it. From: William Oberman [mailto:ober...@civicscience.com] Sent: Thursday, July 07, 2011 7:19 AM To: user@cassandra.apache.org Subject: Re: Cassandra memory

RE: custom reconciling columns?

2011-06-30 Thread Jeremiah Jordan
The reason to break it up is that the information will then be on different servers, so you can have server 1 spending time retrieving row 1, while you have server 2 retrieving row 2, and server 3 retrieving row 3... So instead of getting 3000 things from one server, you get 1000 from 3 servers

RE: Cassandra ACID

2011-06-30 Thread Jeremiah Jordan
For your Consistency case, it is actually an ALL read that is needed, not an ALL write. ALL read, with what ever consistency level of write that you need (to support machines dyeing) is the only way to get consistent results in the face of a failed write which was at ONE that went to one node,

RE: RAID or no RAID

2011-06-29 Thread Jeremiah Jordan
With multiple data dirs you are still limited by the space free on any one drive. So if you have two data dirs with 40GB free on each, and you have 50GB to be compacted, it won't work, but if you had a raid, you would have 80GB free and could compact... -Original Message- From:

RE: Docs: Token Selection

2011-06-17 Thread Jeremiah Jordan
Run two Cassandra clusters... -Original Message- From: Eric tamme [mailto:eta...@gmail.com] Sent: Friday, June 17, 2011 11:31 AM To: user@cassandra.apache.org Subject: Re: Docs: Token Selection What I don't like about NTS is I would have to have more replicas than I need.  {DC1=2,

RE: Docs: Token Selection

2011-06-17 Thread Jeremiah Jordan
] Sent: Friday, June 17, 2011 1:02 PM To: user@cassandra.apache.org Subject: Re: Docs: Token Selection Hi Jeremiah, can you give more details? Thanks On 6/17/2011 10:49 AM, Jeremiah Jordan wrote: Run two Cassandra clusters... -Original Message- From: Eric tamme [mailto:eta...@gmail.com

RE: Docs: Why do deleted keys show up during range scans?

2011-06-14 Thread Jeremiah Jordan
I am pretty sure how Cassandra works will make sense to you if you think of it that way, that rows do not get deleted, columns get deleted. While you can delete a row, if I understand correctly, what happens is a tombstone is created which matches every column, so in effect it is deleting the

RE: Docs: Why do deleted keys show up during range scans?

2011-06-14 Thread Jeremiah Jordan
Also, tombstone's are not attached anywhere. A tombstone is just a column with special value which says I was deleted. And I am pretty sure they go into SSTables etc the exact same way regular columns do. -Original Message- From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com

RE: how to know there are some columns in a row

2011-06-08 Thread Jeremiah Jordan
I am pretty sure this would cut down on network traffic, but not on Disk IO or CPU use. I think Cassandra would still have to deserialize the whole column to get to the name. So if you really have a use case where you just want the name, it would be better to store a separate name with no data

RE: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread Jeremiah Jordan
Don't manually delete things. Let Cassandra do it. Force a garbage collection or restart your instance and Cassandra will delete the unused files. -Original Message- From: AJ [mailto:a...@dude.podzone.net] Sent: Tuesday, June 07, 2011 10:15 AM To: user@cassandra.apache.org Subject: Re:

RE: Reading quorum

2011-06-03 Thread Jeremiah Jordan
Only waiting for quorum responses and then resolving the one with the latest timestamp to return to the client. From: Fredrik Stigbäck [mailto:fredrik.l.stigb...@sitevision.se] Sent: Friday, June 03, 2011 9:44 AM To: user@cassandra.apache.org Subject: Reading

RE: Loading Keyspace from YAML in 0.8

2011-06-03 Thread Jeremiah Jordan
Or at least someone should write a script which will take a YAML config and turn it into a CLI script. From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Friday, June 03, 2011 12:00 PM To: user@cassandra.apache.org Subject: Re: Loading Keyspace from YAML

RE: Appending to fields

2011-06-01 Thread Jeremiah Jordan
Cassandra handles this by using a different design, you don't append anything. You use the fact that in Cassandra you have dynamic columns and you make a new column every time you want to put more data in. Then when you do finally need to read the data out you read out a slice of columns, not

java.lang.RuntimeException: Cannot recover SSTable with version a (current version f).

2011-05-05 Thread Jeremiah Jordan
This cluster was updated from 0.6.8-0.7.4-0.7.5. Do I need to run scrub or compact or something to get all the sstables updated to the new version? Jeremiah Jordan Application Developer Morningstar, Inc. Morningstar. Illuminating investing worldwide. +1 312 696

RE: Replica data distributing between racks

2011-05-03 Thread Jeremiah Jordan
So we are currently running a 10 node ring in one DC, and we are going to be adding 5 more nodes in another DC. To keep the rings in each DC balanced, should I really calculate the tokens independently and just make sure none of them are the same? Something like: DC1 (RF 5): 1: 0 2:

RE: best way to backup

2011-04-30 Thread Jeremiah Jordan
The files inside the keyspace folders are the SSTable. From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Friday, April 29, 2011 4:49 PM To: user@cassandra.apache.org Subject: Re: best way to backup William, Some info on the sstables from me

Changing replica placement strategy

2011-04-18 Thread Jeremiah Jordan
to be able to use NetworkTopologyStrategy. I am pretty sure RackUnawareStrategy and NetworkTopologyStrategy pick the same nodes to put data on if there is only one DC, so it should be ok right? Jeremiah Jordan Application Developer Morningstar, Inc. Morningstar

Link to Hudson on the download page is broken

2011-04-13 Thread Jeremiah Jordan
/lastSuccessfulBuild/ artifact/cassandra/build/ To: https://builds.apache.org/hudson/job/Cassandra/lastSuccessfulBuild/artif act/cassandra/build/ The old link doesn't work anymore. -Jeremiah Jeremiah Jordan Application Developer Morningstar, Inc. Morningstar. Illuminating

RE: Abnormal memory consumption

2011-04-07 Thread Jeremiah Jordan
Connect with jconsole and watch the memory consumption graph. Click the force GC button watch what the low point is, that is how much memory is being used for persistent stuff, the rest is garbage generated while satisfying queries. Run a query, watch how the graph spikes up when you run your

Secondary Index keeping track of column names

2011-04-06 Thread Jeremiah Jordan
Jeremiah Jordan Application Developer Morningstar, Inc. Morningstar. Illuminating investing worldwide. +1 312 696-6128 voice jeremiah.jor...@morningstar.com www.morningstar.com This e-mail contains privileged and confidential information and is intended only

Thrift version

2011-04-05 Thread Jeremiah Jordan
Anyone know if 0.7.4 will work with thirft 0.6? Or do I have to keep thrift 0.5 around to use it? Thanks! Jeremiah Jordan Application Developer Morningstar, Inc. Morningstar. Illuminating investing worldwide. +1 312 696-6128 voice jeremiah.jor

  1   2   >