Re: SurgeCon 2012

2012-09-06 Thread Dan Kuebrich
I'm down--we had a good mini-meetup last year at lunch. How about trying to get something together on Wed or Thurs night? On Wed, Sep 5, 2012 at 5:46 PM, Chris Burroughs chris.burrou...@gmail.comwrote: Surge [1] is scalability focused conference in late September hosted in Baltimore. It's a

Re: Python CQL Batching is slower than single statements

2012-01-25 Thread Dan Kuebrich
Not that familiar with CQL in particular, but what timeout is set in pycassa? It could be too low for your batch size. If your request is timing out, it will do exponential back off between retries. On Jan 25, 2012 2:53 AM, aaron morton aa...@thelastpickle.com wrote: There are few slight

Re: Surgecon Meetup?

2011-09-26 Thread Dan Kuebrich
I'll be at Surge on Thursday, would love to meet up. Anyone else planning to be there? On Sun, Sep 25, 2011 at 7:27 PM, Chris Burroughs chris.burrou...@gmail.comwrote: Surge [1] is scalability focused conference in late September hosted in Baltimore. It's a pretty cool conference with a good

Re: dropping secondary indexes

2011-08-18 Thread Dan Kuebrich
@aaronmorton http://www.thelastpickle.com On 18/08/2011, at 3:16 AM, Dan Kuebrich wrote: Thanks, Aaron! In terms of dropping stuff from the CLI, I tried to re-drop the remaining built column index and get the following error message. I wonder if there's some sort of parser bug related to numeric

Re: dropping secondary indexes

2011-08-17 Thread Dan Kuebrich
Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 17/08/2011, at 2:12 AM, Dan Kuebrich wrote: I think I've dropped all the indexes on a CF, but I see traces of them in the CLI output of show keyspaces. I see a few validators left behind, and one built index. (output below

dropping secondary indexes

2011-08-16 Thread Dan Kuebrich
I think I've dropped all the indexes on a CF, but I see traces of them in the CLI output of show keyspaces. I see a few validators left behind, and one built index. (output below) 1. Is there a better way to check schema for indexes? 2. I can't drop the built one so I assume they're all gone?

strange json2sstable cast exception

2011-08-06 Thread Dan Kuebrich
Having run into a recurring compaction problem due to a corrupt sstable (perceived row size was 13 petabytes or something), I sstable2json -x 'd the key and am now trying to re-import the sstable without it. However, I'm running into the following exception: Importing 2882 keys...

Re: strange json2sstable cast exception

2011-08-06 Thread Dan Kuebrich
with expiring columns. On Sat, Aug 6, 2011 at 9:29 AM, Dan Kuebrich dan.kuebr...@gmail.com wrote: Having run into a recurring compaction problem due to a corrupt sstable (perceived row size was 13 petabytes or something), I sstable2json -x 'd the key and am now trying to re-import the sstable

Re: 8.0.1 Released - Debian Package ETA?

2011-06-28 Thread Dan Kuebrich
0.8.1 should be up--I've already installed it. Here's directions: http://wiki.apache.org/cassandra/DebianPackaging On Tue, Jun 28, 2011 at 8:24 PM, Oleg Tsvinev oleg.tsvi...@gmail.comwrote: Hi, First of all, thank you for releasing v8.0.1 and congrats! the list of fixes and improvements is

Re: 8.0.1 Released - Debian Package ETA?

2011-06-28 Thread Dan Kuebrich
Try running apt-get update (as opposed to upgrade) to pull down the latest listings from the repo. On Tue, Jun 28, 2011 at 8:40 PM, Oleg Tsvinev oleg.tsvi...@gmail.comwrote: Thank you Dan! But I only see 0.8.0 there :( On Tue, Jun 28, 2011 at 5:35 PM, Dan Kuebrich dan.kuebr

Re: RAID or no RAID

2011-06-27 Thread Dan Kuebrich
Not sure what the intended purpose is, but we've mostly used it as an emergency disk-capacity-increase option. It's not as good as raid because each disk size is counted individually (a compacted sstable can only be on one disk) so compaction size limits aren't expanded as one might expect. On

Re: solandra or pig or....?

2011-06-21 Thread Dan Kuebrich
Solandra is indeed distributed search, not distributed number-crunching. As a previous poster said, you could imagine structuring the data in a series of documents with fields containing playername, teamname, position, location, day, time, inning, at bat, outcome, etc. Then you could query to

Re: Cassandra Statistics and Metrics

2011-06-14 Thread Dan Kuebrich
Here's what people usually monitor from munin (and how they get at it): https://github.com/jbellis/cassandra-munin-plugins . Sounds a lot like what these guys are doing (even the stack?): http://datadoghq.com/ On Tue, Jun 14, 2011 at 10:13 AM, Viktor Jevdokimov vjevdoki...@gmail.comwrote:

Re: problem in using get_range() function

2011-06-13 Thread Dan Kuebrich
Are you using the order preserving partitioner or the random partitioner for this CF? In order to get the results you expect, you'll need to use the OPP. More info: http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/ On Mon, Jun 13, 2011 at 8:47 AM,

Re: CLI set command returns null

2011-06-07 Thread Dan Kuebrich
Null response may mean an error on the server side. Have you checked your cassandra server's logs? On Tue, Jun 7, 2011 at 2:22 PM, AJ a...@dude.podzone.net wrote: Ver 0.8.0. Please help. I don't know what I'm doing wrong. One simple keyspace with one simple CF with one simple column.

Re: how to know there are some columns in a row

2011-06-07 Thread Dan Kuebrich
There might not be a built-in way to do this, but if you make two rows for each author, eg: nabokov_fulltext [ 'lolita' : 'Lolita, light of my life ...' , ...] nabokov_bookindex [ 'lolita' : None , ... ] you could query the bookindex for each author without cassandra having to load the full

Re: Appending to fields

2011-05-31 Thread Dan Kuebrich
On Tue, May 31, 2011 at 4:57 PM, Victor Kabdebon victor.kabde...@gmail.comwrote: As Jonathan stated I believe that the insert is in O(N + M), unless there are some operations that I don't know. There are other NoSQL database that can be used with Cassandra as buffers for quick access and

Re: Priority queue in a single row - performance falls over time

2011-05-25 Thread Dan Kuebrich
It sounds like the problem is that the row is getting filled up with tombstones and becoming enormous? Another idea then, which might not be worth the added complexity, is to progressively use new rows. Depending on volume, this could mean having 5-minute-window rows, or 1 minute, or whatever

Re: Upgrade to a different version?

2011-03-17 Thread Dan Kuebrich
Do people have success stories with 0.7.4? It seems like the list only hears if there's a major problem with a release, which means that if you're trying to judge the stability of a release you're looking for silence. But maybe that means not many people have tried it yet. Is there a record of

Re: null vs value not found?

2011-02-24 Thread Dan Kuebrich
When I've gotten null as a result in cassandra-cli, it turned out to mean that there were exceptions being thrown on the server side. Have you checked your Cassandra logs? On Thu, Feb 24, 2011 at 3:44 PM, buddhasystem potek...@bnl.gov wrote: Thanks Tyler, ColumnFamily: index1

Re: null vs value not found?

2011-02-24 Thread Dan Kuebrich
I should mention that it took me a while to figure this out too. Might be a candidate for an improvement in the cli? On Thu, Feb 24, 2011 at 4:01 PM, buddhasystem potek...@bnl.gov wrote: Thanks! You are right. I see exception but have no idea what went wrong. ERROR [ReadStage:14]

Re: read latency in cassandra

2011-02-20 Thread Dan Kuebrich
for being so verbose! dan Sorry for all the questions, the answer to your initial question is mmm, that does not sound right. It will depend on Aaron On 5 Feb 2011, at 08:13, Dan Kuebrich wrote: Hi all, It often takes more than two seconds to load: - one row of ~450 events comprising

Re: RandomPartitioner

2011-02-14 Thread Dan Kuebrich
You may find this part of the wiki helpful: http://wiki.apache.org/cassandra/Operations#Range_changes If you explicitly specify an InitialToken in the configuration, the new node will bootstrap to that position on the ring. Otherwise, it will pick a Token that will give it half the keys from the

read latency in cassandra

2011-02-04 Thread Dan Kuebrich
Hi all, It often takes more than two seconds to load: - one row of ~450 events comprising ~600k - cluster size of 1 - client is pycassa 1.04 - timeout on recv - cold read (I believe) - load generally 0.5 on a 4-core machine, 2 EC2 instance store drives for cassandra - cpu wait generally 1%

Re: Using Cassandra to store files

2011-02-03 Thread Dan Kuebrich
CouchDB That's not what document-oriented means! (har har) I don't know all the details of your case, but with serving static files I suspect you could do ok with something that has a much smaller memory/cpu footprint as you won't have as great of write throughput / read latency concerns.

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Dan Kuebrich
We've done hundreds of gigs in and out of cassandra 0.6.8 with pycassa 0.3. Working on upgrading to 0.7 and pycassa 1.03. I don't know if we're using it wrong, but the connection object is tied to a particular keyspace constraint isn't that awesome--we have a number of keyspaces used

Re: Cassandra Monitoring

2010-12-17 Thread Dan Kuebrich
Is anyone using cassandra with monit? All I have is this embarrassing bit of monit config: check process cassandra with pidfile /var/run/cassandra.pid start program = /etc/init.d/cassandra start with timeout 60 seconds stop program = /etc/init.d/cassandra stop if failed port 9160 type tcp