new nodetool ring output and unbalanced ring?

2012-09-06 Thread William Oberman
Hi, I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from owns to effectively owns. Effectively owns seems to account for replication factor (RF). I'm ok with all of this, yet I still can't figure out what's up with my cluster. I have a

Re: new nodetool ring output and unbalanced ring?

2012-09-06 Thread William Oberman
the range being considered, not the last node that was chosen as a replica). To fix this, you'll either need to make the 1d node a 1c node, or make 42535295865117307932921825928971026432 a 1d node so that you're alternating racks within that DC. On Thu, Sep 6, 2012 at 12:54 PM, William Oberman

pig and widerows

2012-09-26 Thread William Oberman
Hi, I'm trying to figure out what's going on with my cassandra/hadoop/pig system. I created a mini copy of my main cassandra data by randomly subsampling to get ~50,000 keys. I was then writing pig scripts but also the equivalent operation using simple single threaded code to double check pig.

Re: pig and widerows

2012-09-27 Thread William Oberman
to undo all of my other hacks to get logging/printing working to confirm if those were actually the only two changes I had to make. will On Thu, Sep 27, 2012 at 1:43 PM, William Oberman ober...@civicscience.comwrote: Ok, this is painful. The first problem I found is in stock 1.1.5

Re: pig and widerows

2012-09-27 Thread William Oberman
of the integration between cassandra/pig/hadoop. will On Thu, Sep 27, 2012 at 3:26 PM, William Oberman ober...@civicscience.comwrote: The next painful lesson for me was figuring out how to get logging working for a distributed hadoop process. In my test environment, I have a single node that runs

cassandra + pig

2012-10-11 Thread William Oberman
I'm wondering how many people are using cassandra + pig out there? I recently went through the effort of validating things at a much higher level than I previously did(*), and found a few issues: https://issues.apache.org/jira/browse/CASSANDRA-4748

Re: cassandra + pig

2012-10-11 Thread William Oberman
there are some rough edges like you say. But issues that are reproducible on tickets for any problems are much appreciated and they will get addressed. On Oct 11, 2012, at 10:43 AM, William Oberman ober...@civicscience.com wrote: I'm wondering how many people are using cassandra + pig out there? I

Re: cassandra + pig

2012-10-11 Thread William Oberman
paging through them. I don't recall if we did paging in pig or mapreduce but you should be able to do that in both since pig allows you to specify the slice start. On Oct 11, 2012, at 11:28 AM, William Oberman ober...@civicscience.com wrote: If you don't mind me asking, how are you handling

Re: hadoop consistency level

2012-10-18 Thread William Oberman
A recent thread made it sound like Brisk was no longer a datastax supported thing (it's DataStax Enterpise, or DSE, now): http://www.mail-archive.com/user@cassandra.apache.org/msg24921.html In particular this response: http://www.mail-archive.com/user@cassandra.apache.org/msg25061.html On Thu,

remove DC

2012-11-12 Thread William Oberman
There is a great guide here on how to add resources: http://www.datastax.com/docs/1.1/operations/cluster_management#adding-capacity What about deleting resources? I'm thinking of removing a data center. Clearly I'd need to change strategy options, which is currently something like this:

Re: remove DC

2012-11-13 Thread William Oberman
, if you never wrote data directly to DC2, then you are correct you don't need to run repair. You should just need to update the schema, and then decommission the node. -Jeremiah On Nov 12, 2012, at 2:25 PM, William Oberman ober...@civicscience.com wrote: There is a great guide here on how

Re: AWS EMR - Cassandra

2013-01-04 Thread William Oberman
that the initial address isn't set (on the slave, the master is ok). Can you post the full error ? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/01/2013, at 11:15 AM, William Oberman ober

Re: AWS EMR - Cassandra

2013-01-04 Thread William Oberman
way (of setting the 3 system variables on the pig starter box, and having the config flow into the task trackers) doesn't work! will On Fri, Jan 4, 2013 at 9:04 AM, William Oberman ober...@civicscience.comwrote: On all tasktrackers, I see: java.io.IOException: PIG_OUTPUT_INITIAL_ADDRESS

Re: AWS EMR - Cassandra

2013-01-16 Thread William Oberman
what are you using for cassandra storage? I am currently using EC2 local disks, but I am looking for an alternative. Best regards, Marcelo. 2013/1/4 William Oberman ober...@civicscience.com So I've made it work, but I don't get it yet. I have no idea why my DIY server works when I set

Re: Cassandra at Amazon AWS

2013-01-17 Thread William Oberman
I have a peer EBS disk to the ephemeral disk . Then I do nodetool snapshot - rsync from ephemeral to EBS - take snapshot of EBS. Syncing nodetool snapshot directly to S3 would involve less steps and be cheaper (EBS costs more than S3), but I do post processing on the snapshot for EMR, and it

sstable2json had random behavior

2013-01-21 Thread William Oberman
I'm running 1.1.6 from the datastax repo. I ran sstable2json and got the following error: Exception in thread main java.io.IOError: java.io.IOException: dataSize of 7020023552240793698 starting at 993981393 would be larger than file /var/lib/cassandra/data/X-Data.db length 7502161255

Re: sstable2json had random behavior

2013-01-22 Thread William Oberman
files (Compression, Filter...) On Mon, Jan 21, 2013 at 11:36 AM, William Oberman ober...@civicscience.com wrote: I'm running 1.1.6 from the datastax repo. I ran sstable2json and got the following error: Exception in thread main java.io.IOError: java.io.IOException: dataSize

odd timestamps

2013-03-05 Thread William Oberman
I happened to notice some bizarre timestamps coming out of the cassandra-cli. Example: [default@XXX] get CF[‘e2b753aa33b13e74e5e803d787b06000']; = (column=c35ef420-c37a-11e0-ac88-09b2f4397c6a, value=XXX, timestamp=2013042719) = (column=c3845ea0-c37a-11e0-8f6f-09b2f4397c6a, value=XXX,

how to stop out of control compactions?

2013-04-01 Thread William Oberman
I'll skip the prelude, but I worked myself into a bit of a jam. I'm recovering now, but I want to double check if I'm thinking about things correct. Basically, I was in a state where a majority of my servers wanted to do compactions, and rather large ones. This was impacting my site

Re: how to stop out of control compactions?

2013-04-02 Thread William Oberman
column family On Mon, Apr 1, 2013 at 12:38 PM, William Oberman ober...@civicscience.comjavascript:_e({}, 'cvml', 'ober...@civicscience.com'); wrote: I'll skip the prelude, but I worked myself into a bit of a jam. I'm recovering now, but I want to double check if I'm thinking about things

Re: how to stop out of control compactions?

2013-04-02 Thread William Oberman
it usually means one of these things, 1) you have a corrupt table that the compaction never finishes on, sstables count keep growing 2) you do not have enough hardware to handle your write load On Tue, Apr 2, 2013 at 7:50 AM, William Oberman ober...@civicscience.comwrote: Thanks Gregg Aaron

Re: StatusLogger format?

2013-04-15 Thread William Oberman
99% sure it's in bytes. On Mon, Apr 15, 2013 at 11:25 AM, William Oberman ober...@civicscience.comwrote: Mainly the: ColumnFamilyMemtable ops,data section. Is data in bytes/kb/mb/etc? Example line: StatusLogger.java (line 116) civicscience.sessions

Re: advice for EC2 deployment

2011-04-26 Thread William Oberman
a failure mode where the west is disconnected from the east. Could you start simple with 3 replicas in one AZ in us-east and 3 replicas in an AZ+Region ? Then work through some failure scenarios. Hope that helps. Aaron On 22 Apr 2011, at 03:28, William Oberman wrote: Hi, My service is not yet

practice failure recovery

2011-04-26 Thread William Oberman
In my test cluster I manged to jam up a cassandra server. I figure the easy failsafe solution is to just boot a replacement node, but I thought I'd try a minute to either figure out what I did, or try to figure out how to properly recover it before I lose my current state. The symptom = on

Re: advice for EC2 deployment

2011-04-26 Thread William Oberman
-archive.com/user@cassandra.apache.org/msg11414.html Aaron On 27 Apr 2011, at 01:18, William Oberman wrote: Thanks Aaron! Unless no one on this list uses EC2, there were a few minor troubles end of last week through the weekend which taught me a lot about obscure failure modes in various

Re: practice failure recovery

2011-04-26 Thread William Oberman
go with the nuclear option. Aaron On 27 Apr 2011, at 07:11, William Oberman wrote: In my test cluster I manged to jam up a cassandra server. I figure the easy failsafe solution is to just boot a replacement node, but I thought I'd try a minute to either figure out what I did, or try

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
While I haven't configured it for multi-region yet, Sasha is exactly right now how amzon's DNS works (returning private vs. public IP depending on if the machine is local to the region or not). For extra fun, now that Route53 exists you can (somewhat trivially) map and dynamically maintain all

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
performance is good enough. There are lots of dials to play with http://wiki.apache.org/cassandra/MemtableThresholds Hope that helps. Aaron On 27 Apr 2011, at 09:31, William Oberman wrote: I see what you're saying. I was able to control write latency on mysql using insert vs insert delayed

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
: + ec2zone + .); ApplicationState.DC = ec2region ApplicationState.RACK = ec2zone We leverage cassandra instances in APAC, US Europe ... so it's important for us to know that we have one data center in each 'region' and multiple racks per DC ... -sasha On Wed, Apr 27, 2011 at 3:06 PM, William

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
this could be avoided if cassandra maintained hostname references and not just IP references for nodes. -sasha On Wed, Apr 27, 2011 at 2:56 PM, William Oberman ober...@civicscience.com wrote: While I haven't configured it for multi-region yet, Sasha is exactly right now how amzon's DNS works

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
is already in the works (to route EC2 traffic to the closest region). will On Wed, Apr 27, 2011 at 9:33 AM, William Oberman ober...@civicscience.comwrote: I don't think of it as migrating an instance, it's more of a destroy/start with EC2. But, I still think it would be very useful to spin up a set

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
not a good idea to switch NetworkTopologyStrategy ... On Wed, Apr 27, 2011 at 3:29 PM, William Oberman ober...@civicscience.com wrote: Thanks Sasha. Fortunately/unfortunately I did realize the default current behavior of the Ec2Snitch, but my application isn't multi-region capable (yet), so I

best way to backup

2011-04-28 Thread William Oberman
Even with N-nodes for redundancy, I still want to have backups. I'm an amazon person, so naturally I'm thinking S3. Reading over the docs, and messing with nodeutil, it looks like each new snapshot contains the previous snapshot as a subset (and I've read how cassandra uses hard links to avoid

Re: best way to backup

2011-04-28 Thread William Oberman
pointless anyways. will On Thu, Apr 28, 2011 at 3:57 PM, Sasha Dolgy sdo...@gmail.com wrote: You could take a snapshot to an EBS volume. then, take a snapshot of that via AWS. of course, this is ok.when they -arent- having outages and issues ... On Apr 28, 2011 9:54 PM, William Oberman ober

Re: best way to backup

2011-04-28 Thread William Oberman
a json file with the current files in the directory, so you can know what to restore in that event (as far as I understand). On Apr 28, 2011, at 2:53 PM, William Oberman wrote: Even with N-nodes for redundancy, I still want to have backups. I'm an amazon person, so naturally I'm thinking S3

Re: best way to backup

2011-04-29 Thread William Oberman
impacting the main data raid. But the main reason to do this is to have an 'omg we screwed up big time and deleted / corrupted data' recovery. On Apr 28, 2011, at 9:53 PM, William Oberman wrote: Even with N-nodes for redundancy, I still want to have backups. I'm an amazon person, so

Re: best way to backup

2011-04-30 Thread William Oberman
://wiki.apache.org/cassandra/ArchitectureOverviewAaron On 29 Apr 2011, at 23:43, William Oberman wrote: Dumb question, but referenced twice now: which files are the SSTables and why is backing them up incrementally a win? Or should I not bother to understand internals, and instead just roll

hadoop/pig notes

2011-06-08 Thread William Oberman
I decided to try out hadoop/pig + cassandra. I had my ups and downs to get the script I wanted to run to work. I'm sure everyone who tries will have their own experiences/problems, but mine were: -Everything I need to know was in http://hadoop.apache.org/common/docs/r0.20.2/cluster_setup.html

prep for cassandra storage from pig

2011-06-15 Thread William Oberman
I think I'm stuck on typing issues trying to store data in cassandra. To verify, cassandra wants (key, {tuples}) My pig script is fairly brief: raw = LOAD 'cassandra://test_in/test_cf' USING CassandraStorage() AS (key:chararray, columns:bag {column:tuple (name, value)}); --colums == timeUUID -

Re: prep for cassandra storage from pig

2011-06-15 Thread William Oberman
and ToCassandraBag from pygmalion - it does the work for you to get it back into a form that cassandra understands. Others may know better how to massage the data into that form using just pig, but if all else fails, you could write a udf to do that. Jeremy On Jun 15, 2011, at 1:17 PM, William Oberman

Re: prep for cassandra storage from pig

2011-06-15 Thread William Oberman
I'll do a reply all, to keep this more consistent (sorry!). Rather than staying stuck, I wrote a custom function: TupleToBagOfTuple. I'm curious if I could have avoided it with proper pig scripting though. On Wed, Jun 15, 2011 at 3:08 PM, William Oberman ober...@civicscience.comwrote: My

Re: Docs: Token Selection

2011-06-17 Thread William Oberman
I haven't done it yet, but when I researched how to make geo-diverse/failover DCs, I figured I'd have to do something like RF=6, strategy = {DC1=3, DC2=3}, and LOCAL_QUORUM for reads/writes. This gives you an ack after 2 local nodes do the read/write, but the data eventually gets distributed to

OOM (or, what settings to use on AWS large?)

2011-06-22 Thread William Oberman
I woke up this morning to all 4 of 4 of my cassandra instances reporting they were down in my cluster. I quickly started them all, and everything seems fine. I'm doing a postmortem now, but it appears they all OOM'd at roughly the same time, which was not reported in any cassandra log, but I

Re: OOM (or, what settings to use on AWS large?)

2011-06-22 Thread William Oberman
, William Oberman ober...@civicscience.com wrote: I woke up this morning to all 4 of 4 of my cassandra instances reporting they were down in my cluster. I quickly started them all, and everything seems fine. I'm doing a postmortem now, but it appears they all OOM'd at roughly the same time

Re: OOM (or, what settings to use on AWS large?)

2011-06-22 Thread William Oberman
before it was finally killed because 'apt' was fighting for resource. At least, that's as far as I got in my investigation before giving up, moving to 0.8.0 and implementing 24hr nodetool repair on each node via cronjobso far ... no problems. On Wed, Jun 22, 2011 at 2:49 PM, William Oberman

Re: OOM (or, what settings to use on AWS large?)

2011-06-22 Thread William Oberman
is running on the boxes? On Wed, Jun 22, 2011 at 9:06 AM, William Oberman ober...@civicscience.com wrote: I was wondering/I figured that /var/log/kern indicated the OS was killing java (versus an internal OOM). The nodetool repair is interesting. My application never deletes, so I didn't

rpm from 0.7.x - 0.8?

2011-06-22 Thread William Oberman
I'm running 0.7.4 from rpm (riptano). If I do a yum upgrade, it's trying to do 0.7.6. To get 0.8.x I have to do install apache-cassandra08. But that is going to install two copies. Is there a semi-official way of properly upgrading to 0.8 via rpm? -- Will Oberman Civic Science, Inc. 3030

Re: rpm from 0.7.x - 0.8?

2011-06-22 Thread William Oberman
on the version of nodetool will On Wed, Jun 22, 2011 at 10:15 AM, William Oberman ober...@civicscience.comwrote: I'm running 0.7.4 from rpm (riptano). If I do a yum upgrade, it's trying to do 0.7.6. To get 0.8.x I have to do install apache-cassandra08. But that is going to install two

Re: rpm from 0.7.x - 0.8?

2011-06-22 Thread William Oberman
immediately. You should turn this on when you start adding new nodes to a cluster that already has data on it. I'm not adding new nodes, but the cluster does have data on it... will On Wed, Jun 22, 2011 at 11:39 AM, William Oberman ober...@civicscience.comwrote: I just did a remove then install

Re: rpm from 0.7.x - 0.8?

2011-06-22 Thread William Oberman
wrote: Doesn't matter. auto_bootstrap only applies to first start ever. On Wed, Jun 22, 2011 at 10:48 AM, William Oberman ober...@civicscience.com wrote: I have a question about auto_bootstrap. When I originally brought up the cluser, I did: -seed with auto_boot = false -1,2,3

Re: Backup/Restore: Coordinating Cassandra Nodetool Snapshots with Amazon EBS Snapshots?

2011-06-23 Thread William Oberman
I've been doing EBS snapshots for mysql for some time now, and was using a similar pattern as Josep (XFS with freeze, snap, unfreeze), with the extra complication that I was actually using 8 EBS's in RAID-0 (and the extra extra complication that I had to lock the MyISAM tables... glad to be moving

hadoop results

2011-06-29 Thread William Oberman
I'll start with my question: given a CF with comparator TimeUUIDType, what is the most efficient way to get the greatest column's value? Context: I've been running cassandra for a couple of months now, so obviously it's time to start layering more on top :-) In my test environment, I managed to

Re: hadoop results

2011-06-30 Thread William Oberman
that is the current metric to use. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 30 Jun 2011, at 06:35, William Oberman wrote: I'll start with my question: given a CF with comparator TimeUUIDType, what is the most

Re: Strong Consistency with ONE read/writes

2011-07-02 Thread William Oberman
Ok, I see the you happen to choose the 'right' node idea, but it sounds like you want to solve C* problems in the client, and they already wrote that complicated code to make clients simple. You're talking about reimplementing key-node mappings, network topology (with failures), etc... Plus, if

Re: Strong Consistency with ONE read/writes

2011-07-03 Thread William Oberman
Was just going off of: Send the value to the primary replica and send placeholder values to the other replicas. Sounded like you wanted to write the value to one, and write the placeholder to N-1 to me. But, C* will propagate the value to N-1 eventually anyways, 'cause that's just what it does

Re: Strong Consistency with ONE read/writes

2011-07-03 Thread William Oberman
you are hacking (or at least looking) at the source, so all the power to you if/when you try these kind of changes. will On Sun, Jul 3, 2011 at 8:45 PM, AJ a...@dude.podzone.net wrote: ** On 7/3/2011 6:32 PM, William Oberman wrote: Was just going off of: Send the value to the primary

cassandra/hadoop/pig

2011-07-06 Thread William Oberman
I have a few cassandra/hadoop/pig questions. I currently have things set up in a test environment, and for the most part everything works. But, before I start to roll things out to production, I wanted to check on/confirm some things. When I originally set things up, I used:

Re: cassandra/hadoop/pig

2011-07-06 Thread William Oberman
, Jul 6, 2011 at 2:48 PM, William Oberman ober...@civicscience.comwrote: I have a few cassandra/hadoop/pig questions. I currently have things set up in a test environment, and for the most part everything works. But, before I start to roll things out to production, I wanted to check on/confirm

Re: Cassandra memory problem

2011-07-07 Thread William Oberman
I think I had (and have) a similar problem: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html My memory usage grew slowly until I ran out of mem and the OS killed my process (due to no swap). I'm still on 0.7.4, but I'm

Re: What does a write lock ?

2011-07-08 Thread William Oberman
wrong yet again. For me, if I need two pieces of data to be consistently related to each other and stored in cassandra, I encode them (usually JSON) and store them in one column. will On Fri, Jul 8, 2011 at 8:30 AM, William Oberman ober...@civicscience.comwrote: Questions like this seem to come

Re: What does a write lock ?

2011-07-08 Thread William Oberman
in memory, lock the mutex for A's key as a precursor to (1) and release it in a post-update function. But I am always very nervous about inserting locking into a process that wasn't designed with it already in mind... On Fri, Jul 8, 2011 at 8:30 AM, William Oberman ober

Re: What does a write lock ?

2011-07-08 Thread William Oberman
. will On Fri, Jul 8, 2011 at 10:35 AM, William Oberman ober...@civicscience.comwrote: I think you need to look into Zookeeper, or other distributed coordinator, as you have little/no guarantees from cassandra between 1-3 (in terms of the guarantees you want and need). And my terminology in my

Re: What does a write lock ?

2011-07-08 Thread William Oberman
I use a language specific wrapper around thrift as my client, but yes, I guess I fundamentally mean thrift == client, and the cassandra server == server. will On Fri, Jul 8, 2011 at 11:08 AM, Jeffrey Kesselman jef...@gmail.com wrote: I am confused by what you mean by Cassandra client code. Is

Re: What does a write lock ?

2011-07-08 Thread William Oberman
, William Oberman ober...@civicscience.com wrote: I use a language specific wrapper around thrift as my client, but yes, I guess I fundamentally mean thrift == client, and the cassandra server == server. will On Fri, Jul 8, 2011 at 11:08 AM, Jeffrey Kesselman jef...@gmail.comwrote: I am

Re: Survey: Cassandra/JVM Resident Set Size increase

2011-07-14 Thread William Oberman
I finally upgraded to 0.7.4 - 0.8.0 (using riptano packages) 2 days ago. Before, my resident memory (for the java process) would slowly grow without bound and the OS would kill the process. But, over the last 2 days, I _think_ it's been stable. I'll let you know in a week :-) My other stats:

how to migrate?

2011-08-24 Thread William Oberman
I was hoping to transition my simple cassandra cluster (where each node is a cassandra + hadoop tasktracker) to a cluster with two virtual datacenters (vanilla cassandra vs. cassandra + hadoop tasktracker), based on this: http://wiki.apache.org/cassandra/HadoopSupport#ClusterConfig The problem

Re: how to migrate?

2011-08-25 Thread William Oberman
create keyspace civicscience with replication_factor=3 and strategy_options = [{us-east:3}] and placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy'; FYI the replication_factor property with the NTS is incorrect, the next(?) revision of 0.8 will raise an error on

cassandra 0.8.4 + pig (using cloudera rpms)

2011-09-04 Thread William Oberman
I've had some troubles, so I thought I'd pass on my various bug fixes: -Cass 0.8.4 has troubles with pig/hadoop (you get NPE's when trying to connect to cassandra in the pig logs). You need this patch: http://svn.apache.org/viewvc?revision=1158940view=revision And maybe this:

Re: cassandra 0.8.4 + pig (using cloudera rpms)

2011-09-05 Thread William Oberman
AM, William Oberman wrote: I've had some troubles, so I thought I'd pass on my various bug fixes: -Cass 0.8.4 has troubles with pig/hadoop (you get NPE's when trying to connect to cassandra in the pig logs). You need this patch: http://svn.apache.org/viewvc?revision=1158940view=revision

Re: Professional Support

2011-09-06 Thread William Oberman
I also have used datastax with great success (same disclaimer). A specific example: -I setup a one-on-one call to talk through an issue, in my case a server reconfiguration. It took 2 days to find a time to meet, though that was my fault as I believe they could have worked me in within a day. I

Re: Aamzon EC2 Cassandra to ebs or not..

2011-03-09 Thread William Oberman
I'm considering similar issues right now. The problem with ephemeral storage is I don't know an easy way to back it up, while on an EBS it's a simple snapshot API call. Otherwise, I believe the performance of the ephemeral (certainly in the case of large or greater, where you can RAID0 multiple

Re: Aamzon EC2 Cassandra to ebs or not..

2011-03-09 Thread William Oberman
i'm getting at. why would you want to back it up if the cluster is working properly? backup is silly ; ) On Wed, Mar 9, 2011 at 4:54 PM, William Oberman ober...@civicscience.comwrote: I'm considering similar issues right now. The problem with ephemeral storage is I don't know an easy

Re: Aamzon EC2 Cassandra to ebs or not..

2011-03-09 Thread William Oberman
point the snapshot to an external filesystem? will On Wed, Mar 9, 2011 at 11:31 AM, Sasha Dolgy sdo...@gmail.com wrote: Could you not nodetool snapshot the data into an mounted ebs/s3 bucket and satisfy your development requirement? -sd On Wed, Mar 9, 2011 at 5:23 PM, William Oberman ober

Re: Aamzon EC2 Cassandra to ebs or not..

2011-03-09 Thread William Oberman
to developers This is what I had in my head -sd On Wed, Mar 9, 2011 at 5:39 PM, William Oberman ober...@civicscience.comwrote: I thought nodetool snapshot writes the snapshot locally, requiring 2x of expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs snapshot

Re: Aamzon EC2 Cassandra to ebs or not..

2011-03-09 Thread William Oberman
/ In this directory i have: 1299689801925-09032011 -sd On Wed, Mar 9, 2011 at 5:54 PM, William Oberman ober...@civicscience.comwrote: I haven't done backups yet, so I don't know where the data is written. Is it where the nodetool is run from? Or local to the instance running cassandra

who to contact?

2011-03-30 Thread William Oberman
I think I found a bug in the cassandra PHP client. I'm using phpcassa, but the bug is in thrift itself, which I think that library phpcassa just wraps. In any case, I was trying to test on my local machine, which has limited RAM, so I reduced the JVM heap size. Of course I immediately had an

Re: who to contact?

2011-03-30 Thread William Oberman
Nevermind, the header of the file says it's an apache project, so I'll contact them. Though, if anyone else is running PHP and is worried about dropped connections thrashing their server, apply this patch :-) On Wed, Mar 30, 2011 at 3:18 PM, William Oberman ober...@civicscience.comwrote: I

Re: who to contact?

2011-03-30 Thread William Oberman
with phpcassa 0.7.a.3, or the phpcassa master branch? On Wed, Mar 30, 2011 at 2:26 PM, William Oberman ober...@civicscience.com wrote: Nevermind, the header of the file says it's an apache project, so I'll contact them. Though, if anyone else is running PHP and is worried about dropped connections

Re: who to contact?

2011-03-30 Thread William Oberman
master branch. On Wed, Mar 30, 2011 at 2:28 PM, Tyler Hobbs ty...@datastax.com wrote: Are you looking at Thrift trunk, the thrift package that ships with phpcassa 0.7.a.3, or the phpcassa master branch? On Wed, Mar 30, 2011 at 2:26 PM, William Oberman ober...@civicscience.com wrote

Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-12 Thread William Oberman
Hi, I'm getting closer to commiting to cassandra, and now I'm in system/IT issues and questions. I'm in the amazon EC2 cloud. I previously used this forum to discover the best practice for disk layouts (large instance + the two ephemeral disks in RAID0 for data + root volume for everything

Re: Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-12 Thread William Oberman
NTS, than first migrating to NTS (changing strategy is painful). I can't think of any downsides to using NTS in a single-DC environment, so that's the safe option. On Tue, Apr 12, 2011 at 1:15 PM, William Oberman ober...@civicscience.com wrote: Hi, I'm getting closer to commiting

Re: Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-13 Thread William Oberman
were already using NTS, than first migrating to NTS (changing strategy is painful). I can't think of any downsides to using NTS in a single-DC environment, so that's the safe option. On Tue, Apr 12, 2011 at 1:15 PM, William Oberman ober...@civicscience.com wrote: Hi, I'm getting closer

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread William Oberman
I had a similar error today when I tried using LOCAL_QUORUM without having a properly configured NetworkTopologyStrategy. QUORUM worked fine however. will On Tue, Apr 19, 2011 at 8:52 PM, Oleg Tsvinev oleg.tsvi...@gmail.comwrote: Earlier I've posted the same message to a hector-users list.

Re: Cassandra 0.7.4 and LOCAL_QUORUM Consistency level

2011-04-19 Thread William Oberman
NetworkTopologyStrategy. Are you saying that it works after configuring it? On Tue, Apr 19, 2011 at 6:04 PM, William Oberman ober...@civicscience.com wrote: I had a similar error today when I tried using LOCAL_QUORUM without having a properly configured NetworkTopologyStrategy. QUORUM worked fine however

Re: Ec2Snitch + NetworkTopologyStrategy if only in one region?

2011-04-20 Thread William Oberman
in EC2 deployment. I felt silly afterwards, but I couldn't find official docs on the structure of strategy_options anywhere. will On Wed, Apr 13, 2011 at 5:14 PM, William Oberman ober...@civicscience.comwrote: One last coda, for other noobs to cassandra like me. If you use

system_* consistency level?

2011-04-20 Thread William Oberman
Hi, My unit tests started failing once I upgraded from a single node cassandra cluster to a full N node cluster (I'm starting with 4). I had a few various bugs, mostly due to forgetting to read/write at a quorum level in places I needed stronger consistency guarantees. But, I kept getting

Re: system_* consistency level?

2011-04-20 Thread William Oberman
That was the trick. Thanks! On Apr 20, 2011, at 6:05 PM, Jonathan Ellis jbel...@gmail.com wrote: See the comments for describe_schema_versions. On Wed, Apr 20, 2011 at 4:59 PM, William Oberman ober...@civicscience.com wrote: Hi, My unit tests started failing once I upgraded from a single

normal thread counts?

2013-04-29 Thread William Oberman
Hi, I'm having some issues. I keep getting: ERROR [GossipStage:1] 2013-04-28 07:48:48,876 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[GossipStage:1,5,main] java.lang.OutOfMemoryError: unable to create new native thread -- after a day or two of

Re: normal thread counts?

2013-04-30 Thread William Oberman
with that first. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 30/04/2013, at 3:07 AM, William Oberman ober...@civicscience.com wrote: Hi, I'm having some issues. I keep getting: ERROR [GossipStage

Re: normal thread counts?

2013-05-01 Thread William Oberman
- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 1/05/2013, at 2:18 AM, William Oberman ober...@civicscience.com wrote: I use phpcassa. I did a thread dump. 99% of the threads look very similar (I'm using 1.1.9 in terms of matching

Re: normal thread counts?

2013-05-01 Thread William Oberman
. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 1/05/2013, at 2:18 AM, William Oberman ober...@civicscience.com wrote: I use phpcassa. I did a thread dump. 99% of the threads look very similar (I'm

1.1.9 - 1.1.11 rpm upgrade issue

2013-05-03 Thread William Oberman
I get this: Running rpm_check_debug ERROR with rpm_check_debug vs depsolve: apache-cassandra11 conflicts with apache-cassandra11-1.1.11-1.noarch I'm using Centos. Problem with my OS, or problem with the package? (And how can it conflict with itself??) will

cqlsh + existing cf's + query

2013-07-03 Thread William Oberman
I've been running cassandra a while, and have used the PHP api and cassandra-cli, but never gave cqlsh a shot. I'm not quite getting it. My most simple CF is a dumping ground for testing things created as: create column family stats; I was putting random stats I was computing in it. All keys,

dependencies for cassandra's pig integration?

2013-07-31 Thread William Oberman
I'm using AWS's EMR (hadoop as a service), and one step copies some data from EMR - my cassandra cluster. I used to patch EMR with pig 0.11, but now AWS officially supports 0.11, so I thought I'd give it a try. I was having issues. The AWS forum on it is here:

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

2014-02-12 Thread William Oberman
Same region, cross zone transfer is $0.01 / GB (see http://aws.amazon.com/ec2/pricing/, Data Transfer section). On Wed, Feb 12, 2014 at 3:04 PM, Russell Bradberry rbradbe...@gmail.comwrote: Cross zone data transfer does not cost any extra money. LOCAL_QUORUM = QUORUM if all 6 servers are

Re: using hadoop + cassandra for CF mutations (delete)

2014-04-08 Thread William Oberman
, William Oberman ober...@civicscience.com wrote: Hi, I have some history with cassandra + hadoop: 1.) Single DC + integrated hadoop = Was ok until I needed steady performance (the single DC was used in a production environment) 2.) Two DC's + integrated hadoop on 1 of 2 DCs = Was ok until

clearing tombstones?

2014-04-11 Thread William Oberman
I'm wondering what will clear tombstoned rows? nodetool cleanup, nodetool repair, or time (as in just wait)? I had a CF that was more or less storing session information. After some time, we decided that one piece of this information was pointless to track (and was 90%+ of the columns, and in

Re: clearing tombstones?

2014-04-11 Thread William Oberman
of it; for me it never worked so I run nodetool compaction on every node; that does it. 2014-04-11 16:05 GMT+02:00 William Oberman ober...@civicscience.com: I'm wondering what will clear tombstoned rows? nodetool cleanup, nodetool repair, or time (as in just wait)? I had a CF that was more

Re: clearing tombstones?

2014-04-11 Thread William Oberman
for consistency to be achieved prior to deletion. If you are operationally confident that you can achieve consistency via anti-entropy repairs within a shorter period you can always reduce that 10 day interval. Mark On Fri, Apr 11, 2014 at 3:16 PM, William Oberman ober...@civicscience.com wrote

Re: clearing tombstones?

2014-04-11 Thread William Oberman
to the impact to minor compactions). I'm hesitant to write the offending sentence again :-) On Fri, Apr 11, 2014 at 10:44 AM, William Oberman ober...@civicscience.comwrote: So, if I was impatient and just wanted to make this happen now, I could: 1.) Change GCGraceSeconds of the CF to 0 2.) run

  1   2   >