unsibscribe

2012-05-30 Thread Maxim Potekhin

row cache -- does it have data from other nodes?

2012-05-17 Thread Maxim Potekhin
Hello, when I chose to have a rowcache -- will it contain data that is owned by other nodes? Thanks Maxim

Re: Server Side Logic/Script - Triggers / StoreProc

2012-04-29 Thread Maxim Potekhin
About a year ago I started getting a strange feeling that the noSQL community is busy re-creating RDBMS in minute detail. Why did we bother in the first place? Maxim On 4/27/2012 6:49 PM, Data Craftsman wrote: Howdy, Some Polyglot Persistence(NoSQL) products started support server side

Re: Cassandra search performance

2012-04-29 Thread Maxim Potekhin
Jason, I'm using plenty of secondary indexes with no problem at all. Looking at your example,as I think you understand, you forgo indexes by combining two conditions in one query, thinking along the lines of what is often done in RDBMS. A scan is expected in this case, and there is no magic to

Re: RMI/JMX errors, weird

2012-04-24 Thread Maxim Potekhin
disablegossip and disablerthrift , and the turn off the IO limiter with nodetool setcompactionthroughput 0. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/04/2012, at 12:29 AM, Maxim Potekhin wrote: Hello Aaron, how should I go about

Re: RMI/JMX errors, weird

2012-04-18 Thread Maxim Potekhin
/2012 10:03 PM, aaron morton wrote: Look at the server side logs for errors. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/04/2012, at 11:47 AM, Maxim Potekhin wrote: Hello, I'm doing compactions under 0.8.8. Recently, I started

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
http://www.thelastpickle.com On 17/04/2012, at 1:06 PM, Maxim Potekhin wrote: I noticed that nodetool compactionstats shows the building of the secondary index while I initiate compaction. Is this to be expected? Cassandra version 0.8.8. Thank you Maxim

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
manager to rebuild. On Tue, Apr 17, 2012 at 9:47 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: Thanks Aaaron. Just to be clear, every time I do a compaction, I rebuild all indexes from scratch. Right? Maxim On 4/17/2012 6:16 AM, aaron morton wrote: Yes

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
the secondary indexes are themselves column families they too are compacted along with everything else. On Tue, Apr 17, 2012 at 10:02 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: Thanks Jake. Then I am definitely seeing weirdness, as there are tons of pending

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
On Tue, Apr 17, 2012 at 10:09 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: I understand that indexes are CFs. But the compaction stats says it's building the index, not compacting the corresponding CF. Either that's an ambiguous diagnostic, or indeed

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
The offending CF only has one. The other one, that seems to behave well, has nine. Maxim On 4/17/2012 10:20 AM, Jake Luciani wrote: How many indexes are there? On Tue, Apr 17, 2012 at 10:16 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: Yes. Sorry I didn't mention

Is the secondary index re-built under compaction?

2012-04-16 Thread Maxim Potekhin
I noticed that nodetool compactionstats shows the building of the secondary index while I initiate compaction. Is this to be expected? Cassandra version 0.8.8. Thank you Maxim

RMI/JMX errors, weird

2012-04-12 Thread Maxim Potekhin
Hello, I'm doing compactions under 0.8.8. Recently, I started seeing a stack trace like one below, and I can't figure out what causes this to appear. The cluster has been in operation for mode than half a year w/o errors like this one. Any help will be appreciated, Thanks Maxim WARNING:

a very simple indexing question (strange thing seen in CLI)

2012-04-07 Thread Maxim Potekhin
Greetings, Cassandra 0.8.8 is used. I'm trying to create an additional CF which is trivial in all respects. Just ascii columns and a few indexes. This is how I add an index: update column family files with column_metadata = [{column_name : '1', validation_class : AsciiType, index_type : 0,

Re: import

2012-04-01 Thread Maxim Potekhin
Since Python has a native csv module, it's trivial to achieve. I load lots of csv data into my database daily. Maxim On 3/27/2012 11:44 AM, R. Verlangen wrote: You can write your own script to parse the excel file (export as csv) and import it with batch inserts. Should be pretty easy if you

Building a brand new cluster and readying it for production -- advice needed

2012-03-13 Thread Maxim Potekhin
Dear All, after all the testing and continuous operation of my first cluster, I've been given an OK to build a second production Cassandra cluster in Europe. There were posts in recent weeks regarding the most stable and solid Cassandra version. I was wondering is anything better has

Re: Implications of length of column names

2012-02-28 Thread Maxim Potekhin
When I migrated data from our RDBMS, I hashed columns names to integers. This makes for some footwork, but the space gain is clearly there so it's worth it. I de-hash on read. Maxim On 2/10/2012 5:15 PM, Narendra Sharma wrote: It is good to have short column names. They save space all the

Please advise -- 750MB object possible?

2012-02-22 Thread Maxim Potekhin
Hello everybody, I'm being asked whether we can serve an object, which I assume is a blob, of 750MB size? I guess the real question is of how to chunk it and/or even it's possible to chunk it. Thanks! Maxim

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Maxim Potekhin
, but it's not ideal. On Wed, Feb 22, 2012 at 9:04 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: Hello everybody, I'm being asked whether we can serve an object, which I assume is a blob, of 750MB size

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Maxim Potekhin
Thank you so much, looks nice, I'll be looking into it. On 2/22/2012 3:08 PM, Rob Coli wrote: On Wed, Feb 22, 2012 at 10:37 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: The idea was to provide redundancy, resilience, automatic load balancing and automatic

Re: nodetool hangs and didn't print anything with firewall

2012-02-08 Thread Maxim Potekhin
That's good to hear because it does present a problem for a strictly manages and firewalled campus environment. Maxim On 2/6/2012 11:57 AM, Nick Bailey wrote: JMX is not very firewall friendly. The problem is that JMX is a two connection process. The first connection happens on port 7199 and

Re: Encrypting traffic between Hector client and Cassandra server

2012-01-31 Thread Maxim Potekhin
Hello, do you see any value in having a web service over cassandra, with actual client-clients talking to it via https/ssl? This way the cluster can be firewalled and therefore protected, plus you get decent auth/auth right there. Maxim On 1/31/2012 5:21 PM, Xaero S wrote: I have been

Re: Restart cassandra every X days?

2012-01-28 Thread Maxim Potekhin
Sorry if this has been covered, I was concentrating solely on 0.8x -- can I just d/l 1.0.x and continue using same data on same cluster? Maxim On 1/28/2012 7:53 AM, R. Verlangen wrote: Ok, seems that it's clear what I should do next ;-) 2012/1/28 aaron morton aa...@thelastpickle.com

Problematic deletes in 0.8.8

2012-01-27 Thread Maxim Potekhin
Hello, after I thought I was out of the woods with data deletion in 0.8.8, I unfortunately see undead data and other strange behavior. Let me clarify: a) I do run repair and compaction well within GC_GRACE b) deletes happen daily c) after a few repairs, when I run an indexed query on the data

Re: Restart cassandra every X days?

2012-01-25 Thread Maxim Potekhin
I also do repair, compact and cleanup every couple of days, and also have daily restarts on crontab. It doesn't hurt and I avoid having a node becoming unresponsive after many days of operation, that has happened before. Older files get cleaned up on restart. It doesn't take long to shut down

Re: Cassandra x MySQL Sharded - Insert Comparison

2012-01-24 Thread Maxim Potekhin
a) I hate to break it to you, but 6GB x 4 cores != 'high-end machine'. It's pretty much middle of the road consumer level these days. b) Hosting the client and Cassandra on the same node is a Bad Idea. It will depend on what exactly the client will do, but in my experience it won't work too

Re: Cassandra usage

2012-01-24 Thread Maxim Potekhin
You provide zero information on what you are planning to do with the data. Thus, your question is impossible to answer. On 1/24/2012 9:38 PM, francesco.tangari@gmail.com wrote: Do you think that for a standard project with 50.000.000 of rows on 2-3 machines cassandra is appropriate or i

Re: Cassandra x MySQL Sharded - Insert Comparison

2012-01-22 Thread Maxim Potekhin
Hello, I have some experience in benchmarking Cassandra against Oracle and in running on a VM cluster. While the VM solution will work for many applications, it simply won't cut it for all. In particular, I observed a large difference in insert performance when I moved from VM to real

Re: delay in data deleting in cassadra

2012-01-20 Thread Maxim Potekhin
Did you run repairs withing GC_GRACE all the time? On 1/20/2012 3:42 AM, Shammi Jayasinghe wrote: Hi, I am experiencing a delay in delete operations in cassandra. Its as follows. I am running a thread which contains following three steps. Step 01: Read data from column family foo[1]

Re: Cassandra to Oracle?

2012-01-20 Thread Maxim Potekhin
What makes you think that RDBMS will give you acceptable performance? I guess you will try to index it to death (because otherwise the ad hoc queries won't work well if at all), and at this point you may be hit with a performance penalty. It may be a good idea to interview users and build

Re: Cassandra to Oracle?

2012-01-20 Thread Maxim Potekhin
I certainly agree with difficult to predict. There is a Danish proverb, which goes it's difficult to make predictions, especially about the future. My point was that it's equally difficult with noSQL and RDBMS. The latter requires indexing to operate well, and that's a potential performance

Re: ideal cluster size

2012-01-20 Thread Maxim Potekhin
You can also scale not horizontally but diagonally, i.e. raid SSDs and have multicore CPUs. This means that you'll have same performance with less nodes, making it far easier to manage. SSDs by themselves will give you an order of magnitude improvement on I/O. On 1/19/2012 9:17 PM, Thorsten

Re: Using 5-6 bytes for cassandra timestamps vs 8…

2012-01-18 Thread Maxim Potekhin
I must have accidentally deleted all messages in this thread save this one. On the face value, we are talking about saving 2 bytes per column. I know it can add up with many columns, but relative to the size of the column -- is it THAT significant? I made an effort to minimize my CF

Re: About initial token, autobootstraping and load balance

2012-01-15 Thread Maxim Potekhin
I see. Sure, that's a bit more complicated and you'd have to move tokens after adding a machine. Maxim On 1/15/2012 4:40 AM, ??? wrote: It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes, growing. 2012/1/14 Maxim Potekhin potek...@bnl.gov mailto:potek

Re: About initial token, autobootstraping and load balance

2012-01-14 Thread Maxim Potekhin
I'm just wondering -- what's wrong with manual specification of tokens? I'm so glad I did it and have not had problems with balancing and all. Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster, when had to move tokens to make it 33/33/33 and I screwed up a little in that

Exception thrown during repair, contains jmx classes -- why?

2012-01-11 Thread Maxim Potekhin
As per below trace, there is jmx.mbeanserber involved. What I ran was a common repair. Is that right? What does this failure indicate? at org.apache.cassandra.service.StorageService.forceTableRepair(StorageService.java:1613) at

Re: Should I throttle deletes?

2012-01-10 Thread Maxim Potekhin
Thanks, this makes sense. I'll try that. Maxim On 1/6/2012 10:51 AM, Vitalii Tymchyshyn wrote: Do you mean on writes? Yes, your timeouts must be so that your write batch could complete until timeout elapsed. But this will lower write load, so reads should not timeout. Best regards, Vitalii

Re: How does Cassandra decide when to do a minor compaction?

2012-01-07 Thread Maxim Potekhin
, at 3:17 PM, Maxim Potekhin wrote: The subject says it all -- pointers appreciated. Thanks Maxim

How to find out when a nodetool operation has ended?

2012-01-06 Thread Maxim Potekhin
Suppose I start a repair on one or a few nodes in my cluster, from an interactive machine in the office, and leave for the day (which is a very realistic scenario imho). Is there a way to know, from a remote machine, when a particular action, such as compaction or repair, has been finished? I

Re: How to find out when a nodetool operation has ended?

2012-01-06 Thread Maxim Potekhin
Thanks, so I take it there is no solution outside of Opcenter. I mean of course I can redirect the output, with additional timestamps if needed, to a log file -- which I can access remotely. I just thought there would be some status command by chance, to tell me what maintenance the node is

How does Cassandra decide when to do a minor compaction?

2012-01-06 Thread Maxim Potekhin
The subject says it all -- pointers appreciated. Thanks Maxim

Re: Should I throttle deletes?

2012-01-05 Thread Maxim Potekhin
Hello Aaron, On 1/5/2012 4:25 AM, aaron morton wrote: I use a batch mutator in Pycassa to delete ~1M rows based on a longish list of keys I'm extracting from an auxiliary CF (with no problem of any sort). What is the size of the deletion batches ? 2000 mutations. Now, it appears that

Re: Should I throttle deletes?

2012-01-05 Thread Maxim Potekhin
Thanks, that's quite helpful. I'm wondering though if multiplying the number of clients will end up doing same thing. On 1/5/2012 3:29 PM, Philippe wrote: Then I do have a question, what do people generally use as the batch size? I used to do batches from 500 to 2000 like you do.

Re: Strange OOM when doing list in CLI

2012-01-04 Thread Maxim Potekhin
have counters using composite keys and about 1k columns causes this to happen. We should have some paging support with list. On Tuesday, January 3, 2012, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: I came back from Xmas vacation only to see that what always was an innocuous

Should I throttle deletes?

2012-01-04 Thread Maxim Potekhin
Now that my cluster appears to run smoothly and after a few successful repairs and compacts, I'm back in the business of deletion of portions of data based on its date of insertion. For reasons too lengthy to be explained here, I don't want to use TTL. I use a batch mutator in Pycassa to delete

Re: Cassandra WebUI with Sources released

2012-01-03 Thread Maxim Potekhin
Congrats on what seems to be a nice piece of work, need to check it out. Nicely complements other tools. Maxim On 1/2/2012 12:48 PM, Markus Wiesenbacher | Codefreun.de wrote: Hi, I wish you all a happy and healthy new year! As you may remember, I coded a little GUI for Apache Cassandra.

Strange OOM when doing list in CLI

2012-01-03 Thread Maxim Potekhin
I came back from Xmas vacation only to see that what always was an innocuous procedure in CLI now reliably results in OOM -- does anyone have ideas why? It never happened before. Version of Cassandra is 0.8.8. 2956 java -ea -javaagent:/home/cassandra/cassandra/bin/../lib/jamm-0.2.2.jar

Re: Doubts related to composite type column names/values

2011-12-20 Thread Maxim Potekhin
With regards to static, what are major benefits as it compares with string catenation (with some convenient separator inserted)? Thanks Maxim On 12/20/2011 1:39 PM, Richard Low wrote: On Tue, Dec 20, 2011 at 5:28 PM, Ertio Lewertio...@gmail.com wrote: With regard to the composite columns

Re: Doubts related to composite type column names/values

2011-12-20 Thread Maxim Potekhin
AM, Maxim Potekhin wrote: With regards to static, what are major benefits as it compares with string catenation (with some convenient separator inserted)? Thanks Maxim On 12/20/2011 1:39 PM, Richard Low wrote: On Tue, Dec 20, 2011 at 5:28 PM, Ertio Lewertio...@gmail.com mailto:ertio

Can I slice on composite indexes?

2011-12-20 Thread Maxim Potekhin
Let's say I have rows with composite columns Like (key1, {('xyz', 'abc'): 'colval1'}, {('xyz', 'def'): 'colval2'}) (key2, {('ble', 'meh'): 'otherval'}) Is it possible to create a composite type index such that I can query on 'xyz' and get the first two columns? Thanks Maxim

Re: commit log size

2011-12-14 Thread Maxim Potekhin
Alexandru, Jeremiah -- what setting needs to be tweaked, and what's the recommended value? I observed similar behavior this morning. Maxim On 11/28/2011 2:53 PM, Jeremiah Jordan wrote: Yes, the low volume memtables are causing the problem. Lower the thresholds for those tables if you don't

Re: Keys for deleted rows visible in CLI

2011-12-14 Thread Maxim Potekhin
#range_ghosts On Wed, Dec 14, 2011 at 4:36 AM, Radim Kolarh...@sendmail.cz wrote: Dne 14.12.2011 1:15, Maxim Potekhin napsal(a): Thanks. It could be hidden from a human operator, I suppose :) I agree. Open JIRA for it.

Asymmetric load

2011-12-14 Thread Maxim Potekhin
What could be the reason I see unequal loads on a 3-node cluster? This all started happening during repairs (which again are not going smoothly). Maxim

Crazy compactionstats

2011-12-14 Thread Maxim Potekhin
Hello I ran repair like this: nohup repair.sh where repair.sh contains simply nodetool repair plus timestamp. The process dies while dumping this: Exception in thread main java.io.IOException: Repair command #1: some repair session(s) failed (see log for details). at

Best way to implement indexing for high-cardinality values?

2011-12-14 Thread Maxim Potekhin
I now have a CF with extremely skinny rows (in the current implementation), and the application will want to query by more than one column values. Problem is that the values in a lot of cases will be high cardinality. One other factor is that I want to rotate data in and our of the system in one

show schema bombs in 0.8.6

2011-12-13 Thread Maxim Potekhin
Running cli --debug: [default@PANDA] show schema; null java.lang.RuntimeException at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:310) at org.apache.cassandra.cli.CliMain.processStatement(CliMain.java:217) at

Keys for deleted rows visible in CLI

2011-12-13 Thread Maxim Potekhin
Hello, I searched the archives and it appears that this question was once asked but was not answered. I just deleted a lot of rows, and want to list in cli. I still see the keys. This is not the same as getting slices, is it? Anyhow, what's the reason and rationale? I run 0.8.8. Thanks

Re: Keys for deleted rows visible in CLI

2011-12-13 Thread Maxim Potekhin
that an operation has been performed to delete the data. Harold -Original Message- From: Maxim Potekhin [mailto:potek...@bnl.gov] Sent: Tuesday, December 13, 2011 4:03 PM To: user@cassandra.apache.org Subject: Keys for deleted rows visible in CLI Hello, I searched the archives

Deleted rows re-appearing on repair in 0,8.6

2011-12-12 Thread Maxim Potekhin
Hello, I know that this problem used to exist in 0.8.1 -- I delete rows, run a repair and these rows are back with a vengeance. I recall I was told that this was fixed in 0.8.6 -- is that the case? I still keep seeing that behavior. Thanks Maxim

Really old files in the data directory

2011-12-09 Thread Maxim Potekhin
Hello, a varied the GC grace a few times over the period of my cluster's lifetime, but I never went above 10 days. I did compactions, repairs etc. Now, I see that some files in the data directories of the nodes that were there from day one carry timestamps back from July. There are files

Re: Cassandra 0.8.8

2011-12-09 Thread Maxim Potekhin
Hello everyone, so what's the update on 0.8.8? Many thanks Maxim On 12/2/2011 4:49 AM, Patrik Modesto wrote: Hi, It's been almost 2 months since the release of the 0.8.7 version and there are quite some changes in 0.8.8, so I'd like to ask is there a release date? Regards, Patrik

forceUserDefinedCompaction -- how to use it?

2011-12-07 Thread Maxim Potekhin
Can anyone provide an example of how to use forceUserDefinedCompaction? Thanks Maxim

Re: exporting data from Cassandra cluster

2011-12-07 Thread Maxim Potekhin
Hello Alexandru, as you probably know, my group is using Amazon S3 to permanently (or sem-permanently) park the data in CSV format, which makes it portable and we can load it into anything if needed, or analyze on its own. Just my half of a Swiss centime :) And, because the S3 option is not

Cassandra behavior too fragile?

2011-12-07 Thread Maxim Potekhin
OK, thanks to the excellent help of Datastax folks, some of the more severe inconsistencies in my Cassandra cluster were fixed (after a node was down and compactions failed etc). I'm still having problems as reported in repairs 0.8.6. thread. Thing is, why is it so easy for the repair process

Re: Repair failure under 0.8.6

2011-12-05 Thread Maxim Potekhin
Basically I tweaked the phi, put in more verbose GC reporting and decided to do a compaction before I proceed. I'm getting this on the node where compaction is being run. And the system log for the other two nodes follows. It's obvious that the cluster is sick, but I can't determine why --

Could not reach schema agreement... 0.8.6

2011-12-05 Thread Maxim Potekhin
Hello, upon startup, in my cluster of 3 machines, I see similar messages in system.log on each node (below). I start nodes one by one, after I ascertain the previous one is online. So they can't reach schema agreement, all of them. Why? No unusual load visible in Ganglia plots. ERROR

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
) at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:57) at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:157) On 12/3/2011 8:34 PM, Maxim Potekhin wrote: Thank you Peter. Before I look into details as you suggest, may I ask what you mean automatically restarted? They way the box

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
Thanks Peter! I will try to increase phi_convict -- I will just need to restart the cluster after the edit, right? I do recall that I see nodes temporarily marked as down, only to pop up later. In the current situation, there is no load on the cluster at all, outside the maintenance like

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
Please disregard the GC part of the question -- I found it. On 12/4/2011 4:12 PM, Maxim Potekhin wrote: Thanks Peter! I will try to increase phi_convict -- I will just need to restart the cluster after the edit, right? I do recall that I see nodes temporarily marked as down, only to pop up

Re: can not create a column family named 'index'

2011-12-04 Thread Maxim Potekhin
I seem to recall problems when using a cf called indexRegistry, don't remember much detail now. Maxim On 11/30/2011 7:24 PM, Shu Zhang wrote: Hi, just wondering if this is intentional: [default@test] create column family index; Syntax error at position 21: mismatched input 'index' expecting

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
As a side effect of the failed repair (so it seems) the disk usage on the affected node prevents compaction from working. It still works on the remaining nodes (we have 3 total). Is there a way to scrub the extraneous data? Thanks Maxim On 12/4/2011 4:29 PM, Peter Schuller wrote: I will try

Repair failure under 0.8.6

2011-12-03 Thread Maxim Potekhin
Please help -- I've been having pretty consistent failures that look like this one. Don't know how to proceed. Below text comes from the system log. The cluster was all up before and after the attempted repair, so I don't quite understand how Cassandra declared a node dead (in the below). Was

Re: Repair failure under 0.8.6

2011-12-03 Thread Maxim Potekhin
Thank you Peter. Before I look into details as you suggest, may I ask what you mean automatically restarted? They way the box and Cassandra are set up in my case is such that the death of either if final. Also, how do I look for full GC? I just realized that in the latest install, I might have

How many indexes to keep? Guidelines

2011-11-29 Thread Maxim Potekhin
As a matter of practice, how many secondary indexes on a CF do you usually keep? What are rules of thumb? Is 10 too many? 100? 1000? Thanks Maxim

Re: Yanking a dead node

2011-11-29 Thread Maxim Potekhin
Thanks! Looks pretty obvious in retrospect... Regards, Maxim On 11/24/2011 6:54 AM, Filipe Gonçalves wrote: Just remove its token from the ring using nodetool removetokentoken 2011/11/23 Maxim Potekhinpotek...@bnl.gov: This was discussed a long time ago, but I need to know what's the

Yanking a dead node

2011-11-23 Thread Maxim Potekhin
This was discussed a long time ago, but I need to know what's the state of the art answer to that: assume one of my few nodes is very dead. I have no resources or time to fix it. Data is replicated so the data is still available in the cluster. How do I completely remove the dead node without

7199

2011-11-22 Thread Maxim Potekhin
Hello, I have this in my cassandra-env.sh JMX_PORT=7199 Does this mean that if I use nodetool from another node, it will try to connect to that particular port? Thanks, Maxim

Re: 7199

2011-11-22 Thread Maxim Potekhin
Thanks. I'm trying to look up HttpAdaptor and what it does, can you give any pointers? Thanks. I didn't find much useful info just yet. Maxim On 11/22/2011 9:52 PM, Jeremiah Jordan wrote: Yes, that is the port nodetool needs to access. On Nov 22, 2011, at 8:43 PM, Maxim Potekhin wrote

Re: read performance problem

2011-11-19 Thread Maxim Potekhin
Try to see if there is a lot of paging going on, and run some benchmarks on the disk itself. Are you running Windows or Linux? Do you think the disk may be fragmented? Maxim On 11/19/2011 8:58 PM, Kent Tong wrote: Hi, On my computer with 2G RAM and a core 2 duo CPU E4600 @ 2.40GHz, I am

A Cassandra CLI question: null vs 0 rows

2011-11-17 Thread Maxim Potekhin
Hello everyone, I run a query on a secondary index. For some queries, I get 0 rows returned. In other cases, I just get a string that reads null. What's going on? TIA Maxim

Re: A Cassandra CLI question: null vs 0 rows

2011-11-17 Thread Maxim Potekhin
Thanks Jonathan. I get the bellow error. Don't have a clue as to what it means. null java.lang.RuntimeException at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:310) at org.apache.cassandra.cli.CliMain.processStatement(CliMain.java:217) at

What sort of load do the tombstones create on the cluster?

2011-11-17 Thread Maxim Potekhin
In view of my unpleasant discovery last week that deletions in Cassandra lead to a very real and serious performance loss, I'm working on a strategy of moving forward. If the tombstones do cause such problem, where should I be looking for performance bottlenecks? Is it disk, CPU or something

Varying number of rows coming from same query on same database

2011-11-17 Thread Maxim Potekhin
Hello, I'm running the same query repeatedly. It's a secondary index query, done from a Pycassa client. I see that when I iterate the result object, I get slightly different number of entries when running the test serially. There is no deletions in the database, and no writes, it's static for

Re: Data Model Design for Login Servie

2011-11-17 Thread Maxim Potekhin
1122: { gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa alias1: alfred.tes...@xyz.de mailto:alfred.tes...@xyz.de alias2: alf...@aad.de mailto:alf...@aad.de alias3: a...@dd.de

Re: A Cassandra CLI question: null vs 0 rows

2011-11-17 Thread Maxim Potekhin
Should I file a ticket? I consistently see this behavior after a mass delete. On 11/17/2011 12:46 PM, Maxim Potekhin wrote: Thanks Jonathan. I get the bellow error. Don't have a clue as to what it means. null java.lang.RuntimeException

Re: Mass deletion -- slowing down

2011-11-14 Thread Maxim Potekhin
that your slice range only goes back 2 weeks, rather than to the beginning of time. this would avoid iterating over all the tombstones from prior to the 2 week window. this wouldn't work if you are deleting arbitrary days in the middle of your date range. On 14/11/2011 02:02, Maxim Potekhin wrote

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
such behavior? Thanks, Maxim On 11/10/2011 8:30 PM, Maxim Potekhin wrote: Hello, My data load comes in batches representing one day in the life of a large computing facility. I index the data by the day it was produced, to be able to quickly pull data for a specific day within the last year

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
Thanks to all for valuable insight! Two comments: a) this is not actually time series data, but yes, each item has a timestamp and thus chronological attribution. b) so, what do you practically recommend? I need to delete half a million to a million entries daily, then insert fresh data. What's

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
Brandon, thanks for the note. Each row represents a computational task (a job) executed on the grid or in the cloud. It naturally has a timestamp as one of its attributes, representing the time of the last update. This timestamp is used to group the data into buckets each representing one day

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
Brandon, it won't work in my application, as I need a few indexes on attributes of the job. In addition, a large portion of queries is based on key-value lookup, and that key is the unique job ID. I really can't have data packed in one row per day. Thanks, Maxim On 11/13/2011 8:34 PM, Brandon

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
Thanks Peter, I'm not sure I entirely follow. By the oldest data, do you mean the primary key corresponding to the limit of the time horizon? Unfortunately, unique IDs and the timstamps do not correlate in the sense that chronologically newer entries might have a smaller sequential ID. That's

Mass deletion -- slowing down

2011-11-10 Thread Maxim Potekhin
Hello, My data load comes in batches representing one day in the life of a large computing facility. I index the data by the day it was produced, to be able to quickly pull data for a specific day within the last year or two. There are 6 other indexes. When it comes to retiring the data, I

Is there a way to get only keys with get_indexed_slices?

2011-11-10 Thread Maxim Potekhin
Is there a way to get only keys with get_indexed_slices? Looking at the code, it's not possible, but -- is there some way anyhow? I don't want to extract any data, just a list of matching keys. TIA, Maxim

Error connection to remote JMX agent during repair

2011-11-07 Thread Maxim Potekhin
Hello, I'm trying to run repair on one of my nodes which needs to be repopulated after a failure of the hard drive. What I'm getting is below. Note: I'm not loading JMX with Cassandra, it always worked before... The version if 0.8.6. Any help will be appreciated, Maxim Error connection to

Re: Tool for SQL - Cassandra data movement

2011-11-01 Thread Maxim Potekhin
Just a short comment -- we are going the CSV way as well because of its compactness and extreme portability. The CSV files are kept in the cloud as backup. They can also find other uses. JSON would work as well, but it would be at least twice as large in size. Maxim On 9/22/2011 1:25 PM,

Re: CMS GC initial-mark taking 6 seconds , bad?

2011-10-20 Thread Maxim Potekhin
Hello Aaron, I happen to have 48GB on each machines I use in the cluster. Can I assume that I can't really use all of this memory productively? Do you have any suggestion related to that? Can I run more than one instance on Cassandra on the same box (using different ports) to take advantage

Re: hw requirements

2011-09-01 Thread Maxim Potekhin
only ask cause we have started using the Composite as rowkeys and column names to replace the use of concatenated strings mainly for lookup purposes. Anthony On Wed, Aug 31, 2011 at 10:27 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: Plenty of comments in this thread

Re: hw requirements

2011-08-31 Thread Maxim Potekhin
Plenty of comments in this thread already, and I agree with those saying it depends. From my experience, a cluster with 18 spindles total could not match the performance and throughput of our primary Oracle server which had 108 spindles. After we upgraded to SSD, things have definitely changed

Re: Repair taking a long, long time

2011-07-20 Thread Maxim Potekhin
slower than looping through all my data. On Wed, Jul 20, 2011 at 12:18 AM, Maxim Potekhin potek...@bnl.gov mailto:potek...@bnl.gov wrote: Thanks Edward. I'm told by our IT that the switch connecting the nodes is pretty fast. Seriously, in my house I copy complete DVD images from my

Repair taking a long, long time

2011-07-19 Thread Maxim Potekhin
We have something of the order of 200GB load on each of 3 machines in a balanced cluster under 0.8.1. I started repair about 24hrs ago and did some moderate amount of inserts since then (a small fraction of data load). The repair still appears to be running. What could go wrong? Thanks, Maxim

  1   2   >