EOFException: attempted to skip x bytes

2011-02-21 Thread shimi
I upgraded to 0.7.2 from 0.7.0 which was upgraded from 0.6.8 and I gets the following Exception. I have 4 nodes cluster on 2 data centers (2 nodes on each). I see the error only on 2 nodes on the same data center. I didn't see this error on 0.7.0 ERROR [HintedHandoff:5] 2011-02-21 03:53:22,341

Re: EOFException: attempted to skip x bytes

2011-02-21 Thread Karl Hiramoto
On 21/02/2011 09:01, shimi wrote: I upgraded to 0.7.2 from 0.7.0 which was upgraded from 0.6.8 and I gets the following Exception. I have 4 nodes cluster on 2 data centers (2 nodes on each). I see the error only on 2 nodes on the same data center. I didn't see this error on 0.7.0 ERROR

[ANN] Mojo's Cassandra Maven Plugin 0.7.2-1 released

2011-02-21 Thread Stephen Connolly
Hi, The Mojo team is pleased to announce the release of Mojo's Cassandra Maven Plugin version 0.7.2-1. Mojo's Cassandra Plugin is used when you want to install and control a test instance of Apache Cassandra from within your Apache Maven build. The plugin has the following goals. *

java.io.IOException in CompactionExecutor

2011-02-21 Thread ruslan usifov
I launch clean cassandra 7.2 instalation, and after few days i look at system.log follow error (more then 10 time): ERROR [CompactionExecutor:1] 2011-02-19 02:56:17,965 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[CompactionExecutor:1,1,main]

Re: How to use NetworkTopologyStrategy

2011-02-21 Thread Héctor Izquierdo Seliva
Thanks! I totally overlooked that. El lun, 21-02-2011 a las 08:14 +1300, Aaron Morton escribió: The best examples I know of are in the internal cli help, and conf/casandra.yaml Aaron On 19/02/2011, at 12:51 AM, Héctor Izquierdo Seliva izquie...@strands.com wrote: Hi! Can some

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread Aaron Morton
From th F:/ I assume you are on Windows ? What version?Just did a quick test on Ubuntu 10.0.4 and it works, but the File.renameTo() function used has differentbehaviordepending on the host OS. There may be some issues on

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread Norman Maurer
The problem on windows is that it is a bit more worried about rename a file if the handle is still open.. So maybe some stream not closed on the file. Bye, Norman 2011/2/21 Aaron Morton aa...@thelastpickle.com: From th F:/ I assume you are on Windows ? What version? Just did a quick test on

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread Aaron Morton
The code creates a new .tmp file in the saved_caches directory and then renames it to a non .tmp file name, so there is nothing else with a handle open. The rename is to an existing file though.Ruslan can you please raise a bug against 0.7.2 for this and include the platform.ThanksAaronOn 22 Feb,

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread ruslan usifov
2011/2/21 Aaron Morton aa...@thelastpickle.com The code creates a new .tmp file in the saved_caches directory and then renames it to a non .tmp file name, so there is nothing else with a handle open. The rename is to an existing file though. Ruslan can you please raise a bug against 0.7.2

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread Aaron Morton
Yeshttps://issues.apache.org/jira/browse/CASSANDRAThanksAaronOn 22 Feb, 2011,at 12:55 AM, ruslan usifov ruslan.usi...@gmail.com wrote:2011/2/21 Aaron Morton aa...@thelastpickle.com The code creates a new .tmp file in the saved_caches directory and then renames it to a non .tmp file name, so there

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread Ivan Georgiev
I did some very rough measurements in a desperate attempt to see if I can find the issue (if there is an issue). Since I dont know the code base well enough i chose BufferedRandomAccessFile as my suspect, since it was rewritten from 0.7.0 to 0.7.1 I did rough measurements on how many times

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread ruslan usifov
I have a question. How do you think this happens only on windows(on this platform i don't worry, because this is only test platform) or everywhere (linux)? And how dangerous this error, or at first time i may simply ignore it? 2011/2/21 Aaron Morton aa...@thelastpickle.com Yes

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread Daniel Josefsson
There is no antivirus program or similar running on that machine I guess? That could definitely lock the file if Cassandra is creating the .tmp file and then fairly shortly after tries to rename it. /Daniel On Mon, 2011-02-21 at 11:34 +, Aaron Morton wrote: The code creates a new .tmp file

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread ruslan usifov
2011/2/21 Daniel Josefsson daniel.josefs...@shazamteam.com There is no antivirus program or similar running on that machine I guess? That could definitely lock the file if Cassandra is creating the .tmp file and then fairly shortly after tries to rename it. No i haven't any antivirus

Distribution Factor: part of the solution to many-CF problem?

2011-02-21 Thread David Boxenhorn
Cassandra is both distributed and replicated. We have Replication Factor but no Distribution Factor! Distribution Factor would define over how many nodes a CF should be distributed. Say you want to support millions of multi-tenant users in clusters with thousands of nodes, where you don't know

Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread Aaron Morton
The work around is to disable saving caches, as they may not be correctly saved. However you do not have access to change the settings for the system CFs. I imagine it may only be an issue if a node is rebooted as it may load stale caches.I think it's windows only as the function used to rename

Re: Data model for activity feed

2011-02-21 Thread Rauan Maemirov
Any advices? Maybe I should group events from application? Wouldn't it be to much overhead? 2011/2/19 Rauan Maemirov ra...@maemirov.com Hi, with the help of twissandra example, I tried to create a scheme for activity feed. Activities cf stores all activities: Activities: {

Re: Queries on secondary indexes

2011-02-21 Thread Norman Maurer
No sure whats your problem.. Use two EQ operations works without a problem here (even via the cli). Bye, Norman 2011/2/18 Rauan Maemirov ra...@maemirov.com: With this schema: create column family Userstream with comparator=UTF8Type and rows_cached = 1 and keys_cached = 10 and

Replicate changes from DC1 to DC2, but not from DC2 to DC1

2011-02-21 Thread Héctor Izquierdo Seliva
Hi all. Is there a way (besides changing the code) to replicate data from a Data center 1 to a Data center 2, but not the other way around? I need to have a preproduction environment with production data, and ideally with only a fraction of the data (for example, by key preffixes). I have poked

millions of columns in a row vs millions of rows with one column

2011-02-21 Thread Héctor Izquierdo Seliva
Hi Everyone. I'm testing performance differences of millions of columns in a row vs millions of rows. So far it seems wide rows perform better in terms of reads, but there can be potentially hundreds of millions of columns in a row. Is this going to be a problem? Should I go with individual rows?

Re: cant seem to figure out secondary index definition

2011-02-21 Thread Roland Gude
Yes, It has such a Clause. I am very certain that this is Not my Code because the very Same program Works against a Cluster of the Index is created with the cli and it does not, when the Index is configured with cassandra.yaml My assumption is, that the Index Kreation with configured file is

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread Ivan Georgiev
Some more digging. This is the code path causing the excessive rebuffer() calls. java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Unknown Source) at org.apache.cassandra.io.util.BufferedRandomAccessFile.reBuffer(BufferedRandomAccessFile.java:204) at

Re: Understand eventually consistent

2011-02-21 Thread mcasandra
David Strauss-2 wrote: On Fri, 2011-02-18 at 12:01 -0600, Anthony John wrote: Writes will go thru w/hinted handoff, read will fail That is not correct. Hinted handoffs do not count toward reaching QUORUM counts.[1] [1] http://wiki.apache.org/cassandra/HintedHandoff -- David

Re: Cassandra as write-behind, Cassandra as Cache

2011-02-21 Thread Peter Schuller
I'm afraid I didn't think too hard about your overall problem, since since you haven't gotten other responses I can at least say something: Theory 1: use EHCache or something like it. Theory 2: having it in memory in the Cassandra server is nearly as good as having it in memory in my jvm,

Re: frequent client exceptions on 0.7.0

2011-02-21 Thread Peter Schuller
AFAIK the MemtablePostFlusher is the TP writing sstables, if it has a queue then there is the potential for writes to block while it waits for Memtables to be flushed. Take a look at your Memtable settings per CF, could it be that all the Memtables are flushing at once? There is info in the

[RELEASE] 0.6.12

2011-02-21 Thread Eric Evans
It's been about a month since 0.6.11, and there have been a handful of changes since, so I'm pleased to announce the release of 0.6.12. Source and binary archives are available from the Downloads page[3], and packages for Debian-based systems are available from the project repository[4].

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread Jonathan Ellis
BRAF.seek has not changed since 0.7.0. Here is the implementation: public void seek(long newPosition) throws IOException { current = newPosition; if (newPosition = bufferEnd || newPosition bufferOffset) { reBuffer(); // this will set bufferEnd for us

Fwd: java.io.IOException in CompactionExecutor

2011-02-21 Thread Aaron Morton
For those playing along at homeBegin forwarded message:From: ruslan usifov ruslan.usi...@gmail.comDate: 22 February 2011 2:43:52 AMTo: aa...@thelastpickle.comSubject: java.io.IOException in CompactionExecutorLet me when you have created the bug. I create

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread Ivan Georgiev
That is strange. In 0.7.0 i see this for seek: public void seek(long pos) throws IOException { this.curr_ = pos; } Ivan On 21.2.2011 г. 21:20 ч., Jonathan Ellis wrote: BRAF.seek has not changed since 0.7.0. Here is the implementation: public void seek(long newPosition) throws

Re: Replicate changes from DC1 to DC2, but not from DC2 to DC1

2011-02-21 Thread Aaron Morton
Take a look at the NetworkTopologyStrategy and/or the RackInferringSnitch together they decide where to place replicas. It's probably not a great idea to muck around with this stuff though. How about a hadoop job to pull out the data you want? It would be a full scan but in parallel. Aaron

Re: millions of columns in a row vs millions of rows with one column

2011-02-21 Thread Aaron Morton
My preference is to go with more rows as it distributes load better. But the best design is the one that supports your read patterns. See http://wiki.apache.org/cassandra/LargeDataSetConsiderations for background. Aaron On 22/02/2011, at 3:56 AM, Héctor Izquierdo Seliva izquie...@strands.com

Re: Distribution Factor: part of the solution to many-CF problem?

2011-02-21 Thread Aaron Morton
Sounds a bit like this idea http://www.mail-archive.com/dev@cassandra.apache.org/msg01799.html Aaron On 22/02/2011, at 1:28 AM, David Boxenhorn da...@lookin2.com wrote: Cassandra is both distributed and replicated. We have Replication Factor but no Distribution Factor! Distribution

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread ruslan usifov
2011/2/21 Ivan Georgiev yngw...@bk.ru That is strange. In 0.7.0 i see this for seek: public void seek(long pos) throws IOException { this.curr_ = pos; } You doesn't see 0.7.0 version, you see version before cassandra/branches/cassandra-0.7@1052531 (2010-12-24 16:57:07 + (8 weeks ago))

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread Jonathan Ellis
And it's not like it didn't have to rebuffer then, just that the code organization was different. On Mon, Feb 21, 2011 at 2:12 PM, ruslan usifov ruslan.usi...@gmail.com wrote: 2011/2/21 Ivan Georgiev yngw...@bk.ru That is strange. In 0.7.0 i see this for seek: public void seek(long pos)

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread Ivan Georgiev
I meant what was tagged as 0.7.0, at least that is what i used in my 0.7.0 tests: http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.7.0/ Ivan On 21.2.2011 ?. 22:12 ?., ruslan usifov wrote: 2011/2/21 Ivan Georgiev yngw...@bk.ru mailto:yngw...@bk.ru That is strange. In 0.7.0 i

Re: 0.7.2 slow memtables flushing

2011-02-21 Thread Jonathan Ellis
If you look in that code, the bounds are checked on each write and reBuffer is called from there instead of from seek On Mon, Feb 21, 2011 at 2:21 PM, Ivan Georgiev yngw...@bk.ru wrote: I meant what was tagged as 0.7.0, at least that is what i used in my 0.7.0 tests:

Re: Understand eventually consistent

2011-02-21 Thread Peter Schuller
I read the logic of why writes are not allowed.  But other alternative is to allow write and just fail the reads until it's in sync again. Is there some other problem with this logic? The problem lies in until it's in sync again. A given node cannot easily know for a given read, whether

Re: C++ client for Cassandra

2011-02-21 Thread Padraig O'Sullivan
Hi, I had some spare time this weekend so I updated libcassandra to work with the latest stable release of cassandra (0.7.2). Creating/dropping column familes and keyspaces is now supported through libcassandra. I updated the API a fair bit based on the new changes in 0.7. -Padraig On Wed, Dec

R: Re: Are row-keys sorted by the compareWith?

2011-02-21 Thread cbert...@libero.it
Sorry Dan, I just noticed I answer you and not to the group!Didn't want to bother, just mistake. Best Regards Carlo Messaggio originale Da: d...@reactive.org Data: 21/02/2011 4.23 A: user@cassandra.apache.org, cbert...@libero.itcbert...@libero.it Ogg: Re: Are row-keys sorted by the

I: Re: Are row-keys sorted by the compareWith?

2011-02-21 Thread cbert...@libero.it
As Jonathan mentions the compareWith on a column family def. is defines the order for the columns *within* a row... In order to control the ordering of rows you'll need to use the OrderPreservingPartitioner (http://www.datastax.com/docs/0.7/operations/clustering#tokens-partitioners-ring).

Rows and deletion

2011-02-21 Thread Ásgeir Halldórsson
Hi, Since its not posible to get accurate row range list because of ghost rows atm. What is the best solution to get accurate list without ghosts. What I am doing is listing objects that has more data in detail. Regards, Ásgeir Halldórsson

Re: Rows and deletion

2011-02-21 Thread Tyler Hobbs
Since its not posible to get accurate row range list because of ghost rows atm. What is the best solution to get accurate list without ghosts. Page through all of the rows normally, but skip rows which have zero columns. -- Tyler Hobbs Software Engineer, DataStax http://datastax.com/

Re: Distribution Factor: part of the solution to many-CF problem?

2011-02-21 Thread David Boxenhorn
No, that's not what I mean at all. That message is about the ability to use different partitioners for different CFs, say, RandomPartitioner for one, OPP for another. I'm talking about defining how many nodes a CF should be distributed over, which would be useful if you have a lot of nodes and a

Re: Rows and deletion

2011-02-21 Thread Jeremy Hanna
On Feb 21, 2011, at 4:33 PM, Ásgeir Halldórsson wrote: Thanks for the fast response but that would be quite difficult on paging results, do you know if there is a fix in the works? I don't think the range ghosts behavior is going away. Is it possible to buffer results and return them once

RE: Rows and deletion

2011-02-21 Thread Ásgeir Halldórsson
Thanks for the fast response but that would be quite difficult on paging results, do you know if there is a fix in the works? Regards, Ásgeir From: Tyler Hobbs [mailto:ty...@datastax.com] Sent: 21. febrúar 2011 23:20 To: user@cassandra.apache.org Subject: Re: Rows and deletion

Re: Error when bringing up 3rd node

2011-02-21 Thread mcasandra
I am still getting the following: On node 1: ERROR 16:57:31,365 Fatal error: Bootstraping to existing token 0 is not allowed (decommission/removetoken the old node first). On node 2: ERROR 16:57:42,300 Fatal error: Bootstraping to existing token 56713727820156410577229101238628035242 is not

does cassandra support IBM JDK ?

2011-02-21 Thread Xiaobo Gu
does anybody -- 从我的移动设备发送

does cassandra support IBM JDK ?

2011-02-21 Thread Xiaobo Gu
Is there anybody doing this ? -- 从我的移动设备发送

Re: Error when bringing up 3rd node

2011-02-21 Thread mcasandra
mcasandra wrote: I am still getting the following: On node 1: ERROR 16:57:31,365 Fatal error: Bootstraping to existing token 0 is not allowed (decommission/removetoken the old node first). On node 2: ERROR 16:57:42,300 Fatal error: Bootstraping to existing token

Re: Inconsistent result in super range slice query (reversed order)

2011-02-21 Thread Shotaro Kamio
Hi Tyler, Your script doesn't cause the problem. But the problem really occurs in a situation. My colleague analyzed the problem and find out how to reproduce the problem. Please look at the jira. https://issues.apache.org/jira/browse/CASSANDRA-2212 Best regards, Shotaro On Fri, Feb 18, 2011

Re: does cassandra support IBM JDK ?

2011-02-21 Thread Jonathan Ellis
Yes. 2011/2/21 Xiaobo Gu guxiaobo1...@gmail.com: Is there anybody doing this ? -- 从我的移动设备发送 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com

Re: C++ client for Cassandra

2011-02-21 Thread Jonathan Ellis
Thanks, Padraig! On Mon, Feb 21, 2011 at 3:30 PM, Padraig O'Sullivan osullivan.padr...@gmail.com wrote: Hi, I had some spare time this weekend so I updated libcassandra to work with the latest stable release of cassandra (0.7.2). Creating/dropping column familes and keyspaces is now

Re: Inconsistent result in super range slice query (reversed order)

2011-02-21 Thread Tyler Hobbs
I checked out #2212 and was able to reproduce the problem. Thanks for investigating this and putting together a good script to reproduce! - Tyler

Re: Replicate changes from DC1 to DC2, but not from DC2 to DC1

2011-02-21 Thread Héctor Izquierdo Seliva
El mar, 22-02-2011 a las 08:46 +1300, Aaron Morton escribió: Take a look at the NetworkTopologyStrategy and/or the RackInferringSnitch together they decide where to place replicas. It's probably not a great idea to muck around with this stuff though. How about a hadoop job to pull out the

Re: millions of columns in a row vs millions of rows with one column

2011-02-21 Thread Héctor Izquierdo Seliva
El mar, 22-02-2011 a las 08:49 +1300, Aaron Morton escribió: My preference is to go with more rows as it distributes load better. But the best design is the one that supports your read patterns. See http://wiki.apache.org/cassandra/LargeDataSetConsiderations for background. Aaron