Re: Zurich / Swiss / Alps meetup

2012-07-11 Thread Benoit Perroud
Coming back on this thread, we are proud to announce we opened a Swiss BigData UserGroup. http://www.bigdata-usergroup.ch/ Next meetup is July 16, with topic NoSQL Storage: War Stories and Best Practices. Hope to meet you there ! Benoit. 2012/5/17 Sasha Dolgy sdo...@gmail.com: All, A year

Re: Zurich / Swiss / Alps meetup

2012-05-18 Thread Benoit Perroud
+1 ! 2012/5/17 Sasha Dolgy sdo...@gmail.com: All, A year ago I made a simple query to see if there were any users based in and around Zurich, Switzerland or the Alps region, interested in participating in some form of Cassandra User Group / Meetup.  At the time, 1-2 replies happened.  I

SSTableWriter and Bulk Loading life cycle enhancement

2012-05-03 Thread Benoit Perroud
Hi All, I'm bulk loading (a lot of) data from Hadoop into Cassandra 1.0.x. The provided CFOutputFormat is not the best case here, I wanted to use the bulk loading feature. I know 1.1 comes with a BulkOutputFormat but I wanted to propose a simple enhancement to SSTableSimpleUnsortedWriter that

Re: Bulkload into a different CF

2012-05-01 Thread Benoit Perroud
!! Without any guarantee. I know it works but I never used this in production !! You can copy the sstables (renaming them accordingly) and call nodetool refresh. Don't forget to create your column family CF2 before. 2012/5/1 Oleg Proudnikov ol...@cloudorange.com: Hello, Is it possible to

Re: Bulkload into a different CF

2012-05-01 Thread Benoit Perroud
I would just try to copy instead of moving first, and dropping the old CF or the not needed snapshot if necessary when everything is ok. 2012/5/1 Oleg Proudnikov ol...@cloudorange.com: Benoit Perroud benoit at noisette.ch writes: You can copy the sstables (renaming them accordingly

Re: Building SSTables with SSTableSimpleUnsortedWriter

2012-04-29 Thread Benoit Perroud
big buffer size will use more Heap memory at creation of the tables. Not sure impact on server side, but shouldn't be a big difference. I personally use 512Mb. 2012/4/28 sj.climber sj.clim...@gmail.com: Can anyone comment on best practices for setting the buffer size used by

Re: unsubscribe

2012-04-27 Thread Benoit Perroud
http://wiki.apache.org/cassandra/FAQ#unsubscribe Le 27 avril 2012 19:20, Ramkumar Vaidyanathan (PDF) ramkumar.vaidyanat...@pdf.com a écrit : unsubscribe The information in this email and any attachments to it may be confidential and/or privileged. Unless you are the intended recipient (or

Re: unsubscribe

2012-04-07 Thread Benoit Perroud
http://wiki.apache.org/cassandra/FAQ#unsubscribe Le 7 avril 2012 14:37, Jeffrey Fass jeffreyf...@lineardesign.net a écrit : unsubscribe -- sent from my Nokia 3210

Bulk loading errors with 1.0.8

2012-04-05 Thread Benoit Perroud
Hi All, I'm experiencing the following errors while bulk loading data into a cluster ERROR [Thread-23] 2012-04-05 09:58:12,252 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-23,5,main] java.lang.RuntimeException: Insufficient disk space to flush

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-27 Thread Benoit Perroud
Hi All, Thanks a lot for the release. I just upgraded my 1.1-beta1 to 1.1-beta2, and I get the following error : INFO 10:56:17,089 Opening /app/cassandra/data/data/system/LocationInfo/system-LocationInfo-hc-18 (74 bytes) INFO 10:56:17,092 Opening

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-27 Thread Benoit Perroud
. Sorry for any inconvenience. -- Sylvain On Tue, Mar 27, 2012 at 12:57 PM, Benoit Perroud ben...@noisette.ch wrote: Hi All, Thanks a lot for the release. I just upgraded my 1.1-beta1 to 1.1-beta2, and I get the following error :  INFO 10:56:17,089 Opening /app/cassandra/data/data/system

Re: Cassandra - crash with “free() invalid pointer”

2012-03-22 Thread Benoit Perroud
Sounds like a race condition in the off heap caching while calling Unsafe.free(). Do you use cache ? What is your use case when you encounter this error ? Are you able to reproduce it ? 2012/3/22 Maciej Miklas mac.mik...@googlemail.com: Hi *, My Cassandra installation runs on flowing system:

Re: design that mimics twitter tweet search

2012-03-18 Thread Benoit Perroud
The simpliest modeling you could have is using the keyword as key, a timestamp/time UUID as column name and the tweetid as value - cf['keyword']['timestamp'] = tweetid then you do a range query to get all tweetid sorted by time (you may want them in reverse order) and you can limit to the number

Re: Link in Wiki broken

2012-03-18 Thread Benoit Perroud
http://blip.tv/datastax/getting-to-know-the-cassandra-codebase-4034648 2012/3/18 Tharindu Mathew mcclou...@gmail.com: Hi, It seems that [1] is broken. Wonder if it exists somewhere else? [1] - http://www.channels.com/episodes/show/11765800/Getting-to-know-the-Cassandra-Codebase --

Re: Cassandra 1.1 row isolation cross datacenter replication

2012-02-21 Thread Benoit Perroud
The isolation is guarantee locally to the node. If two client are reading / writing to the same node, the one that read will not see partial mutations. 2012/2/21 Allen Servedio allen.serve...@gmail.com: Hi, I saw that row level isolation was added in the beta of Cassandra 1.1 and I have the

Re: Counters and Top 10

2011-12-25 Thread Benoit Perroud
With Composite Column Name, you can even have column composed of sore (int) and userid (uuid or whatever). Empty column value to avoid repeating user UUID. 2011/12/22 R. Verlangen ro...@us2.nl: I would suggest you to create a CF with a single row (or multiple for historical data) with a date

Re: need help with choosing correct tokens for ByteOrderedPartitioner

2011-11-28 Thread Benoit Perroud
You may want to add 29991231 instead of appending. Le lundi 28 novembre 2011, Piavlo lolitus...@gmail.com a écrit : Anyone can help with this? Thanks On 11/24/2011 11:55 AM, Piavlo wrote: Hi, We need help with choosing correct tokens for ByteOrderedPartitioner Originally the key where

Re: Off-heap caching through ByteBuffer.allocateDirect when JNA not available ?

2011-11-10 Thread Benoit Perroud
/browse/CASSANDRA-3271 On Wed, Nov 9, 2011 at 5:54 AM, Benoit Perroud ben...@noisette.ch wrote: Hi, I wonder if you have already discussed about ByteBuffer.allocateDirect alternative to JNA memory allocation ? If so, do someone mind send me a pointer ? Thanks ! Benoit. -- Jonathan

Off-heap caching through ByteBuffer.allocateDirect when JNA not available ?

2011-11-09 Thread Benoit Perroud
Hi, I wonder if you have already discussed about ByteBuffer.allocateDirect alternative to JNA memory allocation ? If so, do someone mind send me a pointer ? Thanks ! Benoit.

Re: Multiple Keyword Lookup Indexes

2011-11-07 Thread Benoit Perroud
You could directly use secondary indexes on the other fields instead of handling yourself your indexes : Define your global id (can be UUID), and have columns loginName, email etc with a secondary index. Retrieval will then be fast. 2011/11/7 Felix Sprick fspr...@gmail.com: Hallo, We are

Re: Bulk uploader issue on multi-node cluster

2011-09-23 Thread Benoit Perroud
On the sstableloader config, make sure you have the seed set and rpc_address and rpc_port pointing to your cassandra instance (127.0.0.2) 2011/9/23 Thamizh tceg...@yahoo.co.in Hi All, I am using bulk-loading to upload data(from lab02) to multi-node cluster of 3 machines(lab02,lab03 lab04)

Re: import data into cassandra

2011-09-18 Thread Benoit Perroud
There is no direct way to do that, but reading a CSV and inserting rows in Java is really easy. But you may want have a look at the new bulk loading tool, sstableloader, described here : http://www.datastax.com/dev/blog/bulk-loading Small detail, it seems you still write email at the incubator

SSTableSimpleUnsortedWriter take long time when inserting big rows

2011-09-02 Thread Benoit Perroud
Hi All, I started using SSTableSimpleUnsortedWriter to load data, and my data has a few rows but a lot of column name in each rows. I call SSTableSimpleUnsortedWriter.newRow every 10'000 columns inserted. But the time taken to insert columns is increasing as the column family is increasing. The

Re: SSTableSimpleUnsortedWriter take long time when inserting big rows

2011-09-02 Thread Benoit Perroud
Thanks for your answer. 2011/9/2 Sylvain Lebresne sylv...@datastax.com: On Fri, Sep 2, 2011 at 10:29 AM, Benoit Perroud ben...@noisette.ch wrote: Hi All, I started using SSTableSimpleUnsortedWriter to load data, and my data has a few rows but a lot of column name in each rows. I call

Re: The way to query a CF with start 10 and end 100

2011-08-29 Thread Benoit Perroud
queries start 10 and end 100 is not straight forward to modelize, you should use the value of start as column name, and check on client side the second condition. Just for comparison, modeling 10 value 100 is rather much easier if you set your values as column name, or using CompositeType if

Re: CompositeType

2011-08-15 Thread Benoit Perroud
You should give a look at https://github.com/edanuff/CassandraIndexedCollections This is a rather good starting point for Composites. 2011/8/15 Stephen Pope stephen.p...@quest.com:  Hey, is there any documentation or examples of how to use the CompositeType? I can't find anything about it on

Re: Need help in CF design

2011-08-11 Thread Benoit Perroud
You can apply this query really simply using cassandra and secondary indexes. You will have a CF TABLE, where row keys are your PK. Just to be sure of my understanding, your SQL query will either return 1 row or no row, right ? 3) SliceQuery returns a range of columns for a given key, it

Re: Fewer wide rows vs. more smaller rows

2011-08-07 Thread Benoit Perroud
performance http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ There is no magic number. The best advice is to follow Jonathan's advice. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5 Aug 2011, at 08:22, Benoit Perroud

Re: Setup Cassandra0.8 in Eclipse

2011-08-07 Thread Benoit Perroud
Make sure svn is on the PATH. If you open a terminal (or cmd), running svn command should work. On 07. 08. 11 23:39, Alvin UW wrote: It seems svn wasn't installed, but i did install it.

Re: How to solve this kind of schema disagreement...

2011-08-05 Thread Benoit Perroud
Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu dikan...@gmail.com: [default@unknown] describe cluster;

Fewer wide rows vs. more smaller rows

2011-08-04 Thread Benoit Perroud
Hi All, In a conceptual point of view, I'm wondering what is the pros cons, mainly in term of access efficiency, of both approach : - Grouping row keys together to reduce the number of keys, but having wider rows (with more columns) - One object in one row Let's illustrate with an example : I

Re: HOW TO select a column or all columns that start with X

2011-08-04 Thread Benoit Perroud
https://github.com/edanuff/CassandraIndexedCollections 2011/8/4 CASSANDRA learner cassandralear...@gmail.com: Can you please gimme an example on this using hector client On Thu, Aug 4, 2011 at 7:18 AM, Boris Yen yulin...@gmail.com wrote: It seems to me that your column name consists of two

Re: Fewer wide rows vs. more smaller rows

2011-08-04 Thread Benoit Perroud
Thanks for your advise. Make sense. And without sticking to my dummy example, conceptually, what has a smaller memory footprint : 1M rows of 1 column or 1 row with 1M columns ? And if the row key and column name are known, is there any performance difference between both scenarios ? Thanks

Re: Sample Cassandra project in Tomcat

2011-08-03 Thread Benoit Perroud
I suppose what you are looking for is an example of interacting with a java app. You should have a look at the high(er) level client hector https://github.com/rantav/hector/ You should find what you are looking for there. If you are looking for a tomcat (and .war) example, you should send an

Re: Killing cassandra is not working

2011-08-03 Thread Benoit Perroud
Seems like you have already a Cassandra instance running, so the second instance cannot open the same port twice. I would suggest you to kill all instances of Cassandra and start it again. 2011/8/3 Nilabja Banerjee nilabja.baner...@gmail.com try to use *grep* command to check the port where

Re: Killing cassandra is not working

2011-08-03 Thread Benoit Perroud
so use netstat to find out which process had opened the port. 2011/8/3 CASSANDRA learner cassandralear...@gmail.com Thnks for the reply Nila When i did PS command, I could not able to find any process related to cassandra. Thts the problem.. On Wed, Aug 3, 2011 at 4:12 PM, Benoit

Re: Sample Cassandra project in Tomcat

2011-08-03 Thread Benoit Perroud
2011/8/3 CASSANDRA learner cassandralear...@gmail.com: Hi,  can you please send me the mailing list address of tomcat http://tomcat.apache.org/lists.html On Wed, Aug 3, 2011 at 4:07 PM, Benoit Perroud ben...@noisette.ch wrote: I suppose what you are looking for is an example of interacting

Re: Cassandra start/stop scripts

2011-08-02 Thread Benoit Perroud
Kill -9 (SIGKILL) is the worst signal to use. It has the advantage to kill quickly the process, but no shutdown hook are called. You should better kill -15 (SIGTERM, which is the default). 2011/7/26 mcasandra mohitanch...@gmail.com: I need to write cassandra start/stop script. Currently I run

Small typo in conf/cassandra.yaml

2011-05-10 Thread Benoit Perroud
Hi all, I found out a small typo in cassandra.yaml, which can confuse inattentive copy-paster. Here is the patch. Index: conf/cassandra.yaml === --- conf/cassandra.yaml (revision 1101465) +++ conf/cassandra.yaml (working copy) @@

Re: Usage Pattern : quot;uniquequot; value of a key.

2011-01-13 Thread Benoit Perroud
, then both nodes think the key belongs to them. So my idea of writing a lock is not well suitable... Does anyone have another idea to share regarding this topic ? Thanks, Kind regards, Benoit. 2011/1/13 Oleg Anastasyev olega...@gmail.com: Benoit Perroud benoit at noisette.ch writes: My idea

Usage Pattern : unique value of a key.

2011-01-12 Thread Benoit Perroud
Hi ML, I wonder if someone has already experiment some kind of unique index on a column family key. Let's go for a short example : the key is the username. What happens if 2 users want to signup at the same time with the same username ? So has someone already addressed this pattern in Cassandra

Re: Quick Poll: Server names

2010-07-27 Thread Benoit Perroud
We use name of (european) cities for logical functionnalities : - berlin01, berlin02, berlin03 part are mysql cluster, - zurich1 and zurich2 are AD, - roma01, roma02, and so on are Cassanrda cluster for the Roma project - and so on. We found this way a good tradeoff. Regards, Benoit.

Re: Does anybody work about transaction on cassandra ?

2010-04-24 Thread Benoit Perroud
orthogonal means go to the opposite direction, but without going back. Including transaction in Cassandra needs to turn 90 degrees the design of Cassandra. Kind regards, Benoit. 2010/4/24 dir dir sikerasa...@gmail.com: Transactions are orthogonal to the design of Cassandra Sorry, Would you

Re: Does anybody work about transaction on cassandra ?

2010-04-24 Thread Benoit Perroud
not understand what is the meaning of needs to turn 90 degrees?? Thank you. On Sun, Apr 25, 2010 at 12:30 AM, Benoit Perroud ben...@noisette.ch wrote: orthogonal means go to the opposite direction, but without going back. Including transaction in Cassandra needs to turn 90 degrees the design

Re: Does anybody work about transaction on cassandra ?

2010-04-24 Thread Benoit Perroud
Ok in this particular context it means no dependencies. Thanks for your precision. Kind regards, Benoit. 2010/4/24 Jonathan Ellis jbel...@gmail.com: On Sat, Apr 24, 2010 at 12:44 PM, Benoit Perroud ben...@noisette.ch wrote: orthogonal means 90 degrees.  Two lines are orthogonal

Re: ORM in Cassandra?

2010-04-23 Thread Benoit Perroud
I understand the question more like : Is there already a lib which help to get rid of writing hardcoded and hard to maintain lines like : MyClass data; String[] myFields = {name, label, ...} ListColumn columns; for (String field : myFields) { if (field == name) { columns.add(new

Re: How many KeySpace will you use in a single application?

2010-04-10 Thread Benoit Perroud
One point in using several keyspaces is that replication factor is per keyspace. If you have a part of your application which generate a lot of data whoss can be lost (some non critical logs?), then a dedicated keyspace with a smaller replication factor can be a good thing. Kind regards,

Re: Heap sudden jump during import

2010-04-03 Thread Benoit Perroud
It exists other tools than jhat to browse a heap dump, which stream the heap dump instead of loading it full in memory like jhat do. Kind regards, Benoit. 2010/4/3 Weijun Li weiju...@gmail.com: I'm running a test to write 30 million columns (700bytes each) to Cassandra: the process ran

Re: Heap sudden jump during import

2010-04-03 Thread Benoit Perroud
...@gmail.com: Thank you Benoit. I did a search but couldn't find any that you mentioned. Both jhat and netbean load entire map file int memory. Do you know the name of the tools that requires less memory to view map file? Thanks, -Weijun On Sat, Apr 3, 2010 at 12:55 AM, Benoit Perroud ben

Re: multinode cluster wiki page

2010-04-03 Thread Benoit Perroud
Hi, Nice work. I guess just a small mistake : the second ListenAddress192.168.1.1/ListenAddress should be ListenAddress192.168.2.34/ListenAddress And I would suggest to add a small part on making the thrift interface listening on more than localhost. Kind regards, Benoit. 2010/4/3 Benjamin

Re: get_range_slice leads to java.lang.OutOfMemoryError?

2010-04-02 Thread Benoit Perroud
A way to read all the db without having an OOM is to limit the amount of rows to be returned, and to iterate over the query, the starting key being the last returned key. Note that doing that way the first key of the next iteration is the same as the last key of the preivous iteration. The

Re: Nodes Timing Out

2010-03-28 Thread Benoit Perroud
ulimit -n returns you unlimited ? 2010/3/28 James Golick jamesgol...@gmail.com: unlimited On Sat, Mar 27, 2010 at 12:09 PM, Chris Goffinet goffi...@digg.com wrote: what's the ulimit set to? -Chris On Mar 27, 2010, at 10:29 AM, James Golick wrote: Hey, I put our first cluster in to