Re: using hadoop + cassandra for CF mutations (delete)

2014-04-08 Thread William Oberman
I use PHP, and phpCassa to talk to cassandra from within my app. I'm using the below script's structure as a way to run a local mutation on each of my nodes: === ?php require_once('PATH/TO/phpcassa-1.0.a.6/lib/autoload.php'); use phpcassa\ColumnFamily; use

Re: Fwd: using hadoop + cassandra for CF mutations (delete)

2014-04-07 Thread Suraj Nayak
to be hadoop 1.0.3 + pig 11. will -- Forwarded message -- From: William Oberman ober...@civicscience.com Date: Fri, Apr 4, 2014 at 12:24 PM Subject: using hadoop + cassandra for CF mutations (delete) To: user@cassandra.apache.org user@cassandra.apache.org Hi, I have some history

Re: using hadoop + cassandra for CF mutations (delete)

2014-04-04 Thread Paulo Ricardo Motta Gomes
You said you have tried the Pig URL split_size, but have you actually tried decreasing the value of cassandra.input.split.size hadoop property? The default is 65536, so you may want to decrease that to see if the number of mappers increase. But at some point, even if you lower that value it will

Re: Data modelling for range retrieval. Was: Re: Hadoop/Cassandra for data transformation (rather than analysis)?

2013-08-14 Thread Aaron Morton
Is it good practice then to find an attribute in my data that would allow me to form wide row row keys with aprox. 1000 values each? You can do that using get_range_slice() via thrift. And via CQL 3 you use the token() function and Limit with a select statement. Check the DS docs for more

Re: Hadoop/Cassandra for data transformation (rather than analysis)?

2013-08-12 Thread Aaron Morton
As I do not have Billions of input records (but a max of 10 Milllion) the added benefit of scaling out the per-line processing is probably not worth the additional setup and operations effort of Hadoop. I would start with a regular app and then go to hadoop if needed, assuming you are only

Data modelling for range retrieval. Was: Re: Hadoop/Cassandra for data transformation (rather than analysis)?

2013-08-12 Thread Jan Algermissen
Aaron, On 12.08.2013, at 23:17, Aaron Morton aa...@thelastpickle.com wrote: As I do not have Billions of input records (but a max of 10 Milllion) the added benefit of scaling out the per-line processing is probably not worth the additional setup and operations effort of Hadoop. I would

Hadoop/Cassandra for data transformation (rather than analysis)?

2013-08-10 Thread Jan Algermissen
Hi, I have a specific use case to address with Cassandra and I can't get my head around whether using Hadoop on top creates any significant benefit or not. Situation: I have product data and each product 'contains' a number of articles (100 / product), representing individual colors/sizes

Re: hadoop/cassandra integration using CL_ONE...

2013-07-29 Thread aaron morton
Is it possible to use CL_ONE with hadoop/cassandra when doing an M/R job? That's the default. https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/hadoop/ConfigHelper.java#L383 And more importantly is there a way to configure that such that if my RF=3

hadoop/cassandra integration using CL_ONE...

2013-07-26 Thread Hiller, Dean
Is it possible to use CL_ONE with hadoop/cassandra when doing an M/R job? And more importantly is there a way to configure that such that if my RF=3, that it only reads from 1 of the nodes in that 3. We have 12 nodes and ideally we would for example hope M/R runs on a2, a9, a5, a12 which happen

Re: Hadoop/Cassandra 1.2 timeouts

2013-06-26 Thread aaron morton
It's an inter node timeout waiting for the read to complete. Normally means the cluster is overloaded in some fashion, check for GC activity and/or overloaded IOPs. If you reduce the batch_size it should help. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand

Hadoop/Cassandra 1.2 timeouts

2013-06-24 Thread Brian Jeltema
I'm having problems with Hadoop job failures on a Cassandra 1.2 cluster due to Caused by: TimedOutException() 2013-06-24 11:29:11,953 INFO Driver -at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12932) This is running on a 6-node cluster,

Hadoop+Cassandra

2013-03-11 Thread oualid ait wafli
Hi I need a tutorial for deployong Hadoop+Cassandra on single-nodes Thanks

Re: Hadoop+Cassandra

2013-03-11 Thread Renato Marroquín Mogrovejo
Hi there, Check this out [1]. It´s kinda old but I think it will help you get started. Renato M. [1] http://www.datastax.com/docs/0.7/map_reduce/hadoop_mr 2013/3/11 oualid ait wafli oualid.aitwa...@gmail.com: Hi I need a tutorial for deployong Hadoop+Cassandra on single-nodes Thanks

Re: Hadoop+Cassandra

2013-03-11 Thread oualid ait wafli
wafli oualid.aitwa...@gmail.com: Hi I need a tutorial for deployong Hadoop+Cassandra on single-nodes Thanks

RE: cryptic exception in Hadoop/Cassandra job

2013-01-30 Thread Pieter Callewaert
exception in Hadoop/Cassandra job I have a Hadoop/Cassandra map/reduce job that performs a simple transformation on a table with very roughly 1 billion columns spread across roughly 4 million rows. During reduction, I see a relative handful of the following: Exception in thread Streaming

Re: cryptic exception in Hadoop/Cassandra job

2013-01-30 Thread Brian Jeltema
[mailto:brian.jelt...@digitalenvoy.net] Sent: woensdag 30 januari 2013 13:20 To: user@cassandra.apache.org Subject: cryptic exception in Hadoop/Cassandra job I have a Hadoop/Cassandra map/reduce job that performs a simple transformation on a table with very roughly 1 billion columns spread across

RE: cryptic exception in Hadoop/Cassandra job

2013-01-30 Thread Pieter Callewaert
@cassandra.apache.org Subject: Re: cryptic exception in Hadoop/Cassandra job Cassandra 1.1.5, using BulkOutputFormat Brian On Jan 30, 2013, at 7:39 AM, Pieter Callewaert wrote: Hi Brian, Which version of cassandra are you using? And are you using the BOF to write to Cassandra? Kind regards

Re: cryptic exception in Hadoop/Cassandra job

2013-01-30 Thread Brian Jeltema
/CASSANDRA-4813) Kind regards, Pieter -Original Message- From: Brian Jeltema [mailto:brian.jelt...@digitalenvoy.net] Sent: woensdag 30 januari 2013 13:58 To: user@cassandra.apache.org Subject: Re: cryptic exception in Hadoop/Cassandra job Cassandra 1.1.5, using

Re: Hybrid Hadoop Cassandra Cluster

2013-01-18 Thread Jeremy Hanna
product. Jeremy On Jan 18, 2013, at 6:01 AM, Naveen Reddy naveen_2...@yahoo.co.in wrote: Hi, I want to setup a hybrid Hadoop Cassandra Cluster. Can anyone point me to a sample documentation for this ? Thank you Naveen

Re: inconsistent hadoop/cassandra results

2013-01-10 Thread aaron morton
But this is the first time I've tried to use the wide-row support, which makes me a little suspicious. The wide-row support is not very well documented, so maybe I'm doing something wrong there in ignorance. This was the area I was thinking about. Can you drill in and see a pattern. Are

Re: inconsistent hadoop/cassandra results

2013-01-10 Thread Michael Kjellman
I found that overall Hadoop input/output from Cassandra could use a little more QA and input from the community. (Especially with large datasets). There were some serious BOF bugs in 1.1 that have been resolved in 1.2. (Yay!) But, the problems in 1.1 weren't immediately apparent. Testing in my

Re: inconsistent hadoop/cassandra results

2013-01-09 Thread Brian Jeltema
Sorry if this is a duplicate - I was having mailer problems last night: Assuming their were no further writes, running repair or using CL all should have fixed it. Can you describe the inconsistency between runs? Sure. The job output is generated by a single reducer and consists of a

Re: inconsistent hadoop/cassandra results

2013-01-08 Thread aaron morton
Assuming their were no further writes, running repair or using CL all should have fixed it. Can you describe the inconsistency between runs? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 8/01/2013, at 2:16 AM,

Hadoop + Cassandra

2012-01-06 Thread Alain RODRIGUEZ
in order to get more statistics. I'll be glad to learn about any interesting things you learnt with your own experiences with hadoop + Cassandra. Thanks in advance.

Re: Hadoop + Cassandra

2012-01-06 Thread Jeremy Hanna
learnt with your own experiences with hadoop + Cassandra. Thanks in advance.

hadoop cassandra

2011-03-17 Thread Sagar Kohli
hi all, is there any example of hadoop and cassandra integration where input is from hdfs and out put to cassandra NOTE: i have gone through word count example provided with the source code, but it does not have above case.. regards Sagar Are you exploring

Re: hadoop cassandra

2011-03-17 Thread Jeremy Hanna
You can start with a word count example that's only for hdfs. Then you can replace the reducer in that with the ReducerToCassandra that's in the cassandra word_count example. You need to match up your Mapper's output to the Reducer's input and set a couple of configuration variables to tell

RE: hadoop cassandra

2011-03-17 Thread Sagar Kohli
thanks Jeremy, its good pointer to start with regards Sagar From: Jeremy Hanna [jeremy.hanna1...@gmail.com] Sent: Thursday, March 17, 2011 7:34 PM To: user@cassandra.apache.org Subject: Re: hadoop cassandra You can start with a word count example that's