Re: RF=1 w/ hadoop jobs

2011-09-05 Thread Mick Semb Wever
On Fri, 2011-09-02 at 09:28 +0200, Patrik Modesto wrote: We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a

Re: RF=1 w/ hadoop jobs

2011-09-05 Thread Patrik Modesto
On Mon, Sep 5, 2011 at 09:39, Mick Semb Wever m...@apache.org wrote: I've entered a jira issue covering this request. https://issues.apache.org/jira/browse/CASSANDRA-3136 Would you mind attaching your patch to the issue. (No review of it will happen anywhere else.) I see Jonathan didn't

Re: RF=1 w/ hadoop jobs

2011-09-05 Thread Mick Semb Wever
On Mon, 2011-09-05 at 21:52 +0200, Patrik Modesto wrote: I'm not sure about 0.8.x and 0.7.9 (to be released today with your patch) but 0.7.8 will fail even with RF1 when there is Hadoop TaskTracer without local Cassandra. So increasing RF is not a solution. This isn't true (or not the

Re: RF=1 w/ hadoop jobs

2011-09-02 Thread Patrik Modesto
Hi, On Thu, Sep 1, 2011 at 12:36, Mck m...@apache.org wrote: It's available here: http://pastebin.com/hhrr8m9P (for version 0.7.8) I'm interested in this patch and see it's usefulness but no one will act until you attach it to an issue. (I think a new issue is appropriate here). I'm glad

Re: RF=1 w/ hadoop jobs

2011-09-02 Thread Mick Semb Wever
On Fri, 2011-09-02 at 08:20 +0200, Patrik Modesto wrote: As Jonathan already explained himself: ignoring unavailable ranges is a misfeature, imo Generally it's not what one would want i think. But I can see the case when data is to be treated volatile and ignoring unavailable ranges may be

Re: RF=1 w/ hadoop jobs

2011-09-02 Thread Patrik Modesto
On Fri, Sep 2, 2011 at 08:54, Mick Semb Wever m...@apache.org wrote: Patrik: is it possible to describe the use-case you have here? Sure. We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute

Re: RF=1 w/ hadoop jobs

2011-09-01 Thread Mck
On Thu, 2011-08-18 at 08:54 +0200, Patrik Modesto wrote: But there is the another problem with Hadoop-Cassandra, if there is no node available for a range of keys, it fails on RuntimeError. For example having a keyspace with RF=1 and a node is down all MapReduce tasks fail. CASSANDRA-2388 is