Just trying to understand your use case
you need an hour job to run on data between 6:40 AM and 7:40 AM. Would
it be like a moving window? For ex. run hour jobs on
6:41 AM to 7:41 AM
6:42 AM to 7:42 AM
and so on...
On Mon, Feb 27, 2012 at 1:01 PM, Stuti Awasthi wrote:
> Hi all,
>
> I have to imp
No. The data will be either of 5 mins interval, or 1 hour interval or 1 day
interval and so on
So suppose utilization is for 40 days then I will charge 30 days according to
months billing and remaining 10 days as days billing job.
-Original Message-
From: Rohit Kelkar [mailto:rohitk
Well, first, you can design 6 MR jobs:
1- for 5 mins interval
2- for 1 hour
3- for 1 day
4- for 1 month
5- for 1 year
6- and a last for any interval
If you say that for each interval, you have to do a different
calculation; this way could be a solution (at least I think that).
You can read the "
Pig tries to do this with some of their optimizations. You ultimately have to
combine them together into a single map/reduce job, with two separate execution
paths. It is complicated, especially in the shuffle phase. It would probably
look something like
MapCollectorWrapper implements collec
Hi.
We are testing hadoop. We are using hadoop (0.20.2-cdh3u3). I am using
the cotomized conf directory with -"-config mypath". I modified the
log4j.properties file in this path, adding "
log4j.logger.com.mycompany=DEBUG". It works fine with our
pseudo-one-node-cluster setup (1.00). But
Hi,
I have been looking for a way to do unit testing of map reduce programs too.
There is not much of help or documentation available for MRUnit, is there
not any other good method for map reduce programs testing?
Thank you
On Mon, Feb 27, 2012 at 1:01 AM, Justin Woody wrote:
> Shuja,
>
> MRUn
Have you checked out this example:
https://cwiki.apache.org/confluence/display/MRUNIT/Testing+Word+Count
On Mon, Feb 27, 2012 at 2:54 PM, Akhtar Muhammad Din
wrote:
> Hi,
> I have been looking for a way to do unit testing of map reduce programs too.
> There is not much of help or documentation a
Hi Marcos,
Thanks for the pointers. I am also thinking on the similar lines.
I am doubtful at 1 point :
I will be having separate data files for every interval. Let's take example if
I have 5 mins interval file which contain data for 2 hours and 10 mins. In this
scenario I want to process 2 h
Yes, I have checked it before, there is only single example out there and
not much help material available.
is there any other way?
On Tue, Feb 28, 2012 at 5:39 AM, Joey Echeverria wrote:
> Have you checked out this example:
>
> https://cwiki.apache.org/confluence/display/MRUNIT/Testing+Word+Co