Jeff Zhang wrote:
Mridul,

What do you mean about "Counter's are not synchronized in 'real-time' " ?
As I know, JT will aggregate Counters from TT, so I think the aggregated
Counter in JT should be correct.

Aggregate counters are guaranteed to be correct at end of a logical state - not necessarily in between. Consider cases of mapper/reducer task re-execution, caching at the task nodes (counters piggyback on heartbeat - and so every XX seconds), etc.

So trying to limit output based on counter would typically result in not optimal results.

Regards,
Mridul



On Tue, Jan 26, 2010 at 3:08 PM, Mridul Muralidharan
<[email protected]>wrote:

Jeff Zhang wrote:

*See my comments below*


On Mon, Jan 25, 2010 at 3:22 PM, Something Something <
[email protected]> wrote:

 If I set # of reduce tasks to 1 using setNumReduceTasks(1), would the
class
be instantiated only on one machine.. always?  I mean if I have a cluster
of
say 1 master, 10 workers & 3 zookeepers, is the Reducer class guaranteed
to
be instantiated only on 1 machine?

*--Yes*


 If answer is yes, then I will use static variable as a counter to see how
may rows have been added to my HBase table so far.  In my use case, I
want
to write only N number of rows to a table.  Is there a better way to do
this?  Please let me know.  Thanks.


*--Maybe you can use Counter to track the number of rows you add to HBase,
then you do not need to limit the reduce task as 1*



Counter's are not synchronized in 'real-time' : so you cant use that to
limit at addition time imo.
It is more for aggregation, not realtime messaging.

- Mridul





Reply via email to