There has been a change with respect to the way since progress reporting is
done since 0.14. The application has to explicitly send the status
(incrCounter doesn't send any status). Even if the application hasn't made
any progress, it is okay to call setStatus with the earlier status.
Hi,
I have a cluster made of only 2 PCs. The master acts also as a slave.
The cluster seems to start properly. It is functional (I can access the
dfs, monitor it with the web interfaces, no errors in the log files...)
but it reports that only 1 node is up. For some reason the datanode on
the
Hello,
For testing purposes I am running Hapoop in local mode.
Is there a possibility to split the output (TextOutputFormat) of a
MapReduce job into several output files (e.g. part-, part-0001,
etc.) according to some maximal file size per file?
I.e. is there a setting such a file size
On Sat, Nov 10, 2007 at 07:56:22PM +, Holger Stenzhorn wrote:
Hello,
For testing purposes I am running Hapoop in local mode.
Is there a possibility to split the output (TextOutputFormat) of a
MapReduce job into several output files (e.g. part-, part-0001,
etc.) according to some maximal
I am using map/reduce with hadoop-0.15.0-streaming.jar to process the data
with php scripts. I have coded to process the data the blow is an example of
word counts from the input.
bin/hadoop jar contrib/hadoop-0.15.0-streaming.jar -mapper
/var/www/search/hadoop/wc-mapper.php -reducer
Billy wrote:
..
What I am looking to do is get and store the input and output from/in hbase.
I haven't tried it but it looks like you can specify input and output
classes for streaming with -inputformat and -outputformat options.
Try setting these to TableInputFormat [1] and
Devaraj Das wrote:
There has been a change with respect to the way since progress reporting is
done since 0.14. The application has to explicitly send the status
(incrCounter doesn't send any status). Even if the application hasn't made
any progress, it is okay to call setStatus with the earlier
This bug is driving me crazy! What tools could I use to find out why
slaves are not reported being part of the cluster? I can't find anything
wrong in the log files.
Using Wireshark, I confirmed that the heartbeat in between the slaves
and the master is working. The ssh communication in between
Did anyone consider the impact of making such a change on existing
applications? Curious how it didn't fail any regression test? (the
pattern that is reported to be broken is so common).
(I suffer from upgradephobia and this doesn't help)
-Original Message-
From: Doug Cutting
Hi
I build nutch from svn source: svn co
http://svn.apache.org/repos/asf/lucene/nutch/trunk nutch
And the nutch-0.9.war is got from the
http://apache.mirror.phpchina.com/lucene/nutch/nutch-0.9.tar.gz
After I configured the nutch followed by
http://wiki.apache.org/nutch/NutchHadoopTutorial
when
I favor considering this a bug. It is easy enough to rework my code
but it seems like odd behaviour.
On Nov 10, 2007 6:41 PM, Doug Cutting [EMAIL PROTECTED] wrote:
Devaraj Das wrote:
There has been a change with respect to the way since progress reporting is
done since 0.14. The application
Basically what I am trying to do is access hbase from php sense I do not
know java and have not found it fun to learn
I was looking around and found this
https://issues.apache.org/jira/browse/HADOOP-2171
but am unsure if it is what I thank it is looks like a way to access hbase
from a socket
Actually in the previous approach, progress reporting used to happen from a
separate thread in tasks. The issues
https://issues.apache.org/jira/browse/HADOOP-1431,
https://issues.apache.org/jira/browse/HADOOP-1462 changed this behavior.
But, yes, I agree that incrCounter should be indicative of
13 matches
Mail list logo