Aayush, out of curiosity, why do you want model wordcount this way?
What benefit do you see?
Norbert
On 4/6/09, Aayush Garg aayush.g...@gmail.com wrote:
Hi,
I want to make experiments with wordcount example in a different way.
Suppose we have very large data. Instead of splitting all the
Have you looked into MogileFS already? Seems like a good fit, based
on your description. This question has come up more than once here,
and MogileFS is an oft-recommended solution.
Norbert
On 3/26/09, phil cryer p...@cryer.us wrote:
When you say that you have huge images, how big is huge?
What platform are you running Eclipse on? If Windows, see this thread
regarding Cygwin:
http://www.mail-archive.com/core-user@hadoop.apache.org/msg07669.html
For my case, I've never had to touch any of the plugin's advanced
parameters. Usually, setting just the Map/Reduce Master and DFS Master
a...@node0:~$
the boxes are just connected with a cat5 cable.
i have not done this with the hadoop account but af is my normal account
and
i figure it should work too.
/etc/init.d/interfaces is empty/does not exist on the machines. (i am using
ubuntu 8.10)
please advise.
Norbert Burger
i have commented out the 192. addresses and changed 127.0.1.1 for node0 and
127.0.1.2 for node0 (in /etc/hosts). with this done i can ssh from one
machine to itself and to the other but the prompt does not change when i
ssh
to the other machine. i don't know if there is a firewall preventing
into the link that you gave.
-zander
Norbert Burger wrote:
i have commented out the 192. addresses and changed 127.0.1.1 for node0
and
127.0.1.2 for node0 (in /etc/hosts). with this done i can ssh from one
machine to itself and to the other but the prompt does not change when i
ssh
On Fri, Feb 13, 2009 at 8:37 AM, Steve Loughran ste...@apache.org wrote:
Michael Lynch wrote:
Hi,
As far as I can tell I've followed the setup instructions for a hadoop
cluster to the letter,
but I find that the datanodes can't connect to the namenode on port 9000
because it is only
Are running Eclipse on Windows? If so, be aware that you need to spawn
Eclipse from within Cygwin in order to access HDFS. It seems that the
plugin uses whoami to get info about the active user. This thread has
some more info:
I'm no Ruby programmer, but don't you need a call to system() instead of the
backtick operator here? Appears that the backtick operator returns STDOUT
instead of the return value:
http://hans.fugal.net/blog/2007/11/03/backticks-2-0
Norbert
On Tue, Feb 3, 2009 at 6:03 PM, S D
need to be a
datanode. If my production node is *not* a dataNode then how can I do
hadoop dfs put?
I was under the impression that when I install HDFS on a cluster each node
in the cluster is a dataNode.
Shahab
On Fri, Oct 31, 2008 at 1:46 PM, Norbert Burger [EMAIL PROTECTED]
wrote
Seems that the slides for each of the 3 Rapleaf talks are posted in the
descriptions:
The Collector - A Tool to Have Multi-Writer Appends into HDFS
http://docs.google.com/Present?docid=dgz78tv5_10gpjhnvg9
Katta - Distributed Lucene Index in Production
Along these lines, I'm curious what management tools folks are using to
ensure cluster availability (ie., auto-restart failed datanodes/namenodes).
Are you using a custom cron script, or maybe something more complex
(Ganglia, Nagios, puppet, etc.)?
Thanks,
Norbert
On 10/28/08, Steve Loughran
I ran into this problem also. From your logs, it seems like you
haven't set mapred.system.dir to be a fixed variable:
http://wiki.apache.org/hadoop/FAQ#14.
The impact is that your job control files are written from your submit
machine into the HDFS at /tmp/hadoop-user2/mapred/system, while your
Yes, this is the suggested configuration. Hadoop relies on password-less
SSH to be able to start tasks on slave machines. You can find instructions
on creating/transferring the SSH keys here:
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
On Wed, Apr
-42.compute-1.amazonaws.com. Please set up DNS so localhost
points
to .
Then it asks for en enter key.
The java exception was not coming earlier.
U mean I should set prerna.dyndns.org to 75.101.217.228?
Thanks
Prerna
On Wed, Apr 16, 2008 at 8:27 AM, Norbert Burger
[EMAIL PROTECTED] wrote
?
Thanks
Prerna
On Wed, Apr 16, 2008 at 8:27 AM, Norbert Burger
[EMAIL PROTECTED] wrote:
There is no need to maintain a server and client Cygwin session
on a
local machine. In the typical Hadoop-on-EC2 setup, all of your
nodes are
EC2 hosts, spawned dynamically
host shouldnt some
key be generated on dyndns?
since I am not able to do ssh to that host
On Thu, Apr 17, 2008 at 12:17 PM, Norbert Burger
[EMAIL PROTECTED] wrote:
You need to create a DynDNS account and then add host records to this
account.
On Thu, Apr 17, 2008 at 12:03 PM, Prerna
and hence I cant run bin/hadoop.
From here I do not know how to proceed?
I basically want to implement
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873.
Hence I created a host using dyndns.
If you can help me,it will be great.
On Tue, Apr 15, 2008 at 2:15 PM, Norbert
, Apr 15, 2008 at 2:15 PM, Norbert Burger
[EMAIL PROTECTED] wrote:
Are you trying to run Hadoop on a local cluster, or in the EC2
environment?
If EC2, then your MASTER_HOST setting is wrong, becuase it points to a
residential ISP (*.rr.com). It should instead point to your jobtracker
Colin, how about writing a streaming mapper which simply runs md5sum on each
file it gets as input? Run this task along with the identity reducer, and
you should be able to identify pretty quickly if there's HDFS corruption
issue.
Norbert
On Tue, Apr 8, 2008 at 5:50 PM, Colin Freas [EMAIL
, Amareshwari Sriramadasu
[EMAIL PROTECTED] wrote:
Norbert Burger wrote:
I'm trying to use the cacheArchive command-line options with the
hadoop-0.15.3-streaming.jar. I'm using the option as follows:
-cacheArchive hdfs://host:50001/user/root/lib.jar#lib
Unfortunately, my PERL scripts
21 matches
Mail list logo