I am using Hadoop streaming and I need to pass arguments to my map/reduce
script. Because a map/reduce script is triggered by hadoop, like
hadoop -file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER
...
How can I pass arguments to MAPPER?
I tried -cmdenv name=val , but it does not
Qin's question actually raises an issue-- it seems that using a
close() call, which does not throw IOException and which does not
provide the user with access to the OutputCollector object makes this
important piece of functionality (from a client's perspective) hard to
use. Does anyone feel
Hi
I'm seeing repeated AlreadyBeingCreatedExceptions during the reduce phase of
my job, which eventually causes the job to fail. Can anyone suggest what
could be causing this exception? I have hadoop configured with just one
slave, running two reduces simultaneously.
regards
Barry
Hi,
Can some expert differentiate or compare HDFS with KFS ? Apparently it
looks like similar architecture with little difference and same objective.
Thanks,
Wasim
Hi,
what I can guarantee about the value of the map keys input (using the
TextInputFormat)?
sometimes, it could be useful for some needs. for example:
- if I want, in some cases, to apply map only on some percent on the
data. If the input keys are indexes then I can ignore (do nothing) when
I end up with using my own MapRunner, so that I can control the call to map
function, and then calling close() is not necessary. However I think it is
reasonable to have close() throw IOException, but providing OutputCollector
may make the framework a little messy, my suggestion is stay with the
On Thu, Aug 21, 2008 at 9:44 PM, Wasim Bari [EMAIL PROTECTED] wrote:
Hi,
Can some expert differentiate or compare HDFS with KFS ? Apparently it
looks like similar architecture with little difference and same objective.
What's KFS? Which KFS?
Here all ones know HDFS, but someone like me
On Thu, Aug 21, 2008 at 3:14 PM, Gopal Gandhi
[EMAIL PROTECTED] wrote:
I am using Hadoop streaming and I need to pass arguments to my map/reduce
script. Because a map/reduce script is triggered by hadoop, like
hadoop -file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER
...
How
Thank you, but i still don't got it.
I've read tons of websites and papers, but there's no clear und founded
answer why use BigTable instead of relational databases.
MySQL Cluster seams to offer the same scalabilty and level of
abstraction, whithout switching to a non relational pardigm.
I'm no expert, but maybe I can explain it the way I see it, maybe it
will resonate with other newbies like me :) Sorry if it's long winded,
or boring for those who already know all this.
BigTable and Hadoop are inherently sharded and distributed. They are
architected to store the data in
A few very big differences...
- HBase/BigTable don't have transactions in the same way that a relational
database does. While it is possible (and was just recently implemented for
HBase, see HBASE-669) it is not at the core of this design. A major bottleneck
of distributed multi-master
KFS is also another Distributed file system implemented in C++. Here you can
get details:
http://kosmosfs.sourceforge.net/
--
From: rae l [EMAIL PROTECTED]
Sent: Thursday, August 21, 2008 4:52 PM
To: core-user@hadoop.apache.org
Subject: Re:
That's interesting. Suppose your mapper script is a Perl script, how do you
assign my.mapper.arg1's value to a variable $x?
$x = $my.mapper.arg1
I just tried the way and my perl script does not recognize $my.mapper.arg1.
--- On Thu, 8/21/08, Rong-en Fan [EMAIL PROTECTED] wrote:
From: Rong-en Fan
Hi folks;
I'm new to Hadoop, and I'm trying to set it up on a cluster for which
almost all the disk is mounted via the Lustre filesystem. That
filesystem is visible to all the nodes, so I don't actually need HDFS to
implement a shared filesystem. (I know the philosophical reasons why
people
Thanks a lot for all replies, this is really helpful.
As you describe it, its a problem of implementation. BigTable is
designed to scale, there are routines to shard the data, desitribute it
to the pool of connected servers. Could MySQL perhaps decide tomorrow to
implement something similar
On Fri, Aug 22, 2008 at 12:34 AM, Wasim Bari [EMAIL PROTECTED] wrote:
KFS is also another Distributed file system implemented in C++. Here you can
get details:
http://kosmosfs.sourceforge.net/
Just from the basic information:
http://sourceforge.net/projects/kosmosfs
# Developers : 2
#
I haven't used KFS, but I believe a major difference is that you can
(apparently) mount KFS as a standard device under Linux, allowing you to
read and write directly to it without having to re-compile the
application (as far as I know that's not possible with HDFS, although
the last time I
On Thursday 21 August 2008 00:14:56 Gopal Gandhi wrote:
I am using Hadoop streaming and I need to pass arguments to my map/reduce
script. Because a map/reduce script is triggered by hadoop, like hadoop
-file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER ...
How can I pass
Isn't there FUSE for HDFS, as well as the WebDAV option?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Tim Wintle [EMAIL PROTECTED]
To: core-user@hadoop.apache.org
Sent: Thursday, August 21, 2008 1:42:51 PM
Subject: Re: HDFS Vs KFS
I'm trying to install Hadoop 0.17.2 version on a linux box (xen os)
So, bin/start-all.sh works fine, but
hadoop-hadoop-jobtracker-softtek-helio-dev.log shows me error showed below.
Do you now how to fix it?
Thanks in advance
2008-08-21 11:20:28,020 INFO org.apache.hadoop.mapred.JobTracker:
Unfortunately this does not work. Hadoop complains:
08/08/21 18:04:46 ERROR streaming.StreamJob: Unexpected arg1 while processing
I'm getting NotReplicatedYet exceptions when I try to put a file on
DFS for a newly created cluster. If I wait a while, the put works.
Is there a way to tell if the DFS is ready from the master node?
hadoop dfs -put isn't giving me a meaningful error status
It wouldn't be too much of a stretch to use Lustre directly...
although it isn't trivial either.
You'd need to implement the 'FileSystem' interface for Lustre, define
a URI scheme (e.g. lfs://) etc. Please take a take a look at the KFS/
S3 implementations.
Arun
On Aug 21, 2008, at 9:59 AM,
In cluster stats i see a number reported as the max capacity of
reduce tasks for the cluster. How is this number computed? I didnt
see any info in the java doc for the method.
thanks.
- Manish
Co-Founder Rapleaf.com
We're looking for a product manager, sys admin, and software
Thanks for answer!
I guess safe-mode come out after awhile, but I was wondering if safemode
problems it is causing my problem
Basically, hadoop server starts just fine, but at moment to run example it
never ends, here is some log:
bin/hadoop jar hadoop-0.17.2-examples.jar wordcount
Hi all!!
I just realized in a secondarynode log file and it stored this exception
java.net.NoRouteToHostException: No route to host
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at
More specifyc exception
org.apache.hadoop.mapred.ReduceTask: java.net.NoRouteToHostException: No
route to host
On Thu, Aug 21, 2008 at 12:03 PM, Gerardo Velez [EMAIL PROTECTED]wrote:
Hi all!!
I just realized in a secondarynode log file and it stored this exception
Hi,
Anyone experience with installing Hadoop or HDFS on Suse Linux?
Thanks
yes and it works out-of-the-box
Miles
2008/8/21 Wasim Bari [EMAIL PROTECTED]
Hi,
Anyone experience with installing Hadoop or HDFS on Suse Linux?
Thanks
--
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.
Hi mailing,
I want to get information of current input split inside the MapRunner object
(or map function), however the only object I can get from the MapRunner is
the RecordReader, and I saw no method defined in RecordReader to fetch the
InputSplit object. Do you have any suggestions on this?
Since HBase became a subproject of Hadoop, it started its own release numbering
scheme. However, because a particular release of HBase requires a specific
release of Hadoop, HBase will change so that its major releases correspond to
the Hadoop major release that it depends on.
For example, in
On Fri, Aug 22, 2008 at 12:51 AM, Steve Gao [EMAIL PROTECTED] wrote:
That's interesting. Suppose your mapper script is a Perl script, how do you
assign my.mapper.arg1's value to a variable $x?
$x = $my.mapper.arg1
I just tried the way and my perl script does not recognize $my.mapper.arg1.
Hi all,
Firstly I have known that there is a FsDirectory class in Nutch-0.9 so
we can access the index on HDFS. But after I tested it, i found that we can
only read the index but can not to append or modify, I think the reason is
the one mentioned in the HDFS-file append issues, am I right?
33 matches
Mail list logo