date:20080821

[Streaming] How to pass arguments to a map/reduce script

2008-08-21 Thread Gopal Gandhi

I am using Hadoop streaming and I need to pass arguments to my map/reduce script. Because a map/reduce script is triggered by hadoop, like hadoop -file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER ... How can I pass arguments to MAPPER? I tried -cmdenv name=val , but it does not

Re: Know how many records remain?

2008-08-21 Thread Chris Dyer

Qin's question actually raises an issue-- it seems that using a close() call, which does not throw IOException and which does not provide the user with access to the OutputCollector object makes this important piece of functionality (from a client's perspective) hard to use. Does anyone feel

AlreadyBeingCreatedException during reduce

2008-08-21 Thread Barry Haddow

Hi I'm seeing repeated AlreadyBeingCreatedExceptions during the reduce phase of my job, which eventually causes the job to fail. Can anyone suggest what could be causing this exception? I have hadoop configured with just one slave, running two reduces simultaneously. regards Barry

HDFS Vs KFS

2008-08-21 Thread Wasim Bari

Hi, Can some expert differentiate or compare HDFS with KFS ? Apparently it looks like similar architecture with little difference and same objective. Thanks, Wasim

map input key values?

2008-08-21 Thread Deyaa Adranale

Hi, what I can guarantee about the value of the map keys input (using the TextInputFormat)? sometimes, it could be useful for some needs. for example: - if I want, in some cases, to apply map only on some percent on the data. If the input keys are indexes then I can ignore (do nothing) when

Re: Know how many records remain?

2008-08-21 Thread Qin Gao

I end up with using my own MapRunner, so that I can control the call to map function, and then calling close() is not necessary. However I think it is reasonable to have close() throw IOException, but providing OutputCollector may make the framework a little messy, my suggestion is stay with the

Re: HDFS Vs KFS

2008-08-21 Thread rae l

On Thu, Aug 21, 2008 at 9:44 PM, Wasim Bari [EMAIL PROTECTED] wrote: Hi, Can some expert differentiate or compare HDFS with KFS ? Apparently it looks like similar architecture with little difference and same objective. What's KFS? Which KFS? Here all ones know HDFS, but someone like me

Re: [Streaming] How to pass arguments to a map/reduce script

2008-08-21 Thread Rong-en Fan

On Thu, Aug 21, 2008 at 3:14 PM, Gopal Gandhi [EMAIL PROTECTED] wrote: I am using Hadoop streaming and I need to pass arguments to my map/reduce script. Because a map/reduce script is triggered by hadoop, like hadoop -file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER ... How

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Mork0075

Thank you, but i still don't got it. I've read tons of websites and papers, but there's no clear und founded answer why use BigTable instead of relational databases. MySQL Cluster seams to offer the same scalabilty and level of abstraction, whithout switching to a non relational pardigm.

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Fernando Padilla

I'm no expert, but maybe I can explain it the way I see it, maybe it will resonate with other newbies like me :) Sorry if it's long winded, or boring for those who already know all this. BigTable and Hadoop are inherently sharded and distributed. They are architected to store the data in

RE: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Jonathan Gray

A few very big differences... - HBase/BigTable don't have transactions in the same way that a relational database does. While it is possible (and was just recently implemented for HBase, see HBASE-669) it is not at the core of this design. A major bottleneck of distributed multi-master

Re: HDFS Vs KFS

2008-08-21 Thread Wasim Bari

KFS is also another Distributed file system implemented in C++. Here you can get details: http://kosmosfs.sourceforge.net/ -- From: rae l [EMAIL PROTECTED] Sent: Thursday, August 21, 2008 4:52 PM To: core-user@hadoop.apache.org Subject: Re:

Re: [Streaming] How to pass arguments to a map/reduce script

2008-08-21 Thread Steve Gao

That's interesting. Suppose your mapper script is a Perl script, how do you assign my.mapper.arg1's value to a variable $x? $x = $my.mapper.arg1 I just tried the way and my perl script does not recognize $my.mapper.arg1. --- On Thu, 8/21/08, Rong-en Fan [EMAIL PROTECTED] wrote: From: Rong-en Fan

Hadoop over Lustre?

2008-08-21 Thread Joel Welling

Hi folks; I'm new to Hadoop, and I'm trying to set it up on a cluster for which almost all the disk is mounted via the Lustre filesystem. That filesystem is visible to all the nodes, so I don't actually need HDFS to implement a shared filesystem. (I know the philosophical reasons why people

Re: Why is scaling HBase much simpler then scaling a relational db?

2008-08-21 Thread Mork0075

Thanks a lot for all replies, this is really helpful. As you describe it, its a problem of implementation. BigTable is designed to scale, there are routines to shard the data, desitribute it to the pool of connected servers. Could MySQL perhaps decide tomorrow to implement something similar

Re: HDFS Vs KFS

2008-08-21 Thread rae l

On Fri, Aug 22, 2008 at 12:34 AM, Wasim Bari [EMAIL PROTECTED] wrote: KFS is also another Distributed file system implemented in C++. Here you can get details: http://kosmosfs.sourceforge.net/ Just from the basic information: http://sourceforge.net/projects/kosmosfs # Developers : 2 #

Re: HDFS Vs KFS

2008-08-21 Thread Tim Wintle

I haven't used KFS, but I believe a major difference is that you can (apparently) mount KFS as a standard device under Linux, allowing you to read and write directly to it without having to re-compile the application (as far as I know that's not possible with HDFS, although the last time I

Re: [Streaming] How to pass arguments to a map/reduce script

2008-08-21 Thread Yuri Pradkin

On Thursday 21 August 2008 00:14:56 Gopal Gandhi wrote: I am using Hadoop streaming and I need to pass arguments to my map/reduce script. Because a map/reduce script is triggered by hadoop, like hadoop -file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER ... How can I pass

Re: HDFS Vs KFS

2008-08-21 Thread Otis Gospodnetic

Isn't there FUSE for HDFS, as well as the WebDAV option? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Tim Wintle [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Thursday, August 21, 2008 1:42:51 PM Subject: Re: HDFS Vs KFS

Haddop 0.17.2 configuration problems!

2008-08-21 Thread Gerardo Velez

I'm trying to install Hadoop 0.17.2 version on a linux box (xen os) So, bin/start-all.sh works fine, but hadoop-hadoop-jobtracker-softtek-helio-dev.log shows me error showed below. Do you now how to fix it? Thanks in advance 2008-08-21 11:20:28,020 INFO org.apache.hadoop.mapred.JobTracker:

Re: [Streaming] How to pass arguments to a map/reduce script

2008-08-21 Thread Steve Gao

Unfortunately this does not work. Hadoop complains: 08/08/21 18:04:46 ERROR streaming.StreamJob: Unexpected arg1 while processing

how to tell if the DFS is ready?

2008-08-21 Thread Karl Anderson

I'm getting NotReplicatedYet exceptions when I try to put a file on DFS for a newly created cluster. If I wait a while, the put works. Is there a way to tell if the DFS is ready from the master node? hadoop dfs -put isn't giving me a meaningful error status

Re: Hadoop over Lustre?

2008-08-21 Thread Arun C Murthy

It wouldn't be too much of a stretch to use Lustre directly... although it isn't trivial either. You'd need to implement the 'FileSystem' interface for Lustre, define a URI scheme (e.g. lfs://) etc. Please take a take a look at the KFS/ S3 implementations. Arun On Aug 21, 2008, at 9:59 AM,

getMaxReduceTasks()

2008-08-21 Thread Manish Shah

In cluster stats i see a number reported as the max capacity of reduce tasks for the cluster. How is this number computed? I didnt see any info in the java doc for the method. thanks. - Manish Co-Founder Rapleaf.com We're looking for a product manager, sys admin, and software

Re: Haddop 0.17.2 configuration problems!

2008-08-21 Thread Gerardo Velez

Thanks for answer! I guess safe-mode come out after awhile, but I was wondering if safemode problems it is causing my problem Basically, hadoop server starts just fine, but at moment to run example it never ends, here is some log: bin/hadoop jar hadoop-0.17.2-examples.jar wordcount

Re: Haddop 0.17.2 configuration problems!

2008-08-21 Thread Gerardo Velez

Hi all!! I just realized in a secondarynode log file and it stored this exception java.net.NoRouteToHostException: No route to host at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at

Re: Haddop 0.17.2 configuration problems!

2008-08-21 Thread Gerardo Velez

More specifyc exception org.apache.hadoop.mapred.ReduceTask: java.net.NoRouteToHostException: No route to host On Thu, Aug 21, 2008 at 12:03 PM, Gerardo Velez [EMAIL PROTECTED]wrote: Hi all!! I just realized in a secondarynode log file and it stored this exception

Hadoop on Suse

2008-08-21 Thread Wasim Bari

Hi, Anyone experience with installing Hadoop or HDFS on Suse Linux? Thanks

Re: Hadoop on Suse

2008-08-21 Thread Miles Osborne

yes and it works out-of-the-box Miles 2008/8/21 Wasim Bari [EMAIL PROTECTED] Hi, Anyone experience with installing Hadoop or HDFS on Suse Linux? Thanks -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

Get information of input split from MapRunner?

2008-08-21 Thread Qin Gao

Hi mailing, I want to get information of current input split inside the MapRunner object (or map function), however the only object I can get from the MapRunner is the RecordReader, and I saw no method defined in RecordReader to fetch the InputSplit object. Do you have any suggestions on this?

[Announcement] HBase major release version to track Hadoop major releases.

2008-08-21 Thread Jim Kellerman

Since HBase became a subproject of Hadoop, it started its own release numbering scheme. However, because a particular release of HBase requires a specific release of Hadoop, HBase will change so that its major releases correspond to the Hadoop major release that it depends on. For example, in

Re: [Streaming] How to pass arguments to a map/reduce script

2008-08-21 Thread Rong-en Fan

On Fri, Aug 22, 2008 at 12:51 AM, Steve Gao [EMAIL PROTECTED] wrote: That's interesting. Suppose your mapper script is a Perl script, how do you assign my.mapper.arg1's value to a variable $x? $x = $my.mapper.arg1 I just tried the way and my perl script does not recognize $my.mapper.arg1.

Questions about lucene index on HDFS

2008-08-21 Thread Jarvis . Guo

Hi all, Firstly I have known that there is a FsDirectory class in Nutch-0.9 so we can access the index on HDFS. But after I tested it, i found that we can only read the index but can not to append or modify, I think the reason is the one mentioned in the HDFS-file append issues, am I right?

[Streaming] How to pass arguments to a map/reduce script

Re: Know how many records remain?

AlreadyBeingCreatedException during reduce

HDFS Vs KFS

map input key values?

Re: Know how many records remain?

Re: HDFS Vs KFS

Re: [Streaming] How to pass arguments to a map/reduce script

Re: Why is scaling HBase much simpler then scaling a relational db?

Re: Why is scaling HBase much simpler then scaling a relational db?

RE: Why is scaling HBase much simpler then scaling a relational db?

Re: HDFS Vs KFS

Re: [Streaming] How to pass arguments to a map/reduce script

Hadoop over Lustre?

Re: Why is scaling HBase much simpler then scaling a relational db?

Re: HDFS Vs KFS

Re: HDFS Vs KFS

Re: [Streaming] How to pass arguments to a map/reduce script

Re: HDFS Vs KFS

Haddop 0.17.2 configuration problems!

Re: [Streaming] How to pass arguments to a map/reduce script

how to tell if the DFS is ready?

Re: Hadoop over Lustre?

getMaxReduceTasks()

Re: Haddop 0.17.2 configuration problems!

Re: Haddop 0.17.2 configuration problems!

Re: Haddop 0.17.2 configuration problems!

Hadoop on Suse

Re: Hadoop on Suse

Get information of input split from MapRunner?

[Announcement] HBase major release version to track Hadoop major releases.

Re: [Streaming] How to pass arguments to a map/reduce script

Questions about lucene index on HDFS

33 matches

Site Navigation

Mail list logo

Footer information