Re: libhdfs working for test program when run from ant but failing when run individually

2008-03-28 Thread Raghavendra K
Hi,
The following segmentation fault still exists.
I re wrote my application to use ant, but when I integrate it with libhdfs
it fails saying segmentation fault and exiting with 139.
Please do help, as I have already spent a lot of time on re writing my
application to use hadoop and this one fails for no reason.

On Wed, Mar 19, 2008 at 12:52 PM, Raghavendra K [EMAIL PROTECTED]
wrote:

 I am passing the following arguments

 OS_NAME=Linux
 OS_ARCH=i386
 LIBHDFS_BUILD_DIR=/garl/garl-alpha1/home1/raghu/Desktop/hadoop-0.15.3
 /build/libhdfs
 JAVA_HOME=/garl/garl-alpha1/home1/raghu/Desktop/jdk1.5.0_14
 PLATFORM=linux
 SHLIB_VERSION=1

 I have commented out the line
 #PLATFORM = $(shell echo $$OS_NAME | tr [A-Z] [a-z])
 and passing PLATFORM=linux
 as the line was not executing if I just type
 make test
 separately.
  and also changed the line

 $(HDFS_TEST): hdfs_test.c
 $(CC) $(CPPFLAGS) $ -L$(LIBHDFS_BUILD_DIR) -l$(LIB_NAME)
 $(LDFLAGS) -o $@

 (have added LDFLAGS, because when run it was complaining that ljvm.so was
 not found)

 Where am I going wrong? Kindly let me know if I have to provide any other
 information.


 On Tue, Mar 18, 2008 at 11:41 PM, Arun C Murthy [EMAIL PROTECTED]
 wrote:

 
  On Mar 14, 2008, at 11:48 PM, Raghavendra K wrote:
 
   Hi,
 My apologies for bugging the forum again and again.
   I am able to get the sample program for libhdfs working. I followed
   these
   steps.
  
   --- compiled using ant
   --- modified the test-libhdfs.sh to include CLASSPATH, HADOOP_HOME,
   HADOOP_CONF_DIR, HADOOP_LOG_DIR, LIBHDFS_BUILD_DIR (since I ran
   test-libhdfs.sh individually and dint invoke it from ant)
   --- The program ran succesfully and was able to write, read and all.
  
   Now I copy the same program to a different directory and use the same
   Makefile(used by ant) and modified the variables accordingly. Used
   make test
   compiled successfully
   Used the same test-libhdfs.sh to invoke hdfs_test, but now it fails
   saying
   Segmentation Fault.
   I dont know where it is going wrong.
   Cant libhdfs be compiled without using ant? I want to test it and
   integrate
   libhdfs with my program
   Please do reply and help me out as this is driving me crazy.
 
  I can only assume there is something wrong with the values you are
  passing for the requisite environment variables: OS_{NAME|OS_ARCH},
  SHLIB_VERSION, LIBHDFS_VERSION, HADOOP_{HOME|CONF_DIR|LOG_DIR} since
  it works when you run 'make test'.
 
  Sorry it isn't of much help... could you share the values you are
  using for these?
 
  Arun
 
 
   Thanks in advance.
  
   --
   Regards,
   Raghavendra K
 
 


 --
 Regards,
 Raghavendra K




-- 
Regards,
Raghavendra K


Re: [Map/Reduce][HDFS]

2008-03-28 Thread Peeyush Bishnoi
hello ,

Yes you can do this by specify in hadoop-site.xml about the location of
namenode , where your data is already get distributed.

---
property
  namefs.default.name/name
  value IPAddress:PortNo  /value
/property

---

Thanks

---
Peeyush


On Thu, 2008-03-27 at 15:41 -0400, Jean-Pierre wrote:

 Hello,
 
 I'm working on large amount of logs, and I've noticed that the
 distribution of data on the network (./hadoop dfs -put input input)
 takes a lot of time.
 
 Let's says that my data is already distributed among the network, is
 there anyway to say to hadoop to use the already existing
 distribution ?.
 
 Thanks
 


DFS get blocked when writing a file.

2008-03-28 Thread Iván de Prado
Hello, 

I'm working with Hadoop 0.16.1. I have an issue with the DFS. Sometimes
when writing to the HDFS it gets blocked. Sometimes it doesn't happen,
so it's not easily reproducible. 

My cluster have 4 nodes and one master with the NameNode and JobTracker.
This are the logs that appears when all gets blocked. Look to the block
blk_7857709233639057851 that seems to be the problematic one. It raises
the exception:

Exception in receiveBlock for block  java.io.IOException: Trying to
change block file offset of block blk_7857709233639057851 to 33357824
but actual size of file is 33353728

A bigger trace of the logs and a part of the stack trace:

hn3: 2008-03-28 07:34:44,499 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.2:46092
dest: /172.16.3.2:50010
hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 2 got response for connect ack  from downstream datanode with
firstbadlink as 
hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 2 forwarding connect ack to upstream firstbadlink is 
hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:
Received block blk_8152094109584962620 of size 67108864 from /172.16.3.2
hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:
PacketResponder 2 for block blk_8152094109584962620 terminating
hn2: 2008-03-28 07:34:44,500 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.5:35904
dest: /172.16.3.5:50010
hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
Datanode 1 got response for connect ack  from downstream datanode with
firstbadlink as 
hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
Datanode 1 forwarding connect ack to upstream firstbadlink is 
hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:
Received block blk_8152094109584962620 of size 67108864 from /172.16.3.4
hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:
PacketResponder 1 for block blk_8152094109584962620 terminating
hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.4:36887
dest: /172.16.3.4:50010
hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 0 forwarding connect ack to upstream firstbadlink is 
hn4: 2008-03-28 07:34:44,615 INFO org.apache.hadoop.dfs.DataNode:
Changing block file offset of block blk_7857709233639057851 from 4325376
to 4325376 meta file offset to 33799
hn3: 2008-03-28 07:34:45,304 INFO org.apache.hadoop.dfs.DataNode:
Changing block file offset of block blk_7857709233639057851 from
33353728 to 33357824 meta file offset to 260615
hn3: 2008-03-28 07:34:45,305 INFO org.apache.hadoop.dfs.DataNode:
Exception in receiveBlock for block  java.io.IOException: Trying to
change block file offset of block blk_7857709233639057851 to 33357824
but actual size of file is 33353728
hn1: 2008-03-28 07:35:31,835 INFO org.apache.hadoop.dfs.DataNode:
BlockReport of 564 blocks got processed in 128 msecs

Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed
mode):

ResponseProcessor for block blk_7857709233639057851 prio=10
tid=0x5c557800 nid=0x23ad waiting for monitor entry
[0x40e15000..0x40e15a10]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
$ResponseProcessor.run(DFSClient.java:1771)
- waiting to lock 0x2aaab43ad910 (a java.util.LinkedList)

DataStreamer for file /user/properazzi/test/output/index/_0.cfs block
blk_7857709233639057851 prio=10 tid=0x5c59f000 nid=0x2392
runnable [0x41219000..0x41219d10]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at
java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
- locked 0x2aaade9b8120 (a java.io.BufferedOutputStream)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
- locked 0x2aaade9b8148 (a java.io.DataOutputStream)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
$DataStreamer.run(DFSClient.java:1623)
- locked 0x2aaab43ad910 (a java.util.LinkedList)

[EMAIL PROTECTED] daemon prio=10
tid=0x5c7f1000 nid=0x2254 waiting on condition
[0x41118000..0x41118a90]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.dfs.DFSClient
$LeaseChecker.run(DFSClient.java:597)
at java.lang.Thread.run(Thread.java:619)

[EMAIL PROTECTED] daemon prio=10
tid=0x5c4fec00 nid=0x224f waiting on condition
[0x40f16000..0x40f16c90]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
 

Re: DFS get blocked when writing a file.

2008-03-28 Thread Raghu Angadi


 Exception in receiveBlock for block  java.io.IOException: Trying to
 change block file offset of block blk_7857709233639057851 to 33357824
 but actual size of file is 33353728

This was fixed in HADOOP-3033. You can try running latest 0.16 branch 
(svn...hadoop/core/branches/branch-016). 0.16.2 release is scheduled for 
early next week.


This exception does not fully explain blocked client. If the client 
blocks again with latest 0.16 branch, could you include stacktraces on 
datanodes also? You could file a jira so that it is convenient to attach 
logs and stacktrace.


Raghu.

Iván de Prado wrote:
Hello, 


I'm working with Hadoop 0.16.1. I have an issue with the DFS. Sometimes
when writing to the HDFS it gets blocked. Sometimes it doesn't happen,
so it's not easily reproducible. 


My cluster have 4 nodes and one master with the NameNode and JobTracker.
This are the logs that appears when all gets blocked. Look to the block
blk_7857709233639057851 that seems to be the problematic one. It raises
the exception:

Exception in receiveBlock for block  java.io.IOException: Trying to
change block file offset of block blk_7857709233639057851 to 33357824
but actual size of file is 33353728

A bigger trace of the logs and a part of the stack trace:

hn3: 2008-03-28 07:34:44,499 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.2:46092
dest: /172.16.3.2:50010
hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 2 got response for connect ack  from downstream datanode with
firstbadlink as 
hn3: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 2 forwarding connect ack to upstream firstbadlink is 
hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:

Received block blk_8152094109584962620 of size 67108864 from /172.16.3.2
hn2: 2008-03-28 07:34:44,496 INFO org.apache.hadoop.dfs.DataNode:
PacketResponder 2 for block blk_8152094109584962620 terminating
hn2: 2008-03-28 07:34:44,500 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.5:35904
dest: /172.16.3.5:50010
hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
Datanode 1 got response for connect ack  from downstream datanode with
firstbadlink as 
hn2: 2008-03-28 07:34:44,502 INFO org.apache.hadoop.dfs.DataNode:
Datanode 1 forwarding connect ack to upstream firstbadlink is 
hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:

Received block blk_8152094109584962620 of size 67108864 from /172.16.3.4
hn1: 2008-03-28 07:34:44,495 INFO org.apache.hadoop.dfs.DataNode:
PacketResponder 1 for block blk_8152094109584962620 terminating
hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Receiving block blk_7857709233639057851 src: /172.16.3.4:36887
dest: /172.16.3.4:50010
hn4: 2008-03-28 07:34:44,501 INFO org.apache.hadoop.dfs.DataNode:
Datanode 0 forwarding connect ack to upstream firstbadlink is 
hn4: 2008-03-28 07:34:44,615 INFO org.apache.hadoop.dfs.DataNode:

Changing block file offset of block blk_7857709233639057851 from 4325376
to 4325376 meta file offset to 33799
hn3: 2008-03-28 07:34:45,304 INFO org.apache.hadoop.dfs.DataNode:
Changing block file offset of block blk_7857709233639057851 from
33353728 to 33357824 meta file offset to 260615
hn3: 2008-03-28 07:34:45,305 INFO org.apache.hadoop.dfs.DataNode:
Exception in receiveBlock for block  java.io.IOException: Trying to
change block file offset of block blk_7857709233639057851 to 33357824
but actual size of file is 33353728
hn1: 2008-03-28 07:35:31,835 INFO org.apache.hadoop.dfs.DataNode:
BlockReport of 564 blocks got processed in 128 msecs

Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed
mode):

ResponseProcessor for block blk_7857709233639057851 prio=10
tid=0x5c557800 nid=0x23ad waiting for monitor entry
[0x40e15000..0x40e15a10]
   java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
$ResponseProcessor.run(DFSClient.java:1771)
- waiting to lock 0x2aaab43ad910 (a java.util.LinkedList)

DataStreamer for file /user/properazzi/test/output/index/_0.cfs block
blk_7857709233639057851 prio=10 tid=0x5c59f000 nid=0x2392
runnable [0x41219000..0x41219d10]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at
java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
- locked 0x2aaade9b8120 (a java.io.BufferedOutputStream)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
- locked 0x2aaade9b8148 (a java.io.DataOutputStream)
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream
$DataStreamer.run(DFSClient.java:1623)
- locked 0x2aaab43ad910 

Re: [Map/Reduce][HDFS]

2008-03-28 Thread Jean-Pierre
Hello

I'm not sure I've understood...actually I've already set this field in
the configuration file. I think this field is just to specify the master
for the HDFS. 

My problem is that I have many machines with, on each one, a bunch of
files which represent the distributed data ... and I want to use this
distribution of data with hadoop. Maybe there is another configuration
file which allow me to say to hadoop how to use my file distribution.
Is it possible ? Should I look to adapt my distribution of data to the
hadoop one ?

Anyway thanks for your answer Peeyush.

On Fri, 2008-03-28 at 16:22 +0530, Peeyush Bishnoi wrote:
 hello ,
 
 Yes you can do this by specify in hadoop-site.xml about the location of
 namenode , where your data is already get distributed.
 
 ---
 property
   namefs.default.name/name
   value IPAddress:PortNo  /value
 /property
 
 ---
 
 Thanks
 
 ---
 Peeyush
 
 
 On Thu, 2008-03-27 at 15:41 -0400, Jean-Pierre wrote:
 
  Hello,
  
  I'm working on large amount of logs, and I've noticed that the
  distribution of data on the network (./hadoop dfs -put input input)
  takes a lot of time.
  
  Let's says that my data is already distributed among the network, is
  there anyway to say to hadoop to use the already existing
  distribution ?.
  
  Thanks
  




Re: EC2 contrib scripts

2008-03-28 Thread Chris K Wensel

make that xen kernels..

btw, they scale much better (see previous post) under heavy load. So  
now instead of timeouts and dropped connections, jvm instances exit  
prematurely. unsure of the cause of this just yet. but its so few, the  
impact is negligible.


ckw

On Mar 28, 2008, at 10:00 AM, Chris K Wensel wrote:

Hey all

I pushed up a patch (and tar) for the ec2 contrib scripts that  
provide support instance sizes, new zen kernels, availability zones,  
concurrent clusters, resizing, ganglia, etc.


the patch can be found here:
https://issues.apache.org/jira/browse/HADOOP-2410

I use these daily, but it is likely wise for others to give them a  
shot to make sure they work for someone besides me.


Since I can't publish to the hadoop-images bucket, you will need to  
build your own image stored in your own bucket. This only takes a  
few minutes.


Feedback against the JIRA issue is best.

cheers,
ckw

Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/






Chris K Wensel
[EMAIL PROTECTED]
http://chris.wensel.net/
http://www.cascading.org/






Re: Performance / cluster scaling question

2008-03-28 Thread Doug Cutting

Doug Cutting wrote:
Seems like we should force things onto the same availablity zone by 
default, now that this is available.  Patch, anyone?


It's already there!  I just hadn't noticed.

https://issues.apache.org/jira/browse/HADOOP-2410

Sorry for missing this, Chris!

Doug


Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Bradford Stephens
Hey everyone,

I'm having a similar problem:

Map output lost, rescheduling:
getMapOutput(task_200803281212_0001_m_00_2,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
task_200803281212_0001_m_00_2/file.out.index in any of the
configured local directories

Then it fails in about 10 minutes. I'm just trying to grep some etexts.

New HDFS installation on 2 nodes (one master, one slave). Ubuntu
Linux, Dell Core 2 Duo processors, Java 1.5.0.

I have a feeling its a configuration issue. Anyone else run into it?


On Tue, Jan 29, 2008 at 11:08 AM, Jason Venner [EMAIL PROTECTED] wrote:
 We are running under linux with dfs on GiGE lans,  kernel
  2.6.15-1.2054_FC5smp, with a variety of xeon steppings for our processors.
  Our replacation factor was set to 3



  Florian Leibert wrote:
   Maybe it helps to know that we're running Hadoop inside amazon's EC2...
  
   Thanks,
   Florian
  

  --
  Jason Venner
  Attributor - Publish with Confidence http://www.attributor.com/
  Attributor is hiring Hadoop Wranglers, contact if interested



RE: Reduce Hangs

2008-03-28 Thread Natarajan, Senthil
Hi,
Thanks for your suggestions.

It looks like the problem is with firewall, I created the firewall rule to 
allow these ports 5 to 50100 (I found in these port range hadoop was 
listening)

Looks like I am missing some ports and that gets blocked in the firewall.

Could anyone please let me know, how to configure hadoop to use only certain 
specified ports, so that those ports can be allowed in the firewall.

Thanks,
Senthil

-Original Message-
From: 朱盛凯 [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 27, 2008 12:32 PM
To: core-user@hadoop.apache.org
Subject: Re: Reduce Hangs

Hi,

I met this problem in my cluster before, I think I can share with you some
of my experience.
But it may not work in you case.

The job in my cluster always hung at 16% of reduce. It occured because the
reduce task could not fetch the
map output from other nodes.

In my case, two factors may result in this faliure of communication between
two task trackers.

One is the firewall block the trackers from communications. I solved this by
disabling the firewall.
The other factor is that trackers refer to other nodes by host name only,
but not ip address. I solved this by editing the file /etc/hosts
with mapping from hostname to ip address of all nodes in cluster.

I hope my experience will be helpful for you.

On 3/27/08, Natarajan, Senthil [EMAIL PROTECTED] wrote:

 Hi,
 I have small Hadoop cluster, one master and three slaves.
 When I try the example wordcount on one of our log file (size ~350 MB)

 Map runs fine but reduce always hangs (sometime around 19%,60% ...) after
 very long time it finishes.
 I am seeing this error
 Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out
 In the log I am seeing this
 INFO org.apache.hadoop.mapred.TaskTracker:
 task_200803261535_0001_r_00_0 0.1834% reduce  copy (11 of 20 at
 0.02 MB/s) 

 Do you know what might be the problem.
 Thanks,
 Senthil




Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Bradford Stephens
Also, I'm running hadoop 0.16.1 :)

On Fri, Mar 28, 2008 at 1:23 PM, Bradford Stephens
[EMAIL PROTECTED] wrote:
 Hey everyone,

  I'm having a similar problem:

  Map output lost, rescheduling:
  getMapOutput(task_200803281212_0001_m_00_2,0) failed :

 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
  task_200803281212_0001_m_00_2/file.out.index in any of the
  configured local directories

  Then it fails in about 10 minutes. I'm just trying to grep some etexts.

  New HDFS installation on 2 nodes (one master, one slave). Ubuntu
  Linux, Dell Core 2 Duo processors, Java 1.5.0.

  I have a feeling its a configuration issue. Anyone else run into it?




  On Tue, Jan 29, 2008 at 11:08 AM, Jason Venner [EMAIL PROTECTED] wrote:
   We are running under linux with dfs on GiGE lans,  kernel
2.6.15-1.2054_FC5smp, with a variety of xeon steppings for our processors.
Our replacation factor was set to 3
  
  
  
Florian Leibert wrote:
 Maybe it helps to know that we're running Hadoop inside amazon's EC2...

 Thanks,
 Florian

  
--
Jason Venner
Attributor - Publish with Confidence http://www.attributor.com/
Attributor is hiring Hadoop Wranglers, contact if interested
  



RE: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Devaraj Das
Hi Bradford,
Could you please check what your mapred.local.dir is set to?
Devaraj. 

 -Original Message-
 From: Bradford Stephens [mailto:[EMAIL PROTECTED] 
 Sent: Saturday, March 29, 2008 1:54 AM
 To: core-user@hadoop.apache.org
 Cc: [EMAIL PROTECTED]
 Subject: Re: hadoop 0.15.3 r612257 freezes on reduce task
 
 Hey everyone,
 
 I'm having a similar problem:
 
 Map output lost, rescheduling:
 getMapOutput(task_200803281212_0001_m_00_2,0) failed :
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could 
 not find task_200803281212_0001_m_00_2/file.out.index in 
 any of the configured local directories
 
 Then it fails in about 10 minutes. I'm just trying to grep 
 some etexts.
 
 New HDFS installation on 2 nodes (one master, one slave). 
 Ubuntu Linux, Dell Core 2 Duo processors, Java 1.5.0.
 
 I have a feeling its a configuration issue. Anyone else run into it?
 
 
 On Tue, Jan 29, 2008 at 11:08 AM, Jason Venner 
 [EMAIL PROTECTED] wrote:
  We are running under linux with dfs on GiGE lans,  kernel  
  2.6.15-1.2054_FC5smp, with a variety of xeon steppings for 
 our processors.
   Our replacation factor was set to 3
 
 
 
   Florian Leibert wrote:
Maybe it helps to know that we're running Hadoop inside 
 amazon's EC2...
   
Thanks,
Florian
   
 
   --
   Jason Venner
   Attributor - Publish with Confidence http://www.attributor.com/  
  Attributor is hiring Hadoop Wranglers, contact if interested
 
 



Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Bradford Stephens
Thanks for the hint, Deveraj! I was using paths for the
mapred.local.dir that was based on ~/, so I gave it an absolute path
instead. Also, the directory for hadoop.tmp.dir did not exist on one
machine :)


On Fri, Mar 28, 2008 at 2:00 PM, Devaraj Das [EMAIL PROTECTED] wrote:
 Hi Bradford,
  Could you please check what your mapred.local.dir is set to?
  Devaraj.



   -Original Message-
   From: Bradford Stephens [mailto:[EMAIL PROTECTED]
   Sent: Saturday, March 29, 2008 1:54 AM
   To: core-user@hadoop.apache.org
   Cc: [EMAIL PROTECTED]
   Subject: Re: hadoop 0.15.3 r612257 freezes on reduce task
  
   Hey everyone,
  
   I'm having a similar problem:
  
   Map output lost, rescheduling:
   getMapOutput(task_200803281212_0001_m_00_2,0) failed :
   org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
   not find task_200803281212_0001_m_00_2/file.out.index in
   any of the configured local directories
  
   Then it fails in about 10 minutes. I'm just trying to grep
   some etexts.
  
   New HDFS installation on 2 nodes (one master, one slave).
   Ubuntu Linux, Dell Core 2 Duo processors, Java 1.5.0.
  
   I have a feeling its a configuration issue. Anyone else run into it?
  
  
   On Tue, Jan 29, 2008 at 11:08 AM, Jason Venner
   [EMAIL PROTECTED] wrote:
We are running under linux with dfs on GiGE lans,  kernel
2.6.15-1.2054_FC5smp, with a variety of xeon steppings for
   our processors.
 Our replacation factor was set to 3
   
   
   
 Florian Leibert wrote:
  Maybe it helps to know that we're running Hadoop inside
   amazon's EC2...
 
  Thanks,
  Florian
 
   
 --
 Jason Venner
 Attributor - Publish with Confidence http://www.attributor.com/
Attributor is hiring Hadoop Wranglers, contact if interested
   
  




Experience with Hadoop on Open Solaris

2008-03-28 Thread Pete Wyckoff

Anyone have experience running a production  cluster on Open Solaris? The
advantage of course is the availability of ZFS, but I haven't seen much in
the way of people on the list mentioning they use Open Solaris.

Thanks, pete



small sized files - how to use MultiInputFileFormat

2008-03-28 Thread Jason Curtes
Hello,

I have been trying to run Hadoop on a set of small text files, not larger
than 10k each. The total input size is 15MB. If I try to run the example
word count application, it takes about 2000 seconds, more than half an hour
to complete. However, if I merge all the files into one large file, it takes
much less than a minute. I think using MultiInputFileFormat can be helpful
at this point. However, the API documentation is not really helpful. I
wonder if MultiInputFileFormat can really solve my problem, and if so, can
you suggest me a reference on how to use it, or a few lines to be added to
the word count example to make things more clear?

Thanks in advance.

Regards,

Jason Curtes