date:20080313

Re: why can not initialize object in Map Class's constructor?

2008-03-13 Thread ma qiang

I'm sorry for my mistake.   The object was initialized .






On Thu, Mar 13, 2008 at 3:13 PM, ma qiang [EMAIL PROTECTED] wrote:
 Hi all,
 My code as below,

  public class MapTest extends MapReduceBase implements Mapper {
 private int[][] myTestArray;
 private int myTestInt;

 public MapTest()
 {
System.out.println(The construtor run !);
myTestArray=new int[10][10];
myTestInt=100;
  }
  .

   }

   when I run this program,I find the constructor run and the parameter
  myTestInt is 100,but myTestArray is null and can not initialize. I've
  no idea why the object will not initialize in Map's constructor ?
  Thank you for your reply !

  Qiang Ma

Re: Separate data-nodes from worker-nodes

2008-03-13 Thread Andrey Pankov


Thanks, Ted!

I also thought it is not good one to separate them out. Just was 
wondering is it possible at all. Thanks!



Ted Dunning wrote:

It is quite possible to do this.

It is also a bad idea.

One of the great things about map-reduce architectures is that data is near
the computation so that you don't have to wait for the network.  If you
separate data and computation, you impose additional load on the cluster.

What this will do to your throughput is an open question and it depends a
lot on your programs.


On 3/13/08 1:42 AM, Andrey Pankov [EMAIL PROTECTED] wrote:


Hi,

Is it possible to configure hadoop cluster in such manner where there
are separately data-nodes and separately worker-nodes? I.e. when nodes
1,2,3 store data in HDFS and nodes 3,4 and 5 do the map-reduce jobs and
take data from HDFS?

If it's possible what impact will be on performance? Any suggestions?

Thanks in advance,

--- Andrey Pankov





---
Andrey Pankov

Re: Separate data-nodes from worker-nodes

2008-03-13 Thread Ted Dunning


It is very possible (even easy).

The data nodes run the datanode process.  The task nodes run the task
tracker.  If the data nodes don't have a task tracker running, then they
won't do any computation.


On 3/13/08 8:22 AM, Andrey Pankov [EMAIL PROTECTED] wrote:

 Thanks, Ted!
 
 I also thought it is not good one to separate them out. Just was
 wondering is it possible at all. Thanks!
 
 
 Ted Dunning wrote:
 It is quite possible to do this.
 
 It is also a bad idea.
 
 One of the great things about map-reduce architectures is that data is near
 the computation so that you don't have to wait for the network.  If you
 separate data and computation, you impose additional load on the cluster.
 
 What this will do to your throughput is an open question and it depends a
 lot on your programs.
 
 
 On 3/13/08 1:42 AM, Andrey Pankov [EMAIL PROTECTED] wrote:
 
 Hi,
 
 Is it possible to configure hadoop cluster in such manner where there
 are separately data-nodes and separately worker-nodes? I.e. when nodes
 1,2,3 store data in HDFS and nodes 3,4 and 5 do the map-reduce jobs and
 take data from HDFS?
 
 If it's possible what impact will be on performance? Any suggestions?
 
 Thanks in advance,
 
 --- Andrey Pankov
 
 
 
 ---
 Andrey Pankov

HadoopDfsReadWriteExample

2008-03-13 Thread Cagdas Gerede

I tried HadoopDfsReadWriteExample. I am getting the following error. I
appreciate any help. I provide more info at the end.


Error while copying file
Exception in thread main java.io.IOException: Cannot run program
df: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
at java.lang.Runtime.exec(Runtime.java:593)
at java.lang.Runtime.exec(Runtime.java:466)
at org.apache.hadoop.fs.ShellCommand.runCommand(ShellCommand.java:48)
at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
at org.apache.hadoop.fs.DF.getAvailable(DF.java:72)
at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:296)
at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:326)
at 
org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:155)
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.newBackupFile(DFSClient.java:1483)
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.openBackupStream(DFSClient.java:1450)
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:1592)
at 
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:140)
at 
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:122)
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1728)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at HadoopDFSFileReadWrite.main(HadoopDFSFileReadWrite.java:106)
Caused by: java.io.IOException: CreateProcess error=2, The system
cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.init(ProcessImpl.java:81)
at java.lang.ProcessImpl.start(ProcessImpl.java:30)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
... 17 more



Note: I am on a Windows machine. The namenode is running in the same
Windows machine. The way I initialized the configuration is:

Configuration conf = new Configuration();
conf.addResource(new
Path(C:\\cygwin\\hadoop-management\\hadoop-conf\\hadoop-site.xml));
FileSystem fs = FileSystem.get(conf);


Any suggestions?

Cagdas

Re: file permission problem

2008-03-13 Thread s29752-hadoopuser

Hi Johannes,

 i'm using the 0.16.0 distribution.
I assume you mean the 0.16.0 release 
(http://hadoop.apache.org/core/releases.html) without any additional patch.

I just have tried it but cannot reproduce the problem you described.  I did the 
following:
1) start a cluster with tsz
2) run a job with nicholas

The output directory and files are owned by nicholas.  Am I doing the same 
thing you did?  Could you try again?

Nicholas


 - Original Message 
 From: Johannes Zillmann [EMAIL PROTECTED]
 To: core-user@hadoop.apache.org
 Sent: Wednesday, March 12, 2008 5:47:27 PM
 Subject: file permission problem

 Hi,

 i have a question regarding the file permissions.
 I have a kind of workflow where i submit a job from my laptop to a 
 remote hadoop cluster.
 After the job finished i do some file operations on the generated output.
 The cluster-user is different to the laptop-user. As output i 
 specify a directory inside the users home. This output directory, 
 created through the map-reduce job has cluster-user permissions, so 
 this does not allow me to move or delete the output folder with my 
 laptop-user.

 So it looks as follow:
 /user/jz/  rwxrwxrwx jzsupergroup
 /user/jz/output   rwxr-xr-xhadoopsupergroup

 I tried different things to achieve what i want (moving/deleting the 
 output folder):
 - jobConf.setUser(hadoop) on the client side
 - System.setProperty(user.name,hadoop) before jobConf instantiation 
 on the client side
 - add user.name node in the hadoop-site.xml on the client side
 - setPermision(777) on the home folder on the client side (does not work 
 recursiv)
 - setPermision(777) on the output folder on the client side (permission 
 denied)
 - create the output folder before running the job (Output directory 
 already exists exception)

 None of the things i tried worked. Is there a way to achieve what i want ?
 Any ideas appreciated!

 cheers
 Johannes


   


-- 
~~~ 
101tec GmbH

Halle (Saale), Saxony-Anhalt, Germany
http://www.101tec.com

Re: copy - sort hanging

2008-03-13 Thread Chris K Wensel


here is a reset, followed by three attempts to write the block.

2008-03-13 13:40:06,892 INFO org.apache.hadoop.dfs.DataNode: Receiving  
block blk_7813471133156061911 src: /10.251.26.3:35762 dest: / 
10.251.26.3:50010
2008-03-13 13:40:06,957 INFO org.apache.hadoop.dfs.DataNode: Exception  
in receiveBlock for block blk_7813471133156061911  
java.net.SocketException: Connection reset
2008-03-13 13:40:06,957 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.net.SocketException: Connection reset
2008-03-13 13:40:06,958 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.net.SocketException: Connection  
reset

at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java: 
65)

at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
	at org.apache.hadoop.dfs.DataNode 
$BlockReceiver.receivePacket(DataNode.java:2194)
	at org.apache.hadoop.dfs.DataNode 
$BlockReceiver.receiveBlock(DataNode.java:2244)
	at org.apache.hadoop.dfs.DataNode 
$DataXceiver.writeBlock(DataNode.java:1150)

at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:938)
at java.lang.Thread.run(Thread.java:619)

2008-03-13 13:40:11,751 INFO org.apache.hadoop.dfs.DataNode: Receiving  
block blk_7813471133156061911 src: /10.251.27.148:48384 dest: / 
10.251.27.148:50010
2008-03-13 13:40:11,752 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.io.IOException: Block blk_7813471133156061911 has already been  
started (though not completed), and thus cannot be created.
2008-03-13 13:40:11,752 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.io.IOException: Block  
blk_7813471133156061911 has already been started (though not  
completed), and thus cannot be created.

at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:638)
	at org.apache.hadoop.dfs.DataNode$BlockReceiver.init(DataNode.java: 
1983)
	at org.apache.hadoop.dfs.DataNode 
$DataXceiver.writeBlock(DataNode.java:1074)

at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:938)
at java.lang.Thread.run(Thread.java:619)


2008-03-13 13:48:37,925 INFO org.apache.hadoop.dfs.DataNode: Receiving  
block blk_7813471133156061911 src: /10.251.70.210:37345 dest: / 
10.251.70.210:50010
2008-03-13 13:48:37,925 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.io.IOException: Block blk_7813471133156061911 has already been  
started (though not completed), and thus cannot be created.
2008-03-13 13:48:37,925 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.io.IOException: Block  
blk_7813471133156061911 has already been started (though not  
completed), and thus cannot be created.

at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:638)
	at org.apache.hadoop.dfs.DataNode$BlockReceiver.init(DataNode.java: 
1983)
	at org.apache.hadoop.dfs.DataNode 
$DataXceiver.writeBlock(DataNode.java:1074)

at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:938)
at java.lang.Thread.run(Thread.java:619)

2008-03-13 14:08:36,089 INFO org.apache.hadoop.dfs.DataNode: Receiving  
block blk_7813471133156061911 src: /10.251.26.223:49176 dest: / 
10.251.26.223:50010
2008-03-13 14:08:36,089 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.io.IOException: Block blk_7813471133156061911 has already been  
started (though not completed), and thus cannot be created.
2008-03-13 14:08:36,089 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.io.IOException: Block  
blk_7813471133156061911 has already been started (though not  
completed), and thus cannot be created.

at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:638)
	at org.apache.hadoop.dfs.DataNode$BlockReceiver.init(DataNode.java: 
1983)
	at org.apache.hadoop.dfs.DataNode 
$DataXceiver.writeBlock(DataNode.java:1074)

at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:938)
at java.lang.Thread.run(Thread.java:619)

On Mar 13, 2008, at 11:25 AM, Chris K Wensel wrote:



should add that 10.251.65.207 (receiving end of  
NameSystem.pendingTransfer below) has this datanode log entry.



2008-03-13 14:08:36,089 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.io.IOException: Block blk_7813471133156061911 has already been  
started (though not completed), and thus cannot be created.
2008-03-13 14:08:36,089 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.io.IOException: Block

Question about recovering from a corrupted namenode 0.16.0

2008-03-13 Thread Jason Venner

The namenode ran out of disk space and on restart was throwing the error 
at the end of this message.


We copied in the edit.tmp to edit from the secondary, and copied in 
srcimage to fsimage, and removed edit.new and our file system started up

and /appears/ to be intact.

What is the proper procedure, we didn't find any details on the wiki.

Namenode error:
2008-03-13 13:19:32,493 ERROR org.apache.hadoop.dfs.NameNode: 
java.io.EOFException

   at java.io.DataInputStream.readFully(DataInputStream.java:180)
   at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106)
   at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:90)
   at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:507)
   at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:744)
   at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:624)
   at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:222)
   at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79)
   at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:254)
   at org.apache.hadoop.dfs.FSNamesystem.init(FSNamesystem.java:235)
   at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:130)
   at org.apache.hadoop.dfs.NameNode.init(NameNode.java:175)
   at org.apache.hadoop.dfs.NameNode.init(NameNode.java:161)
   at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:843)
   at org.apache.hadoop.dfs.NameNode.main(NameNode.java:852)



--
Jason Venner
Attributor - Publish with Confidence http://www.attributor.com/
Attributor is hiring Hadoop Wranglers, contact if interested

Problem retrieving entry from compressed MapFile

2008-03-13 Thread Xavier Stevens

Currently I can retrieve entries if I use MapFileOutputFormat via 
conf.setOutputFormat with no compression specified.  But I was trying to do 
this:

public void configure(JobConf jobConf) {
...
this.writer = new MapFile.Writer(jobConf, fileSys, dirName, Text.class, 
Text.class, SequenceFile.CompressionType.BLOCK);
...
}

public void map(WritableComparable key, Writable value,
OutputCollector output, Reporter reporter) throws 
IOException {
...
writer.append(newkey,newvalue);
...
}

To use SequenceFile block compression.  Then later trying to retrieve the 
output values in a separate class:

public static void main(String[] args) throws Exception {
...
conf.setInputFormat(org.apache.hadoop.mapred.SequenceFileInputFormat.class);
...
MapFile.Reader[] readers = MapFileOutputFormat.getReaders(fileSys, inDataPath, 
defaults);
Partitioner part = 
(Partitioner)ReflectionUtils.newInstance(conf.getPartitionerClass(), conf);
Text entryValue = null;
entryValue = (Text)MapFileOutputFormat.getEntry(readers, part, new 
Text(mykey), new Text());
if (entryValue != null) {
System.out.println(My Entry's Value: );
System.out.println(entryValue.toString());
}
for (MapFile.Reader reader:readers) {
if (reader != null) {
reader.close();
}
}
}

But when I use block compression I no longer get a result from 
MapFileOutputFormat.getEntry.  What am I doing wrong?  And/or is there a way 
for this to work using conf.setOutputFormat(MapFileOutputFormat.class) and 
conf.setMapOutputCompressionType(SequenceFile.CompressionType.BLOCK)?

Fault Tolerance: Inquiry for approaches to solve single point of failure when name node fails

2008-03-13 Thread Cagdas Gerede

I have a question. As we know, the name node forms a single point of failure.
In a production environment, I imagine a name node would run in a data
center. If that data center
fails, how would you a put a new name node in place in another data
center that can take over without minimum interruption?

I was wondering if anyone has any experience/ideas/comments on this.

Thanks

-Cagdas

Re: Fault Tolerance: Inquiry for approaches to solve single point of failure when name node fails

2008-03-13 Thread Cagdas Gerede

 If your data center fails, then you probably have to worry more about how to 
 get your data.

I assume having multiple data centers. I know thanks to HDFS
replication data in the other data center will be enough.
However, as much as I see for now, HDFS has no support for replication
of namenode.
Is this true?
If there is no automated support, and If I need to do this replication
with some custom code or manual intervention,
what are the steps to do this replication?

Any help is appreciated.

Cagdas

RE: Question about recovering from a corrupted namenode 0.16.0

2008-03-13 Thread dhruba Borthakur

Your procedure is right:

1. Copy edit.tmp from secondary to edit on primary
2. Copy srcimage from secondary to fsimage on primary 
3. remove edits.new on primary
4. restart cluster, put in Safemode, fsck /

However, the above steps are not foolproof because the transactions that
occured between the time when the last checkpoint was taken by the
secondary and when the disk became full are lost. This could cause some
blocks to go missing too, because the last checkpoint might refer to
blocks that are no longer present. If the fsck does not report any
missing blocks, then you are good to go.

Thanks,
dhruba

-Original Message-
From: Jason Venner [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 13, 2008 1:37 PM
To: core-user@hadoop.apache.org
Subject: Question about recovering from a corrupted namenode 0.16.0

The namenode ran out of disk space and on restart was throwing the error

at the end of this message.

We copied in the edit.tmp to edit from the secondary, and copied in 
srcimage to fsimage, and removed edit.new and our file system started up
and /appears/ to be intact.

What is the proper procedure, we didn't find any details on the wiki.

Namenode error:
2008-03-13 13:19:32,493 ERROR org.apache.hadoop.dfs.NameNode: 
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106)
at
org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:90)
at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:507)
at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:744)
at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:624)
at
org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:222)
at
org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:79)
at
org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:254)
at org.apache.hadoop.dfs.FSNamesystem.init(FSNamesystem.java:235)
at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:130)
at org.apache.hadoop.dfs.NameNode.init(NameNode.java:175)
at org.apache.hadoop.dfs.NameNode.init(NameNode.java:161)
at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:843)
at org.apache.hadoop.dfs.NameNode.main(NameNode.java:852)



-- 
Jason Venner
Attributor - Publish with Confidence http://www.attributor.com/
Attributor is hiring Hadoop Wranglers, contact if interested

Re: Summit Move: More Seats, new Venue (Re: Hadoop summit on March 25th)

2008-03-13 Thread Jeremy Zawodny

No problem!

Clearly there's demand.  As of a few minutes ago, we're at capacity once
again.  So I hope everyone who wanted in was able to get on the list.

See some of you in just over a week...

Jeremy

On 3/12/08, Marc Boucher [EMAIL PROTECTED] wrote:

 Great news Jeremy, thank you for this.

 Marc Boucher
 Hyperix

 On Wed, Mar 12, 2008 at 11:02 AM, Jeremy Zawodny [EMAIL PROTECTED]
 wrote:

  Good news!
 
  We've located a new venue for the Hadoop Summit (not far from Yahoo),
  have
  capacity for another 75 people, and are still keeping the event free.
  Thanks to Amazon Web Services for chipping in some food money. :-)
 
  Sign up now and pass the word:
 
  http://upcoming.yahoo.com/event/436226/
  http://developer.yahoo.com/blogs/hadoop/2008/03/hadoop-summit-move.html
 
  We're in the process of updating the summit site and notifying a few
  other
  groups as well.
 
  Thanks for all the interest.  We're looking forward to a day packed full
  of
  Hadoop.
 
  Jeremy
 
  On 3/5/08, Ted Dunning [EMAIL PROTECTED] wrote:
  
  
   +1
  
   I have a colleague I would like to bring as well.
  
   Maybe we need to have an unconference in a park next door and take
  turns
   actually being in the hall for the talks.
  
  
  
   On 3/5/08 1:58 PM, Bruce Williams [EMAIL PROTECTED] wrote:
  
It seems like a bigger room in Sunnyvale could be found, doesn't it?
There are tech presentations of different sizes going on everyday in
  the
San Jose area.
   
I registered, but have people who I work with who would love to
  attend.
There are many new people coming into Hadoop who would benefit from
  the
Summit. As it has turned out, with the people presently attending,
   the
result may somewhat be preaching to the choir  and  informing the
already  well informed compared to what could happen. ( to the great
benefit of Hadoop )
   
Anyway, I am looking forward to a great time! :-)
   
Bruce Williams
   
   
Marc Boucher wrote:
I'm on the waiting list as well and i'll be in the area anyway on a
business trip so I'm wondering with so many people wanting to
  attend
is there no way to get a bigger venue?
   
Marc Boucher
Hyperix
   
   
On 3/5/08, mickey hsieh [EMAIL PROTECTED] wrote:
   
Hi Jeremy,
   
It is full again. Current size is 140. The demand is really high,
  I am
desperately looking for opportunity to attend.
   
Is there any chance to get up a couple more slots?
   
Thanks,
   
Mickey Hsieh
Fox Interactive Media
   
   
   
On 2/28/08, Jeremy Zawodny [EMAIL PROTECTED] wrote:
   
I've bumped up the numbers on Upcoming.org to allow more folks to
   attend.
The room might be a little crowded, but we'll make it work.
   
We're also looking at webcasting in addition to posting video
  after
   the
summit.
   
   
   
   
   
  
  http://developer.yahoo.com/blogs/hadoop/2008/02/hadoop_summit_nearly_full_we
ll.html
   
http://upcoming.yahoo.com/event/436226/
http://developer.yahoo.com/hadoop/summit/
   
Register soon if you haven't already.
   
Thanks!
   
Jeremy
   
On 2/25/08, chris [EMAIL PROTECTED] wrote:
   
I see the class is full with more than 50 watchers. Any chance
  the
   size
will
expand? If not, any date in mind for a second one?

Re: copy - sort hanging

2008-03-13 Thread Chris K Wensel



I don't really have these logs as i've bounce my cluster. But am  
willing to ferret out anything in particular on my next failed run.


On Mar 13, 2008, at 4:32 PM, Raghu Angadi wrote:



Yeah, its kind of hard to deal with these failure once they start  
occurring.


Are all these logs from the same datnode? Could you separate logs  
from different datanodes?


If the first exception stack is while replicating a block (as  
opposed to initial write), then http://issues.apache.org/jira/browse/HADOOP-3007 
 would help there. i.e. failure on next datanode should not affect  
this datanode, you still need to check why the remote datanode failed.


Another problem is that once DataNode fails to write a block, the  
same back can not be written to this node for next one hour. These  
are the can not be written to errors you see below. We should  
really fix this. I will file a jira.


Raghu.

Chris K Wensel wrote:

here is a reset, followed by three attempts to write the block.
2008-03-13 13:40:06,892 INFO org.apache.hadoop.dfs.DataNode:  
Receiving block blk_7813471133156061911 src: /10.251.26.3:35762  
dest: /10.251.26.3:50010
2008-03-13 13:40:06,957 INFO org.apache.hadoop.dfs.DataNode:  
Exception in receiveBlock for block blk_7813471133156061911  
java.net.SocketException: Connection reset
2008-03-13 13:40:06,957 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.net.SocketException: Connection reset
2008-03-13 13:40:06,958 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.net.SocketException:  
Connection reset
   at  
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)

   at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
   at  
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java: 
65)
   at java.io.BufferedOutputStream.flush(BufferedOutputStream.java: 
123)

   at java.io.DataOutputStream.flush(DataOutputStream.java:106)
   at org.apache.hadoop.dfs.DataNode 
$BlockReceiver.receivePacket(DataNode.java:2194) at  
org.apache.hadoop.dfs.DataNode 
$BlockReceiver.receiveBlock(DataNode.java:2244) at  
org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java: 
1150)
   at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java: 
938)

   at java.lang.Thread.run(Thread.java:619)
2008-03-13 13:40:11,751 INFO org.apache.hadoop.dfs.DataNode:  
Receiving block blk_7813471133156061911 src: /10.251.27.148:48384  
dest: /10.251.27.148:50010
2008-03-13 13:40:11,752 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.io.IOException: Block blk_7813471133156061911 has already been  
started (though not completed), and thus cannot be created.
2008-03-13 13:40:11,752 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.io.IOException: Block  
blk_7813471133156061911 has already been started (though not  
completed), and thus cannot be created.
   at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java: 
638)
   at org.apache.hadoop.dfs.DataNode 
$BlockReceiver.init(DataNode.java:1983)
   at org.apache.hadoop.dfs.DataNode 
$DataXceiver.writeBlock(DataNode.java:1074)
   at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java: 
938)

   at java.lang.Thread.run(Thread.java:619)
2008-03-13 13:48:37,925 INFO org.apache.hadoop.dfs.DataNode:  
Receiving block blk_7813471133156061911 src: /10.251.70.210:37345  
dest: /10.251.70.210:50010
2008-03-13 13:48:37,925 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.io.IOException: Block blk_7813471133156061911 has already been  
started (though not completed), and thus cannot be created.
2008-03-13 13:48:37,925 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.io.IOException: Block  
blk_7813471133156061911 has already been started (though not  
completed), and thus cannot be created.
   at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java: 
638)
   at org.apache.hadoop.dfs.DataNode 
$BlockReceiver.init(DataNode.java:1983)
   at org.apache.hadoop.dfs.DataNode 
$DataXceiver.writeBlock(DataNode.java:1074)
   at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java: 
938)

   at java.lang.Thread.run(Thread.java:619)
2008-03-13 14:08:36,089 INFO org.apache.hadoop.dfs.DataNode:  
Receiving block blk_7813471133156061911 src: /10.251.26.223:49176  
dest: /10.251.26.223:50010
2008-03-13 14:08:36,089 INFO org.apache.hadoop.dfs.DataNode:  
writeBlock blk_7813471133156061911 received exception  
java.io.IOException: Block blk_7813471133156061911 has already been  
started (though not completed), and thus cannot be created.
2008-03-13 14:08:36,089 ERROR org.apache.hadoop.dfs.DataNode:  
10.251.65.207:50010:DataXceiver: java.io.IOException: Block  
blk_7813471133156061911 has already been started (though not  
completed), and thus cannot be created.
   at

if can not close the connection to HBase using HTable ...?

2008-03-13 Thread ma qiang

Hi all,
 If I can not close the connection to HBase using HTable, after
 the object was set as null . Whether the resource of this connection
 will be released ?

 The code as below;

 public class MyMap extends MapReduceBase implements Mapper {
private HTable connection ;

public MyMap(){
   connection=new HTable(new HBaseConfiguration(),
new Text(HBaseTest));
}

public void map(...){
  ..
  connection=null;  // I couldn't use  connection.close;
}

 }

Re: why can not initialize object in Map Class's constructor?

Re: Separate data-nodes from worker-nodes

Re: Separate data-nodes from worker-nodes

HadoopDfsReadWriteExample

Re: file permission problem

Re: copy - sort hanging

Question about recovering from a corrupted namenode 0.16.0

Problem retrieving entry from compressed MapFile

Fault Tolerance: Inquiry for approaches to solve single point of failure when name node fails

Re: Fault Tolerance: Inquiry for approaches to solve single point of failure when name node fails

RE: Question about recovering from a corrupted namenode 0.16.0

Re: Summit Move: More Seats, new Venue (Re: Hadoop summit on March 25th)

Re: copy - sort hanging

if can not close the connection to HBase using HTable ...?

14 matches

Site Navigation

Mail list logo

Footer information