Slides/Videos of Hadoop Summit

2009-06-22 Thread jaideep dhok
Hi all,
Are the slides or videos of the talks given at Hadoop Summit available
online? I checked the Yahoo! website for the summit but could not find
any links.

Regards,
-- 
Jaideep


FYI, Large-scale graph computing at Google

2009-06-22 Thread Edward J. Yoon
http://googleresearch.blogspot.com/2009/06/large-scale-graph-computing-at-google.html
-- It sounds like Pregel seems, a computing framework based on dynamic
programming for the graph operations. I guess maybe they removed the
file communications/intermediate files during iterations.

Anyway, What do you think?
-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardy...@apache.org
http://blog.udanax.org


Re: Multiple NIC Cards

2009-06-22 Thread JQ Hadoop
The address of the JobTracker (NameNode) is specified using *
mapred.job.tracker* (*fs.default.name*) in the configurations. When the
JobTracker (NameNode) starts, it will listen on the address specified by *
mapred.job.tracker* (*fs.default.name*); and when a TaskTracker (DataNode)
starts, it will talk to the address specified by *mapred.job.tracker* (*
fs.default.name*) through RPC. So there are no confusions (about the
communications between TaskTracker and JobTracker, as well as between
DataNode and NameNode) even for multi-homed nodes, so long those two
addresses are correctly specified.

On the other hand, when a TaskTracker (DataNode) starts, it will also listen
on its own service addresses which are usually specified in the
configurations as *0.0.0.0* (e.g., *mapred.task.tracker.http.address* and *
dfs.datanode.address*); that is, it will accept connections from all the
NICs in the node. In addition, the TaskTracker (DataNode) will send to the
JobTracker (NameNode) status messages regularly, which contain its hostname.
Consequently, when a Map or Reduce task obtains the addresses of the
TaskTrackers (DataNodes) from the JobTracker (NameNode), e.g., for copying
the Map output or reading a HDFS block, it will get the hostnames specified
in the status messages and talk to the TaskTrackers (DataNodes) using those
hostnames.

The hostname specified in the status messages are determined something like
below (as of Hadoop 0.19.1), which can be a little tricky for multi-homed
nodes.

String hostname = conf.get(slave.host.name);
if (hostname == null) {
  String interface =
conf.get(mapred.tasktracker.dns.interface,default);
  String nameserver = conf.get(mapred.tasktracker.dns.nameserver,
default);
  if (interface.equals(default))
hostname = InetAddress.getLocalHost().getCanonicalHostName();
  else {
String[] ips = getIPs(strInterface);
VectorString hosts = new VectorString();
for (int i = 0; i  ips.length; i ++) {
  hosts.add(reverseDns(InetAddress.getByName(ips[i]), nameserver));
}
if (hosts.size() == 0)
  hostname = InetAddress.getLocalHost().getCanonicalHostName();
else
  hostname = hosts.toArray(new String[] {});
  }
}

I think the easiest way for multiple NICs is probably to start each
TaskTracker (DataNode) by specifying appropriate *slave.host.name* at its
command line, which can be done in bin/slave.sh.



On Thu, Jun 11, 2009 at 11:35 AM, John Martyniak 
j...@beforedawnsolutions.com wrote:

 So it turns out the reason that I was getting the duey.local. was because
 that is what was in the reverse DNS on the nameserver from a previous test.
  So that is fixed, and now the machine says duey.local.xxx.com.

 The only remaining issue is the trailing . (Period) that is required by
 DNS to make the name fully qualified.

 So not sure if this is a bug in the Hadoop uses this information or some
 other issue.

 If anybody has run across this issue before any help would be greatly
 appreciated.

 Thank you,

 -John

 On Jun 10, 2009, at 9:21 PM, Matt Massie wrote:

  If you look at the documentation for the getCanonicalHostName() function
 (thanks, Steve)...


 http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName()

 you'll see two Java security properties (networkaddress.cache.ttl and
 networkaddress.cache.negative.ttl).

 You might take a look at your /etc/nsswitch.conf configuration as well to
 learn how hosts are resolved on your machine, e.g...

 $ grep hosts /etc/nsswitch.conf
 hosts:  files dns

 and lastly, you may want to check if you are running nscd (the NameService
 cache daemon).  If you are, take a look at /etc/nscd.conf for the caching
 policy it's using.

 Good luck.

 -Matt



 On Jun 10, 2009, at 1:09 PM, John Martyniak wrote:

  That is what I thought also, is that it needs to keep that information
 somewhere, because it needs to be able to communicate with all of the
 servers.

 So I deleted the /tmp/had* and /tmp/hs* directories, removed the log
 files, and grepped for the duey name in all files in config.  And the
 problem still exists.  Originally I thought that it might have had something
 to do with multiple entries in the .ssh/authorized_keys file but removed
 everything there.  And the problem still existed.

 So I think that I am going to grab a new install of hadoop 0.19.1, delete
 the existing one and start out fresh to see if that changes anything.

 Wish me luck:)

 -John

 On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:

  John Martyniak wrote:

 Does hadoop cache the server names anywhere?  Because I changed to
 using DNS for name resolution, but when I go to the nodes view, it is 
 trying
 to view with the old name.  And I changed the hadoop-site.xml file so that
 it no longer has any of those values.


 in SVN head, we try and get Java to tell us what is going on

 

Can we submit a mapreduce job from another mapreduce job?

2009-06-22 Thread Ramakishore Yelamanchilli
Is there any way we can submit a mapreduce job from another map job? The 
requirement is:

I have customers with start date and end date as follows:

CustomerID  Start Date  End Date
Xxx mm/dd/yymm/dd/yy
YYY mm/dd/yymm/dd/yy
ZZZ 
ABC 
XYZ mm/dd/yymm/dd/yy

I have to read the above file, check the customers whose start date is null and 
run another map reduce job for past 30 days (this means I have run that map 
reduce job 30 times for each customer whose start date is null). I'm just 
wondering how can I do this? 

Regards

Ram




Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Raghu Angadi


Is this before 0.20.0? Assuming you have closed these streams, it is 
mostly https://issues.apache.org/jira/browse/HADOOP-4346


It is the JDK internal implementation that depends on GC to free up its 
cache of selectors. HADOOP-4346 avoids this by using hadoop's own cache.


Raghu.

Stas Oskin wrote:

Hi.

After tracing some more with the lsof utility, and I managed to stop the
growth on the DataNode process, but still have issues with my DFS client.

It seems that my DFS client opens hundreds of pipes and eventpolls. Here is
a small part of the lsof output:

java10508 root  387w  FIFO0,6   6142565 pipe
java10508 root  388r  FIFO0,6   6142565 pipe
java10508 root  389u     0,100  6142566
eventpoll
java10508 root  390u  FIFO0,6   6135311 pipe
java10508 root  391r  FIFO0,6   6135311 pipe
java10508 root  392u     0,100  6135312
eventpoll
java10508 root  393r  FIFO0,6   6148234 pipe
java10508 root  394w  FIFO0,6   6142570 pipe
java10508 root  395r  FIFO0,6   6135857 pipe
java10508 root  396r  FIFO0,6   6142570 pipe
java10508 root  397r     0,100  6142571
eventpoll
java10508 root  398u  FIFO0,6   6135319 pipe
java10508 root  399w  FIFO0,6   6135319 pipe

I'm using FSDataInputStream and FSDataOutputStream, so this might be related
to pipes?

So, my questions are:

1) What happens these pipes/epolls to appear?

2) More important, how I can prevent their accumation and growth?

Thanks in advance!

2009/6/21 Stas Oskin stas.os...@gmail.com


Hi.

I have HDFS client and HDFS datanode running on same machine.

When I'm trying to access a dozen of files at once from the client, several
times in a row, I'm starting to receive the following errors on client, and
HDFS browse function.

HDFS Client: Could not get block locations. Aborting...
HDFS browse: Too many open files

I can increase the maximum number of files that can opened, as I have it
set to the default 1024, but would like to first solve the problem, as
larger value just means it would run out of files again later on.

So my questions are:

1) Does the HDFS datanode keeps any files opened, even after the HDFS
client have already closed them?

2) Is it possible to find out, who keeps the opened files - datanode or
client (so I could pin-point the source of the problem).

Thanks in advance!







Measuring runtime of Map-reduce Jobs

2009-06-22 Thread bharath vissapragada
Hi ,

Are there any tools which can measure the run-time of the map-reduce jobs ??
any help is appreciated .

Thanks in advance


Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
Hi.

I've started doing just that, and indeed the amount of fd's of the DataNode
process have reduced significantly.

My problem is that my own app, which works with DFS, still have dozens of
pipes and epolls open.

The usual level seems to be about 300-400 fd's, but when I access the DFS
for accessing several files concurrently, this number easily climbs to
700-800. Moreover, the number sometimes seems to be stuck above 1000, and
only shutting down the app, at it's start, brings this number back to
300-400.

Any idea why this happens, and what else can be released to get it working?

Also, every file I open seems to bump the fd count sometimes as high as by
12. Any idea why single file requires so many fd's?

Thanks in advance.

2009/6/22 jason hadoop jason.had...@gmail.com

 Yes.
 Otherwise the file descriptors will flow away like water.
 I also strongly suggest having at least 64k file descriptors as the open
 file limit.

 On Sun, Jun 21, 2009 at 12:43 PM, Stas Oskin stas.os...@gmail.com wrote:

  Hi.
 
  Thanks for the advice. So you advice explicitly closing each and every
 file
  handle that I receive from HDFS?
 
  Regards.
 
  2009/6/21 jason hadoop jason.had...@gmail.com
 
   Just to be clear, I second Brian's opinion. Relying on finalizes is a
  very
   good way to run out of file descriptors.
  
   On Sun, Jun 21, 2009 at 9:32 AM, brian.lev...@nokia.com wrote:
  
IMHO, you should never rely on finalizers to release scarce resources
   since
you don't know when the finalizer will get called, if ever.
   
-brian
   
   
   
-Original Message-
From: ext jason hadoop [mailto:jason.had...@gmail.com]
Sent: Sunday, June 21, 2009 11:19 AM
To: core-user@hadoop.apache.org
Subject: Re: Too many open files error, which gets resolved after
  some
time
   
HDFS/DFS client uses quite a few file descriptors for each open file.
   
Many application developers (but not the hadoop core) rely on the JVM
finalizer methods to close open files.
   
This combination, expecially when many HDFS files are open can result
  in
very large demands for file descriptors for Hadoop clients.
We as a general rule never run a cluster with nofile less that 64k,
 and
   for
larger clusters with demanding applications have had it set 10x
 higher.
  I
also believe there was a set of JVM versions that leaked file
  descriptors
used for NIO in the HDFS core. I do not recall the exact details.
   
On Sun, Jun 21, 2009 at 5:27 AM, Stas Oskin stas.os...@gmail.com
   wrote:
   
 Hi.

 After tracing some more with the lsof utility, and I managed to
 stop
   the
 growth on the DataNode process, but still have issues with my DFS
   client.

 It seems that my DFS client opens hundreds of pipes and eventpolls.
   Here
is
 a small part of the lsof output:

 java10508 root  387w  FIFO0,6   6142565
   pipe
 java10508 root  388r  FIFO0,6   6142565
   pipe
 java10508 root  389u     0,100  6142566
 eventpoll
 java10508 root  390u  FIFO0,6   6135311
   pipe
 java10508 root  391r  FIFO0,6   6135311
   pipe
 java10508 root  392u     0,100  6135312
 eventpoll
 java10508 root  393r  FIFO0,6   6148234
   pipe
 java10508 root  394w  FIFO0,6   6142570
   pipe
 java10508 root  395r  FIFO0,6   6135857
   pipe
 java10508 root  396r  FIFO0,6   6142570
   pipe
 java10508 root  397r     0,100  6142571
 eventpoll
 java10508 root  398u  FIFO0,6   6135319
   pipe
 java10508 root  399w  FIFO0,6   6135319
   pipe

 I'm using FSDataInputStream and FSDataOutputStream, so this might
 be
 related
 to pipes?

 So, my questions are:

 1) What happens these pipes/epolls to appear?

 2) More important, how I can prevent their accumation and growth?

 Thanks in advance!

 2009/6/21 Stas Oskin stas.os...@gmail.com

  Hi.
 
  I have HDFS client and HDFS datanode running on same machine.
 
  When I'm trying to access a dozen of files at once from the
 client,
 several
  times in a row, I'm starting to receive the following errors on
   client,
 and
  HDFS browse function.
 
  HDFS Client: Could not get block locations. Aborting...
  HDFS browse: Too many open files
 
  I can increase the maximum number of files that can opened, as I
  have
it
  set to the default 1024, but would like to first solve the
 problem,
   as
  larger value just means it would run out of files again later on.
 
  So my questions are:

Re: Name Node HA (HADOOP-4539)

2009-06-22 Thread Steve Loughran

Andrew Wharton wrote:

https://issues.apache.org/jira/browse/HADOOP-4539

I am curious about the state of this fix. It is listed as
Incompatible, but is resolved and committed (according to the
comments). Is the backup name node going to make it into 0.21? Will it
remove the SPOF for HDFS? And if so, what is the proposed release
timeline for 0.21?




The way to deal with HA -which the BackupNode doesn't promise- is to get 
involved in developing and testing the leading edge source tree.


The 0.21 cutoff is approaching, BackupNode is in there, but it needs a 
lot more tests. If you want to aid the development, helping to get more 
automated BackupNode tests in there (indeed, tests that simulate more 
complex NN failures, like a corrupt EditLog) would go a long way.


-steve


Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Steve Loughran

jason hadoop wrote:

Yes.
Otherwise the file descriptors will flow away like water.
I also strongly suggest having at least 64k file descriptors as the open
file limit.

On Sun, Jun 21, 2009 at 12:43 PM, Stas Oskin stas.os...@gmail.com wrote:


Hi.

Thanks for the advice. So you advice explicitly closing each and every file
handle that I receive from HDFS?

Regards.


I must disagree somewhat

If you use FileSystem.get() to get your client filesystem class, then 
that is shared by all threads/classes that use it. Call close() on that 
and any other thread or class holding a reference is in trouble. You 
have to wait for the finalizers for them to get cleaned up.


If you use FileSystem.newInstance() - which came in fairly recently 
(0.20? 0.21?) then you can call close() safely.


So: it depends on how you get your handle.

see: https://issues.apache.org/jira/browse/HADOOP-5933

Also: the too many open files problem can be caused in the NN  -you need 
to set up the Kernel to have lots more file handles around. Lots.




Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Steve Loughran

Scott Carey wrote:

Furthermore, if for some reason it is required to dispose of any objects after 
others are GC'd, weak references and a weak reference queue will perform 
significantly better in throughput and latency - orders of magnitude better - 
than finalizers.




Good point.

I would make sense for the FileSystem cache to be weak referenced, so 
that on long-lived processes the client references will get cleaned up 
without waiting for app termination


Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Steve Loughran

Raghu Angadi wrote:


Is this before 0.20.0? Assuming you have closed these streams, it is 
mostly https://issues.apache.org/jira/browse/HADOOP-4346


It is the JDK internal implementation that depends on GC to free up its 
cache of selectors. HADOOP-4346 avoids this by using hadoop's own cache.


yes, and it's that change that led to my stack traces :(

http://jira.smartfrog.org/jira/browse/SFOS-1208


Specifying which systems to be used as DataNode

2009-06-22 Thread Santosh Bs1

Hi,

I am very new to Hadoop, I have few basic questions...

How and where do I need to specify which all system in the given cluster to
be used as DataNodes ?

Can I change this set dynamically ?


Thanks and Regards
Santosh.B.S
India Systems  Technology Lab
Bangalore 560045



java.io.IOException: Error opening job jar

2009-06-22 Thread Shravan Mahankali
Hi Group,

 

I was having trouble getting through an example Hadoop program. I have
searched the mailing list but could not find any thing useful. Below is the
issue:

 

1) Executed below command to submit a job to Hadoop:

 /hadoop-0.18.3/bin/hadoop jar -libjars AggregateWordCount.jar
org.apache.hadoop.examples.AggregateWordCount words/*
aggregatewordcount_output 2 textinputformat

 

2) Following is the error:

java.io.IOException: Error opening job jar:
org.apache.hadoop.examples.AggregateWordCount

at org.apache.hadoop.util.RunJar.main(RunJar.java:90)

at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Caused by: java.util.zip.ZipException: error in opening zip file

at java.util.zip.ZipFile.open(Native Method)

at java.util.zip.ZipFile.init(ZipFile.java:203)

at java.util.jar.JarFile.init(JarFile.java:132)

at java.util.jar.JarFile.init(JarFile.java:70)

at org.apache.hadoop.util.RunJar.main(RunJar.java:88)

... 4 more

 

Please advice.

 

Thank You,

Shravan Kumar. M 

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com

 



Re: java.io.IOException: Error opening job jar

2009-06-22 Thread Harish Mallipeddi
It cannot find your job jar file. Make sure you run this command from the
directory that has the AggregateWordCount.jar (and you can lose the -libjars
flag too - you need that only if you need to specify extra jar dependencies
apart from your job jar file).

- Harish

On Mon, Jun 22, 2009 at 3:45 PM, Shravan Mahankali 
shravan.mahank...@catalytic.com wrote:

 Hi Group,



 I was having trouble getting through an example Hadoop program. I have
 searched the mailing list but could not find any thing useful. Below is the
 issue:



 1) Executed below command to submit a job to Hadoop:

  /hadoop-0.18.3/bin/hadoop jar -libjars AggregateWordCount.jar
 org.apache.hadoop.examples.AggregateWordCount words/*
 aggregatewordcount_output 2 textinputformat



 2) Following is the error:

 java.io.IOException: Error opening job jar:
 org.apache.hadoop.examples.AggregateWordCount

at org.apache.hadoop.util.RunJar.main(RunJar.java:90)

at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

 Caused by: java.util.zip.ZipException: error in opening zip file

at java.util.zip.ZipFile.open(Native Method)

at java.util.zip.ZipFile.init(ZipFile.java:203)

at java.util.jar.JarFile.init(JarFile.java:132)

at java.util.jar.JarFile.init(JarFile.java:70)

at org.apache.hadoop.util.RunJar.main(RunJar.java:88)

... 4 more



 Please advice.



 Thank You,

 Shravan Kumar. M

 Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

 -

 This email and any files transmitted with it are confidential and intended
 solely for the use of the individual or entity to whom they are addressed.
 If you have received this email in error please notify the system
 administrator -  mailto:netopshelpd...@catalytic.com
 netopshelpd...@catalytic.com






-- 
Harish Mallipeddi
http://blog.poundbang.in


Re: problem about put a lot of files

2009-06-22 Thread stchu
Hi,

Thanks for your quickly reponses.
I tried to relax this limit to 204800, but it still not work.
Is this possible caused from fs objects?

Anyway, thanks a lot!



2009/6/22 zhuweimin xim-...@tsm.kddilabs.jp

 Hi

 The max open files have limit in LINUX box. Please using ulimit to view and
 modify the limit
 1.view limit
   # ulimit -a
 2.modify limit
   For example
   # ulimit -n 10240

 Best wish

  -Original Message-
  From: stchu [mailto:stchu.cl...@gmail.com]
  Sent: Monday, June 22, 2009 12:57 PM
  To: core-user@hadoop.apache.org
  Subject: problem about put a lot of files
 
  Hi,
  Is there any restriction on the amount of putting files? I tried to
  put/copyFromLocal about 50,573 files to HDFS, but I faced a problem:
  ==
  ==
  09/06/22 11:34:34 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
 140.96.89.57:51010
  09/06/22 11:34:34 INFO dfs.DFSClient: Abandoning block
  blk_8245450203753506945_65955
  09/06/22 11:34:40 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
 140.96.89.57:51010
  09/06/22 11:34:40 INFO dfs.DFSClient: Abandoning block
  blk_-8257846965500649510_65956
  09/06/22 11:34:46 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
 140.96.89.57:51010
  09/06/22 11:34:46 INFO dfs.DFSClient: Abandoning block
  blk_4751737303082929912_65956
  09/06/22 11:34:56 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
 140.96.89.57:51010
  09/06/22 11:34:56 INFO dfs.DFSClient: Abandoning block
  blk_5912850890372596972_66040
  09/06/22 11:35:02 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
  140.96.89.193:51010
  09/06/22 11:35:02 INFO dfs.DFSClient: Abandoning block
  blk_6609198685444611538_66040
  09/06/22 11:35:08 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
  140.96.89.193:51010
  09/06/22 11:35:08 INFO dfs.DFSClient: Abandoning block
  blk_6696101244177965180_66040
  09/06/22 11:35:17 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
 140.96.89.57:51010
  09/06/22 11:35:17 INFO dfs.DFSClient: Abandoning block
  blk_-5430033105510098342_66105
  09/06/22 11:35:26 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
 140.96.89.57:51010
  09/06/22 11:35:26 INFO dfs.DFSClient: Abandoning block
  blk_5325140471333041601_66165
  09/06/22 11:35:32 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
  140.96.89.205:51010
  09/06/22 11:35:32 INFO dfs.DFSClient: Abandoning block
  blk_1121864992752821949_66165
  09/06/22 11:35:39 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
  140.96.89.205:51010
  09/06/22 11:35:39 INFO dfs.DFSClient: Abandoning block
  blk_-2096783021040778965_66184
  09/06/22 11:35:45 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
  140.96.89.205:51010
  09/06/22 11:35:45 INFO dfs.DFSClient: Abandoning block
  blk_6949821898790162970_66184
  09/06/22 11:35:51 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
  140.96.89.205:51010
  09/06/22 11:35:51 INFO dfs.DFSClient: Abandoning block
  blk_4708848202696905125_66184
  09/06/22 11:35:57 INFO dfs.DFSClient: Exception in
  createBlockOutputStream
  java.io.IOException: Bad connect ack with firstBadLink
  140.96.89.205:51010
  09/06/22 11:35:57 INFO dfs.DFSClient: Abandoning block
  blk_8031882012801762201_66184
  09/06/22 11:36:03 WARN dfs.DFSClient: DataStreamer Exception:
  java.io.IOException: Unable to create new block.
  at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(
  DFSClient.java:2359)
  at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.
  java:1745)
  at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCl
  ient.java:1922)
 
  09/06/22 11:36:03 WARN dfs.DFSClient: Error Recovery for block
  blk_8031882012801762201_66184 bad datanode[2]
  put: Could not get block locations. Aborting...
  Exception closing file /osmFiles/a/109103.gpx.txt
  java.io.IOException: Could not get block locations. Aborting...
  at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(D
  FSClient.java:2153)
  at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.
  java:1745)
  at
  

RE: java.io.IOException: Error opening job jar

2009-06-22 Thread Shravan Mahankali
Thanks for your reply Harish.

 

Am running this example from with in the directory containing
AggregateWordCount.jar file. But even then, I have this issue. Earlier I had
issue of java.lang.ClassNotFoundException:
org.apache.hadoop.examples.AggregateWordCount$WordCountPlugInClass, so in
some thread some one have suggested using -libjars, so I tried, but there
was not great!!!

 

I did not think it is SUCH A HARD JOB (almost 20+ hours with no success) to
just run an example provided by Hadoop in its distribution!!!

 

Thank You,

Shravan Kumar. M 

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com

  _  

From: Harish Mallipeddi [mailto:harish.mallipe...@gmail.com] 
Sent: Monday, June 22, 2009 4:03 PM
To: core-user@hadoop.apache.org; shravan.mahank...@catalytic.com
Subject: Re: java.io.IOException: Error opening job jar

 

It cannot find your job jar file. Make sure you run this command from the
directory that has the AggregateWordCount.jar (and you can lose the -libjars
flag too - you need that only if you need to specify extra jar dependencies
apart from your job jar file).

- Harish

On Mon, Jun 22, 2009 at 3:45 PM, Shravan Mahankali
shravan.mahank...@catalytic.com wrote:

Hi Group,



I was having trouble getting through an example Hadoop program. I have
searched the mailing list but could not find any thing useful. Below is the
issue:



1) Executed below command to submit a job to Hadoop:

 /hadoop-0.18.3/bin/hadoop jar -libjars AggregateWordCount.jar
org.apache.hadoop.examples.AggregateWordCount words/*
aggregatewordcount_output 2 textinputformat



2) Following is the error:

java.io.IOException: Error opening job jar:
org.apache.hadoop.examples.AggregateWordCount

   at org.apache.hadoop.util.RunJar.main(RunJar.java:90)

   at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

   at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Caused by: java.util.zip.ZipException: error in opening zip file

   at java.util.zip.ZipFile.open(Native Method)

   at java.util.zip.ZipFile.init(ZipFile.java:203)

   at java.util.jar.JarFile.init(JarFile.java:132)

   at java.util.jar.JarFile.init(JarFile.java:70)

   at org.apache.hadoop.util.RunJar.main(RunJar.java:88)

   ... 4 more



Please advice.



Thank You,

Shravan Kumar. M

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com







-- 
Harish Mallipeddi
http://blog.poundbang.in



RE: java.io.IOException: Error opening job jar

2009-06-22 Thread Ramakishore Yelamanchilli
Can you attach the jar file you have?

-Ram

-Original Message-
From: Shravan Mahankali [mailto:shravan.mahank...@catalytic.com] 
Sent: Monday, June 22, 2009 3:49 AM
To: 'Harish Mallipeddi'; core-user@hadoop.apache.org
Subject: RE: java.io.IOException: Error opening job jar

Thanks for your reply Harish.

 

Am running this example from with in the directory containing
AggregateWordCount.jar file. But even then, I have this issue. Earlier I had
issue of java.lang.ClassNotFoundException:
org.apache.hadoop.examples.AggregateWordCount$WordCountPlugInClass, so in
some thread some one have suggested using -libjars, so I tried, but there
was not great!!!

 

I did not think it is SUCH A HARD JOB (almost 20+ hours with no success) to
just run an example provided by Hadoop in its distribution!!!

 

Thank You,

Shravan Kumar. M 

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com

  _  

From: Harish Mallipeddi [mailto:harish.mallipe...@gmail.com] 
Sent: Monday, June 22, 2009 4:03 PM
To: core-user@hadoop.apache.org; shravan.mahank...@catalytic.com
Subject: Re: java.io.IOException: Error opening job jar

 

It cannot find your job jar file. Make sure you run this command from the
directory that has the AggregateWordCount.jar (and you can lose the -libjars
flag too - you need that only if you need to specify extra jar dependencies
apart from your job jar file).

- Harish

On Mon, Jun 22, 2009 at 3:45 PM, Shravan Mahankali
shravan.mahank...@catalytic.com wrote:

Hi Group,



I was having trouble getting through an example Hadoop program. I have
searched the mailing list but could not find any thing useful. Below is the
issue:



1) Executed below command to submit a job to Hadoop:

 /hadoop-0.18.3/bin/hadoop jar -libjars AggregateWordCount.jar
org.apache.hadoop.examples.AggregateWordCount words/*
aggregatewordcount_output 2 textinputformat



2) Following is the error:

java.io.IOException: Error opening job jar:
org.apache.hadoop.examples.AggregateWordCount

   at org.apache.hadoop.util.RunJar.main(RunJar.java:90)

   at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

   at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Caused by: java.util.zip.ZipException: error in opening zip file

   at java.util.zip.ZipFile.open(Native Method)

   at java.util.zip.ZipFile.init(ZipFile.java:203)

   at java.util.jar.JarFile.init(JarFile.java:132)

   at java.util.jar.JarFile.init(JarFile.java:70)

   at org.apache.hadoop.util.RunJar.main(RunJar.java:88)

   ... 4 more



Please advice.



Thank You,

Shravan Kumar. M

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com







-- 
Harish Mallipeddi
http://blog.poundbang.in



Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
Hi.

So what would be the recommended approach to pre-0.20.x series?

To insure each file is used only by one thread, and then it safe to close
the handle in that thread?

Regards.

2009/6/22 Steve Loughran ste...@apache.org

 Raghu Angadi wrote:


 Is this before 0.20.0? Assuming you have closed these streams, it is
 mostly https://issues.apache.org/jira/browse/HADOOP-4346

 It is the JDK internal implementation that depends on GC to free up its
 cache of selectors. HADOOP-4346 avoids this by using hadoop's own cache.


 yes, and it's that change that led to my stack traces :(

 http://jira.smartfrog.org/jira/browse/SFOS-1208



RE: Name Node HA (HADOOP-4539)

2009-06-22 Thread Brian.Levine
If the BackupNode doesn't promise HA, then how would additional testing on this 
feature aid in the HA story?  Maybe you could expand on the purpose of 
HADOOP-4539 because now I'm confused.

How does the approaching 0.21 cutoff translate into a release date for 0.21?

-brian
 

-Original Message-
From: ext Steve Loughran [mailto:ste...@apache.org] 
Sent: Monday, June 22, 2009 5:36 AM
To: core-user@hadoop.apache.org
Subject: Re: Name Node HA (HADOOP-4539)

Andrew Wharton wrote:
 https://issues.apache.org/jira/browse/HADOOP-4539
 
 I am curious about the state of this fix. It is listed as
 Incompatible, but is resolved and committed (according to the
 comments). Is the backup name node going to make it into 0.21? Will it
 remove the SPOF for HDFS? And if so, what is the proposed release
 timeline for 0.21?
 
 

The way to deal with HA -which the BackupNode doesn't promise- is to get 
involved in developing and testing the leading edge source tree.

The 0.21 cutoff is approaching, BackupNode is in there, but it needs a 
lot more tests. If you want to aid the development, helping to get more 
automated BackupNode tests in there (indeed, tests that simulate more 
complex NN failures, like a corrupt EditLog) would go a long way.

-steve


Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Raghu Angadi


64k might help in the sense, you might hit GC before you hit the limit.

Otherwise, your only options are to use the patch attached to 
HADOOP-4346 or run System.gc() occasionally.


I think it should be committed to 0.18.4

Raghu.

Stas Oskin wrote:

Hi.

Yes, it happens with 0.18.3.

I'm closing now every FSData stream I receive from HDFS, so the number of
open fd's in DataNode is reduced.

Problem is that my own DFS client still have a high number of fd's open,
mostly pipes and epolls.
They sometimes quickly drop to the level of ~400 - 500, and sometimes just
stuck at ~1000.

I'm still trying to find out how well it behaves if I set the maximum fd
number to 65K.

Regards.



2009/6/22 Raghu Angadi rang...@yahoo-inc.com


Is this before 0.20.0? Assuming you have closed these streams, it is mostly
https://issues.apache.org/jira/browse/HADOOP-4346

It is the JDK internal implementation that depends on GC to free up its
cache of selectors. HADOOP-4346 avoids this by using hadoop's own cache.

Raghu.


Stas Oskin wrote:


Hi.

After tracing some more with the lsof utility, and I managed to stop the
growth on the DataNode process, but still have issues with my DFS client.

It seems that my DFS client opens hundreds of pipes and eventpolls. Here
is
a small part of the lsof output:

java10508 root  387w  FIFO0,6   6142565 pipe
java10508 root  388r  FIFO0,6   6142565 pipe
java10508 root  389u     0,100  6142566
eventpoll
java10508 root  390u  FIFO0,6   6135311 pipe
java10508 root  391r  FIFO0,6   6135311 pipe
java10508 root  392u     0,100  6135312
eventpoll
java10508 root  393r  FIFO0,6   6148234 pipe
java10508 root  394w  FIFO0,6   6142570 pipe
java10508 root  395r  FIFO0,6   6135857 pipe
java10508 root  396r  FIFO0,6   6142570 pipe
java10508 root  397r     0,100  6142571
eventpoll
java10508 root  398u  FIFO0,6   6135319 pipe
java10508 root  399w  FIFO0,6   6135319 pipe

I'm using FSDataInputStream and FSDataOutputStream, so this might be
related
to pipes?

So, my questions are:

1) What happens these pipes/epolls to appear?

2) More important, how I can prevent their accumation and growth?

Thanks in advance!

2009/6/21 Stas Oskin stas.os...@gmail.com

 Hi.

I have HDFS client and HDFS datanode running on same machine.

When I'm trying to access a dozen of files at once from the client,
several
times in a row, I'm starting to receive the following errors on client,
and
HDFS browse function.

HDFS Client: Could not get block locations. Aborting...
HDFS browse: Too many open files

I can increase the maximum number of files that can opened, as I have it
set to the default 1024, but would like to first solve the problem, as
larger value just means it would run out of files again later on.

So my questions are:

1) Does the HDFS datanode keeps any files opened, even after the HDFS
client have already closed them?

2) Is it possible to find out, who keeps the opened files - datanode or
client (so I could pin-point the source of the problem).

Thanks in advance!








HDFS out of space

2009-06-22 Thread Kris Jirapinyo
Hi all,
How does one handle a mount running out of space for HDFS?  We have two
disks mounted on /mnt and /mnt2 respectively on one of the machines that are
used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there a way to
tell the machine to balance itself out?  I know for the cluster, you can
balance it using start-balancer.sh but I don't think that it will tell the
individual machine to balance itself out.  Our hack right now would be
just to delete the data on /mnt, since we have replication of 3x, we should
be OK.  But I'd prefer not to do that.  Any thoughts?


RE: java.io.IOException: Error opening job jar

2009-06-22 Thread Ramakishore Yelamanchilli
There's no file attached Shravan.

Regards

Ram

-Original Message-
From: Shravan Mahankali [mailto:shravan.mahank...@catalytic.com] 
Sent: Monday, June 22, 2009 4:43 AM
To: core-user@hadoop.apache.org; 'Harish Mallipeddi'
Subject: RE: java.io.IOException: Error opening job jar


Hi Harish,

PFA the AggregateWordCount.jar file.

I was able to open this jar file using a sample Java file written with
JarFile api, and this is the same api used inside
org.apache.hadoop.util.RunJar class, where my jar is rejected and throwing
the error!!!

Also find attached the sample java file - RunJar1.java

Thank You,
Shravan Kumar. M 
Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
-
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator - netopshelpd...@catalytic.com
-Original Message-
From: Ramakishore Yelamanchilli [mailto:kyela...@cisco.com] 
Sent: Monday, June 22, 2009 5:04 PM
To: core-user@hadoop.apache.org; shravan.mahank...@catalytic.com; 'Harish
Mallipeddi'
Subject: RE: java.io.IOException: Error opening job jar

Can you attach the jar file you have?

-Ram

-Original Message-
From: Shravan Mahankali [mailto:shravan.mahank...@catalytic.com] 
Sent: Monday, June 22, 2009 3:49 AM
To: 'Harish Mallipeddi'; core-user@hadoop.apache.org
Subject: RE: java.io.IOException: Error opening job jar

Thanks for your reply Harish.

 

Am running this example from with in the directory containing
AggregateWordCount.jar file. But even then, I have this issue. Earlier I had
issue of java.lang.ClassNotFoundException:
org.apache.hadoop.examples.AggregateWordCount$WordCountPlugInClass, so in
some thread some one have suggested using -libjars, so I tried, but there
was not great!!!

 

I did not think it is SUCH A HARD JOB (almost 20+ hours with no success) to
just run an example provided by Hadoop in its distribution!!!

 

Thank You,

Shravan Kumar. M 

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com

  _  

From: Harish Mallipeddi [mailto:harish.mallipe...@gmail.com] 
Sent: Monday, June 22, 2009 4:03 PM
To: core-user@hadoop.apache.org; shravan.mahank...@catalytic.com
Subject: Re: java.io.IOException: Error opening job jar

 

It cannot find your job jar file. Make sure you run this command from the
directory that has the AggregateWordCount.jar (and you can lose the -libjars
flag too - you need that only if you need to specify extra jar dependencies
apart from your job jar file).

- Harish

On Mon, Jun 22, 2009 at 3:45 PM, Shravan Mahankali
shravan.mahank...@catalytic.com wrote:

Hi Group,



I was having trouble getting through an example Hadoop program. I have
searched the mailing list but could not find any thing useful. Below is the
issue:



1) Executed below command to submit a job to Hadoop:

 /hadoop-0.18.3/bin/hadoop jar -libjars AggregateWordCount.jar
org.apache.hadoop.examples.AggregateWordCount words/*
aggregatewordcount_output 2 textinputformat



2) Following is the error:

java.io.IOException: Error opening job jar:
org.apache.hadoop.examples.AggregateWordCount

   at org.apache.hadoop.util.RunJar.main(RunJar.java:90)

   at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

   at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Caused by: java.util.zip.ZipException: error in opening zip file

   at java.util.zip.ZipFile.open(Native Method)

   at java.util.zip.ZipFile.init(ZipFile.java:203)

   at java.util.jar.JarFile.init(JarFile.java:132)

   at java.util.jar.JarFile.init(JarFile.java:70)

   at org.apache.hadoop.util.RunJar.main(RunJar.java:88)

   ... 4 more



Please advice.



Thank You,

Shravan Kumar. M

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com







-- 
Harish Mallipeddi
http://blog.poundbang.in



Interfaces/Implementations and Key/Values for M/R

2009-06-22 Thread Grant Ingersoll

Hi,

Over at Mahout (http://lucene.apache.org/mahout) we have a Vector  
interface with two implementations DenseVector and SparseVector.  When  
it comes to writing Mapper/Reducer, we have been able to just use  
Vector, but when it comes to actually binding real data via a  
Configuration, we need to specify, I think, the actual implementation  
being used, as in something like  
conf.setOutputValueClass(SparseVector.class);


Ideally, we'd like to avoid having to pick a particular implementation  
to as late as possible.  Right now, we've pushed this off to the user  
to pass in the implementation, but even that is less than ideal for a  
variety of reasons.  While we typically wouldn't expect the data to be  
a mixture of Dense and Sparse, there really shouldn't be a reason why  
it can't be.  We realize we could write out the class name to the  
DataOutput (we implement Writable) that causes us to have either hack  
some String compares in or use Class.forName(), which seems like it  
wouldn't perform well (although I admit I haven't tested that yet,  
presumably the JDK can cache the info)


Thanks,
Grant


Re: multiple file input

2009-06-22 Thread Erik Paulson
On Thu, Jun 18, 2009 at 01:36:14PM -0700, Owen O'Malley wrote:
 On Jun 18, 2009, at 10:56 AM, pmg wrote:
 
 Each line from FileA gets compared with every line from FileB1,  
 FileB2 etc.
 etc. FileB1, FileB2 etc. are in a different input directory
 
 In the general case, I'd define an InputFormat that takes two  
 directories, computes the input splits for each directory and  
 generates a new list of InputSplits that is the cross-product of the  
 two lists. So instead of FileSplit, it would use a FileSplitPair that  
 gives the FileSplit for dir1 and the FileSplit for dir2 and the record  
 reader would return a TextPair with left and right records (ie.  
 lines). Clearly, you read the first line of split1 and cross it by  
 each line from split2, then move to the second line of split1 and  
 process each line from split2, etc.
 

Out of curiosity, how does Hadoop schedule tasks when a task needs
multiple inputs and the data for a task is on different nodes?  How does
it decide which node will be more local and should have the task
steered to it?

-Erik



Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
Hi Rahid.

A question - this issue does not influence Hadoop itself (DataNodes,
etc...), but rather influence any application using DFS, correct?

If so, without patching iI should either to increase fd limit (which might
fill-up as well???), or periodically launch the GC?

Regards.


2009/6/22 Raghu Angadi rang...@yahoo-inc.com


 64k might help in the sense, you might hit GC before you hit the limit.

 Otherwise, your only options are to use the patch attached to HADOOP-4346
 or run System.gc() occasionally.

 I think it should be committed to 0.18.4


 Raghu.

 Stas Oskin wrote:

 Hi.

 Yes, it happens with 0.18.3.

 I'm closing now every FSData stream I receive from HDFS, so the number of
 open fd's in DataNode is reduced.

 Problem is that my own DFS client still have a high number of fd's open,
 mostly pipes and epolls.
 They sometimes quickly drop to the level of ~400 - 500, and sometimes just
 stuck at ~1000.

 I'm still trying to find out how well it behaves if I set the maximum fd
 number to 65K.

 Regards.



 2009/6/22 Raghu Angadi rang...@yahoo-inc.com

  Is this before 0.20.0? Assuming you have closed these streams, it is
 mostly
 https://issues.apache.org/jira/browse/HADOOP-4346

 It is the JDK internal implementation that depends on GC to free up its
 cache of selectors. HADOOP-4346 avoids this by using hadoop's own cache.

 Raghu.


 Stas Oskin wrote:

  Hi.

 After tracing some more with the lsof utility, and I managed to stop the
 growth on the DataNode process, but still have issues with my DFS
 client.

 It seems that my DFS client opens hundreds of pipes and eventpolls. Here
 is
 a small part of the lsof output:

 java10508 root  387w  FIFO0,6   6142565 pipe
 java10508 root  388r  FIFO0,6   6142565 pipe
 java10508 root  389u     0,100  6142566
 eventpoll
 java10508 root  390u  FIFO0,6   6135311 pipe
 java10508 root  391r  FIFO0,6   6135311 pipe
 java10508 root  392u     0,100  6135312
 eventpoll
 java10508 root  393r  FIFO0,6   6148234 pipe
 java10508 root  394w  FIFO0,6   6142570 pipe
 java10508 root  395r  FIFO0,6   6135857 pipe
 java10508 root  396r  FIFO0,6   6142570 pipe
 java10508 root  397r     0,100  6142571
 eventpoll
 java10508 root  398u  FIFO0,6   6135319 pipe
 java10508 root  399w  FIFO0,6   6135319 pipe

 I'm using FSDataInputStream and FSDataOutputStream, so this might be
 related
 to pipes?

 So, my questions are:

 1) What happens these pipes/epolls to appear?

 2) More important, how I can prevent their accumation and growth?

 Thanks in advance!

 2009/6/21 Stas Oskin stas.os...@gmail.com

  Hi.

 I have HDFS client and HDFS datanode running on same machine.

 When I'm trying to access a dozen of files at once from the client,
 several
 times in a row, I'm starting to receive the following errors on client,
 and
 HDFS browse function.

 HDFS Client: Could not get block locations. Aborting...
 HDFS browse: Too many open files

 I can increase the maximum number of files that can opened, as I have
 it
 set to the default 1024, but would like to first solve the problem, as
 larger value just means it would run out of files again later on.

 So my questions are:

 1) Does the HDFS datanode keeps any files opened, even after the HDFS
 client have already closed them?

 2) Is it possible to find out, who keeps the opened files - datanode or
 client (so I could pin-point the source of the problem).

 Thanks in advance!







Re: Making sure the tmp directory is cleaned?

2009-06-22 Thread Pankil Doshi
Yes, If your job gets completed successfully .possibly it removes after
completion of both map and reduce tasks.

Pankil

On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao q...@cs.cmu.edu wrote:

 Hi All,

 Do you know if the tmp directory on every map/reduce task will be deleted
 automatically after the map task finishes or will do I have to delete them?

 I mean the tmp directory that automatically created by on current
 directory.

 Thanks a lot
 --Q



Re: Making sure the tmp directory is cleaned?

2009-06-22 Thread Qin Gao
Thanks!

But what if the jobs get killed or failed? Does hadoop try to clean it? we
are considering bad situations - if job gets killed, will the tmp dirs sit
on local disks forever and eats up all the diskspace?

I guess this should be considered in distributed cache, but those files are
read-only, and our program will generate new temporary files.


--Q


On Mon, Jun 22, 2009 at 4:19 PM, Pankil Doshi forpan...@gmail.com wrote:

 Yes, If your job gets completed successfully .possibly it removes after
 completion of both map and reduce tasks.

 Pankil

 On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao q...@cs.cmu.edu wrote:

  Hi All,
 
  Do you know if the tmp directory on every map/reduce task will be deleted
  automatically after the map task finishes or will do I have to delete
 them?
 
  I mean the tmp directory that automatically created by on current
  directory.
 
  Thanks a lot
  --Q
 



Re: Slides/Videos of Hadoop Summit

2009-06-22 Thread Alex Loddengaard
The Cloudera talks are here:


http://www.cloudera.com/blog/2009/06/22/a-great-week-for-hadoop-summit-west-roundup/


As for the rest, I'm not sure.

Alex

On Sun, Jun 21, 2009 at 11:46 PM, jaideep dhok jdd...@gmail.com wrote:

 Hi all,
 Are the slides or videos of the talks given at Hadoop Summit available
 online? I checked the Yahoo! website for the summit but could not find
 any links.

 Regards,
 --
 Jaideep



Re: Measuring runtime of Map-reduce Jobs

2009-06-22 Thread Alex Loddengaard
What specific information are you interested in?

The job history logs show all sorts of great information (look in the
history sub directory of the JobTracker node's log directory).

Alex

On Mon, Jun 22, 2009 at 1:23 AM, bharath vissapragada 
bhara...@students.iiit.ac.in wrote:

 Hi ,

 Are there any tools which can measure the run-time of the map-reduce jobs
 ??
 any help is appreciated .

 Thanks in advance



Re: Problem in viewing WEB UI

2009-06-22 Thread Pankil Doshi
I am not sure but sometimes you might see that datanodes are working from
cmd prompt..
But actually when you look at the logs you find sme kind of error in
that..Check the logs of datanode..

Pankil

On Wed, Jun 17, 2009 at 1:42 AM, ashish pareek pareek...@gmail.com wrote:

 Hi,

  When I run command *bin/hadoop dfsadmin -report *it shows that 2
 datanodes are alive but when i try to http://hadoopmster:50070/ but the
 problem is that it opens doesnot opne
 http://hadoopmaster:50070/dfshealth.jsp page and throws *error HTTP: 404 .
 So why is't happening like this?
 *
 Regards,
 Ashish Pareek


  On Wed, Jun 17, 2009 at 10:06 AM, Sugandha Neaolekar 
 sugandha@gmail.com wrote:

  Well, You just have to specify the address in the URL address bar as::
  http://hadoopmaster:50070 U'll be able to see the web UI..!
 
 
  On Tue, Jun 16, 2009 at 7:17 PM, ashish pareek pareek...@gmail.com
 wrote:
 
  HI Sugandha,
 Hmmm your suggestion helped and Now I am able
  to run two datanode one on the same machine as name node and other
 on
  the different machine Thanks a lot :)
 
   But the problem is now I am not able to see web UI
 .
  for  both datanode and as well as name node
  should I have to consider some more things in the site.xml ? if so
 please
  help...
 
  Thanking you again,
  regards,
  Ashish Pareek.
 
  On Tue, Jun 16, 2009 at 3:10 PM, Sugandha Naolekar 
  sugandha@gmail.com wrote:
 
  hi,,!
 
 
  First of all, get your concepts clear of hadoop.
  You can refer to the following
 
  site::
 
 http://www.google.co.in/url?sa=tsource=webct=rescd=1url=http%3A%2F%2Fwww.michael-noll.com%2Fwiki%2FRunning_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)ei=lGU3Spv2FZbLjAe19KmiDQusg=AFQjCNFbmVGsoChOSMzCB3tRhoV0ylHOzAsig2=t2AJ_nf24SFtveN4PHS_TAhttp://www.google.co.in/url?sa=tsource=webct=rescd=1url=http%3A%2F%2Fwww.michael-noll.com%2Fwiki%2FRunning_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29ei=lGU3Spv2FZbLjAe19KmiDQusg=AFQjCNFbmVGsoChOSMzCB3tRhoV0ylHOzAsig2=t2AJ_nf24SFtveN4PHS_TA
 
 http://www.google.co.in/url?sa=tsource=webct=rescd=1url=http%3A%2F%2Fwww.michael-noll.com%2Fwiki%2FRunning_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29ei=lGU3Spv2FZbLjAe19KmiDQusg=AFQjCNFbmVGsoChOSMzCB3tRhoV0ylHOzAsig2=t2AJ_nf24SFtveN4PHS_TA
 
 
 
  I have small doubt whether in the mater.xml and slave.xml we can have
  same port numbers to both of them like
 
 
  for slave :
 
  property
  namefs.default.name/name
  valuehdfs://hadoopslave:
 
  9000/value
/property
 
 
   for master:::
 
  property
  namefs.default.name/name
  valuehdfs://hadoopmaster:9000/value
/property
 
 
 
  Well, any  two daemons or services can run on the same port unless,
 they
  are not run on the same machine.If you wish to run DN and NN on the
 same
  machine, their port numbers have to be different.
 
 
 
 
  On Tue, Jun 16, 2009 at 2:55 PM, ashish pareek pareek...@gmail.com
 wrote:
 
  HI sugandha,
 
 
 
  and one more thing can we have in slave:::
 
  property
namedfs.datanode.address/
 
  name
valuehadoopmaster:9000/value
  valuehadoopslave:9001/value
/property
 
 
 
  Also, fs,default.name is the tag which specifies the default
 filesystem.
  And generaLLY, it is run on namenode. So, it;s value has to be a
 namenode's
  address only and not slave's.
 
 
 
  Else if you have complete procedure for installing and running Hadoop
 in
  cluster can you please send it to me .. I need to step up hadoop
 with in
  two days and show it to my guide.Currently I am doing my masters.
 
  Thanks for your spending time
 
 
  Try for the above, and this should work!
 
 
 
  regards,
  Ashish Pareek
 
 
  On Tue, Jun 16, 2009 at 2:33 PM, Sugandha Naolekar 
  sugandha@gmail.com wrote:
 
  Following changes are to be done::
 
  Under master folder::
 
  - put slaves address as well under the values of
  tag(dfs.datanode.address)
 
  - You want to make namenode as datanode as well. As per your config
  file, you have specified hadoopmaster  in your slave file. If you
 don't want
  that, remove ti from slaves file.
 
  UNder slave folder::
 
  - put only slave's (the m/c where you intend to run your datanode)'s
  address.under datanode.address tag. Else
  it should go as such::
 
  property
namedfs.datanode.address/name
valuehadoopmaster:9000/value
  valuehadoopslave:9001/value
/property
 
  Also, your port numbers hould be different. the daemons NN,DN,JT,TT
  should run independently on different ports.
 
 
  On Tue, Jun 16, 2009 at 2:05 PM, Sugandha Naolekar 
  sugandha@gmail.com wrote:
 
 
 
  -- Forwarded message --
  From: ashish pareek pareek...@gmail.com
  Date: Tue, Jun 16, 2009 at 2:00 PM
  Subject: Re: org.apache.hadoop.ipc.client : trying connect to server
  failed
  To: Sugandha Naolekar sugandha@gmail.com
 
 
 
 
  On Tue, Jun 16, 2009 at 1:58 PM, ashish pareek pareek...@gmail.com
 wrote:
 
  

Re: Disk Usage Overhead of Hadoop Upgrade

2009-06-22 Thread Pankil Doshi
hi Stu,

which block conversion are you talking about? If you are talking abt block
size of data then it remains same in upgrade unless and until you change it.

Pankil

On Tue, Jun 16, 2009 at 5:16 PM, Stu Hood stuart.h...@rackspace.com wrote:

 Hey gang,

 We're preparing to upgrade our cluster from Hadoop 0.15.3 to 0.18.3.

 How much disk usage overhead can we expect from the block conversion before
 we finalize the upgrade? In the worst case, will the upgrade cause our disk
 usage to double?

 Thanks,

 Stu Hood
 Search Team Technical Lead
 Email  Apps Division, Rackspace Hosting




Re: HDFS out of space

2009-06-22 Thread Alex Loddengaard
Are you seeing any exceptions because of the disk being at 99% capacity?

Hadoop should do something sane here and write new data to the disk with
more capacity.  That said, it is ideal to be balanced.  As far as I know,
there is no way to balance an individual DataNode's hard drives (Hadoop does
round-robin scheduling when writing data).

Alex

On Mon, Jun 22, 2009 at 10:12 AM, Kris Jirapinyo kjirapi...@biz360.comwrote:

 Hi all,
How does one handle a mount running out of space for HDFS?  We have two
 disks mounted on /mnt and /mnt2 respectively on one of the machines that
 are
 used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there a way to
 tell the machine to balance itself out?  I know for the cluster, you can
 balance it using start-balancer.sh but I don't think that it will tell the
 individual machine to balance itself out.  Our hack right now would be
 just to delete the data on /mnt, since we have replication of 3x, we should
 be OK.  But I'd prefer not to do that.  Any thoughts?



Re: HDFS out of space

2009-06-22 Thread Pankil Doshi
Hey Alex,

Will Hadoop balancer utility work in this case?

Pankil

On Mon, Jun 22, 2009 at 4:30 PM, Alex Loddengaard a...@cloudera.com wrote:

 Are you seeing any exceptions because of the disk being at 99% capacity?

 Hadoop should do something sane here and write new data to the disk with
 more capacity.  That said, it is ideal to be balanced.  As far as I know,
 there is no way to balance an individual DataNode's hard drives (Hadoop
 does
 round-robin scheduling when writing data).

 Alex

 On Mon, Jun 22, 2009 at 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com
 wrote:

  Hi all,
 How does one handle a mount running out of space for HDFS?  We have
 two
  disks mounted on /mnt and /mnt2 respectively on one of the machines that
  are
  used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there a way
 to
  tell the machine to balance itself out?  I know for the cluster, you can
  balance it using start-balancer.sh but I don't think that it will tell
 the
  individual machine to balance itself out.  Our hack right now would be
  just to delete the data on /mnt, since we have replication of 3x, we
 should
  be OK.  But I'd prefer not to do that.  Any thoughts?
 



Re: Making sure the tmp directory is cleaned?

2009-06-22 Thread Pankil Doshi
No..If your job gets killed or failed.Temp wont clean up.. and In that case
you will have to carefully clean that on your own. If you dont clean it up
yourself it will eat up your disk space.

Pankil

On Mon, Jun 22, 2009 at 4:24 PM, Qin Gao q...@cs.cmu.edu wrote:

 Thanks!

 But what if the jobs get killed or failed? Does hadoop try to clean it? we
 are considering bad situations - if job gets killed, will the tmp dirs sit
 on local disks forever and eats up all the diskspace?

 I guess this should be considered in distributed cache, but those files are
 read-only, and our program will generate new temporary files.


 --Q


 On Mon, Jun 22, 2009 at 4:19 PM, Pankil Doshi forpan...@gmail.com wrote:

  Yes, If your job gets completed successfully .possibly it removes after
  completion of both map and reduce tasks.
 
  Pankil
 
  On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao q...@cs.cmu.edu wrote:
 
   Hi All,
  
   Do you know if the tmp directory on every map/reduce task will be
 deleted
   automatically after the map task finishes or will do I have to delete
  them?
  
   I mean the tmp directory that automatically created by on current
   directory.
  
   Thanks a lot
   --Q
  
 



Re: HDFS out of space

2009-06-22 Thread Matt Massie

Pankil-

I'd be interested to know the size of the /mnt and /mnt2 partitions.   
Are they the same?  Can you run the following and report the output...


% df -h /mnt /mnt2

Thanks.

-Matt

On Jun 22, 2009, at 1:32 PM, Pankil Doshi wrote:


Hey Alex,

Will Hadoop balancer utility work in this case?

Pankil

On Mon, Jun 22, 2009 at 4:30 PM, Alex Loddengaard  
a...@cloudera.com wrote:


Are you seeing any exceptions because of the disk being at 99%  
capacity?


Hadoop should do something sane here and write new data to the disk  
with
more capacity.  That said, it is ideal to be balanced.  As far as I  
know,
there is no way to balance an individual DataNode's hard drives  
(Hadoop

does
round-robin scheduling when writing data).

Alex

On Mon, Jun 22, 2009 at 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com

wrote:



Hi all,
  How does one handle a mount running out of space for HDFS?  We  
have

two
disks mounted on /mnt and /mnt2 respectively on one of the  
machines that

are
used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there  
a way

to
tell the machine to balance itself out?  I know for the cluster,  
you can
balance it using start-balancer.sh but I don't think that it will  
tell

the
individual machine to balance itself out.  Our hack right now  
would be

just to delete the data on /mnt, since we have replication of 3x, we

should

be OK.  But I'd prefer not to do that.  Any thoughts?







Re: HDFS out of space

2009-06-22 Thread Allen Wittenauer



On 6/22/09 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com wrote:

 Hi all,
 How does one handle a mount running out of space for HDFS?  We have two
 disks mounted on /mnt and /mnt2 respectively on one of the machines that are
 used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there a way to
 tell the machine to balance itself out?  I know for the cluster, you can
 balance it using start-balancer.sh but I don't think that it will tell the
 individual machine to balance itself out.  Our hack right now would be
 just to delete the data on /mnt, since we have replication of 3x, we should
 be OK.  But I'd prefer not to do that.  Any thoughts?

Decommission the entire node, wait for data to be replicated,
re-commission, then do HDFS rebalance.  It blows, no doubt about it, but the
admin tools in the space are... lacking.




Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Steve Loughran

Stas Oskin wrote:

Hi.

So what would be the recommended approach to pre-0.20.x series?

To insure each file is used only by one thread, and then it safe to close
the handle in that thread?

Regards.


good question -I'm not sure. For anythiong you get with 
FileSystem.get(), its now dangerous to close, so try just setting the 
reference to null and hoping that GC will do the finalize() when needed


Re: Disk Usage Overhead of Hadoop Upgrade

2009-06-22 Thread Raghu Angadi


The initial overhead is fairly small (extra hard link for each file).

After that, the overhead grows as you delete the files (thus its blocks) 
that existed before the upgrade.. since the physical files for blocks 
are deleted only after you finalize.


So the overhead == (the blocks that got deleted after the upgrade).

Raghu.

Stu Hood wrote:

Hey gang,

We're preparing to upgrade our cluster from Hadoop 0.15.3 to 0.18.3.

How much disk usage overhead can we expect from the block conversion before we 
finalize the upgrade? In the worst case, will the upgrade cause our disk usage 
to double?

Thanks,

Stu Hood
Search Team Technical Lead
Email  Apps Division, Rackspace Hosting





Re: HDFS out of space

2009-06-22 Thread Usman Waheed
I have used the balancer to balance the data in the cluster with the  
-threshold option. The bandwidth transfer was set to 1MB/sec ( I think  
thats the default setting) in one of the config files and had to move  
500GB of data around. It did take sometime but eventually the data got  
spread out evenly. In my case i was using one of the machines as the  
masternode and datanode at the same time which is why this one machine  
consumed more as compared to the other datanodes.


Thanks,
Usman



Hey Alex,

Will Hadoop balancer utility work in this case?

Pankil

On Mon, Jun 22, 2009 at 4:30 PM, Alex Loddengaard a...@cloudera.com  
wrote:



Are you seeing any exceptions because of the disk being at 99% capacity?

Hadoop should do something sane here and write new data to the disk with
more capacity.  That said, it is ideal to be balanced.  As far as I  
know,

there is no way to balance an individual DataNode's hard drives (Hadoop
does
round-robin scheduling when writing data).

Alex

On Mon, Jun 22, 2009 at 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com
wrote:

 Hi all,
How does one handle a mount running out of space for HDFS?  We have
two
 disks mounted on /mnt and /mnt2 respectively on one of the machines  
that

 are
 used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there a  
way

to
 tell the machine to balance itself out?  I know for the cluster, you  
can

 balance it using start-balancer.sh but I don't think that it will tell
the
 individual machine to balance itself out.  Our hack right now would  
be

 just to delete the data on /mnt, since we have replication of 3x, we
should
 be OK.  But I'd prefer not to do that.  Any thoughts?






--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


Re: Making sure the tmp directory is cleaned?

2009-06-22 Thread Allen Wittenauer



On 6/22/09 12:15 PM, Qin Gao q...@cs.cmu.edu wrote:
 Do you know if the tmp directory on every map/reduce task will be deleted
 automatically after the map task finishes or will do I have to delete them?
 
 I mean the tmp directory that automatically created by on current directory.

Past experience says that users will find writable space on nodes and fill
it, regardless of what Hadoop may do to try and keep it clean.  It is a good
idea to just wipe those spaces clean during hadoop upgrades and other
planned downtimes.



Re: Making sure the tmp directory is cleaned?

2009-06-22 Thread Qin Gao
Thanks, then I will try keep a log on the files and clean them out, thanks.
--Q


On Mon, Jun 22, 2009 at 4:34 PM, Pankil Doshi forpan...@gmail.com wrote:

 No..If your job gets killed or failed.Temp wont clean up.. and In that case
 you will have to carefully clean that on your own. If you dont clean it up
 yourself it will eat up your disk space.

 Pankil

 On Mon, Jun 22, 2009 at 4:24 PM, Qin Gao q...@cs.cmu.edu wrote:

  Thanks!
 
  But what if the jobs get killed or failed? Does hadoop try to clean it?
 we
  are considering bad situations - if job gets killed, will the tmp dirs
 sit
  on local disks forever and eats up all the diskspace?
 
  I guess this should be considered in distributed cache, but those files
 are
  read-only, and our program will generate new temporary files.
 
 
  --Q
 
 
  On Mon, Jun 22, 2009 at 4:19 PM, Pankil Doshi forpan...@gmail.com
 wrote:
 
   Yes, If your job gets completed successfully .possibly it removes after
   completion of both map and reduce tasks.
  
   Pankil
  
   On Mon, Jun 22, 2009 at 3:15 PM, Qin Gao q...@cs.cmu.edu wrote:
  
Hi All,
   
Do you know if the tmp directory on every map/reduce task will be
  deleted
automatically after the map task finishes or will do I have to delete
   them?
   
I mean the tmp directory that automatically created by on current
directory.
   
Thanks a lot
--Q
   
  
 



Re: HDFS out of space

2009-06-22 Thread Pankil Doshi
Matt.

Kris can give that info..
I am one of the users from mailing list.

PAnkil

On Mon, Jun 22, 2009 at 4:37 PM, Matt Massie m...@cloudera.com wrote:

 Pankil-

 I'd be interested to know the size of the /mnt and /mnt2 partitions.  Are
 they the same?  Can you run the following and report the output...

 % df -h /mnt /mnt2

 Thanks.

 -Matt


 On Jun 22, 2009, at 1:32 PM, Pankil Doshi wrote:

  Hey Alex,

 Will Hadoop balancer utility work in this case?

 Pankil

 On Mon, Jun 22, 2009 at 4:30 PM, Alex Loddengaard a...@cloudera.com
 wrote:

  Are you seeing any exceptions because of the disk being at 99% capacity?

 Hadoop should do something sane here and write new data to the disk with
 more capacity.  That said, it is ideal to be balanced.  As far as I know,
 there is no way to balance an individual DataNode's hard drives (Hadoop
 does
 round-robin scheduling when writing data).

 Alex

 On Mon, Jun 22, 2009 at 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com

 wrote:


  Hi all,
  How does one handle a mount running out of space for HDFS?  We have

 two

 disks mounted on /mnt and /mnt2 respectively on one of the machines that
 are
 used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there a way

 to

 tell the machine to balance itself out?  I know for the cluster, you can
 balance it using start-balancer.sh but I don't think that it will tell

 the

 individual machine to balance itself out.  Our hack right now would be
 just to delete the data on /mnt, since we have replication of 3x, we

 should

 be OK.  But I'd prefer not to do that.  Any thoughts?






Re: Too many open files error, which gets resolved after some time

2009-06-22 Thread Stas Oskin
Ok, seems this issue is already patched in the Hadoop distro I'm using
(Cloudera).

Any idea if I still should call GC manually/periodically to clean out all
the stale pipes / epolls?

2009/6/22 Steve Loughran ste...@apache.org

 Stas Oskin wrote:

 Hi.

 So what would be the recommended approach to pre-0.20.x series?

 To insure each file is used only by one thread, and then it safe to close
 the handle in that thread?

 Regards.


 good question -I'm not sure. For anythiong you get with FileSystem.get(),
 its now dangerous to close, so try just setting the reference to null and
 hoping that GC will do the finalize() when needed



Re: HDFS out of space

2009-06-22 Thread Kris Jirapinyo
It's a typical Amazon EC2 Large instance, so 414G each.

-- Kris.

On Mon, Jun 22, 2009 at 1:37 PM, Matt Massie m...@cloudera.com wrote:

 Pankil-

 I'd be interested to know the size of the /mnt and /mnt2 partitions.  Are
 they the same?  Can you run the following and report the output...

 % df -h /mnt /mnt2

 Thanks.

 -Matt


 On Jun 22, 2009, at 1:32 PM, Pankil Doshi wrote:

  Hey Alex,

 Will Hadoop balancer utility work in this case?

 Pankil

 On Mon, Jun 22, 2009 at 4:30 PM, Alex Loddengaard a...@cloudera.com
 wrote:

  Are you seeing any exceptions because of the disk being at 99% capacity?

 Hadoop should do something sane here and write new data to the disk with
 more capacity.  That said, it is ideal to be balanced.  As far as I know,
 there is no way to balance an individual DataNode's hard drives (Hadoop
 does
 round-robin scheduling when writing data).

 Alex

 On Mon, Jun 22, 2009 at 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com

 wrote:


  Hi all,
  How does one handle a mount running out of space for HDFS?  We have

 two

 disks mounted on /mnt and /mnt2 respectively on one of the machines that
 are
 used for HDFS, and /mnt is at 99% while /mnt2 is at 30%.  Is there a way

 to

 tell the machine to balance itself out?  I know for the cluster, you can
 balance it using start-balancer.sh but I don't think that it will tell

 the

 individual machine to balance itself out.  Our hack right now would be
 just to delete the data on /mnt, since we have replication of 3x, we

 should

 be OK.  But I'd prefer not to do that.  Any thoughts?






Re: Hadoop Vaidya tool

2009-06-22 Thread Vitthal Gogate
Hello Pratik, -joblog also should be a specific job history file path not a
directory. Usually, I copy the job conf xml file and job history log file to
a local file system and then use a file:// protocol (although hdfs:// should
also work) e.g, 

Sh /home/hadoop/Desktop/hadoop-0.20.0/contrib/vaidya/bin/vaidya.sh -jobconf
file://localhost/logs/job_200906221335_0001_conf.xml  -joblog
file://localhost/logs/job_00906221335_0001_jobxxx

I discovered few problems with the tool in hadoop 20 for some specific
scenarios such as map_only jobs etc. Following Jiras fix the problems,

If you download latest hadoop (trunk), then 5582 is already part of it, else
with hadoop 20, you can apply following Jiras in sequence.

https://issues.apache.org/jira/browse/HADOOP-5582
https://issues.apache.org/jira/browse/HADOOP-5950

1. Hadoop Vaidya being standalone tool, you may not need to change your
existing installed version of hadoop, but rater separately download the
hadoop trunk, apply patch 5950, re-build and replace the
$HADOOP_HOME/contrib/vaidya/hadoop-0.20.0-vaidya.jar file in your existing
hadoop 20 installation with the one newly built.

2. Also if you have big job (i.e. Lots of map/reduce tasks), you may face
out of memory problem while analyzing it. In which case you can edit the
$HADOOP_HOME/contrib/vaidya/bin/vaidya.sh and add -Xmx1024m option on the
java command line before class path.

Hope it helps

Thanks  Regards, Suhas


On 6/22/09 1:13 PM, Pankil Doshi forpan...@gmail.com wrote:

 Hello ,
 
 I am trying to use Hadoop Vaidya tool . Its available with version 0.20.0.
 But I see following error.Can anyone Guide me on that. I have pseudo mode
 cluster i/e single node cluster for testing..
 
 *cmd I submit is * sh
 /home/hadoop/Desktop/hadoop-0.20.0/contrib/vaidya/bin/vaidya.sh -jobconf
 hdfs://localhost:9000/logs/job_200906221335_0001_conf.xml  -joblog
 hdfs://localhost:9000/logs/ 
 
 *Error :-*
 Exception:java.net.MalformedURLException: unknown protocol:
 hdfsjava.net.MalformedURLException: unknown protocol: hdfs
 at java.net.URL.init(URL.java:590)
 at java.net.URL.init(URL.java:480)
 at java.net.URL.init(URL.java:429)
 at
 org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.readJobInf
 ormation(PostExPerformanceDiagnoser.java:124)
 at
 org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.init(Pos
 tExPerformanceDiagnoser.java:112)
 at
 org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.main(PostE
 xPerformanceDiagnoser.java:220)
 
 Can anyone guide me on that..
 
 Regards
 Pankil

--Regards Suhas
[Getting stated w/ Grid]
http://twiki.corp.yahoo.com/view/GridDocumentation/GridDocAbout
[Search HADOOP/PIG Information]
http://ucdev20.yst.corp.yahoo.com/griduserportal/griduserportal.php





THIS WEEK: PNW Hadoop / Apache Cloud Stack Users' Meeting, Wed Jun 24th, Seattle

2009-06-22 Thread Bradford Stephens
Hey all, just a friendly reminder that this is Wednesday! I hope to see
everyone there again. Please let me know if there's something interesting
you'd like to talk about -- I'll help however I can. You don't even need a
Powerpoint presentation -- there's many whiteboards. I'll try to have a
video cam, but no promises.
Feel free to call at 904-415-3009 if you need directions or any questions :)

~~`

Greetings,

On the heels of our smashing success last month, we're going to be
convening the Pacific Northwest (Oregon and Washington)
Hadoop/HBase/Lucene/etc. meetup on the last Wednesday of June, the
24th.  The meeting should start at 6:45, organized chats will end
around  8:00, and then there shall be discussion and socializing :)

The meeting will be at the University of Washington in
Seattle again. It's in the Computer Science building (not electrical
engineering!), room 303, located here:
http://www.washington.edu/home/maps/southcentral.html?80,70,792,660

If you've ever wanted to learn more about distributed computing, or
just see how other people are innovating with Hadoop, you can't miss
this opportunity. Our focus is on learning and education, so every
presentation must end with a few questions for the group to research
and discuss. (But if you're an introvert, we won't mind).

The format is two or three 15-minute deep dive talks, followed by
several 5 minute lightning chats. We had a few interesting topics
last month:

-Building a Social Media Analysis company on the Apache Cloud Stack
-Cancer detection in images using Hadoop
-Real-time OLAP on HBase -- is it possible?
-Video and Network Flow Analysis in Hadoop vs. Distributed RDBMS
-Custom Ranking in Lucene

We already have one deep dive scheduled this month, on truly
scalable Lucene with Katta. If you've been looking for a way to handle
those large Lucene indices, this is a must-attend!

Looking forward to seeing everyone there again.

Cheers,
Bradford

http://www.roadtofailure.com -- The Fringes of Distributed Computing,
Computer Science, and Social Media.


Strange Exeception

2009-06-22 Thread akhil1988

Hi All!

I have been running Hadoop jobs through my user account on a cluster, for a
while now. But now I am getting this strange exception when I try to execute
a job. If anyone knows, please let me know why this is happening.

[akhil1...@altocumulus WordCount]$ hadoop jar wordcount_classes_dir.jar
org.uiuc.upcrc.extClasses.WordCount /home/akhil1988/input
/home/akhil1988/output
JO
09/06/22 19:19:01 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
org.apache.hadoop.ipc.RemoteException: java.io.FileNotFoundException:
/hadoop/tmp/hadoop/mapred/local/jobTracker/job_200906111015_0167.xml
(Read-only file system)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.init(FileOutputStream.java:179)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:187)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:183)
at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.init(ChecksumFileSystem.java:327)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:360)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:487)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:468)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:375)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:208)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195)
at org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:212)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2230)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)

at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at org.apache.hadoop.mapred.$Proxy1.submitJob(Unknown Source)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:828)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1127)
at org.uiuc.upcrc.extClasses.WordCount.main(WordCount.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


Thanks,
Akhil

-- 
View this message in context: 
http://www.nabble.com/Strange-Exeception-tp24158395p24158395.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Determining input record directory using Streaming...

2009-06-22 Thread C G
Hi All:
Is there any way using Hadoop Streaming to determining the directory from which 
an input record is being read?  This is straightforward in Hadoop using 
InputFormats, but I am curious if the same concept can be applied to streaming.
The goal here is to read in data from 2 directories, say A/ and B/, and make 
decisions about what to do based on where the data is rooted.
Thanks for any help...CG



  

Re: Strange Exeception

2009-06-22 Thread jason hadoop
The directory specified by the configuration parameter mapred.system.dir,
defaulting to /tmp/hadoop/mapred/system, doesn't exist.

Most likely your tmp cleaner task has removed it, and I am guessing it is
only created at cluster start time.

On Mon, Jun 22, 2009 at 6:19 PM, akhil1988 akhilan...@gmail.com wrote:


 Hi All!

 I have been running Hadoop jobs through my user account on a cluster, for a
 while now. But now I am getting this strange exception when I try to
 execute
 a job. If anyone knows, please let me know why this is happening.

 [akhil1...@altocumulus WordCount]$ hadoop jar wordcount_classes_dir.jar
 org.uiuc.upcrc.extClasses.WordCount /home/akhil1988/input
 /home/akhil1988/output
 JO
 09/06/22 19:19:01 WARN mapred.JobClient: Use GenericOptionsParser for
 parsing the arguments. Applications should implement Tool for the same.
 org.apache.hadoop.ipc.RemoteException: java.io.FileNotFoundException:
 /hadoop/tmp/hadoop/mapred/local/jobTracker/job_200906111015_0167.xml
 (Read-only file system)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.init(FileOutputStream.java:179)
at

 org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:187)
at

 org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:183)
at
 org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
at

 org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.init(ChecksumFileSystem.java:327)
at
 org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:360)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:487)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:468)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:375)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:208)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
at
 org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214)
at
 org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195)
at
 org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:212)
at
 org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2230)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)

at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at org.apache.hadoop.mapred.$Proxy1.submitJob(Unknown Source)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:828)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1127)
at org.uiuc.upcrc.extClasses.WordCount.main(WordCount.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)


 Thanks,
 Akhil

 --
 View this message in context:
 http://www.nabble.com/Strange-Exeception-tp24158395p24158395.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals


Re: When is configure and close run

2009-06-22 Thread jason hadoop
configure and close are run for each task, mapper and reducer. The configure
and close are NOT run on the combiner class.

On Mon, Jun 22, 2009 at 9:23 AM, Saptarshi Guha saptarshi.g...@gmail.comwrote:

 Hello,
 In a mapreduce job, a given map JVM will run N map tasks. Are the
 configure and close methods executed for every one of these N tasks?
 Or is configure executed once when the JVM starts and the close method
 executed once when all N have been completed?

 I have the same question for the reduce task. Will it be run before
 for every reduce task? And close is run when all the values for a
 given key have been processed?

 We can assume there isn't a combiner.

 Regards
 Saptarshi




-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals


Re: Determining input record directory using Streaming...

2009-06-22 Thread jason hadoop
Check the process environment for your streaming tasks, generally the
configuration variables are exported into the process environment.

The Mapper input file is normally stored as some variant of
mapred.input.file. The reducer's input is the mapper output for that reduce,
so the input file is not relevant.

On Mon, Jun 22, 2009 at 7:21 PM, C G parallel...@yahoo.com wrote:

 Hi All:
 Is there any way using Hadoop Streaming to determining the directory from
 which an input record is being read?  This is straightforward in Hadoop
 using InputFormats, but I am curious if the same concept can be applied to
 streaming.
 The goal here is to read in data from 2 directories, say A/ and B/, and
 make decisions about what to do based on where the data is rooted.
 Thanks for any help...CG








-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals


RE: java.io.IOException: Error opening job jar

2009-06-22 Thread Shravan Mahankali
Hi Ramakishore,

Unable to attach files to mailing list! I hope Harish received the attached
docs to his gmail a/c.

PFA attached those here.

Any help would be appreciated.

Thank You,
Shravan Kumar. M 
Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
-
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator - netopshelpd...@catalytic.com
-Original Message-
From: Ramakishore Yelamanchilli [mailto:kyela...@cisco.com] 
Sent: Monday, June 22, 2009 10:54 PM
To: core-user@hadoop.apache.org; shravan.mahank...@catalytic.com; 'Harish
Mallipeddi'
Subject: RE: java.io.IOException: Error opening job jar

There's no file attached Shravan.

Regards

Ram

-Original Message-
From: Shravan Mahankali [mailto:shravan.mahank...@catalytic.com] 
Sent: Monday, June 22, 2009 4:43 AM
To: core-user@hadoop.apache.org; 'Harish Mallipeddi'
Subject: RE: java.io.IOException: Error opening job jar


Hi Harish,

PFA the AggregateWordCount.jar file.

I was able to open this jar file using a sample Java file written with
JarFile api, and this is the same api used inside
org.apache.hadoop.util.RunJar class, where my jar is rejected and throwing
the error!!!

Also find attached the sample java file - RunJar1.java

Thank You,
Shravan Kumar. M 
Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
-
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator - netopshelpd...@catalytic.com
-Original Message-
From: Ramakishore Yelamanchilli [mailto:kyela...@cisco.com] 
Sent: Monday, June 22, 2009 5:04 PM
To: core-user@hadoop.apache.org; shravan.mahank...@catalytic.com; 'Harish
Mallipeddi'
Subject: RE: java.io.IOException: Error opening job jar

Can you attach the jar file you have?

-Ram

-Original Message-
From: Shravan Mahankali [mailto:shravan.mahank...@catalytic.com] 
Sent: Monday, June 22, 2009 3:49 AM
To: 'Harish Mallipeddi'; core-user@hadoop.apache.org
Subject: RE: java.io.IOException: Error opening job jar

Thanks for your reply Harish.

 

Am running this example from with in the directory containing
AggregateWordCount.jar file. But even then, I have this issue. Earlier I had
issue of java.lang.ClassNotFoundException:
org.apache.hadoop.examples.AggregateWordCount$WordCountPlugInClass, so in
some thread some one have suggested using -libjars, so I tried, but there
was not great!!!

 

I did not think it is SUCH A HARD JOB (almost 20+ hours with no success) to
just run an example provided by Hadoop in its distribution!!!

 

Thank You,

Shravan Kumar. M 

Catalytic Software Ltd. [SEI-CMMI Level 5 Company]

-

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system
administrator -  mailto:netopshelpd...@catalytic.com
netopshelpd...@catalytic.com

  _  

From: Harish Mallipeddi [mailto:harish.mallipe...@gmail.com] 
Sent: Monday, June 22, 2009 4:03 PM
To: core-user@hadoop.apache.org; shravan.mahank...@catalytic.com
Subject: Re: java.io.IOException: Error opening job jar

 

It cannot find your job jar file. Make sure you run this command from the
directory that has the AggregateWordCount.jar (and you can lose the -libjars
flag too - you need that only if you need to specify extra jar dependencies
apart from your job jar file).

- Harish

On Mon, Jun 22, 2009 at 3:45 PM, Shravan Mahankali
shravan.mahank...@catalytic.com wrote:

Hi Group,



I was having trouble getting through an example Hadoop program. I have
searched the mailing list but could not find any thing useful. Below is the
issue:



1) Executed below command to submit a job to Hadoop:

 /hadoop-0.18.3/bin/hadoop jar -libjars AggregateWordCount.jar
org.apache.hadoop.examples.AggregateWordCount words/*
aggregatewordcount_output 2 textinputformat



2) Following is the error:

java.io.IOException: Error opening job jar:
org.apache.hadoop.examples.AggregateWordCount

   at org.apache.hadoop.util.RunJar.main(RunJar.java:90)

   at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

   at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Caused by: java.util.zip.ZipException: error in opening zip file

   at java.util.zip.ZipFile.open(Native Method)

   at java.util.zip.ZipFile.init(ZipFile.java:203)

   at java.util.jar.JarFile.init(JarFile.java:132)

   at 

Re: Slides/Videos of Hadoop Summit

2009-06-22 Thread jaideep dhok
Thanks for the link.

-
JD

On Tue, Jun 23, 2009 at 1:55 AM, Alex Loddengaarda...@cloudera.com wrote:
 The Cloudera talks are here:

 
 http://www.cloudera.com/blog/2009/06/22/a-great-week-for-hadoop-summit-west-roundup/


 As for the rest, I'm not sure.

 Alex

 On Sun, Jun 21, 2009 at 11:46 PM, jaideep dhok jdd...@gmail.com wrote:

 Hi all,
 Are the slides or videos of the talks given at Hadoop Summit available
 online? I checked the Yahoo! website for the summit but could not find
 any links.

 Regards,
 --
 Jaideep





-- 
- JDD