Re: Hama Problem
Currently in Hama, eigenvalue decomposition is not implement.So In STEP 4, it is hard to migrate it.so I work out an idea to bypass it. before Step 4, I can let L be denseMatrix.when I come to Step 4, I can transform L into submatrix.in Jama,eigenvalue decomposition is support although it is not parallel computing.So I can geteigValues ,eigVectors values.But after that in step 5,It need to sort two matrix. I want to use the hbase sort function.so Hwo can transform this two submatrix into two densematrix? or other way ? /** * STEP 4 * Calculate the eigen values and vectors of this covariance matrix * * % Get the eigenvectors (columns of Vectors) and eigenvalues (diag of Values) */ EigenvalueDecomposition eigen = L.eig(); eigValues = eigen.getD(); eigVectors = eigen.getV(); /** * STEP 5 * % Sort the vectors/values according to size of eigenvalue */ Matrix[] eigDVSorted = sortem(eigValues, eigVectors); eigValues = eigDVSorted[0]; eigVectors = eigDVSorted[1]; /** * STEP 6 * % Convert the eigenvectors of A'*A into eigenvectors of A*A' */ eigVectors = A.times(eigVectors); /** * STEP 7 * % Get the eigenvalues out of the diagonal matrix and * % normalize them so the evalues are specifically for cov(A'), not A*A'. */ double[] values = diag(eigValues); for(int i = 0; i values.length; i++) values[i] /= A.getColumnDimension() - 1; /** * STEP 8 * % Normalize Vectors to unit length, kill vectors corr. to tiny evalues */ numEigenVecs = 0; for(int i = 0; i eigVectors.getColumnDimension(); i++) { Matrix tmp; if (values[i] 0.0001) { tmp = new Matrix(eigVectors.getRowDimension(),1); } else { tmp = eigVectors.getMatrix(0,eigVectors.getRowDimension()-1,i,i).times( 1 / eigVectors.getMatrix(0, eigVectors.getRowDimension() - 1, i, i).normF()); numEigenVecs++; } eigVectors.setMatrix(0,eigVectors.getRowDimension()-1,i,i,tmp); //eigVectors.timesEquals(1 / eigVectors.getMatrix(0, eigVectors.getRowDimension() - 1, i, i).normInf()); } eigVectors = eigVectors.getMatrix(0,eigVectors.getRowDimension() - 1, 0, numEigenVecs - 1); trained = true; /*System.out.println(There are + numGood + eigenVectors\n\nEigenVectorSize); System.out.println(eigVectors.getRowDimension()); System.out.println(eigVectors.getColumnDimension()); try { PrintWriter pw = new PrintWriter(c:\\tmp\\test.txt); eigVectors.print(pw, 8, 4); pw.flush(); pw.close(); } catch (Exception e) { e.printStackTrace(); } int width = pics[0].img.getWidth(null); BufferedImage biAvg = imageFromMatrix(bigAvg.getArrayCopy()[0], width); try { saveImage(new File(c:\\tmp\\test.jpg), biAvg); } catch (IOException e1) { e1.printStackTrace(); }*/ } /** * Returns a number of eigenFace values to be used in a feature space * @param pic * @param number number of eigen feature values. * @return will be of length number or this.getNumEigenVecs whichever is the smaller */ public double[] getEigenFaces(Picture pic, int number) { if (number numEigenVecs) //adjust the number to the maxium number of eigen vectors availiable number = numEigenVecs; double[] ret = new double[number]; double[] pixels = pic.getImagePixels(); Matrix face = new Matrix(pixels, pixels.length); Matrix Vecs =
input/output error while setting up superblock
Hi, I have a problem to use hdfs. I mounted hdfs using fuse-dfs. I created a dummy file for 'Xen' in hdfs and then formated the dummy file using 'mke2fs'. But the operation was faced error. The error message is as follows. [r...@localhost hdfs]# mke2fs -j -F ./file_dumy mke2fs 1.40.2 (12-Jul-2007) ./file_dumy: Input/output error while setting up superblock Also, I copyed an image file of xen to hdfs. But Xen couldn't the image files in hdfs. r...@localhost hdfs]# fdisk -l fedora6_demo.img last_lba(): I don't know how to handle files with mode 81a4 You must set cylinders. You can do this from the extra functions menu. Disk fedora6_demo.img: 0 MB, 0 bytes 255 heads, 63 sectors/track, 0 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System fedora6_demo.img1 * 1 156 1253038+ 83 Linux Could you answer me anything about this problem. Thank you.
Re: input/output error while setting up superblock
I don't think HDFS is a good place to store your Xen image file as it will likely be updated/appended frequently in small blocks. With the way HDFS is designed for, you can't quite use it like a regular filesystem (e.g. ones that support frequent small block appends/updates in files). My suggestion is to use another filesystem like NAS or SAN. /Taeho 2009/5/22 신승엽 mikas...@naver.com Hi, I have a problem to use hdfs. I mounted hdfs using fuse-dfs. I created a dummy file for 'Xen' in hdfs and then formated the dummy file using 'mke2fs'. But the operation was faced error. The error message is as follows. [r...@localhost hdfs]# mke2fs -j -F ./file_dumy mke2fs 1.40.2 (12-Jul-2007) ./file_dumy: Input/output error while setting up superblock Also, I copyed an image file of xen to hdfs. But Xen couldn't the image files in hdfs. r...@localhost hdfs]# fdisk -l fedora6_demo.img last_lba(): I don't know how to handle files with mode 81a4 You must set cylinders. You can do this from the extra functions menu. Disk fedora6_demo.img: 0 MB, 0 bytes 255 heads, 63 sectors/track, 0 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System fedora6_demo.img1 * 1 156 1253038+ 83 Linux Could you answer me anything about this problem. Thank you.
Re: Hama Problem
Hi, Before consider this, let's talk about your problem and why do you want to use these. If your application isn't huge then I think MPI-based matrix package could be much helpful to you since Hama concept also is the large-scale, not high performance for small matrices. And, Have you tried to subscribe/mail here: http://incubator.apache.org/hama/mailing_lists.html On Fri, May 22, 2009 at 4:51 PM, ykj ykj...@163.com wrote: Currently in Hama, eigenvalue decomposition is not implement.So In STEP 4, it is hard to migrate it.so I work out an idea to bypass it. before Step 4, I can let L be denseMatrix.when I come to Step 4, I can transform L into submatrix.in Jama,eigenvalue decomposition is support although it is not parallel computing.So I can get eigValues ,eigVectors values.But after that in step 5,It need to sort two matrix. I want to use the hbase sort function.so Hwo can transform this two submatrix into two densematrix? or other way ? /** * STEP 4 * Calculate the eigen values and vectors of this covariance matrix * * % Get the eigenvectors (columns of Vectors) and eigenvalues (diag of Values) */ EigenvalueDecomposition eigen = L.eig(); eigValues = eigen.getD(); eigVectors = eigen.getV(); /** * STEP 5 * % Sort the vectors/values according to size of eigenvalue */ Matrix[] eigDVSorted = sortem(eigValues, eigVectors); eigValues = eigDVSorted[0]; eigVectors = eigDVSorted[1]; /** * STEP 6 * % Convert the eigenvectors of A'*A into eigenvectors of A*A' */ eigVectors = A.times(eigVectors); /** * STEP 7 * % Get the eigenvalues out of the diagonal matrix and * % normalize them so the evalues are specifically for cov(A'), not A*A'. */ double[] values = diag(eigValues); for(int i = 0; i values.length; i++) values[i] /= A.getColumnDimension() - 1; /** * STEP 8 * % Normalize Vectors to unit length, kill vectors corr. to tiny evalues */ numEigenVecs = 0; for(int i = 0; i eigVectors.getColumnDimension(); i++) { Matrix tmp; if (values[i] 0.0001) { tmp = new Matrix(eigVectors.getRowDimension(),1); } else { tmp = eigVectors.getMatrix(0,eigVectors.getRowDimension()-1,i,i).times( 1 / eigVectors.getMatrix(0, eigVectors.getRowDimension() - 1, i, i).normF()); numEigenVecs++; } eigVectors.setMatrix(0,eigVectors.getRowDimension()-1,i,i,tmp); //eigVectors.timesEquals(1 / eigVectors.getMatrix(0, eigVectors.getRowDimension() - 1, i, i).normInf()); } eigVectors = eigVectors.getMatrix(0,eigVectors.getRowDimension() - 1, 0, numEigenVecs - 1); trained = true; /*System.out.println(There are + numGood + eigenVectors\n\nEigenVectorSize); System.out.println(eigVectors.getRowDimension()); System.out.println(eigVectors.getColumnDimension()); try { PrintWriter pw = new PrintWriter(c:\\tmp\\test.txt); eigVectors.print(pw, 8, 4); pw.flush(); pw.close(); } catch (Exception e) { e.printStackTrace(); } int width = pics[0].img.getWidth(null); BufferedImage biAvg = imageFromMatrix(bigAvg.getArrayCopy()[0], width); try { saveImage(new File(c:\\tmp\\test.jpg), biAvg); } catch (IOException e1) { e1.printStackTrace(); }*/ } /** * Returns a number of eigenFace values to be used in a feature space * @param pic * @param number number of eigen feature values. * @return will be of length number or this.getNumEigenVecs whichever is the smaller */ public double[] getEigenFaces(Picture pic, int number) { if (number numEigenVecs) //adjust the number to the maxium number of eigen vectors availiable number = numEigenVecs; double[] ret =
Re: ssh issues
Well i made ssh with passphares. as the system in which i need to login requires ssh with pass phrases and those systems have to be part of my cluster. and so I need a way where I can specify -i path/to key/ and passphrase to hadoop in before hand. Pankil On Thu, May 21, 2009 at 9:35 PM, Aaron Kimball aa...@cloudera.com wrote: Pankil, That means that either you're using the wrong ssh key and it's falling back to password authentication, or else you created your ssh keys with passphrases attached; try making new ssh keys with ssh-keygen and distributing those to start again? - Aaron On Thu, May 21, 2009 at 3:49 PM, Pankil Doshi forpan...@gmail.com wrote: The problem is that it also prompts for the pass phrase. On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pankil, Use ~/.ssh/config to set the default key location to the proper place for each host, if you're going down that route. I'd remind you that SSH is only used as a convenient method to launch daemons. If you have a preferred way to start things up on your cluster, you can use that (I think most large clusters don't use ssh... could be wrong). Brian On May 21, 2009, at 2:07 PM, Pankil Doshi wrote: Hello everyone, I got hint how to solve the problem where clusters have different usernames.but now other problem I face is that i can ssh a machine by using -i path/to key/ ..I cant ssh them directly but I will have to always pass the key. Now i face problem in ssh-ing my machines.Does anyone have any ideas how to deal with that?? Regards Pankil
Re: ssh issues
Pankil Doshi wrote: Well i made ssh with passphares. as the system in which i need to login requires ssh with pass phrases and those systems have to be part of my cluster. and so I need a way where I can specify -i path/to key/ and passphrase to hadoop in before hand. Pankil Well, are trying to manage a system whose security policy is incompatible with hadoop's current shell scripts. If you push out the configs and manage the lifecycle using other tools, this becomes a non-issue. Dont raise the topic of HDFS security to your ops team though, as they will probably be unhappy about what is currently on offer. -steve
Re: ssh issues
Steve, Security through obscurity is always a good practice from a development standpoint and one of the reasons why tricking you out is an easy task. Please, keep hiding relevant details from people in order to keep everyone smiling. Hal Pankil Doshi wrote: Well i made ssh with passphares. as the system in which i need to login requires ssh with pass phrases and those systems have to be part of my cluster. and so I need a way where I can specify -i path/to key/ and passphrase to hadoop in before hand. Pankil Well, are trying to manage a system whose security policy is incompatible with hadoop's current shell scripts. If you push out the configs and manage the lifecycle using other tools, this becomes a non-issue. Dont raise the topic of HDFS security to your ops team though, as they will probably be unhappy about what is currently on offer. -steve
Can not start task tracker because java.lang.NullPointerException
Version 19.1 with patches: 4780-2v19.patch (Jira 4780) closeAll3.patch (Jira 3998) I have confirmed that https://issues.apache.org/jira/browse/HADOOP-4924patch is in, so that is not the fix. We are having task trackers die every night with a null pointer exception. Usually 2 or so out of 8 (25% each night). Here are the logs: Version 19.1 with 2009-05-22 02:46:49,911 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200905211749_0451 2009-05-22 02:46:49,911 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200905211749_0451_m_00_0 done; removing files. 2009-05-22 02:46:54,911 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:47:13,968 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200905211749_0444 2009-05-22 02:47:13,969 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200905211749_0444_m_00_0 done; removing files. 2009-05-22 02:47:18,968 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:48:52,324 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200905211749_0452_m_06_0 task's state:UNASSIGNED 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200905211749_0452_m_06_0 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 4 and trying to launch attempt_200905211749_0452_m_06_0 2009-05-22 02:49:15,274 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200905211749_0452_m_1998728288 spawned. 2009-05-22 02:49:15,765 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200905211749_0452_m_1998728288 given task: attempt_200905211749_0452_m_06_0 2009-05-22 02:49:15,781 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:49:15,781 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0452/attempt_200905211749_0452_m_06_0/output/file.out in any of the configured local directories 2009-05-22 02:49:19,784 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200905211749_0452_m_06_0 1.0% hdfs:// ec2-75-101-247-52.compute-1.amazonaws.com:54310/paragraphInstances/2009-05-22/rollup#06h#3e04c188-245a-4856-9a54-2fec60e85e3d.seq:0+9674259http://ec2-75-101-247-52.compute-1.amazonaws.com:54310/paragraphInstances/2009-05-22/rollup#06h%233e04c188-245a-4856-9a54-2fec60e85e3d.seq:0+9674259 2009-05-22 02:49:19,785 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_200905211749_0452_m_06_0 is done. 2009-05-22 02:49:19,785 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_200905211749_0452_m_06_0 was 0 2009-05-22 02:49:19,787 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 4 2009-05-22 02:49:19,954 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200905211749_0452_m_1998728288 exited. Number of tasks it ran: 1 2009-05-22 02:59:19,297 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker 2009-05-22 02:59:19,298 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:2300) at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:2273) at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:840) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1728) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2785) 2009-05-22 02:59:19,300 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down TaskTracker at domU-12-31-38-01-AD-91/ 10.253.178.95 /
Re: ssh issues
Pankil, I used to be very confused by hadoop and SSH keys. SSH is NOT required. Each component can be started by hand. This gem of knowledge is hidden away in the hundreds of DIGG style articles entitled 'HOW TO RUN A HADOOP MULTI-MASTER CLUSTER!' The SSH keys are only required by the shell scripts that are contained with Hadoop like start-all. They are wrappers to kick off other scripts on a list of nodes. I PERSONALLY dislike using SSH keys as a software component and believe they should only be used by administrators. We chose the cloudera distribution. http://www.cloudera.com/distribution. A big factor behind this was the simple init.d scripts they provided. Each hadoop component has its own start scripts hadoop-namenode, hadoop-datanode, etc. My suggestion is taking a look at the Cloudera startup scripts. Even if you decide not to use the distribution you can take a look at their start up scripts and fit them to your needs. On Fri, May 22, 2009 at 10:34 AM, hmar...@umbc.edu wrote: Steve, Security through obscurity is always a good practice from a development standpoint and one of the reasons why tricking you out is an easy task. Please, keep hiding relevant details from people in order to keep everyone smiling. Hal Pankil Doshi wrote: Well i made ssh with passphares. as the system in which i need to login requires ssh with pass phrases and those systems have to be part of my cluster. and so I need a way where I can specify -i path/to key/ and passphrase to hadoop in before hand. Pankil Well, are trying to manage a system whose security policy is incompatible with hadoop's current shell scripts. If you push out the configs and manage the lifecycle using other tools, this becomes a non-issue. Dont raise the topic of HDFS security to your ops team though, as they will probably be unhappy about what is currently on offer. -steve
Re: input/output error while setting up superblock
More specifically: HDFS does not support operations such as opening a file for write/append after it has already been closed, or seeking to a new location in a writer. You can only write files linearly; all other operations will return a not supported error. You'll also find that random-access read performance, while implemented, is not particularly high-throughput. For serving Xen images even in read-only mode, you'll likely have much better luck with a different FS. - Aaron 2009/5/22 Taeho Kang tka...@gmail.com I don't think HDFS is a good place to store your Xen image file as it will likely be updated/appended frequently in small blocks. With the way HDFS is designed for, you can't quite use it like a regular filesystem (e.g. ones that support frequent small block appends/updates in files). My suggestion is to use another filesystem like NAS or SAN. /Taeho 2009/5/22 신승엽 mikas...@naver.com Hi, I have a problem to use hdfs. I mounted hdfs using fuse-dfs. I created a dummy file for 'Xen' in hdfs and then formated the dummy file using 'mke2fs'. But the operation was faced error. The error message is as follows. [r...@localhost hdfs]# mke2fs -j -F ./file_dumy mke2fs 1.40.2 (12-Jul-2007) ./file_dumy: Input/output error while setting up superblock Also, I copyed an image file of xen to hdfs. But Xen couldn't the image files in hdfs. r...@localhost hdfs]# fdisk -l fedora6_demo.img last_lba(): I don't know how to handle files with mode 81a4 You must set cylinders. You can do this from the extra functions menu. Disk fedora6_demo.img: 0 MB, 0 bytes 255 heads, 63 sectors/track, 0 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System fedora6_demo.img1 * 1 156 1253038+ 83 Linux Could you answer me anything about this problem. Thank you.
Tutorial on building an AMI
Hello, Is there a tutorial available to build an Hadoop AMI (like Cloudera's)? Cloudera has an 18.2 ami and for reasons I understand they can't provide(as of now) AMIs for higher Hadoop versions until they become stable. I would like to create an AMI for 19.2 - so was hoping if there is a guide for building one. Thank you Saptarshi Guha
Re: Can not start task tracker because java.lang.NullPointerException
Hi Lance, Is it possible that your mapred.local.dir is in /tmp and you have a cron job that cleans it up at night (default on many systems)? Thanks -Todd On Fri, May 22, 2009 at 9:33 AM, Lance Riedel la...@dotspots.com wrote: Version 19.1 with patches: 4780-2v19.patch (Jira 4780) closeAll3.patch (Jira 3998) I have confirmed that https://issues.apache.org/jira/browse/HADOOP-4924patch is in, so that is not the fix. We are having task trackers die every night with a null pointer exception. Usually 2 or so out of 8 (25% each night). Here are the logs: Version 19.1 with 2009-05-22 02:46:49,911 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200905211749_0451 2009-05-22 02:46:49,911 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200905211749_0451_m_00_0 done; removing files. 2009-05-22 02:46:54,911 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:47:13,968 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200905211749_0444 2009-05-22 02:47:13,969 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200905211749_0444_m_00_0 done; removing files. 2009-05-22 02:47:18,968 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:48:52,324 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200905211749_0452_m_06_0 task's state:UNASSIGNED 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200905211749_0452_m_06_0 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 4 and trying to launch attempt_200905211749_0452_m_06_0 2009-05-22 02:49:15,274 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200905211749_0452_m_1998728288 spawned. 2009-05-22 02:49:15,765 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200905211749_0452_m_1998728288 given task: attempt_200905211749_0452_m_06_0 2009-05-22 02:49:15,781 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:49:15,781 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0452/attempt_200905211749_0452_m_06_0/output/file.out in any of the configured local directories 2009-05-22 02:49:19,784 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200905211749_0452_m_06_0 1.0% hdfs:// ec2-75-101-247-52.compute-1.amazonaws.com:54310/paragraphInstances/2009-05-22/rollup#06h#3e04c188-245a-4856-9a54-2fec60e85e3d.seq:0+9674259 http://ec2-75-101-247-52.compute-1.amazonaws.com:54310/paragraphInstances/2009-05-22/rollup#06h%233e04c188-245a-4856-9a54-2fec60e85e3d.seq:0+9674259 2009-05-22 02:49:19,785 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_200905211749_0452_m_06_0 is done. 2009-05-22 02:49:19,785 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_200905211749_0452_m_06_0 was 0 2009-05-22 02:49:19,787 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 4 2009-05-22 02:49:19,954 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_200905211749_0452_m_1998728288 exited. Number of tasks it ran: 1 2009-05-22 02:59:19,297 INFO org.apache.hadoop.mapred.TaskTracker: Recieved RenitTrackerAction from JobTracker 2009-05-22 02:59:19,298 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:2300) at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.jobHasFinished(TaskTracker.java:2273) at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:840) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1728) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2785) 2009-05-22 02:59:19,300 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: /
Re: Can not start task tracker because java.lang.NullPointerException
Sure, I'll try out 19.2.. but where is it? I don't see it here: http://svn.apache.org/repos/asf/hadoop/core/ (looking under tags) On Fri, May 22, 2009 at 2:11 PM, Todd Lipcon t...@cloudera.com wrote: Hi Lance, It's possible this is related to the other JIRA (HADOOP-5761). If it's not too much trouble to try out the 19.2 branch from SVN, it would be helpful in determining whether this is a problem that's already fixed or if you've discovered something new. Thanks -Todd On Fri, May 22, 2009 at 2:01 PM, Lance Riedel la...@dotspots.com wrote: Hi Todd, We had looked at that before.. here is the location of the tmp directory: [dotsp...@domu-12-31-38-00-80-21 hadoop-0.19.1]$ du -sh /dist/app/hadoop-0.19.1/tmp 248G/dist/app/hadoop-0.19.1/tmp There are no cron jobs that would have anything to do with that directory. Here is the /tmp [dotsp...@domu-12-31-38-00-80-21 hadoop-0.19.1]$ du -sh /tmp 204K/tmp Does this look like a disk error? I had seen that the org.apache.hadoop.util.DiskChecker$DiskErrorException is bogus. Thanks! Lance On Fri, May 22, 2009 at 9:33 AM, Lance Riedel la...@dotspots.com wrote: Version 19.1 with patches: 4780-2v19.patch (Jira 4780) closeAll3.patch (Jira 3998) I have confirmed that https://issues.apache.org/jira/browse/HADOOP-4924patch is in, so that is not the fix. We are having task trackers die every night with a null pointer exception. Usually 2 or so out of 8 (25% each night). Here are the logs: Version 19.1 with 2009-05-22 02:46:49,911 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200905211749_0451 2009-05-22 02:46:49,911 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200905211749_0451_m_00_0 done; removing files. 2009-05-22 02:46:54,911 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:47:13,968 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200905211749_0444 2009-05-22 02:47:13,969 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200905211749_0444_m_00_0 done; removing files. 2009-05-22 02:47:18,968 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:48:52,324 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200905211749_0452_m_06_0 task's state:UNASSIGNED 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200905211749_0452_m_06_0 2009-05-22 02:49:10,779 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 4 and trying to launch attempt_200905211749_0452_m_06_0 2009-05-22 02:49:15,274 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_200905211749_0452_m_1998728288 spawned. 2009-05-22 02:49:15,765 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200905211749_0452_m_1998728288 given task: attempt_200905211749_0452_m_06_0 2009-05-22 02:49:15,781 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0421/attempt_200905211749_0421_r_09_0/output/file.out in any of the configured local directories 2009-05-22 02:49:15,781 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905211749_0452/attempt_200905211749_0452_m_06_0/output/file.out in any of the configured local directories 2009-05-22 02:49:19,784 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200905211749_0452_m_06_0 1.0% hdfs:// ec2-75-101-247-52.compute-1.amazonaws.com:54310/paragraphInstances/2009-05-22/rollup#06h#3e04c188-245a-4856-9a54-2fec60e85e3d.seq:0+9674259 http://ec2-75-101-247-52.compute-1.amazonaws.com:54310/paragraphInstances/2009-05-22/rollup#06h%233e04c188-245a-4856-9a54-2fec60e85e3d.seq:0+9674259 2009-05-22 02:49:19,785 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_200905211749_0452_m_06_0 is done. 2009-05-22 02:49:19,785 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for