Team , i had wrote a mapreduce program . scenario of my program is to emit <userid,seqid> .
Total no user : 825 Total no seqid:6583100 No of map which the program will emit is : 825 * 6583100 I have Hbase table called ObjectSequence : which consist of 6583100(rows) i had use TableMapper and TableReducer for my map reduce program Problem definition : Processor : i7 Replication Factor : 1 Live Datanodes : 3 Node Last Contact Admin State Configured Capacity (GB) Used (GB) Non DFS Used (GB) Remaining (GB) Used (%) Used (%) Remaining (%) Blocks chethan 1In Service28.590.625.172.822.11 9.8773 shashwat<http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>2In Service28.980.8722.016.13 21.0469 syed<http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>0In Service28.984.2918.376.3214.8 21.82129 When i run balancer in hadoop i had seen Blocks are not equally distributed . Can i know what may be the reason for this .. Kind% CompleteNum TasksPendingRunningCompleteKilledFailed/Killed Task Attempts<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007> map<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1> 85.71% 701<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running> 6<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed> 03<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed>/ 1<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed> reduce<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1> 28.57% 101<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running> 000 / 0 i had seen only Number Task is allocated is 8 . Is there any possibility to increase the Map Number of Task Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters task_201207121836_0007_m_000001<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001> 100.00% UserID: 777 SEQID:415794 12-Jul-2012 21:35:48 12-Jul-2012 21:36:12 (24sec) 16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001> task_201207121836_0007_m_000002<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002> 100.00% UserID: 777 SEQID:422256 12-Jul-2012 21:35:50 12-Jul-2012 21:36:47 (57sec) 16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002> task_201207121836_0007_m_000003<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003> 100.00% UserID: 777 SEQID:563544 12-Jul-2012 21:35:50 12-Jul-2012 22:00:08 (24mins, 17sec) 16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003> task_201207121836_0007_m_000004<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004> 100.00% UserID: 777 SEQID:592918 12-Jul-2012 21:35:50 12-Jul-2012 21:42:09 (6mins, 18sec) 16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004> task_201207121836_0007_m_000005<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005> 100.00% UserID: 777 SEQID:618121 12-Jul-2012 21:35:50 12-Jul-2012 21:44:34 (8mins, 43sec) 16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005> task_201207121836_0007_m_000006<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006> 100.00% UserID: 777 SEQID:685810 12-Jul-2012 21:36:12 12-Jul-2012 21:44:18 (8mins, 6sec) 16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006> why for last Map task is talking nearly 2 hours .please give me some suggestion how to do an optimization TaskCompleteStatusStart TimeFinish TimeErrorsCounters task_201207121836_0007_m_000000<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000> 0.00% UserID: 482 SEQID:99596 12-Jul-2012 21:35:48 java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205) at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/spill712.out at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127) at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344) java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: Cannot run program "/bin/ls": java.io.IOException: error=12, Cannot allocate memory at java.lang.ProcessBuilder.start(ProcessBuilder.java:475) at org.apache.hadoop.util.Shell.runCommand(Shell.java:200) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) at org.apache.hadoop.util.Shell.execCommand(Shell.java:461) at org.apache.hadoop.util.Shell.execCommand(Shell.java:444) at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:260) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory at java.lang.UNIXProcess.<init>(UNIXProcess.java:164) at java.lang.ProcessImpl.start(ProcessImpl.java:81) at java.lang.ProcessBuilder.start(ProcessBuilder.java:468) ... 15 more at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:260) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205) at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/spill934.out at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127) at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344) I had seen this error for last task what may be the reason for this error . NOTE: When i run import hbase table it takes 10 min . Team please give suggestion what to be done to solve these issue . Thanks and Regards, S SYED ABDUL KATHER