Please try this attached patch, let me know if it works. Thanks, dhruba
-----Original Message----- From: KrzyCube [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 6:19 PM To: [email protected] Subject: Re: Calling FsShell.doMain() hold so many threads first of all ,thanks , Raghu. here's the exception info: ------------------------------------------------------------------------ Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Unknown Source) at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:116) at org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.initial ize(DistributedFileSystem.java:67) at org.apache.hadoop.fs.FilterFileSystem.initialize(FilterFileSystem.java:57) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:160) at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:119) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:91) at org.apache.hadoop.fs.FsShell.init(FsShell.java:41) at org.apache.hadoop.fs.FsShell.run(FsShell.java:809) at kingsoft.lab.duba.CustomInterface.CreateDir(CustomInterface.java:138) at kingsoft.lab.duba.CustomInterface.main(CustomInterface.java:155) ------------------------------------------------------------------------ Then , is there any recommendable API for these use ? here "these" I mean: upload or download files and create dir programmatically even in concurrency operation. Raghu Angadi wrote: > > > Can you get the stack trace of the threads that are left? It was not > obvious from the code where a thread is started. It might be 'trash > handler'. > > You could add sleep(10sec) to give you enough time to get the trace. > > FsShell might not be designed for this use, but seems like a pretty > useful feature. > > Raghu. > > KrzyCube wrote: >> I have tried the way TestDFSShell.java does, >> here's my code: >> >> ------------------------------------------------------------ >> public class CustomInterface >> { >> Configuration conf ; >> FsShell fs ; >> >> public CustomInterface() >> { >> conf = new Configuration(); >> fs = new FsShell(); >> >> fs.setConf(conf); >> } >> >> public int createDir(String strDirName,String strPath) >> { >> // omit exception catch >> int iRet = 0; >> strPath += strDirName; >> String[] strCmd = new String[2]; >> strCmd[0] = "-mkdir"; >> strCmd[1] = strPath; >> return m_fs.run(strCmd); >> } >> } >> ------------------------------------------------------------ >> >> Then i just call the createdir Method >> >> for(int i =0 ; i < 100000 ; i ++) >> { >> custom.createDir("someName"); >> } >> >> this cause the java vm process hold many threads >> and these threads eat memory . >> till the JVM Heap are eat up , throws Exceptions. >> only larger Heap size holds more threads , but not fix the problem. >> >> thanks. >> >> >> Dhruba Borthakur wrote: >>> One example of programmatically using FsShell is in >>> src/test/org/apache/hadoop/dfs/TestDFSShell.java >>> >>> Thanks, >>> dhruba >>> >>> -----Original Message----- >>> From: KrzyCube [mailto:[EMAIL PROTECTED] >>> Sent: Monday, July 23, 2007 7:49 PM >>> To: [email protected] >>> Subject: Calling FsShell.doMain() hold so many threads >>> >>> >>> Hi there: >>> >>> i got two questions: >>> >>> Q1: >>> I am try to call the FsShell.doMain() with my own code , which is >>> only >>> a easy wrapper of the FsShell. >>> But when i am trying to create many dirs , 10000 etc. Exception like >>> "Not >>> enough memory for more threads" throw , i have set the -Xmx512m. >>> Then i trying to view the process info while the program running , >>> then >>> i found there are more and more threads invoked during the process , and >>> eat >>> more and more memory ,all threads still there without exit. >>> Then i came to the source code , and found that while the >>> FsShell.Main() >>> for terminal call there is one line >>> "System.exit(return_value_of_doMain)" >>> , >>> Is that mean the call of the ToolBase.run() which implemented in >>> FsShell.java is always create a new thread and have to be force >>> terminated >>> by System.exit() to kill the process ? >>> So , if that is , how can i write my own code to use hadoop with >>> FsShell >>> in multi-thread mode , or is there any other way to do this ? >>> >>> Q2: >>> I svn code , and run it in eclipse [the only reason i refer to >>> eclipse >>> is to indicate my environment], >>> under Unbuntu 7.04. >>> all about casual , i want to see how much time the FsShell.doMain() >>> take , I use "new Date()" and >>> get the interval with "DateEnd.getTime() - DateBeg.getTime()" >>> Then i found that: even mkdir take more then 1000 [getTime shows] >>> if there's no arguments , it take 25 , but even if i just give it a >>> wrong >>> argument , such as "-sl", it take more than 1000 , is that means the >>> argument check take most of the time cost? >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41 >>> 33557.html#a11756139 >>> Sent from the Hadoop Users mailing list archive at Nabble.com. >>> >>> >>> >>> >> > > > -- View this message in context: http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41 33557.html#a11774684 Sent from the Hadoop Users mailing list archive at Nabble.com.
