Ok, I created http://issues.apache.org/jira/browse/HADOOP-1666 to track this issue. I also attached a patch with that issue.
Thanks, dhruba -----Original Message----- From: KrzyCube [mailto:[EMAIL PROTECTED] Sent: Sunday, July 29, 2007 7:46 PM To: [email protected] Subject: RE: Calling FsShell.doMain() hold so many threads although a little late , i test the second patch and it really works fine. but , i still get problem about : who is really hold the thread ? reason is FsShell instance can not be disposed? or , DFSClient instance? Dhruba Borthakur wrote: > > > Ok, can you pl remove the earlier patch I gave you, instead use this > modified patch? This shud work. > > Thanks, > Dhruba > > > -----Original Message----- > From: KrzyCube [mailto:[EMAIL PROTECTED] > Sent: Wednesday, July 25, 2007 12:51 AM > To: [email protected] > Subject: RE: Calling FsShell.doMain() hold so many threads > > > hi , dhruba > > i have tried the patch [restartableFsShell.patch], but the problem is > still > there. > > i have view the code in debug mode , and the "fs = null" both in init() > and > in finally area has all > be hit , and the threads still be create. > > so i think it must be some other problems. > i will make description more detailed later , with my code and my > exceptions. > and the snapshot which i caught under windows xp > [only because i don't know how to view the threads num of a process under > Ubuntu Linux]. > > > Dhruba Borthakur wrote: >> >> Please try this attached patch, let me know if it works. >> >> Thanks, >> dhruba >> >> -----Original Message----- >> From: KrzyCube [mailto:[EMAIL PROTECTED] >> Sent: Tuesday, July 24, 2007 6:19 PM >> To: [email protected] >> Subject: Re: Calling FsShell.doMain() hold so many threads >> >> >> first of all ,thanks , Raghu. >> >> here's the exception info: >> ------------------------------------------------------------------------ >> Exception in thread "main" java.lang.OutOfMemoryError: unable to create >> new >> native thread >> at java.lang.Thread.start0(Native Method) >> at java.lang.Thread.start(Unknown Source) >> at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:116) >> at >> > org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.initial >> ize(DistributedFileSystem.java:67) >> at >> org.apache.hadoop.fs.FilterFileSystem.initialize(FilterFileSystem.java:57) >> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:160) >> at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:119) >> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:91) >> at org.apache.hadoop.fs.FsShell.init(FsShell.java:41) >> at org.apache.hadoop.fs.FsShell.run(FsShell.java:809) >> at kingsoft.lab.duba.CustomInterface.CreateDir(CustomInterface.java:138) >> at kingsoft.lab.duba.CustomInterface.main(CustomInterface.java:155) >> ------------------------------------------------------------------------ >> >> Then , is there any recommendable API for these use ? >> here "these" I mean: upload or download files and create dir >> programmatically even in concurrency operation. >> >> >> Raghu Angadi wrote: >>> >>> >>> Can you get the stack trace of the threads that are left? It was not >>> obvious from the code where a thread is started. It might be 'trash >>> handler'. >>> >>> You could add sleep(10sec) to give you enough time to get the trace. >>> >>> FsShell might not be designed for this use, but seems like a pretty >>> useful feature. >>> >>> Raghu. >>> >>> KrzyCube wrote: >>>> I have tried the way TestDFSShell.java does, >>>> here's my code: >>>> >>>> ------------------------------------------------------------ >>>> public class CustomInterface >>>> { >>>> Configuration conf ; >>>> FsShell fs ; >>>> >>>> public CustomInterface() >>>> { >>>> conf = new Configuration(); >>>> fs = new FsShell(); >>>> >>>> fs.setConf(conf); >>>> } >>>> >>>> public int createDir(String strDirName,String strPath) >>>> { >>>> // omit exception catch >>>> int iRet = 0; >>>> strPath += strDirName; >>>> String[] strCmd = new String[2]; >>>> strCmd[0] = "-mkdir"; >>>> strCmd[1] = strPath; >>>> return m_fs.run(strCmd); >>>> } >>>> } >>>> ------------------------------------------------------------ >>>> >>>> Then i just call the createdir Method >>>> >>>> for(int i =0 ; i < 100000 ; i ++) >>>> { >>>> custom.createDir("someName"); >>>> } >>>> >>>> this cause the java vm process hold many threads >>>> and these threads eat memory . >>>> till the JVM Heap are eat up , throws Exceptions. >>>> only larger Heap size holds more threads , but not fix the problem. >>>> >>>> thanks. >>>> >>>> >>>> Dhruba Borthakur wrote: >>>>> One example of programmatically using FsShell is in >>>>> src/test/org/apache/hadoop/dfs/TestDFSShell.java >>>>> >>>>> Thanks, >>>>> dhruba >>>>> >>>>> -----Original Message----- >>>>> From: KrzyCube [mailto:[EMAIL PROTECTED] >>>>> Sent: Monday, July 23, 2007 7:49 PM >>>>> To: [email protected] >>>>> Subject: Calling FsShell.doMain() hold so many threads >>>>> >>>>> >>>>> Hi there: >>>>> >>>>> i got two questions: >>>>> >>>>> Q1: >>>>> I am try to call the FsShell.doMain() with my own code , which is >>>>> only >>>>> a easy wrapper of the FsShell. >>>>> But when i am trying to create many dirs , 10000 etc. Exception like >>>>> "Not >>>>> enough memory for more threads" throw , i have set the -Xmx512m. >>>>> Then i trying to view the process info while the program running , >>>>> then >>>>> i found there are more and more threads invoked during the process , >>>>> and >>>>> eat >>>>> more and more memory ,all threads still there without exit. >>>>> Then i came to the source code , and found that while the >>>>> FsShell.Main() >>>>> for terminal call there is one line >>>>> "System.exit(return_value_of_doMain)" >>>>> , >>>>> Is that mean the call of the ToolBase.run() which implemented in >>>>> FsShell.java is always create a new thread and have to be force >>>>> terminated >>>>> by System.exit() to kill the process ? >>>>> So , if that is , how can i write my own code to use hadoop with >>>>> FsShell >>>>> in multi-thread mode , or is there any other way to do this ? >>>>> >>>>> Q2: >>>>> I svn code , and run it in eclipse [the only reason i refer to >>>>> eclipse >>>>> is to indicate my environment], >>>>> under Unbuntu 7.04. >>>>> all about casual , i want to see how much time the >>>>> FsShell.doMain() >>>>> take , I use "new Date()" and >>>>> get the interval with "DateEnd.getTime() - DateBeg.getTime()" >>>>> Then i found that: even mkdir take more then 1000 [getTime shows] >>>>> if there's no arguments , it take 25 , but even if i just give it a >>>>> wrong >>>>> argument , such as "-sl", it take more than 1000 , is that means the >>>>> argument check take most of the time cost? >>>>> >>>>> -- >>>>> View this message in context: >>>>> >> > http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41 >>>>> 33557.html#a11756139 >>>>> Sent from the Hadoop Users mailing list archive at Nabble.com. >>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >>> >> >> -- >> View this message in context: >> > http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41 >> 33557.html#a11774684 >> Sent from the Hadoop Users mailing list archive at Nabble.com. >> >> >> > > -- > View this message in context: > http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41 > 33557.html#a11778036 > Sent from the Hadoop Users mailing list archive at Nabble.com. > > > -- View this message in context: http://www.nabble.com/Calling-FsShell.doMain%28%29-hold-so-many-threads-tf41 33557.html#a11857398 Sent from the Hadoop Users mailing list archive at Nabble.com.
