Kurt we are facing the exact same problem. We use OCFS2 ocfs2-2.6.9-34.ELsmp-1.2.3-1 for aSHARED APPL_TOP in an 11i env.
We hit the problem when trying to do backups! Any timeframe on the latest version where this OOM problem will be resolved? Any advice on best approach for backing up an OCFS2 volume with 100's of thousands of files, such as an APPL_TOP (or multiple APPL_TOPs as in our case). TIA Peter --- Alexander Finger <[EMAIL PROTECTED]> wrote: > Hello! > > Thanks for the fast reply. > > Kurt Hackel wrote: > > Hi, > > > > Alexander Finger wrote: > >> Hello, > >> > >> my problem: When I want to create a large number > of small files on > >> any node at my ocfs2 cluster, after some time the > oom killer starts > >> killing processes because of low LowMem. All > error messages and > >> memory stats are at the end of this mail. > > > > This is a known issue that is being currently > fixed for the next > > scheduled release. At this time, once a node > masters a lock resource > > (from the filesystem this would happen if the node > were the first node > > to access that file) it cannot drop the mastery of > that resource until > > it unmounts. The fix is nontrivial but I'm almost > done with it. Once > > the fix is done it will need extensive testing. > This is very bad... I have prepared the whole > cluster (9 nodes) already > and thought I am "close" to deployment... while > functional testing the > clusters behavior was "normal" (bonnie & iozone > reported good results) > after setting the scheduler to deadline, and doing > other fine tuning it > crashed within minutes when I tried to copy our > production data into it. > I need just minutes to crash the cluster because I > need the cluster to > hold about 10 mio. files (each about 3-5 kB). > > So I would suggest you send your fix to me for > testing... once its > done. ;-) ... please! > > > >> The only way to avoid this behavoir is to unmount > the ocfs2 partition > >> after some disk operations, because LowMem > (LowFree) stays low until > >> unmount... I searched the web and found many > descriptions of this > >> error, but no answer how to handle this problem. > > Correct. The only current workaround is to > unmount, or to attempt to > > spread the lock resources out across all the nodes > of the cluster > > (which may be impossible in your usage case). > Wonderful, how can I spread the resources? I did > recognize such an > option at the documentation. The ocfs2 volume is > needed "just" to store > a fast changing and very large directory tree, > containing metadata files > (xml). I do not use it (at this point) for > database(s) or anythying > else. The cluster has a size of ~ 290 GB. If you > need further > information to explain if spreading the lock > resources to other nodes or > not may help me, I'll be happy to send them to you. > > > Best regards, > > Alexander > > -- > Fotofinder GmbH USt-IdNr. DE812854514 > Software Entwicklung Web: > http://www.fotofinder.net/ > Potsdamer Str. 96 Tel: +49 30 25792890 > 10785 Berlin Fax: +49 30 257928999 > > > begin:vcard > fn:Fotofinder GmbH / Alexander Finger > n:Finger;Alexander > org:Fotofinder GmbH;Software Entwicklung > adr:;;Potsdamer Str. 96;Berlin;Berlin;10785;DEU > email;internet:[EMAIL PROTECTED] > tel;work:+49 30 25792890 > tel;fax:+49 30 257928999 > tel;home:+49 30 25792890 > x-mozilla-html:FALSE > url:http://www.fotofinder.net > version:2.1 > end:vcard > > > _______________________________________________ > Ocfs2-users mailing list > [email protected] > http://oss.oracle.com/mailman/listinfo/ocfs2-users > ____________________________________________________ On Yahoo!7 Answers: Real people ask and answer questions on any topic. http://www.yahoo7.com.au/answers _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
