Francesc Altet wrote: > A Dilluns 09 Abril 2007 15:57, Michael Hoffman escrigué: >> As a followup to my previous message, I have realized that I am supposed >> to tune the lustre filesystem for large files. Hopefully that will solve >> my performance problems. > > Maybe. A good crosscheck would be to copy the file to a local filesystem and > test the performance. If you still see high latency, please explain which > hierarchy have you endowed to your data and I'll try to provide you more > feedback.
Well, I tried that and it was still really slow. So I tried balancing the tree by creating groups named _00 through _ff, from the first octet of the MD5 digest of the dataset name. This afforded a considerable speedup in opening even on a remote filesystem: $ time python -c 'import tables; tables.openFile("original.h5")' Closing remaining opened files... original.h5... done. real 2m25.643s user 0m1.271s sys 0m1.379s $ time python -c 'import tables; tables.openFile("balanced.h5")' Closing remaining opened files... balanced.h5... done. real 0m2.186s user 0m0.158s sys 0m0.106s So perhaps sticking to <4096 nodes per group (or here, <256) is still a good idea. I'm thankful that I don't need to move to multiple files which would have been a real pain. It would be nice if this sort of thing were done automatically but that would probably be best handled upstream in HDF5. -- Michael Hoffman ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Pytables-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/pytables-users