On Mon, Mar 19, 2012 at 1:08 PM, sreeaurovindh viswanathan <sreeaurovi...@gmail.com> wrote: > > Sorry to misphrase my question.But by querying speed i meant the speed of > "pytable querying and not the postgresql querying.To rephrase, > 1) Will i be able to query(using kernel queries) a single HDF5 file using > pytables parallely with five different programs? How will the efficiency in > that case.. > Secondly as per the suggestions, > > I will break it into 6 chunks as per your advise and try to incorporate in > the code.Also i will try to break my query into chunks and write it into > hdf5 tables as chunks as advised by frensec. But..
What you are describing is I/O bound; this means that you are only going to get as much throughput as your disk sub-system can handle. Writing in larger batches exploits the caching and block write nature of fixed disk mechanisms. Reading in batches does the same thing to exploit builtin caching and block reads. Profile your disks, if you are getting max throughput, buy faster hardware. If you are I/O bound multiple threads of execution will almost guarantee a reduction in throughput and reduction in overall performance of your application. This is the laws of physics at work, there is no multi-threaded royal road to better I/O performance with fixed disks. ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users