Apparently sent from an unsubscribed address.
Begin forwarded message:
> From: pytables-users-boun...@lists.sourceforge.net
> Subject: Auto-discard notification
> Date: February 10, 2012 12:14:45 AM GMT+01:00
> To: pytables-users-ow...@lists.sourceforge.net
>
> The attached message has been automatically discarded.
> From: Jimmy Paillet <jimmy.pail...@gmail.com>
> Subject: Performance advice/Blosc
> Date: February 10, 2012 12:14:19 AM GMT+01:00
> To: pytables-users@lists.sourceforge.net
>
>
> Hey,
>
> I'd like to ask some advice about pytables data organization and compression
> performance....
>
> My data set is just a big table (500Mrows, 45 columns), the file size is
> 70GB, compressed with blosc-4... the compression ratio is around 2-3.
> Several ultralight indexes.
> Python 2.5, pytables 2.3.1 ubuntu 8.04 64 bits, 4core Intel Xeon 12GB RAM.
>
> The file is on a NAS, which I am linked to with a GbE link.
> Performance was not that bad for a max IO bandwidth of 90 MB/s
>
> To see how it would scale with I/O speed, I set up a 3-SSD RAID 0 (sequential
> read speeds up to 660 MB/s)
> I got a bit disappointed. Yes, very selective queries that can use indexes
> are very much faster on the RAID (up to 6 times).
> However, broader queries are almost on par with the speed I got from the NAS
> system, which seemed weird as it's getting close
> to sequential reads. This is the queries I wanted to speed up!
>
> It seems I can't get past 80-90 MB/s when reading a compressed h5.
> It's roughly the same with lzo or blosc (except lzo compressed 2 times
> more)...
> Does that number seems reasonable? Reading from lzo and especially blosc on
> the web, it looks a bit underwhelming in comparison....
> Am I missing something?
>
> One of my issues I believe is that I can't get more than one decompressing
> blosc thread, even though I set tables.setBloscMaxThreads(6).
> Any ideas of what is happening here?
>
> On uncompressed files, I can reach the 600MB/s limit when doing reads. But
> since I get files that are 2 to 6 times bigger,
> I often end up with similar performances. So I wonder how to scale my system.
>
> Thanks for any input.
> J.
>
>
>
>
-- Francesc Alted
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users