A Thursday 14 October 2010 21:09:18 braingateway escrigué:
> Sorry about the super big paragraph! Thanks a lot for your detailed
> response!
> I was aware it is pointless to compress pure random data, so I did
> not mention the compression rate at all in my post. Unfortunately,
> the dynamic range of my data is very large and it is very
> “random”-like. Blosc only reduces 10% file size of my real dataset ,
> so I am not a fan of the compression feature.

I see.  Then it should be better to take compression out of the 
measurements and focus on I/O speed.

> I am really confused about the dimension order. I cannot see the
> freedom to change the Column-major or Row-major, because the HDF5 is
> Row-major. For example, I got N different sensors, each sensor
> generate 1E9 samples/s, the fixed length dimension (fastest
> dimension) should always store N-samples from sensor network, so the
> time always has to be the column. And in most case, we always want
> to access data from all sensors during certain period of time. In
> some case, we only want to access data just from one or two sensors.
> So I think it is correct to make raw stores data from all sensors at
> the same time point. In my opinion, almost for all kind of
> real-world data, the slowest dimension should always represent the
> time. Probably, I should inverse the dimension order when I load
> them into RAM.

You always have to find the best way to combine convenience and 
performance.  If in some cases you cannot do this, then you have to 
choose: convenience *or* performance.

> Even though I did invert the dimension order,. the speed did not
> improve for accessing all channels, but did improve a lot for only
> accessing data from one sensor for both memmap and pytables.

Exactly.  That was my point.

> However, the pytables is still much slower than memmap:
> 
> Read 24x1e6 data chunk at random position:
> 
> Memmap: 128ms, (without shift dimension order: 81ms)
> Pytables (automatic chunshape = (1, 32768)) 327ms, (without shift
> dimension order: 358ms)
> Pytables ( chunshape = (24, 65536)): 270ms, (without shift dimension
> order: 255ms)
> Pytables ( chunshape = (1, 65535)): 328ms

That 'bad' performance of Pytables regarding numpy.memmap is kind of 
expected.  You should be aware that HDF5 is quite more complex (but more 
featured too) than memmap, so the overhead is significant, specially in 
this case where you are using a chunked dataset (CArray) for doing the 
comparison.

It is also important the fact that you are benchmarking with datasets 
that fits well into OS filesystem cache (for example, when using a 
machine with > 1 GB of RAM), and in this case the disk is touched very 
little.  But as soon as your datasets exceeds the amount of your 
available memory, the performance of memmap and HDF5 would become much 
closer.

In case you are interested only in situations where your datasets fits 
well in-memory, probably you will get much better results if you use a 
non-chunked dataset (i.e. a plain Array) in PyTables.  But still, if you 
are expecting PyTables to be faster than the much simpler memmap 
approach, then I'm going to disappoint you: that simply will not happen 
(unless your data is very compressible, but that's not your case).

> Calculate expr on the whole array:
> 
> Memmap: 1.4~1.8s (without shift dimension order: 1.4~1.6s)
> Pytables (automatic chunshape = (1, 32768)): 9.4s, (without shift
> dimension order: 14s)
> Pytables ( chunshape = (24, 65536)): 16s, (without shift dimension
> order: 9s)
> Pytables ( chunshape = (1, 65535)): 13s
> 
> Should I change some default parameters such as buffersize, etc, to
> improve the performance?

No, I don't think you are able to get much more performance out or 
tables.Expr.  But, mind you, you are not comparing apples with apples 
here.  numpy.memmap and tables.Expr are paradigms for performing out-of-
core computations (i.e. computations with operands that do not fit in 
memory).  But in your example, for the numpy.memmap case, you are 
loading everything in-memory and then calling numexpr for performing 
operations, while you are using tables.Expr for doing the same 
operations but *on-disk* and hence the big difference in performance.

In order to compare apples with apples my advice is to use 
tables.Array+numexpr if you want to compare it with numpy.memmap+numexpr 
for an in-memory paradigm.  Or, if you really want to do out-of-core 
computations, then use numpy.memmap and perform operations directly on-
disk (i.e. without using numexpr).  See:

https://portal.g-node.org/python-
autumnschool/_media/materials/starving_cpus/poly2.py

for an example of apples-with-apples comparison for out-of-core 
computations.

BTW, I've seen that you are still using numexpr 1.3; you may want to use 
1.4 instead, where I've implemented multi-threading sometime ago.  That 
might boost your computations quite a bit.

> By the way, after I change shape to (24,3e6), the pytables.Expr
> returns an Error:
> The error was --> <type 'exceptions.AttributeError'>: 'Expr' object
> has no attribute 'BUFFERTIMES'.

Hmm, that should be a bug.  Could you send a small self-contained 
example so that I can fix that?

> I think this is because you have not updated the expression.py for
> new ‘BUFFER_TIMES’ parameter? So I add:
> from tables.parameters import BUFFER_TIMES
> change self.BUFFERTIMES to BUFFER_TIMES
> 
> I hope this is correct.

Hope that helps,

-- 
Francesc Alted

------------------------------------------------------------------------------
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to