>> The data is split between many RB basis functions (about 1GB of data
>> split between ca 60000 files). So I was thinking the cause of the very
>> slow reading was overhead induced by the large number of individual
>> calls to read_serialized_data, and perhaps related memory management?
>> 
>> Any ideas on this would be very appreciated!
> 
> A lot of filesystem types do behave badly when you put too many
> entries in the same directory.  How long does copying take if your
> "1GB of data" is in those 60000 separate files?

Agreed.  copying 1 byte of data from 1e9 files will take ~infinitely longer
than 1e9 bytes from one file, provided your filesystem doesn't freak first.

> On the other hand, we may have some performance bug.  Can you stick
> logging into the read_serialized_data internals and figure out where
> the holdup is?

Or maybe if you could create a simple test case I can help profile it here
too.

What is the average vector size?

Note that the overhead of the system header and all the information in XDR
restart files will at some point become overwhelming.  This usage case is
different than what was in mind for restart files - I at least always
thought of restarting an entire simulation, so there might be a lot of data
contained in a handful of vectors.  There may be a better implementation for
this case...

-Ben



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to