Otherwise, the whole object orientation part of Scilab (tlist and
mlist etc.) would be hard to use for anything that comes in large
numbers, which would be a shame, especially as it used to work just
fine (well, I can see how the old structure wasn’t “just fine” in
other ways, but still).
Cheers,
Arvid
*From: *users <[email protected]> on behalf of Stéphane
Mottelet <[email protected]>
*Organization: *Université de Technologie de Compiègne
*Reply-To: *Users mailing list for Scilab <[email protected]>
*Date: *Monday, 15 October 2018 at 14:37
*To: *"[email protected]" <[email protected]>
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello,
I looked a little bit in the sources: the evident bottleneck is the
nested creation of an hdf5 group each time that a container variable
is met.
For the given example, this is particularly evident. If you replace
the syslin structure by the corresponding [A,B;C,D] matrix, then save
is ten times faster:
N = 4;
n = 1000;
filters = list();
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());
0.724754
N = 4;
n = 1000;
filters = list()
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = [G.a G.b;G.c G.d];
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());
0.082302
Serializing container objects seems to be the solution, but it goes
towards an orthogonal direction w.r.t. the hdf5 portability spirit.
S.
Le 15/10/2018 à 12:22, Antoine Monmayrant a écrit :
Le 15/10/2018 à 11:55, Arvid Rosén a écrit :
Hi,
Thanks for getting back to me!
Unfortunately, we used Scilab’s pretty cool way of doing
object orientation, so we have big nested tlist structures
with multiple instances of various lists of filters and other
structures, as in my example. Saving those structures in some
explicit manual way would be extremely complicated. Or is
there some way of writing explicit HDF5 saving/loading schemes
using overloading? That would be great! I am sure we could
find the main culprits and do something explicit for them, but
as they can be located wherever in a big nested structure, it
would be painful to do anything on the top level.
Another, related I guess, problem here is that the new file
format uses about 15 times as much disk space as the old
format (for a typical ill-behaved nested structure). That adds
to the save/load time too I guess, but is probably not the
main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and
should be reported as bugs.
By the way, I rewrote your script to run it under both 6.0 and 5.5:
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end
ver=getversion('scilab');
if ver(1)<6 then
tic();
save('filters_old.dat', filters);
ts1 = toc();
else
tic();
save('filters_new.dat', 'filters');
ts1 = toc();
end
printf("Time for save %.2fs\n", ts1);
/////////////////////////////////
Hope it helps,
Antoine
I think I might have reported this earlier using Bugzilla, but
I’m not sure. I’ll check and report it if not.
Cheers,
Arvid
*From: *users <[email protected]>
<mailto:[email protected]> on behalf of
"[email protected]" <mailto:[email protected]>
<[email protected]> <mailto:[email protected]>
*Reply-To: *"[email protected]"
<mailto:[email protected]>
<[email protected]>
<mailto:[email protected]>, Users mailing list for
Scilab <[email protected]> <mailto:[email protected]>
*Date: *Monday, 15 October 2018 at 11:08
*To: *"[email protected]" <mailto:[email protected]>
<[email protected]> <mailto:[email protected]>
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello,
I tried your code in 5.5.1 and the last nightly-build of 6.0:
I see a slowdown of around 175 between old save in 5.5.1 and
new (and only) save in 6.0.
It's really related to the data structure, because we use hdf5
read/write a lot here and did not experience significant
slowdowns using 6.0.
I think the overhead might come to the translation of your
fairly complex variable (a long array of tlist) in the
corresponding hdf5 structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
For example:
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a
hdf5 file by using h5open(), h5write() directly. It means you
need to write your own load() for your custom file format. But
this way, you can try to find the best way to layout your data
in hdf5 format.
3) in addition to 2) you could try to save each entry of your
"filters" array as one dataset in a given hdf5 file.
Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?
Antoine
Le 15/10/2018 à 10:11, Arvid Rosén a écrit :
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end
tic();
save('filters.dat', filters);
ts1 = toc();
tic();
save('filters.dat', 'filters');
ts2 = toc();
printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
email :[email protected] <mailto:[email protected]>
permanent email :[email protected]
<mailto:[email protected]>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
email :[email protected] <mailto:[email protected]>
permanent email :[email protected]
<mailto:[email protected]>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
_______________________________________________
users mailing list
[email protected] <mailto:[email protected]>
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
<https://antispam.utc.fr/proxy/2/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users>
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
<https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/www.utc.fr/%7Emottelet>
_______________________________________________
users mailing list
[email protected]
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users