Hi
The code you provide does not run on my sci 5.5.0
anyway, i would read only the once the matrix, as a string, and then use
csvTextScan on
mydat(:,2:6)
to convert that part to double. Taht already saves time.
in your loop i believe strtod is extremely slow. Again, use csvtextscan
instead.
Hope this helps
Adrien
On 20/05/2014 14:44, Richard Llom wrote:
Hello,
I need to read in a csv of about 360.000 lines with date and numerical
values. Attached is a sample excerpt of that file.
So far I did:
==== CODE ====
// read in
tic
mydat = csvRead('dat04-2011.csv', ';', ',', 'double', [], [], [], 6);
toc (= 5,213 secs)
mydat = mydat(:,2:6);
tic
mystring = csvRead('dat04-2011.csv', ';', ',', 'string', [], [], [], 6);
toc (= 3,077 secs)
mystring = mystring(:,1);
tic
for i=1:size(mydat,1)
mydate(i,:) = strtod(strsplit(mystring(i,1),['.';' ';':']))';
end
toc (= 186,473 secs)
==== CODE ====
(I filled in the toc values).
As you can see this is unfortunately very slow. The read in of the csv, but
especially the for loop.
So I have several question:
1)
Is there a faster way to read in the csv? Note that I need the 'header'
option.
2)
Instead of the loop I would like to use
mydate = strtod(strsplit(mystring(:,1),['.';' ';':']))';
but this doesn't work. Is there another way to avoid the loop?
3)
The raw csv file is around 15MB, but when I want to read it in the second
time, Scilab says this will exceed the stacksize. Which is default by 76MB.
So I don't quite understand how two times the 15MB file takes so much
memory? I raised the stacksize now, but I would rather like not to.
Any help is appreciated.
Thanks!
Richard
_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users
--
Adrien Vogt-Schilb
Consultant (World Bank) and PhD Candidate (Cired)
1 202 473 7980
_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users