Re: [Scilab-users] Why so slow?

2014-05-20 Thread Adrien Vogt-Schilb

Hi

The code you provide does not run on my sci 5.5.0

anyway, i would read only the once the matrix, as a string, and then use 
csvTextScan on


mydat(:,2:6)

to convert that part to double. Taht already saves time.

in your loop i believe strtod is extremely slow. Again, use csvtextscan 
instead.


Hope this helps

Adrien

On 20/05/2014 14:44, Richard Llom wrote:

Hello,
I need to read in a csv of about 360.000 lines with date and numerical
values. Attached is a sample excerpt of that file.

So far I did:
 CODE 

// read in
tic
mydat = csvRead('dat04-2011.csv', ';', ',', 'double', [], [], [], 6);
toc (= 5,213 secs)
mydat = mydat(:,2:6);
tic
mystring = csvRead('dat04-2011.csv', ';', ',', 'string', [], [], [], 6);
toc (= 3,077 secs)
mystring = mystring(:,1);


tic
for i=1:size(mydat,1)
 mydate(i,:) = strtod(strsplit(mystring(i,1),['.';' ';':']))';
end
toc (= 186,473 secs)


 CODE 
(I filled in the toc values).


As you can see this is unfortunately very slow. The read in of the csv, but
especially the for loop.

So I have several question:

1)
Is there a faster way to read in the csv? Note that I need the 'header'
option.

2)
Instead of the loop I would like to use
mydate = strtod(strsplit(mystring(:,1),['.';' ';':']))';
but this doesn't work. Is there another way to avoid the loop?

3)
The raw csv file is around 15MB, but when I want to read it in the second
time, Scilab says this will exceed the stacksize. Which is default by 76MB.
So I don't quite understand how two times the 15MB file takes so much
memory? I raised the stacksize now, but I would rather like not to.


Any help is appreciated.
Thanks!
Richard


___
users mailing list
users@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users



--
Adrien Vogt-Schilb
Consultant (World Bank) and PhD Candidate (Cired)
1 202 473 7980

___
users mailing list
users@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users


Re: [Scilab-users] Why so slow?

2014-05-20 Thread Michael Dunn
I had a whole slew of CSV problems too - over a year ago. Guess they
haven't been fixed yet.


Michael Dunn | Editor, EDN Design Ideas
<http://www.edn.com/design-ideas/all>
PCB <http://www.edn.com/design/pc-board>, IC/FPGA
<http://www.edn.com/design/integrated-circuit-design> & Medical
<http://www.edn.com/design/medical> Design Centers

(519) 744-9395 (Canada)
(226) 336-6033 (Mobile)
(415) 947-6096 (USA)
EDN Profile <http://edn.com/user/Michael%20Dunn> | LinkedIn
<http://www.linkedin.com/profile/view?id=29419994> | Skype: MichaelDunn_UBM

 <http://www.tech.ubm.com/>






-Original Message-
From: Richard Llom 
Reply-To: "International users mailing list for Scilab."

Date: Tuesday, May 20, 2014 2:44 PM
To: "users@lists.scilab.org" 
Subject: [Scilab-users] Why so slow?

>Hello,
>I need to read in a csv of about 360.000 lines with date and numerical
>values. Attached is a sample excerpt of that file.
>
>So far I did:
> CODE 
>
>// read in
>tic
>mydat = csvRead('dat04-2011.csv', ';', ',', 'double', [], [], [], 6);
>toc (= 5,213 secs)
>mydat = mydat(:,2:6);
>tic
>mystring = csvRead('dat04-2011.csv', ';', ',', 'string', [], [], [], 6);
>toc (= 3,077 secs)
>mystring = mystring(:,1);
>
>
>tic
>for i=1:size(mydat,1)
>mydate(i,:) = strtod(strsplit(mystring(i,1),['.';' ';':']))';
>end
>toc (= 186,473 secs)
>
>
> CODE 
>(I filled in the toc values).
>
>
>As you can see this is unfortunately very slow. The read in of the csv,
>but 
>especially the for loop.
>
>So I have several question:
>
>1)
>Is there a faster way to read in the csv? Note that I need the 'header'
>option.
>
>2)
>Instead of the loop I would like to use
>mydate = strtod(strsplit(mystring(:,1),['.';' ';':']))';
>but this doesn't work. Is there another way to avoid the loop?
>
>3)
>The raw csv file is around 15MB, but when I want to read it in the second
>time, Scilab says this will exceed the stacksize. Which is default by
>76MB. 
>So I don't quite understand how two times the 15MB file takes so much
>memory? I raised the stacksize now, but I would rather like not to.
>
>
>Any help is appreciated.
>Thanks!
>Richard

___
users mailing list
users@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users


[Scilab-users] Why so slow?

2014-05-20 Thread Richard Llom
Hello,
I need to read in a csv of about 360.000 lines with date and numerical 
values. Attached is a sample excerpt of that file.

So far I did:
 CODE 

// read in
tic
mydat = csvRead('dat04-2011.csv', ';', ',', 'double', [], [], [], 6);
toc (= 5,213 secs)
mydat = mydat(:,2:6);
tic
mystring = csvRead('dat04-2011.csv', ';', ',', 'string', [], [], [], 6);
toc (= 3,077 secs)
mystring = mystring(:,1);


tic
for i=1:size(mydat,1)
mydate(i,:) = strtod(strsplit(mystring(i,1),['.';' ';':']))';
end
toc (= 186,473 secs)


 CODE 
(I filled in the toc values).


As you can see this is unfortunately very slow. The read in of the csv, but 
especially the for loop.

So I have several question:

1)
Is there a faster way to read in the csv? Note that I need the 'header' 
option.

2)
Instead of the loop I would like to use
mydate = strtod(strsplit(mystring(:,1),['.';' ';':']))';
but this doesn't work. Is there another way to avoid the loop?

3)
The raw csv file is around 15MB, but when I want to read it in the second 
time, Scilab says this will exceed the stacksize. Which is default by 76MB. 
So I don't quite understand how two times the 15MB file takes so much 
memory? I raised the stacksize now, but I would rather like not to.


Any help is appreciated.
Thanks!
Richard# S
# Parame
# Unit:
# Titles:
Timity
# Data:
03.01.2004 14:20;9,33;6,96;11,1;0,75;2
05.01.2004 13:40;8,58;7,34;9,56;0,38;2
10.01.2004 13:10;7,33;6,19;8,79;0,58;2
13.01.2004 06:10;16,07;12,92;20,62;1,27;2
25.01.2004 18:20;4,15;3,88;4,46;0,15;2
15.02.2004 00:30;3,49;3,11;3,78;0,17;2
27.02.2004 03:10;8,33;7,34;9,46;0,36;2
15.03.2004 08:50;15,04;13,31;17,16;0,49;2
19.03.2004 06:00;14,4;13,02;15,62;0,38;2
___
users mailing list
users@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users