Hello Adrian,
In essence, your extremely useful solution is similar to what Samuel and
Jan proposed: grab the whole file once.
I must admit I did not even consider it given the length of the files
involved and how easily I managed to crash scilab on small files.
Thanks,
Antoine
On 27/04/2020 18:58, Adrian Weeks wrote:
Hi Antoine,
I often have to read csv files with odd lines that trip functions like
csvRead so I often use the method below. It may solve your problem.
dataread = mgetl(readfile); // Read everything
a = [];
b = [];
…
for i = 1: size(dataread, 'r') do
line = dataread(i);
if length(line) ~= 0
then // Ignore blank lines
line = tokens(line, [' ', ',', ascii(9)]); // Accept
spaces, commas or tabs
if and(isnum(line)) then // If the line
is all-numeric
line = strtod(line);
a = [a; line(1)];
b = [b; line(2)];
…
end
end
end
Adrian Weeks
Development Engineer, Hardware Engineering EMEA
Office: +44 (0)2920 528500 | Desk: +44 (0)2920 528523 | Fax: +44
(0)2920 520178
[email protected] <mailto:[email protected]>
HID Global Logo <http://www.hidglobal.com/>
Unit 3, Cae Gwyrdd,
Green meadow Springs,
Cardiff, UK,
CF15 7AB.
www.hidglobal.com <http://www.hidglobal.com>
*From:*users <[email protected]> *On Behalf Of *Antoine
Monmayrant
*Sent:* 27 April 2020 16:41
*To:* Users mailing list for Scilab <[email protected]>
*Subject:* [EXT] [Scilab-users] parsing TSV (or CSV) file with scilab
is a nightmare
**** Please use caution this is an externally originating email. *** ***
Hi all,
This is both a rant and desperate cry for help.
I'm trying to parse some TSV data (tab separated data file) with
scilab and I cannot find a way to navigate around the minefield of
bugs present in meof/mgetl/mgetstr/csvRead.
A bit of context: I need to load into scilab data generated by a
closed source software.
The data is in the form of many TSV files (that I cannot share in
full, just some redacted bits) with a header and a footer.
I don't want to hand modify these files or edit them in any way (I
need to keep this as portable as possible, so no sed/awk/grep...)
OPTION 1: csvRead
That's the most intuitive solution, however, because of
http://bugzilla.scilab.org/show_bug.cgi?id=16391
<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbugzilla.scilab.org%2Fshow_bug.cgi%3Fid%3D16391&data=02%7C01%7Caweeks%40hidglobal.com%7C958a9bb7c76f40cef22108d7eac3d120%7Cf0bdc1c951484f86ac40edd976e1814c%7C0%7C0%7C637235999087304170&sdata=Myj7OkrpbGfSl3LD4QAoYzifF80drUrz6nPjP9H7xC8%3D&reserved=0>
and the presence of more than 1 empty line in my header/footer, this
crashes Scilab.
OPTION 2: hand parsing line by line using mgetl/meof
I tried:
filename="tsv.txt";
[fd, err] = mopen(filename, 'rt');
while ~meof(fd) do
txtline=mgetl(fd,1);
end
mclose(fd)
Saddly, and contrary to what's written in "help mgetl", meof keeps on
returning 0, well passed the end of the file and the while never ends!
OPTION 3: hand parsing chunk by chunk using mgetstr/meof
"help meof" does not confirm that meof should work with mgetl, but
mgetstr is specifically listed.
I thus tried:
filename="tsv.txt";
[fd, err] = mopen(filename, 'rt');
while ~meof(fd) do
txtchunk=mgetstr(80,fd);
end
mclose(fd)
But thanks to http://bugzilla.scilab.org/show_bug.cgi?id=16419
<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbugzilla.scilab.org%2Fshow_bug.cgi%3Fid%3D16419&data=02%7C01%7Caweeks%40hidglobal.com%7C958a9bb7c76f40cef22108d7eac3d120%7Cf0bdc1c951484f86ac40edd976e1814c%7C0%7C0%7C637235999087304170&sdata=BOnmop38zy8wUtFRwrPSoVl9HTsJT6NcQAY23qPK8f8%3D&reserved=0>
this is also crashing Scilab.
OPTION 4: Can anyone here help me with this?
I am really running out of ideas.
Did I miss some -hmm- obvious combination of available file parsing
scilab functions to achieve my goal?
I have the feeling that it would have been faster for me to just learn
a totally new language that does not suck at parsing files than trying
to get it to work with scilab....
Antoine
(depressed)
http://bugzilla.scilab.org/show_bug.cgi?id=16419
<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbugzilla.scilab.org%2Fshow_bug.cgi%3Fid%3D16419&data=02%7C01%7Caweeks%40hidglobal.com%7C958a9bb7c76f40cef22108d7eac3d120%7Cf0bdc1c951484f86ac40edd976e1814c%7C0%7C0%7C637235999087314165&sdata=hrLO7KWziAoacs9ytFJzziqv89FCY46SUNdZEUgvKNQ%3D&reserved=0>
_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users