Re: [Scilab-users] [EXT] parsing TSV (or CSV) file with scilab

Jan Åge Langeland Tue, 28 Apr 2020 02:20:10 -0700

I find it safer to process the data without returning to a disk file. Asmentioned I actually prefer to start with mgeti() and read the file asbinary, as then all byte values are accepted.

But anyway with the data separated in lines, it is relatively simple tosplit up with the wanted separators and decimal sign :


clear  dataset;
headerlines=3:
footerlines=2:
for  k=1:size(in_text,1)
    if  k>headerlines  &&  k<n-footerlines  then
       datatemp=strtod(strsplit(in_text(k),[ascii(9),";"]),",");
       dataset(k-headerlines,1:length(datatemp))=datatemp;
    end
end

disp(in_text(1:headerlines));
disp(dataset);
disp(in_text(($-footerlines+1):$));

JÅ


On 2020-04-28 10:14 AM, Rafael Guerra wrote:

Antoine,
One workflow that works fast for me, for large data files, is to loadfirst the whole file with mgetl, then remove all empty lines usingisempty in a loop (as shown below), process the header block, isolatethe data block and save it to a temporary backup file to disk usingmputl, then load very efficiently from disk that backup file usingfscanfMat.
tlines=mgetl(fid,-1); /// reads lines until end of file into 1 columntext vector/
bool=~cellfun(isempty,tlines);

tlines=tlines(bool); /// removes empty lines/

function*out_text*=_cellfun_(*fun*, *in_text*)

/// Applies function to input text (column strings vector), line by line/

n=size(*in_text*,1);

for i=1:n;

*out_text*(i)=*fun*(*in_text*(i));

end

endfunction

Regards,

Rafael


_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users

Re: [Scilab-users] [EXT] parsing TSV (or CSV) file with scilab

Reply via email to