I find it safer to process the data without returning to a disk file. As mentioned I actually prefer to start with mgeti() and read the file as binary, as then all byte values are accepted.

But anyway with the data separated in lines, it is relatively simple to split up with the wanted separators and decimal sign :

clear  dataset;
headerlines=3:
footerlines=2:
for  k=1:size(in_text,1)
    if  k>headerlines  &&  k<n-footerlines  then
       datatemp=strtod(strsplit(in_text(k),[ascii(9),";"]),",");
       dataset(k-headerlines,1:length(datatemp))=datatemp;
    end
end

disp(in_text(1:headerlines));
disp(dataset);
disp(in_text(($-footerlines+1):$));

JÅ


On 2020-04-28 10:14 AM, Rafael Guerra wrote:

Antoine,

One workflow that works fast for me, for large data files, is to load first the whole file with mgetl, then remove all empty lines using isempty in a loop (as shown below), process the header block, isolate the data block and save it to a temporary backup file to disk using mputl, then load very efficiently from disk that backup file using fscanfMat.

tlines=mgetl(fid,-1); /// reads lines until end of file into 1 column text vector/

bool=~cellfun(isempty,tlines);

tlines=tlines(bool); /// removes empty lines/

function*out_text*=_cellfun_(*fun*, *in_text*)

/// Applies function to input text (column strings vector), line by line/

n=size(*in_text*,1);

for i=1:n;

*out_text*(i)=*fun*(*in_text*(i));

end

endfunction

Regards,

Rafael


_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/users

Reply via email to