If you're spending your time in IO, then I doubt the language will make much of a difference. However you might see improvement if you buffer the input (ie read a chunk into a byte buffer, grab what you need, repeat until done). I'm not sure if this exists already, but if it helps, I did this a while ago:
https://github.com/tpbreloff/CTechCommon.jl/blob/master/src/bufio.jl On Wednesday, December 2, 2015, Rajn <[email protected]> wrote: > I am reading a binary file with a structure. Matlab takes long and I > wanted to run Julia for speed improvement. > It reads camera frames, each with certain bytesize and each frame > consisting of parameters of certain byte size and Type. > > frames: is declared Int64 and runs from 1:frames > PACKSIZE is declared Int64 and is constant > key_ind is Int16 and is a vector. This denotes the parameters which > defines the frames. There are 200 but I am only interested in 3. > Type declared Type as: 1x3 Array of Int16, Int16, Float64... > Num as Int64 and 3x1 elements and each is 1. This says that there are only > 1 elements of each Type in a frame. > > This is where I want to store the data... > dCell = 3x1 Array{Any,2} of Array{Int16}(1xframes) > Array{Int16}(1xframes) and Array{Float64}(1xframes) > > each camera frame is of PACKSIZE in bytes. > > To summarize I have declared the type of each variable before I start the > read process. > > #Read and assemble the data structured as series of Int16,Int16,Float64... > for j=1:length(frames); > Pos = frames[j]*PACKSIZE ; > for k=1:size(key_ind,1) > seek(fid,Pos[k]); > value = read(ifov_fid, Type[k][1],Num[k]); #Num[k] is all ones. > dCell[k][:,j]=value; > end > end > > For 500,000 frames it takes as much time as Matlab. Not getting warnings > or errors but would like to know how to improve the code for speed or > efficiency. > > I am not familiar with binary read and was wondering if this is 'normal' > and one cannot do any better? If I were to write this in C would it also > take the same amount (maybe because of my sloppy code)? I should perhaps > give details on the processor I use but because I am only comparing Matlab > with Julia - and I am not doing any parallelization - I think that OS > details are not important? > In the Matlab code I am declaring few variables types but not being very > careful as I am in with Julia. > I am not reading the entire structure i.e., key_ind =1:3 of 200, each with > different byte size. Therefore, I am not providing @time but if curious it > is ~30-45 seconds for 500000 frames. The inner for-loop takes about > ~0.00007 seconds. > Thanks > > >
