Re: [Bug-apl] Performance issue when manipulating arrays

David B. Lamkins Thu, 24 Apr 2014 22:54:08 -0700

Given a quick read, I get the impression that you're still incrementally
extending the length of the result. This is, by definition, an O(n^2)
operation. There's a lot of catenation in your code; that'll almost
certainly involve copying.


Try this instead:

1. Get the size of the file to be read.
2. Preallocate a vector large enough to hold the entire file.
3. Read the file (I'm assuming that the file_io won't let you read it
all at one go) by chunks and use indexed assignment to copy each chunk
into its position in the preallocated vector.



On Fri, 2014-04-25 at 00:21 +0800, Elias Mårtenson wrote:
> In writing a function that uses lib_file_io to load the content of an
> entire file into an array of strings, I came across a bad performance
> problem that I am having trouble narrowing down.
> 
> 
> Here is my current version of the
> function: 
> https://github.com/lokedhs/apl-tools/blob/e3e81816f3ccb4d8c56acc8e4012d53f05de96d6/io.apl#L8
> 
> 
> The first version did not do blocked reads and resized the array after
> each row was read. That was terribly slow, so I preallocate a block of
> 1000 elements, and resize every 1000 lines, giving the version you can
> see linked above.
> 
> 
> I was testing with a text file containing almost 14000 rows, and on my
> laptop it takes many minutes to load the file. One would expect that
> the time taken to load such a small file should not take any
> noticeable time at all.
> 
> 
> One interesting aspect of this is that it takes longer and longer to
> load each row as the loading proceeds. I have no explanation as to why
> that is the case. It's not the resizing that takes time, I was
> measuring the time taken to load a block of rows excluding the array
> resize.
> 
> 
> Any ideas?
> 
> 
> Regards,
> Elias

Re: [Bug-apl] Performance issue when manipulating arrays

Reply via email to