John Little wrote:
> On Jan 21, 9:56 am, Bram Moolenaar <[email protected]> wrote:
>
>> Sounds good. I'll add this in the todo list.
>
> There is already an item about readfile() and realloc in the todo
> list:
>
> 8 When editing a file with extremely long lines (e.g., an
> executable), the
> "linerest" in readfile() is allocated twice to be able to copy
> what was
> read so far. Use realloc() instead? Or split the line when
> allocating
> memory fails and "linerest" is big (> 100000)?
>
> I read this as worrying about the case where there's limited memory,
> and the copy from readfile's working string to store in the list. I
> suspect that the item was really prompted by the problem of the O(n^2)
> growth strategy causing poor performance, which hit me in my use of
> the yank ring plugin.
>
> I was going to consider realloc soon, but have worked to establish
> correctness first. I get a similar speed up as Dominique using a 50%
> growth factor for the working string, and I was thinking about how
> realloc could be applied to avoid copying a very long line. I wonder
> how good realloc implementations are at reducing the length of
> strings.
Hi John
Speed of realloc can depend at least on the OS and also on memory
fragmentation. But realloc() can only be better than the old pattern
of alloc new block + memmove + free old block. I'm not sure whether
a growth factor as in ga_grow(...) is better. In practice, with the realloc
patch, I see a linear speed rather than quadratic speed before patch
as shown in these measurements:
$ cat time-readfile.txt
time time
line with without
size patch patch
---------------------
500000 0.04 0.46
1000000 0.04 2.40
2000000 0.06 12.26
4000000 0.08 61.69
8000000 0.13 252.75
16000000 0.22
32000000 0.39
64000000 0.75
128000000 1.47
$ cat time-readfile.gnuplot
set xtics rotate
set xlabel "size in mega-bytes of file big-line.txt"
set ylabel "time in sec"
set grid
set title "time in sec of: vim -u NONE -c \":call readfile('big-line.txt')|q\""
set terminal png
plot 'time-readfile.txt' using ($1/1000000):2 with linespoints title
'time after patch'
$ gnuplot time-readfile.gnuplot < time-readfile.txt > time-readfile.png
And graph after patch is here:
http://dominique.pelle.free.fr/time-readfile.png
> The old code's handling of allocation failures is haphazard, or at
> least I can't see a clear outcome if an allocation fails. I would
> like to test with such failures; what's an easy way to do this on
> Linux? And can we write tests for this kind of problem?
Yes, I'm not sure either what happens if it runs out of memory.
But with the patch, it should behave as before. Also, running out
of memory becomes less likely as realloc generally does not
need to temporarily hold both the old + new buffer at the same
time. I also suspect that with realloc, there is less fragmentation
as there are less free() but intuition can be wrong.
To test memory alloc failures, on Linux, you can limit the
the amount of virtual memory of the shell & its subprocesses
with: ulimit -v <kbytes>.
Ex:
# Create a file of 100 Mb and a file of 10 Mb
$ perl -e 'print "x" x 100000000' > file-100Mb.txt
$ perl -e 'print "x" x 10000000' > file-10Mb.txt
# First test vim without ulimit...
$ vim -u NONE -c ":echo len(readfile('file-10Mb.txt')[0])"
(this prints 10000000 (good it works)
$ vim -u NONE -c ":echo len(readfile('file-100Mb.txt')[0])"
(this prints 100000000 (good it works)
# Now limit virtual memory to a max of 100000 Kb = 100 Mb
$ ulimit -v 100000
# Doing readfile on a 10Mb line still works.
$ vim -u NONE -c ":echo len(readfile('file-10Mb.txt')[0])"
(prints 10000000)
# Doing a readfile on a 100Mb line no longer works.
# Vim exceeds the 100Mb of virtual memory limit and crash.
$ vim -u NONE -c ":echo len(readfile('file-10Mb.txt')[0])"
Vim: Caught deadly signal SEGV
Vim: Finished.
Segmentation fault (core dumped)
I have not tried to debug the crash, but in my opinion,
when running out of memory, it's most of the time
hopeless to try to recover, except in few critical places.
-- Dominique
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php