"read...until " -- buffer size

Richard Gaskin Tue, 05 Aug 2014 08:40:21 -0700

The "read...until <string>" form is wonderfully convenient, but very slow.

It's so slow, in fact, that I've found I can write a few dozen lines ofcode to perform a functionally identical task at much greater speed.

The algo I use I picked up from some old HyperCard article back in theday. In short, I read from disk into a buffer by a specified amount,then walk through the lines within the buffer in memory. When I reachthe last item, which doesn't have a terminator, I read another batchfrom the file, appending my buffer, and repeat this process until I'vecompleted my traversal of the file.

While this has given me a satisfying speed bump well worth the time ittook to write it, it occurs to me that it shouldn't be necessary at all.

After all, it's not like "read...until <string>" is reading only onebyte at a time from disk; I haven't read that part of the source but I'dbe surprised if it's not reading at least one block's worth in each pass(usually 4k, depending on the file system).

So in essence, it would seem the current "read...until <string>" algo inthe engine is nearly identical to what I'm doing in script, with onlyone critical difference: the buffer size.

Oddly enough, experimenting with different buffer sizes has yieldedsurprising results. At first I thought that minimizing disk accesswould be the primary boost, so I tried reading in 1 MB chunks but foundfar greater performance with much smaller amounts, even though it meanttouching the disk more often. Ultimately, it seems the optimal buffersize in my experiments on one system was 128k; when reading smalleramounts the extra disk accesses take a toll, and when reading largeramounts it's also slower, perhaps due to malloc anomalies or the way LCuses malloc in that context.


All this leaves me with a proposal:

Should we consider adding an optional argument for "read...until<string>" to specify the buffer size the engine will use?


E.g.:

  read from file tMyFile until CR with buffer 128000

...or:

  read from file tMyFile until CR for 128000

...or if we're going to try to be more English-like about it:

  read from file tMyFile until CR for 128k


Worth submitting as an enhancement request?

Anyone here in a position to implement this themselves?

And is there anyone here who happens to know the buffer size the enginecurrently uses for this?


--
 Richard Gaskin
 Fourth World Systems
 Software Design and Development for the Desktop, Mobile, and the Web
 ____________________________________________________________________
 ambassa...@fourthworld.com                http://www.FourthWorld.com

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

"read...until " -- buffer size

Reply via email to