Re: "read...until " -- buffer size

Alex Tweedly Tue, 05 Aug 2014 14:18:13 -0700

Summary : Richard said :

Should we consider adding an optional argument for "read...until<string>" to specify the buffer size the engine will use?

IMHO, No.

LC is supposed to be an easy to use language/system. I don't need todeal with malloc/free, I'm not vulnerable to memory leaks, I don't needto fiddle with unnecessary detail - LC deals with it for me. So ratherthan this "enhancement request", I'd say this would be a

    "report of a bug due to unacceptable performance",

and it should just say something like "Make this work at acceptablespeed so it can be used" :-).

Even if you or I were able and willing to experiment with differentbuffer sizes, how could we choose suitable sizes for all the differentOSes, disk setups, etc. that users might finish up using ?

I wonder if the problem is some generality in the string search for thetermination ? Shouldn't be, really - but if it is, I guess it would beOK to have a more restricted form, such as


 read lines from file ...

if that would get better performance.

But I would definitely want to make it easier to get good results, notharder :-)

-- Alex.


On 05/08/2014 16:31, Richard Gaskin wrote:

The "read...until <string>" form is wonderfully convenient, but veryslow.
It's so slow, in fact, that I've found I can write a few dozen linesof code to perform a functionally identical task at much greater speed.
The algo I use I picked up from some old HyperCard article back in theday. In short, I read from disk into a buffer by a specified amount,then walk through the lines within the buffer in memory. When I reachthe last item, which doesn't have a terminator, I read another batchfrom the file, appending my buffer, and repeat this process until I'vecompleted my traversal of the file.
While this has given me a satisfying speed bump well worth the time ittook to write it, it occurs to me that it shouldn't be necessary at all.
After all, it's not like "read...until <string>" is reading only onebyte at a time from disk; I haven't read that part of the source butI'd be surprised if it's not reading at least one block's worth ineach pass (usually 4k, depending on the file system).
So in essence, it would seem the current "read...until <string>" algoin the engine is nearly identical to what I'm doing in script, withonly one critical difference: the buffer size.
Oddly enough, experimenting with different buffer sizes has yieldedsurprising results. At first I thought that minimizing disk accesswould be the primary boost, so I tried reading in 1 MB chunks butfound far greater performance with much smaller amounts, even thoughit meant touching the disk more often. Ultimately, it seems theoptimal buffer size in my experiments on one system was 128k; whenreading smaller amounts the extra disk accesses take a toll, and whenreading larger amounts it's also slower, perhaps due to mallocanomalies or the way LC uses malloc in that context.
All this leaves me with a proposal:
Should we consider adding an optional argument for "read...until<string>" to specify the buffer size the engine will use?
E.g.:

  read from file tMyFile until CR with buffer 128000

...or:

  read from file tMyFile until CR for 128000

...or if we're going to try to be more English-like about it:

  read from file tMyFile until CR for 128k


Worth submitting as an enhancement request?

Anyone here in a position to implement this themselves?
And is there anyone here who happens to know the buffer size theengine currently uses for this?



_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: "read...until " -- buffer size

Reply via email to