I'm very much hoping that Mark W might magically fix this in 9.6.5.

But in the meantime FWIW, the place where this was really hurting (in a script that took 8 minutes under LC6, was taking 8 hours under LC9, but I've gradually tamed it down to under an hour by buffering the large accumulations) was a single sort command, on 70 MB of data in approx 223,000 lines.

I've replaced this line:
        sort lines of tNewTable by item iSortCol of each

which took 1 second on Mac, 2063 seconds (i.e. 34 minutes) on Windows, with a call to this command

        command sortLinesByTabbedColumn @tTable, iSortCol
           local aTable, tSortTable, iARcounter, tARbuffer, tRow, k

           -- load table into an array for fast access by line number
           put tTable into aTable
           split aTable using return

           -- compile index of just the column to sort on, and line number
           set the itemDelimiter to tab
           repeat for each key k in aTable
                  get (item iSortCol of aTable[k]) && k
                  appendRow it, iARcounter, tARbuffer, tSortTable
           end repeat
           put tARbuffer after tSortTable

           -- sort it
           sort lines of tSortTable

           -- rebuild table out of array, in sorted order
           put empty into tARbuffer
           put empty into tTable
           repeat for each line tRow in tSortTable
                  put last word of tRow into k
                  appendRow aTable[k], iARcounter, tARbuffer, tTable
           end repeat
           put tARbuffer after tTable

        end sortLinesByTabbedColumn

which takes 25 seconds on Windows (to my surprise, most of that time was in the final 'rebuild' loop).


On 02/09/2021 23:53, Bob Sneidar via use-livecode wrote:
I am going to say no, because you still have to traverse the file once to get 
it into sqLite, then do the sort, then write out the file when done. I might be 
mistaken, the subsequent SQL sort may make up for lost time. Using a memory SQL 
really shines when you need to make multiple passes at the data using different 
queries. One pass may not impress you much.

For instance, I have a File Management module built into my application. A file 
can belong to a customer, and also to a site, and also to a device. Like so:

custid  siteid  deviceid        filepath
123                                             disk/folder/file1
456             098                             disk/folder/file2
789             765             432             disk/folder/file3

Note all have a custid, some have a siteid as well, and some also have a 
deviceid.

So rather than query mySQL for the files for each site or device as I select 
them, I instead, upon selecting a customer, query mySQL for ALL the file 
records for that customer, (which of course contain the file records for all 
the sites and devices), then store that in a memory database. Then when a 
different site or device belonging to that customer is selected, I query the 
memory database for those belonging to that site, or that device in those 
modules respectively.

The performance enhancement is significant.

Another way I apply this is to get the objects on a card passing a list of 
properties I'm interested in, then store the data in a memory database. I can 
then query for objects with certain properties without having to iterate 
through all the objects on a card in a repeat loop. For instance, the farthest 
left, top, right and bottom object whose visible is true in 4 memory db 
queries, giving me the total rect of all the visible objects without 
grouping/ungrouping and the hell that can ensue.

Bob S


On Sep 2, 2021, at 11:22 , Bernard Devlin via use-livecode 
<use-livecode@lists.runrev.com> wrote:

Whilst waiting for a fix, would a temporary solution be to use sqlite to
create an in-memory database and let sqlite do the sorting for you?

Regards, Bernard.

On Mon, Aug 30, 2021 at 8:23 PM Ben Rubinstein via use-livecode <
use-livecode@lists.runrev.com> wrote:

Thanks to Mark Waddingham's advice about using a buffer var when
accumulating
a large text variabel in stages, I've now got a script that took 8 hours
under
LC9, and (8 minutes under LC6) down by stages to just under 1 hour under
LC9.

However I have some remaining issues not amenable to this approach; of
which
the most significant relates to the sort command.

In all cases it seems to take much longer under LC9 than it did under LC6;
although the factor is quite variable. The most dramatic is one instance,
in
which this statement:

        sort lines of tNewTable by item iSortCol of each

takes 35 minutes to execute. `tNewTable` is a variable consisting of some
223,000 lines of text; approx 70MB. The exact same statement with the same
data on the same computer in LC6 takes just 1 second.

Has anyone else noticed something of this sort? As I said, the effect
varies:
e.g. 54 seconds versus 1 second; 22 seconds versus 1 second. So it may not
be
so noticeable in all cases.

TIA,

Ben



_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to