BK> will get to in a moment.  You make a point about the speed of mk
BK> versus the speed of raw python dictionary,
I don't think it could ever get as good as operations on disk-based
data structures being comparable in speed to memory-based data
structures. It's unrealistic to expect that I think, unless you're
some sort of Donald Knuth on steroids.

agreed. i was writing under the impression that all of an mk storage gets always fully loaded into memory on opening. i am happy to hear this is not the case. my caching example was only possible in such a simple way because i know my tables are not too big for memory. i am thinking about functionality somehwere in my wrapping class that manages such a caching on demand.

view = storage.getas("test[_B[a:s,b:s,c:s]]").blocked()
vw.append(('1','2','3')))

600000, time: 21.30, delta: 2.78
Values written, now syncing, time: 22.83
After syncing: 23.08
end.

i tried this myself. the results so far are slightly puzzling to me. here is a short test report:

============================================================

table creation strings: for blocked and unblocked views:

    node[_B[name:S,comment:S,termnr:I]]
    node[name:S,comment:S,termnr:I]

data: several thousands of rows with nearly identical
content. all data was produced prior to each testrun and
kept in memory throughout. for mode t1, a list of
tuples, for mode t0, a list of dictionaries was
produced::

    [
        ('*0*', 'QfoY', 88),
        ('*1*', 'dn', 430),
        ('*2*', 'CJnnTZTLD', 502),
        ... ]

    [
        {'termnr': 88, 'comment': 'QfoY', 'name': '*0*'},
        {'termnr': 430, 'comment': 'dn', 'name': '*1*'},
        {'termnr': 502, 'comment': 'CJnnTZTLD', 'name': '*2*'},
        ... ]



core code::

    stopwatch.start( _inter( '$ax $bx $ex $tx' ) )
    if useAppend:
        if useTuples:
            for entry in ENTRYTUPLES:
                targetTable.append( entry )
        else:
            for entry in ENTRYDICTS:
                targetTable.append( **entry )
    else:
        if useTuples:
            targetTable[ 0 : ROWCOUNT ] = ENTRYTUPLES
        else:
            targetTable[ 0 : ROWCOUNT ] = ENTRYDICTS
    stopwatch.stop()

the test runs created 16 data storages with identical sizes of about 2MB each.

results::

    test run 2, 100'000 rows

TOTAL : 749.9780
a0 b0 x0 t0: 10.1340 ****
a1 b1 x1 t1: 10.3540 *****
a0 b0 x0 t1: 11.0860 *****
a1 b0 x0 t0: 11.7670 *****
a1 b1 x0 t1: 14.6310 ******
a0 b0 x1 t1: 22.5230 **********
a0 b0 x1 t0: 27.2790 ************
a1 b1 x1 t0: 28.9120 *************
a1 b0 x1 t0: 41.5600 ******************
a0 b1 x1 t1: 60.7580 ***************************
a1 b0 x0 t1: 63.1210 ****************************
a0 b1 x1 t0: 65.4740 *****************************
a1 b0 x1 t1: 67.9580 ******************************
a1 b1 x0 t0: 78.3630 **********************************
a0 b1 x0 t1: 109.0270 ************************************************
a0 b1 x0 t0: 114.1740 **************************************************



test run 2, 50'000 rows

TOTAL : 187.3690
a1 b1 x1 t1: 3.7960 ********
a0 b0 x0 t1: 4.0160 ********
a1 b1 x0 t1: 4.4560 *********
a0 b0 x0 t0: 4.9470 **********
a1 b0 x0 t0: 4.9770 **********
a0 b0 x1 t1: 5.3680 ***********
a0 b0 x1 t0: 6.6990 **************
a1 b1 x1 t0: 9.2140 *******************
a1 b0 x1 t0: 11.0960 ***********************
a1 b0 x0 t1: 13.3400 ****************************
a1 b1 x0 t0: 14.8110 *******************************
a1 b0 x1 t1: 16.5630 ***********************************
a0 b1 x1 t1: 16.6240 ***********************************
a0 b1 x1 t0: 17.5650 *************************************
a0 b1 x0 t1: 23.6340 *************************************************
a0 b1 x0 t0: 23.9150 **************************************************


    a0  --  use slice assignment (see code)
    a1  --  use append with loop (see code)

    b0  --  do not use blocked view
    b1  --  use blocked view

    x0  --  use normal commit mode
    x1  --  use extend commit mode

    t1  --  use tuples (see code)
    t0  --  use dictionaries (see code)

============================================================

there are huge differences in the timings, but i find
myself unable to distill any kind of clear policy for
using metakit from them -- all of the 0s and 1s seem to
be scattered all over the plot for all four options. i
would have expected the results with slice assignment
from a list of tuples on a blocked view that is in a
storage opened using extend-comit should behave fastest,
but even if we concede that the top-runners in both
cases somehow corroborate that expectation. furthermore,
the results seem not to allow the interpretation that
these factors act together in a synergetic way. even if
we say that factor b (blocked views) does not kick in
here because even 100'000 rows are not enough, then
still these other factors do not appear to act together.

the only three interpretations i have to offer right now are:

1)  the testing code contains some grave blunder that mars
    the results;

2)  it is the lack of many test runs that are randomly
    shuffled that is missing here -- perhaps the order
    in which the storages were produced is important (i
    can not see how, but i'll try);

3)  the results are correct and metakit's behavior *is*
    not very predictable.

perhaps someone would be eager to falsify at least the
last hypothesis.


_wolf






_____________________________________________ Metakit mailing list - [email protected] http://www.equi4.com/mailman/listinfo/metakit

Reply via email to