BK> will get to in a moment. You make a point about the speed of mk BK> versus the speed of raw python dictionary, I don't think it could ever get as good as operations on disk-based data structures being comparable in speed to memory-based data structures. It's unrealistic to expect that I think, unless you're some sort of Donald Knuth on steroids.
agreed. i was writing under the impression that all of an mk storage gets always fully loaded into memory on opening. i am happy to hear this is not the case. my caching example was only possible in such a simple way because i know my tables are not too big for memory. i am thinking about functionality somehwere in my wrapping class that manages such a caching on demand.
view = storage.getas("test[_B[a:s,b:s,c:s]]").blocked()
vw.append(('1','2','3')))
600000, time: 21.30, delta: 2.78 Values written, now syncing, time: 22.83 After syncing: 23.08 end.
i tried this myself. the results so far are slightly puzzling to me. here is a short test report:
============================================================
table creation strings: for blocked and unblocked views:
node[_B[name:S,comment:S,termnr:I]]
node[name:S,comment:S,termnr:I]data: several thousands of rows with nearly identical content. all data was produced prior to each testrun and kept in memory throughout. for mode t1, a list of tuples, for mode t0, a list of dictionaries was produced::
[
('*0*', 'QfoY', 88),
('*1*', 'dn', 430),
('*2*', 'CJnnTZTLD', 502),
... ] [
{'termnr': 88, 'comment': 'QfoY', 'name': '*0*'},
{'termnr': 430, 'comment': 'dn', 'name': '*1*'},
{'termnr': 502, 'comment': 'CJnnTZTLD', 'name': '*2*'},
... ]core code::
stopwatch.start( _inter( '$ax $bx $ex $tx' ) )
if useAppend:
if useTuples:
for entry in ENTRYTUPLES:
targetTable.append( entry )
else:
for entry in ENTRYDICTS:
targetTable.append( **entry )
else:
if useTuples:
targetTable[ 0 : ROWCOUNT ] = ENTRYTUPLES
else:
targetTable[ 0 : ROWCOUNT ] = ENTRYDICTS
stopwatch.stop()the test runs created 16 data storages with identical sizes of about 2MB each.
results::
test run 2, 100'000 rows
TOTAL : 749.9780
a0 b0 x0 t0: 10.1340 ****
a1 b1 x1 t1: 10.3540 *****
a0 b0 x0 t1: 11.0860 *****
a1 b0 x0 t0: 11.7670 *****
a1 b1 x0 t1: 14.6310 ******
a0 b0 x1 t1: 22.5230 **********
a0 b0 x1 t0: 27.2790 ************
a1 b1 x1 t0: 28.9120 *************
a1 b0 x1 t0: 41.5600 ******************
a0 b1 x1 t1: 60.7580 ***************************
a1 b0 x0 t1: 63.1210 ****************************
a0 b1 x1 t0: 65.4740 *****************************
a1 b0 x1 t1: 67.9580 ******************************
a1 b1 x0 t0: 78.3630 **********************************
a0 b1 x0 t1: 109.0270 ************************************************
a0 b1 x0 t0: 114.1740 **************************************************
test run 2, 50'000 rows
TOTAL : 187.3690
a1 b1 x1 t1: 3.7960 ********
a0 b0 x0 t1: 4.0160 ********
a1 b1 x0 t1: 4.4560 *********
a0 b0 x0 t0: 4.9470 **********
a1 b0 x0 t0: 4.9770 **********
a0 b0 x1 t1: 5.3680 ***********
a0 b0 x1 t0: 6.6990 **************
a1 b1 x1 t0: 9.2140 *******************
a1 b0 x1 t0: 11.0960 ***********************
a1 b0 x0 t1: 13.3400 ****************************
a1 b1 x0 t0: 14.8110 *******************************
a1 b0 x1 t1: 16.5630 ***********************************
a0 b1 x1 t1: 16.6240 ***********************************
a0 b1 x1 t0: 17.5650 *************************************
a0 b1 x0 t1: 23.6340 *************************************************
a0 b1 x0 t0: 23.9150 **************************************************
a0 -- use slice assignment (see code)
a1 -- use append with loop (see code) b0 -- do not use blocked view
b1 -- use blocked view x0 -- use normal commit mode
x1 -- use extend commit mode t1 -- use tuples (see code)
t0 -- use dictionaries (see code)============================================================
there are huge differences in the timings, but i find myself unable to distill any kind of clear policy for using metakit from them -- all of the 0s and 1s seem to be scattered all over the plot for all four options. i would have expected the results with slice assignment from a list of tuples on a blocked view that is in a storage opened using extend-comit should behave fastest, but even if we concede that the top-runners in both cases somehow corroborate that expectation. furthermore, the results seem not to allow the interpretation that these factors act together in a synergetic way. even if we say that factor b (blocked views) does not kick in here because even 100'000 rows are not enough, then still these other factors do not appear to act together.
the only three interpretations i have to offer right now are:
1) the testing code contains some grave blunder that mars
the results;2) it is the lack of many test runs that are randomly
shuffled that is missing here -- perhaps the order
in which the storages were produced is important (i
can not see how, but i'll try);3) the results are correct and metakit's behavior *is*
not very predictable.perhaps someone would be eager to falsify at least the last hypothesis.
_wolf
_____________________________________________ Metakit mailing list - [email protected] http://www.equi4.com/mailman/listinfo/metakit
