This needed more time and work on more realistic tests, and to create a fix that addresses all issues. Sorry for the delay.
Summary:
- Bug: locks can result in false error reports or missing error
reports when the locks are not interpreted in the same way by all
vim functions and commands.
When this is fixed, locks
- can reliably be used to find mistakes
- do not terminate production runs (or test runs) because of
unexpected false positives.
- Bug: on an attempt to change a locked item, extend() is left with
an error and all previous changes are kept (exit-on-error).
- Bug: when the target dict is locked, extend() reports false errors
for 'extend(d, {})' and 'extend({"a": 1}, {"a": 2}, "keep")'.
- New, less artificial tests show that a lock check loop adds less
overhead than suggested by previous tests, esp. for common use
cases.
For the more common use cases, a separate lock test loop adds an
overhead of
avg.: 17% stddev: ± 16, max. 40%.
Across the whole range of use cases, the tests show an overhead of
avg.: 34% stddev: ± 27, max. 100%.
These tests are still somewhat artificial. The overhead shrinks
beyond these values when larger item values (e.g. real-life
strings) are used.
- Fix: all above bugs are fixed for extend() by using a lock check
loop that is only used when the target dict has locks set.
This implements all-or-nothing behavior at near-zero cost when no
locks are used on the target dict.
When locks are used on the target dict, the slowdown is as shown
in above tests.
- Open issue: are lock check loops needed in other situations (e.g.
for lists), and should a similar lock count optimization be used?
On 02-May-15, Bram Moolenaar wrote:
>
> Olaf Dabrunz wrote:
>
> > > > Example:
> > > >
> > > > :let d = {'a': 99, 'b': 100}
> > > > :lockvar 1 d
> > > > :call extend(d, {'a': 123})
> > > > E741: Value is locked: extend() argument
> > > > :let d.a = 123
> > > > :let d
> > > > d {'a': 123, 'b': 100}
> > > >
> > > > The dictionary is locked, but the 'a' item is not, so extend() on 'a'
> > > > should succeed in the same way as the 'let' does. According to ':he
> > > > lockvar', a dictionary lock protects against item deletion and addition,
> > > > but not against changing an existing item.
> > >
> > > Hmm, one could argue that extend() adds the new key/value pair, thus
> > > that it should fail.
> >
> > For lists, I agree -- extend(list, list2[, where]) does not touch
> > existing items.
> >
> > For dicts, extend() modifies existing items and adds new items when
> > their keys do not exist yet.
> >
> > > Zyx's argument that it may change one existing key/value but then fail
> > > on a following one that adds a key makes sense.
> >
> > This was also discussed by you and ZyX for patch 7.4.419 ("when part of
> > a list is locked it's possible to make changes"), which is a related
> > situation. ZyX proposed several approaches, and the result was "[list]
> > items will be removed [or changed] only if all items are not locked".
> >
> > > Perhaps we should just
> > > state "extend() on a locked list or dict does not work".
The implementation advantage is that the dict lock only needs to be
checked once. This achieves all-or-nothing behavior and a lock checking
loop can be avoided. And this is what the current implementation does,
it only needs to be documented.
But see below.
> >
> > If we let extend() fail here, then ':let' (and map()) should fail as
> > well. The lock should not have a different effect when the change is
> > attempted by different means.
>
> Well, it depends on how you look at it. Although extend() can be used
> to change the value of an existing item, it is not what it is normally
> used for. Using an assignment to an item with a specific key is a more
> obvious way to change the value. Of course, if the key does not exist
> the assignment would fail if the dict is locked.
>
> Also: It's really easy to document that extend() does not work on a
> locked dictionary. That is also an indication that this is easy to
> understand. Trying to explain that it might work in specific situations
> is more complicated, indicating it's also harder to understand for
> users.
Hmmm, yes, from the viewpoint of "how do I use extend()?" it is simple:
extend() does not change a locked dict. This is easy to understand, and
should be easy to use.
But on the other hand, from the viewpoint of "how can I protect data
with locks?", extend() is a special case. (Which also needs special
documentation in ':he extend()' and ':he lockvar'.)
The main problem is that we generally cannot predict whether extend() or
:let or some other function or command will be used on a dict.
So we cannot know if the dict will be protected according to extend()'s
interpretation of locks or according to the interpretation by :let or
others (or by both, at different points).
Example:
:let d = {'x': 1, 'func': function('Userfunc')}
:lockvar 1 d
:lockvar d.func
:
:call SomeFunc(d)
Is d.x locked or not?
To find out if extend() or :let or something else is used on the
dict, we need to analyze all possible code paths, including
functions called in other scripts, triggered autocommands and used
expressions ('statusline', 'diffexpr', ...). The latter can also
have been set up by any script.
We cannot predict all scripts and script versions that a user has
installed, or future code changes. And even the analysis of the
known code can be tedious (without tools or help by vimL).
So we cannot know if d.x will be treated as locked or not.
But locks should help making a code analysis, rather than require one.
So to make locks helpful, the interpretation of locks should be the same
everywhere.
False positives by inconsistent locks can break production runs:
----------------------------------------------------------------
In a production situation, a new version of a foreign script may switch
from using :let to use extend(). If this touches a dict that was locked
by our script, this may generate false (= unintended) errors which could
terminate an expression ('diffexpr' or 'statusline', etc.), an 'abort'
function or a :try block.
So inconsistent locks can break production runs.
Example use case for a consistent dict lock that permits item changes:
----------------------------------------------------------------------
I had a 'statusline' expression that used a dict as a cache for
statusline items.
Sometimes bogus new items were added to the cache. A dict lock would
throw an error on item addition, to detect the source of the bougs new
items.
But the dict lock also threw errors for item changes, when done by
extend(). And unfortunately, it was not easily possible to selectively
disable item updates while keeping the bogus item additions running.
(To do that, I would have needed to know where the bogus updates come
from.)
As the statusline expression terminates on the first error, item updates
did terminate it, and it did not get to the point where the bogus item
additions appeared.
Note that termination on first error can also happen in 'abort'
functions, in a :try block and in some other expressions.
I ended up debugging this with many ':echo's.
Lessons for locks:
- Inconsistent locks may generate false positives, which may prevent
reaching the point where the bug can be reproduced.
- A more specific dict lock that does not lock items against changes
is useful in finding specific mistakes.
Other problems:
---------------
- When an item change is prevented by a dict item lock, then
extend() returns with an error, keeping previously made changes.
:let d = {'a': 1, 'b': 2}
:lockvar d.b
:call extend(d, {'a': 99, 'b': 101})
E741: Value is locked: extend() argument
:let d
d {'a': 99, 'b': 2}
For all-or-nothing behaviour, we need a separate lock checking
loop.
When this separate lock check loop is used, the main advantage of
interpreting a dict lock (!) also as a change lock for all items
is lost: a separate lock check loop is needed anyway.
- extend(d1, d2, action) reports lock errors even when no changes
would have been made.
This happens when d2 is empty, and when d2 would only change
existing items in d1 and 'action' is "keep".
:let d = {'a': 1, 'b': 2}
:lockvar 1 d
:call extend(d, {})
E741: Value is locked: extend() argument
:call extend(d, {'a': 99, 'b': 101}, 'keep')
E741: Value is locked: extend() argument
When "keep" is used, d2 items that would change existing items in
d1 should silently be ignored.
The lock detects a false positive.
Fix: add a loop over d2 items to find an addition to d1, then
report a lock error. If d2 has NO additions to d1, then return
without error. In both cases, d1 remains unchanged.
For this fix, the checks *could* also be integrated into the
normal change loop. No changes need to be avoided as "keep" is
used together with a dict lock, preventing all changes (even if
the dict lock does not protect against item changes -- "keep"
already does that).
Here are some use cases that show that extend() is useful when mixing
item addition and change:
---------------------------------------------------------------------
These can be found in scripts on http://www.vim.org/scripts and in my
own scripts.
- Config dict: override / augment defaults with user settings.
- Option dict for a function: override / augment defaults with
caller-supplied options.
- l:self: initialize or update a dict function's data.
- Associative cache: add items, or update already cached items.
- Transactional update (e.g. at the end of a function): compute
several values, then update / augment the target dict in a single
transaction: extend(target, {'keyA': valA, 'keyB': valB, ...}).
Reasons to use extend() here:
- extend() combines add-or-change operation with all-or-nothing.
No additional save and rollback is needed to implement
transactional item addition / update with extend().
(Unless several dictionaries need to be updated within a single
transaction. But I see this happen less often, and even in this
case extend()s can be used to simplify the code.)
- extend() makes code more readable and simplifies it:
- It summarizes updates in one place: a temp dict can be created
from plain local variables (which have been set up during the
execution of the function) just when needed in the extend().
- Items can be initialized and updated by the same code.
- extend() is more efficient than using :let in a loop. E.g. when
the source dict is prepared in some other software layer, and its
contents needs to be integrated into a target dict.
> > This would be a different lock model.
> >
> >
> > The all-or-nothing semantics you and ZyX mention can be implemented for
> > extend(dict, dict2[, how]) as well. But note:
> >
> > - extend(dict, dict2[, how]) currently already has several
> > exit-on-error conditions:
> >
> > - adding a function to scopes l: or g: that conflicts with a
> > builtin function
> >
> > - adding or replacing an item in a variable scope, when the key
> > is not a valid variable name
> >
> > - changing an existing dict item while 'how' is 'error'
> >
> > - changing a locked dict item (since 7.4.698).
> >
> > For all-or-nothing semantics, all of these need to be moved into a
> > separate checking loop.
> >
> > (This is the reason why I considered all-or-nothing semantics
> > beyond the scope of my patch: the current implementation has
> > exit-on-error semantics.)
> >
> >
> > - A separate checking loop needs to repeat all dictionary lookups.
> > For my tests, this made extend(dict, ...) take twice as much time
> > to complete.
>
> Yes, this would be the reason to only use this for error conditions that
> do not require looping over all the items (in the dict and in the
> arguments).
>
> > :let d = {}
> > :for i in range(10000) | let d[i] = 't'.i | endfor
> > :for t in range(3)
> > : let start = reltime()
> > : for i in range(10000) | call extend(d, d) | endfor
> > : echomsg reltimestr(reltime(start))
> > :endfor
> >
> > Unpatched:
> >
> > 3.255444
> > 3.243517
> > 3.248523
> >
> > Old patch (exit-on-error semantics):
> >
> > 3.249932
> > 3.256253
> > 3.255438
> >
> > New patch with separate checking loop (see below):
> >
> > 6.452288
> > 6.390474
> > 6.420924
> >
> > I am not sure if taking twice as long is a good tradeoff for
> > having all-or-nothing semantics.
>
> For things where the script writer made an obvious mistake, such as
> trying to overwrite a builtin function, the slowness is not good.
> For the cases where the dict would be halfway a change, it was already
> modified when an unexpected error condition is encountered, it would be
> worth it.
Yes. So for all-or-nothing, only locks need to be checked before
entering the change loop that does the remaining checks.
A separate checking loop is still needed though, for all-or-nothing
behavior when checking locks.
The limitation to only do some checks saves only a small fraction of
the overall time, as most of the time is spent on loading an item
into the CPU caches. After that, additional checks are fast, they
do not need to wait for cache loads.
Additionally, when dict items are mostly contiguous in memory or can
be accessed in a striding pattern (because they were added to the
dict without interjecting memory allocations of random size, or
memory deallocations), then automatic cache prefetching should pick
up the next item while we do more checks on the current item. This
means that our bottleneck -- cache fill -- runs at top speed while
we make more checks.
But to find out how helpful the speedup by prefetching is in
practice, we would need to find out how prevalent striding patterns
in dicts are.
However, even if the checks are not completely "hidden" by
concurrent cache preloads, the cache load latency should already
dwarf the time spent in checks (as mentioned above).
Anyhow, none of us saw a necessity to do the more esoteric checks in
all-or-nothing mode in the separate check loop. This simplifies the
code when the separate check loop is only run under some conditions (as
in my patch below), because checks do not have to be done in both the
separate check loop and the change loop.
> It's a tough decision. In those cases I prefer to leave it as it is.
I hope that the new patch below and the new test results will help.
> > - In case of an out of memory error, extend() skips that item and
> > continues with the next one (continue-after-error semantics).
> >
> > For OOM, all-or-nothing semantics can only be achieved by
> > implementing a rollback, because OOM can only happen when we
> > actually execute the changes.
> >
> > A rollback requires copies of the changed items. Making them is
> > relatively expensive.
> >
> > Note that list functions do not implement this: 'extend(list,
> > list2, how)' and ':let list[i:j] = list2' and probably others do
> > have exit-on-OOM semantics.
> >
> > One reason for not implementing all-or-nothing for OOM may be that
> > OOM rarely happens: as discussed in another thread, users tend to
> > terminate other processes before OOM (to avoid a slow system that
> > is swapping or thrashing), or an OOM-killer may kill some other
> > process to avoid OOM.
> >
> > It may not be worth the slowdown just to have a rollback on OOM.
> >
> > An updated patch that implements all-or-nothing semantics in
> > extend(dict, dict2[, how]) (except for OOM) is at the end of this mail.
> >
> >
> > On another note, while looking at the 7.4.419 patch again, I realized
> > that it adds checks to ':unlet' for item locks before it deletes any
> > items. This is not needed under the locking model described in ':he
> > lockvar': item locks protect against item change, not against item
> > deletion.
> >
> > The list lock, which protects against deletion, is checked before
> > already.
> >
> > Here is a patch that removes the unneeded item lock checks. Note that
> > no changes to the tests are needed, as the list lock already makes sure
> > the items cannot be deleted.
>
> So this makes extend() much slower. Not sure what part of this I would
> include.
New test results suggest that extend() is not slowed down so much in
actual use.
For overlaps of 0% and 10% between source and target dict keys, the
measured lock check loop overhead averages around 17%:
avg.: 17% stddev: ± 16, max. 40%.
Many use cases seem to have dict overlaps in this range.
For overlaps from 0% to 100%, the tests show an overhead of
avg.: 34% stddev: ± 27, max. 100%.
The worst case for realistic, but uncommon use cases is 83% overhead
(for 1 million items in both the source dict and the target dict, and
100% overlap between them).
The above results include measurements for dictionaries with numbers as
values. With longer, real-life strings as values, the overhead tends to
shrink considerably (see also below).
Last minute addition: some applications use long strings as keys. I
have no test results, but the overhead of a lock check loop should
grow in these cases: the dict lookup in both loops (lock check and
change loop) needs to compare the keys, and long keys make this a
more dominant factor of the overall time, making the time spent in
item addition and value copying relatively smaller, so that both
loops become more similar in running time, and the overhead gets
closer to 100%.
But this also seems to be used more for temporary data (counting
string occurences, or an intermediate step when making a list of
unique strings), where locks are less likely to be used.
Here are the test results for the lock check loop overhead for different
dict sizes and different overlaps. These tests use the new patch (see
below):
Using short strings (eval('"t".i')) as dict values:
overlap between dicts: 100% 90% 50% 10% 0%
----------------------------------------------------------------
10000000 items 76.0% 65.8% 45.4% 39.9% crash
1000000 items 59.5% 58.4% 28.9% 24.7% 18.1%
100000 items 66.6% 57.2% 43.6% 15.3% 9.5%
10000 items 52.3% 26.6% 31.9% 23.9% 13.6%
1000 items 38.7% 34.2% 29.6% 16.3% 16.3%
100 items 6.4% 28.4% 31.5% -2.7% -16.9%
10 items 10.0% 8.0% 8.1% 13.0% 21.2%
Using numbers as dict values:
overlap between dicts: 100% 90% 50% 10% 0%
----------------------------------------------------------------
10000000 items 100.3% 84.5% 53.7% 39.4% crash
1000000 items 82.6% 77.4% 28.5% 26.4% 23.6%
100000 items 80.7% 79.2% 64.0% 16.6% 17.2%
10000 items 69.5% 71.3% 58.0% 38.2% 36.2%
1000 items 76.3% 61.1% 42.3% 34.7% 18.6%
100 items 41.8% 32.4% 30.5% -12.8% -14.7%
10 items 0.5% 5.1% 15.0% -5.2% -0.2%
Note: the source dict and the target dict have the same preloaded
size in each test.
Disclaimer: test results varied to some extent, although I tried to
minimize disturbing influences and removed outliers. See also the
attached test scripts. Also, for various reasons, I could not run
the tests on a dedicated test system.
There is also a crash for 10 million items and 0% overlap. So far,
I did not put sufficient time into debugging that crash. But it
happens in both the patched vim and the unpatched vim.
Lock test loop overhead becomes smaller when the overlap between source
dict and target dict becomes smaller.
With more items to be added to the target dict, more new items need
to be allocated and filled (instead of just copying the item
values). The time spent in the creation of new items becomes more
dominant, so that a smaller share of the total time is spent in the
lock test loop.
Lock test loop overhead also shrinks when the values of dict items use
more memory, e.g. for longer strings.
Similarly, copying larger values makes the time spent for that more
dominant, so that a smaller portion of the total time is spent in
the lock test loop.
The worst lock check loop overheads appear when the values are plain
numbers.
It also seems that lock test loop overhead is smaller when the looked up
items "just fit" into a CPU cache layer, while the change loop is "too
large" for that CPU cache layer, because it accesses more memory to add
items and to copy values between the dictionaries.
In these cases, the lock check loop can benefit from CPU cache
prefetches because it does not kick out prefetched cache lines
before it needs them, while in the change loop the additional memory
accesses contribute to kicking out more prefetched cache lines
before they can be used.
This seems to happen for different dict sizes, whenever the data
needed in the lock check loop "just fits" into one CPU cache at one
level in the cache hierarchy, while item creation and value copying
in the change loop produces more cache collisions in that cache.
Note that my computer took about 60 seconds to create a dictionary
with 10 million items in an item-by-item add loop, and about 6 seconds
for 1 million items. This seems to be the limit for useable dict sizes.
Please see the full test results in the attached file.
Also, the test scripts are in another attachment. The scripts also
create a test log with more detailed results.
New patch:
----------
This patch adds a separate lock check loop to extend(). The loop is
only executed when needed.
When no locks are used on the target dict, it has near zero cost.
Otherwise, the slowdown is shown in the above test results.
The lock check loop only checks for lock violations and when "error"
mode is also used for item changes, to report the expected error.
Other error conditions are only checked later, in the change loop.
To quickly find out when the lock check loop can be skipped, a counter
for locked items is kept in each dictionary. When the counter is
non-zero or the dict lock is set, then the lock check loop is executed.
For variable scope dicts (v:, l:, ...), the counter is initialized
to a base count of 1, so they are always checked. These dicts can
contain items with DI_FLAGS_RO/RO_SBX (which are not individually
counted).
User locks (VAR_LOCKED) can only be added or removed by
:(un)lockvar. Counting is done in these commands.
All functions that add items to a dict already make sure that the
locks on the added items are cleared. (All functions that
ultimately call hash_add() or hash_add_item() on a dict's hash.)
So there is no need to count locks there.
The counter also works when the dict is referenced from another
variable, as the same dict header (dict_T) is referenced.
Note that VAR_FIXED protects only against item deletion. This is
not relevant for extend(). And deletion with 'remove(dict, key)'
and ':unlet dict.key ...' only removes one item from a dict at a
time (:unlet essentially does exit-on-error when more than one
'dict.key' is given to it), and 'filter(dict, "v:key != key")' does
exit-on-error.
So no need to count VAR_FIXED.
---
src/eval.c | 43 ++++++++++++++++++++++++++++++++++++++-----
src/structs.h | 1 +
src/testdir/test55.in | 24 ++++++++++++++++++++++++
src/testdir/test55.ok | 7 +++++++
4 files changed, 70 insertions(+), 5 deletions(-)
Index: b/src/eval.c
===================================================================
--- a/src/eval.c 2015-06-21 15:43:04.278647940 +0200
+++ b/src/eval.c 2015-06-22 23:57:55.304056700 +0200
@@ -3810,8 +3810,11 @@ do_lock_var(lp, name_end, deep, lock)
/* (un)lock a List item. */
item_lock(&lp->ll_li->li_tv, deep, lock);
else
+ {
/* un(lock) a Dictionary item. */
item_lock(&lp->ll_di->di_tv, deep, lock);
+ lp->ll_dict->dv_ilocked += lock ? 1 : -1;
+ }
return ret;
}
@@ -3879,6 +3882,7 @@ item_lock(tv, deep, lock)
{
--todo;
item_lock(&HI2DI(hi)->di_tv, deep - 1, lock);
+ d->dv_ilocked += lock ? 1 : -1;
}
}
}
@@ -7195,6 +7199,7 @@ dict_alloc()
hash_init(&d->dv_hashtab);
d->dv_lock = 0;
+ d->dv_ilocked = 0;
d->dv_scope = 0;
d->dv_refcount = 0;
d->dv_copyID = 0;
@@ -10509,6 +10514,37 @@ dict_extend(d1, d2, action)
int todo;
char_u *arg_errmsg = (char_u *)N_("extend() argument");
+ /* When d1 has locks, first check for lock breaks and overwrite errors. */
+ if (d1->dv_ilocked || d1->dv_lock & (VAR_LOCKED | VAR_FIXED))
+ {
+ todo = (int)d2->dv_hashtab.ht_used;
+ for (hi2 = d2->dv_hashtab.ht_array; todo > 0; ++hi2)
+ {
+ if (!HASHITEM_EMPTY(hi2))
+ {
+ --todo;
+ di1 = dict_find(d1, hi2->hi_key, -1);
+ if (di1 == NULL)
+ {
+ if (tv_check_lock(d1->dv_lock, arg_errmsg, TRUE))
+ return;
+ }
+ else if (*action == 'e')
+ {
+ EMSG2(_("E737: Key already exists: %s"), hi2->hi_key);
+ return;
+ }
+ else if (*action == 'f' && HI2DI(hi2) != di1)
+ {
+ if (tv_check_lock(di1->di_tv.v_lock, arg_errmsg, TRUE)
+ || var_check_ro(di1->di_flags, arg_errmsg, TRUE))
+ return;
+ }
+ }
+ }
+ }
+
+ /* Check for errors and make changes. No need to check locks again. */
todo = (int)d2->dv_hashtab.ht_used;
for (hi2 = d2->dv_hashtab.ht_array; todo > 0; ++hi2)
{
@@ -10542,9 +10578,6 @@ dict_extend(d1, d2, action)
}
else if (*action == 'f' && HI2DI(hi2) != di1)
{
- if (tv_check_lock(di1->di_tv.v_lock, arg_errmsg, TRUE)
- || var_check_ro(di1->di_flags, arg_errmsg, TRUE))
- break;
clear_tv(&di1->di_tv);
copy_tv(&HI2DI(hi2)->di_tv, &di1->di_tv);
}
@@ -10608,8 +10641,7 @@ f_extend(argvars, rettv)
d1 = argvars[0].vval.v_dict;
d2 = argvars[1].vval.v_dict;
- if (d1 != NULL && !tv_check_lock(d1->dv_lock, arg_errmsg, TRUE)
- && d2 != NULL)
+ if (d1 != NULL && d2 != NULL)
{
/* Check the third argument. */
if (argvars[2].v_type != VAR_UNKNOWN)
@@ -21330,6 +21362,7 @@ init_var_dict(dict, dict_var, scope)
{
hash_init(&dict->dv_hashtab);
dict->dv_lock = 0;
+ dict->dv_ilocked = 1; /* must expect DI_FLAGS_RO* on items */
dict->dv_scope = scope;
dict->dv_refcount = DO_NOT_FREE_CNT;
dict->dv_copyID = 0;
Index: b/src/structs.h
===================================================================
--- a/src/structs.h 2015-06-21 15:42:20.922696781 +0200
+++ b/src/structs.h 2015-06-22 23:43:25.029495192 +0200
@@ -1219,6 +1219,7 @@ struct dictvar_S
int dv_refcount; /* reference count */
int dv_copyID; /* ID used by deepcopy() */
hashtab_T dv_hashtab; /* hashtab that refers to the items */
+ long_u dv_ilocked; /* locked items + 1 if DI_FLAGS_RO* possible */
dict_T *dv_copydict; /* copied dict used by deepcopy() */
dict_T *dv_used_next; /* next dict in used dicts list */
dict_T *dv_used_prev; /* previous dict in used dicts list */
Index: b/src/testdir/test55.in
===================================================================
--- a/src/testdir/test55.in 2015-06-21 15:42:20.922696781 +0200
+++ b/src/testdir/test55.in 2015-06-21 20:26:18.759475460 +0200
@@ -408,6 +408,30 @@ let l = [0, 1, 2, 3]
:endtry
:$put =string(d)
:"
+:$put ='extend() after lock on dict:'
+:unlet! d
+:let d = {'a': 99, 'b': 100, 'd': 101}
+:lockvar 1 d
+:$put =string(extend(d, {'a': 123}))
+:try
+: $put =string(extend(d, {'x': 'new'}))
+: $put ='wrong: did extend() with new item'
+:catch
+: $put =v:exception[:14]
+:endtry
+:$put =string(d)
+:unlet! d2
+:let d2 = {'a': 789, 'b': 42, 'c': 'new', 'd': 52}
+:$put ='key order is ok to recognize error modes: '.(keys(d2)[0] !=
'c').(keys(d2)[-1] != 'c')
+:try
+: $put =string(extend(d, d2))
+: $put ='wrong: did extend() with mixed-in new item'
+:catch
+: $put =v:exception[:14]
+:endtry
+:$put =string(d)
+:unlet d d2
+:"
:$put ='No remove() of write-protected scope-level variable:'
:fun! Tfunc(this_is_a_loooooooooong_parameter_name)
: try
Index: b/src/testdir/test55.ok
===================================================================
--- a/src/testdir/test55.ok 2015-06-21 15:42:20.922696781 +0200
+++ b/src/testdir/test55.ok 2015-06-21 20:26:18.763475458 +0200
@@ -138,6 +138,13 @@ did map()
No extend() after lock on dict item:
Vim(put):E741:
{'a': 99, 'b': 100}
+extend() after lock on dict:
+{'a': 123, 'b': 100, 'd': 101}
+Vim(put):E741:
+{'a': 123, 'b': 100, 'd': 101}
+key order is ok to recognize error modes: 11
+Vim(put):E741:
+{'a': 123, 'b': 100, 'd': 101}
No remove() of write-protected scope-level variable:
Vim(put):E795:
No extend() of write-protected scope-level variable:
--
Olaf Dabrunz (oda <at> fctrace.org)
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.
Measurement of the overhead of the lock checking loop used in 'rollback'
mode (all-or-nothing) over 'stop' mode (exit-on-error) in vim's extend()
function.
------------------------------------------------------------------------
Test notes:
-----------
- Tests for smaller dictionaries are done more often, keeping the
product (test iterations * dict size) constant: 1000 tests * 10000
items, 10000 test * 1000 items, ...
This keeps the running time of a test loop large enough to be
accurately measured.
- There is some variance between test results on my test system,
probably caused by L2/L3 cache utilization by other processes (on
other cores / HTs) and by CPU frequency throttling due to heat (as
done by the systems management BIOS).
To minimize the influence of disturbances, each test loop was
repeated 11 times and the results were averaged.
Before averaging, outliers were removed: the measurements beyond a
factor of 1.1 below the end of the first quarter of test results
and above the start of the last quarter of test results.
With 11 tests, two tests lie in the first quantile and two lie in
the last quantile, and these can possibly be recognized as
outliers. In pratice, this seems sufficient to detect and remove
outliers.
- To derive the number of cache events caused by extend(), cache
overhead for everything but the extend() is measured separately
and subtracted from cache event counts measured when the full test
is run (including extend()).
- Tests for dict sizes of 1000 items or above measure the time spent
in extend() directly.
This prevents influences from variations in overhead (most notably
in copy()) from distorting the measurements for extend().
When dict sizes become smaller, the tests appear to become less
susceptible to variations.
Possibly because the whole test becomes more likely to run
completely from L2 or L1 cache, reducing or eliminating memory
bus and shared cache contention from other processes.
Also, for small dict sizes, the running time spent in a single
extend() became too small to measure it directly with high enough
accuracy.
Testing the "check-when-needed" patch:
--------------------------------------
To test the overhead of the lock check loop, the vim test script creates
and locks a special item when needed to activate the lock test loop in
extend().
Using strings (eval('"t".i')) as dict values:
overlap between dicts: 100% 90% 50% 10% 0%
------------------------------------------------------------------
10000000 items 76.0% 65.8% 45.4% 39.9% crash
L1 dcache-loads -128.0% 33.2% -37.1% -0.6% crash
L1 dcache-stores 34.6% 27.4% 13.0% 9.9% crash
L1 dcache-l+s 242.2% 31.0% -23.7% 3.4% crash
1000000 items 59.5% 58.4% 28.9% 24.7% 18.1%
L1 dcache-loads 38.0% 33.6% 19.1% 13.2% 10.6%
L1 dcache-stores 33.4% 28.5% 14.9% 10.9% 8.6%
L1 dcache-l+s 36.4% 31.7% 17.5% 12.3% 9.8%
100000 items 66.6% 57.2% 43.6% 15.3% 9.5%
L1 dcache-loads 37.9% 32.1% 21.0% 12.9% 11.6%
L1 dcache-stores 36.5% 29.4% 19.9% 11.5% 10.5%
L1 dcache-l+s 37.4% 31.1% 20.6% 12.3% 11.2%
10000 items 52.3% 26.6% 31.9% 23.9% 13.6%
L1 dcache-loads 37.3% 30.4% 19.6% 12.5% 10.9%
L1 dcache-stores 34.9% 26.7% 16.7% 11.0% 9.7%
L1 dcache-l+s 36.4% 29.0% 18.4% 11.9% 10.4%
1000 items 38.7% 34.2% 29.6% 16.3% 16.3%
L1 dcache-loads 39.0% 32.7% 21.4% 15.4% 13.4%
L1 dcache-stores 36.4% 29.3% 18.8% 16.5% 14.4%
L1 dcache-l+s 38.1% 31.4% 20.4% 15.8% 13.8%
100 items 6.4% 28.4% 31.5% -2.7% -16.9%
L1 dcache-loads 36.8% 33.8% 20.4% 6.4% 2.1%
L1 dcache-stores 36.3% 31.7% 19.1% 4.3% -2.0%
L1 dcache-l+s 36.6% 33.0% 19.9% 5.6% 0.6%
10 items 10.0% 8.0% 8.1% 13.0% 21.2%
L1 dcache-loads 45.1% 43.0% 27.5% 20.6% 18.7%
L1 dcache-stores 47.9% 45.7% 27.5% 20.3% 20.3%
L1 dcache-l+s 46.1% 44.0% 27.5% 20.5% 19.3%
No priming with copy() (not needed as overlap is 100%):
-------------------------------------------------------
overlap between dicts: 100%
----------------------------------
10000000 items 58.7%
L1 dcache-loads 40.3%
L1 dcache-stores 34.6%
L1 dcache-l+s 38.2%
1000000 items 50.5%
L1 dcache-loads 38.2%
L1 dcache-stores 33.7%
L1 dcache-l+s 36.6%
100000 items 67.2%
L1 dcache-loads 37.0%
L1 dcache-stores 33.4%
L1 dcache-l+s 35.7%
10000 items 44.7%
L1 dcache-loads 36.1%
L1 dcache-stores 33.4%
L1 dcache-l+s 35.1%
1000 items 36.0%
L1 dcache-loads 36.8%
L1 dcache-stores 33.3%
L1 dcache-l+s 35.5%
100 items 17.9%
L1 dcache-loads 34.1%
L1 dcache-stores 32.6%
L1 dcache-l+s 33.5%
10 items 20.6%
L1 dcache-loads 24.9%
L1 dcache-stores 25.4%
L1 dcache-l+s 25.1%
extend dict with itself:
---------------------------------
10000000 items 117.5%
L1 dcache-loads 98.0%
L1 dcache-stores 100.0%
L1 dcache-l+s 98.6%
1000000 items 107.9%
L1 dcache-loads 98.9%
L1 dcache-stores 101.3%
L1 dcache-l+s 99.7%
100000 items 42.5%
L1 dcache-loads 97.6%
L1 dcache-stores 99.0%
L1 dcache-l+s 98.0%
10000 items 128.2%
L1 dcache-loads 98.5%
L1 dcache-stores 100.5%
L1 dcache-l+s 99.2%
1000 items 87.5%
L1 dcache-loads 97.5%
L1 dcache-stores 99.1%
L1 dcache-l+s 98.0%
100 items 41.7%
L1 dcache-loads 90.5%
L1 dcache-stores 91.9%
L1 dcache-l+s 91.0%
10 items 1.1%
L1 dcache-loads 53.8%
L1 dcache-stores 55.0%
L1 dcache-l+s 54.2%
Same tests with numbers as dict values:
overlap between dicts: 100% 90% 50% 10% 0%
------------------------------------------------------------------
10000000 items 100.3% 84.5% 53.7% 39.4% crash
L1 dcache-loads 92.5% 68.0% 32.7% 19.7% crash
L1 dcache-stores 89.6% 61.0% 24.7% 15.3% crash
L1 dcache-l+s 91.5% 65.5% 29.5% 17.9% crash
1000000 items 82.6% 77.4% 28.5% 26.4% 23.6%
L1 dcache-loads 81.2% 68.6% 29.3% 19.8% 16.6%
L1 dcache-stores 75.4% 59.3% 21.8% 15.9% 13.4%
L1 dcache-l+s 79.2% 65.2% 26.2% 18.2% 15.3%
100000 items 80.7% 79.2% 64.0% 16.6% 17.2%
L1 dcache-loads 90.4% 68.0% 35.7% 19.1% 16.7%
L1 dcache-stores 91.7% 67.1% 33.6% 16.5% 14.1%
L1 dcache-l+s 90.8% 67.6% 34.9% 18.0% 15.6%
10000 items 69.5% 71.3% 58.0% 38.2% 36.2%
L1 dcache-loads 85.7% 61.2% 33.0% 19.4% 17.1%
L1 dcache-stores 83.7% 53.3% 28.5% 17.0% 14.9%
L1 dcache-l+s 85.0% 58.2% 31.2% 18.5% 16.2%
1000 items 76.3% 61.1% 42.3% 34.7% 18.6%
L1 dcache-loads 82.3% 65.7% 35.4% 22.7% 16.4%
L1 dcache-stores 76.2% 56.5% 30.9% 22.2% 14.6%
L1 dcache-l+s 80.1% 62.2% 33.7% 22.5% 15.7%
100 items 41.8% 32.4% 30.5% -12.8% -14.7%
L1 dcache-loads 89.1% 69.9% 35.8% 6.1% 4.2%
L1 dcache-stores 88.0% 66.4% 33.7% 2.7% 1.7%
L1 dcache-l+s 88.7% 68.5% 35.0% 4.9% 3.3%
10 items 0.5% 5.1% 15.0% -5.2% -0.2%
L1 dcache-loads 79.5% 72.6% 42.9% 29.0% 23.5%
L1 dcache-stores 85.9% 78.5% 43.6% 28.5% 25.5%
L1 dcache-l+s 81.9% 74.7% 43.2% 28.8% 24.3%
No priming with copy() (not needed as overlap is 100%):
-------------------------------------------------------
overlap between dicts: 100%
----------------------------------
10000000 items 89.0%
L1 dcache-loads 83.1%
L1 dcache-stores 76.2%
L1 dcache-l+s 80.8%
1000000 items 87.4%
L1 dcache-loads 84.7%
L1 dcache-stores 79.5%
L1 dcache-l+s 82.9%
100000 items 89.4%
L1 dcache-loads 87.6%
L1 dcache-stores 83.9%
L1 dcache-l+s 86.3%
10000 items 110.3%
L1 dcache-loads 86.7%
L1 dcache-stores 83.3%
L1 dcache-l+s 85.5%
1000 items 78.7%
L1 dcache-loads 86.0%
L1 dcache-stores 82.3%
L1 dcache-l+s 84.7%
100 items 44.6%
L1 dcache-loads 79.4%
L1 dcache-stores 77.0%
L1 dcache-l+s 78.5%
10 items 19.2%
L1 dcache-loads 46.4%
L1 dcache-stores 47.8%
L1 dcache-l+s 46.9%
extend dict with itself:
---------------------------------
10000000 items 98.1%
L1 dcache-loads 96.3%
L1 dcache-stores 97.2%
L1 dcache-l+s 96.6%
1000000 items 96.1%
L1 dcache-loads 97.9%
L1 dcache-stores 99.4%
L1 dcache-l+s 98.4%
100000 items 114.2%
L1 dcache-loads 98.3%
L1 dcache-stores 100.1%
L1 dcache-l+s 98.9%
10000 items 154.1%
L1 dcache-loads 98.0%
L1 dcache-stores 100.1%
L1 dcache-l+s 98.7%
1000 items 109.1%
L1 dcache-loads 97.6%
L1 dcache-stores 99.3%
L1 dcache-l+s 98.2%
100 items 40.0%
L1 dcache-loads 90.5%
L1 dcache-stores 91.8%
L1 dcache-l+s 90.9%
10 items 0.9%
L1 dcache-loads 53.7%
L1 dcache-stores 54.9%
L1 dcache-l+s 54.1%
testscripts-for-vim-extend-error-mode.tar
Description: Unix tar archive
