On Tue, Oct 31, 2017 at 11:38 AM, Israel Brewster <isr...@ravnalaska.net> wrote: > A question that has arisen before (for example, here: > https://mail.python.org/pipermail/python-list/2010-January/565497.html > <https://mail.python.org/pipermail/python-list/2010-January/565497.html>) is > the question of "is defaultdict thread safe", with the answer generally being > a conditional "yes", with the condition being what is used as the default > value: apparently default values of python types, such as list, are thread > safe,
I would not rely on this. It might be true for current versions of CPython, but I don't think there's any general guarantee and you could run into trouble with other implementations. > whereas more complicated constructs, such as lambdas, make it not thread > safe. In my situation, I'm using a lambda, specifically: > > lambda: datetime.min > > So presumably *not* thread safe. > > My goal is to have a dictionary of aircraft and when they were last "seen", > with datetime.min being effectively "never". When a data point comes in for a > given aircraft, the data point will be compared with the value in the > defaultdict for that aircraft, and if the timestamp on that data point is > newer than what is in the defaultdict, the defaultdict will get updated with > the value from the datapoint (not necessarily current timestamp, but rather > the value from the datapoint). Note that data points do not necessarily > arrive in chronological order (for various reasons not applicable here, it's > just the way it is), thus the need for the comparison. Since you're going to immediately replace the default value with an actual value, it's not clear to me what the purpose of using a defaultdict is here. You could use a regular dict and just check if the key is present, perhaps with the additional argument to .get() to return a default value. Individual lookups and updates of ordinary dicts are atomic (at least in CPython). A lookup followed by an update is not, and this would be true for defaultdict as well. > When the program first starts up, two things happen: > > 1) a thread is started that watches for incoming data points and updates the > dictionary as per above, and > 2) the dictionary should get an initial population (in the main thread) from > hard storage. > > The behavior I'm seeing, however, is that when step 2 happens (which > generally happens before the thread gets any updates), the dictionary gets > populated with 56 entries, as expected. However, none of those entries are > visible when the thread runs. It's as though the thread is getting a separate > copy of the dictionary, although debugging says that is not the case - > printing the variable from each location shows the same address for the > object. > > So my questions are: > > 1) Is this what it means to NOT be thread safe? I was thinking of race > conditions where individual values may get updated wrong, but this apparently > is overwriting the entire dictionary. No, a thread-safety issue would be something like this: account[user] = account[user] + 1 where the value of account[user] could potentially change between the time it is looked up and the time it is set again. That said it's not obvious to me what your problem actually is. -- https://mail.python.org/mailman/listinfo/python-list