A question that has arisen before (for example, here:
https://mail.python.org/pipermail/python-list/2010-January/565497.html
<https://mail.python.org/pipermail/python-list/2010-January/565497.html>) is
the question of "is defaultdict thread safe", with the answer generally being a
conditional "yes", with the condition being what is used as the default value:
apparently default values of python types, such as list, are thread safe,
whereas more complicated constructs, such as lambdas, make it not thread safe.
In my situation, I'm using a lambda, specifically:
lambda: datetime.min
So presumably *not* thread safe.
My goal is to have a dictionary of aircraft and when they were last "seen",
with datetime.min being effectively "never". When a data point comes in for a
given aircraft, the data point will be compared with the value in the
defaultdict for that aircraft, and if the timestamp on that data point is newer
than what is in the defaultdict, the defaultdict will get updated with the
value from the datapoint (not necessarily current timestamp, but rather the
value from the datapoint). Note that data points do not necessarily arrive in
chronological order (for various reasons not applicable here, it's just the way
it is), thus the need for the comparison.
When the program first starts up, two things happen:
1) a thread is started that watches for incoming data points and updates the
dictionary as per above, and
2) the dictionary should get an initial population (in the main thread) from
hard storage.
The behavior I'm seeing, however, is that when step 2 happens (which generally
happens before the thread gets any updates), the dictionary gets populated with
56 entries, as expected. However, none of those entries are visible when the
thread runs. It's as though the thread is getting a separate copy of the
dictionary, although debugging says that is not the case - printing the
variable from each location shows the same address for the object.
So my questions are:
1) Is this what it means to NOT be thread safe? I was thinking of race
conditions where individual values may get updated wrong, but this apparently
is overwriting the entire dictionary.
2) How can I fix this?
Note: I really don't care if the "initial" update happens after the thread
receives a data point or two, and therefore overwrites one or two values. I
just need the dictionary to be fully populated at some point early in
execution. In usage, the dictionary is used to see of an aircraft has been seen
"recently", so if the most recent datapoint gets overwritten with a slightly
older one from disk storage, that's fine - it's just if it's still showing
datetime.min because we haven't gotten in any datapoint since we launched the
program, even though we have "recent" data in disk storage thats a problem. So
I don't care about the obvious race condition between the two operations, just
that the end result is a populated dictionary. Note also that as datapoint come
in, they are being written to disk, so the disk storage doesn't lag
significantly anyway.
The framework of my code is below:
File: watcher.py
last_points = defaultdict(lambda:datetime.min)
# This function is launched as a thread using the threading module when the
first client connects
def watch():
while true:
<wait for datapoint>
pointtime= <extract/parse timestamp from datapoint>
if last_points[<aircraft_identifier>] < pointtime:
<do stuff>
last_points[<aircraft_identifier>]=pointtime
#DEBUGGING
print("At update:", len(last_points))
File: main.py:
from .watcher import last_points
# This function will be triggered by a web call from a client, so could happen
at any time
# Client will call this function immediately after connecting, as well as in
response to various user actions.
def getac():
<load list of aircraft and times from disk>
<do stuff to send the list to the client>
for record in aclist:
last_points[<aircraft_identifier>]=record_timestamp
#DEBUGGING
print("At get AC:", len(last_points))
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------
--
https://mail.python.org/mailman/listinfo/python-list