It's best to avoid those synchronization barriers if possible. If you have all of the data in SHM (RAM) on one node, and you need to notify processes / wait for other workers to be available to perform a task that requires that data, you need a method for IPC: a queue, channel subscriptions, a source/sink, over-frequent polling that's more resilient against dropped messages. (But you only need to scale to one node).
There needs to be a shared structure that tracks allocations, right? What does it need to do lookups by. [ [obj_id_or_shm_pointer, [subscribers]] ] Does the existing memory pool solve for that? And there also needs to be an instruction pipeline; a queue/channel/source of messages for each worker or only some workers to process. ... https://distributed.dask.org/en/latest/journey.html https://distributed.dask.org/en/latest/work-stealing.html "Accelerate intra-node IPC with shared memory" https://github.com/dask/dask/issues/6267 On Sun, Aug 2, 2020, 3:21 AM Vinay Sharma <vinay04sha...@icloud.com> wrote: > I understand that I won’t need locks with immutable objects at some level, > but I don’t understand how they can be used to synchronise shared memory > segments. > > For every change in an immutable object, a copy is created which will have > a different address. Now, for processes to use this updated object they > will have to remap a new address in their address space for them to see any > changes, and this remap will have to occur whenever a change takes place, > which is obviously not feasible. > > So, changes in the shared memory segment should be done in the shared > memory segment itself, therefore shared memory segments should be mutable. > > On 02-Aug-2020, at 5:11 AM, Wes Turner <wes.tur...@gmail.com> wrote: > > > https://docs.dask.org/en/latest/shared.html#known-limitations : > > > Known Limitations > > The shared memory scheduler has some notable limitations: > > > > - It works on a single machine > > - The threaded scheduler is limited by the GIL on Python code, so if > your operations are pure python functions, you should not expect a > multi-core speedup > > - The multiprocessing scheduler must serialize functions between > workers, which can fail > > - The multiprocessing scheduler must serialize data between workers and > the central process, which can be expensive > > - The multiprocessing scheduler cannot transfer data directly between > worker processes; all data routes through the master process. > > ... > https://distributed.dask.org/en/latest/memory.html#difference-with-dask-compute > > (... https://github.com/dask/dask-labextension ) > > On Sat, Aug 1, 2020 at 7:34 PM Wes Turner <wes.tur...@gmail.com> wrote: > >> PyArrow Plasma object ids, "sealing" makes an object immutable, pyristent >> >> https://arrow.apache.org/docs/python/plasma.html#object-ids >> https://arrow.apache.org/docs/python/plasma.html#creating-an-object-buffer >> >> > Objects are created in Plasma in two stages. First, they are created, >> which allocates a buffer for the object. At this point, the client can >> write to the buffer and construct the object within the allocated buffer. >> > >> > To create an object for Plasma, you need to create an object ID, as >> well as give the object’s maximum size in bytes. >> > ```python >> > # Create an object buffer. >> > object_id = plasma.ObjectID(20 * b"a") >> > object_size = 1000 >> > buffer = memoryview(client.create(object_id, object_size)) >> > >> > # Write to the buffer. >> > for i in range(1000): >> > buffer[i] = i % 128 >> > ``` >> > >> > When the client is done, the client seals the buffer, making the object >> immutable, and making it available to other Plasma clients. >> > >> > ```python >> > # Seal the object. This makes the object immutable and available to >> other clients. >> > client.seal(object_id) >> > ``` >> >> https://pypi.org/project/pyrsistent/ also supports immutable structures >> >> On Sat, Aug 1, 2020 at 4:44 PM Eric V. Smith <e...@trueblade.com> wrote: >> >>> On 8/1/2020 1:25 PM, Marco Sulla wrote: >>> > You don't need locks with immutable objects. Since they're immutable, >>> > any operation that usually will mutate the object, generate another >>> > immutable instead. The most common example is str: the sum of two >>> > strings in Python (and in many other languages) produces a new string. >>> >>> While they're immutable at the Python level, strings (and all other >>> objects) are mutated at the C level, due to reference count updates. You >>> >>> need to consider this if you're sharing objects without locking or other >>> >>> synchronization. >>> >>> Eric >>> >>> _______________________________________________ >>> Python-ideas mailing list -- python-ideas@python.org >>> To unsubscribe send an email to python-ideas-le...@python.org >>> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >>> Message archived at >>> https://mail.python.org/archives/list/python-ideas@python.org/message/FEJEHFKBK7TMH6KIYJBPLBYBDU4IA4EB/ >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/IRDFSJP7CIQRPQQEP54T42HN33BUOOOV/ > Code of Conduct: http://python.org/psf/codeofconduct/ > > >
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/S6HLASS4SJ6KGEI3JFY4TMUBSOGBRHBR/ Code of Conduct: http://python.org/psf/codeofconduct/