It still surprises me how in the scientific computing field people still
refuse to learn about databases and then replicate database functionality
in files in a complicated and probably buggy way. HDF5 is one example,
there are many others. If you want to to fancy search (i.e. speedup search
via indices) or do things like parallel writes/concurrency you REALLY
should use databases. That's what they were invented for decades ago.
Nowadays there a bigger choice than ever: Relational or non-relational
(NOSQL), single host or distributed, web interface or not, disk-based or
in-memory,... There really is no excuse anymore not to use a database if
you want to go beyond just reading in a bunch of data in one go in memory.
On Monday, October 10, 2016 at 5:09:39 PM UTC+2, Zachary Roth wrote:
> Hi, everyone,
> I'm trying to save to a single file from multiple worker processes, but
> don't know of a nice way to coordinate this. When I don't coordinate,
> saving works fine much of the time. But I sometimes get errors with
> reading/writing of files, which I'm assuming is happening because multiple
> processes are trying to use the same file simultaneously.
> I tried to coordinate this with a queue/channel of `Condition`s managed by
> a task running in process 1, but this isn't working for me. I've tried to
> simiplify this to track down the problem. At least part of the issue seems
> to be writing to the channel from process 2. Specifically, when I `put!`
> something onto a channel (or `push!` onto an array) from process 2, the
> channel/array is still empty back on process 1. I feel like I'm missing
> something simple. Is there an easier way to go about coordinating multiple
> processes that are trying to access the same file? If not, does anyone
> have any tips?
> Thanks for any help you can offer.