Well if you want multiple processes to write into the db you should use one 
that can handle concurrency, i.e. a "real" DB not a simple desktop/embedded 
DB like SQLlite. So for example Postgres or if you do not want to deal with 
SQL then use a NOSQL db e.g. mongodb (there are many more). For a column 
store relational DB (good for analytics): monetDB.
If you still want all the data in one file at the end then write a program 
that at the end exports the data from the db to a file (that program is a 
single process so no concurrency issues).

 You could also do everything in-memory and let it serialize to disk async 
: e.g. Apache ignite (there are a bunch of others).
There's also sciDB for array-oriented DB.

This is just a small sample of possibilities, If you want a pure julia 
solution, then you could do it with the julia multiprocessing functionality 
but you'll have to work with locking to co├Ârdinate between the processes 
(i.e. it isn't just the typical trivial "divide and conquer" data 
parallelism anymore).

On Monday, October 17, 2016 at 7:07:28 PM UTC+2, Zachary Roth wrote:
> Thanks for the responses.
> Raph, thank you again.  I very much appreciate your "humble offering". 
>  I'll take a further look into your gist.
> Steven, I'm happy to use the right tool for the job...so long as I have an 
> idea of what it is.  Would you care to offer more insights or suggestions 
> for the ill-informed (such as myself)?
> ---Zachary
> On Sunday, October 16, 2016 at 7:51:19 AM UTC-4, Steven Sagaert wrote:
>> that because SQLLite isn't a multi-user DB server but a single user 
>> embedded (desktop) db. Use the right tool for the job.
>> On Saturday, October 15, 2016 at 7:02:58 PM UTC+2, Ralph Smith wrote:
>>> How are the processes supposed to interact with the database?  Without 
>>> extra synchronization logic, SQLite.jl gives (occasionally)
>>> ERROR: LoadError: On worker 2:
>>> SQLite.SQLiteException("database is locked")
>>> which on the face of it suggests that all workers are using the same 
>>> connection, although I opened the DB separately in each process.
>>> (I think we should get "busy" instead of "locked", but then still have 
>>> no good way to test for this and wait for a wake-up signal.)
>>> So we seem to be at least as badly off as the original post, except with 
>>> DB calls instead of simple writes.
>>> We shouldn't have to stand up a separate multithreaded DB server just 
>>> for this. Would you be kind enough to give us an example of simple (i.e. 
>>> not client-server) multiprocess DB access in Julia?
>>> On Saturday, October 15, 2016 at 9:40:17 AM UTC-4, Steven Sagaert wrote:
>>>> It still surprises me how in the scientific computing field people 
>>>> still refuse to learn about databases and then replicate database 
>>>> functionality in files in a complicated and probably buggy way. HDF5  is 
>>>> one example, there are many others. If you want to to fancy search (i.e. 
>>>> speedup search via indices) or do things like parallel writes/concurrency 
>>>> you REALLY should use databases. That's what they were invented for 
>>>> decades 
>>>> ago. Nowadays there a bigger choice than ever: Relational or 
>>>> non-relational 
>>>> (NOSQL), single host or distributed, web interface or not,  disk-based or 
>>>> in-memory,... There really is no excuse anymore not to use a database if 
>>>> you want to go beyond just reading in a bunch of data in one go in memory.
>>>> On Monday, October 10, 2016 at 5:09:39 PM UTC+2, Zachary Roth wrote:
>>>>> Hi, everyone,
>>>>> I'm trying to save to a single file from multiple worker processes, 
>>>>> but don't know of a nice way to coordinate this.  When I don't 
>>>>> coordinate, 
>>>>> saving works fine much of the time.  But I sometimes get errors with 
>>>>> reading/writing of files, which I'm assuming is happening because 
>>>>> multiple 
>>>>> processes are trying to use the same file simultaneously.
>>>>> I tried to coordinate this with a queue/channel of `Condition`s 
>>>>> managed by a task running in process 1, but this isn't working for me. 
>>>>>  I've tried to simiplify this to track down the problem.  At least part 
>>>>> of 
>>>>> the issue seems to be writing to the channel from process 2.  
>>>>> Specifically, 
>>>>> when I `put!` something onto a channel (or `push!` onto an array) from 
>>>>> process 2, the channel/array is still empty back on process 1.  I feel 
>>>>> like 
>>>>> I'm missing something simple.  Is there an easier way to go about 
>>>>> coordinating multiple processes that are trying to access the same file? 
>>>>>  If not, does anyone have any tips?
>>>>> Thanks for any help you can offer.
>>>>> Cheers,
>>>>> ---Zachary

Reply via email to