> Seems that an easy work-around would be to have your reader and writer
> threads synchronize on their access to the CAS.  If we implemented
> concurrent access, this is what we would have to do, inside the CAS
> itself. 
>
> When new data are added to the CAS, indexes are often updated.  If these
> are concurrently being accessed, *bad things* can happen, which is
> probably what's happening in your case.  
>
>   
Well, not exactly because I do not *write* any data in the CAS: threads
only read the annotations contained in the CAS, and in my real
annotators data is written in the CAS after all threads have terminated.
I'm not expert in thread-safety so I might miss something, but at first
sight I don't understand how concurrent read access can fail? (though I
must admit I did not try to study the source code in the
FSIndexRepositoryImpl class)


> The CAS is used as a "unit-of-work" in many places in UIMA, as well.  If
> you used it for this purpose, then a workflow might be:
>
> Have the Writer write to the process, so the process gets all its
> inputs, then have the reader read from the process the results.
>
> For scale-out, have multiple CASes.
>
> Would this work in your use case?  -Marshall
>   
Yes, indeed. The only quite negative point in this solution is that it
requires to totally duplicate the data at each input or output step,
thus needing a bit more time and memory. I guess this solution is more
"UIMA standard" than synchronizing every CAS access in my threads?

Thanks again!
Erwan

Reply via email to