On 7/6/11 5:23 PM, Jason Baldridge wrote:
This is very micro-managed, and it should be possible to use a single instance on multiple threads as well. E.g., I'd like to use parallel sequences in Scala to distribute a single model instance over several documents (so, if I have documents in a List mydocuments, then I should be able to do mydocuments.par and process each document, but I can't). Better encapsulation would do the trick here. Is there a good reason not to?
A new threading strategy must perform similar to the current one. A thread safe instance must scale with the number of threads and CPU cores. I believe it is not as easy as just doing a little encapsulation, but includes a deep redesign of a few things, e.g. the feature generation of the pos tagger, name finder and chunker must be changed. The beam search stuff isn't thread safe, etc. Isn't it possible in scala to pass-in some factory which can create an instance per thread? Jörn
