I'm trying to write a multithreaded crawler using Cayenne. I previously
had it working with Torque.
I'm writing different information out to the database (and Solr). Some
of the information is used by multiple threads, and should only be
created if it doesn't already exist in the db. Outgoing links is the one
that is giving trouble. Many of our pages point to the same link, so it
should use that same reference in the database if one exists. If one
does not exist, it should create it. Further actions should check for
existence.
If I don't commit the context frequently enough, it starts attempting to
insert duplicate URLs. I have that fixed, but now am getting this sort
of message:
Cannot set object as destination of relationship toResource because it
is in a different ObjectContext
What's the best strategy for doing frequent updates to the database with
multiple threads?
I am beginning to think I'm headed down the wrong path and should switch
to something else completely to store this data, such as NoSQL.
- Multithreaded application trouble with contexts Richard Frovarp
-