On Tue, Dec 4, 2012 at 3:24 PM, Rob Hall <[email protected]> wrote:
> *tl:dr;* I had to extend DatasetImpl to bypass internal caching in order to
> be able to retrieve the same models that I added to the Dataset. I believe
> the only performance hit that I can see will be that the Models which wrap
> the underlying graphs will not be garbage collected. Is this a correct
> assumption regarding the effect of this change, and are my expectations
> regarding the behavior of dataset in error?
>
> *background:* I extended the OntModel interface (and created a
> corresponding extension to OntModelImpl) to facilitate some domain-specific
> entity representations and actions. I wished to execute queries over sets
> of these Models. These queries were to include their sub-models, and
> potentially inferred triples from any attached reasoner.
>

I think a better way to handle this is via composition rather than
inheritance from the OntModelImpl class.  If your extended model class
does not need direct access to the instance variables of OntModelImpl,
then this way is a lot more flexible.  Especially if the
implementation of OntModelImpl changes in the future.  Essentially,
make your custom Model a stateless wrapper around another Model.

Make your own interface that extends OntModel, but make your
implementation class implement that new interface and delegate all
calls to an OntModel that is passed into the contructor.  This way
your class very cheap to construct, and you can create a new one
backed by any OntModel that you retreive from a Dataset.


> To allow this, I added instances of my new implementation to a Jena
> in-memory Dataset.
>
> So far, this has worked as I had expected (with fairly thorough integration
> testing taking place). Recently, I discovered a crash whenever I had added
> a sufficient number of models to the system. When I went to retrieve a
> named model, I could not cast it back to its original type. The following
> bit of code from DatasetImpl make this clear why:
>
>     private Model graph2model(Graph graph)
>     {
>         // Called from readers -- outer synchronation needed.
>         Model model = cache.get(graph) ;
>         if ( model == null )
>         {
>             model = ModelFactory.createModelForGraph(graph) ;
>             cache.put(graph, model) ;
>         }
>         return model ;
>     }
>
> Inside DatasetImpl, a LRU cache is used to keep track of recently requested
> (or inserted) models. After executing for some time, the original model has
> been lost, and a subsequent request (and cast) results in a
> ClassCastException.
>

I think that cache needs to be eliminated.  It is probably a relic
from the days of expensive object creation.


> *workaround:* My workaround would have been to extend DatasetImpl and
> override the 4 methods that interact with the cache to, instead, work with
> a standard HashMap. As those methods are private, am presently creating my
> own implementation.
>
> *question:* My workaround makes several assumptions, and I wanted to be
> certain that they made sense.
>
> *1) *My use of Dataset is outside of its expected use. Multiple requests to
> get the same model should not expect to retrieve the same implementation,
> let alone the same instance, of the model that they saw on a previous
> request. (Ie: *that this isn't a bug*)

Model is meant to be a stateless wrapper around an underlying Graph,
and Dataset is a similarly a wrapper around a DatasetGraph.  Therefore
it is expected that you would different instances of Model back from a
Dataset than you added.


> *2)* The caching that occurs within DatasetImpl primarily exists to
> minimize held references to held objects in order to facilitate java
> garbage collection (the Model may be garbage collected, but its backing
> graph, retained by the Dataset, will not). Therefore, *overriding this
> behavior will have no effect on applications which are already retaining
> references to those Models*.

The caching that is there now probably isn't needed, and it should be
returning a new Model instance every time.

-Stephen

Reply via email to