On Wed, Jun 26, 2013 at 11:00 AM, Michael A Barborak <
[email protected]> wrote:

> Hi,
>
> Thanks for the reply. I'm hearing a few things at once so please let me
> try to tease it apart.
>
> The first is that UIMA is caching feature structure address look ups
> already. My tests show that the time performance of this cache is not as
> good as a Java HashMap. Does the UIMA implementation have benefits I am
> not seeing? Perhaps in memory usage? If those benefits do not apply to my
> use case, is there a way that I can register my own Map implementation?
>
I believe JCas  has a custom hashmap. I don't know more than that.


> The second is that caching can hurt performance and so there is an option
> to turn it off. I believe you are saying that if the percentage of cache
> accesses that are misses is above some level then caching is a performance
> drag rather than a performance help. I believe that is generally true of
> caching. For my use case it is not the case though. (By the way, I wasn't
> able to find this option to measure the effect on my tests of turning off
> caching but I couldn't think of good keywords to search on to find it.)
>
The scenario where JCas hashmap causes CPU degradation is not due to
cache misses, it is due to populating the hashmap but never using it.

Documentation for disabling JCas cache is at
http://uima.apache.org/d/uimaj-2.4.0/tutorials_and_users_guides.html#tug.application.pto


> You didn't mention improving feature value access performance. It seems
> that could be improved?
>
I would not be surprised if some improvement was possible.


>
> I'm puzzled by your comment about the likelihood of being able to improve
> UIMA's performance for all or even most scenarios. Is this based on some a
> survey of typical UIMA applications? What was the decision process that
> led to the current map implementation? I just wonder what criteria would
> need to be achieved by a proposed change. My feeling continues to be that
> there is room for improvement.
>
The JCas cache is a good example. It can make performance much better
for some scenarios and worse in others. An improvement to the cache
implementation would be hard to argue with, as would a reduction in some
code path that didn't lose necessary functionality.

Eddie

Reply via email to