Right Michael...of course. Sometimes I cannot see the forest or the trees...
We don't have to worry that much about the IndexReader being read only
though do we? There is not much worry now - I believe that a deleted doc
field value remains in the field cache until the Reader is reopened. If
someone is changing docs/fields, they need to reopen the Reader.
Heck...i think that deleted docs field values even currently get loaded
into the cache. So regardless of "read-only" this method will duplicate
the current behavior right?
- Mark
Michael McCandless wrote:
I was picturing that you'd first call an API on IndexReader to
retrieve an object for accessing your stored field, like the
IndexReader.getCachedData() in the current patch on LUCENE-831. That
method must be synchronized so that the underlying cache is initially
loaded by at most one thread.
You then use the returned object to ask for the field value(s) for an
individual doc. I think calling the "getIntValue(int docID)" on this
object would not have to be synchronized, assuming the original
IndexReader was "read only"?
Mike
Mark Miller wrote:
The reason I am thinking you have to synch on every getCachedField
call is that the cache needs to be lazily loaded...I don't see a way
to do with this without sync unless you have an ugly "you must call
this method before repeatably calling getCachedField."
Maybe I am wrong? Or maybe the cost of synchronization is low enough
now not to matter as long as we provide the back compaitable way?
Michael McCandless (JIRA) wrote:
[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582576#action_12582576
]
Michael McCandless commented on LUCENE-831:
-------------------------------------------
I think if we can finally move to having read-only IndexReaders then
they would not sync on this method?
Also, we should still provide the "give me the full array as of
right now" fallback which in a read/write usage would allow you to
spend lots of RAM in order to not synchronize. Of course you'd also
have to update your array (or, periodically ask for a new one) if
you are altering fields.
Complete overhaul of FieldCache API/Implementation
--------------------------------------------------
Key: LUCENE-831
URL: https://issues.apache.org/jira/browse/LUCENE-831
Project: Lucene - Java
Issue Type: Improvement
Components: Search
Reporter: Hoss Man
Assignee: Michael Busch
Fix For: 2.4
Attachments: fieldcache-overhaul.032208.diff,
fieldcache-overhaul.diff, fieldcache-overhaul.diff
Motivation:
1) Complete overhaul the API/implementation of "FieldCache" type
things...
a) eliminate global static map keyed on IndexReader (thus
eliminating synch block between completley independent
IndexReaders)
b) allow more customization of cache management (ie: use
expiration/replacement strategies, disk backed caches, etc)
c) allow people to define custom cache data logic (ie: custom
parsers, complex datatypes, etc... anything tied to a reader)
d) allow people to inspect what's in a cache (list of
CacheKeys) for
an IndexReader so a new IndexReader can be likewise warmed.
e) Lend support for smarter cache management if/when
IndexReader.reopen is added (merging of cached data from
subReaders).
2) Provide backwards compatibility to support existing FieldCache
API with
the new implementation, so there is no redundent caching as
client code
migrades to new API.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]