[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657401#action_12657401 ]
Jeremy Volkman commented on LUCENE-831: --------------------------------------- A couple things: # Looking at the getCachedData method for MultiReader and MultiSegmentReader, it doesn't appear that the CacheData objects from merge operations are cached. Is there any reason for this? # I've written a merge method for StringIndexCacheKey. The process isn't all that complicated (apart from all of the off-by-ones), but it's expensive. {code:java} public boolean isMergable() { return true; } private static class OrderNode { int index; OrderNode next; } public CacheData mergeData(int[] starts, CacheData[] data) throws UnsupportedOperationException { int[] mergedOrder = new int[starts[starts.length - 1]]; // Lookup map is 1-based String[] mergedLookup = new String[starts[starts.length - 1] + 1]; // Unwrap cache payloads and flip order arrays StringIndex[] unwrapped = new StringIndex[data.length]; /* Flip the order arrays (reverse indices and values) * Since the ord map has a many-to-one relationship with the lookup table, * the flipped structure must be one-to-many which results in an array of * linked lists. */ OrderNode[][] flippedOrders = new OrderNode[data.length][]; for (int i = 0; i < data.length; i++) { StringIndex si = (StringIndex) data[i].getCachePayload(); unwrapped[i] = si; flippedOrders[i] = new OrderNode[si.lookup.length]; for (int j = 0; j < si.order.length; j++) { OrderNode a = new OrderNode(); a.index = j; a.next = flippedOrders[i][si.order[j]]; flippedOrders[i][si.order[j]] = a; } } // Lookup map is 1-based int[] lookupIndices = new int[unwrapped.length]; Arrays.fill(lookupIndices, 1); int lookupIndex = 0; String currentVal; int currentSeg; while (true) { currentVal = null; currentSeg = -1; int remaining = 0; // Find the next ordered value from all the segments for (int i = 0; i < unwrapped.length; i++) { if (lookupIndices[i] < unwrapped[i].lookup.length) { remaining++; String that = unwrapped[i].lookup[lookupIndices[i]]; if (currentVal == null || currentVal.compareTo(that) > 0) { currentVal = that; currentSeg = i; } } } if (remaining == 1) { break; } else if (remaining == 0) { /* The only way this could happen is if there are 0 segments or if * all segments have 0 terms. In either case, we can return * early. */ return new CacheData(new StringIndex( new int[starts[starts.length - 1]], new String[1])); } if (!currentVal.equals(mergedLookup[lookupIndex])) { lookupIndex++; mergedLookup[lookupIndex] = currentVal; } OrderNode a = flippedOrders[currentSeg][lookupIndices[currentSeg]]; while (a != null) { mergedOrder[a.index + starts[currentSeg]] = lookupIndex; a = a.next; } lookupIndices[currentSeg]++; } {code} > Complete overhaul of FieldCache API/Implementation > -------------------------------------------------- > > Key: LUCENE-831 > URL: https://issues.apache.org/jira/browse/LUCENE-831 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Reporter: Hoss Man > Fix For: 3.0 > > Attachments: ExtendedDocument.java, fieldcache-overhaul.032208.diff, > fieldcache-overhaul.diff, fieldcache-overhaul.diff, > LUCENE-831.03.28.2008.diff, LUCENE-831.03.30.2008.diff, > LUCENE-831.03.31.2008.diff, LUCENE-831.patch, LUCENE-831.patch, > LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch > > > Motivation: > 1) Complete overhaul the API/implementation of "FieldCache" type things... > a) eliminate global static map keyed on IndexReader (thus > eliminating synch block between completley independent IndexReaders) > b) allow more customization of cache management (ie: use > expiration/replacement strategies, disk backed caches, etc) > c) allow people to define custom cache data logic (ie: custom > parsers, complex datatypes, etc... anything tied to a reader) > d) allow people to inspect what's in a cache (list of CacheKeys) for > an IndexReader so a new IndexReader can be likewise warmed. > e) Lend support for smarter cache management if/when > IndexReader.reopen is added (merging of cached data from subReaders). > 2) Provide backwards compatibility to support existing FieldCache API with > the new implementation, so there is no redundent caching as client code > migrades to new API. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org