[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect

2012-03-17 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3867:
---

Lucene Fields: New,Patch Available  (was: New)
 Assignee: (was: Shai Erera)

Wow, what awesome improvements you guys have added !

Uwe, +1 to commit. I unassigned myself - you and Dawid definitely deserve the 
credit!

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER and other constants are incorrect
 --

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3867:
---

Attachment: LUCENE-3867.patch

Thanks Uwe !

I ran the test, and now with both J9 (IBM) and Oracle, I get this print 
(without enabling any flag):

{code}
[junit] NOTE: running test testReferenceSize
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
{code}

* I modified the test name to testReferenceSize (was testCompressedOops).

I wrote this small test to print the differences between sizeOf(String) and 
estimateRamUsage(String):

{code}
  public void testSizeOfString() throws Exception {
String s = abcdefgkjdfkdsjdskljfdskfjdsf;
String sub = s.substring(0, 4);
System.out.println(original= + RamUsageEstimator.sizeOf(s));
System.out.println(sub= + RamUsageEstimator.sizeOf(sub));
System.out.println(checkInterned=true(orig):  + new 
RamUsageEstimator().estimateRamUsage(s));
System.out.println(checkInterned=false(orig):  + new 
RamUsageEstimator(false).estimateRamUsage(s));
System.out.println(checkInterned=false(sub):  + new 
RamUsageEstimator(false).estimateRamUsage(sub));
  }
{code}

It prints:
{code}
original=104
sub=56
checkInterned=true(orig): 0
checkInterned=false(orig): 98
checkInterned=false(sub): 98
{code}

So clearly estimateRamUsage factors in the sub-string's larger char[]. The 
difference in sizes of 'orig' stem from AverageGuessMemoryModel which computes 
the reference size to be 4 (hardcoded), and array size to be 16 (hardcoded). I 
modified AverageGuess to use constants from RUE (they are best guesses 
themselves). Still the test prints a difference, but now I think it's because 
sizeOf(String) aligns the size to mod 8, while estimateRamUsage isn't. I fixed 
that in size(Object), and now the prints are the same.

* I also fixed sizeOfArray -- if the array.length == 0, it returned 0, but it 
should return its header, and aligned to mod 8 as well.

* I modified sizeOf(String[]) to sizeOf(Object[]) and compute its raw size 
only. I started to add sizeOf(String), fastSizeOf(String) and 
deepSizeOf(String[]), but reverted to avoid the hassle -- the documentation 
confuses even me :).

* Changed all sizeOf() to return long, and align() to take and return long.

I think this is ready to commit, though I'd appreciate a second look on the 
MemoryModel and size(Obj) changes.

Also, how about renaming MemoryModel methods to: arrayHeaderSize(), 
classHeaderSize(), objReferenceSize() to make them more clear and accurate? For 
instance, getArraySize does not return the size of an array, but its object 
header ...

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String 

[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3867:
---

Attachment: LUCENE-3867.patch

Ok removed sizeOf(Object[]). One can compute it by using RUE.estimateRamSize to 
do a deep calculation.

Geez Dawid, you took away all the reasons I originally opened the issue for ;).

But at least AvgGuessMemoryModel and RUE.size() are more accurate now. And we 
have some useful utility methods.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-14 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3867:
---

Attachment: LUCENE-3867.patch

Patch adds RUE.sizeOf(String) and various sizeOf(arr[]) methods. Also fixes the 
ARRAY_HEADER.

Uwe, I merged with your patch, with one difference -- the System.out prints in 
the test are printed only if VERBOSE.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3786) Create SearcherTaxoManager

2012-03-06 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3786:
---

Fix Version/s: (was: 3.6)

Removing 3.6 Fix version. If I'll make it by the release, I'll put it back.

 Create SearcherTaxoManager
 --

 Key: LUCENE-3786
 URL: https://issues.apache.org/jira/browse/LUCENE-3786
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 4.0


 If an application wants to use an IndexSearcher and TaxonomyReader in a 
 SearcherManager-like fashion, it cannot use a separate SearcherManager, and 
 say a TaxonomyReaderManager, because the IndexSearcher and TaxoReader 
 instances need to be in sync. That is, the IS-TR pair must match, or 
 otherwise the category ordinals that are encoded in the search index might 
 not match the ones in the taxonomy index.
 This can happen if someone reopens the IndexSearcher's IndexReader, but does 
 not refresh the TaxonomyReader, and the category ordinals that exist in the 
 reopened IndexReader are not yet visible to the TaxonomyReader instance.
 I'd like to create a SearcherTaxoManager (which is a ReferenceManager) which 
 manages an IndexSearcher and TaxonomyReader pair. Then an application will 
 call:
 {code}
 SearcherTaxoPair pair = manager.acquire();
 try {
   IndexSearcher searcher = pair.searcher;
   TaxonomyReader taxoReader = pair.taxoReader;
   // do something with them
 } finally {
   manager.release(pair);
   pair = null;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3793) Use ReferenceManager in DirectoryTaxonomyReader

2012-03-06 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3793:
---

Fix Version/s: (was: 3.6)

Removing 3.6 Fix version. If I'll make it by the release, I'll put it back.

 Use ReferenceManager in DirectoryTaxonomyReader
 ---

 Key: LUCENE-3793
 URL: https://issues.apache.org/jira/browse/LUCENE-3793
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 4.0


 DirTaxoReader uses hairy code to protect its indexReader instance from 
 being modified while threads use it. It maintains a ReentrantLock 
 (indexReaderLock) which is obtained on every 'read' access, while 
 refresh() locks it for 'write' operations (refreshing the IndexReader). 
 Instead of all that, now that we have ReferenceManager in place, I think 
 that we can write a ReaderManagerIndexReader which will be used by 
 DirTR. Every method that requires access to the indexReader will 
 acquire/release (not too different than obtaining/releasing the read 
 lock), and refresh() will call ReaderManager.maybeRefresh(). It will 
 simplify the code and remove some rather long comments, that go into 
 great length explaining why does the code looks like that. 
 This ReaderManager cannot be used for every IndexReader, because DirTR's
 refresh() logic is special -- it reopens the indexReader, and then
 verifies that the createTime still matches on the reopened reader as
 well. Otherwise, it closes the reopened reader and fails with an exception.
 Therefore, this ReaderManager.refreshIfNeeded will need to take the
 createTime into consideration and fail if they do not match.
 And while we're at it ... I wonder if we should have a manager for an
 IndexReader/ParentArray pair? I think that it makes sense because we
 don't want DirTR to use a ParentArray that does not match the IndexReader.
 Today this can happen in refresh() if e.g. after the indexReader instance
 has been replaced, parentArray.refresh(indexReader) fails. DirTR will be
 left with a newer IndexReader instance, but old (or worse, corrupt?)
 ParentArray ... I think it'll be good if we introduce clone() on ParentArray,
 or a new ctor which takes an int[].
 I'll work on a patch once I finish with LUCENE-3786.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3138) IW.addIndexes should fail fast if an index is too old/new

2012-03-05 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3138:
---

Fix Version/s: (was: 3.6)

Removing 3.6 Fix Version.

 IW.addIndexes should fail fast if an index is too old/new
 -

 Key: LUCENE-3138
 URL: https://issues.apache.org/jira/browse/LUCENE-3138
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0


 Today IW.addIndexes (both Dir and IR versions) do not check the format of the 
 incoming indexes. Therefore it could add a too old/new index and the app will 
 discover that only later, maybe during commit() or segment merges. We should 
 check that up front and fail fast.
 This issue is relevant only to 4.0 at the moment, which will not support 2.x 
 indexes anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2921) Now that we track the code version at the segment level, we can stop tracking it also in each file level

2012-03-05 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2921:
---

Fix Version/s: (was: 3.6)

Removing 3.6 version. Since this requires an index format change, I think that 
it'd be good if we can resolve it by 4.0 Alpha.

 Now that we track the code version at the segment level, we can stop tracking 
 it also in each file level
 

 Key: LUCENE-2921
 URL: https://issues.apache.org/jira/browse/LUCENE-2921
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
 Fix For: 4.0


 Now that we track the code version that created the segment at the segment 
 level, we can stop tracking versions in each file. This has several major 
 benefits:
 # Today the constant names that use to track versions are confusing - they do 
 not state since which version it applies to, and so it's harder to determine 
 which formats we can stop supporting when working on the next major release.
 # Those format numbers are usually negative, but in some cases positive 
 (inconsistency) -- we need to remember to increase it one down for the 
 negative ones, which I always find confusing.
 # It will remove the format tracking from all the *Writers, and the *Reader 
 will receive the code format (String) and work w/ the appropriate constant 
 (e.g. Constants.LUCENE_30). Centralizing version tracking to SegmentInfo is 
 an advantage IMO.
 It's not urgent that we do it for 3.1 (though it requires an index format 
 change), because starting from 3.1 all segments track their version number 
 anyway (or migrated to track it), so we can safely release it in follow-on 3x 
 release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3794) DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail)

2012-02-16 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3794:
---

Attachment: LUCENE-3794.patch

Patch fixes the bug + adds a couple of test cases to ensure the correct 
behavior.

 DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing 
 DirTaxoReader.refresh() to falsely succeed (or fail)
 -

 Key: LUCENE-3794
 URL: https://issues.apache.org/jira/browse/LUCENE-3794
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3794.patch


 DirTaxoWriter sets createTime to null after it put it in the commit data 
 once. But that's wrong because if one calls commit(Map) twice, the second 
 time doesn't record the creation time. Also, in the ctor, if an index exists 
 and OpenMode is not CREATE, the creation time property is not read.
 I wrote a couple of unit tests that assert this, and modified DirTaxoWriter 
 to always record the creation time (in every commit) -- that's the only safe 
 way.
 Will upload a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3761) Generalize SearcherManager

2012-02-13 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3761:
---

Attachment: LUCENE-3761.patch

Updated patch:
- ThingyManager renamed to ReferenceManager
- Declared 'current' volatile (thanks Simon!)
- Added two tests to TestSM. While they could be under a TestReferenceManager 
new class, I didn't think that creating another class + a ReferenceManager 
extension is worth it.
- Added a CHANGES entry under back-compat (following maybeReopen to 
maybeRefresh).

If nobody objects, I'd like to rename maybeRefresh to just refresh, and commit 
it. Otherwise, I'll commit what I have.

I've decided to deal with the SearcherTaxoManager in a different issue.

 Generalize SearcherManager
 --

 Key: LUCENE-3761
 URL: https://issues.apache.org/jira/browse/LUCENE-3761
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3761.patch, LUCENE-3761.patch, LUCENE-3761.patch


 I'd like to generalize SearcherManager to a class which can manage instances 
 of a certain type of interfaces. The reason is that today SearcherManager 
 knows how to handle IndexSearcher instances. I have a SearcherManager which 
 manages a pair of IndexSearcher and TaxonomyReader pair.
 Recently, few concurrency bugs were fixed in SearcherManager, and I realized 
 that I need to apply them to my version as well. Which led me to think why 
 can't we have an SM version which is generic enough so that both my version 
 and Lucene's can benefit from?
 The way I see SearcherManager, it can be divided into two parts: (1) the part 
 that manages the logic of acquire/release/maybeReopen (i.e., ensureOpen, 
 protect from concurrency stuff etc.), and (2) the part which handles 
 IndexSearcher, or my SearcherTaxoPair. I'm thinking that if we'll have an 
 interface with incRef/decRef/tryIncRef/maybeRefresh, we can make 
 SearcherManager a generic class which handles this interface.
 I will post a patch with the initial idea, and we can continue from there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3761) Generalize SearcherManager

2012-02-09 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3761:
---

Attachment: LUCENE-3761.patch

Option #2:

ThingyManagerG is an abstract class which implements all the concurrency 
administration and exposes the abstract methods tryIncRef(), decRef() and 
refreshIfNeeded().

SearcherManager now extends ThingyManagerIndexSearcher and implements just 
these 3 methods (in addition to isSearcherCurrent()).

What I like about this approach is that SearcherManager remains a concrete 
class, so that code can reference it and not ThingyManager. Also, IMO it's a 
simplified impl vs. the composite ThingyManager/Thingy. AND besides the rename 
of maybeReopen to maybeRefresh, NONE of the code was affected by this 
refactoring.

I've left the unneeded code as commented out in SearcherManager for easy 
comparison, but it should go away. TestSM passes (as well as all core tests), 
so I think that ThingyManager handles all concurrency cases as SearcherManager. 
However, it could use another inspecting eye :).

As for the name -- now the name is less important b/c I don't think we'll 
reference ThingyManagers. I lean towards something like ReferenceManager / 
RefCountManager or remove Manager. Something simple. Suggestions are welcome.

 Generalize SearcherManager
 --

 Key: LUCENE-3761
 URL: https://issues.apache.org/jira/browse/LUCENE-3761
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3761.patch, LUCENE-3761.patch


 I'd like to generalize SearcherManager to a class which can manage instances 
 of a certain type of interfaces. The reason is that today SearcherManager 
 knows how to handle IndexSearcher instances. I have a SearcherManager which 
 manages a pair of IndexSearcher and TaxonomyReader pair.
 Recently, few concurrency bugs were fixed in SearcherManager, and I realized 
 that I need to apply them to my version as well. Which led me to think why 
 can't we have an SM version which is generic enough so that both my version 
 and Lucene's can benefit from?
 The way I see SearcherManager, it can be divided into two parts: (1) the part 
 that manages the logic of acquire/release/maybeReopen (i.e., ensureOpen, 
 protect from concurrency stuff etc.), and (2) the part which handles 
 IndexSearcher, or my SearcherTaxoPair. I'm thinking that if we'll have an 
 interface with incRef/decRef/tryIncRef/maybeRefresh, we can make 
 SearcherManager a generic class which handles this interface.
 I will post a patch with the initial idea, and we can continue from there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3761) Generalize SearcherManager

2012-02-08 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3761:
---

Attachment: LUCENE-3761.patch

Initial patch. Introduces a new package 'thingy' (a temporary one, this will 
eventually move to o.a.l.search) with the class ThingyManager, a Thingy 
interface and a SearcherThingy implementation.

As far as I can tell (if there are no bugs), this can replace SearcherManager 
as-is, aside from a 'nocommit' which I know how to handle, but didn't get to it 
yet.

The approach is that ThingyManager receives a ThingyG instance and delegates 
calls to it.

Robert and I discussed another approach - have ThingyManager abstract with a 
concrete (final) SearcherManager impl which overrides methods like 
incRef/decRef etc. I still didn't try to impl that approach, I think that I'll 
give it a try, later.

Oh, and BTW, ThingyManager (even though a cool name !) will not be its final 
name ! :). It's just easier to progress like that, without thinking too much 
about the name.

 Generalize SearcherManager
 --

 Key: LUCENE-3761
 URL: https://issues.apache.org/jira/browse/LUCENE-3761
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3761.patch


 I'd like to generalize SearcherManager to a class which can manage instances 
 of a certain type of interfaces. The reason is that today SearcherManager 
 knows how to handle IndexSearcher instances. I have a SearcherManager which 
 manages a pair of IndexSearcher and TaxonomyReader pair.
 Recently, few concurrency bugs were fixed in SearcherManager, and I realized 
 that I need to apply them to my version as well. Which led me to think why 
 can't we have an SM version which is generic enough so that both my version 
 and Lucene's can benefit from?
 The way I see SearcherManager, it can be divided into two parts: (1) the part 
 that manages the logic of acquire/release/maybeReopen (i.e., ensureOpen, 
 protect from concurrency stuff etc.), and (2) the part which handles 
 IndexSearcher, or my SearcherTaxoPair. I'm thinking that if we'll have an 
 interface with incRef/decRef/tryIncRef/maybeRefresh, we can make 
 SearcherManager a generic class which handles this interface.
 I will post a patch with the initial idea, and we can continue from there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3703) DirectoryTaxonomyReader.refresh misbehaves with ref counts

2012-01-19 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3703:
---

Attachment: LUCENE-3703.patch

Patch addresses Doron's comments.

 DirectoryTaxonomyReader.refresh misbehaves with ref counts
 --

 Key: LUCENE-3703
 URL: https://issues.apache.org/jira/browse/LUCENE-3703
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3703.patch, LUCENE-3703.patch


 DirectoryTaxonomyReader uses the internal IndexReader in order to track its 
 own reference counting. However, when you call refresh(), it reopens the 
 internal IndexReader, and from that point, all previous reference counting 
 gets lost (since the new IndexReader's refCount is 1).
 The solution is to track reference counting in DTR itself. I wrote a simple 
 unit test which exposes the bug (will be attached with the patch shortly).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3703) DirectoryTaxonomyReader.refresh misbehaves with ref counts

2012-01-18 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3703:
---

Attachment: LUCENE-3703.patch

Patch fixes the bug by moving to track reference count by DTR. Also, added a 
test which covers that bug.

On the go, fixed close() to synchronize on this if the instance is not already 
closed. Otherwise, two threads that call close() concurrently might fail (one 
of them) in decRef().

I think it's ready to commit, will wait until tomorrow for review.

 DirectoryTaxonomyReader.refresh misbehaves with ref counts
 --

 Key: LUCENE-3703
 URL: https://issues.apache.org/jira/browse/LUCENE-3703
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3703.patch


 DirectoryTaxonomyReader uses the internal IndexReader in order to track its 
 own reference counting. However, when you call refresh(), it reopens the 
 internal IndexReader, and from that point, all previous reference counting 
 gets lost (since the new IndexReader's refCount is 1).
 The solution is to track reference counting in DTR itself. I wrote a simple 
 unit test which exposes the bug (will be attached with the patch shortly).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3649) Facet userguide link is broken after ant javadocs-all

2012-01-01 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3649:
---

Attachment: LUCENE-3649.patch

Patch against 3x:
* Move docs/ under src/java/org/apache/lucene/facet/doc-files -- that way the 
javadocs tool takes these files as they are
* Fix references to the userguide in overview.html and o.a.l.facet/package.html.
* Remove 'javadocs' target from facet/build.xml.

I will commit this shortly.

 Facet userguide link is broken after ant javadocs-all
 ---

 Key: LUCENE-3649
 URL: https://issues.apache.org/jira/browse/LUCENE-3649
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3649.patch, LUCENE-3649.patch


 Spinoff from 
 http://mail-archives.apache.org/mod_mbox/lucene-java-user/201112.mbox/%3CCAO9cvUaZePZ3faWo==xx7x8-5+snwlsbdqqjo_n-ycxr0lj...@mail.gmail.com%3E.
 When javadocs-all is run, the userguide is not copied at all, and therefore 
 the link is broken. Two options: inline the userguide in 
 package/overview.html or fix the Ant target to copy the userguide correctly.
 Thanks Lukas for reporting this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3637) Make IndexReader.decRef() call refCount.decrementAndGet instead of getAndDecrement

2011-12-11 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3637:
---

Attachment: LUCENE-3637.patch

Very trivial patch. If there are no objections, I'd like to commit this.

 Make IndexReader.decRef() call refCount.decrementAndGet instead of 
 getAndDecrement
 --

 Key: LUCENE-3637
 URL: https://issues.apache.org/jira/browse/LUCENE-3637
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Reporter: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3637.patch


 IndexReader.decRef() has this code:
 {code}
 final int rc = refCount.getAndDecrement();
 if (rc == 1) {
 {code}
 I think it will be clearer if it was written like this:
 {code}
 final int rc = refCount.decrementAndGet();
 if (rc == 0) {
 {code}
 It's a very simple change, which makes reading the code (at least IMO) 
 easier. Will post a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3635) Allow setting arbitrary objects on PerfRunData

2011-12-10 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3635:
---

Attachment: LUCENE-3635.patch

Patch (against trunk) adds perfObjects MapString, Object with matching 
set/get methods.

 Allow setting arbitrary objects on PerfRunData
 --

 Key: LUCENE-3635
 URL: https://issues.apache.org/jira/browse/LUCENE-3635
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/benchmark
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3635.patch


 PerfRunData is used as the intermediary objects between PerfRunTasks. Just 
 like we can set IndexReader/Writer on it, it will be good if it allows 
 setting other arbitrary objects that are e.g. created by one task and used by 
 another.
 A recent example is the enhancement to the benchmark package following the 
 addition of the facet module. We had to add TaxoReader/Writer.
 The proposal is to add a HashMapString, Object that custom PerfTasks can 
 set()/get(). I do not propose to move IR/IW/TR/TW etc. into that map. If 
 however people think that we should, I can do that as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3620:
---

Attachment: LUCENE-3620-trunk.patch

Patch adds the test to TestFilterIndexReader. Uwe asked that I do not commit 
these changes (test + FIR/IR fixes) until he merges in the branch on 
IR-read-only. We decided that Uwe will apply that patch to the branch, fix 
FIR/IR there and merge the branch afterwards.

 FilterIndexReader does not override all of IndexReader methods
 --

 Key: LUCENE-3620
 URL: https://issues.apache.org/jira/browse/LUCENE-3620
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3620-trunk.patch, LUCENE-3620.patch, 
 LUCENE-3620.patch, LUCENE-3620.patch


 FilterIndexReader does not override all of IndexReader methods. We've hit an 
 error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
 which asserts that FIR overrides all methods of IR (and we can filter our 
 methods that we don't think that it should override). The test is very simple 
 (attached), and it currently fails over these methods:
 {code}
 getRefCount
 incRef
 tryIncRef
 decRef
 reopen
 reopen
 reopen
 reopen
 clone
 numDeletedDocs
 document
 setNorm
 setNorm
 termPositions
 deleteDocument
 deleteDocuments
 undeleteAll
 getIndexCommit
 getUniqueTermCount
 getTermInfosIndexDivisor
 {code}
 I didn't yet fix anything in FIR -- if you spot a method that you think we 
 should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-06 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3620:
---

Attachment: LUCENE-3620.patch

Attached patch against 3x adds the test to TestFilterIndexReader.

Even if there are methods which you don't think need to be overridden by FIR, I 
prefer that we still override them and call super.(), with a comment why we 
don't delegate.

 FilterIndexReader does not override all of IndexReader methods
 --

 Key: LUCENE-3620
 URL: https://issues.apache.org/jira/browse/LUCENE-3620
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3620.patch


 FilterIndexReader does not override all of IndexReader methods. We've hit an 
 error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
 which asserts that FIR overrides all methods of IR (and we can filter our 
 methods that we don't think that it should override). The test is very simple 
 (attached), and it currently fails over these methods:
 {code}
 getRefCount
 incRef
 tryIncRef
 decRef
 reopen
 reopen
 reopen
 reopen
 clone
 numDeletedDocs
 document
 setNorm
 setNorm
 termPositions
 deleteDocument
 deleteDocuments
 undeleteAll
 getIndexCommit
 getUniqueTermCount
 getTermInfosIndexDivisor
 {code}
 I didn't yet fix anything in FIR -- if you spot a method that you think we 
 should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-06 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3620:
---

Attachment: LUCENE-3620.patch

Patch against 3x:
* Adds a HashSet of methods that should not be overridden by FilterIndexReader.
** If a method appears there and is not overridden, it is an error.
** If a method appears there and is overridden, it is an error as well.
* Override more methods by FIR.

see previous comment for the rest of the methods.

 FilterIndexReader does not override all of IndexReader methods
 --

 Key: LUCENE-3620
 URL: https://issues.apache.org/jira/browse/LUCENE-3620
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3620.patch, LUCENE-3620.patch


 FilterIndexReader does not override all of IndexReader methods. We've hit an 
 error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
 which asserts that FIR overrides all methods of IR (and we can filter our 
 methods that we don't think that it should override). The test is very simple 
 (attached), and it currently fails over these methods:
 {code}
 getRefCount
 incRef
 tryIncRef
 decRef
 reopen
 reopen
 reopen
 reopen
 clone
 numDeletedDocs
 document
 setNorm
 setNorm
 termPositions
 deleteDocument
 deleteDocuments
 undeleteAll
 getIndexCommit
 getUniqueTermCount
 getTermInfosIndexDivisor
 {code}
 I didn't yet fix anything in FIR -- if you spot a method that you think we 
 should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3603) jar-src fails if ${build.dir} does not exist

2011-11-28 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3603:
---

Attachment: LUCENE-3603.patch

Patch fixes jar-src to:
* not depend on init, as there's no need to compile anything (saves time)
* create ${build.dir}

I've decided not to modify the build.dir definitions in the other build.xmls 
for now, as it's more delicate.

I intend to commit this soon.

 jar-src fails if ${build.dir} does not exist
 

 Key: LUCENE-3603
 URL: https://issues.apache.org/jira/browse/LUCENE-3603
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3603.patch


 Simple fix -- make jar-src depend on a target which creates the build.dir. 
 Also, I noticed that build.dir is set in multiple places across our 
 build.xmls, so I'd like to improve that a bit (minor fixes as well).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3583) benchmark tests always fail on windows because directory cannot be removed

2011-11-21 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3583:
---

Attachment: LUCENE-3583.patch

Patch fixes the problem in LineDocSourceTest - add tasks.close() (otherwise LDS 
keeps a reader open on the file).

I intend to commit shortly, after verifying all tests pass and no other such 
changes are required.

 benchmark tests always fail on windows because directory cannot be removed
 --

 Key: LUCENE-3583
 URL: https://issues.apache.org/jira/browse/LUCENE-3583
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.5, 4.0
 Environment: Only fails for Lucene trunk
Reporter: Uwe Schindler
 Attachments: LUCENE-3583.patch, LUCENE-3583.patch, 
 benchmark-test-output.txt, io-event-log.txt


 This seems to be a bug recently introduced. I have no idea what's wrong. 
 Attached is a log file, reproduces everytime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests

2011-11-14 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3269:
---

Attachment: LUCENE-3269.patch

Patch introduces the following:
* HashMapInteger, SearchTaxoDirPair which is initialized in beforeClass and 
maps a partition size to the pair of Directories.
* initIndex first checks the map for the partition size, and creates the 
indexes only if no matching pair is found.

The sampling tests do not benefit from that directly, as they only run one test 
method, however, if they will run in the same JVM, then they will reuse the 
already created indexes.

Patch is against 3x and seems trivial, so I intend to commit this later today 
or tomorrow if there are no objections.

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch, 
 LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3556) Make DirectoryTaxonomyWriter's indexWriter member private

2011-11-03 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3556:
---

Attachment: LUCENE-3556.patch

Trivial patch against trunk. I'd like to commit this shortly.

 Make DirectoryTaxonomyWriter's indexWriter member private
 -

 Key: LUCENE-3556
 URL: https://issues.apache.org/jira/browse/LUCENE-3556
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3556.patch


 DirectoryTaxonomyWriter has a protected indexWriter member. As far as I can 
 tell, for two reasons:
 # protected openIndexWriter method which lets you open your own IW (e.g. with 
 a custom IndexWriterConfig).
 # protected closeIndexWriter which is a hook for letting you close the IW you 
 opened in the previous one.
 The fixes are trivial IMO:
 # Modify the method to return IW, and have the calling code set DTW's 
 indexWriter member
 # Eliminate closeIW. DTW already has a protected closeResources() which lets 
 you clean other resources you've allocated, so I think that's enough.
 I'll post a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3552) TaxonomyReader/Writer and their Lucene* implementation

2011-11-02 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3552:
---

Attachment: LUCENE-3552.patch

Patch renames LTW/R to DirectoryTW/TR. Also, I renamed LTW's 
openLuceneIndex/closeLuceneIndex to open/closeIndexWriter.

I've also made TW extend TwoPhaseCommit.

I think that it's ready to commit. I'll port the changes to trunk afterwards. 
I'll wait until tomorrow before I commit (the changes are trivial).

 TaxonomyReader/Writer and their Lucene* implementation
 --

 Key: LUCENE-3552
 URL: https://issues.apache.org/jira/browse/LUCENE-3552
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3552.patch


 The facet module contains two interfaces TaxonomyWriter and TaxonomyReader, 
 with two implementations Lucene*. We've never actually implemented two 
 TaxonomyWriters/Readers, so I'm not sure if these interfaces are useful 
 anymore. Therefore I'd like to propose that we do either of the following:
 # Remove the interfaces and remove the Lucene part from the implementation 
 classes (to end up with TW/TR impls). Or,
 # Keep the interfaces, but rename the Lucene* impls to Directory*.
 Whatever we do, I'd like to make the impls/interfaces impl also 
 TwoPhaseCommit.
 Any preferences?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3549) Remove DocumentBuilder interface from facet module

2011-11-01 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3549:
---

Attachment: LUCENE-3549.patch

Patch against 3x (but easy to apply on trunk as well).

I will commit this soon.

 Remove DocumentBuilder interface from facet module
 --

 Key: LUCENE-3549
 URL: https://issues.apache.org/jira/browse/LUCENE-3549
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3549.patch


 The facet module contains an interface called DocumentBuilder, which contains 
 a single method, build(Document) (it's a builder API). We use it in my 
 company to standardize how different modules populate a Document object. I've 
 included it with the facet contribution so that things will compile with as 
 few code changes as possible.
 Now it's time to do some cleanup and I'd like to start with this interface. 
 If people think that this interface is useful to reside in 'core', then I 
 don't mind moving it there. But otherwise, let's remove it from the code. It 
 has only one impl in the facet module: CategoryDocumentBuilder, and we can 
 certainly do without the interface.
 More so, it's under o.a.l package which is inappropriate IMO. If it's moved 
 to 'core', it should be under o.a.l.document.
 If people see any problem with that, please speak up. I will do the changes 
 and post a patch here shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3522) TermsFilter.getDocIdSet(context) NPE on missing field

2011-10-17 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3522:
---

Fix Version/s: 3.5

Added 3.5 as a fix version as well

 TermsFilter.getDocIdSet(context) NPE on missing field
 -

 Key: LUCENE-3522
 URL: https://issues.apache.org/jira/browse/LUCENE-3522
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0
Reporter: Dan Climan
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3522.patch


 If the context does not contain the field for a term when calling 
 TermsFilter.getDocIdSet(AtomicReaderContext context) then a 
 NullPointerException is thrown due to not checking for null Terms before 
 getting iterator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3522) TermsFilter.getDocIdSet(context) NPE on missing field

2011-10-17 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3522:
---

Fix Version/s: (was: 3.5)

Ah. I thought that we need the Fix Version to properly track which issues are 
part of a release. But you're right - if this bug didn't exist in 3.x, then we 
better not mark that it was fixed there.

 TermsFilter.getDocIdSet(context) NPE on missing field
 -

 Key: LUCENE-3522
 URL: https://issues.apache.org/jira/browse/LUCENE-3522
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/other
Affects Versions: 4.0
Reporter: Dan Climan
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3522.patch


 If the context does not contain the field for a term when calling 
 TermsFilter.getDocIdSet(AtomicReaderContext context) then a 
 NullPointerException is thrown due to not checking for null Terms before 
 getting iterator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3485) LuceneTaxonomyReader .decRef() may close the inner IR, renderring the LTR in a limbo.

2011-10-04 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3485:
---

Lucene Fields: New,Patch Available  (was: New)
Fix Version/s: 4.0
   3.5

 LuceneTaxonomyReader .decRef() may close the inner IR, renderring the LTR in 
 a limbo.
 -

 Key: LUCENE-3485
 URL: https://issues.apache.org/jira/browse/LUCENE-3485
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Affects Versions: 3.4
Reporter: Gilad Barkai
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.5, 4.0

 Attachments: LUCENE-3485.patch


 TaxonomyReader which supports ref-counting, has a decRef() method which 
 delegates to an inner IndexReader and calls its .decRef(). The latter may 
 close the reader (if the ref is zeroes) but the taxonomy would remain 'open' 
 which will fail many of its method calls.
 Also, the LTR's .close() method does not work in the same manner as 
 IndexReader's - which calls decRef(), and leaves the real closing logic to 
 the decRef(). I believe this should be the right approach for the fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org