[ 
https://issues.apache.org/jira/browse/LUCENE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12531510
 ] 

Yonik Seeley commented on LUCENE-1012:
--------------------------------------

> We could just fix the javadocs to match the current approach?
That sounds like the right approach.

> Problems with maxMergeDocs parameter
> ------------------------------------
>
>                 Key: LUCENE-1012
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1012
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Michael Busch
>            Priority: Minor
>             Fix For: 2.3
>
>
> I found two possible problems regarding IndexWriter's maxMergeDocs value. I'm 
> using the following code to test maxMergeDocs:
> {code:java} 
>   public void testMaxMergeDocs() throws IOException {
>     final int maxMergeDocs = 50;
>     final int numSegments = 40;
>     
>     MockRAMDirectory dir = new MockRAMDirectory();
>     IndexWriter writer  = new IndexWriter(dir, new WhitespaceAnalyzer(), 
> true);      
>     writer.setMergePolicy(new LogDocMergePolicy());
>     writer.setMaxMergeDocs(maxMergeDocs);
>     Document doc = new Document();
>     doc.add(new Field("field", "aaa", Field.Store.YES, Field.Index.TOKENIZED, 
> Field.TermVector.WITH_POSITIONS_OFFSETS));
>     for (int i = 0; i < numSegments * maxMergeDocs; i++) {
>       writer.addDocument(doc);
>       //writer.flush();      // uncomment to avoid the DocumentsWriter bug
>     }
>     writer.close();
>     
>     new SegmentInfos.FindSegmentsFile(dir) {
>       protected Object doBody(String segmentFileName) throws 
> CorruptIndexException, IOException {
>         SegmentInfos infos = new SegmentInfos();
>         infos.read(directory, segmentFileName);
>         for (int i = 0; i < infos.size(); i++) {
>           assertTrue(infos.info(i).docCount <= maxMergeDocs);
>         }
>         return null;
>       }
>     }.run();
>   }
> {code} 
>   
> - It seems that DocumentsWriter does not obey the maxMergeDocs parameter. If 
> I don't flush manually, then the index only contains one segment at the end 
> and the test fails.
> - If I flush manually after each addDocument() call, then the index contains 
> more segments. But still, there are segments that contain more docs than 
> maxMergeDocs, e. g. 55 vs. 50. The javadoc in IndexWriter says:
> {code:java}
>    /**
>    * Returns the largest number of documents allowed in a
>    * single segment.
>    *
>    * @see #setMaxMergeDocs
>    */
>   public int getMaxMergeDocs() {
>     return getLogDocMergePolicy().getMaxMergeDocs();
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to