[jira] Created: (LUCENE-1509) IndexCommit.getFileNames() should not return dups
IndexCommit.getFileNames() should not return dups - Key: LUCENE-1509 URL: https://issues.apache.org/jira/browse/LUCENE-1509 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.4, 2.9 Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 2.9 If the index was created with autoCommit false, and more than 1 segment was flushed during the IndexWriter session, then the shared doc-store files are incorrectly duplicated in IndexCommit.getFileNames(). This is because that method is walking through each SegmentInfo, appending its files to a list. Since multiple SegmentInfo's may share the doc store files, this causes dups. To fix this, I've added a SegmentInfos.files(...) method, and refactored all places that were computing their files one SegmentInfo at a time to use this new method instead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1509) IndexCommit.getFileNames() should not return dups
[ https://issues.apache.org/jira/browse/LUCENE-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1509: --- Attachment: LUCENE-1509.patch Attached patch. I plan to commit in a day or two. > IndexCommit.getFileNames() should not return dups > - > > Key: LUCENE-1509 > URL: https://issues.apache.org/jira/browse/LUCENE-1509 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Affects Versions: 2.4, 2.9 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1509.patch > > > If the index was created with autoCommit false, and more than 1 > segment was flushed during the IndexWriter session, then the shared > doc-store files are incorrectly duplicated in > IndexCommit.getFileNames(). This is because that method is walking > through each SegmentInfo, appending its files to a list. Since > multiple SegmentInfo's may share the doc store files, this causes dups. > To fix this, I've added a SegmentInfos.files(...) method, and > refactored all places that were computing their files one SegmentInfo > at a time to use this new method instead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660322#action_12660322 ] Mark Miller commented on LUCENE-1483: - So what looks like a promising strategy? Off the top I am thinking something as simple as: start with ORD with no fallback on the largest. if the next segments are fairly large, use ORD_VAL if the segments get somewhat smaller, move to ORD_DEM Oddly, I've seen VAL perform well in certain situations, so maybe it has its place, but I don't know where yet. > Change IndexSearcher multisegment searches to search each individual segment > using a single HitCollector > > > Key: LUCENE-1483 > URL: https://issues.apache.org/jira/browse/LUCENE-1483 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 2.9 >Reporter: Mark Miller >Priority: Minor > Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, sortBench.py, sortCollate.py > > > FieldCache and Filters are forced down to a single segment reader, allowing > for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660322#action_12660322 ] markrmil...@gmail.com edited comment on LUCENE-1483 at 1/2/09 6:24 AM: - So what looks like a promising strategy? Off the top I am thinking something as simple as: start with ORD with no fallback on the largest. if the next segments are fairly large, use ORD_VAL if the segments get somewhat smaller, move to ORD_DEM Oddly, I've seen VAL perform well in certain situations, so maybe it has its place, but I don't know where yet. *edit* Oh, yeah, queue size should also play a roll in the switching was (Author: markrmil...@gmail.com): So what looks like a promising strategy? Off the top I am thinking something as simple as: start with ORD with no fallback on the largest. if the next segments are fairly large, use ORD_VAL if the segments get somewhat smaller, move to ORD_DEM Oddly, I've seen VAL perform well in certain situations, so maybe it has its place, but I don't know where yet. > Change IndexSearcher multisegment searches to search each individual segment > using a single HitCollector > > > Key: LUCENE-1483 > URL: https://issues.apache.org/jira/browse/LUCENE-1483 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 2.9 >Reporter: Mark Miller >Priority: Minor > Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, > LUCENE-1483.patch, LUCENE-1483.patch, sortBench.py, sortCollate.py > > > FieldCache and Filters are forced down to a single segment reader, allowing > for individual segment reloading on reopen. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1510) InstantiatedIndexReader throws NullPointerException in norms() when used with a MultiReader
InstantiatedIndexReader throws NullPointerException in norms() when used with a MultiReader --- Key: LUCENE-1510 URL: https://issues.apache.org/jira/browse/LUCENE-1510 Project: Lucene - Java Issue Type: Bug Components: contrib/* Affects Versions: 2.4 Reporter: Robert Newson When using InstantiatedIndexReader under a MultiReader where the other Reader contains documents, a NullPointerException is thrown here; public void norms(String field, byte[] bytes, int offset) throws IOException { byte[] norms = getIndex().getNormsByFieldNameAndDocumentNumber().get(field); System.arraycopy(norms, 0, bytes, offset, norms.length); } the 'norms' variable is null. Performing the copy only when norms is not null does work, though I'm sure it's not the right fix. java.lang.NullPointerException at org.apache.lucene.store.instantiated.InstantiatedIndexReader.norms(InstantiatedIndexReader.java:297) at org.apache.lucene.index.MultiReader.norms(MultiReader.java:273) at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:70) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:131) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:112) at org.apache.lucene.search.Searcher.search(Searcher.java:136) at org.apache.lucene.search.Searcher.search(Searcher.java:146) at org.apache.lucene.store.instantiated.TestWithMultiReader.test(TestWithMultiReader.java:41) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:164) at junit.framework.TestCase.runBare(TestCase.java:130) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:120) at junit.framework.TestSuite.runTest(TestSuite.java:230) at junit.framework.TestSuite.run(TestSuite.java:225) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1510) InstantiatedIndexReader throws NullPointerException in norms() when used with a MultiReader
[ https://issues.apache.org/jira/browse/LUCENE-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Newson updated LUCENE-1510: -- Attachment: TestWithMultiReader.java Test case to demonstrate NPE. > InstantiatedIndexReader throws NullPointerException in norms() when used with > a MultiReader > --- > > Key: LUCENE-1510 > URL: https://issues.apache.org/jira/browse/LUCENE-1510 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/* >Affects Versions: 2.4 >Reporter: Robert Newson > Attachments: TestWithMultiReader.java > > > When using InstantiatedIndexReader under a MultiReader where the other Reader > contains documents, a NullPointerException is thrown here; > public void norms(String field, byte[] bytes, int offset) throws IOException > { > byte[] norms = > getIndex().getNormsByFieldNameAndDocumentNumber().get(field); > System.arraycopy(norms, 0, bytes, offset, norms.length); > } > the 'norms' variable is null. Performing the copy only when norms is not null > does work, though I'm sure it's not the right fix. > java.lang.NullPointerException > at > org.apache.lucene.store.instantiated.InstantiatedIndexReader.norms(InstantiatedIndexReader.java:297) > at org.apache.lucene.index.MultiReader.norms(MultiReader.java:273) > at > org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:70) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:131) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:112) > at org.apache.lucene.search.Searcher.search(Searcher.java:136) > at org.apache.lucene.search.Searcher.search(Searcher.java:146) > at > org.apache.lucene.store.instantiated.TestWithMultiReader.test(TestWithMultiReader.java:41) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at junit.framework.TestCase.runTest(TestCase.java:164) > at junit.framework.TestCase.runBare(TestCase.java:130) > at junit.framework.TestResult$1.protect(TestResult.java:106) > at junit.framework.TestResult.runProtected(TestResult.java:124) > at junit.framework.TestResult.run(TestResult.java:109) > at junit.framework.TestCase.run(TestCase.java:120) > at junit.framework.TestSuite.runTest(TestSuite.java:230) > at junit.framework.TestSuite.run(TestSuite.java:225) > at > org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Too many open files
Hello. I'm struggling with the following exception: Exception in thread "Lucene Merge Thread #1037" org.apache.lucene.index.MergePolicy$MergeException: java.io.FileNotFoundException: /home/plopes/aktwise/server-commons/data/lucenedata/_80c.tii (Too many open files) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286) Caused by: java.io.FileNotFoundException: /home/plopes/aktwise/server-commons/data/lucenedata/_80c.tii (Too many open files) I have only one indexwriter instantiated, and every time i update the index (add or remove a document), i commit the indexwriter and run the following code in order to make the searcher aware of the new documents: private synchronized void refreshSearcher() throws CorruptIndexException, IOException { try { IndexReader reader = searcher.getIndexReader().reopen(); searcher.close(); searcher = new IndexSearcher(reader); if (reader != searcher.getIndexReader()) searcher.getIndexReader().close(); } catch (Exception e) { e.printStackTrace(); } } I have tried lowering the merge factor and the number of threads by executing the following: ((ConcurrentMergeScheduler)writer.getMergeScheduler()).setMaxThreadCount(1); writer.setMergeFactor(3); I am using lucene 2.4.0 and there is only one thread manipulating the index. any help would be appreciated. Nuno Seco - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Too many open files
I have just noticed that I subscribed the wrong list. I meant to subscribe and email the Java User List. Sorry Nuno Seco wrote: Hello. I'm struggling with the following exception: Exception in thread "Lucene Merge Thread #1037" org.apache.lucene.index.MergePolicy$MergeException: java.io.FileNotFoundException: /home/plopes/aktwise/server-commons/data/lucenedata/_80c.tii (Too many open files) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286) Caused by: java.io.FileNotFoundException: /home/plopes/aktwise/server-commons/data/lucenedata/_80c.tii (Too many open files) I have only one indexwriter instantiated, and every time i update the index (add or remove a document), i commit the indexwriter and run the following code in order to make the searcher aware of the new documents: private synchronized void refreshSearcher() throws CorruptIndexException, IOException { try { IndexReader reader = searcher.getIndexReader().reopen(); searcher.close(); searcher = new IndexSearcher(reader); if (reader != searcher.getIndexReader()) searcher.getIndexReader().close(); } catch (Exception e) { e.printStackTrace(); } } I have tried lowering the merge factor and the number of threads by executing the following: ((ConcurrentMergeScheduler)writer.getMergeScheduler()).setMaxThreadCount(1); writer.setMergeFactor(3); I am using lucene 2.4.0 and there is only one thread manipulating the index. any help would be appreciated. Nuno Seco - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org