[jira] Commented: (LUCENE-887) Interruptible segment merges
[ https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544724 ] Michael Busch commented on LUCENE-887: -- close(false) does not "kill" a merge thread right? So actually there is no guarantee that close(false) will return within x minutes? E. g. if a cascading merge is running, then close(false) will wait for the current merge job to finish, and then abort, meaning not perform other planned merge jobs of the cascading merge? But if the current merge job is very big, with huge segments involved, then it can still take a long time for close(false) to return? > Interruptible segment merges > > > Key: LUCENE-887 > URL: https://issues.apache.org/jira/browse/LUCENE-887 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Reporter: Michael Busch >Assignee: Michael Busch >Priority: Minor > Attachments: ExtendedIndexWriter.java > > > Adds the ability to IndexWriter to interrupt an ongoing merge. This might be > necessary when Lucene is e. g. running as a service and has to stop indexing > within a certain period of time due to a shutdown request. > A solution would be to add a new method shutdown() to IndexWriter which > satisfies the following two requirements: > - if a merge is happening, abort it > - flush the buffered docs but do not trigger a merge > See also discussions about this feature on java-dev: > http://www.gossamer-threads.com/lists/lucene/java-dev/49008 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown
[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544730 ] Michael Busch commented on LUCENE-1044: --- I think changing the only constructor in FSDirectory.FSIndexOutput is an API change. I have a class that extends FSIndexOutput and it doesn't compile anymore after switching to the 2.3-dev jar. I think we should put this ctr back: public FSIndexOutput(File path) throws IOException { this(path, DEFAULT_DO_SYNC); } > Behavior on hard power shutdown > --- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 >Reporter: venkat rangan >Assignee: Michael McCandless > Fix For: 2.3 > > Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, > LUCENE-1044.take3.patch > > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown
[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544758 ] Michael McCandless commented on LUCENE-1044: Woops, OK I will put it back ... > Behavior on hard power shutdown > --- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 >Reporter: venkat rangan >Assignee: Michael McCandless > Fix For: 2.3 > > Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, > LUCENE-1044.take3.patch > > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-887) Interruptible segment merges
[ https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544769 ] Michael McCandless commented on LUCENE-887: --- I believe close(false) marks all still-running merges as aborted (calls OneMerge.abort()) and then returns immediately without waiting for them to finish? However, you're right, those merge threads are not "killed" in that they keep running until they "notice" they were aborted (typically when the merge tries to commit its results). Right now we don't make any effort to have the merging process notice sooner that it was aborted and abort the thread. If we could somehow reach in & find all FSIndexOutputs that are presently opened by SegmentMerger, and forcefully close them (forcing an IOException which ConcurrentMergeScheduler catches & ignores if the merge was aborted) that'd give us a fast way to have the threads stop working. Oh, one issue is we are failing to setDaemon(true) on these threads, which means the JVM will not exit until they complete. I'll fix that. > Interruptible segment merges > > > Key: LUCENE-887 > URL: https://issues.apache.org/jira/browse/LUCENE-887 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Reporter: Michael Busch >Assignee: Michael Busch >Priority: Minor > Attachments: ExtendedIndexWriter.java > > > Adds the ability to IndexWriter to interrupt an ongoing merge. This might be > necessary when Lucene is e. g. running as a service and has to stop indexing > within a certain period of time due to a shutdown request. > A solution would be to add a new method shutdown() to IndexWriter which > satisfies the following two requirements: > - if a merge is happening, abort it > - flush the buffered docs but do not trigger a merge > See also discussions about this feature on java-dev: > http://www.gossamer-threads.com/lists/lucene/java-dev/49008 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Created: (LUCENE-1064) Make TopDocs constructor public
Make TopDocs constructor public --- Key: LUCENE-1064 URL: https://issues.apache.org/jira/browse/LUCENE-1064 Project: Lucene - Java Issue Type: Improvement Components: Search Affects Versions: 2.2 Environment: All Reporter: Shai Erera TopDocs constructor is package visible. This prevents instantiating it from outside this package. For example, I wrote a HitColletor that couldn't extend directly from TopDocCollector. I need to create a new TopDocs instance, however since the c'tor is package visible, I can't do that. For now, I completely duplicated the code, but I hope you'll fix it soon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown
[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544803 ] Doron Cohen commented on LUCENE-1044: - {quote} I'll look into the separate thread to sync/close files in the background next... {quote} I was wondering if delaying sync to actual commit point would run faster than a background thread. I thought it would, because the background thread, though not holding current thread from continue with indexing, does force the sync *now* rather than letting the IO subsystem actually write stuff on its time. I was also hoping that by doing them later, some of the syncs would become no-ops, and hence faster. I found out however that delaying the syncs (but intending to sync) also means keeping the file handles open, and therefore this is not a practical approach. Still it was interesting to compare. So... my small test sequentially writes M characters to N files and either do not sync (just close), or does sync in one of three ways: (1) at the end, (2) immediately, (3) in a background thread. The results (in millis) on my Windows XP were: || num files || num chars per file || No Sync || Sync At End || Background Sync || Immediate Sync || | 100 | 1 | 631 | 5778 | 5729 | 5828 | | 100 | 1 | 581 | 4486 | 4117 | 4687 | | 1000 | 1000 | 1612 | 38996 | 34900 | 35852 | | 1000 | 1000 | 1432 | 37153 | 35051 | 37263 | | 1 | 100 | 10335 | 154262 | 162103 | 174251 | | 1 | 100 | 11276 | 147752 | 159480 | 222450 | Each configuration ran twice and there are fluctuations, but it is obvious (as Mike noticed) that no-sync is much faster then sync. In fact in my test no-sync is at least 10 times faster than any sync approach, while in Mike's test which is using Lucene the penalty is smaller. Difference might be because in my test there is no CPU work involved, just IO. Comparing "immediate" to "background" I it is not clearly worth it to add a background thread (unless Mike's test proves otherwise..) > Behavior on hard power shutdown > --- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 >Reporter: venkat rangan >Assignee: Michael McCandless > Fix For: 2.3 > > Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, > LUCENE-1044.take3.patch > > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown
[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544811 ] Doron Cohen commented on LUCENE-1044: - With some artificial CPU activity added to the test program: || num files || num chars per file || No Sync || Sync At End || Background Sync || Immediate Sync || | 100 | 1 | 6690 | 11516 | 10706 | 11216 | | 100 | 1 | 7200 | 11006 | 10575 | 10846 | | 1000 | 1000 | 8002 | 48570 | 48479 | 51825 | | 1000 | 1000 | 7801 | 43142 | 43693 | 43342 | | 1 | 100 | 16303 | 152730 | 326810 | 207939 | | 1 | 100 | 17805 | 156375 | 160040 | 165398 | > Behavior on hard power shutdown > --- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 >Reporter: venkat rangan >Assignee: Michael McCandless > Fix For: 2.3 > > Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, > LUCENE-1044.take3.patch > > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Issue Comment Edited: (LUCENE-1044) Behavior on hard power shutdown
[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544812 ] doronc edited comment on LUCENE-1044 at 11/22/07 6:16 AM: --- Attached FSyncPerfTest.java is the standalone (non Lucene) perf test that I used. was (Author: doronc): The standalone (non Lucene) perf test that I used. > Behavior on hard power shutdown > --- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 >Reporter: venkat rangan >Assignee: Michael McCandless > Fix For: 2.3 > > Attachments: FSyncPerfTest.java, LUCENE-1044.patch, > LUCENE-1044.take2.patch, LUCENE-1044.take3.patch > > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-1044) Behavior on hard power shutdown
[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-1044: Attachment: FSyncPerfTest.java The standalone (non Lucene) perf test that I used. > Behavior on hard power shutdown > --- > > Key: LUCENE-1044 > URL: https://issues.apache.org/jira/browse/LUCENE-1044 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java > 1.5 >Reporter: venkat rangan >Assignee: Michael McCandless > Fix For: 2.3 > > Attachments: FSyncPerfTest.java, LUCENE-1044.patch, > LUCENE-1044.take2.patch, LUCENE-1044.take3.patch > > > When indexing a large number of documents, upon a hard power failure (e.g. > pull the power cord), the index seems to get corrupted. We start a Java > application as an Windows Service, and feed it documents. In some cases > (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the > following is observed. > The 'segments' file contains only zeros. Its size is 265 bytes - all bytes > are zeros. > The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes > are zeros. > Before corruption, the segments file and deleted file appear to be correct. > After this corruption, the index is corrupted and lost. > This is a problem observed in Lucene 1.4.3. We are not able to upgrade our > customer deployments to 1.9 or later version, but would be happy to back-port > a patch, if the patch is small enough and if this problem is already solved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-887) Interruptible segment merges
[ https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544854 ] Michael Busch commented on LUCENE-887: -- {quote} I believe close(false) marks all still-running merges as aborted (calls OneMerge.abort()) and then returns immediately without waiting for them to finish? {quote} OK I got it. So if Lucene runs as a service and a shutdown request is received, then a call to close(false) will cause the IndexWriter to mark the running merges as aborted, flush the buffer and return once flush+commit is done. Then the caller knows that it is safe now to shutdown the JVM, which will also stop the running merge thread (because it's a daemon thread now). We should probably add a testcase that tests such a shutdown scenario. {quote} If we could somehow reach in & find all FSIndexOutputs that are presently opened by SegmentMerger, and forcefully close them (forcing an IOException which ConcurrentMergeScheduler catches & ignores if the merge was aborted) that'd give us a fast way to have the threads stop working. {quote} The fact that the background merge thread keeps running doesn't hurt us, but the advantage would be to reduce system load and thus to possibly speedup the flush+commit that the other thread is doing. Maybe for now we could also set the priority of the background thread to a minimum value, as soon as close(false) is called? {quote} Oh, one issue is we are failing to setDaemon(true) on these threads, which means the JVM will not exit until they complete. I'll fix that. {quote} Cool, thanks! W/o this the above explained shutdown scenario wouldn't work. > Interruptible segment merges > > > Key: LUCENE-887 > URL: https://issues.apache.org/jira/browse/LUCENE-887 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Reporter: Michael Busch >Assignee: Michael Busch >Priority: Minor > Attachments: ExtendedIndexWriter.java > > > Adds the ability to IndexWriter to interrupt an ongoing merge. This might be > necessary when Lucene is e. g. running as a service and has to stop indexing > within a certain period of time due to a shutdown request. > A solution would be to add a new method shutdown() to IndexWriter which > satisfies the following two requirements: > - if a merge is happening, abort it > - flush the buffered docs but do not trigger a merge > See also discussions about this feature on java-dev: > http://www.gossamer-threads.com/lists/lucene/java-dev/49008 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-584) Decouple Filter from BitSet
[ https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-584: Attachment: (was: Matcher-20071122-1ground.patch) > Decouple Filter from BitSet > --- > > Key: LUCENE-584 > URL: https://issues.apache.org/jira/browse/LUCENE-584 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.0.1 >Reporter: Peter Schäfer >Priority: Minor > Attachments: bench-diff.txt, bench-diff.txt, > Matcher-20070905-1ground.patch, Matcher-20070905-2default.patch, > Matcher-20070905-3core.patch, Matcher-20071008-1ground.patch, Some > Matchers.zip > > > {code} > package org.apache.lucene.search; > public abstract class Filter implements java.io.Serializable > { > public abstract AbstractBitSet bits(IndexReader reader) throws IOException; > } > public interface AbstractBitSet > { > public boolean get(int index); > } > {code} > It would be useful if the method =Filter.bits()= returned an abstract > interface, instead of =java.util.BitSet=. > Use case: there is a very large index, and, depending on the user's > privileges, only a small portion of the index is actually visible. > Sparsely populated =java.util.BitSet=s are not efficient and waste lots of > memory. It would be desirable to have an alternative BitSet implementation > with smaller memory footprint. > Though it _is_ possibly to derive classes from =java.util.BitSet=, it was > obviously not designed for that purpose. > That's why I propose to use an interface instead. The default implementation > could still delegate to =java.util.BitSet=. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-584) Decouple Filter from BitSet
[ https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-584: Attachment: Matcher-20071122-1ground.patch Resolved a local conflict in the javadocs of HitCollector. > Decouple Filter from BitSet > --- > > Key: LUCENE-584 > URL: https://issues.apache.org/jira/browse/LUCENE-584 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.0.1 >Reporter: Peter Schäfer >Priority: Minor > Attachments: bench-diff.txt, bench-diff.txt, > Matcher-20070905-1ground.patch, Matcher-20070905-2default.patch, > Matcher-20070905-3core.patch, Matcher-20071008-1ground.patch, Some > Matchers.zip > > > {code} > package org.apache.lucene.search; > public abstract class Filter implements java.io.Serializable > { > public abstract AbstractBitSet bits(IndexReader reader) throws IOException; > } > public interface AbstractBitSet > { > public boolean get(int index); > } > {code} > It would be useful if the method =Filter.bits()= returned an abstract > interface, instead of =java.util.BitSet=. > Use case: there is a very large index, and, depending on the user's > privileges, only a small portion of the index is actually visible. > Sparsely populated =java.util.BitSet=s are not efficient and waste lots of > memory. It would be desirable to have an alternative BitSet implementation > with smaller memory footprint. > Though it _is_ possibly to derive classes from =java.util.BitSet=, it was > obviously not designed for that purpose. > That's why I propose to use an interface instead. The default implementation > could still delegate to =java.util.BitSet=. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-584) Decouple Filter from BitSet
[ https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-584: Attachment: (was: Matcher-20070905-1ground.patch) > Decouple Filter from BitSet > --- > > Key: LUCENE-584 > URL: https://issues.apache.org/jira/browse/LUCENE-584 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.0.1 >Reporter: Peter Schäfer >Priority: Minor > Attachments: bench-diff.txt, bench-diff.txt, > Matcher-20070905-2default.patch, Matcher-20070905-3core.patch, > Matcher-20071008-1ground.patch, Matcher-20071122-1ground.patch, Some > Matchers.zip > > > {code} > package org.apache.lucene.search; > public abstract class Filter implements java.io.Serializable > { > public abstract AbstractBitSet bits(IndexReader reader) throws IOException; > } > public interface AbstractBitSet > { > public boolean get(int index); > } > {code} > It would be useful if the method =Filter.bits()= returned an abstract > interface, instead of =java.util.BitSet=. > Use case: there is a very large index, and, depending on the user's > privileges, only a small portion of the index is actually visible. > Sparsely populated =java.util.BitSet=s are not efficient and waste lots of > memory. It would be desirable to have an alternative BitSet implementation > with smaller memory footprint. > Though it _is_ possibly to derive classes from =java.util.BitSet=, it was > obviously not designed for that purpose. > That's why I propose to use an interface instead. The default implementation > could still delegate to =java.util.BitSet=. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-584) Decouple Filter from BitSet
[ https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-584: Attachment: Matcher-20071122-1ground.patch Resolved a local conflict in the javadocs of HitCollector. This time with licence granted to ASF. > Decouple Filter from BitSet > --- > > Key: LUCENE-584 > URL: https://issues.apache.org/jira/browse/LUCENE-584 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.0.1 >Reporter: Peter Schäfer >Priority: Minor > Attachments: bench-diff.txt, bench-diff.txt, > Matcher-20070905-2default.patch, Matcher-20070905-3core.patch, > Matcher-20071008-1ground.patch, Matcher-20071122-1ground.patch, Some > Matchers.zip > > > {code} > package org.apache.lucene.search; > public abstract class Filter implements java.io.Serializable > { > public abstract AbstractBitSet bits(IndexReader reader) throws IOException; > } > public interface AbstractBitSet > { > public boolean get(int index); > } > {code} > It would be useful if the method =Filter.bits()= returned an abstract > interface, instead of =java.util.BitSet=. > Use case: there is a very large index, and, depending on the user's > privileges, only a small portion of the index is actually visible. > Sparsely populated =java.util.BitSet=s are not efficient and waste lots of > memory. It would be desirable to have an alternative BitSet implementation > with smaller memory footprint. > Though it _is_ possibly to derive classes from =java.util.BitSet=, it was > obviously not designed for that purpose. > That's why I propose to use an interface instead. The default implementation > could still delegate to =java.util.BitSet=. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-584) Decouple Filter from BitSet
[ https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Elschot updated LUCENE-584: Attachment: (was: Matcher-20071008-1ground.patch) > Decouple Filter from BitSet > --- > > Key: LUCENE-584 > URL: https://issues.apache.org/jira/browse/LUCENE-584 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.0.1 >Reporter: Peter Schäfer >Priority: Minor > Attachments: bench-diff.txt, bench-diff.txt, > Matcher-20070905-2default.patch, Matcher-20070905-3core.patch, > Matcher-20071122-1ground.patch, Some Matchers.zip > > > {code} > package org.apache.lucene.search; > public abstract class Filter implements java.io.Serializable > { > public abstract AbstractBitSet bits(IndexReader reader) throws IOException; > } > public interface AbstractBitSet > { > public boolean get(int index); > } > {code} > It would be useful if the method =Filter.bits()= returned an abstract > interface, instead of =java.util.BitSet=. > Use case: there is a very large index, and, depending on the user's > privileges, only a small portion of the index is actually visible. > Sparsely populated =java.util.BitSet=s are not efficient and waste lots of > memory. It would be desirable to have an alternative BitSet implementation > with smaller memory footprint. > Though it _is_ possibly to derive classes from =java.util.BitSet=, it was > obviously not designed for that purpose. > That's why I propose to use an interface instead. The default implementation > could still delegate to =java.util.BitSet=. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-1064) Make TopDocs constructor public
[ https://issues.apache.org/jira/browse/LUCENE-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1064: --- Attachment: TopDocs-patch The simple patch details > Make TopDocs constructor public > --- > > Key: LUCENE-1064 > URL: https://issues.apache.org/jira/browse/LUCENE-1064 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.2 > Environment: All >Reporter: Shai Erera > Attachments: TopDocs-patch > > > TopDocs constructor is package visible. This prevents instantiating it from > outside this package. For example, I wrote a HitColletor that couldn't extend > directly from TopDocCollector. I need to create a new TopDocs instance, > however since the c'tor is package visible, I can't do that. > For now, I completely duplicated the code, but I hope you'll fix it soon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Making TopDocs constructor public
Hi I opened an issue in JIRA on making TopDocs constructor public ( https://issues.apache.org/jira/browse/LUCENE-1064). I think it's a very small change and is required by applications that have the need to write their own HitCollector and create a corresponding TopDocs (but can't extend TopDocCollector). What's your opinion on that? Shai Erera