date:20071122

[jira] Commented: (LUCENE-887) Interruptible segment merges

2007-11-22 Thread Michael Busch (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544724
 ] 

Michael Busch commented on LUCENE-887:
--

close(false) does not "kill" a merge thread right? So actually
there is no guarantee that close(false) will return within x minutes?

E. g. if a cascading merge is running, then close(false) will wait
for the current merge job to finish, and then abort, meaning not
perform other planned merge jobs of the cascading merge?
But if the current merge job is very big, with huge segments
involved, then it can still take a long time for close(false) to
return?

> Interruptible segment merges
> 
>
> Key: LUCENE-887
> URL: https://issues.apache.org/jira/browse/LUCENE-887
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Reporter: Michael Busch
>Assignee: Michael Busch
>Priority: Minor
> Attachments: ExtendedIndexWriter.java
>
>
> Adds the ability to IndexWriter to interrupt an ongoing merge. This might be 
> necessary when Lucene is e. g. running as a service and has to stop indexing 
> within a certain period of time due to a shutdown request.
> A solution would be to add a new method shutdown() to IndexWriter which 
> satisfies the following two requirements:
> - if a merge is happening, abort it
> - flush the buffered docs but do not trigger a merge 
> See also discussions about this feature on java-dev:
> http://www.gossamer-threads.com/lists/lucene/java-dev/49008

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-22 Thread Michael Busch (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544730
 ] 

Michael Busch commented on LUCENE-1044:
---

I think changing the only constructor in FSDirectory.FSIndexOutput is
an API change. I have a class that extends FSIndexOutput and it 
doesn't compile anymore after switching to the 2.3-dev jar.

I think we should put this ctr back:
public FSIndexOutput(File path) throws IOException {
  this(path, DEFAULT_DO_SYNC);
}

> Behavior on hard power shutdown
> ---
>
> Key: LUCENE-1044
> URL: https://issues.apache.org/jira/browse/LUCENE-1044
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
> Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>Reporter: venkat rangan
>Assignee: Michael McCandless
> Fix For: 2.3
>
> Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, 
> LUCENE-1044.take3.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-22 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544758
 ] 

Michael McCandless commented on LUCENE-1044:


Woops, OK I will put it back ...

> Behavior on hard power shutdown
> ---
>
> Key: LUCENE-1044
> URL: https://issues.apache.org/jira/browse/LUCENE-1044
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
> Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>Reporter: venkat rangan
>Assignee: Michael McCandless
> Fix For: 2.3
>
> Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, 
> LUCENE-1044.take3.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-887) Interruptible segment merges

2007-11-22 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544769
 ] 

Michael McCandless commented on LUCENE-887:
---


I believe close(false) marks all still-running merges as aborted
(calls OneMerge.abort()) and then returns immediately without waiting
for them to finish?

However, you're right, those merge threads are not "killed" in that
they keep running until they "notice" they were aborted (typically
when the merge tries to commit its results).  Right now we don't make
any effort to have the merging process notice sooner that it was
aborted and abort the thread.

If we could somehow reach in & find all FSIndexOutputs that are
presently opened by SegmentMerger, and forcefully close them (forcing
an IOException which ConcurrentMergeScheduler catches & ignores if the
merge was aborted) that'd give us a fast way to have the threads stop
working.

Oh, one issue is we are failing to setDaemon(true) on these threads,
which means the JVM will not exit until they complete.  I'll fix
that.


> Interruptible segment merges
> 
>
> Key: LUCENE-887
> URL: https://issues.apache.org/jira/browse/LUCENE-887
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Reporter: Michael Busch
>Assignee: Michael Busch
>Priority: Minor
> Attachments: ExtendedIndexWriter.java
>
>
> Adds the ability to IndexWriter to interrupt an ongoing merge. This might be 
> necessary when Lucene is e. g. running as a service and has to stop indexing 
> within a certain period of time due to a shutdown request.
> A solution would be to add a new method shutdown() to IndexWriter which 
> satisfies the following two requirements:
> - if a merge is happening, abort it
> - flush the buffered docs but do not trigger a merge 
> See also discussions about this feature on java-dev:
> http://www.gossamer-threads.com/lists/lucene/java-dev/49008

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Created: (LUCENE-1064) Make TopDocs constructor public

2007-11-22 Thread Shai Erera (JIRA)

Make TopDocs constructor public
---

 Key: LUCENE-1064
 URL: https://issues.apache.org/jira/browse/LUCENE-1064
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Affects Versions: 2.2
 Environment: All
Reporter: Shai Erera


TopDocs constructor is package visible. This prevents instantiating it from 
outside this package. For example, I wrote a HitColletor that couldn't extend 
directly from TopDocCollector. I need to create a new TopDocs instance, however 
since the c'tor is package visible, I can't do that.
For now, I completely duplicated the code, but I hope you'll fix it soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-22 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544803
 ] 

Doron Cohen commented on LUCENE-1044:
-

{quote}
I'll look into the separate thread to sync/close files in the
background next...
{quote}

I was wondering if delaying sync to actual commit point would run faster
than a background thread. I thought it would, because the background
thread, though not holding current thread from continue with indexing, 
does force the sync *now* rather than letting the IO subsystem actually 
write stuff on its time. I was also hoping that by doing them later, 
some of the syncs would become no-ops, and hence faster. I found
out however that delaying the syncs (but intending to sync) also 
means keeping the file handles open, and therefore  this is not 
a practical approach. Still it was interesting to compare. 

So... my small test sequentially writes M characters to N files 
and either do not sync (just close), or does sync in one of three 
ways: (1) at the end, (2) immediately, (3) in a background thread. 
The results (in millis) on my Windows XP were:

|| num files || num chars per file || No Sync || Sync At End || Background Sync 
|| Immediate Sync ||
|   100 | 1 |   631 |   5778 |   5729 |   5828 |
|   100 | 1 |   581 |   4486 |   4117 |   4687 |
|  1000 |  1000 |  1612 |  38996 |  34900 |  35852 |
|  1000 |  1000 |  1432 |  37153 |  35051 |  37263 |
| 1 |   100 | 10335 | 154262 | 162103 | 174251 |
| 1 |   100 | 11276 | 147752 | 159480 | 222450 |

Each configuration ran twice and there are fluctuations, 
but it is obvious (as Mike noticed) that no-sync is much faster
then sync. In fact in my test no-sync is at least 10 times faster
than any sync approach, while in Mike's test which is using 
Lucene the penalty is smaller. Difference might be because 
in my test there is no CPU work involved, just IO. 

Comparing "immediate" to "background" I it is not clearly worth it 
to add a background thread (unless Mike's test proves otherwise..)

> Behavior on hard power shutdown
> ---
>
> Key: LUCENE-1044
> URL: https://issues.apache.org/jira/browse/LUCENE-1044
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
> Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>Reporter: venkat rangan
>Assignee: Michael McCandless
> Fix For: 2.3
>
> Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, 
> LUCENE-1044.take3.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-22 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544811
 ] 

Doron Cohen commented on LUCENE-1044:
-

With some artificial CPU activity added to the test program:
|| num files || num chars per file || No Sync || Sync At End || Background Sync 
|| Immediate Sync ||
|   100 | 1 |  6690  |  11516 |  10706 |  11216 |
|   100 | 1 |  7200  |  11006 |  10575 |  10846 |
|  1000 |  1000 |  8002  |  48570 |  48479 |  51825 |
|  1000 |  1000 |  7801  |  43142 |  43693 |  43342 |
| 1 |   100 | 16303  | 152730 | 326810 | 207939 |
| 1 |   100 | 17805  | 156375 | 160040 | 165398 |



> Behavior on hard power shutdown
> ---
>
> Key: LUCENE-1044
> URL: https://issues.apache.org/jira/browse/LUCENE-1044
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
> Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>Reporter: venkat rangan
>Assignee: Michael McCandless
> Fix For: 2.3
>
> Attachments: LUCENE-1044.patch, LUCENE-1044.take2.patch, 
> LUCENE-1044.take3.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Issue Comment Edited: (LUCENE-1044) Behavior on hard power shutdown

2007-11-22 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544812
 ] 

doronc edited comment on LUCENE-1044 at 11/22/07 6:16 AM:
---

Attached FSyncPerfTest.java is the standalone (non Lucene) perf test that I 
used.

  was (Author: doronc):
The standalone (non Lucene) perf test that I used.
  
> Behavior on hard power shutdown
> ---
>
> Key: LUCENE-1044
> URL: https://issues.apache.org/jira/browse/LUCENE-1044
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
> Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>Reporter: venkat rangan
>Assignee: Michael McCandless
> Fix For: 2.3
>
> Attachments: FSyncPerfTest.java, LUCENE-1044.patch, 
> LUCENE-1044.take2.patch, LUCENE-1044.take3.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1044) Behavior on hard power shutdown

2007-11-22 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-1044:


Attachment: FSyncPerfTest.java

The standalone (non Lucene) perf test that I used.

> Behavior on hard power shutdown
> ---
>
> Key: LUCENE-1044
> URL: https://issues.apache.org/jira/browse/LUCENE-1044
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
> Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>Reporter: venkat rangan
>Assignee: Michael McCandless
> Fix For: 2.3
>
> Attachments: FSyncPerfTest.java, LUCENE-1044.patch, 
> LUCENE-1044.take2.patch, LUCENE-1044.take3.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-887) Interruptible segment merges

2007-11-22 Thread Michael Busch (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544854
 ] 

Michael Busch commented on LUCENE-887:
--

{quote}
I believe close(false) marks all still-running merges as aborted
(calls OneMerge.abort()) and then returns immediately without waiting
for them to finish?
{quote}
OK I got it. So if Lucene runs as a service and a shutdown request is
received, then a call to close(false) will cause the IndexWriter to mark
the running merges as aborted, flush the buffer and return once 
flush+commit is done. Then the caller knows that it is safe now to
shutdown the JVM, which will also stop the running merge thread
(because it's a daemon thread now). 
We should probably add a testcase that tests such a shutdown 
scenario.

 {quote}
If we could somehow reach in & find all FSIndexOutputs that are
presently opened by SegmentMerger, and forcefully close them (forcing
an IOException which ConcurrentMergeScheduler catches & ignores if the
merge was aborted) that'd give us a fast way to have the threads stop
working.
{quote}
The fact that the background merge thread keeps running doesn't hurt
us, but the advantage would be to reduce system load and thus to
possibly speedup the flush+commit that the other thread is doing.
Maybe for now we could also set the priority of the background thread
to a minimum value, as soon as close(false) is called?

{quote}
Oh, one issue is we are failing to setDaemon(true) on these threads,
which means the JVM will not exit until they complete. I'll fix
that.
{quote}
Cool, thanks! W/o this the above explained shutdown scenario wouldn't
work.

> Interruptible segment merges
> 
>
> Key: LUCENE-887
> URL: https://issues.apache.org/jira/browse/LUCENE-887
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Reporter: Michael Busch
>Assignee: Michael Busch
>Priority: Minor
> Attachments: ExtendedIndexWriter.java
>
>
> Adds the ability to IndexWriter to interrupt an ongoing merge. This might be 
> necessary when Lucene is e. g. running as a service and has to stop indexing 
> within a certain period of time due to a shutdown request.
> A solution would be to add a new method shutdown() to IndexWriter which 
> satisfies the following two requirements:
> - if a merge is happening, abort it
> - flush the buffered docs but do not trigger a merge 
> See also discussions about this feature on java-dev:
> http://www.gossamer-threads.com/lists/lucene/java-dev/49008

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

2007-11-22 Thread Paul Elschot (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-584:


Attachment: (was: Matcher-20071122-1ground.patch)

> Decouple Filter from BitSet
> ---
>
> Key: LUCENE-584
> URL: https://issues.apache.org/jira/browse/LUCENE-584
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.0.1
>Reporter: Peter Schäfer
>Priority: Minor
> Attachments: bench-diff.txt, bench-diff.txt, 
> Matcher-20070905-1ground.patch, Matcher-20070905-2default.patch, 
> Matcher-20070905-3core.patch, Matcher-20071008-1ground.patch, Some 
> Matchers.zip
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract 
> interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's 
> privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of 
> memory. It would be desirable to have an alternative BitSet implementation 
> with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was 
> obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation 
> could still delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

2007-11-22 Thread Paul Elschot (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-584:


Attachment: Matcher-20071122-1ground.patch

Resolved a local conflict in the javadocs of HitCollector.

> Decouple Filter from BitSet
> ---
>
> Key: LUCENE-584
> URL: https://issues.apache.org/jira/browse/LUCENE-584
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.0.1
>Reporter: Peter Schäfer
>Priority: Minor
> Attachments: bench-diff.txt, bench-diff.txt, 
> Matcher-20070905-1ground.patch, Matcher-20070905-2default.patch, 
> Matcher-20070905-3core.patch, Matcher-20071008-1ground.patch, Some 
> Matchers.zip
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract 
> interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's 
> privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of 
> memory. It would be desirable to have an alternative BitSet implementation 
> with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was 
> obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation 
> could still delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

2007-11-22 Thread Paul Elschot (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-584:


Attachment: (was: Matcher-20070905-1ground.patch)

> Decouple Filter from BitSet
> ---
>
> Key: LUCENE-584
> URL: https://issues.apache.org/jira/browse/LUCENE-584
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.0.1
>Reporter: Peter Schäfer
>Priority: Minor
> Attachments: bench-diff.txt, bench-diff.txt, 
> Matcher-20070905-2default.patch, Matcher-20070905-3core.patch, 
> Matcher-20071008-1ground.patch, Matcher-20071122-1ground.patch, Some 
> Matchers.zip
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract 
> interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's 
> privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of 
> memory. It would be desirable to have an alternative BitSet implementation 
> with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was 
> obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation 
> could still delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

2007-11-22 Thread Paul Elschot (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-584:


Attachment: Matcher-20071122-1ground.patch

Resolved a local conflict in the javadocs of HitCollector. This time with 
licence granted to ASF.

> Decouple Filter from BitSet
> ---
>
> Key: LUCENE-584
> URL: https://issues.apache.org/jira/browse/LUCENE-584
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.0.1
>Reporter: Peter Schäfer
>Priority: Minor
> Attachments: bench-diff.txt, bench-diff.txt, 
> Matcher-20070905-2default.patch, Matcher-20070905-3core.patch, 
> Matcher-20071008-1ground.patch, Matcher-20071122-1ground.patch, Some 
> Matchers.zip
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract 
> interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's 
> privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of 
> memory. It would be desirable to have an alternative BitSet implementation 
> with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was 
> obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation 
> could still delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

2007-11-22 Thread Paul Elschot (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot updated LUCENE-584:


Attachment: (was: Matcher-20071008-1ground.patch)

> Decouple Filter from BitSet
> ---
>
> Key: LUCENE-584
> URL: https://issues.apache.org/jira/browse/LUCENE-584
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.0.1
>Reporter: Peter Schäfer
>Priority: Minor
> Attachments: bench-diff.txt, bench-diff.txt, 
> Matcher-20070905-2default.patch, Matcher-20070905-3core.patch, 
> Matcher-20071122-1ground.patch, Some Matchers.zip
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract 
> interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's 
> privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of 
> memory. It would be desirable to have an alternative BitSet implementation 
> with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was 
> obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation 
> could still delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1064) Make TopDocs constructor public

2007-11-22 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-1064:
---

Attachment: TopDocs-patch

The simple patch details

> Make TopDocs constructor public
> ---
>
> Key: LUCENE-1064
> URL: https://issues.apache.org/jira/browse/LUCENE-1064
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.2
> Environment: All
>Reporter: Shai Erera
> Attachments: TopDocs-patch
>
>
> TopDocs constructor is package visible. This prevents instantiating it from 
> outside this package. For example, I wrote a HitColletor that couldn't extend 
> directly from TopDocCollector. I need to create a new TopDocs instance, 
> however since the c'tor is package visible, I can't do that.
> For now, I completely duplicated the code, but I hope you'll fix it soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Making TopDocs constructor public

2007-11-22 Thread Shai Erera

Hi

I opened an issue in JIRA on making TopDocs constructor public (
https://issues.apache.org/jira/browse/LUCENE-1064). I think it's a very
small change and is required by applications that have the need to write
their own HitCollector and create a corresponding TopDocs (but can't extend
TopDocCollector).

What's your opinion on that?

Shai Erera

[jira] Commented: (LUCENE-887) Interruptible segment merges

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

[jira] Commented: (LUCENE-887) Interruptible segment merges

[jira] Created: (LUCENE-1064) Make TopDocs constructor public

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

[jira] Issue Comment Edited: (LUCENE-1044) Behavior on hard power shutdown

[jira] Updated: (LUCENE-1044) Behavior on hard power shutdown

[jira] Commented: (LUCENE-887) Interruptible segment merges

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

[jira] Updated: (LUCENE-1064) Make TopDocs constructor public

Making TopDocs constructor public

17 matches

Site Navigation

Mail list logo

Footer information