David,
I was able to find the Java PageFilter source. It seems
that the Java code snippet you quoted came from this
article:
http://www.sys-con.com/read/37296.htm
The article provides code samples linked from the bottom of
the article:
http://res.sys-con.com/story/37296/Walls0712.zip
The Java PageFilter code is pretty short:
import java.io.IOException;
import java.util.BitSet;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.Filter;
public class PageFilter extends Filter {
private int start;
private int end;
public PageFilter(int pageNum, int pageSize) {
start = pageNum * pageSize;
end = (pageNum+1) * pageSize;
}
public BitSet bits(IndexReader reader) throws IOException {
BitSet result = new BitSet(reader.maxDoc());
for(int i=start; (i<end) && (i<result.size()); i++) {
result.set(i);
}
return result;
}
}
You can implement this in PyLucene like this:
import PyLucene
class PageFilter(object):
def __init__(self, page_num=0, size=10):
self.start = page_num * size
self.end = self.start + size
def bits(self, reader):
results = PyLucene.BitSet(reader.maxDoc())
for i in xrange(self.start, min(self.end, results.size())):
results.set(i)
return results
Then you can do what you originally tried:
hits = searcher.search(query, PageFilter(1, 20))
HOWEVER, the PageFilter code in Java doesn't work right and
neither does the PageFilter code in Python. As far as I can
tell this is because the article's author made a mistake.
There's a comment on the article that shows Java Lucene
users haven't been able to get the PageFilter example to
work either:
http://www.sys-con.com/read/37296_f.htm
I'm not very experienced with Filters in Lucene (just
started with Lucene, via PyLucene, a few weeks ago).
However, reading the Java Lucene documentation it appears
that the author's strategy isn't going to work right. You
can read the Java Lucene doc's yourself:
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Filter.html
You can read about extending Java Lucene objects from
PyLucene in the README, here:
http://svn.osafoundation.org/pylucene/trunk/README
Look for the section called "'Extending' Java classes from
Python". And you can see an example here:
http://svn.osafoundation.org/pylucene/trunk/samples/LuceneInAction/lia/extsearch/filters/SpecialsFilter.py
FWIW, the logic for pagination is sufficiently simple that
I'd probably just apply it directly to the hits object. You
should be able to figure that out from the examples above.
Hope that helps.
-matthew
David Pratt [EMAIL PROTECTED] said:
> Hi. I've read a few things about paging functionality for the searcher.
> I have already rolled my own in the meantime for batching and paging but
> still wondering if this functionality already exists somewhere that I am
> just unaware of. I am providing a start position and calculating an end
> position for xrange based on hits.length() to keep the end position
> within the range of results. In any case, I read:
>
> Hits hits = searcher.search(query, new PageFilter(1,20));
>
> In another, this version:
>
> hits = searcher.search(query, 0, 10);
>
> I could not locate a PageFilter method in the java docs and the second
> method throws an exception.
>
> Regards,
> David
> _______________________________________________
> pylucene-dev mailing list
> [email protected]
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
>
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev