RE: RE: howto: handle temporal visibility of a document?

2015-01-12 Thread Clemens Wyss DEV
reduced to: ( ( *:* -visiblefrom:[* TO *] AND -visibleto:[* TO *] ) OR (-visiblefrom:[* TO *] AND visibleto:[ TO ]) OR (-visibleto:[ * TO *] AND visiblefrom:[0 TO ]) OR ( visiblefrom:[0 TO ] AND visibleto:[ TO ]) ) > also if you index an explicit null value you won't nee

Problem with Custom FieldComparator

2015-01-12 Thread Victor Podberezski
I'm changing one web application with lucene 2.4.1 to lucene 2.9.4 (basically because this bug: https://issues.apache.org/jira/browse/LUCENE-1304). I'm trying to migrate a custom sort field according to some examples i read. But I cannot make it work right. I have a field with string values and w

StoredField available in Collector.setNextReader

2015-01-12 Thread Hasenberger, Josef
Hello, I have tried to retrieve values stored via StoredField type inside a Collector when its method setNextReader(AtomicReaderContext) was called. I used the following method from FieldCache, but do not get back any values: FieldCache.DEFAULT.getTerms(indexReader, field, false); Retrievi

Re: Details on setting block parameters for Lucene41PostingsFormat

2015-01-12 Thread Tom Burton-West
Thanks Mike, Do you know how I can configure Solr to use the min=200 and max=398 block sizes you suggested? Or should I ask on the Solr list? Tom On Sat, Jan 10, 2015 at 4:46 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > The first int to Lucene41PostingsFormat is the min block s

Re: Details on setting block parameters for Lucene41PostingsFormat

2015-01-12 Thread Tom Burton-West
Thanks Mike, > OK. It would be good to know where all your RAM is being consumed, > and how much of that is really the terms index: it ought to be a very > small part of it. > > I made a bunch of heap dumps. I just watched with jconsole and ran jmap -histo when memory use got high. I've appende

AW: AW: howto: handle temporal visibility of a document?

2015-01-12 Thread Clemens Wyss DEV
Thx, I will simplify/optimize ;) -Ursprüngliche Nachricht- Von: Michael Sokolov [mailto:msoko...@safaribooksonline.com] Gesendet: Montag, 12. Januar 2015 14:41 An: java-user@lucene.apache.org Betreff: Re: AW: howto: handle temporal visibility of a document? The basic idea seems sound, bu

MultiPhraseQuery:Rewrite to BooleanQuery

2015-01-12 Thread ku3ia
Hi folks! I have a multiphrase query, for example, from units: Directory indexStore = newDirectory(); RandomIndexWriter writer = new RandomIndexWriter(random(), indexStore); add("blueberry chocolate pie", writer); add("blueberry chocolate tart", writer); IndexReader r = writer.getReader(); writer.

Re: Finding a match for an automaton against a FST

2015-01-12 Thread Michael McCandless
On Sat, Jan 10, 2015 at 8:23 AM, Olivier Binda wrote: > On 01/10/2015 11:00 AM, Michael McCandless wrote: >> >> On Fri, Jan 9, 2015 at 6:42 AM, Olivier Binda >> wrote: >>> >>> Hello. >>> >>> 1) What is the best way to check if an automaton (from a regex or a >>> string >>> with a wildcard) >>> ha

Re: AW: howto: handle temporal visibility of a document?

2015-01-12 Thread Michael Sokolov
The basic idea seems sound, but I think you can simplify that query a bit. For one thing, the *:* clauses can be removed in a few places: also if you index an explicit null value you won't need them at all; for visiblefrom, if you don't have a from time, use 0, for visibleto, if you don't have

MultiPhraseQuery:Rewrite to BooleanQuery

2015-01-12 Thread dennis yermakov
Hi folks! I have a multiphrase query, for example, from units: Directory indexStore = newDirectory(); RandomIndexWriter writer = new RandomIndexWriter(random(), indexStore); add("blueberry chocolate pie", writer); add("blueberry chocolate tart", writer); IndexReader r = writer.getReader(); writer.

fill 'empty' facet-values, sampling, taxoreader

2015-01-12 Thread Rob Audenaerde
Hi all, I'm building an application in which users can add arbitrary documents, and all fields will be added as facets as well. This allows users to browse their documents by their own defined facets easily. However, when the number of documents gets very large, I switch to random-sampled facets

AW: howto: handle temporal visibility of a document?

2015-01-12 Thread Clemens Wyss DEV
I'll add/start with my proposal ;) Document-meta fields: + visiblefrom [long] + visibleto [long] Query or query filter: (*:* -visiblefrom:[* TO *] AND -visibleto:[* TO *]) OR (*:* -visiblefrom:[* TO *] AND visibleto:[ TO *]) OR (*:* -visibleto:[ * TO *] AND visiblefrom:[* TO ]) OR ( visiblefr

howto: handle temporal visibility of a document?

2015-01-12 Thread Clemens Wyss DEV
We have documents that are not always visible (visiblefrom-visibleto). In order to not have to query the originating object of the document whether it is currently visible (after the query), we'd like to put metadata into the documents, so that the visibility can be determined at query-time (by

RE: Custom tokenizer

2015-01-12 Thread Uwe Schindler
> Thanks for the reply. > > Hmm, I understand. > I know about AnalyzerWrapper, but that is not what I am looking for. > > I also know about cloning and overriding. I want my analyzer to behave > exactly the same as EnglishAnalyzer and right now I am copying the code > from the EnglishAnalyzer to

Re: Custom tokenizer

2015-01-12 Thread Vihari Piratla
Thanks for the reply. Hmm, I understand. I know about AnalyzerWrapper, but that is not what I am looking for. I also know about cloning and overriding. I want my analyzer to behave exactly the same as EnglishAnalyzer and right now I am copying the code from the EnglishAnalyzer to mimic the behavi

RE: Custom tokenizer

2015-01-12 Thread Uwe Schindler
Hi, Extending an existing Analyzer is not useful, because it is just a factory that returns a TokenStream instance to consumers. If you want to change the Tokenizer of an existing Analyzer, just clone it and rewrite its createComponents() method, see the example in the Javadocs: http://lucene.