date:20110511

Re: Sharding Techniques

2011-05-11 Thread Samarendra Pratap

Hi Tom,
 the more i am getting responses in this thread the more i feel that our
application needs optimization.

350 GB and less than 2 seconds!!! That's much more than my expectation :-)
(in current scenario).

*characteristics of slow queries:*
 there are a few reasons for greater search time

 1.Two of our fields contain decimal values but are not NumericField :( .
These fields are searched as a range. Whenever the ranges are larger and/or
both the fields are used in search the search time and server load goes
high. I have already started work to convert it to NumericField - but
suggestions and experiences are most welcome.

2. When queries (without two fields mentioned above) have a lot of
words/phrases search time is high. E.g I took a query with around 80 unique
terms (not words) in 5 fields. These terms occur repeatedly and become total
225 terms (non-unique). This particular query took 4.2 seconds. the 15
indexes used for this query were of total size 5 G.
Are 225 terms (80 unique) is a very big number?

and yes, slow queries are always slow. yes but obviously high load will add
up to their slowness.

Here I have another curiosity about something I noticed.
If I have a query like following:

title:xyz title:xyz title:xyz title:xyz title:xyz title:xyz title:xyz
title:xyz title:xyz title:xyz title:xyz

*Will lucene search for the term 11 times or it will reuse the results of
first term?*

If later is true (which I think is), is there any particular reason or it
may be optimized inside lucene?

On Tue, May 10, 2011 at 9:46 PM, Burton-West, Tom wrote:

> Hi Samar,
>
> >>Normal queries go fine under 500 ms but when people start searching
> >>"anything" some queries take up to > 100 seconds. Don't you think
> >>distributing smaller indexes on different machines would reduce the
> average
> >>.search time. (Although I have a feeling that search time for smaller
> queries
> >>may be slightly increased)
>
> What are the characteristics of your slow queries?  Can you give examples?
>   Are the slow queries always slow or only under heavy load?   What the
> bottleneck is and whether splitting into smaller indexes would help depends
> on just what your bottleneck is. It's not clear that your index is large
> enough that the size of the index is causing your bottleneck.
>
> We run indexes of about 350GB with average response times under 200ms and
> 99th percentile reponse times of under 2 seconds. (We have a very low qps
> rate however).
>
>
> Tom
>
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-- 
Regards,
Samar

Re: Sharding Techniques

2011-05-11 Thread Ian Lea

Ganesh


Nobody is saying that sharding is never a good idea - it just doesn't
seem to be applicable in the case being discussed.  On my indexes I
care much more about speed of searching rather than speed of indexing.
 The latter typically happens in the background in the dead of night
and within reason I don't really care how long it takes.  Your
application and requirements will be different and you may come to
different conclusions.


--
Ian.


On Wed, May 11, 2011 at 6:09 AM, Ganesh  wrote:
>
> We also use similar kind of technique, breaking indexes in to smaller and 
> search using ParallelMultiSearcher. We have to do incremental indexing and 
> the records older than 6 months or 1 year (based on ageout setting) should be 
> deleted. Having multiple small indexes is really fast in terms of indexing.
>
> Since you guys mentioned about keeping single large index. Search time woule 
> be faster but the indexing and index optimization will take more time. How 
> you are handling it in case of incremental indexing. If we keep the indexes 
> size to 100+ GB then each file size (fdt, fdx etc) would in GB's. Small 
> addition or deletion to the file will not cause more IO as it has to skip 
> those bytes and write it at the end of file.
>
> Regards
> Ganesh
>
> - Original Message -
> From: "Burton-West, Tom" 
> To: 
> Sent: Tuesday, May 10, 2011 9:46 PM
> Subject: RE: Sharding Techniques
>
>
> Hi Samar,
>
>>>Normal queries go fine under 500 ms but when people start searching
>>>"anything" some queries take up to > 100 seconds. Don't you think
>>>distributing smaller indexes on different machines would reduce the average
>>>.search time. (Although I have a feeling that search time for smaller queries
>>>may be slightly increased)
>
> What are the characteristics of your slow queries?  Can you give examples?   
> Are the slow queries always slow or only under heavy load?   What the 
> bottleneck is and whether splitting into smaller indexes would help depends 
> on just what your bottleneck is. It's not clear that your index is large 
> enough that the size of the index is causing your bottleneck.
>
> We run indexes of about 350GB with average response times under 200ms and 
> 99th percentile reponse times of under 2 seconds. (We have a very low qps 
> rate however).
>
>
> Tom
>
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Sharding Techniques

2011-05-11 Thread Ian Lea

I'm sure that you should try building one large index and convert to
NumericField wherever you can.  I'm convinced that will be faster -
but as ever, the proof will be in the numbers.

On repeated terms, I believe that lucene will search multiple times.
If so, I'd guess it is just something that has never been optimized.
225 terms is not a very big number, but not very small either.
Complex queries with lots of terms can be expected to be slower than
simple queries with few terms.  If you have a particular problem with
repeated terms perhaps you could dedup them yourself.


--
Ian.

On Wed, May 11, 2011 at 9:10 AM, Samarendra Pratap  wrote:
> Hi Tom,
>  the more i am getting responses in this thread the more i feel that our
> application needs optimization.
>
> 350 GB and less than 2 seconds!!! That's much more than my expectation :-)
> (in current scenario).
>
> *characteristics of slow queries:*
>  there are a few reasons for greater search time
>
>  1.Two of our fields contain decimal values but are not NumericField :( .
> These fields are searched as a range. Whenever the ranges are larger and/or
> both the fields are used in search the search time and server load goes
> high. I have already started work to convert it to NumericField - but
> suggestions and experiences are most welcome.
>
> 2. When queries (without two fields mentioned above) have a lot of
> words/phrases search time is high. E.g I took a query with around 80 unique
> terms (not words) in 5 fields. These terms occur repeatedly and become total
> 225 terms (non-unique). This particular query took 4.2 seconds. the 15
> indexes used for this query were of total size 5 G.
> Are 225 terms (80 unique) is a very big number?
>
> and yes, slow queries are always slow. yes but obviously high load will add
> up to their slowness.
>
>
>
>
> Here I have another curiosity about something I noticed.
> If I have a query like following:
>
>
> title:xyz title:xyz title:xyz title:xyz title:xyz title:xyz title:xyz
> title:xyz title:xyz title:xyz title:xyz
>
> *Will lucene search for the term 11 times or it will reuse the results of
> first term?*
>
> If later is true (which I think is), is there any particular reason or it
> may be optimized inside lucene?
>
>
> On Tue, May 10, 2011 at 9:46 PM, Burton-West, Tom wrote:
>
>> Hi Samar,
>>
>> >>Normal queries go fine under 500 ms but when people start searching
>> >>"anything" some queries take up to > 100 seconds. Don't you think
>> >>distributing smaller indexes on different machines would reduce the
>> average
>> >>.search time. (Although I have a feeling that search time for smaller
>> queries
>> >>may be slightly increased)
>>
>> What are the characteristics of your slow queries?  Can you give examples?
>>   Are the slow queries always slow or only under heavy load?   What the
>> bottleneck is and whether splitting into smaller indexes would help depends
>> on just what your bottleneck is. It's not clear that your index is large
>> enough that the size of the index is causing your bottleneck.
>>
>> We run indexes of about 350GB with average response times under 200ms and
>> 99th percentile reponse times of under 2 seconds. (We have a very low qps
>> rate however).
>>
>>
>> Tom
>>
>>
>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
>
> --
> Regards,
> Samar
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Can I omit ShingleFilter's filler tokens

2011-05-11 Thread Steven A Rowe

Hi Bill,

I can think of two possible interpretations of "removing filler tokens":

1. Don't create shingles across stopwords, e.g. for text "one two three four 
five" and stopword "three", bigrams only, you'd get ("one two", "four five"), 
instead of the current ("one two", "two _", "_ four", "four five").

2. Create shingles as if the stopwords were never there, e.g. for the same text 
and stopword, bigrams only, you'd get ("one two", "two four", "four five").

Which one did you have in mind?  #2 can be achieved by adding PositionFilter 
after StopFilter and before ShingleFilter.  I think #1 requires ShingleFilter 
modifications.

Steve

> -Original Message-
> From: William Koscho [mailto:wkos...@gmail.com]
> Sent: Wednesday, May 11, 2011 12:05 AM
> To: java-user@lucene.apache.org
> Subject: Can I omit ShingleFilter's filler tokens
> 
> Hi,
> 
> Can I remove the filler token _ from the n-gram-tokens that are generated
> by
> a ShingleFilter?
> 
> I'm using a chain of filters: ClassicFilter, StopFilter, LowerCaseFilter,
> and ShingleFilter to create phrase n-grams.  The ShingleFilter inserts
> FILLER_TOKENs in place of the stopwords, but I don't want them.
> 
> How can I omit the filler tokens?
> 
> thanks
> Bill

Re: Can I omit ShingleFilter's filler tokens

2011-05-11 Thread Robert Muir

another idea is to .setEnablePositionIncrements(false) on your stopfilter.

On Wed, May 11, 2011 at 8:27 AM, Steven A Rowe  wrote:
> Hi Bill,
>
> I can think of two possible interpretations of "removing filler tokens":
>
> 1. Don't create shingles across stopwords, e.g. for text "one two three four 
> five" and stopword "three", bigrams only, you'd get ("one two", "four five"), 
> instead of the current ("one two", "two _", "_ four", "four five").
>
> 2. Create shingles as if the stopwords were never there, e.g. for the same 
> text and stopword, bigrams only, you'd get ("one two", "two four", "four 
> five").
>
> Which one did you have in mind?  #2 can be achieved by adding PositionFilter 
> after StopFilter and before ShingleFilter.  I think #1 requires ShingleFilter 
> modifications.
>
> Steve
>
>> -Original Message-
>> From: William Koscho [mailto:wkos...@gmail.com]
>> Sent: Wednesday, May 11, 2011 12:05 AM
>> To: java-user@lucene.apache.org
>> Subject: Can I omit ShingleFilter's filler tokens
>>
>> Hi,
>>
>> Can I remove the filler token _ from the n-gram-tokens that are generated
>> by
>> a ShingleFilter?
>>
>> I'm using a chain of filters: ClassicFilter, StopFilter, LowerCaseFilter,
>> and ShingleFilter to create phrase n-grams.  The ShingleFilter inserts
>> FILLER_TOKENs in place of the stopwords, but I don't want them.
>>
>> How can I omit the filler tokens?
>>
>> thanks
>> Bill
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: Can I omit ShingleFilter's filler tokens

2011-05-11 Thread Steven A Rowe

Yes, StopFilter.setEnablePositionIncrements(false) will almost certainly get 
higher throughput than inserting PositionFilter.  Like PositionFilter, this 
will buy you #2 (create shingles as if stopwords were never there), but not #1 
(don't create shingles across stopwords).

> -Original Message-
> From: Robert Muir [mailto:rcm...@gmail.com]
> Sent: Wednesday, May 11, 2011 9:02 AM
> To: java-user@lucene.apache.org
> Subject: Re: Can I omit ShingleFilter's filler tokens
> 
> another idea is to .setEnablePositionIncrements(false) on your
> stopfilter.
> 
> On Wed, May 11, 2011 at 8:27 AM, Steven A Rowe  wrote:
> > Hi Bill,
> >
> > I can think of two possible interpretations of "removing filler
> tokens":
> >
> > 1. Don't create shingles across stopwords, e.g. for text "one two three
> four five" and stopword "three", bigrams only, you'd get ("one two",
> "four five"), instead of the current ("one two", "two _", "_ four", "four
> five").
> >
> > 2. Create shingles as if the stopwords were never there, e.g. for the
> same text and stopword, bigrams only, you'd get ("one two", "two four",
> "four five").
> >
> > Which one did you have in mind?  #2 can be achieved by adding
> PositionFilter after StopFilter and before ShingleFilter.  I think #1
> requires ShingleFilter modifications.
> >
> > Steve
> >
> >> -Original Message-
> >> From: William Koscho [mailto:wkos...@gmail.com]
> >> Sent: Wednesday, May 11, 2011 12:05 AM
> >> To: java-user@lucene.apache.org
> >> Subject: Can I omit ShingleFilter's filler tokens
> >>
> >> Hi,
> >>
> >> Can I remove the filler token _ from the n-gram-tokens that are
> generated
> >> by
> >> a ShingleFilter?
> >>
> >> I'm using a chain of filters: ClassicFilter, StopFilter,
> LowerCaseFilter,
> >> and ShingleFilter to create phrase n-grams.  The ShingleFilter inserts
> >> FILLER_TOKENs in place of the stopwords, but I don't want them.
> >>
> >> How can I omit the filler tokens?
> >>
> >> thanks
> >> Bill
> >
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Non-English Languages Search

2011-05-11 Thread Robert Muir

On Mon, May 9, 2011 at 5:32 PM, Provalov, Ivan
 wrote:
> We are planning to ingest some non-English content into our application.  All 
> content is OCR'ed and there are a lot of misspellings and garbage terms 
> because of this.  Each document has one primary language with a some 
> exceptions (e.g. a few English terms mixed in with primarily non-English 
> document terms).
>

sounds like you should talk to Tom Burton-West!

> 1. Does it make sense to mix two or more different Latin-based languages in 
> the same index directory in Lucene (e.g. Spanish/French/English)?

I think it depends upon the application. If the user is specifying the
language via the UI somehow then its probably simplest to just use
different indexes for each collection.

> 2. What about mixing Latin and non-Latin languages?  We ran tests on English 
> and Chinese collections mixed together and didn't see any negative impact 
> (precision/recall).  Any other potential issues?

Right, none of the terms would overlap here... the only "issue" would
be a skewed maxDoc but this is probably not a big deal at all. But
whats the benefit to mixing them?

> 3. Any recommendations for an Urdu analyzer?
>

you can always start with standardanalyzer as it will tokenize it...
you might be able to make use of resources such as
http://www.crulp.org/software/ling_resources/UrduClosedClassWordsList.htm
and http://www.crulp.org/software/ling_resources/UrduHighFreqWords.htm
as a stoplist.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Bug in BrazilianAnalyzer?

2011-05-11 Thread paulocsc

 Hi,
 I did a test to understand the use of '*'and '?'.
 If I use StandardAnalyzer I have expected results by if a use
BrazilianAnalyzer I have a mistake result.
 Please, where is my mistake? Junit is at the end.
 Paulo Cesar
 cities = {"Brasília","Brasilândia","Braslândia", "São Paulo",
"São Roque", "Salvador"};
 >>> Using StandardAnalyzer > Using BrazilianAnalyzer >  JUnit

Re: Bug in BrazilianAnalyzer?

2011-05-11 Thread Adriano Crestani

Hi,

I think you forgot to attach the JUnit.

On Wed, May 11, 2011 at 10:04 AM,  wrote:

>  Hi,
>  I did a test to understand the use of '*'and '?'.
>  If I use StandardAnalyzer I have expected results by if a use
> BrazilianAnalyzer I have a mistake result.
>  Please, where is my mistake? Junit is at the end.
>  Paulo Cesar
>  cities = {"Brasília","Brasilândia","Braslândia", "São Paulo",
> "São Roque", "Salvador"};
>  >>> Using StandardAnalyzer > Using BrazilianAnalyzer >  JUnit

Re: Can I omit ShingleFilter's filler tokens

2011-05-11 Thread William Koscho

#1 is what I'm trying for, so Ill give setPositionIncrements(false) a
try. Thanks for everyone's help.

Bill

On 5/11/11, Steven A Rowe  wrote:
> Yes, StopFilter.setEnablePositionIncrements(false) will almost certainly get
> higher throughput than inserting PositionFilter.  Like PositionFilter, this
> will buy you #2 (create shingles as if stopwords were never there), but not
> #1 (don't create shingles across stopwords).
>
>> -Original Message-
>> From: Robert Muir [mailto:rcm...@gmail.com]
>> Sent: Wednesday, May 11, 2011 9:02 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: Can I omit ShingleFilter's filler tokens
>>
>> another idea is to .setEnablePositionIncrements(false) on your
>> stopfilter.
>>
>> On Wed, May 11, 2011 at 8:27 AM, Steven A Rowe  wrote:
>> > Hi Bill,
>> >
>> > I can think of two possible interpretations of "removing filler
>> tokens":
>> >
>> > 1. Don't create shingles across stopwords, e.g. for text "one two three
>> four five" and stopword "three", bigrams only, you'd get ("one two",
>> "four five"), instead of the current ("one two", "two _", "_ four", "four
>> five").
>> >
>> > 2. Create shingles as if the stopwords were never there, e.g. for the
>> same text and stopword, bigrams only, you'd get ("one two", "two four",
>> "four five").
>> >
>> > Which one did you have in mind?  #2 can be achieved by adding
>> PositionFilter after StopFilter and before ShingleFilter.  I think #1
>> requires ShingleFilter modifications.
>> >
>> > Steve
>> >
>> >> -Original Message-
>> >> From: William Koscho [mailto:wkos...@gmail.com]
>> >> Sent: Wednesday, May 11, 2011 12:05 AM
>> >> To: java-user@lucene.apache.org
>> >> Subject: Can I omit ShingleFilter's filler tokens
>> >>
>> >> Hi,
>> >>
>> >> Can I remove the filler token _ from the n-gram-tokens that are
>> generated
>> >> by
>> >> a ShingleFilter?
>> >>
>> >> I'm using a chain of filters: ClassicFilter, StopFilter,
>> LowerCaseFilter,
>> >> and ShingleFilter to create phrase n-grams.  The ShingleFilter inserts
>> >> FILLER_TOKENs in place of the stopwords, but I don't want them.
>> >>
>> >> How can I omit the filler tokens?
>> >>
>> >> thanks
>> >> Bill
>> >
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-- 
Sent from my mobile device

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Can I omit ShingleFilter's filler tokens

2011-05-11 Thread William Koscho

I meant I'm trying for #2 so this should work (got my numbers mixed
up). Thanks again

Bill

On 5/11/11, William Koscho  wrote:
> #1 is what I'm trying for, so Ill give setPositionIncrements(false) a
> try. Thanks for everyone's help.
>
> Bill
>
> On 5/11/11, Steven A Rowe  wrote:
>> Yes, StopFilter.setEnablePositionIncrements(false) will almost certainly
>> get
>> higher throughput than inserting PositionFilter.  Like PositionFilter,
>> this
>> will buy you #2 (create shingles as if stopwords were never there), but
>> not
>> #1 (don't create shingles across stopwords).
>>
>>> -Original Message-
>>> From: Robert Muir [mailto:rcm...@gmail.com]
>>> Sent: Wednesday, May 11, 2011 9:02 AM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: Can I omit ShingleFilter's filler tokens
>>>
>>> another idea is to .setEnablePositionIncrements(false) on your
>>> stopfilter.
>>>
>>> On Wed, May 11, 2011 at 8:27 AM, Steven A Rowe  wrote:
>>> > Hi Bill,
>>> >
>>> > I can think of two possible interpretations of "removing filler
>>> tokens":
>>> >
>>> > 1. Don't create shingles across stopwords, e.g. for text "one two
>>> > three
>>> four five" and stopword "three", bigrams only, you'd get ("one two",
>>> "four five"), instead of the current ("one two", "two _", "_ four",
>>> "four
>>> five").
>>> >
>>> > 2. Create shingles as if the stopwords were never there, e.g. for the
>>> same text and stopword, bigrams only, you'd get ("one two", "two four",
>>> "four five").
>>> >
>>> > Which one did you have in mind?  #2 can be achieved by adding
>>> PositionFilter after StopFilter and before ShingleFilter.  I think #1
>>> requires ShingleFilter modifications.
>>> >
>>> > Steve
>>> >
>>> >> -Original Message-
>>> >> From: William Koscho [mailto:wkos...@gmail.com]
>>> >> Sent: Wednesday, May 11, 2011 12:05 AM
>>> >> To: java-user@lucene.apache.org
>>> >> Subject: Can I omit ShingleFilter's filler tokens
>>> >>
>>> >> Hi,
>>> >>
>>> >> Can I remove the filler token _ from the n-gram-tokens that are
>>> generated
>>> >> by
>>> >> a ShingleFilter?
>>> >>
>>> >> I'm using a chain of filters: ClassicFilter, StopFilter,
>>> LowerCaseFilter,
>>> >> and ShingleFilter to create phrase n-grams.  The ShingleFilter
>>> >> inserts
>>> >> FILLER_TOKENs in place of the stopwords, but I don't want them.
>>> >>
>>> >> How can I omit the filler tokens?
>>> >>
>>> >> thanks
>>> >> Bill
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
> --
> Sent from my mobile device
>

-- 
Sent from my mobile device

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

found workaround: Query on using Payload with MoreLikeThis class

2011-05-11 Thread Saurabh Gokhale

Hi All,

I am not sure if any one got chance to go over my question (below).

The question was to check if I can modify MoreLikeThis.like() result
using index time  boosting.

I have found a work around as there is no easy way to influence MoreLikeThis
result using index time payload value.

The work around is to write class similar to MoreLikeThis (can not extend
this call as it is final) and in the createQuery method of MoreLikeThis
class change the Query class from TermQuery to PayloadTermQuery.

Change:
TermQuery tq = new TermQuery(new Term((String) ar[1], (String) ar[0]));

To:
Term payloadTerm = new Term((String) ar[1], (String) ar[0]);
Query tq = new PayloadTermQuery(payloadTerm, new AveragePayloadFunction());

Thats it, rest of the MoreLikeThis code stays the same :)

With this change, I could boost my MoreLikeThis result with the payload
value setup at the index time

If any one has any better thoughts, I would be glad to hear about them

Thanks

Saurabh

On Tue, May 10, 2011 at 1:36 PM, Saurabh Gokhale
wrote:

> Hi,
>
> In the Lucene 2.9.4 project, there is a requirement to boost some of the
> keywords in the document using payload.
>
> Now while searching, is there a way I can boost the MoreLikeThis result
> using the index time payload values?
>
> Or can I merge MoreLikeThis output and PayloadTermQuery output somehow to
> get the final percentage output?
>

Help needed on Ant build script for creating Lucene index

2011-05-11 Thread Saurabh Gokhale

Hi,

Can someone pls direct me to an example where I can get help on creating ant
build script for creating lucene index?. It is part of Lucene contrib but I
did not get much idea from the documentation on Lucene site.

Thanks

Saurabh

Re: How do I sort lucene search results by relevance and time?

2011-05-11 Thread Otis Gospodnetic

If only you were using Solr 
http://wiki.apache.org/solr/DisMaxQParserPlugin#bf_.28Boost_Functions.29


Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Johnbin Wang 
> To: java-user@lucene.apache.org
> Sent: Sun, May 8, 2011 11:59:11 PM
> Subject: How do I sort lucene search results by relevance and time?
> 
> What do I want to do is just like Google search results.  The results in  the
> first page is the most relevant and also recent documents, but  not
> absolutely sorted by  time desc.
> 
> -- 
> cheers,
> Johnbin  Wang
> 

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Sharding Techniques

Re: Sharding Techniques

Re: Sharding Techniques

RE: Can I omit ShingleFilter's filler tokens

Re: Can I omit ShingleFilter's filler tokens

RE: Can I omit ShingleFilter's filler tokens

Re: Non-English Languages Search

Bug in BrazilianAnalyzer?

Re: Bug in BrazilianAnalyzer?

Re: Can I omit ShingleFilter's filler tokens

Re: Can I omit ShingleFilter's filler tokens

found workaround: Query on using Payload with MoreLikeThis class

Help needed on Ant build script for creating Lucene index

Re: How do I sort lucene search results by relevance and time?

14 matches

Site Navigation

Mail list logo

Footer information