Re: test

2009-04-08 Thread Michael McCandless
Can you provide more details?  EG a full exception?  What was the app
doing (indexing, searching, both)?

Mike

On Wed, Apr 8, 2009 at 2:40 AM, Antony Joseph  wrote:
> Hi,
>
> In a long running process Lucene get crashed in my application, Is there any
> way to diagnose or how can I  turn on debug logging / trace logging for
> Lucene?
>
>
> Thanks
> Antony
>
>
> --
> DigitalGlue, India
>
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Suggestive Search

2009-04-08 Thread Matt Schraeder
I want to add a suggestive search similar to google's to autocomplete
search phrases as the user types.  It doesn't have to be very elaborate
and for the most part will just involve searching single fields.  How
can I perform a search  to be able to fill in autocomplete text?
 
For instance, if I start typing "Harr" it should bring up "Harry
Potter" "Harry Houdini" and "Harry S. Truman"
 
I have tried doing search queries for "Harr*" but it's still doing
term-based searching rather than searching a full field.  To make a
field both searchable as the full field as well as tokenized, would I
have to duplicate the field and make one a keyword field? Is there a
more convenient way to do this? I have also considered making a second
index for suggestive search, which would only have the fields that I
want to enable suggestive search on, but this seems like it would be
unneccesary duplication of data as well, though it would probably make
suggestive search faster due to a smaller index.
 
Ideally it would also be nice to be able to rank these terms based on
the number of times they have been searched for so that the results are
tailored more to our users rather than simply just the score that Lucene
chooses.


Lucene searching across documents

2009-04-08 Thread Dan Scrima
So I have a requirement where I have a directory filled with xml files.
I wrote a parser to parse these files, and index all of the xml
attributes and properties into documents. An example of one of these
documents is below. I'm parsing sentences into words, and tagging the
sentences based on certain criteria.

My issue is trying to find out if lucene can handle cross-document
searching. So below is indexed as a single document... and there will be
multiple sentences before, after, and throughout an entire transcript.
Is it possible somehow to say, "I want a result where one line marked as
Symptom is 5 lines away from another line marked as Brand." So in
essence, I'm trying to search across multiple lucene documents.

 

Any thoughts or literature out there?

 











Coughing

 
SBJ





is

 
VB





caused

 
NP





by

 
PP





Mucinex

 
PDC







 

 

Thanks so much!



How can I change that lucene use by default the AND operator between terms ???

2009-04-08 Thread Ariel
When I do a search using lucene internally lucene use by default the OR
operator between terms, How can I change that lucene use by default the AND
operator between terms ???

Regards
Ariel


RE: How can I change that lucene use by default the AND operator between terms ???

2009-04-08 Thread Uwe Schindler
The query parser has a option to change that. After creating the query
parser, just set the corresponding option before parsing the query.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Ariel [mailto:isaacr...@gmail.com]
> Sent: Wednesday, April 08, 2009 4:46 PM
> To: lucene user
> Subject: How can I change that lucene use by default the AND operator
> between terms ???
> 
> When I do a search using lucene internally lucene use by default the OR
> operator between terms, How can I change that lucene use by default the
> AND
> operator between terms ???
> 
> Regards
> Ariel


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How to customize score according to field value?

2009-04-08 Thread Jinming Zhang
Hi,

Yes, the CustomScoreQuery.customScore() can meet the requirement I
described.

Thank you all!

On Tue, Apr 7, 2009 at 9:01 PM, Tim Williams  wrote:

> On Tue, Apr 7, 2009 at 3:08 AM, Jinming Zhang 
> wrote:
> > Hi,
> >
> > I have the following situation which needs to customize the final score
> > according to field value.
> >
> > Suppose there are two docs in my query result, and they are ordered by
> > default score sort:
> >
> > doc1(field1:bookA, field2:2000-01-01) -- score:0.80
> > doc2(field1:bookB, filed2:2009-01-01) -- score:0.70
> >
> > I want "doc2" to have a higher score since it's publishing date is more
> > recent, while "doc1" to have a lower score:
> >
> > doc2(field1:bookB, filed2:2009-01-01) -- score:0.77
> > doc1(field1:bookA, field2:2000-01-01) -- score:0.73
> >
> > I found this scenario is different from doc.setBoost() and
> field.setBoost().
> > Is there any way to impact the score calculated for "doc1" & "doc2"
> > according to the value of "field2"?
> >
> > Thank you in advance!
>
> If you have access to the MEAP for Lucine In Action 2nd Edition, it
> demonstrates using a CustomScoreQuery[1] for to boost a docs score
> based on recency.
>
> --tim
>
> [1] -
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/function/CustomScoreQuery.html
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Query any data

2009-04-08 Thread addman

Hi,
   Is it possible to create a query to search a field for any value?  I just
need to know if the optional field contain any data at all.
-- 
View this message in context: 
http://www.nabble.com/Query-any-data-tp22953431p22953431.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Query any data

2009-04-08 Thread Tim Williams
On Wed, Apr 8, 2009 at 11:45 AM, addman  wrote:
>
> Hi,
>   Is it possible to create a query to search a field for any value?  I just
> need to know if the optional field contain any data at all.

google for:  lucene field existence

There's no way built in, one strategy[1] is to have a 'meta field'
that contains the names of the fields the document contains.

--tim

[1] - http://www.mail-archive.com/lucene-u...@jakarta.apache.org/msg07703.html

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Lucene 1.4.3: Error when creating Searcher

2009-04-08 Thread Zhang, Lisheng
Hi,

We are using lucene 1.4.3, sometimes when two threads try to search,
one thread got error when creating MultiSearcher:

Lock obtain timed out: 
Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock

I read lucene FAQ and searched previous discussions, it seems that this
error should be related to indexing, but we are only creating Searcher? 

Thanks very much for helps, Lisheng


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Lucene searching across documents

2009-04-08 Thread Steven A Rowe
Hi Dan,

My guess, though you didn't directly say so, is that you're representing each 
sentence/"line" as a separate Lucene document.  To directly answer your 
question about whether inter-document relations (like database joins) are 
queryable in Lucene, I don't think so, other than performing multiple searches, 
where you feed the results of one query into another one (e.g.: first query for 
all lines with tag X, retrieve the line-ID and transcript-ID field values, then 
query for tag Y, requiring the same transcript-ID field value, and any one of 
the line-ID values that are within the window you want).

If instead (or perhaps in addition, depending on your other needs), each full 
transcript is a Lucene document, you can perform the kinds of searches you're 
talking about with tools available in Lucene.

I'm thinking of a lucene document with a "line-tags" field, populated with the 
tags you've associated with each line, and with the position of each line tag 
adjusted so that two tags assigned to the same line are given the same position 
(sometimes Lucene users call terms with the same position "synonyms", because 
that's the most common thing this capability is used for).

Then you can run a SpanNearQuery over the line-tags field, to return matches 
where tag X is within N lines of tag Y.

(See 

 for info on the Lucene Span Query family.)

Steve

On 4/8/2009 at 9:33 AM, Dan Scrima wrote:
> So I have a requirement where I have a directory filled with xml files.
> I wrote a parser to parse these files, and index all of the xml
> attributes and properties into documents. An example of one of these
> documents is below. I'm parsing sentences into words, and tagging the
> sentences based on certain criteria.
> 
> My issue is trying to find out if lucene can handle cross-document
> searching. So below is indexed as a single document... and there will
> be multiple sentences before, after, and throughout an entire
> transcript. Is it possible somehow to say, "I want a result where one
> line marked as Symptom is 5 lines away from another line marked as
> Brand." So in essence, I'm trying to search across multiple lucene
> documents.
> 
> Any thoughts or literature out there?
> 
> 
>   
> 
> 
> 
>   Coughing
>   SBJ
> 
> 
>   is
>   VB
> 
> 
>   caused
>   NP
> 
> 
>   by
>   PP
> 
> 
>   Mucinex
>   PDC
> 
>   
> 


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



How Can I make an analyzer that ignore the numbers o the texts ???

2009-04-08 Thread Ariel
Hi everybody:

I would want to know how Can I make an analyzer that ignore the numbers o
the texts like the stop words are ignored ??? For example that the terms :
3.8, 100, 4.15, 4,33 don't be added to the index.
How can I do that ???

Regards
Ariel


Re: Suggestive Search

2009-04-08 Thread Karl Wettin
For this you probably want to use ngrams. Wether or not this is  
something that fits in your current index is hard to say. My guess is  
that you want to create a new index with one document per unique  
phrase. You might also want to try to load this index in an  
InstantiatedIndex, that could speed things up quite a bit if the  
corpus is not too large.


If your suggestion text corpus is really large and you only want  
forward-only suggestions then you might want to consider a trie- 
pattern solution instead. These can be rather resource efficient, even  
when loaded to memory.


If you have a lot of user load on your search eninge then it might be  
interesting to use old user queries as the base of your suggestions  
and perhaps boost a bit on trends, i.e. the more people search for  
something the more it get boosted in the suggestions list.



 karl

8 apr 2009 kl. 15.26 skrev Matt Schraeder:


I want to add a suggestive search similar to google's to autocomplete
search phrases as the user types.  It doesn't have to be very  
elaborate

and for the most part will just involve searching single fields.  How
can I perform a search  to be able to fill in autocomplete text?

For instance, if I start typing "Harr" it should bring up "Harry
Potter" "Harry Houdini" and "Harry S. Truman"

I have tried doing search queries for "Harr*" but it's still doing
term-based searching rather than searching a full field.  To make a
field both searchable as the full field as well as tokenized, would I
have to duplicate the field and make one a keyword field? Is there a
more convenient way to do this? I have also considered making a second
index for suggestive search, which would only have the fields that I
want to enable suggestive search on, but this seems like it would be
unneccesary duplication of data as well, though it would probably make
suggestive search faster due to a smaller index.

Ideally it would also be nice to be able to rank these terms based on
the number of times they have been searched for so that the results  
are
tailored more to our users rather than simply just the score that  
Lucene

chooses.



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How Can I make an analyzer that ignore the numbers o the texts ???

2009-04-08 Thread Matthew Hall
You can define your own STOP_LIST and pass it in as a constructor to 
most analyzers.


For example from the Lucene Javadocs:


 StandardAnalyzer

public *StandardAnalyzer*(String 
[] stopWords)

Builds an analyzer with the given stop words.

The only thing that you need to be careful of is to make sure that the 
analyzer isn't doing some sort of conversion of the tokens before the 
stoplist is checked, but otherwise that should work out just fine.


Matt

Ariel wrote:

Hi everybody:

I would want to know how Can I make an analyzer that ignore the numbers o
the texts like the stop words are ignored ??? For example that the terms :
3.8, 100, 4.15, 4,33 don't be added to the index.
How can I do that ???

Regards
Ariel

  



--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How Can I make an analyzer that ignore the numbers o the texts ???

2009-04-08 Thread Koji Sekiguchi

Ariel wrote:

Hi everybody:

I would want to know how Can I make an analyzer that ignore the numbers o
the texts like the stop words are ignored ??? For example that the terms :
3.8, 100, 4.15, 4,33 don't be added to the index.
How can I do that ???

Regards
Ariel

  


There is a patch for filtering out number tokens:

https://issues.apache.org/jira/browse/SOLR-448

Koji


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: How Can I make an analyzer that ignore the numbers o the texts ???

2009-04-08 Thread Steven A Rowe
Hi Ariel,

As Koji mentioned, https://issues.apache.org/jira/browse/SOLR-448 contains a 
NumberFilter.  It filters out tokens that successfully parse as Doubles.  I'm 
not sure, since the examples you gave seem to use "," as the decimal character, 
how this interacts with the Locale.  (Koji, I don't see any ","-as-decimal 
tests in your patch.)

There is another one here that filters out tokens that have an initial digit 
character:



Steve

On 4/8/2009 at 1:43 PM, Ariel wrote:
> I would want to know how Can I make an analyzer that ignore the numbers
> o the texts like the stop words are ignored ??? For example that the
> terms : 3.8, 100, 4.15, 4,33 don't be added to the index.
> How can I do that ???
> 
> Regards
> Ariel


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Suggestive Search

2009-04-08 Thread Matt Schraeder
Corerct me if I'm wrong, but I don't think n-grams is really what I'm
looking for here.  I'm not looking for a spellchecker or phrase checker
style suggestive search, but only based on the exact phrases the user is
currently typing.  Since Lucene uses term-based searching, I'm not sure
how to have it search on portions of a full phrase.  Using a standard
lucene search typing in "harr" will result in searching for "harr" as a
term, which will not find "Harry Potter".  Using ngrams it would find
"Harry" as a term, but not at the beginning of an entire phrase.  This
would bring back "My Dog Harry" as a result, which isn't what I'm
looking for. I just want phrases from fields beginning with "Harr"
only.
 
I could easily do this all with our database server by simply doing a
query for "where searchqueries like 'harr%'" but we're trying to limit
our hits to the database to keep speed up on the site.

>>> karl.wet...@gmail.com 4/8/2009 12:49:45 PM >>>

For this you probably want to use ngrams. Wether or not this is  
something that fits in your current index is hard to say. My guess is 

that you want to create a new index with one document per unique  
phrase. You might also want to try to load this index in an  
InstantiatedIndex, that could speed things up quite a bit if the  
corpus is not too large.

If your suggestion text corpus is really large and you only want  
forward-only suggestions then you might want to consider a trie- 
pattern solution instead. These can be rather resource efficient, even 

when loaded to memory.

If you have a lot of user load on your search eninge then it might be 

interesting to use old user queries as the base of your suggestions  
and perhaps boost a bit on trends, i.e. the more people search for  
something the more it get boosted in the suggestions list.


  karl

8 apr 2009 kl. 15.26 skrev Matt Schraeder:

> I want to add a suggestive search similar to google's to
autocomplete
> search phrases as the user types.  It doesn't have to be very  
> elaborate
> and for the most part will just involve searching single fields. 
How
> can I perform a search  to be able to fill in autocomplete text?
>
> For instance, if I start typing "Harr" it should bring up "Harry
> Potter" "Harry Houdini" and "Harry S. Truman"
>
> I have tried doing search queries for "Harr*" but it's still doing
> term-based searching rather than searching a full field.  To make a
> field both searchable as the full field as well as tokenized, would
I
> have to duplicate the field and make one a keyword field? Is there a
> more convenient way to do this? I have also considered making a
second
> index for suggestive search, which would only have the fields that I
> want to enable suggestive search on, but this seems like it would be
> unneccesary duplication of data as well, though it would probably
make
> suggestive search faster due to a smaller index.
>
> Ideally it would also be nice to be able to rank these terms based
on
> the number of times they have been searched for so that the results 

> are
> tailored more to our users rather than simply just the score that  
> Lucene
> chooses.


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




Re: Suggestive Search

2009-04-08 Thread Gary Moore
I use TermEnum for this sort of  "browsing" on untokenized, unstored 
fields e.g. TermEnum terms=reader.terms(new Term("mybrowsefld","harr")).

-Gary
Matt Schraeder wrote:

Corerct me if I'm wrong, but I don't think n-grams is really what I'm
looking for here.  I'm not looking for a spellchecker or phrase checker
style suggestive search, but only based on the exact phrases the user is
currently typing.  Since Lucene uses term-based searching, I'm not sure
how to have it search on portions of a full phrase.  Using a standard
lucene search typing in "harr" will result in searching for "harr" as a
term, which will not find "Harry Potter".  Using ngrams it would find
"Harry" as a term, but not at the beginning of an entire phrase.  This
would bring back "My Dog Harry" as a result, which isn't what I'm
looking for. I just want phrases from fields beginning with "Harr"
only.
 
I could easily do this all with our database server by simply doing a

query for "where searchqueries like 'harr%'" but we're trying to limit
our hits to the database to keep speed up on the site.


  



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Lucene 1.4.3: Error when creating Searcher

2009-04-08 Thread Zhang, Lisheng
Hi,

Sorry that my initial message is not clear, I read lucene source code (both 
1.4.3 
and 2.4.0), and understood more.

The problem is that when using lucene 1.4.3 sometimes when searching, we got
the error:

Lock obtain timed out: 
Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock

It seems that in 2.4.0 we will never have this issue because this error can 
only 
happen when concurrent writing.

Is this true?

Thanks very much for helps, Lisheng


>  -Original Message-
> From: Zhang, Lisheng  
> Sent: Wednesday, April 08, 2009 9:08 AM
> To:   'java-user@lucene.apache.org'
> Subject:  Lucene 1.4.3: Error when creating Searcher
> 
> Hi,
> 
> We are using lucene 1.4.3, sometimes when two threads try to search,
> one thread got error when creating MultiSearcher:
> 
> Lock obtain timed out: 
> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
> 
> I read lucene FAQ and searched previous discussions, it seems that this
> error should be related to indexing, but we are only creating Searcher? 
> 
> Thanks very much for helps, Lisheng
> 

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Lucene 1.4.3: Error when creating Searcher

2009-04-08 Thread Michael McCandless
Likely your exception happened because a reader was trying to open
just as a writer was committing, twice in a row.

Do you commit (flush or close) frequently from your writer?

As of 2.1, Lucene no longer uses a commit locks -- commits are now
lockless, so you won't hit this on upgrading to 2.4.

Mike

On Wed, Apr 8, 2009 at 3:40 PM, Zhang, Lisheng
 wrote:
> Hi,
>
> Sorry that my initial message is not clear, I read lucene source code (both 
> 1.4.3
> and 2.4.0), and understood more.
>
> The problem is that when using lucene 1.4.3 sometimes when searching, we got
> the error:
>
> Lock obtain timed out: 
> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>
> It seems that in 2.4.0 we will never have this issue because this error can 
> only
> happen when concurrent writing.
>
> Is this true?
>
> Thanks very much for helps, Lisheng
>
>
>>  -Original Message-
>> From:         Zhang, Lisheng
>> Sent: Wednesday, April 08, 2009 9:08 AM
>> To:   'java-user@lucene.apache.org'
>> Subject:      Lucene 1.4.3: Error when creating Searcher
>>
>> Hi,
>>
>> We are using lucene 1.4.3, sometimes when two threads try to search,
>> one thread got error when creating MultiSearcher:
>>
>> Lock obtain timed out: 
>> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>>
>> I read lucene FAQ and searched previous discussions, it seems that this
>> error should be related to indexing, but we are only creating Searcher?
>>
>> Thanks very much for helps, Lisheng
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Lucene 1.4.3: Error when creating Searcher

2009-04-08 Thread Zhang, Lisheng
Hi,

Client said they did not index, all they do is searching (create
Searcher objects), I looked at 1.4.3 and think this issue can
happen in:

private static IndexReader open(final Directory directory, final boolean 
closeDirectory) 
  syncronized(directory) {...}

if calls are coming from different Java processes (in our case 
a few AppServer clusters)?

Thanks very much for helps, Lisheng

-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Wednesday, April 08, 2009 1:00 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene 1.4.3: Error when creating Searcher


Likely your exception happened because a reader was trying to open
just as a writer was committing, twice in a row.

Do you commit (flush or close) frequently from your writer?

As of 2.1, Lucene no longer uses a commit locks -- commits are now
lockless, so you won't hit this on upgrading to 2.4.

Mike

On Wed, Apr 8, 2009 at 3:40 PM, Zhang, Lisheng
 wrote:
> Hi,
>
> Sorry that my initial message is not clear, I read lucene source code (both 
> 1.4.3
> and 2.4.0), and understood more.
>
> The problem is that when using lucene 1.4.3 sometimes when searching, we got
> the error:
>
> Lock obtain timed out: 
> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>
> It seems that in 2.4.0 we will never have this issue because this error can 
> only
> happen when concurrent writing.
>
> Is this true?
>
> Thanks very much for helps, Lisheng
>
>
>>  -Original Message-
>> From:         Zhang, Lisheng
>> Sent: Wednesday, April 08, 2009 9:08 AM
>> To:   'java-user@lucene.apache.org'
>> Subject:      Lucene 1.4.3: Error when creating Searcher
>>
>> Hi,
>>
>> We are using lucene 1.4.3, sometimes when two threads try to search,
>> one thread got error when creating MultiSearcher:
>>
>> Lock obtain timed out: 
>> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>>
>> I read lucene FAQ and searched previous discussions, it seems that this
>> error should be related to indexing, but we are only creating Searcher?
>>
>> Thanks very much for helps, Lisheng
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Lucene 1.4.3: Error when creating Searcher

2009-04-08 Thread Michael McCandless
Ahh yes right.  If multiple IndexSearchers are trying to open at once,
they each try to acquire the commit lock and can thus starve one
another.

The simplest workaround is to just keep retrying opening the IndexSearcher.

Though if you accidentally get an orphan'd commit lock in the
directory (eg if the JRE was killed while IndexSearcher was trying to
open) then you'll have to remove that file.

Or upgrade to Lucene >= 2.1 with lockless commits.

Mike

On Wed, Apr 8, 2009 at 4:47 PM, Zhang, Lisheng
 wrote:
> Hi,
>
> Client said they did not index, all they do is searching (create
> Searcher objects), I looked at 1.4.3 and think this issue can
> happen in:
>
> private static IndexReader open(final Directory directory, final boolean 
> closeDirectory)
>  syncronized(directory) {...}
>
> if calls are coming from different Java processes (in our case
> a few AppServer clusters)?
>
> Thanks very much for helps, Lisheng
>
> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Wednesday, April 08, 2009 1:00 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene 1.4.3: Error when creating Searcher
>
>
> Likely your exception happened because a reader was trying to open
> just as a writer was committing, twice in a row.
>
> Do you commit (flush or close) frequently from your writer?
>
> As of 2.1, Lucene no longer uses a commit locks -- commits are now
> lockless, so you won't hit this on upgrading to 2.4.
>
> Mike
>
> On Wed, Apr 8, 2009 at 3:40 PM, Zhang, Lisheng
>  wrote:
>> Hi,
>>
>> Sorry that my initial message is not clear, I read lucene source code (both 
>> 1.4.3
>> and 2.4.0), and understood more.
>>
>> The problem is that when using lucene 1.4.3 sometimes when searching, we got
>> the error:
>>
>> Lock obtain timed out: 
>> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>>
>> It seems that in 2.4.0 we will never have this issue because this error can 
>> only
>> happen when concurrent writing.
>>
>> Is this true?
>>
>> Thanks very much for helps, Lisheng
>>
>>
>>>  -Original Message-
>>> From:         Zhang, Lisheng
>>> Sent: Wednesday, April 08, 2009 9:08 AM
>>> To:   'java-user@lucene.apache.org'
>>> Subject:      Lucene 1.4.3: Error when creating Searcher
>>>
>>> Hi,
>>>
>>> We are using lucene 1.4.3, sometimes when two threads try to search,
>>> one thread got error when creating MultiSearcher:
>>>
>>> Lock obtain timed out: 
>>> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>>>
>>> I read lucene FAQ and searched previous discussions, it seems that this
>>> error should be related to indexing, but we are only creating Searcher?
>>>
>>> Thanks very much for helps, Lisheng
>>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Lucene help with query

2009-04-08 Thread The Seer

Hello,

I have 5 lucene documents 

name: Apple
name: Apple martini
name: Apple drink
name: Apple sweet drink

I am using lucene default similarity and standard analyzer .

When I am searching for apple I am getting all 4 documents with the same
score back. If I use hits the score is 1.0 if I use hit collator is some
number

Can someone explain why? 


I am generating my query using query parser
The field is ANALYZED 


Thanks

-- 
View this message in context: 
http://www.nabble.com/Lucene-help-with-query-tp22959498p22959498.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Lucene 1.4.3: Error when creating Searcher

2009-04-08 Thread Zhang, Lisheng
Hi,

So it is always OK in 2.4 when multiple java processes 
try to create IndexerSearcher at the same time? Just 
want to make sure.

I think upgrading should be the best option.

Thanks very much for helps, Lisheng

-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Wednesday, April 08, 2009 1:59 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene 1.4.3: Error when creating Searcher


Ahh yes right.  If multiple IndexSearchers are trying to open at once,
they each try to acquire the commit lock and can thus starve one
another.

The simplest workaround is to just keep retrying opening the IndexSearcher.

Though if you accidentally get an orphan'd commit lock in the
directory (eg if the JRE was killed while IndexSearcher was trying to
open) then you'll have to remove that file.

Or upgrade to Lucene >= 2.1 with lockless commits.

Mike

On Wed, Apr 8, 2009 at 4:47 PM, Zhang, Lisheng
 wrote:
> Hi,
>
> Client said they did not index, all they do is searching (create
> Searcher objects), I looked at 1.4.3 and think this issue can
> happen in:
>
> private static IndexReader open(final Directory directory, final boolean 
> closeDirectory)
>  syncronized(directory) {...}
>
> if calls are coming from different Java processes (in our case
> a few AppServer clusters)?
>
> Thanks very much for helps, Lisheng
>
> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Wednesday, April 08, 2009 1:00 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene 1.4.3: Error when creating Searcher
>
>
> Likely your exception happened because a reader was trying to open
> just as a writer was committing, twice in a row.
>
> Do you commit (flush or close) frequently from your writer?
>
> As of 2.1, Lucene no longer uses a commit locks -- commits are now
> lockless, so you won't hit this on upgrading to 2.4.
>
> Mike
>
> On Wed, Apr 8, 2009 at 3:40 PM, Zhang, Lisheng
>  wrote:
>> Hi,
>>
>> Sorry that my initial message is not clear, I read lucene source code (both 
>> 1.4.3
>> and 2.4.0), and understood more.
>>
>> The problem is that when using lucene 1.4.3 sometimes when searching, we got
>> the error:
>>
>> Lock obtain timed out: 
>> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>>
>> It seems that in 2.4.0 we will never have this issue because this error can 
>> only
>> happen when concurrent writing.
>>
>> Is this true?
>>
>> Thanks very much for helps, Lisheng
>>
>>
>>>  -Original Message-
>>> From:         Zhang, Lisheng
>>> Sent: Wednesday, April 08, 2009 9:08 AM
>>> To:   'java-user@lucene.apache.org'
>>> Subject:      Lucene 1.4.3: Error when creating Searcher
>>>
>>> Hi,
>>>
>>> We are using lucene 1.4.3, sometimes when two threads try to search,
>>> one thread got error when creating MultiSearcher:
>>>
>>> Lock obtain timed out: 
>>> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>>>
>>> I read lucene FAQ and searched previous discussions, it seems that this
>>> error should be related to indexing, but we are only creating Searcher?
>>>
>>> Thanks very much for helps, Lisheng
>>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Lucene 1.4.3: Error when creating Searcher

2009-04-08 Thread Michael McCandless
Yes, no locking is done anymore (as of 2.1) when opening
IndexSearchers.  So, it's fine.

Though... if they are within a single JRE, it's best to open a single
IndexSearcher and share.

Mike

On Wed, Apr 8, 2009 at 5:14 PM, Zhang, Lisheng
 wrote:
> Hi,
>
> So it is always OK in 2.4 when multiple java processes
> try to create IndexerSearcher at the same time? Just
> want to make sure.
>
> I think upgrading should be the best option.
>
> Thanks very much for helps, Lisheng
>
> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Wednesday, April 08, 2009 1:59 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene 1.4.3: Error when creating Searcher
>
>
> Ahh yes right.  If multiple IndexSearchers are trying to open at once,
> they each try to acquire the commit lock and can thus starve one
> another.
>
> The simplest workaround is to just keep retrying opening the IndexSearcher.
>
> Though if you accidentally get an orphan'd commit lock in the
> directory (eg if the JRE was killed while IndexSearcher was trying to
> open) then you'll have to remove that file.
>
> Or upgrade to Lucene >= 2.1 with lockless commits.
>
> Mike
>
> On Wed, Apr 8, 2009 at 4:47 PM, Zhang, Lisheng
>  wrote:
>> Hi,
>>
>> Client said they did not index, all they do is searching (create
>> Searcher objects), I looked at 1.4.3 and think this issue can
>> happen in:
>>
>> private static IndexReader open(final Directory directory, final boolean 
>> closeDirectory)
>>  syncronized(directory) {...}
>>
>> if calls are coming from different Java processes (in our case
>> a few AppServer clusters)?
>>
>> Thanks very much for helps, Lisheng
>>
>> -Original Message-
>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
>> Sent: Wednesday, April 08, 2009 1:00 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Lucene 1.4.3: Error when creating Searcher
>>
>>
>> Likely your exception happened because a reader was trying to open
>> just as a writer was committing, twice in a row.
>>
>> Do you commit (flush or close) frequently from your writer?
>>
>> As of 2.1, Lucene no longer uses a commit locks -- commits are now
>> lockless, so you won't hit this on upgrading to 2.4.
>>
>> Mike
>>
>> On Wed, Apr 8, 2009 at 3:40 PM, Zhang, Lisheng
>>  wrote:
>>> Hi,
>>>
>>> Sorry that my initial message is not clear, I read lucene source code (both 
>>> 1.4.3
>>> and 2.4.0), and understood more.
>>>
>>> The problem is that when using lucene 1.4.3 sometimes when searching, we got
>>> the error:
>>>
>>> Lock obtain timed out: 
>>> Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock
>>>
>>> It seems that in 2.4.0 we will never have this issue because this error can 
>>> only
>>> happen when concurrent writing.
>>>
>>> Is this true?
>>>
>>> Thanks very much for helps, Lisheng
>>>
>>>
  -Original Message-
 From:         Zhang, Lisheng
 Sent: Wednesday, April 08, 2009 9:08 AM
 To:   'java-user@lucene.apache.org'
 Subject:      Lucene 1.4.3: Error when creating Searcher

 Hi,

 We are using lucene 1.4.3, sometimes when two threads try to search,
 one thread got error when creating MultiSearcher:

 Lock obtain timed out: 
 Lock@/tmp/lucene-ba94511756a2670adeac03a50532c63c-commit.lock

 I read lucene FAQ and searched previous discussions, it seems that this
 error should be related to indexing, but we are only creating Searcher?

 Thanks very much for helps, Lisheng

>>>
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Lucene help with query

2009-04-08 Thread John Seer

Any  ideas?



John Seer wrote:
> 
> Hello,
> 
> I have 5 lucene documents 
> 
> name: Apple
> name: Apple martini
> name: Apple drink
> name: Apple sweet drink
> 
> I am using lucene default similarity and standard analyzer .
> 
> When I am searching for apple I am getting all 4 documents with the same
> score back. If I use hits the score is 1.0 if I use hit collator is some
> number
> 
> Can someone explain why? 
> 
> 
> I am generating my query using query parser
> The field is ANALYZED 
> 
> 
> Thanks
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Lucene-help-with-query-tp22959498p22960441.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Suggestive Search

2009-04-08 Thread Karl Wettin
If you use prefix grams only then you'll get a forward-only suggestion  
scheme. I've seen several implementation that use that and it works  
quite well.


harry potter: ^ha, ^har, ^harr, ^harry, ^harry p, ^harry po..
harry houdini: ^ha, ^har, ^harr, ^harry, ^harry h, ^harry ho..

I prefere the trie-pattern though. Just rememberd there is an old one  
in LUCENE-625.


 karl

8 apr 2009 kl. 20.50 skrev Matt Schraeder:


Corerct me if I'm wrong, but I don't think n-grams is really what I'm
looking for here.  I'm not looking for a spellchecker or phrase  
checker
style suggestive search, but only based on the exact phrases the  
user is
currently typing.  Since Lucene uses term-based searching, I'm not  
sure

how to have it search on portions of a full phrase.  Using a standard
lucene search typing in "harr" will result in searching for "harr"  
as a

term, which will not find "Harry Potter".  Using ngrams it would find
"Harry" as a term, but not at the beginning of an entire phrase.  This
would bring back "My Dog Harry" as a result, which isn't what I'm
looking for. I just want phrases from fields beginning with "Harr"
only.

I could easily do this all with our database server by simply doing a
query for "where searchqueries like 'harr%'" but we're trying to limit
our hits to the database to keep speed up on the site.


karl.wet...@gmail.com 4/8/2009 12:49:45 PM >>>


For this you probably want to use ngrams. Wether or not this is
something that fits in your current index is hard to say. My guess is

that you want to create a new index with one document per unique
phrase. You might also want to try to load this index in an
InstantiatedIndex, that could speed things up quite a bit if the
corpus is not too large.

If your suggestion text corpus is really large and you only want
forward-only suggestions then you might want to consider a trie-
pattern solution instead. These can be rather resource efficient, even

when loaded to memory.

If you have a lot of user load on your search eninge then it might be

interesting to use old user queries as the base of your suggestions
and perhaps boost a bit on trends, i.e. the more people search for
something the more it get boosted in the suggestions list.


 karl

8 apr 2009 kl. 15.26 skrev Matt Schraeder:


I want to add a suggestive search similar to google's to

autocomplete

search phrases as the user types.  It doesn't have to be very
elaborate
and for the most part will just involve searching single fields.

How

can I perform a search  to be able to fill in autocomplete text?

For instance, if I start typing "Harr" it should bring up "Harry
Potter" "Harry Houdini" and "Harry S. Truman"

I have tried doing search queries for "Harr*" but it's still doing
term-based searching rather than searching a full field.  To make a
field both searchable as the full field as well as tokenized, would

I

have to duplicate the field and make one a keyword field? Is there a
more convenient way to do this? I have also considered making a

second

index for suggestive search, which would only have the fields that I
want to enable suggestive search on, but this seems like it would be
unneccesary duplication of data as well, though it would probably

make

suggestive search faster due to a smaller index.

Ideally it would also be nice to be able to rank these terms based

on

the number of times they have been searched for so that the results



are
tailored more to our users rather than simply just the score that
Lucene
chooses.



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Wordnet indexing error

2009-04-08 Thread Sudarsan, Sithu D.
Hi All,

We're using Lucene 2.3.2 on Windows. When we try to generate index for
WordNet2.0 using Syns2Index class, while indexing, the following error
is thrown:

Java.lang.NoSuchMethodError:
org.apache.lucene.document.Field.UnIndexed(Ljava/lang/String;Ljava/lang/
String;)Lorg/apache/lucene/document/Field;

Our code is looks like this:

String[] filelocations = {"path/to/prolog/file", "path/to/index"};
try{
 Syns2Index.main(filelocations);
} catch 


The error typically happens at about line number 13 in the wn_s.pl
file.

No luck with WordNet3.0 as well. We get the same error.

Any fix or solutions? 

Thanks in advance,
Sithu D Sudarsan

sithu.sudar...@fda.hhs.gov
sdsudar...@ualr.edu



Re: Lucene help with query

2009-04-08 Thread Koji Sekiguchi

If you omit norms when indexing the name field, you'll get same score back.

Koji

The Seer wrote:

Hello,

I have 5 lucene documents 


name: Apple
name: Apple martini
name: Apple drink
name: Apple sweet drink

I am using lucene default similarity and standard analyzer .

When I am searching for apple I am getting all 4 documents with the same
score back. If I use hits the score is 1.0 if I use hit collator is some
number

Can someone explain why? 



I am generating my query using query parser
The field is ANALYZED 



Thanks

  



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How Can I make an analyzer that ignore the numbers o the texts ???

2009-04-08 Thread Koji Sekiguchi

Steven A Rowe wrote:

Hi Ariel,

As Koji mentioned, https://issues.apache.org/jira/browse/SOLR-448 contains a NumberFilter.  It 
filters out tokens that successfully parse as Doubles.  I'm not sure, since the examples you gave 
seem to use "," as the decimal character, how this interacts with the Locale.  (Koji, I 
don't see any ","-as-decimal tests in your patch.)

  

Right. It should be. Thanks!

Koji


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How can I change that lucene use by default the AND operator between terms ???

2009-04-08 Thread 王巍巍
call method of QueryParser
setDefaultOperator

2009/4/8 Uwe Schindler 

> The query parser has a option to change that. After creating the query
> parser, just set the corresponding option before parsing the query.
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Ariel [mailto:isaacr...@gmail.com]
> > Sent: Wednesday, April 08, 2009 4:46 PM
> > To: lucene user
> > Subject: How can I change that lucene use by default the AND operator
> > between terms ???
> >
> > When I do a search using lucene internally lucene use by default the OR
> > operator between terms, How can I change that lucene use by default the
> > AND
> > operator between terms ???
> >
> > Regards
> > Ariel
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


-- 
王巍巍(Weiwei Wang)
Department of Computer Science
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Mobile: 86-13913310569
MSN: ww.wang...@gmail.com
Homepage: http://cs.nju.edu.cn/rl/weiweiwang


Re: Query any data

2009-04-08 Thread 王巍巍
first you should change your querypaser to accept wildcard query by calling
method of QueryParser
  setAllowLeadingWildcard
then you can query like this:  fieldname:*

2009/4/9 Tim Williams 

> On Wed, Apr 8, 2009 at 11:45 AM, addman  wrote:
> >
> > Hi,
> >   Is it possible to create a query to search a field for any value?  I
> just
> > need to know if the optional field contain any data at all.
>
> google for:  lucene field existence
>
> There's no way built in, one strategy[1] is to have a 'meta field'
> that contains the names of the fields the document contains.
>
> --tim
>
> [1] -
> http://www.mail-archive.com/lucene-u...@jakarta.apache.org/msg07703.html
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


-- 
王巍巍(Weiwei Wang)
Department of Computer Science
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Mobile: 86-13913310569
MSN: ww.wang...@gmail.com
Homepage: http://cs.nju.edu.cn/rl/weiweiwang


Re: Suggestive Search

2009-04-08 Thread 王巍巍
I test the lucene spellchecker and it doesn't support chinese spell checker,
how can i achieve this goal as google does?

2009/4/9 Karl Wettin 

> If you use prefix grams only then you'll get a forward-only suggestion
> scheme. I've seen several implementation that use that and it works quite
> well.
>
> harry potter: ^ha, ^har, ^harr, ^harry, ^harry p, ^harry po..
> harry houdini: ^ha, ^har, ^harr, ^harry, ^harry h, ^harry ho..
>
> I prefere the trie-pattern though. Just rememberd there is an old one in
> LUCENE-625.
>
> karl
>
> 8 apr 2009 kl. 20.50 skrev Matt Schraeder:
>
>
>  Corerct me if I'm wrong, but I don't think n-grams is really what I'm
>> looking for here.  I'm not looking for a spellchecker or phrase checker
>> style suggestive search, but only based on the exact phrases the user is
>> currently typing.  Since Lucene uses term-based searching, I'm not sure
>> how to have it search on portions of a full phrase.  Using a standard
>> lucene search typing in "harr" will result in searching for "harr" as a
>> term, which will not find "Harry Potter".  Using ngrams it would find
>> "Harry" as a term, but not at the beginning of an entire phrase.  This
>> would bring back "My Dog Harry" as a result, which isn't what I'm
>> looking for. I just want phrases from fields beginning with "Harr"
>> only.
>>
>> I could easily do this all with our database server by simply doing a
>> query for "where searchqueries like 'harr%'" but we're trying to limit
>> our hits to the database to keep speed up on the site.
>>
>>  karl.wet...@gmail.com 4/8/2009 12:49:45 PM >>>
>

>> For this you probably want to use ngrams. Wether or not this is
>> something that fits in your current index is hard to say. My guess is
>>
>> that you want to create a new index with one document per unique
>> phrase. You might also want to try to load this index in an
>> InstantiatedIndex, that could speed things up quite a bit if the
>> corpus is not too large.
>>
>> If your suggestion text corpus is really large and you only want
>> forward-only suggestions then you might want to consider a trie-
>> pattern solution instead. These can be rather resource efficient, even
>>
>> when loaded to memory.
>>
>> If you have a lot of user load on your search eninge then it might be
>>
>> interesting to use old user queries as the base of your suggestions
>> and perhaps boost a bit on trends, i.e. the more people search for
>> something the more it get boosted in the suggestions list.
>>
>>
>> karl
>>
>> 8 apr 2009 kl. 15.26 skrev Matt Schraeder:
>>
>>  I want to add a suggestive search similar to google's to
>>>
>> autocomplete
>>
>>> search phrases as the user types.  It doesn't have to be very
>>> elaborate
>>> and for the most part will just involve searching single fields.
>>>
>> How
>>
>>> can I perform a search  to be able to fill in autocomplete text?
>>>
>>> For instance, if I start typing "Harr" it should bring up "Harry
>>> Potter" "Harry Houdini" and "Harry S. Truman"
>>>
>>> I have tried doing search queries for "Harr*" but it's still doing
>>> term-based searching rather than searching a full field.  To make a
>>> field both searchable as the full field as well as tokenized, would
>>>
>> I
>>
>>> have to duplicate the field and make one a keyword field? Is there a
>>> more convenient way to do this? I have also considered making a
>>>
>> second
>>
>>> index for suggestive search, which would only have the fields that I
>>> want to enable suggestive search on, but this seems like it would be
>>> unneccesary duplication of data as well, though it would probably
>>>
>> make
>>
>>> suggestive search faster due to a smaller index.
>>>
>>> Ideally it would also be nice to be able to rank these terms based
>>>
>> on
>>
>>> the number of times they have been searched for so that the results
>>>
>>
>>  are
>>> tailored more to our users rather than simply just the score that
>>> Lucene
>>> chooses.
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


-- 
王巍巍(Weiwei Wang)
Department of Computer Science
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Mobile: 86-13913310569
MSN: ww.wang...@gmail.com
Homepage: http://cs.nju.edu.cn/rl/weiweiwang


Re: Wordnet indexing error

2009-04-08 Thread Otis Gospodnetic

Hi,

The simplest thing to do is to grab the latest Lucene and the latest jar for 
that Wordnet (syns2index) code.  That should work for you (that UnIndexed 
method is an old method that doesn't exist any more).


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: "Sudarsan, Sithu D." 
> To: java-user@lucene.apache.org
> Sent: Wednesday, April 8, 2009 7:01:16 PM
> Subject: Wordnet indexing error
> 
> Hi All,
> 
> We're using Lucene 2.3.2 on Windows. When we try to generate index for
> WordNet2.0 using Syns2Index class, while indexing, the following error
> is thrown:
> 
> Java.lang.NoSuchMethodError:
> org.apache.lucene.document.Field.UnIndexed(Ljava/lang/String;Ljava/lang/
> String;)Lorg/apache/lucene/document/Field;
> 
> Our code is looks like this:
> 
> String[] filelocations = {"path/to/prolog/file", "path/to/index"};
> try{
>  Syns2Index.main(filelocations);
> } catch 
> 
> 
> The error typically happens at about line number 13 in the wn_s.pl
> file.
> 
> No luck with WordNet3.0 as well. We get the same error.
> 
> Any fix or solutions? 
> 
> Thanks in advance,
> Sithu D Sudarsan
> 
> sithu.sudar...@fda.hhs.gov
> sdsudar...@ualr.edu


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Lucene searching across documents

2009-04-08 Thread Andy
Hello all,



I'm trying to implement a vector space model using lucene. I need to
have a file (or on memory) with TF/IDF weight of each term in each
document. (in fact that is a matrix with documents presented as
vectors, in which the elements of each vector is the TF weight ...) 



Please Please help me on this

contac me if you need any further info via andykan1...@yahoo.com

Many Many thanks


  

Vector space implemantion

2009-04-08 Thread Andy
Hello all,

I'm trying to implement a vector space model using lucene. I need to have a 
file (or on memory) with TF/IDF weight of each term in each document. (in fact 
that is a matrix with documents presented as vectors, in which the elements of 
each vector is the TF weight ...)

Please Please help me on this
contact me if you need any further info via andykan1...@yahoo.com
Many Many thanks