You could you phrase queries also like Economic Meltdown AND Asian
Countries. but these phrases may be too distant from one another to be
relevant for your searching purposes.
To get better result wrt position(distance between phrases), you can use
SpanNearQuery.
Let me know if you need more
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
I am wondering whether there is an easy way to avoid duplication while
indexing, just using the index being created, without creating other data
structures.
In some cases, the incoming document list can have duplicates. For example,
when creating spell checking indexes for phrases. Each phrase is
Hi,
I am having a hard time in indexing the Arabic content and
searching the same via Lucene. I have also used a Arabic Analyzer from
the Lucene package but had no luck. I have also used a snowball jar but
it doesn't contain an Arabic stemmer. So i had put the Lucene Arabic
Stemmer in
Hi Girish,
Can you provide some sample code and info about what isn't working?
All you have said so far is that the Arabic Analyzer doesn't work for
you, but you have said nothing about how you are actually using it.
Are you getting exceptions? Do the tokens not look right? Are no
Sorry for that,
Here is how the Analyzer is Selected:
public static Analyzer getAnalyzerInstance(String localeKey) {
Analyzer analyzer = null;
if (localeKey == null || localeKey.trim().equals("")) {
localeKey = AppContext.getSetting("defaultLocale");
System.out.println("Locale key taken
That sounds pretty cool Karl, and I also dig your use of Motorhead as an
example : )
I recently built an application where payloads were a lifesaver, but my
usage of them is pretty basic. I am indexing pages of text, so I use
payloads to store metadata about each word on the page - size, color,
That sounds pretty cool Karl, and I also dig your use of Motorhead as an
example : )
I recently built an application where payloads were a lifesaver, but my
usage of them is pretty basic. I am indexing pages of text, so I use
payloads to store metadata about each word on the page - size, color,
Hi Guys,
Can you Please tell me where to get login details for Luke
Thanks
Nagesh
Hi Karl,
I use payloads for weight only, too, with BoostingTermQuery (see:
http://www.nabble.com/BoostingTermQuery-scoring-td20323615.html#a20323615)
A custom tokenizer looks for the reserved character '\b' followed by a 2
byte 'boost' value. It then creates a special Token type for a custom
On Dec 29, 2008, at 9:59 AM, Girish Naik wrote:
FIELD_BODY is defined as
public static final String FIELD_BODY = AVS_FIELD_BODY;
and its indexed as
ParsedDoc webdoc = ParsedDoc.getDoc(page);
...
document.add(new Field(Constants.FIELD_BODY, webdoc.getContents(),
Field.Store.NO,
Ummm, I don't understand the question. You don't need to login, Luke is a
stand-alone program for examining Lucene indexes. You *do* have to point
Luke at your index, there should be some choice about opening a file. I
don't
have Luke in front of me here at home, but poke around the menus and it
It is just File | Open Lucene Index
:)
- Original Message
From: Erick Erickson erickerick...@gmail.com
To: java-user@lucene.apache.org
Sent: Monday, December 29, 2008 11:05:01 AM
Subject: Re: Where to get login details for Luke
Ummm, I don't understand the question. You don't need to
Thanks Grant I will check this out.
BTW, as far as Lucene version is concerned I had checked out the svn of
lucene and created a build its version says as 2.9 :) . And Luke is of
version 0.9.1
Regards,
Please do not print this email
unless it is absolutely necessary.
Chris,
Mark Miller Co. are working on (Near) Duplicate Detection. I think the work
is in Solr's JIRA, but some of it might be applicable to Lucene.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Chris Lu chris...@gmail.com
To:
On Dec 29, 2008, at 11:25 AM, Girish Naik wrote:
Thanks Grant I will check this out.
BTW, as far as Lucene version is concerned I had checked out the svn
of lucene and created a build its version says as 2.9 :) . And Luke
is of version 0.9.1
You will need to plug in your own Lucene
Otis, thanks for the pointer.
I think the question can be:
How to access TermEnum or TermInfos during indexing.
If this is possible, things would be easier.
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo:
I use JDBM store document's key ID.
2008/12/30 Chris Lu chris...@gmail.com
Otis, thanks for the pointer.
I think the question can be:
How to access TermEnum or TermInfos during indexing.
If this is possible, things would be easier.
--
Chris Lu
-
Instant
Hello,
Solr uses IndexCommit#getFileNames() to get a list of files for replication.
One windows user reported an exception which looks like it may have been
caused by IndexCommit#getFileNames() returning duplicate file names. The
exception in his case was caused by _21e.tvx coming more than once.
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
Hi
Thanks for your reply. It turns out you were correct and I was not
loading the correct document. User error!
Cheers
Amin
On 28 Dec 2008, at 19:50, Grant Ingersoll gsing...@apache.org wrote:
How do you know that document in question has an id of 1, as in when
you do: Document
23 matches
Mail list logo