Please don't hijack a thread, start a new topic. From Hossman:
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
What are you trying to do? I think you'd get a better response ifyou
explained what higher-level task/feature you're trying to
implement.
Best
Erick
On Mon, Jul 13, 2009 at 4:54 AM, liat oren oren.l...@gmail.com wrote:
Hi all,
I have a list of synonyms for every word.
Is there a good way to
It would be helpful if you told us what analyzers you're using andwhat your
search code looks like. Even better would be a small,
self-contained demonstration app showing the issue.
You could well be right that the text format is tripping up tokenizing,
but there are other issues. You may have to
It depends (tm). How much data are we talking about here?I dislike having to
have two data sources for a running app
just because it's more complicated, so my first try would
be to store all the data in the index and try it. A several
Gigabyte index is not a problem at all (depending upon
how you
WARNING: I haven't actually tried using RegexTermEnum in a
long time, but...
I *think* that the constructor positions you at the first term that
matches, without calling next(). At least there's nothing I saw
in the documentation that indicates you need to call next() before
calling term().
H. I'm having trouble understanding what you want
to accomplish and why you think storing a java object is appropriate
to do in a Lucene index.
Perhaps you could expand on your use case here.
Best
Erick
On Fri, Jul 3, 2009 at 3:32 PM, MilleBii mille...@gmail.com wrote:
I want to store in
You have to tell us what analyzers you are using. Many analyzers
will throw out non alpha-num characters.
Even better, a small, self-contained test case illustrating your problem
would help us help you.
Best
Erick
On Fri, Jul 3, 2009 at 5:11 PM, shbn sharon.benkovi...@ewave.co.il wrote:
Hi,
in Ian's link, particularly see the section Don't iterate over morehits
than necessary.
A couple of other things:
1 Loading the entire document just to get a field or two isn't
very efficient, think about lazy loading (See FieldSelector)
2 What do you mean when you say not very good? Using
can you please tune my code to work it faster and better
Are you willing to pay me to do your job for you? Sorry to besnarky, but
please be aware that we're volunteers here, it's
pretty presumptuous to ask for this.
You still haven't answered what it is you're trying to do. Why are
you
You probably need to make sure you understand analyzers beforeyou think
about escaping/encoding. For instance, if you use
StandardAnalyzer when indexing the text Las Vegas-Food Dining Place
would index the tokens
las
vegas
food
dining
place
nary a hyphen to be seen. If you used StandardAnalyzer
This is really a permissions problem, which has been discussed
frequently. I think you'd get farther faster by searching the mail
archive (see this page, near the bottom:
http://lucene.apache.org/java/docs/mailinglists.html
http://lucene.apache.org/java/docs/mailinglists.htmland see if those
First, I highly, highly recommend you get a copy of Luke to examineyour
index. It'll also help you understand the role of Analyzers.
Your first problem is that StandardAnalyzer probably removes
the open and close parens. See:
http://lucene.apache.org/java/2_4_1/api/index.html
so you can't search
Opening a searcher and doing the first query incurs a significant amount of
overhead, cache loading, etc. Inferring search times relative to index size
with a program like you describe is unreliable.
Try firing a few queries at the index without measuring, *then* measure the
time it takes for
Are you measuring search time *only* or are you measuring total response
time
including assembling whatever you assemble? If you're measuring total
response
time, everything from network latency to what you're doing with each hit may
affect response time.
This is especially true if you're
NOT isn't a boolean operator, which is a source of continuous confusion.
See:
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#NOT
for a part of the explanation, and
http://wiki.apache.org/lucene-java/BooleanQuerySyntax
Best
Erick
On Tue, Jun 16, 2009 at 11:24 AM, Sumanta Bhowmik
Well, if you're seeing it, it's possible G
But the first question is always what were you measuring? Be aware
that when you open a searcher, the first few queries can fill caches, etc
and
may take an anomalously long time, especially if you're sorting. So could
you give more details of your
Why wouldn't two RangeQuerys work for this? Essentially something expressing
startdate[0 TO Systemtime] AND enddate[Systemtime TO infinity]?
Best
Erick
On Fri, Jun 12, 2009 at 1:00 PM, Muhammad Momin Rashid mo...@abdere.comwrote:
Hello Everyone,
I need to filter records based on whether
Enumerating terms will be inefficient compared to getting the stored field.I'd
try soring the fields first until and unless you can demonstrate a problem.
BTW, if you're not going to *search* on the field, there's no reason to
index
it at all.
Why do think you don't want to store the paths? How
to understand the queries and the content of the index.
Thanks (Erick Balasubramanian Sudaakeran)
Tom
--- En date de : Dim 31.5.09, Erick Erickson erickerick...@gmail.com a
écrit :
De: Erick Erickson erickerick...@gmail.com
Objet: Re: Index and search terms containing character -
À: java-user
It's really unclear to me what
PhysicianFieldInfo.FIRST_NAME_EXACT.toString()
returns. I assume the intent is to return a field name, but
how that relates to
FIRST_NAME_EXACT(Field.Store.YES, Field.Index.UN_TOKENIZED)
doesn't mean anything to me. Could you provide some details?
Note that if you
The most common issue with this kind of thing is that UN_TOKENIZEDimplies no
case folding. So if your case differs you won't get a match.
That aside, the very first thing I'd do is get a copy of Luke (google Lucene
Luke)
and examine the index to see if what's in your index is what you *think* is
StandardAnalyzer is fine. I loaded your index into Luke and there is exactly
one document with philipcimiano in the name field.
There is only one document that has researcher in the name field.
Both of these documents (using StandardAnalyzer) return one
document (doc 12 for PHILIPCIMIANO and doc 4
analyzer won't be that difficult after
going thru your mail. I'll give it a try. I don't have any idea on filters
but I'm pretty it must be simple and will definitely go through the
examples
of LIA 2ndEdn. Thank you.
--KK
On Tue, May 26, 2009 at 6:55 PM, Erick Erickson erickerick
I don't think there's anything you can use out of the box, but if you
search for the mail thread (see serchable archives) for a thread
titled Hebrew and Hindi analyzers you might find something
useful.
Not much help I know, but perhaps a place to start.
And yes, you should use the same analyzer
I suspect that your boost values are too small to really influencethe scores
very much. Have you tried using boost values of, say,
d:5^100 OR uid:10^10 OR lang:lisp ?
But if you have specific documents that you *know* you want in
specific places, why play around with boosting at all? You can use
Unless something about your problem space *requires* that you reopen theindex,
you're better off just opining it once, writing all your documents to
it, then closing it. Although what you're doing will work, it's not very
efficient.
And the same thing is *especially* true of the searcher. There's
The Lucene In Action book (at least the first edition and, I presume, the
second)
has exactly this, called SynonymAnalyzer. The basic idea is that at index
time
you index your multiple terms with no increment between, so all your
synonyms
get indexed in the same position.
I highly recommend the
Well, you haven't really provided much in the way of details.For instance,
what does it mean that your Lucene index is
stored in a database? Did you store it as a BLOB? Your
problem statement is very hard to understand, please explain
in more detail. Pretend you don't know a thing about your
app
the fields I need.
Could you please give me an example of how I creat the Filter that filters
out a given list of ids?
Thanks!
Liat
2009/5/18 Erick Erickson erickerick...@gmail.com
I'm still unclear what you want the statistics *for*. statistics
are pretty meaningless as far as I understand
Have you looked at TopDocCollector? Basically, you can tell itto only return
you the top N docs by score (N is arbitrary).
What you then have is an array of raw score and doc ID pairs
AND a max score.
NOTE: raw score is not normalized, i.e. is not guaranteed to be
between 0 and 1.
So now you can
-
From: Erick Erickson erickerick...@gmail.com
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache.org
Subject: Re: relevance function for scores
Date: Mon, 18 May 2009 09:13:27 -0400
Have you looked at TopDocCollector? Basically, you can tell itto only
return
you the top N docs
a list of ids that the query should look at, which Filter should
I
use?
Thanks a lot,
Liat
2009/5/14 Erick Erickson erickerick...@gmail.com
Hmmm, come to think of it, if you pass the Filter to the search I*think*
you
don't get scores for that clause, but you may want to
check it out
issue?
On Thu, May 14, 2009 at 4:59 PM, Erick Erickson
erickerick...@gmail.comwrote:
I suspect that what's happening is that StandardAnalyzer is breaking
your stream up on the odd characters. All escaping them on the
query does is insure that they're not interpreted by the parser
I don't know if I'm understanding what you want, but if you havea
pre-defined list of documents, couldn't you form a Filter? Then
your results would only be the documents you care about.
If this is irrelevant, perhaps you could explain a bit more about
the problem you're trying to solve.
Best
on these, but it will take the statictics of the
whole index, right?
2009/5/14 Erick Erickson erickerick...@gmail.com
I don't know if I'm understanding what you want, but if you havea
pre-defined list of documents, couldn't you form a Filter? Then
your results would only be the documents you care about
No. What is correctly? Are you stemming? in which case using thesame
analyzer on different languages will not work.
This topic have been discussed on the user list frequently, so if you
searched
that archive (see: http://wiki.apache.org/lucene-java/MailingListArchives)
you'd find a wealth of
I suspect that what's happening is that StandardAnalyzer is breaking
your stream up on the odd characters. All escaping them on the
query does is insure that they're not interpreted by the parser as (in
this case), the beginning of a group and a MUST operator. So, I
claim it correctly feeds
I'd recommend you get a copy of Luke and examine what's actually in
your index when anomalous things happen. In your first post you didn't
specify what analyzer you used, I suspect you weren't getting the tokens
broken up as you expected. Luke would have shown you.
But if you're satisfied
The class is contained in
org.apache.lucene.index.memory.AnalyzerUtil
Assuming you've installed 2.4, it's in...
which is located in the contrib area. Try looking in your 2.4 installation
directory/contrib/memory/lucene-memory-2.4.0.jar
Best
Erick
2009/5/11 Kamal Najib kamal.na...@mytum.de
I don't understand your regex at all. Isn't it looking for in with any
*single* character in front and back? Given your example, I don't
see how you're getting anything back at all. Is this code you're
actually executing or just an example?
What does toString and/or Explain show? Think about
You haven't forced the double quotes through to the parser. Try
Query query = qp.parse(\word1 word2\);
On Thu, May 7, 2009 at 11:14 AM, Seid Mohammed seidy...@gmail.com wrote:
I have set the slop for my search to be some terms away for inclusion.
unfortunately, the result is the same
how much data are you talking about here? Could you use a KeywordAnalyzer
(perhaps in a duplicated field) with appropriate filtering (to lowercase,
remove
punctuation, etc)?
Best
Erick
On Wed, May 6, 2009 at 4:50 AM, Laura Hollink lau...@cs.vu.nl wrote:
Hi,
I am trying to distinguish between
Why are you using MultiPhraseQuery? It appears (warning,
I haven't really used it) to be designed to handle *phrases*.
You're problem statement isn't looking at phrases at all,
just a wildcard single terms. And you're supposed to
call the first MPQ.add with, say, the first word of the
*phrase*,
with '*' (e.g. * phrase *), so I tryed
MultiPhraseQuery instead.
Forgive me if I am too newbie, 10 days ago I didn't know this tool
existed...
Erick Erickson wrote:
Why are you using MultiPhraseQuery? It appears (warning,
I haven't really used it) to be designed to handle *phrases*.
You're
RegexQuery that appears in the
API
documentation but doesn't exist in the lucene-core-2.4.1.jar? I think that
class would be very useful for my problem...
Thank you so much!!
Erick Erickson wrote:
the guys really helped me understand the issues with wildcards,
it's harder than you think G
H, tricky. Let's see if I understand your problem.
Basically, you have a bunch of HSTs that have had
some number of items arbitrarily assigned to them, and
you want to see if you can make Lucene behave as a kind
of expert system to help you classify the next item.
I *think* you'd get better
everyone's help
Christian
On Mon, May 4, 2009 at 11:40 AM, Erick Erickson erickerick...@gmail.com
wrote:
H, tricky. Let's see if I understand your problem.
Basically, you have a bunch of HSTs that have had
some number of items arbitrarily assigned to them, and
you want to see if you
This looks like a job for PerFieldAnalyzerWrapper, no
MultiFieldQueryparser required
Best
Erick
On Fri, May 1, 2009 at 3:33 PM, theDude_2 aornst...@webmd.net wrote:
Hello fellow Lucene developers!
I have a bit of a question - and I can't find the answer in my lucene
book
Im
-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, May 01, 2009 11:42 PM
To: java-user@lucene.apache.org
Subject: Re: MultiFieldQueryParser - using a different analyzer per
field...
This looks like a job for PerFieldAnalyzerWrapper, no
MultiFieldQueryparser required
This is surprising behavior, which is another way of saying that,
given what you've said so far, this shouldn't be happening. I'd
really look at system metrics, like whether you're swapping
etc. In particular you might want to try varying how big you
allow your memory footprint to grow before you
Would a TopDocCollector work for you? You can get a TopDoc
object from that collector, from which you can get the max score.
That, along with the score provided for each doc should give you
a percentage.
Best
Erick
On Wed, Apr 29, 2009 at 5:30 AM, joseph.christopher jos...@kottsoftware.com
People (including me) use Lucene to page through results all the time,
so I'm pretty sure you're OK.
so here's my answers...
(1) yes.
(2) Well, the default sort is by score so if you want some other
ordering you have to sort.
(3) You can boost things at index time, but I don't think that's
of a difference when
paging through hits 1-10 vs. hits 300-310. They all seem to take about the
same time to evaluate. I'll try using one of the HitCollectors as you
suggest to see if it makes a difference.
regards,
--
Bill Chesky
-Original Message-
From: Erick Erickson
Well, you haven't shown us your program, so it's hard to tellG
But my first uninformed guess would be that the case of your search
doesn't exactly match the case you indexed when you add letters
to your IDs.
We need to see the search code particularly, including the
analyzers you use (a
Well, you can always implement your own HitCollector and just take
the end of the list.
But perhaps a fuller explanation of why you need to do this would
lead to a better answer
Best
Erick
On Sun, Apr 26, 2009 at 11:41 PM, samd sdoyl...@yahoo.com wrote:
I have 2500 documents and need to
about ranking pieces, it's about all no
matter
what the rank should be available.
Erick Erickson wrote:
Well, you can always implement your own HitCollector and just take
the end of the list.
But perhaps a fuller explanation of why you need to do this would
lead to a better answer
. From that i'm able to do this kind
of
reaserch work.
Please help me in this.
Erick Erickson wrote:
OK, this is a much different problem than you were originally
asking about, effectively how to index/search mixed language
documents.
This topic has been discussed multiple times
specifikation - aftaleseddel nr. 12.]]/com:Note
im searching the word like rådgiver . When i see the result it is clearly
searching for r dgiver. It is omitting the danish element.
Please help me in this.
Erick Erickson wrote:
Are you *also* using the DutchAnalyzer for your *query
*If* your terms are simple (that is, not wildcarded), you may get
some joy from TermEnum. The idea here would be to find the
longest term *already in your index* that satisfies your need and
use that to form a simple TermQuery
Essentially using TernEnum.skipTo on successively shorter
strings
to identify.
Please tell me how to use DutchAnalzer in my application. Sample example or
series of steps helps me.
I also attached my index file(.java file).
Please help me in this. please..
Erick Erickson wrote:
Take a look at DutchAnalyzer. The problem you'll have is if you're
indexing
Take a look at DutchAnalyzer. The problem you'll have is if you're indexing
this document along with a bunch of documents from other languages.
You could search the mail archive for extensive discussions of indexing/
searching documents from several languages.
Best
Erick
On Tue, Apr 21, 2009 at
to correctly find this.
Thanks,
Billy
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, April 17, 2009 8:08 PM
To: java-user@lucene.apache.org
Subject: Re: IndexWriter update method
What you're missing is that the example has no unique ID, it wasn't
Lucene is an *engine*, not an application. *You* have to process the
XML, decide what the structure of your index is and index the data. There
are many
XML parser options, this is just straight Java code. You'll decide
what's relevant, and add the contents of the relevant elements to a Lucene
2009/4/16 Erick Erickson erickerick...@gmail.com
Hmmm, try query.toString() and/or query.explain().
Also, try using Luke to see what is actually in the document.
BTW, what analyzer did you use in Luke? Luke also has an
explain (tab?) that will show you what Luke does, which may
What you're missing is that the example has no unique ID, it wasn't created
with update in mind.
There's no hidden magic for Lucene knowing *what* document you want
to have updated, you have to provide it yourself, and it should be unique.
Imagine a parts catalog, or an index of a directory
Hmmm, try query.toString() and/or query.explain().
Also, try using Luke to see what is actually in the document.
BTW, what analyzer did you use in Luke? Luke also has an
explain (tab?) that will show you what Luke does, which may
be useful.
The default operator should be OR, but looking at the
Well, under the covers, the old Hits object *was* reloading the first N
pages to
get page N + 1, you just didn't see it. Hits also had other, undesirable
behaviors.
But loading docs N-1 times it's not as expensive as you perhaps fear.
To get a sorted list, you must sort the entire set of
http://people.apache.org/~hossman/#threadhijackhttp://people.apache.org/%7Ehossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
subject line of
Wildcard queries are not lowercased, so depending upon
how you're indexing, that may be tripping you up.
See
http://wiki.apache.org/lucene-java/LuceneFAQ#head-133cf44dd3dff3680c96c1316a663e881eeac35a
Best
Erick
On Fri, Apr 10, 2009 at 2:56 PM, John Seer pulsph...@yahoo.com wrote:
Hello,
I
That'll teach me to scan a post. The link I sent you
is still relevant, but wildcards are NOT intended to be used to
concatenate terms. You want a phrase query or a span query
for that. i.e. A C F~# where # is the slop, that is, the number
of other terms allowed to appear between your desired
...@thetaphi.de
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Saturday, April 11, 2009 6:42 PM
To: java-user@lucene.apache.org
Subject: Re: RangeFilter performance problem using MultiReader
OK, I scanned all the e-mails in this thread so I may
searching for fieldname:* will be *extremely* expensive as it will, by
default,
build a giant OR clause consisting of every term in the field. You'll throw
MaxClauses exceptions right and left. I'd follow Tim's thread lead first
Best
Erick
2009/4/8 王巍巍 ww.wang...@gmail.com
first you should
Do you want the dates to *influence* or *determine* the order? I
don't have much help if what you're after is something like docs
that are more recent tend to rank higher, although I vaguely
remember this question coming up on the user list, maybe a
search of the archive would turn something
properly so that search become better.
Regards,
Allahbaksh
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Monday, April 06, 2009 9:31 PM
To: java-user@lucene.apache.org
Subject: Re: Multiple Analyzer on Single field
This really doesn't make sense
the documents that have the exact phrase the
bank of america.
Could you help me please ???
Regards
Ariel
On Mon, Apr 6, 2009 at 5:26 PM, Erick Erickson erickerick...@gmail.com
wrote:
If you have luke, you should be able to submit your query and use
the explain functionality to gain some insights
We really need some more data. First, I *strongly* recommend you
get a copy of Luke and examine your index to see what is
*actually* there. Google lucene luke. That often answers
many questions.
Second, query.toString is your friend. For instance, if the query
you provided below is all that
fine.
The field where I am searching is the content field.
I am using the same analyzer in query and indexing time: SnowBall English
Analyzer.
I am going to submit later the snippet code.
Regards
Ariel
On Mon, Apr 6, 2009 at 4:37 PM, Erick Erickson erickerick...@gmail.com
wrote:
We
How much memory are you allocating for the JVM? And what are your
various indexwriter settings (e.g. MaxBufferedDocs, MaxMergeDocs, etc).
Have you tried different settings in setRamBufferSizeMB?
Best
Erick
On Fri, Apr 3, 2009 at 7:13 AM, John Byrne john.by...@propylon.com wrote:
Hi, I'm
on ly
happened in a production environment that I can't mess with. I am planning
to try reproducing it locally soon, but it takes quite a while before it
happens.
-John
Erick Erickson wrote:
How much memory are you allocating for the JVM? And what are your
various indexwriter settings (e.g
.
Thanks for the ideas anyway - I know I really need to come up with some
more info on the problem, so I think the next thing I'll do it try to
reproduce it locally.
-John
Erick Erickson wrote:
H, that's odd. how many is a large number of documents? And
what is your index size when
: Erick Erickson erickerick...@gmail.com
To: java-user@lucene.apache.org
Sent: Wednesday, April 1, 2009 6:51:13 PM
Subject: Re: Search using MultiSearcher generates OOM on a 1GB total
Partitioned indeces
Think about putting this query in Luke and doing an explain for details,
but
I'm
default Max Clause is
1024, is there any reason behind this max?
Thanks,
M
From: Erick Erickson erickerick...@gmail.com
To: java-user@lucene.apache.org
Sent: Thursday, April 2, 2009 2:34:47 PM
Subject: Re: Search using MultiSearcher generates OOM on a 1GB
This seems really odd, especially with an index that size. The
first question is usually Do you open an IndexReader for
each query? If you do, be aware that opening a reader/searcher
is expensive, and the first few queries through the system are
slow as the caches are built up.
The second
Think about putting this query in Luke and doing an explain for details,
but
I'm surprised this is working at all without throwing TooManyClauses errors.
Under the covers, Lucene expands your wildcards to all terms in the field
that match. For instance, assume your document field has the
What kind of failures do you get? And I'm confused by the code. Are
you creating a new IndexWriter every time? Do you ever close it?
It'd help to see the surrounding code...
Best
Erick
On Sat, Mar 28, 2009 at 1:36 PM, Raymond Balmès raymond.bal...@gmail.comwrote:
Hi guys,
I'm using a
Yes, updating a document in Lucene is expensive for two
reasons:
1 deleting and adding a document does mean there's internal
work being done. But it's not all *that* expensive. So this really
comes down to how many records you expect to update
every 15 minutes. You've gotta try it.
2
You've got a great grasp of the issues, comments below. But before you
do, a lot of this kind if thing is incorporated in SOLR, which is build on
Lucene. Particularly updating an index then using it.
So you might take a look over there. It even has a DataImportHandler...
WARNING: I've only been
What does the front end look like? Is it a web page or a custom app? And
do you expect your users to actually enter the field name? I'd be reluctant
to allow any but the geekiest of users to enter the Lucene syntax (i.e. the
field
names). Users shouldn't know anything about the underlying
Could you provide more information about what you expect and what
you are seeing? As well as an example of what you've tried? Just
saying it didn't work doesn't give us much to go on
Best
Erick
On Wed, Mar 25, 2009 at 5:02 AM, m.harig m.ha...@gmail.com wrote:
Hello all
Can anyone
Try searching the mail archives, the searchable archive is linked to
off the Wiki. This topic has been discussed multiple times but I forget
the solutions...
Best
Erick
On Sun, Mar 22, 2009 at 4:30 PM, Paul Libbrecht p...@activemath.org wrote:
Hello list,
in an auto-completion task, I would
This might help you understand Lucene scoring better...
http://lucene.apache.org/java/2_4_1/scoring.html
The number of occurrences is not the sole determinant of a
document's score and boosting won't change that.
But I have to ask why counting words is important to you. What problem
are you
What's the query? Wildcard or did you just construct a huge
number of clauses?
You can always bump the allowed, see BooleanQuery.setMaxClauseCount()
Best
Erick
On Mon, Mar 16, 2009 at 6:52 AM, liat oren oren.l...@gmail.com wrote:
Hi,
I try to search a long query and get the following erroe:
Have you tried working through the getting started guide at
http://lucene.apache.org/java/2_4_1/gettingstarted.html? That
should give you a good idea of how to create a document in Lucene.
Best
Erick
On Sun, Mar 15, 2009 at 8:49 AM, Seid Mohammed seidy...@gmail.com wrote:
that is exactly my
You could do something with FieldSortedHitQueue as a post-search
sort, but I wonder if this would work for you...
public TopFieldDocs
http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/search/TopFieldDocs.html
*search*(Query
it can be applied to the search method exposed by
MultiSearcher. Would it be possible to clarify abit more or even point to
some reference documentation?
Cheers
Amin
On Sun, Mar 15, 2009 at 1:08 PM, Erick Erickson erickerick...@gmail.com
wrote:
You could do something with FieldSortedHitQueue
to parse the query and take
out the fields and assign their specific analyzer to them.
Rokham
Erick Erickson wrote:
PerFieldAnalyzerWrapper is your friend, assuming that you have separate
fields, some tokenized and some not. If you *don't* have separate
fields, then we need more details
- From: Erick Erickson
erickerick...@gmail.com
To: java-user@lucene.apache.org
Sent: Friday, March 06, 2009 6:47 PM
Subject: Re: Questions about analyzer
See below
On Fri, Mar 6, 2009 at 1:44 AM, Ganesh emailg...@yahoo.co.in wrote:
Hello all
1)
Which is best to use Snowball analyzer
Sure there are other options. You could decide to index in chunks
rather then entire documents. You could decide many things.
None of which we can recommend unless we have a clue what
you're really trying to accomplish or whether you're encountering
a specific problem.
I can say that we've
You have my sympathy. Let's see, you're being told we can't give
you the tools you need to diagnose/fix the problem, but fix it anyway.
Probably with the addendum And fix it by Friday.
You might want to consider staging a mutiny until the powers that be
can give you a solution. Perhaps working
to figure out as to whether Lucene is suited for this kind of application.
Once again thanks for all the inputs.
On Fri, Mar 6, 2009 at 7:15 PM, Erick Erickson erickerick...@gmail.com
wrote:
Whatever you do will be wrong G. What you're saying is
that you have structured data that the user wants
801 - 900 of 2128 matches
Mail list logo