On Sunday 11 July 2004 10:03, Doug Cutting wrote:
Doug Cutting wrote:
The calls would look like:
new Field(name, value, Stored.YES, Indexed.NO, Tokenized.YES);
.
Actually, while we're at it, Indexed and Tokenized are confounded. A
single entry would be better, something like:
...
then
On Monday 19 April 2004 14:01, Mario Ivankovits wrote:
Stephane James Vaucher wrote:
Anyone try what Joerg suggested here?
http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]
pache.orgmsgNo=6231
Dont know what you would like to do, but if you simply would like to
extract text, you could
On Tuesday 13 April 2004 15:31, Holger Klawitter wrote:
Hi Erik,
What is wrong with simply creating a new token that replaces an
incoming one for synonyms?
I'm just playing devil's advocate here since you can already get
the termText() through the public _method_.
Well, you're
On Saturday 03 April 2004 08:34, [EMAIL PROTECTED] wrote:
On Saturday 03 April 2004 17:11, Erik Hatcher wrote:
No objections that error messages and such could be made clearer.
Patches welcome! Care to submit better error message handling in this
case? Or perhaps allow lower-case to?
I
On Friday 02 April 2004 08:12, Phil brunet wrote:
Hi all.
I'm migrating a part of an application from Oracle intermedia to Lucene
(1.3) to perform full text searches.
Congratulations! :-)
I'd like to know if there is a way to perform exact queries. By exact
query, i mean beeing able to
On Monday 08 March 2004 12:34, Erik Hatcher wrote:
In the RealWorld... many applications actually just re-run a search and
jump to the appropriate page within the hits searching is generally
plenty fast enough to alleviate concerns of caching.
However, if you need to cache Hits, you need
On Wednesday 21 January 2004 08:38, Doug Cutting wrote:
Francesco Bellomi wrote:
I agree that synchronization in Vector is a waste of time if it isn't
required,
It would be interesting to see if such synchronization actually impairs
overall performance significantly. This would be fairly
On Wednesday 07 January 2004 20:48, Dror Matalon wrote:
On Wed, Jan 07, 2004 at 07:24:22PM -0700, Scott Smith wrote:
...
Thanks for the suggestions. I wonder how much faster I can go if I
implement some of those?
25 msecs to insert a document is on the high side, but it depends of
course
On Tuesday 16 December 2003 03:37, Hohwiller, Joerg wrote:
Hi there,
I have not yet got any response about my problem.
While debugging into the depth of lucene (really hard to read deep insde) I
discovered that it is possible to disable the Locks using a System
property.
...
Am I safe
On Friday 05 December 2003 10:45, Doug Cutting wrote:
Tatu Saloranta wrote:
Also, shouldn't there be at least 3 methods that take Readers; one for
Text-like handling, another for UnStored, and last for UnIndexed.
How do you store the contents of a Reader? You'd have to double-buffer
On Tuesday 02 December 2003 09:51, Tun Lin wrote:
Anyone knows a search engine that supports xml formats?
There's no way to generally support xml formats, as xml is just a
meta-language. However, building specific search engines using Lucene core it
should be reasonably straight-forward to
On Monday 01 December 2003 15:13, Dion Almaer wrote:
...
Interesting. I implemented an approach which boosted based on the number
of months in the past, and after tweaking the boost amounts, it seems to do
the job. I do a fresh reindex every night (since the indexing process takes
no time at
On Monday 17 November 2003 07:40, Chong, Herb wrote:
i don't know what the Java implementation is like but the C++ one is very
fast.
...
I personally do not have any experience with the BreakIterator in Java. Has
anyone used it in any production environment? I'd be very interested to
learn
On Monday 17 November 2003 08:39, Chong, Herb wrote:
the core of the search engine has to have certain capabilities, however,
because they are next to impossible to add as a layer on top with any
efficiency. detecting sentence boundaries outside the core search engine is
really hard to do
On Tuesday 21 October 2003 17:31, Otis Gospodnetic wrote:
It does seem handy to avoid exact phrase matches on phone boy when
a
stop word is removed though, so patching StopFilter to put in the
missing positions seems reasonable to me currently. Any objections
to that?
So phone boy
On Monday 20 October 2003 16:41, Erik Hatcher wrote:
One more thought related to this subject - once a nice scheme for
representing hierarchies within a Lucene index emerges, having XPath as
a query language would rock! Has anyone implemented O/R or XPath-like
query expressions on top of
On Monday 06 October 2003 08:35, Lars Hammer wrote:
...
to iterate the Hits. I thought that Hits was an array of pointers to docs,
^^^
Actually, Hits contains a Vector (could be an array as well), but is not a
Collection itself
On Thursday 18 September 2003 14:50, Michael Giles wrote:
I know, I know, the HTML Parser in the demo is just that (i.e. a demo), but
I also know that it is updated from time to time and performs much better
than the other ones that I have tested. Frustratingly, the very first page
I tried to
On Wednesday 17 September 2003 07:07, Erik Hatcher wrote:
On Wednesday, September 17, 2003, at 08:43 AM, Killeen, Tom wrote:
I would suggest XML as well.
Again, I'd like to hear more about how you'd do this generically. Tell
me what the field names and values would correspond to when
On Friday 29 August 2003 10:02, Terry Steichen wrote:
I agree. One problem, however, that new (and not-so-new) Lucene users face
is a learning curve when they want to get past the simplest and most
obvious uses of Lucene. For example, I don't think any of the docs mention
the fact that you
On Monday 11 August 2003 01:07, Kevin A. Burton wrote:
Why was an int chosen to represent document handles? Is there a reason
for this? Why wasn't a long chosen to represent document handles? 64
bits seems like the obvious choice here except for a potentially bloated
datastore (32 extra
On Thursday 17 July 2003 07:20, greg wrote:
I have several document sections that are being indexed via the
StandardAnalyzer. One of these documents has the line access, the
manager. When searching for the phrase access manager, this document is
being returned. I understand why (at least i
On Monday 14 July 2003 08:52, Guilherme Barile wrote:
Hi
I'm writing a web application which will index files using
textmining to extract text and lucene to store it. I do have the
following implementation questions:
1) Only one user can write to an index at each time. How are you people
On Wednesday 25 June 2003 09:47, Ulrich Mayring wrote:
John Takacs wrote:
I'd love to try Lucene with the above, but the Lucene install fails
because of JavaCC issues. Surprised more people haven't encountered this
problem, as the install instructions are out of date.
Well, what do you
On Tuesday 17 June 2003 05:43, Kevin L. Cobb wrote:
I have an index that has three fields in it. When I do a search using
MultiFieldQueryParser, the search applies the same importance (weight)
to each of the fields. BUT, what if I want to apply a different weight
to each field, i.e. I want to
On Friday 30 May 2003 09:55, Leo Galambos wrote:
Ah, I got it. THX. In the good old days, the wildcards were used as a
fix for missing stemming module. I am not sure if you can combine these
two opposite approaches successfully. I see the following drawbacks of
your solution.
Example:
On Wednesday 28 May 2003 05:43, David Medinets wrote:
- Original Message -
From: Andrei Melis [EMAIL PROTECTED]
As far as I have understood, lucene does not allow search queries
starting with wildcards. I have a file database indexed by content
and also by filename. It would be
On Friday 04 April 2003 05:24, Rob Outar wrote:
Hi all,
Sorry for the flood of questions this week, clients finally started using
the search engine I wrote which uses Lucene. When I first started
Yup... that's the root of all evil. :-)
(I'm in similar situation, going through user
On Friday 28 March 2003 08:37, [EMAIL PROTECTED] wrote:
Ok, thanks Otis,
you have to write the terms lowercase when you're searching with wildcards.
Or use the set method in QueryParser to ask it to automatically lower case
those terms. Patch for that was added before 1.3RC1 (check javadocs or
On Friday 28 March 2003 15:48, Shah, Vineel wrote:
One of my clients is asking for an old-style boolean query search on my
keywords fields. A string might look like this:
oracle admin* and java and oracle and (8.1.6 or 8.1.7) and
(solaris or unix or linux)
There would probably be need
On Monday 24 March 2003 18:03, Michael Wechner wrote:
John Bresnik wrote:
anyone know of a quick and easy way to get this demo
[org.apache.lucene.demo.IndexHTML] to parse JSP files as well? I used to a
crawler to create a local [static] version of the site [i.e. they are not
longer JSP files
On Friday 21 March 2003 03:55, Pierre Lacchini wrote:
Heya,
as u can see, I want to create my own french Analyzer, using the snowball's
FrenchStemmer...
But i don't really know how to proceed...
Does anyone know where I can find a tutorial, or a clear example of How to
create an analyzer
On Wednesday 19 March 2003 01:44, Morus Walter wrote:
...
Searches must be able on any combination of collections.
A typical search includes ~ 40 collections.
Now the question is, how to implement this in lucene best.
Currently I see basically three possibilities:
- create a data field
On Thursday 13 March 2003 00:52, Magnus Johansson wrote:
Tatu Saloranta wrote:
...
But same happens during indexing; fotbollsmatch should be properly
split and stemmed to fotboll and match terms, right?
Yes but the word fotbollsmatch was never indexed in this example. Only
the word fotboll
On Wednesday 05 March 2003 13:35, Leo Galambos wrote:
I'm all eyes and I'm a serious grown-up with good manners :)
Constructive suggestions for improvement are always welcome.
First a disclaimer: I don't mean to sound too negative. I'm genuinely curious
about many of the issues you mention.
On Friday 28 February 2003 05:15, Alain Lauzon wrote:
At 07:16 2003-02-28 +0100, you wrote:
May it be, that microsoft is found, because the search is not case
sensitive (text) and ct is not found because there the search is case
sensitive (Keyword)
Did you try
+state:CT
On Friday 21 February 2003 13:22, Günter Kukies wrote:
Hello,
I don't have any line number.
You unfortunately do need to know the line number, if you do get an exception
and try to see where it occurs.
Another less frequent problem is that you actually get the exception as an
object and
On Friday 14 February 2003 02:58, Volker Luedeling wrote:
Hi,
I am writing an application that constructs Lucene searches from XML
queries. Each item from the XML is represented by a Query of the
corresponding type. I have a problem when I try to search for number
ranges, since RangeQuery
On Friday 14 February 2003 07:27, Aaron Galea wrote:
I had this problem when using xerces to parse xml documents. The problem I
think lies in the Java garbage collector. The way I solved it was to create
It's unlikely that GC is the culprit. Current ones are good at purging objects
that are
On Tuesday 11 February 2003 07:48, Nellai wrote:
Hi!
can anyone tell me how to calculate the % of relevance using Lucene.
Lucene's hit score is normalized float, ] 0.0, 1.0 ] (since 0.0 ones are never
included). From there it's basic arithmetics (perhaps this could be included
in FAQ , even
On Monday 03 February 2003 07:19, Terry Steichen wrote:
I believe that the tokenizer treats a dash as a token separator. Hence,
the only way, as I recall, to eliminate this behavior is to modify
QueryParser.jj so it doesn't do this. However, doing this can cause some
other problems, like
On Saturday 01 February 2003 00:19, Otis Gospodnetic wrote:
1) to what extent are wildcards supported by lucenes?
You can use * and ? the way they usually are used.
I think there was one exception; first character of a simple term
can not be a wildcard? (this from query syntax page).
-+ Tatu
On Wednesday 22 January 2003 07:49, Erik Hatcher wrote:
Unfortunately I don't believe date field range queries work with
QueryParser, or at least not human-readable dates.
Is that correct?
I think it supports date ranges if they are turned into a numeric
format, but no human would type that
On Wednesday 22 January 2003 08:27, Michael Barry wrote:
I utilize the earlier version and queries such as this work fine with
QueryParser:
field:[ 20030120 - 20030125 ]
of course the back-end indexer canonocalizes all date fields to MMDD.
The front-end search code is responsible for
My apologies if this is a FAQ (which is possible as I am new to Lucene,
however, I tried checking the web page for the answer).
I read through the Query syntax web page first, and then checked the
matching query classes. It seems like query syntax page is missing some
details; the one I was
45 matches
Mail list logo