Re: The case of the disappearing index files

2004-01-28 Thread Rociel Buico
i also had an experience on this, what i did is i wrap my 
searcher into a singleton object, and check if it is being used 
by another thread, then i let the thread caller to put on wait state, 
until the other thread finish using the searcher.
 
maybe it can help
 
buics


Scott Smith <[EMAIL PROTECTED]> wrote:
We have started using lucene as the indexer for messages on our website. We
are seeing a problem where some index files seem to disappear (we've seen
the segment file vanish as well as some others).

My first thought after looking though some archives is that maybe we are
getting the "too many open files" problem and this means that a file might
get deleted in preparation for being rewritten, but it can't be rewritten
because there are no file handles (this is on a Windows XP box). Since the
indexer is pretty staight forward in that it opens an IndexWriter, adds new
messages received in the last minute and then closes the IndexWriter, I'm
pretty sure it's ok. Besides, we didn't see this problem until we started
doing lots of searches.

I'm feeling less comfortable with the search code. Here are a couple of
snippets. The first was a transliteration of some code that I saw in a Doug
C. posting (it was in v1.2 form and I needed it in v1.3)

private Searcher m_Searcher = null;
private long m_LastModified;
private void getSearcher()
throws IOException
{
// has the index been modified since last we looked?
long newModified =
IndexReader.getCurrentVersion(m_IndexDirectory);
if (m_LastModified != newModified)
{
// Get a new searcher and orphan the old one w/o
closing
m_Searcher = new IndexSearcher(m_IndexDirectory);
m_LastModified = newModified; }
}

Here's a somewhat simplified version (I search more fields) of the search
code that calls it.

public synchronized Hits SimpleSearch(String a_SearchString)
throws IOException, ParseException
{
Query q = QueryParser.parse(a_SearchString, "Body",
m_Analyzer);

try
{
getSearcher();
}
catch (IOException e)
{
// if we can't generate searcher, then claim
// nothing is there
m_lggr.error(e.getMessage());
return null;
}

Hits hits = m_Searcher.search(q);

return hits;
}

The caller then can walk through the hits list to get the messages.

Originally, I would close the searcher after I got the hits, but I found
that you couldn't access the documents in the Hits structure once the
IndexSearcher was closed (Looking at the source, it looks like the Hits list
doesn't actually have the documents in it, but simply has references to them
which it uses the Searcher object to get at). So, I now never close the
Searcher (though I'll create a new one if the index has been modified since
the last time I looked).

One other thing, I know the web guy using this is creating a new object
everytime he does a search (which I will talk to him about since I think
that's the wrong thing based on what I've read). Is that my only problem?
Do I really want to wait until garbage collection deletes the old Searchers
for the files it has opened to get closed?

Does anyone see anything wrong with the above code or anything I should do
to optimize it? Suggestions anyone?

Scott

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


"We shape clay into a pot but it is the emptyness inside that holds whatever we want." 
Lao Tzu

-
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!

RE: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Juan Manuel Hernandez Garcia
Sorry, but i can´t send a mail to the server, which is the addres, or do
you can help me
I want to install lucene in windows xp, do you know where i can find
information, i’ve traed but when test the demo executing

java org.apache.lucene.demo.IndexFiles {full-path-to-lucene}/src

 

i have the next error 

C:\Documents and Settings\juan>java  org.apache.lucene.demo.IndexFiles
e:\lucene

\src

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/lucene/dem

o/IndexFiles

 

anyone know why

 

thanks a lot



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



The case of the disappearing index files

2004-01-28 Thread Scott Smith
We have started using lucene as the indexer for messages on our website.  We
are seeing a problem where some index files seem to disappear (we've seen
the segment file vanish as well as some others).

My first thought after looking though some archives is that maybe we are
getting the "too many open files" problem and this means that a file might
get deleted in preparation for being rewritten, but it can't be rewritten
because there are no file handles (this is on a Windows XP box).  Since the
indexer is pretty staight forward in that it opens an IndexWriter, adds new
messages received in the last minute and then closes the IndexWriter, I'm
pretty sure it's ok.  Besides, we didn't see this problem until we started
doing lots of searches.

I'm feeling less comfortable with the search code.  Here are a couple of
snippets.  The first was a transliteration of some code that I saw in a Doug
C. posting (it was in v1.2 form and I needed it in v1.3)

private Searcher m_Searcher = null;
private long m_LastModified;
private void getSearcher()
throws IOException
{
// has the index been modified since last we looked?
long newModified =
IndexReader.getCurrentVersion(m_IndexDirectory);
if (m_LastModified != newModified)
{
// Get a new searcher and orphan the old one w/o
closing
m_Searcher = new IndexSearcher(m_IndexDirectory);
m_LastModified = newModified;   }
}

Here's a somewhat simplified version (I search more fields) of the search
code that calls it.

public synchronized Hits SimpleSearch(String a_SearchString)
throws IOException, ParseException
{
Query q = QueryParser.parse(a_SearchString, "Body",
m_Analyzer);

try
{
getSearcher();
}
catch (IOException e)
{
// if we can't generate searcher, then claim
// nothing is there
m_lggr.error(e.getMessage());
return null;
}

Hits hits = m_Searcher.search(q);

return hits;
}

The caller then can walk through the hits list to get the messages.

Originally, I would close the searcher after I got the hits, but I found
that you couldn't access the documents in the Hits structure once the
IndexSearcher was closed (Looking at the source, it looks like the Hits list
doesn't actually have the documents in it, but simply has references to them
which it uses the Searcher object to get at).  So, I now never close the
Searcher (though I'll create a new one if the index has been modified since
the last time I looked).

One other thing, I know the web guy using this is creating a new object
everytime he does a search (which I will talk to him about since I think
that's the wrong thing based on what I've read).  Is that my only problem?
Do I really want to wait until garbage collection deletes the old Searchers
for the files it has opened to get closed?

Does anyone see anything wrong with the above code or anything I should do
to optimize it?  Suggestions anyone?

Scott

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Hamish Carpenter
Hi,

My company has implemented a socket based interface for lucene.  To index and query documents you need to construct and xml document and then send it to our "luceneserver" which listens on a socket (can be same machine or different).

I can email this to you if you wish, it is ~2.5Mb including all libs to run it.  It is currently licensed under GPL.

btw; installing tomcat to test the lucene webapp locally is not too difficult.

Hamish Carpenter.

Sebastian Fey wrote:

hi,

my task is to implement a search engine to a documentation in HTML. the files are not 
online but local.
But the "getting started" guide at lucene-home just explains howto set up lucene with 
tomcat. (ive never set up a webserver)
I was able to create an index of my files, but now the web-frontend is missing. I think its in the luceneweb.war, right?
So, my qustion, how can i use lucene local? Can someone provide a html-frontend? 
 
thx in advance,

Sebastian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: AW: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Erik Hatcher
On Jan 28, 2004, at 9:37 AM, Sebastian Fey wrote:
this is JSP. i think i need to set up a webserver to run it. (sorry 
all this web and server stuff really isnt my field ... :) )

is there actually a way to use Lucene without a webserver?
Yes.

Lucene has *nothing* to do with web applications.  It is completely 
orthogonal.

Look at the many Lucene articles (mine at java.net are the most 
recent).  You have to write some Java code, but you can easily write a 
few lines of code that search an index and output the results.  In 
fact, look at Luke if you want a pre-built desktop application to 
browse/search an index.  Also, look at the lucli project in the sandbox 
for a command-line tool.

The Lucene website has pointers to all that I've mentioned here.

	Erik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread David Townsend
Why don't you take a look at luke. That way you can play with the index you built and 
work from there.  If you're looking to replicate something like Luke, I'd get studying 
now ;).

http://www.getopt.org/luke/



-Original Message-
From: Sebastian Fey [mailto:[EMAIL PROTECTED]
Sent: 28 January 2004 14:23
To: Lucene Users List
Subject: AW: use Lucene LOCAL (looking for a frontend)


>Not being funny, but if you have no experience in Java, then why are you using a Java 
>API >for index building/text searching ?

im just testing some possibilities.
though i cant write an java application, i can read it and, if someone gives me 
something to start with, im sure ill make it. if lucene seems to be the best solution, 
ill spend some time to leran something about java.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



AW: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Sebastian Fey
hi again,

one more question...

>No offense intended at all, but you'll really need some Java experience 
>to do stuff with Lucene.  There is no real good out-of-the-box 
>front-end at the moment, unless you went with something like Searchblox 
>(www.searchblox.com).

this is JSP. i think i need to set up a webserver to run it. (sorry all this web and 
server stuff really isnt my field ... :) )

is there actually a way to use Lucene without a webserver?

thx :)




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Ben Keeping
For an "out of the box" job, I found searchblox pretty impressive, and easy to install.

-Original Message-
From: Sebastian Fey [mailto:[EMAIL PROTECTED]
Sent: 28 January 2004 14:23
To: Lucene Users List
Subject: AW: use Lucene LOCAL (looking for a frontend)


>Not being funny, but if you have no experience in Java, then why are you using a Java 
>API >for index building/text searching ?

im just testing some possibilities.
though i cant write an java application, i can read it and, if someone gives me 
something to start with, im sure ill make it. if lucene seems to be the best solution, 
ill spend some time to leran something about java.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



This e-mail and any attachments may be confidential and/or legally
privileged. If you have received this e-mail and you are not a named
addressee, please inform Landmark Information Group on 01392 441700
and then delete the e-mail from your system. If you are not a named
addressee you must not use, disclose, distribute, copy, print or rely 
on this e-mail. This email and any attachments have been scanned for
viruses and to the best of our knowledge are clean. To ensure 
regulatory compliance and for the protection of our clients and 
business, we may monitor and read e-mails sent to and from our 
servers.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



AW: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Sebastian Fey
>>> How you present the search results will be up to you and the needs of 
>>> your
>>> project.
>>
>> ive NO experience with java.
>> it would be nice to see an example of a webinterface, that implements 
>> lucene to have something to start with.
>
>No offense intended at all,
:)

>but you'll really need some Java experience to do stuff with Lucene.  
>There is no real good out-of-the-box front-end at the moment, 
>unless you went with something like Searchblox (www.searchblox.com).

nice ill take a look.

>My JavaDevWithAnt project provides a front-end (using Struts) similar 
>to the one that comes with the demo.  You can get JavaDevWithAnt (and 
>build it yourself) at http://www.ehatchersolutions.com/JavaDevWithAnt

thx for the infos, ill do some further reading about all this stuff.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



AW: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Sebastian Fey
>Not being funny, but if you have no experience in Java, then why are you using a Java 
>API >for index building/text searching ?

im just testing some possibilities.
though i cant write an java application, i can read it and, if someone gives me 
something to start with, im sure ill make it. if lucene seems to be the best solution, 
ill spend some time to leran something about java.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Erik Hatcher
On Jan 28, 2004, at 9:01 AM, Sebastian Fey wrote:
How you present the search results will be up to you and the needs of 
your
project.
ive NO experience with java.
it would be nice to see an example of a webinterface, that implements 
lucene to have something to start with.
No offense intended at all, but you'll really need some Java experience 
to do stuff with Lucene.  There is no real good out-of-the-box 
front-end at the moment, unless you went with something like Searchblox 
(www.searchblox.com).

My JavaDevWithAnt project provides a front-end (using Struts) similar 
to the one that comes with the demo.  You can get JavaDevWithAnt (and 
build it yourself) at http://www.ehatchersolutions.com/JavaDevWithAnt

	Erik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Ben Keeping
Not being funny, but if you have no experience in Java, then why are you using a Java 
API for index building/text searching ?

-Original Message-
From: Sebastian Fey [mailto:[EMAIL PROTECTED]
Sent: 28 January 2004 14:01
To: Lucene Users List
Subject: RE: use Lucene LOCAL (looking for a frontend)


>To index local files leverage some of the 
>code I have put in my java.net articles, or use the Ant  task 
>that resides in the sandbox repository, or write your own. 

im satisfied with the index ive for now, but later on ill take a look ...

>How you present the search results will be up to you and the needs of your 
>project.

ive NO experience with java.
it would be nice to see an example of a webinterface, that implements lucene to have 
something to start with.

thx,

Sebastian


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



This e-mail and any attachments may be confidential and/or legally
privileged. If you have received this e-mail and you are not a named
addressee, please inform Landmark Information Group on 01392 441700
and then delete the e-mail from your system. If you are not a named
addressee you must not use, disclose, distribute, copy, print or rely 
on this e-mail. This email and any attachments have been scanned for
viruses and to the best of our knowledge are clean. To ensure 
regulatory compliance and for the protection of our clients and 
business, we may monitor and read e-mails sent to and from our 
servers.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Sebastian Fey
>To index local files leverage some of the 
>code I have put in my java.net articles, or use the Ant  task 
>that resides in the sandbox repository, or write your own. 

im satisfied with the index ive for now, but later on ill take a look ...

>How you present the search results will be up to you and the needs of your 
>project.

ive NO experience with java.
it would be nice to see an example of a webinterface, that implements lucene to have 
something to start with.

thx,

Sebastian


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Erik Hatcher
Lucene is a Java API, and can be used within any type of Java program 
(command-line, web, etc).

It is up to you as the developer embedding Lucene to put whatever kind 
of interface you want on it.  To index local files leverage some of the 
code I have put in my java.net articles, or use the Ant  task 
that resides in the sandbox repository, or write your own.  How you 
present the search results will be up to you and the needs of your 
project.

	Erik

On Jan 28, 2004, at 7:44 AM, Sebastian Fey wrote:

hi,

my task is to implement a search engine to a documentation in HTML. 
the files are not online but local.
But the "getting started" guide at lucene-home just explains howto set 
up lucene with tomcat. (ive never set up a webserver)

I was able to create an index of my files, but now the web-frontend is 
missing. I think its in the luceneweb.war, right?
So, my qustion, how can i use lucene local? Can someone provide a 
html-frontend?

thx in advance,

Sebastian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


use Lucene LOCAL (looking for a frontend)

2004-01-28 Thread Sebastian Fey
hi,

my task is to implement a search engine to a documentation in HTML. the files are not 
online but local.
But the "getting started" guide at lucene-home just explains howto set up lucene with 
tomcat. (ive never set up a webserver)

I was able to create an index of my files, but now the web-frontend is missing. I 
think its in the luceneweb.war, right?
So, my qustion, how can i use lucene local? Can someone provide a html-frontend? 
 
thx in advance,

Sebastian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: arrays of values in a field

2004-01-28 Thread Andrzej Bialecki
Erik Hatcher wrote:
On Jan 27, 2004, at 2:27 PM, Gabe wrote:

If I have a group of documents and I want to filter on
a category, it is fairly straightforward. I just
create a Field that contains the category and filter
on it.
However, what if I want the field "category" to have
multiple possible values? Is there a known best way to
filter on that?
I imagine it is possible to "hack" it by, say,
creating a field with value:
|category1|category2|category3| etc.
And then query "|category1|"

I was wondering if there was a better way.


Simply add multiple (probably Keyword) fields with the same name.  
Lucene supports this nicely.
There are other tricks you can use here, too... In one of my projects I 
had a need to store a list of weighted keywords. No problem storing 
multiple tokens under the same field name, as Erik explained above. 
However, in Lucene you can only apply a single boost value to a field. I 
ended up encoding the keywords like "10.0 keyword" and then writing an 
analyzer which skips the initial numbers when processing this particular 
field (which was stored, indexed and tokenized).



--
Best regards,
Andrzej Bialecki
-
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-
FreeBSD developer (http://www.freebsd.org)
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]