Re: indexing/searching a website

2003-11-28 Thread Michal S
Robert Taylor wrote:
Check out http://www.searchblox.com/ . 
It's based on Lucene and extremely easy to use and set up.
It basically crawls your website and creates the index.
Search results are in XML and you can transform it using the
XSL style sheet shipped with it or create your own.
Great app, but doesn't support language of my content.
Thanks,
Michal



___
Najlepsze bo darmowe - konta e-mail
www.free.os.pl
-- SUPER LOGOSY I DZWONKI DO TWOJEJ KOMÓRKI --
   www.logo-dzwonki.pl
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: indexing/searching a website

2003-11-27 Thread David Townsend
I would advise you to use the excellent articles listed here.  

http://jakarta.apache.org/lucene/docs/resources.html

Some good examples and by the end of it you should have a good understanding of the 
major
classes and their use.

-Original Message-
From: Michal S [mailto:[EMAIL PROTECTED]
Sent: 27 November 2003 10:52
To: Lucene Users List
Subject: Re: indexing/searching a website



 Another option is to deploy your site and crawl it from the outside 
 (have a look at Nutch at sourceforge - or write your own using 
 HttpClient and some HTML parsing for hyperlinks).

I realize that it will be necessary to write or use existing html 
parser. I know that i need But i don't know how the whole framework 
would look like (how to translate pages on webserwer to Lucene 
documents, how to index them, how to search them).

The example on the Lucene home page is very simple and doesn't give me 
much answers.


 I would argue that content within the JSP is a bad thing given that you 
 want to index it - perhaps it makes more sense to put the content 
 somewhere easier to get at like a database?

You are absolutely right. But my client wants to edit the content as 
easy as possible (via notepad or other text editor). If the content were 
in database, it would be necessery to provide my client with some kind 
of application which could let him update the content. The budget of the 
project is strongly limited so i can't afford to allocate more 
developers to build content editor.

Thanks for the reply.
Michal.



___
Najlepsze bo darmowe - konta e-mail
www.free.os.pl


-- SUPER LOGOSY I DZWONKI DO TWOJEJ KOMÓRKI --
   www.logo-dzwonki.pl


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: indexing/searching a website

2003-11-27 Thread Robert Taylor
Check out http://www.searchblox.com/ . 
It's based on Lucene and extremely easy to use and set up.
It basically crawls your website and creates the index.
Search results are in XML and you can transform it using the
XSL style sheet shipped with it or create your own.

robert

 -Original Message-
 From: Michal S [mailto:[EMAIL PROTECTED]
 Sent: Thursday, November 27, 2003 3:37 AM
 To: [EMAIL PROTECTED]
 Subject: indexing/searching a website
 
 
 Dear Group Members,
 
 I have looked in archives for a simple tutorial which could guide me 
 throught process of integrating Lucene with a website based on Struts.
 The website uses tiles, the content of the tiles is kept in multiple jsp 
 files.
 
 I have read several Marco's posts which seem to be close to my problem. 
 However, my experience in Lucene is limited to indexing static html file 
 repository, so I need some kind of tutorial.
 
 Also, i didn't find get anything like this from google.
 Any help appreciated.
 
 All the best,
 Michal
 
 
 ___
 Najlepsze bo darmowe - konta e-mail
 www.free.os.pl
 
 
 -- SUPER LOGOSY I DZWONKI DO TWOJEJ KOMSRKI --
  www.logo-dzwonki.pl
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]