date:20031020

RE: Does the Lucene search engine work with PDF's?

2003-10-20 Thread MOYSE Gilles (Cetelem)

You can also use the TextMining.org toolbox, which provides classes to extract text from PDF and DOC files, using the Jakarta POI project. They are all free, under Apache Licence. The URL :http://www.textmining.org/modules.php?op=modloadname=Newsfile=articlesid =6mode=threadorder=0thold=0). (URL

[OT] Open Source Goes to COMDEX

2003-10-20 Thread petite_abeille

Hello, This is pretty much off topic, but... ZOE has been nominated as one of the candidate project to go the Open Source Innovation Area on the COMDEX Exhibit Floor. http://www.oreillynet.com/contest/comdex/ ZOE is one of the few Java project short listed and it uses Lucene quiet

Hierarchical document

2003-10-20 Thread Tom Howe

Hi, I have a very hierarchical document structure where each level of the hierarchy contains indexable information. It looks like this: Study - Section - DataFile - Variable.

Lucene on Windows

2003-10-20 Thread Steve Jenkins

Hi, Wonder if anyone can help. Has anyone used Lucene on a Windows environment? Anyone know of any documentation specifically focused on doing that? Or anyone know of any gotchas to avoid? Thanks for any help, Cheers Steve.

Re: Lucene on Windows

2003-10-20 Thread Erik Hatcher

On Monday, October 20, 2003, at 12:00 PM, Steve Jenkins wrote: Hi, Wonder if anyone can help. Has anyone used Lucene on a Windows environment? Anyone know of any documentation specifically focused on doing that? Or anyone know of any gotchas to avoid? Yup, used Lucene on Windows lots. Is there

Does the Lucene search engine work with PDF's?

2003-10-20 Thread Konrad Kolosowski

Return Receipt Your Does the Lucene search engine work with PDF's? document :

RE: Lucene on Windows

2003-10-20 Thread Otis Gospodnetic

The CVS version of Lucene has a patch that allows one to use a 'Compound Index' instead of the traditional one. This reduces the number of open files. For more info, see/make the Javadocs for IndexWriter. Otis --- Tate Avery [EMAIL PROTECTED] wrote: You might have trouble with too many open

Re: Hierarchical document

2003-10-20 Thread Erik Hatcher

On Monday, October 20, 2003, at 11:06 AM, Tom Howe wrote: contain Section and Study information and then, if a user wants a set of Study documents, just aggregate them after the search by hand or is there a more lucene way of doing this? I'm trying to avoid storing too much redundant

Re: Dash Confusion in QueryParser - Bug? Feature?

2003-10-20 Thread Erik Hatcher

On Wednesday, October 15, 2003, at 10:24 AM, Michael Giles wrote: So how do we move this issue forward. I can't think of a single case where a - with no whitespace on either side (i.e. t-shirt, Wal-Mart) should be interpreted as a NOT command. Is there a feeling that changing the

positional token info

2003-10-20 Thread Erik Hatcher

Is anyone doing anything interesting with the Token.setPositionIncrement during analysis? Just for fun, I've written a simple stop filter that bumps the position increments to account for the stop words removed: public final Token next() throws IOException { int increment = 0; for

Re: Hierarchical document

2003-10-20 Thread Tatu Saloranta

On Monday 20 October 2003 16:41, Erik Hatcher wrote: One more thought related to this subject - once a nice scheme for representing hierarchies within a Lucene index emerges, having XPath as a query language would rock! Has anyone implemented O/R or XPath-like query expressions on top of

RE: Does the Lucene search engine work with PDF's?

[OT] Open Source Goes to COMDEX

Hierarchical document

Lucene on Windows

Re: Lucene on Windows

Does the Lucene search engine work with PDF's?

RE: Lucene on Windows

Re: Hierarchical document

Re: Dash Confusion in QueryParser - Bug? Feature?

positional token info

Re: Hierarchical document

11 matches

Site Navigation

Mail list logo

Footer information