SQLDirectory

2004-02-01 Thread lucene
Hi!

There was some third-party SQLDirectory for lucene 1.2 which was abandoned for 
a matter of performance. Well, why not loading the index into RAM? Is there 
some (official) SQLDirectory for 1.3?

searcher = new IndexSearcher(IndexReader.open(new RAMDirectory(new 
SQLDirectory()));

I'd really like to have the index where I do have all the data - in the 
database.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



HTMLDocument

2004-02-01 Thread lucene
Hi!

Is there any HTMLDocument out there? The one in the demo package of lucene 
does not handle non-wellformed HTML files (what about nekohtml?) and seems to 
have some other inabilities and bugs as well (and why isn't it part of the 
distro but in a demo package?!)?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: SQLDirectory

2004-02-01 Thread Erik Hatcher
On Feb 1, 2004, at 6:16 AM, [EMAIL PROTECTED] wrote:
There was some third-party SQLDirectory for lucene 1.2 which was 
abandoned for
a matter of performance. Well, why not loading the index into RAM? Is 
there
some (official) SQLDirectory for 1.3?
If you look back in the list archives a few weeks ago there is someone 
who has implemented a Directory within SQL Server I believe and has 
said the performance is quite good.

There is a Berkeley DB DBDirectory implementation in Lucene's sandbox 
repository as of a few weeks ago also.

	Erik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: HTMLDocument

2004-02-01 Thread Erik Hatcher
On Feb 1, 2004, at 6:19 AM, [EMAIL PROTECTED] wrote:
Hi!

Is there any HTMLDocument out there? The one in the demo package of 
lucene
does not handle non-wellformed HTML files (what about nekohtml?) and 
seems to
have some other inabilities and bugs as well (and why isn't it part of 
the
distro but in a demo package?!)?
Nutch uses NekoHTML, so you can browse around that codebase and borrow 
its implementation.  The sandbox has a contribution/ant directory which 
contains an HTMLDocument that uses JTidy to parse HTML which does a 
pretty good job at handling bad HTML.

Why isn't it in the distribution?  Parsing HTML and turning it into a 
Lucene document is not always done the same way and doing so is really 
on top of the core, not integral to it.

	Erik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Date Range support

2004-02-01 Thread Erik Hatcher
On Jan 29, 2004, at 5:08 AM, tom wa wrote:
I'm trying to create an index which can also be searched with date 
ranges. My first attempt using the Lucene date format ran in to 
trouble after my index grew and I couldn't search over more than a few 
days.
I saw some other posts explaining why this happens and the suggestion 
seemed to be to use strings of the format MMdd. Using that format 
worked great until I remembered that my search needs
 to be able to support different timezones. Adding the hour to my 
field causes the same problem above and my queries stop working when 
using a range of about 2 months.
When you say you couldn't search and that it stopped working, do you 
mean it was just unacceptably slow?

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene with Postgres db

2004-02-01 Thread Leo Galambos
Have you tried a special add-on for pgsql - 
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

Lucene is faster than tsearch (I hope so), but tsearch neednot be 
synchronized with the main DB...up to you.

Cheers,
Leo
Ankur Goel wrote:

Hi,

I have to search the documents which are stored in postgres db. 

Can someone give a clue how to go about it?

Thanks

Ankur Goel
Brickred Technologies
B-2 IInd Floor, Sector-31
Noida,India
P:+91-1202456361
C:+91-9810161323
E:[EMAIL PROTECTED]
http://www.brickred.com


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: HTMLDocument

2004-02-01 Thread lucene
On Sunday 01 February 2004 13:21, Erik Hatcher wrote:
 On Feb 1, 2004, at 6:19 AM, [EMAIL PROTECTED] wrote:

 Nutch uses NekoHTML, so you can browse around that codebase and borrow

Nutch(.org)? No code there...


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Japanese Analyzer

2004-02-01 Thread Masami Saotome

As far as I know:
the following URL, There is an Japanese analyzer using morphological analysis.

http://yamaguch.sytes.net/~tora/opensource/sen/
But, this page is Japanese only.

Regards

 I've been using the CJKAnalyzer for a while now and our native japanese speaking 
 development staff haven't had any complaints with the results they are getting in 
 their searches.
 
 Just be sure you get all the character encoding issues straight. One of the gotchas 
 I ran into when I first started working with this was improper character encoding 
 handling in my web application.
 
 Eric
 
 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, January 29, 2004 1:46 PM
 To: Lucene Users List
 Subject: Re: Japanese Analyzer
 
 
 I think that's the only one we've got.
 You can browse the Lucene Sandbox contributions directory, it's there.
 
 Otis
 
 --- Weir, Michael [EMAIL PROTECTED] wrote:
  Is the CJKAnalyzer the best to use for Japanese?  If not, which is?
  If so,
  from where can I download it?
  Thanks.
  
  Michael Weir . Transform Research Inc. . 613.238.1363 x.114
  
  This message may contain privileged and/or confidential information.
  If you
  have received this e-mail in error or are not the intended recipient,
  you
  may not use, copy, disseminate or distribute it; do not open any
  attachments, delete it immediately from your system and notify the
  sender
  promptly by e-mail that you have done so.  Thank you.
  
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Jem has a new address

2004-02-01 Thread jem
Thank you for your recent email. Unfortunately, due to spam overload, I no longer 
accept mail to this address. 

If you would like to receive my new email address, please call me. My phone number is 
listed in www.whitepages.com.au

Best wishes
Jem



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]