commit e-mails changed

2005-02-28 Thread Erik Hatcher
Commits now send to [EMAIL PROTECTED] - both lucene4c and lucene (Java) are sent to this address currently. To subscribe, mail to: [EMAIL PROTECTED] Erik Begin forwarded message: From: [EMAIL PROTECTED] Date: February 27, 2005 3:22:58 PM EST To: [EMAIL PROTECTED] Subject: svn commit:

Re: svn commit: r155648 - lucene/trunk

2005-02-28 Thread Erik Hatcher
Garrett, Thanks for the very detailed reply. It's nice to have a Subversion guru on board :) My main goal was only to have a placeholder where I could check out all the latest Lucene stuff, across all implementations. I hadn't considered actually putting things in the /trunk directory

Fast access to a random page of the search results.

2005-02-28 Thread Stanislav Jordanov
Hi Guys, I am cross-posting this issue from the lucene-users list, because: 1. it is still unanswered there 2. it feels more like a development issue. 3. the company I work for wishes to offer you our dev-force ;-) in case the issue is solvable (or more accurately speaking, the feature is

special character with lucene

2005-02-28 Thread Philipp_Breuss
Hello, I would like to build a search engine using several different languages - f.e. Spanish names, French names, ... - Using a different analyzer for each language would be one solution. - But how about replacing each special character (Umlaute, ...ä, ö, ...) with its html special character

Re: special character with lucene

2005-02-28 Thread Erik Hatcher
On Feb 28, 2005, at 10:01 AM, [EMAIL PROTECTED] wrote: Hello, I would like to build a search engine using several different languages - f.e. Spanish names, French names, ... Will your text be a mix of languages within a single field? Or would each document (or field) be a single language? -

Re: special character with lucene

2005-02-28 Thread Philipp_Breuss
Usually the text is in one specific language. English, German, Spanish, French, ... However, I dont really have a runtime identifier which language it is. I could only pick a few words and decide from there (?) - if this is a good idea? Is there a tool part of lucene that helps deciding what

Re: special character with lucene

2005-02-28 Thread slagraulet
[EMAIL PROTECTED] nydadc.com A

WG: Re: special character with lucene

2005-02-28 Thread Philipp_Breuss
My file.encoding is set to Cp1252. Maybe this is the reason. However, its a good point replacing all the Umlaute Ä, ... with A, ... before indexing, such that people with non-Umlaut keyboards can search for them. I might do that. Greetings, Philipp Daniel Naber [EMAIL PROTECTED]

Re: updating jakarta site

2005-02-28 Thread Erik Hatcher
I'm gun shy for forging ahead without community consensus based on me pushing too fast earlier. I'm +1 on forging ahead with all of what Henri brings up. Doug? Others? As Henri mentions, we need a mail and download page added to the Lucene website since this will be removed from the Jakarta

Re: updating jakarta site

2005-02-28 Thread Henri Yandell
Your download page is already separate, you're using the global closer.cgi file. Hen On Mon, 28 Feb 2005 11:52:56 -0500, Erik Hatcher [EMAIL PROTECTED] wrote: I'm gun shy for forging ahead without community consensus based on me pushing too fast earlier. I'm +1 on forging ahead with all of

RE: special character with lucene

2005-02-28 Thread Zhaohui Li
Basis Technology has a commercial product Rosette Language Identifier to identify the input language. If you are interested in, you can send email to [EMAIL PROTECTED] -zhaohui -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, February 28, 2005 10:37

Re: special character with lucene

2005-02-28 Thread Steven Rowe
Also, check out Nutch's language identification stuff: URL:http://cvs.sourceforge.net/viewcvs.py/nutch/nutch/src/plugin/languageidentifier/ Zhaohui Li wrote: Basis Technology has a commercial product Rosette Language Identifier to identify the input language. If you are interested in, you can send

Re: updating jakarta site

2005-02-28 Thread Doug Cutting
Henri Yandell wrote: Your download page is already separate, you're using the global closer.cgi file. So we need to: - rename Lucene Java's mailing lists, with forwards put into place. - add a mailing list page to Lucene Java's website, modelled after

Re: updating jakarta site

2005-02-28 Thread Garrett Rooney
Doug Cutting wrote: Henri Yandell wrote: Your download page is already separate, you're using the global closer.cgi file. So we need to: - rename Lucene Java's mailing lists, with forwards put into place. - add a mailing list page to Lucene Java's website, modelled after

Re: updating jakarta site

2005-02-28 Thread Doug Cutting
Garrett Rooney wrote: Actually, currently we've got both lucene4c and java commits going to [EMAIL PROTECTED], and there was some talk of just leaving it that way, since it isn't that much traffic and it encourages people to keep an eye on what's going on in other languages. I think that's a

Re: patch - DEFAULT_ vars in IndexWriter non-final and DEFAULT for useCompoundFile

2005-02-28 Thread Doug Cutting
Kevin A. Burton wrote: Wolf Siberski wrote: Kevin A. Burton wrote: I see following issues with your patch: - you changed the DEFAULT_... semantics from constant to modifiable, but didn't adjust the names according to Java conventions (default_...). Java doesn't have any naming conventions

Re: patch - DEFAULT_ vars in IndexWriter non-final and DEFAULT for useCompoundFile

2005-02-28 Thread Doug Cutting
Kevin A. Burton wrote: Doug Cutting wrote: Wolf Siberski wrote: So, if anything at all, I would rather opt for making these constants private :-). I agree. In general, fields should either be final, or private with accessor methods. So, we could change this to: private static int

Re: updating jakarta site

2005-02-28 Thread Garrett Rooney
Doug Cutting wrote: Garrett Rooney wrote: Actually, currently we've got both lucene4c and java commits going to [EMAIL PROTECTED], and there was some talk of just leaving it that way, since it isn't that much traffic and it encourages people to keep an eye on what's going on in other languages.

Re: updating jakarta site

2005-02-28 Thread Erik Hatcher
There is a pending JIRA issue for Lucene e-mail lists here: http://issues.apache.org/jira/browse/INFRA-195 Please adjust it and prod infrastructure when agreement is made on the lists. I prefer a single commits list for now, but no problem if the consensus is to separate them.

Re: updating jakarta site

2005-02-28 Thread Henri Yandell
On Sun, 27 Feb 2005 01:37:38 -0500, Henri Yandell [EMAIL PROTECTED] wrote: Would anyone mind me changing jakarta.apache.org to switch Lucene to TLP there? Mainly this would involve: Addition of news item concerning promotion Movement of Lucene from Subprojects to Ex-Jakarta Movement of

Re: updating jakarta site

2005-02-28 Thread Otis Gospodnetic
I'd like to help. So are we still in jakarta-site2 land, Toto? Otis --- Erik Hatcher [EMAIL PROTECTED] wrote: I'm gun shy for forging ahead without community consensus based on me pushing too fast earlier. I'm +1 on forging ahead with all of what Henri brings up. Doug? Others? As

Re: updating jakarta site

2005-02-28 Thread Otis Gospodnetic
I recall somebody mentioning jakarta-site2 being locked... and moved to SVN, if I recall correctly. Should I be checking out http://svn.apache.org/repos/asf/jakarta/site/ and using that to generate the new Lucene site docs, or the old CVS version of jakarta-site2? Thanks, Otis --- Otis