Re: lucene.apache.org

2005-02-14 Thread Kelvin Tan
I'm happy to help, but I haven't been keeping track of this thread.. What needs to be done, and how can I help? k On Mon, 14 Feb 2005 07:37:38 -0500, Erik Hatcher wrote: > We now have lucene.apache.org mapped, yet we don't have a site > there yet. > > Doug - do you have your Forest work handy?  

RE: Study Group (WAS Re: Normalized Scoring)

2005-02-07 Thread Kelvin Tan
here is a pile of stuff in Citeseer, but those papers never really > dig into the details and always require solid previous knowledge of > the field.  They are no replacement for a class or a textbook. > > If you find a good forum for IR, please share. > > Otis > > > --- Kelvin Tan &

Re: Study Group (WAS Re: Normalized Scoring)

2005-02-07 Thread Kelvin Tan
Hey Paul, thanks for responding. On Sun, 6 Feb 2005 13:26:24 +0100, Paul Elschot wrote: > > Tuning the scoring is difficult because one needs to avoid the trap > of optimizing for the test collection and test queries at hand. The > interplays between query structure, coord(), idf() and tf() add to

Study Group (WAS Re: Normalized Scoring)

2005-02-06 Thread Kelvin Tan
. What do you think? k On Sat, 5 Feb 2005 22:10:26 -0800 (PST), Otis Gospodnetic wrote: > Exactly.  Luckily, since then I've learned a bit from lucene-dev > discussions and side IR readings, so some of the topics are making > more sense now. > > Otis > > --- Kelvin Tan <

Re: Normalized Scoring -- was RE: idf and explain(), was Re: Search and Scoring

2005-02-05 Thread Kelvin Tan
Hi Otis, I was re-reading this whole theoretical thread about idf, scoring, normalization, etc from last Oct and couldn't help laughing out loud when I read your post, coz it summed up what I was thinking the whole time. I think its really great to have people like Chuck and Paul (Eshlot) around

Re: cvs commit: jakarta-lucene-sandbox/contributions/javascript/queryConstructor luceneQueryConstructor.html

2004-05-17 Thread Kelvin Tan
Erik, I used the Google Advanced Search feature for the demo/introduction, but in so doing removed the enhancements you made for drop-downs, multi-select and radio fields. Do you still think they're necessary? Cheers, Kelvin On 17 May 2004 13:29:24 -, [EMAIL PROTECTED] said: > kelvint 2

Disappearing segments

2004-04-28 Thread Kelvin Tan
I've been experiencing intermittent disappearing segments which result in the following stacktrace: Caused by: java.io.FileNotFoundException: C:\index\_1ae.fnm (The system cannot find the file specified) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.(R

Update to benchmarks.xml

2004-01-10 Thread Kelvin Tan
Incorporated Geoffrey's benchmarks. Index: benchmarks.xml === RCS file: /home/cvspublic/jakarta-lucene/xdocs/benchmarks.xml,v retrieving revision 1.2 diff -u -r1.2 benchmarks.xml --- benchmarks.xml 12 Dec 2002 06:23:48 - 1.2 +++

Re: Bug/enhancement request 21921

2003-09-10 Thread Kelvin Tan
gt;Kevin, > >I think those two changes are okay. Please submit a patch. I imagine >you already made those two changes in your local copy of Lucene and >have actually been using it with your highlighting code....? > >Otis > > >--- Kelvin Tan <[EMAIL PROTECTED]> wrote:

Re: Fwd: Re: [PROPOSAL] Add Lucene Distribution To Mirrors

2003-09-09 Thread Kelvin Tan
On Tue, 9 Sep 2003 15:02:25 -0700 (PDT), Otis Gospodnetic said: > >What do you think about a 1.3 release? >I think we should resolve the JavaCC situation and them make the 1.3 >release. Perhaps it would be best to include JavaCC-generated .java >files in the CVS, as Doug described the other day.

Normalizer

2003-07-27 Thread Kelvin Tan
I found this SF project doing a search for 'lucene' on SF. http://sourceforge.net/projects/normalizer/ Excerpt says "Contextual rule-based text normalization engine written in java, that can be used to implement stemming algorithms or phonetic normalizers. The project includes a french stemmer/ph

Text mining/classification

2003-07-27 Thread Kelvin Tan
Here's a pretty cool project at SF http://sourceforge.net/projects/exteca "The Exteca platform is an ontology-based technology written in Java for high-quality knowledge management and document categorisation. It can be used in conjunction with search engines." SF Project page says " Other/Propr

Highlighting

2003-07-27 Thread Kelvin Tan
Maik, your [EMAIL PROTECTED] email bounced, so I'm hoping you're still lurking on this list... I've updated the highlighting code you posted on your website to work with the current version of Lucene (except RangeQuery), that's 1.3 rc1 I believe. One change to the Lucene core classes need to be m

RE: Lucene

2003-06-05 Thread Kelvin Tan
On Thu, 5 Jun 2003 16:49:18 -0500, Armbrust, Daniel C. said: >Maybe you should add another page (or section on the page) that is >for people to list the names of their companies or products that are >using lucene (with a brief description of what for), as the current >page only shows that Lucene

Re: Indyo

2003-05-27 Thread Kelvin Tan
at your companies web site, you have cool >products. Why thank you... > >Bryan > >- Original Message - >From: "Kelvin Tan" <[EMAIL PROTECTED]> >To: "Lucene Developers List" <[EMAIL PROTECTED]> >Sent: Tuesday, May 27, 2003 10:56 PM >Subject: Re

Re: Indyo

2003-05-27 Thread Kelvin Tan
Bryan, I've removed it from sandbox coz there never was a great deal of interest in it, and the codebase from which I had originated it has moved on, so had no wish to keep maintaining it. Which aspect of Indyo interests you? Kelvin On Tue, 27 May 2003 22:36:39 -0500, Bryan LaPlante said: >Anyb

Re: [FAQ] Finding number of occurrences of a given word in a document

2003-01-30 Thread Kelvin Tan
this be the correct answer? >http://www.mail-archive.com/lucene- >[EMAIL PROTECTED]/msg01738.html > >I haven't tested it... > >Otis > >--- Kelvin Tan <[EMAIL PROTECTED]> wrote: >>Maybe this is a good FAQ entry, under Searching? >> >> >ht

[FAQ] Finding number of occurrences of a given word in a document

2003-01-30 Thread Kelvin Tan
Maybe this is a good FAQ entry, under Searching? http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01735.html Regards, Kelvin The book giving manifesto - http://how.to/sharethisbook - To unsubscribe,

Failed Build: Query.java

2003-01-15 Thread Kelvin Tan
Someone's got JDK 1.4 installed...:-) compile: [javac] Compiling 73 source files to C:\checkout\jakarta-lucene\bin\classes [javac] C:\checkout\jakarta-lucene\src\java\org\apache\lucene\search\Query.jav a:175: cannot resolve symbol [javac] symbol : constructor RuntimeException (java.la

Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/document Document.java

2003-01-06 Thread Kelvin Tan
hrug* just thought I'd point it out... On Mon, 6 Jan 2003 18:36:50 -0800 (PST), Otis Gospodnetic said: >Be my guest. > >--- Kelvin Tan <[EMAIL PROTECTED]> wrote: >>I couldn't help noticing that the code formatting for the new >>methods are different from the rest o

Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/document Document.java

2003-01-06 Thread Kelvin Tan
I couldn't help noticing that the code formatting for the new methods are different from the rest of the class (Turbine vs Sun). Shouldn't it be corrected? On 7 Jan 2003 02:29:21 -, [EMAIL PROTECTED] said: >otis2003/01/06 18:29:21 > >Modified:src/java/org/apache/lucene/document Doc

[REPOST] [Benchmarks] Daniel's numbers

2002-12-11 Thread Kelvin Tan
Please see attached for diff to benchmarks.xml for Daniel's numbers. Thanks Dan! Regards, Kelvin The book giving manifesto - http://how.to/sharethisbook cvs -z9 diff benchmarks.xml (in directory C:\checkout\jakarta-lucene\xdocs\) Index: benchmarks.xml

[Benchmarks] Daniel's numbers

2002-12-09 Thread Kelvin Tan
Please see attached for diff to benchmarks.xml for Daniel's numbers. Thanks Dan! Regards, Kelvin The book giving manifesto - http://how.to/sharethisbook cvs -z9 diff benchmarks.xml (in directory C:\checkout\jakarta-lucene\xdocs\) Index: benchmarks.xml

Re: How do I get TermPositions for a given document?

2002-10-24 Thread Kelvin Tan
Dmitry would need commit access to the Lucene-sandbox to add the code in, I believe... Regards, Kelvin On Wed, 23 Oct 2002 23:21:45 -0700, Peter Carlson wrote: >Into the sandbox area sound great. > >Just add it to the contributions area in a project called >TermPositions >or something more cleve

Re: Updated Site - Indyo Tutorial

2002-09-16 Thread Kelvin Tan
Peter, The index.xml file in Lucene xdocs should contain a link to it because I modified it with: Indyo is a datasource-independent Lucene indexing framework. A tutorial for using Indyo can be found here. It _should_ work, but it clearly isn't. :) Regards, Kelvin On Mon, 16 Sep 2002 22

Re: fixed url and How to contribute code to lucene sandbox?

2002-09-09 Thread Kelvin Tan
al web server from the contributor. > >--Peter > > > > >On Sunday, September 8, 2002, at 08:10 PM, Kelvin Tan wrote: > >>For code to be added to Sandbox, it also has to be APL. >> >>Otis, I suggest creating a space on Lucene's website for these ad- >>

Re: fixed url and How to contribute code to lucene sandbox?

2002-09-08 Thread Kelvin Tan
For code to be added to Sandbox, it also has to be APL. Otis, I suggest creating a space on Lucene's website for these ad-hoc contribs. I know Sandbox was meant for this, but its not reasonable to expect everyone to APL their code. I'm willing to maintain this section if necessary. Attachments ca

Re: Configuration RFC

2002-07-14 Thread Kelvin Tan
[snip] >Having a framework for dealing with multiple file types (text, HTML, >PDF, Word, etc) is critical. There was a proposal that floated >around >a few months ago which should be dusted off. Indyo, the indexing framework I checked into Sandbox (under the appex project) handles this aspect of

Re: Becoming a collaborator?

2002-06-27 Thread Kelvin Tan
I think this might be my cue. :) Right now, appex consists (afik) of the Andy's Centipede tool, and the indexing framework I checked in. Right now, the package structure is rather confusing and its not immediately clear where the sources are for these two components. That's partially my fault, I

SearchService

2002-06-11 Thread Kelvin Tan
27;s there, and we could discuss refactorings to introduce to make it implementation-neutral. Regards, Kelvin Tan -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

[sandbox] Non-project contributions

2002-05-06 Thread Kelvin Tan
What's the consensus on code which doesn't belong in a project? Does it warrant a directory of its own under /contributions even if its only one file (like in the case of a javascript query validator)? Or should there be a "misc" directory for code like this? Regards, Kelv

Re: Proposal for Lucene

2002-05-04 Thread Kelvin Tan
Andy, I'm up for it. I've made further changes to what I previously posted and am keen on getting it into sandbox. K - Original Message - From: "Andrew C. Oliver" <[EMAIL PROTECTED]> To: "Lucene Developers List" <[EMAIL PROTECTED]> Sent: Saturday, May 04, 2002 10:23 AM Subject: Re: Propo

Re: [VOTE] Lucene Sandbox committer nomination

2002-05-01 Thread Kelvin Tan
Great! Thanks for the support. Regards, Kelvin - Original Message - From: "Peter Carlson" <[EMAIL PROTECTED]> To: "Lucene Developers List" <[EMAIL PROTECTED]> Cc: "Kelvin Tan" <[EMAIL PROTECTED]> Sent: Wednesday, May 01, 2002 6:46

Re: [VOTE] Lucene Sandbox committer nomination

2002-04-24 Thread Kelvin Tan
e Developers List" <[EMAIL PROTECTED]> Sent: Wednesday, April 24, 2002 11:56 PM Subject: Re: [VOTE] Lucene Sandbox committer nomination > I'm using his contribution, so +1, if he wants it, of course. > > Otis > > --- Peter Carlson <[EMAIL PROTECTED]> wrote

Asterisk in field search

2002-04-10 Thread Kelvin Tan
t; ... ... ... ... ... ... org.apache.lucene.queryParser.ParseException: Encountered ":" at line 1, column 9. Was expecting one of: ... ... .. Regards, Kelvin Tan Relevanz Pte Ltd http://www.relevanz.com 180B Bencoolen St. The Bencoolen, #04-01 S(1896

Minor javadoc patch for DateFilter

2002-04-09 Thread Kelvin Tan
Copy and paste javadoc error in DateFilter. Really minor. Please see attached. Regards, Kelvin Tan Relevanz Pte Ltd http://www.relevanz.com 180B Bencoolen St. The Bencoolen, #04-01 S(189648) Tel: 6238 6229 Fax: 6337 4417 cvs diff DateFilter.java (in directory C:\checkout\jakarta-lucene

Search framework

2002-03-01 Thread Kelvin Tan
Torque DB index that Reptile uses. Sounds interesting doesn't it? :) Regards, Kelvin Tan Relevanz Pte Ltd http://www.relevanz.com 180B Bencoolen St. The Bencoolen, #04-01 S(189648) Tel: 238 6229 Fax: 337 4417 -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional

Re: Proposal for Lucene / new component

2002-02-26 Thread Kelvin Tan
Pls see inline reply. - Original Message - From: "Manfred Schäfer" <[EMAIL PROTECTED]> To: "Lucene Developers List" <[EMAIL PROTECTED]> Sent: Tuesday, February 26, 2002 7:23 PM Subject: Re: Proposal for Lucene / new component > Hi, > > "Andrew C. Oliver" wrote: > > > > > I'm trying to l

Re: Proposal for Lucene

2002-02-26 Thread Kelvin Tan
;d be more than happy if you could do that. It would be nice if Lucene had the equivalent of the commons-sandbox or turbine-stratum, a workplace kind-of. Regards, Kelvin > > Has anyone else looked at this? Any objections? > > -Andy > > > On Sat, 2002-02-09 at 07:58, Kelvi

Re: Proposal for Lucene

2002-02-26 Thread Kelvin Tan
Mark, My web server is acting all weird -- somehow this zip file refuses to download completely via HTTP (both in IE and Netscape, but downloading via FTP is fine). The workaround is that I've renamed it to http://www.relevanz.com/search_full.z. If your friendly zip program doesn't recognize it

Re: Proposal for Lucene

2002-02-09 Thread Kelvin Tan
- Original Message - From: Andrew C. Oliver <[EMAIL PROTECTED]> To: Lucene Developers List <[EMAIL PROTECTED]> Sent: Saturday, February 09, 2002 8:57 PM Subject: Re: Proposal for Lucene [snip] > > > 5. There's a JDBCDatasource for indexing a table from databases (the table > > stores me

Re: Proposal for Lucene

2002-02-09 Thread Kelvin Tan
D]> To: Lucene Developers List <[EMAIL PROTECTED]> Sent: Friday, February 08, 2002 9:18 PM Subject: Re: Proposal for Lucene > Is this open source? APL'd? Where can I look at it? > > -Andy > > On Thu, 2002-02-07 at 20:27, Kelvin Tan wrote: > > Great suggestions all arou

Re: Proposal for Lucene

2002-02-07 Thread Kelvin Tan
Great suggestions all around, and I'm pretty much in agreement with what's been said. In my app, I've built a mini-framework around the searching such that I'm able to map ContentHandlers (which index file contents) to file extensions. I've been wanting to clean it up and contribute it for awhi