RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-18 Thread Philippe Laflamme
List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] On Monday 17 November 2003 07:40, Chong, Herb wrote: i don't know what the Java implementation is like but the C++ one is very fast. ... I personally do not have any experience with the BreakIterator

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-18 Thread Chong, Herb
[was Re: Vector Space Model in Lucene?] In terms of speed I would tend to agree with you. My question regarding efficiency was directed more towards the quality of the results it provides. Is the BreakIterator breaking on correct sentence boundaries or is it being confused by dots at the end

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 5:54 PM To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] Well ... Sure, nothing can replace a human mind. But believe it or not, there are studies which show that even human

RE: Contributing to Lucene (was RE: inter-term correlation [was Re: Vector Space Model in Lucene?])

2003-11-17 Thread Chong, Herb
looking for one. Herb -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 6:45 PM To: Lucene Users List Subject: Contributing to Lucene (was RE: inter-term correlation [was Re: Vector Space Model in Lucene?]) Hello Herb, I don't

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] Isn't that quite strict interpretation, however? There are many cases where linguistically separate sentences do have strong dependendies; in web world simple things like list items may be very closely related. Put

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
[mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 8:30 PM To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] Hmmh? You implied that there are some useful distance heuristics (words 5 words apart or more correlate much less), and others have

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] What you can do is use a pos tagger (i.e. a maximum entropy model based or Brill tagger if you just have english) and use a data mining algorithm for weight your terms. May be you can use a hidden

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
Message- From: Karsten Konrad [mailto:[EMAIL PROTECTED] Sent: Saturday, November 15, 2003 7:16 AM To: Lucene Users List Subject: AW: inter-term correlation [was Re: Vector Space Model in Lucene?] Anyway, Herb is right, sentence boundaries do carry a meaning and the linguistic rule could

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Philippe Laflamme
: RE: inter-term correlation [was Re: Vector Space Model in Lucene?] i have a program written in Icon that does basic sentence splitting. with about 5 heuristics and one small lookup table, i can get well over 90% accuracy doing sentence boundary detection on email. for well edited English

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
i don't know what the Java implementation is like but the C++ one is very fast. Herb -Original Message- From: Philippe Laflamme [mailto:[EMAIL PROTECTED] Sent: Monday, November 17, 2003 9:39 AM To: Lucene Users List Subject: RE: inter-term correlation [was Re: Vector Space Model

AW: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Karsten Konrad
] www.xtramind.com -Ursprüngliche Nachricht- Von: Philippe Laflamme [mailto:[EMAIL PROTECTED] Gesendet: Montag, 17. November 2003 15:39 An: Lucene Users List Betreff: RE: inter-term correlation [was Re: Vector Space Model in Lucene?] There is already an implementation in the Java API

RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Dan Quaroni
My only concern with this being integrated into lucene is that it be done in a way that doesn't make its use mandatory. Lucene is powerful enough that it can be used for a lot of cases where NLP doesn't make any sense. For example, I think that sentence boundaries would severely screw up the

RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Chong, Herb
show an example document. Herb -Original Message- From: Dan Quaroni [mailto:[EMAIL PROTECTED] Sent: Monday, November 17, 2003 9:48 AM To: 'Lucene Users List' Subject: RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?]) My only

Re: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Joe Paulsen
Message - From: Chong, Herb [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, November 17, 2003 10:00 AM Subject: RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?]) show an example document. Herb -Original Message

RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Chong, Herb
-Original Message- From: Joe Paulsen [mailto:[EMAIL PROTECTED] Sent: Monday, November 17, 2003 10:12 AM To: Lucene Users List Subject: Re: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?]) Hope this isn't out of context - but Dan makes a very

RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Dan Quaroni
and the needs of the user. -Original Message- From: Chong, Herb [mailto:[EMAIL PROTECTED] Sent: Monday, November 17, 2003 10:01 AM To: Lucene Users List Subject: RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?]) show an example document. Herb

Re: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Andrzej Bialecki
Joe Paulsen wrote: Hope this isn't out of context - but Dan makes a very valid point. Besides the potential performance slowdown if NLP was always applied to a users query - there are times that an exact term match is desired without the query expansion that an NLP process normally requires.

RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Chong, Herb
to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?]) I'm not sure I can share a sample, but the specific situation I'm thinking of is when you have data that doesn't exist within a sentence, for example the name, address, etc of a company. Some foreign companies have

RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Chong, Herb
to do that, there is no point in using Lucene. Herb... -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Monday, November 17, 2003 10:26 AM To: Lucene Users List Subject: Re: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model

RE: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Otis Gospodnetic
correlation [was R e: Vector Space Model in Lucene?]) Query expansion can (and I believe should) be done efficiently outside the core of search engine. After all, it's a process of changing the query according to some expansion/rewriting algorithms, but it is still the unchanged search

Re: AW: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Doug Cutting
Karsten Konrad wrote: I was wondering whether we could, while indexing, make a use of this by increasing the position counter by a large number, let's say 1000, whenever we encounter a sentence separator (Note, this is not trivial; not every '.' ends a sentence etc. etc. etc.). Thus, searching

RE: AW: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
Space Model in Lucene?] This is exactly the sort of approach I was advocating in earlier messages. (Although I think you'd only need to increase the position counter by 101 for the first word in each sentence.) Herb Chong didn't seem to think this was appropriate, but I never understood why. Doug

RE: AW: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Chong, Herb
you could use the negative of the actual value. Herb -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Monday, November 17, 2003 2:56 PM To: Lucene Users List Subject: Re: AW: inter-term correlation [was Re: Vector Space Model in Lucene?] This is exactly

Re: AW: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Doug Cutting
PROTECTED] Sent: Monday, November 17, 2003 2:56 PM To: Lucene Users List Subject: Re: AW: inter-term correlation [was Re: Vector Space Model in Lucene?] This is exactly the sort of approach I was advocating in earlier messages. (Although I think you'd only need to increase the position counter

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-17 Thread Tatu Saloranta
On Monday 17 November 2003 07:40, Chong, Herb wrote: i don't know what the Java implementation is like but the C++ one is very fast. ... I personally do not have any experience with the BreakIterator in Java. Has anyone used it in any production environment? I'd be very interested to learn

Re: Contributing to Lucene (was RE: inter-term correlation [was R e: Vector Space Model in Lucene?])

2003-11-17 Thread Tatu Saloranta
On Monday 17 November 2003 08:39, Chong, Herb wrote: the core of the search engine has to have certain capabilities, however, because they are next to impossible to add as a layer on top with any efficiency. detecting sentence boundaries outside the core search engine is really hard to do

Re: understanding IR topics on this list [was: Re: Vector Space Model in Lucene?]

2003-11-16 Thread Magnus Johansson
[mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 12:39 PM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? Herb Hmm... Are you perhaps familiar with some open system which doesn't? I'm curious because one of my projects (already using Lucene) could benefit from

AW: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-15 Thread Karsten Konrad
you are in flame mode anyway now :) Regards, Karsten -Ursprüngliche Nachricht- Von: petite_abeille [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 14. November 2003 20:04 An: Lucene Users List Betreff: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] On Nov 14, 2003

understanding IR topics on this list [was: Re: Vector Space Model in Lucene?]

2003-11-15 Thread Gerret Apelt
. my project at the time was cancelled after TREC-7 and so there haven't been any new developments. Herb -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 12:39 PM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? Herb

Re: Vector Space Model in Lucene?

2003-11-14 Thread Leo Galambos
Really? And what model is used/implemented by Lucene? THX Leo Otis Gospodnetic wrote: Lucene does not implement vector space model. Otis --- [EMAIL PROTECTED] wrote: Hi, does Lucene implement a Vector Space Model? If yes, does anybody have an example of how using it? Cheers, Ralf -- NEU

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
does it matter? vector space is only one of several important ones. Herb -Original Message- From: Leo Galambos [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 4:00 AM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? Really? And what model is used

AW: Vector Space Model in Lucene?

2003-11-14 Thread Karsten Konrad
. November 2003 14:35 An: Lucene Users List Betreff: RE: Vector Space Model in Lucene? does it matter? vector space is only one of several important ones. Herb -Original Message- From: Leo Galambos [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 4:00 AM To: Lucene Users List

Re: Vector Space Model in Lucene?

2003-11-14 Thread Leo Galambos
: Friday, November 14, 2003 4:00 AM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? Really? And what model is used/implemented by Lucene? THX Leo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
-Original Message- From: Karsten Konrad [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 9:08 AM To: Lucene Users List Subject: AW: Vector Space Model in Lucene? what are these several other important ones? - To unsubscribe

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
to me, vector space implies thinking inside the box. Herb... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
like all vector space models i have come across, Lucene ignores interterm correlation. Herb - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Vector Space Model in Lucene?

2003-11-14 Thread Andrzej Bialecki
Chong, Herb wrote: like all vector space models i have come across, Lucene ignores interterm correlation. Herb Hmm... Are you perhaps familiar with some open system which doesn't? I'm curious because one of my projects (already using Lucene) could benefit from such feature. Right now I'm

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
: Re: Vector Space Model in Lucene? Herb Hmm... Are you perhaps familiar with some open system which doesn't? I'm curious because one of my projects (already using Lucene) could benefit from such feature. Right now I'm using a bastardized version of Markov chains, but it's more of a hack

inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Joshua O'Madadhain
Incorporating inter-term correlation into Lucene isn't that hard; I've done it. Nor is it incompatible with the vector-space model. I'm not happy with the specific correlation metric that I picked, which is why I'm not eager to generally release the code I wrote, but I think that the basic

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Chong, Herb
14, 2003 1:14 PM To: Lucene Users List Subject: inter-term correlation [was Re: Vector Space Model in Lucene?] Incorporating inter-term correlation into Lucene isn't that hard; I've done it. Nor is it incompatible with the vector-space model. I'm not happy with the specific correlation metric

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Joshua O'Madadhain
- From: Joshua O'Madadhain [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 1:14 PM To: Lucene Users List Subject: inter-term correlation [was Re: Vector Space Model in Lucene?] Incorporating inter-term correlation into Lucene isn't that hard; I've done it. Nor is it incompatible

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Chong, Herb
: Friday, November 14, 2003 1:53 PM To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] Not sure what you mean by terms can't cross sentence boundaries. If you're only using single-word terms, that's trivially true. What is it that you're trying

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Erik Hatcher
On Friday, November 14, 2003, at 01:13 PM, Chong, Herb wrote: if you didn't have to change the index then you haven't got all the factors needed to do it well. terms can't cross sentence boundaries and the index doesn't store sentence boundaries. You mean if you have text like this: Hello Herb.

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Chong, Herb
, 2003 1:52 PM To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] You mean if you have text like this: Hello Herb. Have a nice day!, you want to prevent phrase queries for herb have? You could prevent sentence boundary crossing with clever use

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread petite_abeille
On Nov 14, 2003, at 19:50, Chong, Herb wrote: if you are handling inter correlation properly, then terms can't cross sentence boundaries. Could you not break down your document along sentences boundary? If you manage to figure out what a sentence is, that is. if you are not paying attention to

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Erik Hatcher
On Friday, November 14, 2003, at 02:02 PM, Chong, Herb wrote: if i just run this query against a million document newswire index, i know i am going to get lots of hits. the phrase capital gains tax hits a lot fewer documents, but is overrestrictive. the fact that the three terms occur next to

Re: Vector Space Model in Lucene?

2003-11-14 Thread Dror Matalon
. Herb -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 12:39 PM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? Herb Hmm... Are you perhaps familiar with some open system which doesn't? I'm curious

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Chong, Herb
, November 14, 2003 2:10 PM To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] With Lucene's analysis process, you can assign a position increment to tokens. The default value is 1, meaning its the next position. Phrase queries default to a slop of 0

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Philippe Laflamme
Space Model in Lucene?] On Nov 14, 2003, at 19:50, Chong, Herb wrote: if you are handling inter correlation properly, then terms can't cross sentence boundaries. Could you not break down your document along sentences boundary? If you manage to figure out what a sentence

Re: Vector Space Model in Lucene?

2003-11-14 Thread petite_abeille
On Nov 14, 2003, at 20:27, Dror Matalon wrote: I might be the only person on the list who's having a hard time following this discussion. Nope. I don't understand a word of what those guys are talking about either :) Would one of you wise folks care to point me to a good dummies, also known as

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
:28 PM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? Hi, I might be the only person on the list who's having a hard time following this discussion. Would one of you wise folks care to point me to a good dummies, also known as an executive summary, resource about the theoretical

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread petite_abeille
On Nov 14, 2003, at 20:29, Philippe Laflamme wrote: Rules of linguistics? Is there such a thing? :) Actually, yes there is. Natural Language Processing (NLP) is a very broad research subject but a lot has come out of it. A lot of what? If statements? :) More specifically, Rule-based taggers

Re: Vector Space Model in Lucene?

2003-11-14 Thread Erik Hatcher
On Friday, November 14, 2003, at 02:32 PM, Chong, Herb wrote: when people type in multiword queries, mostly they are interested in phrases in the linguistic sense. phrases don't cross sentence boundaries. you need certain features in the index and in the ranking algorithm to capture that

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
Subject: Re: Vector Space Model in Lucene? In the Lucene-sense of things, sounds like you're after one Document per sentence. You then get your boundaries automatically as well as the distance weighting through the coord() Similarity function. At least that seems like a close approximation

Re: Vector Space Model in Lucene?

2003-11-14 Thread Erik Hatcher
On Friday, November 14, 2003, at 02:54 PM, Chong, Herb wrote: it solves one part of the problem, but there are a lot of sentences in a typical document. you'll need to composite a rank of a document from its constituent sentences then. there are less drastic ways to solve the problem. the

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Philippe Laflamme
analysis. Maybe someone out there has some experience they might want to share with us? Thanks, Phil -Original Message- From: petite_abeille [mailto:[EMAIL PROTECTED] Sent: November 14, 2003 14:36 To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Doug Cutting
Chong, Herb wrote: since i am working now on financial news, here is an example: capital gains tax if i just run this query against a million document newswire index, i know i am going to get lots of hits. the phrase capital gains tax hits a lot fewer documents, but is overrestrictive. the fact

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
, but implementing it can be. Herb -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 3:08 PM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? I get the feeling you're looking for reasons that Lucene is inadequate. This may

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
Space Model in Lucene? This all sounds wonderfully exotic, but, from all the different esoteric approaches you ever tried, what, if anything, made a concrete and noticeable impact on the quality of your search? - To unsubscribe, e

Re: Vector Space Model in Lucene?

2003-11-14 Thread petite_abeille
On Nov 14, 2003, at 21:16, Chong, Herb wrote: if you know what TREC is, you know what i meant earlier. this isn't exotic technology, this is close to 15 year old technology. This is not really what I asked. What I would be interested to know is what approach you consider to provide the biggest

RE: Vector Space Model in Lucene?

2003-11-14 Thread Chong, Herb
to implement efficiently. Herb... -Original Message- From: petite_abeille [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 3:20 PM To: Lucene Users List Subject: Re: Vector Space Model in Lucene? This is not really what I asked. What I would be interested to know is what approach you

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Doug Cutting
. there is psychology of query creation too and that is one thing i am taking advantage of. Herb -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Friday, November 14, 2003 3:15 PM To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model

RE: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Chong, Herb
:33 PM To: Lucene Users List Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?] Certainly there are lots of scoring algorithms that one cannot easily implement with Lucene. I'm just not yet clear on what you need to do that Lucene cannot support

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Doug Cutting
Leo Galambos wrote: There are other (more trivial) problems as well. One geek from UFAL (our NLP lab) reported, that it was a hard problem to find the boundaries, or rather, to say whether a dot is a dot or something else, i.e. blah, i.e. blah i.b.m. i.p. pavlov 3.14 28.10.2003 etc. On the

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread petite_abeille
On Nov 14, 2003, at 21:14, Philippe Laflamme wrote: Rules of linguistics? Is there such a thing? :) Actually, yes there is. Natural Language Processing (NLP) is a very broad research subject but a lot has come out of it. A lot of what? If statements? :) Yes... just like every software boils down

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Stefan Groschupf
PA, But Lucene is an low level indexing library. I'm sure most people here will agree that lucene is much more than a _low level_ indexing library. May be it is just a library, but definitely the *highest level* search technology available in the web for free. You ride roughshod over the

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Andrzej Bialecki
Well ... Sure, nothing can replace a human mind. But believe it or not, there are studies which show that even human experts can significantly differ in their opinions on what are key-phrases for a given text. So, the results are never clear cut with humans either... So, in this sense a

Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

2003-11-14 Thread Stefan Groschupf
Herb, On Friday 14 November 2003 13:39, Chong, Herb wrote: you're describing ad-hoc solutions to a problem that have an effect, but not one that is easily predictable. one can concoct all sorts of combinations of the query operators that would have something of the effect that i am describing.

Re: Vector Space Model in Lucene?

2003-11-13 Thread Otis Gospodnetic
Lucene does not implement vector space model. Otis --- [EMAIL PROTECTED] wrote: Hi, does Lucene implement a Vector Space Model? If yes, does anybody have an example of how using it? Cheers, Ralf -- NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... Fotoalbum, File

Vector Space Model in Lucene?

2003-11-12 Thread ambiesense
Hi, does Lucene implement a Vector Space Model? If yes, does anybody have an example of how using it? Cheers, Ralf -- NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService Jetzt kostenlos anmelden unter http://www.gmx.net