On Sep 21, 2007, at 3:37 AM, Pieter Berkel wrote:
Thanks for the response guys:
Grant: I had a brief look at LingPipe, it looks quite interesting
but I'm
concerned that the licensing may prevent me from using it in my
project.
Does the opennlp license look good for you? It's LGPL. Not
Thanks for the response guys:
Grant: I had a brief look at LingPipe, it looks quite interesting but I'm
concerned that the licensing may prevent me from using it in my project.
Michael: I have used the Yahoo API in the past but due to it's generic
nature, I wasn't entirely happy with the results
On 9/21/07, Pieter Berkel [EMAIL PROTECTED] wrote:
Yonik: This is the approach I had in mind, will it still work if I put the
SynonymFilter after the word-delimiter filter in the schema config?
SynonymFilter doesn't currently have the capability to handle multiple
tokens at the same position in
Not sure if this is in the same league or not, but Yahoo offers a term
extraction
web service.
http://developer.yahoo.com/search/content/V1/termExtraction.html
On 9/20/07, Grant Ingersoll [EMAIL PROTECTED] wrote:
You might investigate some tools like Alias-i's LingPipe or do some
searches
On 9/19/07, Pieter Berkel [EMAIL PROTECTED] wrote:
However, I'd like to be able to
analyze documents more intelligently to recognize phrase keywords such as
open source, Microsoft Office, Bill Gates rather than splitting each
word into separate tokens (the field is never used in search queries
I'm currently looking at methods of term extraction and automatic keyword
generation from indexed documents. I've been experimenting with
MoreLikeThis and values returned by the mlt.interestingTerms parameter and
so far this approach has worked well. However, I'd like to be able to
analyze
On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote:
I'm currently looking at methods of term extraction and automatic
keyword
generation from indexed documents.
We do it manually (not in solr, but we put the results in solr.) We
do it the usual way - chunk (into n-grams, named entities
:
On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote:
I'm currently looking at methods of term extraction and automatic
keyword
generation from indexed documents.
We do it manually (not in solr, but we put the results in solr.) We
do it the usual way - chunk (into n-grams, named entities noun