Hi,
  I want to know how to search query in categories.  e.g. java can mean
coffee or computer language.  Is there a way to put something in query that
distinguish them?  Something like search "java +category:food"?
[Iain>>] 
[Iain>>] This is really not an easy question to answer.  Or at least it is.
"You can't do it".

Automatically identifying which of many meanings of a word is correct
(disambiguation) is a research project - which is still not generally
solved.

If it were you could change the word before it got added to the index,
'java' -> 'java_food'.  However, there is no clear way to define these
categories in general (see something like Wordnet for how complex this can
get) and you would have to know that 'food' was an option for 'java' in the
search.

If you have a controlled text which only features certain themes, then you
could automatically classify each text and add the classification to a
separate lucene index.  Not easy but doable.  Then you get into trouble if a
text talks about the tendency of programmers to drink coffee.

Iain


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to