I think that a big part of the problem is that most approaches to word
similarity (especially thesaurus-based approaches like Wordnet, but also the
significantly better distributional approaches) use very impoverished
representations of knowledge.  As such, they are unable to make useful
inferences because they lack the underlying representation of knowledge and
experience that is necessary for the kind of similarity judgements that
people are able to make.

I am especially interested in how Gardenfors-style conceptual space
modelling can improve the situation, in the context of other NLP
techniques.  The Cognitive Geography group at UCSB is doing some interesting
work, including a Java library for conceptual space modeling.  They plan to
release it as free software, but it's not done yet.  I got an early version
of the code from them last year and played with it some in Clojure.  I plan
to do more with that and see what results I can get.

Also, most of NLTK works in Jython*, and by extension in Jython running in
Clojure ( which is why I started writing a convenience wrapper to make it
easier to use python libraries: http://code.google.com/p/clojure-python/ ).


*Actually getting NLTK to work in Jython is kind of problematic presently
because you need to modify a few things to allow it to work.  I think it's
great that there's a Clojure NLP library in the works.  If the Clojure NLP
libs are better than NLTK then everyone in computational linguistics will
switch to Clojure. :)

On Thu, Jul 29, 2010 at 9:16 AM, Michael Harrison (goodmike) <
goodmike...@gmail.com> wrote:

> As others have said, this is a difficult problem, but a fascinating
> one too. I'm currently nibbling on building some grouping-by-
> similarity algorithms for Clojure, although I'm sticking to numerical
> criteria for similarity or "distance". New developments in text
> analysis and the Learning by Reading approach to AI, as described at
> http://blog.steinberg.org/?p=11 e.g., are making data science an
> exciting place. If you make some headway, please do share with us. I
> for one would love to see where you go and contribute if possible.
>
> Cheers,
> Michael
>
>
> On Jul 28, 4:58 pm, Daniel <doubleagen...@gmail.com> wrote:
> > I want to write a clojure program that searches for similarities of
> > words in the english language and places them in a graph, where the
> > distance between nodes indicates their similarity.  I don't mean
> > syntactical similarity.  Related contextual meaning is closer to the
> > mark.
> >
> > For instance: "fish" and "reel" don't have much similarity, but in the
> > context of fishing they do, so the distance in such a graph wouldn't
> > be very large.
> >
> > I'm sure research has been done in this area (I suspect with no small
> > portion belonging to google), so can anybody point me in the right
> > direction?
> >
> > Thanks.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com>
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to