Hi all, I have to develop a prototype to manage semantic search on a corpus. My idea is to create a space model of the corpus using LDA (after create the TF vectors) and then, represent each query as a point in this space model to measure the distance between its query and the nearest document represented and retreive it.
Also, Im thinking about clustering the documents before run LDA in order to retrieve the user some documents similars with the same topics. My problem is that im not sure if I can "extract" the LDA model and represent new points (the querys) on it. It will be great if somebody could show me an explanation or webpage doing this task, or something in the way. Not only code, articles explaining this technics will be very helpfull to me.
