I can't speak about the automatic classification of unstructured information, other than to say, that most semantic analysers that I have seen are not effective compared to a person. (talk about manual versus automatic recounts :) As the size of the underlying information sources grows, the uncertainity grows as well, but essentially you don't see it unless you look for it. Also, the initial promises of the www, with web's of trusted linked information, seem naive today. You can't expect enough people to maintain the necessary links to make a very large collection of information coherent. As to structured data and data mining, having witnessed such a project here with UMHS's clincial data warehouse, I can say that cleaning the data and data feeds was an enormous task and still requires a pretty significant effort.
