Rajesh Chandrakar wrote:It seems to very nice Thesis, I will look at it.> >
> > Another problem has to do with searching/indexing. Search/index applications
> > are "broken" by non-Standard encodings.
>
> but how far searching and indexing is possible for encoded standards?
>Hopefully, someone on our list with better knowledge of search
engine technology will respond to your question.Here is a link which might also be helpful:
A Devanagari Search Engine for Unicode Documents with Compression
http://www.cse.iitk.ac.in/research/mtech1998/9811101.html
I have visited the above website. I can't say much about the Tamil language that I dont' know any letter of it as well as I can't read it. but I can say about the Devanagari which is my mother language. Some "matras" such as "EE" and "U" of this language is not being represented in actual place, somewhere it is coming first and somewhere after than its actual place in words. Look at the first line itself just down the "Pallavi:" given in bracket).And, you can see one problem with Private Use Area encoding
and searching on one of my pages,
http://home.att.net/~jameskass/tamiltutf.htm
cheery regards
rajesh chandrakar

