Hi Fredrik and Terry, Well I got this on IDLE I think I have done something wrong.
>>> import codecs >>> f = open("C:\Documents and Settings\admin\My Documents\corpus\dainaikAikya >>> collected by sushant.txt","r", "utf_8") Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> f = open("C:\Documents and Settings\admin\My Documents\corpus \dainaikAikya collected by sushant.txt","r", "utf_8") TypeError: an integer is required after that I tried the read binary mode and tried reading the firt 32 bytes and this is what I got. >>> f = open("C:\Documents and Settings\\admin\\My >>> Documents\\corpus\\dainaikAikya collected by sushant.txt","rb") >>> f.read(32) '\xef\xbb\xbf\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa5\x80 \xe0\xa4\xa6\xe0\xa4\xbf\xe0\xa4\xb2\xe0\xa5\x8d \xe0\xa4\xb2\xe0\xa5\x80,' Now based on my knowledge of Unicode I think this is a utf-8 file (the first 3 bytes \xef\xbb\xbf), please correct me if I am wrong. How do I read this? Atul. PS: the above code I wrote using the information from the Library Reference pdf section 4.8 "Codecs". Something wrong I am doing? Please do let me know. On Jul 25, 6:21 am, Terry Reedy <[EMAIL PROTECTED]> wrote: > Atul. wrote: > > Hello All, > > > I wanted to know what encoding should I use to open the files with > >Devanagaricharacters. I was thinking of UTF-8 but was not sure, any > > leads on this? Anyone used it earlier? > > You cannot hurt your machine by giving that a try. > > This is a general comment for all beginners. Before posting, open the > interactive interpreter (or IDLE) and try something(s). If the result > puzzles you, copy and paste into a post. Or if more appropriate, open > the Python manuals and search a bit, or try a search engine. -- http://mail.python.org/mailman/listinfo/python-list