Hi to all,

I scripted some text files with another language which cannot handle unicode. As I need special character in the resulting text files (IPA extension), my idea was to define some special ascii sequences in the text files, open the text files in Python, replace the special sequences with unicode and encode in utf8. I made some tests with consolle and everything seemed fine.

But my script keeps on raising exceptions related to encoding.

Sorry if it's obvious but I really can't figure out what to do.

The script follows.

Thanks a lot

-a-

# a class for replacing ascii with unicode


import codecs
import os

class Unicoder:

        def __init__(self, folder):
            files = os.listdir(folder)
            paths = []
            for x in files:
                paths.append(folder+"/"+x)
            self.files = paths
            # a list containing all the sc-generated .ly files

        def intoText(self, inFile):
            aFile = codecs.open(inFile, "r")
            text = aFile.read() # read all its content in text
            return text

        def replaceSpecials(self, text):
            replacementDict = (
            {"[O]":u"\u0254",
             "[U]":u"\u0277",
             "[E]":u"\u025b",
             "[o|]":u"\xf8",
             "[oe]":u"\u0153",
             "[e:]":u"\u0259",
             "[I]":u"\u026a",
             "[ae]":u"\xe6",
             "[A]":u"\u0251",
             "[Q]":u"\u0252",
             "[V]":u"\u028c"
             }

            )
            # hash table where to look up for replacement
            for ascii in replacementDict:
                print ascii
                utf = replacementDict[ascii]
                text = text.replace(ascii, utf.encode("utf-8"))
            return text

        def toFile(self, text, outFileName):
outFile = codecs.open(outFileName, encoding='utf-8', mode="w")
            outFile.write(text)
            outFile.close()

        def run(self):
            for aFileName in self.files:
                outFileName = aFileName.split(".")[0]+"UTF.ly"
                text = self.intoText(aFileName)
                text = self.replaceSpecials(text)
                self.toFile(text, outFileName)

if __name__ == "__main__":
    a = Unicoder("/musica/antigone/scores/")

# EOF

--------------------------------------------------
Andrea Valle
--------------------------------------------------
CIRMA - DAMS
Università degli Studi di Torino
--> http://www.cirma.unito.it/andrea/
--> [EMAIL PROTECTED]
--------------------------------------------------


I did this interview where I just mentioned that I read Foucault. Who doesn't in university, right? I was in this strip club giving this guy a lap dance and all he wanted to do was to discuss Foucault with me. Well, I can stand naked and do my little dance, or I can discuss Foucault, but not at the same time; too much information.
(Annabel Chong)





-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to