On working http://FreeTamilEbooks.com project, we get many books in
non-unicode text.

We are converting the text using open-tamil library.

https://github.com/arcturusannamalai/open-tamil
https://github.com/arulalant/txt2unicode

It is a python script and runs in terminal.
It can convert only the plain text.

The authors gives their works as MS word doc,
with lot of formatting like Bold/Italic/Tables/Headings etc.

We convert them to plain text and then convert to unicode.
Now, it becomes a tired job to reformat them all the text
as previous formatting.

Looking for ideas on how to convert the encoding with the rich text
without losing any formatting.

Share your thoughts.

Thanks.


-- 
Regards,
T.Shrinivasan


My Life with GNU/Linux : http://goinggnu.wordpress.com
Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com

Get CollabNet Subversion Edge :     http://www.collab.net/svnedge
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc
ILUGC Mailing List Guidelines:
http://ilugc.in/mailinglist-guidelines

Reply via email to