Introduction: It's always hard to parse the binary file to text. Today we will see rescue module which will help us to convert docx to md file
Module: docx2md Installation: pip install docx2md About: Converts Microsoft Word document files (.docx extension) to Markdown files. Execution: % python -m docx2md ~/Downloads/example.docx output.msd # save output.msd # save media/image1.png # save media/image4.jpg # save media/image3.gif # save media/image2.png Output: % cat output.msd <div class="break"></div> # chapter 1 text of chapter 1 ## section 1-1 text of section 1-1 ### subsection 1-1-1 text of subsection 1-1-1 <div class="break"></div> insert png <img src="media/image1.png" id="image1"> insert bmp <img src="media/image2.png" id="image2"> insert gif <img src="media/image3.gif" id="image3"> insert jpg <img src="media/image4.jpg" id="image4"> <div class="break"></div> * aaaaa * bbbbb * ccccc * ddddd * eeeee * fffff * ggggg * hhhhh * iiiii * jjjjj <table id="table1"> <tr> <td>a</td> <td>b</td> <td>c</td> </tr> <tr> <td>d</td> <td>e</td> <td>f</td> </tr> <tr> <td>g</td> <td>h</td> <td>i</td> </tr> </table> Reference: https://pypi.org/project/docx2md/
_______________________________________________ Chennaipy mailing list Chennaipy@python.org https://mail.python.org/mailman/listinfo/chennaipy