Bug#932044: calibre: PDF to EPUB conversion failed with "No module named html2text"
On 2019-07-15 03:36, Norbert Preining wrote: Can you try installing python-html2text And see if that fixes the problem? Yes, it did the trick!
Bug#932044: calibre: PDF to EPUB conversion failed with "No module named html2text"
Can you try installing python-html2text And see if that fixes the problem? I send that there are insufficient dependencies declared. Thanks Norbert (Away from PC so cannot check myself atm) On July 14, 2019 7:59:32 PM GMT+09:00, Vincas Dargis wrote: >Package: calibre >Version: 3.45.2+dfsg-1 >Severity: normal > >Dear Maintainer, > >I've tried to convert freely available "Elements of Programming" PDF >(from http://elementsofprogramming.com) into EPUB, and got this error: > >``` >Traceback (most recent call last): >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 198, in >_manifest_add_missing >data = item.data > File "/usr/lib/calibre/calibre/ebooks/oeb/base.py", line 1043, in data >data = self._parse_xhtml(data) >File "/usr/lib/calibre/calibre/ebooks/oeb/base.py", line 960, in >_parse_xhtml >filename=fname, non_html_file_tags={'ncx'}) >File "/usr/lib/calibre/calibre/ebooks/oeb/parse_utils.py", line 207, in >parse_html >data = preprocessor(data) >File "/usr/lib/calibre/calibre/ebooks/conversion/preprocess.py", line >684, in __call__ >html = preprocessor(html) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 784, >in __call__ >html = self.markup_chapters(html, self.totalwords, >self.blanks_between_paragraphs) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 334, >in markup_chapters >html = recurse_patterns(html, False) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 329, >in recurse_patterns >html = chapdetect.sub(self.chapter_head, html) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 63, in >chapter_head >txt_chap = delete_quotes.sub('', delete_whitespace.sub('\\g', >html2text(chap))) >File "/usr/lib/calibre/calibre/utils/html2text.py", line 8, in >html2text >from html2text import HTML2Text >ImportError: No module named html2text > >Spine item 'id1' not found >Traceback (most recent call last): > File "/usr/bin/calibre-parallel", line 20, in >sys.exit(main()) > File "/usr/lib/calibre/calibre/utils/ipc/worker.py", line 200, in main >result = func(*args, **kwargs) >File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line >42, in gui_convert_override >override_input_metadata=True) >File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line >27, in gui_convert >plumber.run() >File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line >1121, in run >for_regex_wizard=self.for_regex_wizard, >removed_items=getattr(self.input_plugin, 'removed_items_to_ignore', >())) >File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line >1315, in create_oebbook >reader()(oeb, path_or_stream) >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 71, in >__call__ >self._all_from_opf(opf) >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 703, in >_all_from_opf >self._spine_from_opf(opf) >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 348, in >_spine_from_opf >raise OEBError("Spine is empty") >calibre.ebooks.oeb.base.OEBError: Spine is empty >``` > >Is it expected to installe additional Calibre dependencies manually >or..? > >Full conversion log is attached. > >-- System Information: >Debian Release: bullseye/sid > APT prefers unstable-debug >APT policy: (500, 'unstable-debug'), (500, 'unstable'), (1, >'experimental-debug'), (1, 'experimental') >Architecture: amd64 (x86_64) >Foreign Architectures: i386 > >Kernel: Linux 4.19.0-5-amd64 (SMP w/8 CPU cores) >Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, >TAINT_UNSIGNED_MODULE >Locale: LANG=lt_LT.UTF-8, LC_CTYPE=lt_LT.UTF-8 (charmap=UTF-8), >LANGUAGE=lt (charmap=UTF-8) >Shell: /bin/sh linked to /usr/bin/dash >Init: systemd (via /run/systemd/system) >LSM: AppArmor: enabled > >Versions of packages calibre depends on: >ii calibre-bin 3.45.2+dfsg-1 >ii fonts-liberation 1:1.07.4-10 >ii imagemagick 8:6.9.10.23+dfsg-2.1 >ii imagemagick-6.q16 [imagemagick] 8:6.9.10.23+dfsg-2.1 >ii libjpeg-turbo-progs 1:1.5.2-2+b1 >ii libjs-coffeescript 1.12.8~dfsg-4 >ii libjs-mathjax2.7.4+dfsg-1 >ii optipng 0.7.7-1 >ii poppler-utils0.71.0-5 >ii python-apsw 3.27.2-r1-1 >ii python-bs4 4.7.1-1 >ii python-chardet 3.0.4-3 >ii python-cherrypy3 8.9.1-2 >ii python-css-parser1.0.4-1 >ii python-cssselect 1.0.3-1 >ii python-cssutils 1.0.2-2 >ii python-dateutil 2.7.3-3 >ii python-dbus 1.2.8-3 >ii python-feedparser5.2.1-1 >ii python-html5-parser 0.4.5-1 >ii python-html5lib 1.0.1-1 >ii python-lxml 4.3.3-2 >ii python-markdown 3.0.1-3 >ii python-mechanize
Bug#932044: calibre: PDF to EPUB conversion failed with "No module named html2text"
Hi Vincas, Thanks for the report, I'll look into it asap, most probably already tomorrow. Best Norbert On July 14, 2019 7:59:32 PM GMT+09:00, Vincas Dargis wrote: >Package: calibre >Version: 3.45.2+dfsg-1 >Severity: normal > >Dear Maintainer, > >I've tried to convert freely available "Elements of Programming" PDF >(from http://elementsofprogramming.com) into EPUB, and got this error: > >``` >Traceback (most recent call last): >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 198, in >_manifest_add_missing >data = item.data > File "/usr/lib/calibre/calibre/ebooks/oeb/base.py", line 1043, in data >data = self._parse_xhtml(data) >File "/usr/lib/calibre/calibre/ebooks/oeb/base.py", line 960, in >_parse_xhtml >filename=fname, non_html_file_tags={'ncx'}) >File "/usr/lib/calibre/calibre/ebooks/oeb/parse_utils.py", line 207, in >parse_html >data = preprocessor(data) >File "/usr/lib/calibre/calibre/ebooks/conversion/preprocess.py", line >684, in __call__ >html = preprocessor(html) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 784, >in __call__ >html = self.markup_chapters(html, self.totalwords, >self.blanks_between_paragraphs) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 334, >in markup_chapters >html = recurse_patterns(html, False) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 329, >in recurse_patterns >html = chapdetect.sub(self.chapter_head, html) >File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 63, in >chapter_head >txt_chap = delete_quotes.sub('', delete_whitespace.sub('\\g', >html2text(chap))) >File "/usr/lib/calibre/calibre/utils/html2text.py", line 8, in >html2text >from html2text import HTML2Text >ImportError: No module named html2text > >Spine item 'id1' not found >Traceback (most recent call last): > File "/usr/bin/calibre-parallel", line 20, in >sys.exit(main()) > File "/usr/lib/calibre/calibre/utils/ipc/worker.py", line 200, in main >result = func(*args, **kwargs) >File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line >42, in gui_convert_override >override_input_metadata=True) >File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line >27, in gui_convert >plumber.run() >File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line >1121, in run >for_regex_wizard=self.for_regex_wizard, >removed_items=getattr(self.input_plugin, 'removed_items_to_ignore', >())) >File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line >1315, in create_oebbook >reader()(oeb, path_or_stream) >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 71, in >__call__ >self._all_from_opf(opf) >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 703, in >_all_from_opf >self._spine_from_opf(opf) >File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 348, in >_spine_from_opf >raise OEBError("Spine is empty") >calibre.ebooks.oeb.base.OEBError: Spine is empty >``` > >Is it expected to installe additional Calibre dependencies manually >or..? > >Full conversion log is attached. > >-- System Information: >Debian Release: bullseye/sid > APT prefers unstable-debug >APT policy: (500, 'unstable-debug'), (500, 'unstable'), (1, >'experimental-debug'), (1, 'experimental') >Architecture: amd64 (x86_64) >Foreign Architectures: i386 > >Kernel: Linux 4.19.0-5-amd64 (SMP w/8 CPU cores) >Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, >TAINT_UNSIGNED_MODULE >Locale: LANG=lt_LT.UTF-8, LC_CTYPE=lt_LT.UTF-8 (charmap=UTF-8), >LANGUAGE=lt (charmap=UTF-8) >Shell: /bin/sh linked to /usr/bin/dash >Init: systemd (via /run/systemd/system) >LSM: AppArmor: enabled > >Versions of packages calibre depends on: >ii calibre-bin 3.45.2+dfsg-1 >ii fonts-liberation 1:1.07.4-10 >ii imagemagick 8:6.9.10.23+dfsg-2.1 >ii imagemagick-6.q16 [imagemagick] 8:6.9.10.23+dfsg-2.1 >ii libjpeg-turbo-progs 1:1.5.2-2+b1 >ii libjs-coffeescript 1.12.8~dfsg-4 >ii libjs-mathjax2.7.4+dfsg-1 >ii optipng 0.7.7-1 >ii poppler-utils0.71.0-5 >ii python-apsw 3.27.2-r1-1 >ii python-bs4 4.7.1-1 >ii python-chardet 3.0.4-3 >ii python-cherrypy3 8.9.1-2 >ii python-css-parser1.0.4-1 >ii python-cssselect 1.0.3-1 >ii python-cssutils 1.0.2-2 >ii python-dateutil 2.7.3-3 >ii python-dbus 1.2.8-3 >ii python-feedparser5.2.1-1 >ii python-html5-parser 0.4.5-1 >ii python-html5lib 1.0.1-1 >ii python-lxml 4.3.3-2 >ii python-markdown 3.0.1-3 >ii python-mechanize 1:0.2.5-3 >ii python-msgpack 0.5.6-1+b1 >ii python-netifaces
Bug#932044: calibre: PDF to EPUB conversion failed with "No module named html2text"
Package: calibre Version: 3.45.2+dfsg-1 Severity: normal Dear Maintainer, I've tried to convert freely available "Elements of Programming" PDF (from http://elementsofprogramming.com) into EPUB, and got this error: ``` Traceback (most recent call last): File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 198, in _manifest_add_missing data = item.data File "/usr/lib/calibre/calibre/ebooks/oeb/base.py", line 1043, in data data = self._parse_xhtml(data) File "/usr/lib/calibre/calibre/ebooks/oeb/base.py", line 960, in _parse_xhtml filename=fname, non_html_file_tags={'ncx'}) File "/usr/lib/calibre/calibre/ebooks/oeb/parse_utils.py", line 207, in parse_html data = preprocessor(data) File "/usr/lib/calibre/calibre/ebooks/conversion/preprocess.py", line 684, in __call__ html = preprocessor(html) File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 784, in __call__ html = self.markup_chapters(html, self.totalwords, self.blanks_between_paragraphs) File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 334, in markup_chapters html = recurse_patterns(html, False) File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 329, in recurse_patterns html = chapdetect.sub(self.chapter_head, html) File "/usr/lib/calibre/calibre/ebooks/conversion/utils.py", line 63, in chapter_head txt_chap = delete_quotes.sub('', delete_whitespace.sub('\\g', html2text(chap))) File "/usr/lib/calibre/calibre/utils/html2text.py", line 8, in html2text from html2text import HTML2Text ImportError: No module named html2text Spine item 'id1' not found Traceback (most recent call last): File "/usr/bin/calibre-parallel", line 20, in sys.exit(main()) File "/usr/lib/calibre/calibre/utils/ipc/worker.py", line 200, in main result = func(*args, **kwargs) File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line 42, in gui_convert_override override_input_metadata=True) File "/usr/lib/calibre/calibre/gui2/convert/gui_conversion.py", line 27, in gui_convert plumber.run() File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line 1121, in run for_regex_wizard=self.for_regex_wizard, removed_items=getattr(self.input_plugin, 'removed_items_to_ignore', ())) File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line 1315, in create_oebbook reader()(oeb, path_or_stream) File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 71, in __call__ self._all_from_opf(opf) File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 703, in _all_from_opf self._spine_from_opf(opf) File "/usr/lib/calibre/calibre/ebooks/oeb/reader.py", line 348, in _spine_from_opf raise OEBError("Spine is empty") calibre.ebooks.oeb.base.OEBError: Spine is empty ``` Is it expected to installe additional Calibre dependencies manually or..? Full conversion log is attached. -- System Information: Debian Release: bullseye/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'unstable'), (1, 'experimental-debug'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.19.0-5-amd64 (SMP w/8 CPU cores) Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=lt_LT.UTF-8, LC_CTYPE=lt_LT.UTF-8 (charmap=UTF-8), LANGUAGE=lt (charmap=UTF-8) Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages calibre depends on: ii calibre-bin 3.45.2+dfsg-1 ii fonts-liberation 1:1.07.4-10 ii imagemagick 8:6.9.10.23+dfsg-2.1 ii imagemagick-6.q16 [imagemagick] 8:6.9.10.23+dfsg-2.1 ii libjpeg-turbo-progs 1:1.5.2-2+b1 ii libjs-coffeescript 1.12.8~dfsg-4 ii libjs-mathjax2.7.4+dfsg-1 ii optipng 0.7.7-1 ii poppler-utils0.71.0-5 ii python-apsw 3.27.2-r1-1 ii python-bs4 4.7.1-1 ii python-chardet 3.0.4-3 ii python-cherrypy3 8.9.1-2 ii python-css-parser1.0.4-1 ii python-cssselect 1.0.3-1 ii python-cssutils 1.0.2-2 ii python-dateutil 2.7.3-3 ii python-dbus 1.2.8-3 ii python-feedparser5.2.1-1 ii python-html5-parser 0.4.5-1 ii python-html5lib 1.0.1-1 ii python-lxml 4.3.3-2 ii python-markdown 3.0.1-3 ii python-mechanize 1:0.2.5-3 ii python-msgpack 0.5.6-1+b1 ii python-netifaces 0.10.4-1+b1 ii python-pil 6.1.0-1 ii python-pkg-resources 41.0.1-1 ii python-pyparsing 2.2.0+dfsg1-2 ii python-pyqt5 5.11.3+dfsg-1+b3 ii