Hello community, here is the log from the commit of package python3-html2text for openSUSE:Factory checked in at 2015-01-07 09:39:08 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python3-html2text (Old) and /work/SRC/openSUSE:Factory/.python3-html2text.new (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python3-html2text" Changes: -------- --- /work/SRC/openSUSE:Factory/python3-html2text/python3-html2text.changes 2014-12-21 12:04:32.000000000 +0100 +++ /work/SRC/openSUSE:Factory/.python3-html2text.new/python3-html2text.changes 2015-01-07 09:39:12.000000000 +0100 @@ -1,0 +2,17 @@ +Mon Jan 5 20:03:53 UTC 2015 - [email protected] + +- specfile: update copyright year + +- update to version 2014.12.29: + * Feature #51: Add single line break option. This feature is useful + for ensuring that lots of extra line breaks do not end up in the + resulting Markdown file in situations like Evernote .enex + exports. Note that this only works properly if body-width is set + to 0. + +- changes from version 2014.12.24: + * Feature #49: Added a images_to_alt option to discard images and keep only their alt. + * Feature #50: Protect links, surrounding them with angle brackets to avoid breaking... + * Feature: Add setup.cfg file. + +------------------------------------------------------------------- Old: ---- html2text-2014.12.5.tar.gz New: ---- html2text-2014.12.29.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python3-html2text.spec ++++++ --- /var/tmp/diff_new_pack.nKdzuc/_old 2015-01-07 09:39:13.000000000 +0100 +++ /var/tmp/diff_new_pack.nKdzuc/_new 2015-01-07 09:39:13.000000000 +0100 @@ -1,7 +1,7 @@ # # spec file for package python3-html2text # -# Copyright (c) 2014 SUSE LINUX Products GmbH, Nuernberg, Germany. +# Copyright (c) 2015 SUSE LINUX Products GmbH, Nuernberg, Germany. # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -17,7 +17,7 @@ Name: python3-html2text -Version: 2014.12.5 +Version: 2014.12.29 Release: 0 Url: https://github.com/Alir3z4/html2text/ Summary: Turn HTML into equivalent Markdown-structured text ++++++ html2text-2014.12.5.tar.gz -> html2text-2014.12.29.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/AUTHORS.rst new/html2text-2014.12.29/AUTHORS.rst --- old/html2text-2014.12.5/AUTHORS.rst 2014-09-25 17:35:28.000000000 +0200 +++ new/html2text-2014.12.29/AUTHORS.rst 2014-12-29 08:30:17.000000000 +0100 @@ -7,7 +7,10 @@ * Alex Musayev * Matěj Cepl * Stefano Rivera + * Alireza Savand <[email protected]> * Ivan Gromov <[email protected]> + * Jocelyn Delalande <[email protected]> + * Matt Dorn <[email protected]> Maintainer: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/ChangeLog.rst new/html2text-2014.12.29/ChangeLog.rst --- old/html2text-2014.12.5/ChangeLog.rst 2014-12-05 19:01:01.000000000 +0100 +++ new/html2text-2014.12.29/ChangeLog.rst 2014-12-29 08:32:53.000000000 +0100 @@ -1,3 +1,23 @@ +2014.12.29 +========== +---- + +* Feature #51: Add single line break option. + This feature is useful for ensuring that lots of extra line breaks do not + end up in the resulting Markdown file in situations like Evernote .enex + exports. Note that this only works properly if ``body-width`` is set + to ``0``. + + +2014.12.24 +========== +---- + +* Feature #49: Added a images_to_alt option to discard images and keep only their alt. +* Feature #50: Protect links, surrounding them with angle brackets to avoid breaking... +* Feature: Add ``setup.cfg`` file. + + 2014.12.5 ========= ---- diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/PKG-INFO new/html2text-2014.12.29/PKG-INFO --- old/html2text-2014.12.5/PKG-INFO 2014-12-05 19:04:27.000000000 +0100 +++ new/html2text-2014.12.29/PKG-INFO 2014-12-29 08:35:48.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 1.1 Name: html2text -Version: 2014.12.5 +Version: 2014.12.29 Summary: Turn HTML into equivalent Markdown-structured text. Home-page: https://github.com/Alir3z4/html2text/ Author: Alireza Savand diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/README.md new/html2text-2014.12.29/README.md --- old/html2text-2014.12.5/README.md 2014-11-01 16:18:36.000000000 +0100 +++ new/html2text-2014.12.29/README.md 2014-12-29 08:30:24.000000000 +0100 @@ -17,11 +17,13 @@ | Option | Description -|--------------------------------------------------------|-------------------------------------------------- +|--------------------------------------------------------|--------------------------------------------------- | `--version` | Show program's version number and exit | `-h`, `--help` | Show this help message and exit | `--ignore-links` | Don't include any formatting for links +|`--protect-links` | Protect links from line breaks surrounding them "+" with angle brackets |`--ignore-images` | Don't include any formatting for images +|`--images-to-alt` | Discard image data, only keep alt text |`-g`, `--google-doc` | Convert an html-exported Google Document |`-d`, `--dash-unordered-list` | Use a dash rather than a star for unordered list items |`-b` `BODY_WIDTH`, `--body-width`=`BODY_WIDTH` | Number of characters per output line, `0` for no wrap @@ -29,6 +31,7 @@ |`-s`, `--hide-strikethrough` | Hide strike-through text. only relevent when `-g` is specified as well |`--escape-all` | Escape all special characters. Output is less readable, but avoids corner case formatting issues. | `--bypass-tables` | Format tables in HTML rather than Markdown syntax. +| `--single-line-break` | Use a single line break after a block element rather than two. Or you can use it from within `Python`: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/html2text/__init__.py new/html2text-2014.12.29/html2text/__init__.py --- old/html2text-2014.12.5/html2text/__init__.py 2014-12-05 19:02:01.000000000 +0100 +++ new/html2text-2014.12.29/html2text/__init__.py 2014-12-29 08:33:15.000000000 +0100 @@ -28,7 +28,7 @@ skipwrap ) -__version__ = "2014.12.5" +__version__ = "2014.12.29" # TODO: @@ -55,15 +55,18 @@ self.body_width = bodywidth self.skip_internal_links = config.SKIP_INTERNAL_LINKS self.inline_links = config.INLINE_LINKS + self.protect_links = config.PROTECT_LINKS self.google_list_indent = config.GOOGLE_LIST_INDENT self.ignore_links = config.IGNORE_ANCHORS self.ignore_images = config.IGNORE_IMAGES + self.images_to_alt = config.IMAGES_TO_ALT self.ignore_emphasis = config.IGNORE_EMPHASIS self.bypass_tables = config.BYPASS_TABLES self.google_doc = False self.ul_item_mark = '*' self.emphasis_mark = '_' self.strong_mark = '**' + self.single_line_break = config.SINGLE_LINE_BREAK if out is None: self.out = self.outtextf @@ -367,6 +370,8 @@ attrs['href'].startswith('#')): self.astack.append(attrs) self.maybe_automatic_link = attrs['href'] + if self.protect_links: + attrs['href'] = '<'+attrs['href']+'>' else: self.astack.append(None) else: @@ -390,23 +395,29 @@ if tag == "img" and start and not self.ignore_images: if 'src' in attrs: - attrs['href'] = attrs['src'] + if not self.images_to_alt: + attrs['href'] = attrs['src'] alt = attrs.get('alt') or '' - self.o("![" + escape_md(alt) + "]") - if self.inline_links: - href = attrs.get('href') or '' - self.o("(" + escape_md(href) + ")") - else: - i = self.previousIndex(attrs) - if i is not None: - attrs = self.a[i] + # If we have images_to_alt, we discard the image itself, + # considering only the alt text. + if self.images_to_alt: + self.o(escape_md(alt)) + else: + self.o("![" + escape_md(alt) + "]") + if self.inline_links: + href = attrs.get('href') or '' + self.o("(" + escape_md(href) + ")") else: - self.acount += 1 - attrs['count'] = self.acount - attrs['outcount'] = self.outcount - self.a.append(attrs) - self.o("[" + str(attrs['count']) + "]") + i = self.previousIndex(attrs) + if i is not None: + attrs = self.a[i] + else: + self.acount += 1 + attrs['count'] = self.acount + attrs['outcount'] = self.outcount + self.a.append(attrs) + self.o("[" + str(attrs['count']) + "]") if tag == 'dl' and start: self.p() @@ -507,7 +518,7 @@ self.p_p = 1 def p(self): - self.p_p = 2 + self.p_p = 1 if self.single_line_break else 2 def soft_br(self): self.pbr() diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/html2text/cli.py new/html2text-2014.12.29/html2text/cli.py --- old/html2text-2014.12.5/html2text/cli.py 2014-12-05 18:59:05.000000000 +0100 +++ new/html2text-2014.12.29/html2text/cli.py 2014-12-29 08:30:24.000000000 +0100 @@ -24,6 +24,13 @@ default=config.IGNORE_ANCHORS, help="don't include any formatting for links") p.add_option( + "--protect-links", + dest="protect_links", + action="store_true", + default=config.PROTECT_LINKS, + help=("protect links from line breaks surrounding them " + + "with angle brackets")) + p.add_option( "--ignore-images", dest="ignore_images", action="store_true", @@ -31,6 +38,13 @@ help="don't include any formatting for images" ) p.add_option( + "--images-to-alt", + dest="images_to_alt", + action="store_true", + default=config.IMAGES_TO_ALT, + help="Discard image data, only keep alt text" + ) + p.add_option( "-g", "--google-doc", action="store_true", dest="google_doc", @@ -90,6 +104,16 @@ default=config.BYPASS_TABLES, help="Format tables in HTML rather than Markdown syntax." ) + p.add_option( + "--single-line-break", + action="store_true", + dest="single_line_break", + default=config.SINGLE_LINE_BREAK, + help=( + "Use a single line break after a block element rather than two " + "line breaks. NOTE: Requires --body-width=0" + ) + ) (options, args) = p.parse_args() # process input @@ -139,10 +163,13 @@ h.list_indent = options.list_indent h.ignore_emphasis = options.ignore_emphasis h.ignore_links = options.ignore_links + h.protect_links = options.protect_links h.ignore_images = options.ignore_images + h.images_to_alt = options.images_to_alt h.google_doc = options.google_doc h.hide_strikethrough = options.hide_strikethrough h.escape_snob = options.escape_snob h.bypass_tables = options.bypass_tables + h.single_line_break = options.single_line_break - wrapwrite(h.handle(data)) \ No newline at end of file + wrapwrite(h.handle(data)) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/html2text/config.py new/html2text-2014.12.29/html2text/config.py --- old/html2text-2014.12.5/html2text/config.py 2014-09-25 17:32:04.000000000 +0200 +++ new/html2text-2014.12.29/html2text/config.py 2014-12-29 08:30:24.000000000 +0100 @@ -20,11 +20,16 @@ # Use inline, rather than reference, formatting for images and links INLINE_LINKS = True +# Protect links from line breaks surrounding them with angle brackets (in +# addition to their square brackets) +PROTECT_LINKS = False + # Number of pixels Google indents nested lists GOOGLE_LIST_INDENT = 36 IGNORE_ANCHORS = False IGNORE_IMAGES = False +IMAGES_TO_ALT = False IGNORE_EMPHASIS = False # For checking space-only lines on line 771 @@ -102,4 +107,8 @@ 'rlm': '' } -BYPASS_TABLES = False \ No newline at end of file +BYPASS_TABLES = False + +# Use a single line break after a block element rather an two line breaks. +# NOTE: Requires body width setting to be 0. +SINGLE_LINE_BREAK = False diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/html2text.egg-info/PKG-INFO new/html2text-2014.12.29/html2text.egg-info/PKG-INFO --- old/html2text-2014.12.5/html2text.egg-info/PKG-INFO 2014-12-05 19:04:26.000000000 +0100 +++ new/html2text-2014.12.29/html2text.egg-info/PKG-INFO 2014-12-29 08:35:47.000000000 +0100 @@ -1,6 +1,6 @@ Metadata-Version: 1.1 Name: html2text -Version: 2014.12.5 +Version: 2014.12.29 Summary: Turn HTML into equivalent Markdown-structured text. Home-page: https://github.com/Alir3z4/html2text/ Author: Alireza Savand diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/html2text.egg-info/SOURCES.txt new/html2text-2014.12.29/html2text.egg-info/SOURCES.txt --- old/html2text-2014.12.5/html2text.egg-info/SOURCES.txt 2014-12-05 19:04:27.000000000 +0100 +++ new/html2text-2014.12.29/html2text.egg-info/SOURCES.txt 2014-12-29 08:35:47.000000000 +0100 @@ -3,6 +3,7 @@ ChangeLog.rst MANIFEST.in README.md +setup.cfg setup.py html2text/__init__.py html2text/cli.py @@ -32,6 +33,8 @@ test/doc_with_table_bypass.md test/emdash-para.html test/emdash-para.md +test/images_to_alt.html +test/images_to_alt.md test/invalid_start.html test/invalid_start.md test/nbsp.html @@ -46,6 +49,10 @@ test/pre.md test/preformatted_in_list.html test/preformatted_in_list.md +test/protect_links.html +test/protect_links.md +test/single_line_break.html +test/single_line_break.md test/test_html2text.py test/test_memleak.py test/url-escaping.html diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/setup.cfg new/html2text-2014.12.29/setup.cfg --- old/html2text-2014.12.5/setup.cfg 2014-12-05 19:04:27.000000000 +0100 +++ new/html2text-2014.12.29/setup.cfg 2014-12-29 08:35:48.000000000 +0100 @@ -1,3 +1,6 @@ +[metadata] +description-file = README.md + [egg_info] tag_build = tag_date = 0 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/setup.py new/html2text-2014.12.29/setup.py --- old/html2text-2014.12.5/setup.py 2014-12-05 19:01:55.000000000 +0100 +++ new/html2text-2014.12.29/setup.py 2014-12-29 08:33:05.000000000 +0100 @@ -34,7 +34,7 @@ setup( name="html2text", - version="2014.12.5", + version="2014.12.29", description="Turn HTML into equivalent Markdown-structured text.", author="Aaron Swartz", author_email="[email protected]", diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/test/images_to_alt.html new/html2text-2014.12.29/test/images_to_alt.html --- old/html2text-2014.12.5/test/images_to_alt.html 1970-01-01 01:00:00.000000000 +0100 +++ new/html2text-2014.12.29/test/images_to_alt.html 2014-12-24 20:35:10.000000000 +0100 @@ -0,0 +1,3 @@ +<a href="http://example.com"> +<img src="http://example.com/img.png" alt="ALT TEXT" /> +</a> diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/test/images_to_alt.md new/html2text-2014.12.29/test/images_to_alt.md --- old/html2text-2014.12.5/test/images_to_alt.md 1970-01-01 01:00:00.000000000 +0100 +++ new/html2text-2014.12.29/test/images_to_alt.md 2014-12-24 20:35:10.000000000 +0100 @@ -0,0 +1,2 @@ +[ ALT TEXT ](http://example.com) + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/test/protect_links.html new/html2text-2014.12.29/test/protect_links.html --- old/html2text-2014.12.5/test/protect_links.html 1970-01-01 01:00:00.000000000 +0100 +++ new/html2text-2014.12.29/test/protect_links.html 2014-12-24 20:35:10.000000000 +0100 @@ -0,0 +1 @@ +<a href="http://im-a-very-very-very-very-very-very-very-very-very-very-long/link.html">foo</a> \ No newline at end of file diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/test/protect_links.md new/html2text-2014.12.29/test/protect_links.md --- old/html2text-2014.12.5/test/protect_links.md 1970-01-01 01:00:00.000000000 +0100 +++ new/html2text-2014.12.29/test/protect_links.md 2014-12-24 20:35:10.000000000 +0100 @@ -0,0 +1,3 @@ +[foo](<http://im-a-very-very-very-very-very-very-very-very-very-very- +long/link.html>) + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/test/single_line_break.html new/html2text-2014.12.29/test/single_line_break.html --- old/html2text-2014.12.5/test/single_line_break.html 1970-01-01 01:00:00.000000000 +0100 +++ new/html2text-2014.12.29/test/single_line_break.html 2014-12-29 08:30:24.000000000 +0100 @@ -0,0 +1,6 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml2.dtd"> +<en-note> +<div>Hello world.</div> +<div>And hello html2text.</div> +</en-note> diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/test/single_line_break.md new/html2text-2014.12.29/test/single_line_break.md --- old/html2text-2014.12.5/test/single_line_break.md 1970-01-01 01:00:00.000000000 +0100 +++ new/html2text-2014.12.29/test/single_line_break.md 2014-12-29 08:30:24.000000000 +0100 @@ -0,0 +1,2 @@ +Hello world. +And hello html2text. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/html2text-2014.12.5/test/test_html2text.py new/html2text-2014.12.29/test/test_html2text.py --- old/html2text-2014.12.5/test/test_html2text.py 2014-12-03 14:53:16.000000000 +0100 +++ new/html2text-2014.12.29/test/test_html2text.py 2014-12-29 08:30:24.000000000 +0100 @@ -127,6 +127,20 @@ module_args['body_width'] = 0 cmdline_args.append('--body-width=0') + if base_fn.startswith('protect_links'): + module_args['protect_links'] = True + cmdline_args.append('--protect-links') + + if base_fn.startswith('images_to_alt'): + module_args['images_to_alt'] = True + cmdline_args.append('--images-to-alt') + + if base_fn.startswith('single_line_break'): + module_args['body_width'] = 0 + cmdline_args.append('--body-width=0') + module_args['single_line_break'] = True + cmdline_args.append('--single-line-break') + return test_mod, test_cmd # Originally from http://stackoverflow.com/questions/32899/\ -- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
