Hello community, here is the log from the commit of package urlwatch for openSUSE:Factory checked in at 2018-09-04 22:57:50 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/urlwatch (Old) and /work/SRC/openSUSE:Factory/.urlwatch.new (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "urlwatch" Tue Sep 4 22:57:50 2018 rev:12 rq:632968 version:2.14 Changes: -------- --- /work/SRC/openSUSE:Factory/urlwatch/urlwatch.changes 2018-06-08 23:16:24.210031361 +0200 +++ /work/SRC/openSUSE:Factory/.urlwatch.new/urlwatch.changes 2018-09-04 22:58:07.189400169 +0200 @@ -1,0 +2,14 @@ +Tue Sep 4 06:34:45 UTC 2018 - mvet...@suse.com + +- Update to 2.14: + * Added filter to pretty-print JSON data: format-json (by Niko Böckerman, PR#250) + * Added list active Telegram chats using --telegram-chats (with fixes by Georg Pichler, PR#270) + * Added support for HTTP ETag header in URL jobs and If-None-Match (by Karol Babioch, PR#256) + * Added xupport for filtering HTML using XPath expressions, with lxml (PR#274, Fixes #226) + * Added install_dependencies to setup.py commands for easy installing of dependencies + * Added ignore_connection_errors per-job configuration option (by Karol Babioch, PR#261) + * Improved code (HTTP status codes, by Karol Babioch PR#258) + * Improved documentation for setting up Telegram chat bots + * Allow multiple chats for Telegram reporting (by Georg Pichler, PR#271) + +------------------------------------------------------------------- Old: ---- urlwatch-2.13.tar.gz New: ---- urlwatch-2.14.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ urlwatch.spec ++++++ --- /var/tmp/diff_new_pack.wj9slx/_old 2018-09-04 22:58:07.541401369 +0200 +++ /var/tmp/diff_new_pack.wj9slx/_new 2018-09-04 22:58:07.541401369 +0200 @@ -17,7 +17,7 @@ Name: urlwatch -Version: 2.13 +Version: 2.14 Release: 0 Summary: A tool for monitoring webpages for updates License: BSD-3-Clause ++++++ urlwatch-2.13.tar.gz -> urlwatch-2.14.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/.travis.yml new/urlwatch-2.14/.travis.yml --- old/urlwatch-2.13/.travis.yml 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/.travis.yml 2018-08-30 10:36:16.000000000 +0200 @@ -4,5 +4,5 @@ - "3.5" - "3.6" install: - - pip install pyyaml minidb requests keyring pycodestyle appdirs + - python setup.py install_dependencies script: nosetests -v diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/CHANGELOG.md new/urlwatch-2.14/CHANGELOG.md --- old/urlwatch-2.13/CHANGELOG.md 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/CHANGELOG.md 2018-08-30 10:36:16.000000000 +0200 @@ -4,6 +4,22 @@ The format mostly follows [Keep a Changelog](http://keepachangelog.com/en/1.0.0/). +## [2.14] -- 2018-08-30 + +### Added +- Filter to pretty-print JSON data: `format-json` (by Niko Böckerman, PR#250) +- List active Telegram chats using `--telegram-chats` (with fixes by Georg Pichler, PR#270) +- Support for HTTP `ETag` header in URL jobs and `If-None-Match` (by Karol Babioch, PR#256) +- Support for filtering HTML using XPath expressions, with `lxml` (PR#274, Fixes #226) +- Added `install_dependencies` to `setup.py` commands for easy installing of dependencies +- Added `ignore_connection_errors` per-job configuration option (by Karol Babioch, PR#261) + +### Changed +- Improved code (HTTP status codes, by Karol Babioch PR#258) +- Improved documentation for setting up Telegram chat bots +- Allow multiple chats for Telegram reporting (by Georg Pichler, PR#271) + + ## [2.13] -- 2018-06-03 ### Added diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/README.md new/urlwatch-2.14/README.md --- old/urlwatch-2.13/README.md 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/README.md 2018-08-30 10:36:16.000000000 +0200 @@ -26,28 +26,19 @@ * [requests](http://python-requests.org/) * [keyring](https://github.com/jaraco/keyring/) * [appdirs](https://github.com/ActiveState/appdirs) - * [chump](https://github.com/karanlyons/chump/) (for Pushover support) - * [pushbullet.py](https://github.com/randomchars/pushbullet.py) (for Pushbullet support) + * [lxml](https://lxml.de) The dependencies can be installed with (add `--user` to install to `$HOME`): -`python3 -m pip install pyyaml minidb requests keyring appdirs` +`python3 -m pip install pyyaml minidb requests keyring appdirs lxml` -For optional pushover support the chump package is required: -`python3 -m pip install chump` +Optional dependencies (install via `python3 -m pip install <packagename>`): -For optional pushbullet support the pushbullet.py package is required: - -`python3 -m pip install pushbullet.py` - -For optional support for the "browser" job kind, Requests-HTML is needed: - -`python3 -m pip install requests-html` - -For unit tests, you also need to install pycodestyle: - -`python3 -m pip install pycodestyle` + * Pushover reporter: [chump](https://github.com/karanlyons/chump/) + * Pushbullet reporter: [pushbullet.py](https://github.com/randomchars/pushbullet.py) + * "browser" job kind: [requests-html](https://html.python-requests.org) + * Unit testing: [pycodestyle](http://pycodestyle.pycqa.org/en/latest/) MIGRATION FROM URLWATCH 1.x @@ -144,6 +135,30 @@ `brew install wdiff` on macOS). Coloring is supported for `wdiff`-style output, but potentially not for other diff tools. +To filter based on an [XPath](https://www.w3.org/TR/1999/REC-xpath-19991116/) +expression, you can use the `xpath` filter like so (see Microsoft's +[XPath Examples](https://msdn.microsoft.com/en-us/library/ms256086(v=vs.110).aspx) +page for some other examples): + +```yaml +url: https://example.net/ +filter: xpath:/body +``` + +This filters only the `<body>` element of the HTML document, stripping +out everything else. + +In some cases, it might be useful to ignore (temporary) network errors to +avoid notifications being sent. While there is a `display.error` config +option (defaulting to `True`) to control reporting of errors globally, to +ignore network errors for specific jobs only, you can use the +`ignore_connection_errors` key in the job list configuration file: + +```yaml +url: https://example.com/ +ignore_connection_errors: true +``` + PUSHOVER -------- @@ -168,6 +183,7 @@ Telegram notifications are configured using the Telegram Bot API. For this, you'll need a Bot API token and a chat id (see https://core.telegram.org/bots). Sample configuration: + ```yaml telegram: bot_token: '999999999:3tOhy2CuZE0pTaCtszRfKpnagOG8IQbP5gf' # your bot api token @@ -175,6 +191,28 @@ enabled: true ``` +To set up Telegram, from your Telegram app, chat up BotFather (New Message, +Search, "BotFather"), then say `/newbot` and follow the instructions. +Eventually it will tell you the bot token (in the form seen above, +`<number>:<random string>`) - add this to your config file. + +You can then click on the link of your bot, which will send the message `/start`. +At this point, you can use the command `urlwatch --telegram-chats` to list the +private chats the bot is involved with. This is the chat ID that you need to put +into the config file as `chat_id`. You may add multiple chat IDs as a YAML list: +```yaml +telegram: + bot_token: '999999999:3tOhy2CuZE0pTaCtszRfKpnagOG8IQbP5gf' # your bot api token + chat_id: + - '11111111' + - '22222222' + enabled: true +``` + +Don't forget to also enable the reporter. + + + BROWSER ------- diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/__init__.py new/urlwatch-2.14/lib/urlwatch/__init__.py --- old/urlwatch-2.13/lib/urlwatch/__init__.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/__init__.py 2018-08-30 10:36:16.000000000 +0200 @@ -12,5 +12,5 @@ __author__ = 'Thomas Perl <m...@thp.io>' __license__ = 'BSD' __url__ = 'https://thp.io/2008/urlwatch/' -__version__ = '2.13' +__version__ = '2.14' __user_agent__ = '%s/%s (+https://thp.io/2008/urlwatch/info.html)' % (pkgname, __version__) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/command.py new/urlwatch-2.14/lib/urlwatch/command.py --- old/urlwatch-2.13/lib/urlwatch/command.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/command.py 2018-08-30 10:36:16.000000000 +0200 @@ -33,6 +33,7 @@ import os import shutil import sys +import requests from .filters import FilterBase from .handler import JobState @@ -175,6 +176,41 @@ if self.urlwatch_config.edit_config: sys.exit(self.urlwatcher.config_storage.edit()) + def check_telegram_chats(self): + if self.urlwatch_config.telegram_chats: + config = self.urlwatcher.config_storage.config['report'].get('telegram', None) + if not config: + print('You need to configure telegram in your config first (see README.md)') + sys.exit(1) + + bot_token = config.get('bot_token', None) + if not bot_token: + print('You need to set up your bot token first (see README.md)') + sys.exit(1) + + info = requests.get('https://api.telegram.org/bot{}/getMe'.format(bot_token)).json() + + chats = {} + for chat_info in requests.get('https://api.telegram.org/bot{}/getUpdates'.format(bot_token)).json()['result']: + chat = chat_info['message']['chat'] + if chat['type'] == 'private': + chats[str(chat['id'])] = ' '.join((chat['first_name'], chat['last_name'])) if 'last_name' in chat else chat['first_name'] + + if not chats: + print('No chats found. Say hello to your bot at https://t.me/{}'.format(info['result']['username'])) + sys.exit(1) + + headers = ('Chat ID', 'Name') + maxchat = max(len(headers[0]), max((len(k) for k, v in chats.items()), default=0)) + maxname = max(len(headers[1]), max((len(v) for k, v in chats.items()), default=0)) + fmt = '%-' + str(maxchat) + 's %s' + print(fmt % headers) + print(fmt % ('-' * maxchat, '-' * maxname)) + for k, v in sorted(chats.items(), key=lambda kv: kv[1]): + print(fmt % (k, v)) + print('\nChat up your bot here: https://t.me/{}'.format(info['result']['username'])) + sys.exit(0) + def check_smtp_login(self): if self.urlwatch_config.smtp_login: config = self.urlwatcher.config_storage.config['report']['email'] @@ -222,6 +258,7 @@ def run(self): self.check_edit_config() self.check_smtp_login() + self.check_telegram_chats() self.handle_actions() self.urlwatcher.run_jobs() self.urlwatcher.close() diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/config.py new/urlwatch-2.14/lib/urlwatch/config.py --- old/urlwatch-2.13/lib/urlwatch/config.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/config.py 2018-08-30 10:36:16.000000000 +0200 @@ -89,6 +89,7 @@ group = parser.add_argument_group('Authentication') group.add_argument('--smtp-login', action='store_true', help='Enter password for SMTP (store in keyring)') + group.add_argument('--telegram-chats', action='store_true', help='List telegram chats the bot is joined to') group = parser.add_argument_group('job list management') group.add_argument('--list', action='store_true', help='list jobs') diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/filters.py new/urlwatch-2.14/lib/urlwatch/filters.py --- old/urlwatch-2.13/lib/urlwatch/filters.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/filters.py 2018-08-30 10:36:16.000000000 +0200 @@ -32,11 +32,14 @@ import logging import itertools import os +import io import imp import html.parser import hashlib +import json from enum import Enum +from lxml import etree from .util import TrackSubClasses @@ -183,6 +186,19 @@ return ical2text(data) +class JsonFormatFilter(FilterBase): + """Convert to formatted json""" + + __kind__ = 'format-json' + + def filter(self, data, subfilter=None): + indentation = 4 + if subfilter is not None: + indentation = int(subfilter) + parsed_json = json.loads(data) + return json.dumps(parsed_json, sort_keys=True, indent=indentation) + + class GrepFilter(FilterBase): """Filter only lines matching a regular expression""" @@ -349,3 +365,18 @@ return '\n'.join('%s %s' % (' '.join('%02x' % c for c in block), ''.join((chr(c) if (c > 31 and c < 127) else '.') for c in block)) for block in blocks) + + +class XPathFilter(FilterBase): + """Filter XML/HTML using XPath expressions""" + + __kind__ = 'xpath' + + def filter(self, data, subfilter=None): + if subfilter is None: + raise ValueError('Need an XPath expression for filtering') + + parser = etree.HTMLParser() + tree = etree.parse(io.StringIO(data), parser) + return '\n'.join(etree.tostring(element, pretty_print=True, method='html', encoding='unicode') + for element in tree.xpath(subfilter)) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/handler.py new/urlwatch-2.14/lib/urlwatch/handler.py --- old/urlwatch-2.13/lib/urlwatch/handler.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/handler.py 2018-08-30 10:36:16.000000000 +0200 @@ -51,9 +51,10 @@ self.exception = None self.traceback = None self.tries = 0 + self.etag = None def load(self): - self.old_data, self.timestamp, self.tries = self.cache_storage.load(self.job, self.job.get_guid()) + self.old_data, self.timestamp, self.tries, self.etag = self.cache_storage.load(self.job, self.job.get_guid()) if self.tries is None: self.tries = 0 @@ -62,7 +63,7 @@ # If no new data has been retrieved due to an exception, use the old job data self.new_data = self.old_data - self.cache_storage.save(self.job, self.job.get_guid(), self.new_data, time.time(), self.tries) + self.cache_storage.save(self.job, self.job.get_guid(), self.new_data, time.time(), self.tries, self.etag) def process(self): logger.info('Processing: %s', self.job) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/jobs.py new/urlwatch-2.14/lib/urlwatch/jobs.py --- old/urlwatch-2.13/lib/urlwatch/jobs.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/jobs.py 2018-08-30 10:36:16.000000000 +0200 @@ -180,7 +180,7 @@ __required__ = ('url',) __optional__ = ('cookies', 'data', 'method', 'ssl_no_verify', 'ignore_cached', 'http_proxy', 'https_proxy', - 'headers') + 'headers', 'ignore_connection_errors') CHARSET_RE = re.compile('text/(html|plain); charset=([^;]*)') @@ -197,10 +197,14 @@ 'https': os.getenv('HTTPS_PROXY'), } + if job_state.etag is not None: + headers['If-None-Match'] = job_state.etag + if job_state.timestamp is not None: headers['If-Modified-Since'] = email.utils.formatdate(job_state.timestamp) if self.ignore_cached: + headers['If-None-Match'] = None headers['If-Modified-Since'] = email.utils.formatdate(0) headers['Cache-Control'] = 'max-age=172800' headers['Expires'] = email.utils.formatdate() @@ -234,9 +238,12 @@ proxies=proxies) response.raise_for_status() - if response.status_code == 304: + if response.status_code == requests.codes.not_modified: raise NotModifiedError() + # Save ETag from response into job_state, which will be saved in cache + job_state.etag = response.headers.get('ETag') + # If we can't find the encoding in the headers, requests gets all # old-RFC-y and assumes ISO-8859-1 instead of UTF-8. Use the old # urlwatch behavior and try UTF-8 decoding first. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/reporters.py new/urlwatch-2.14/lib/urlwatch/reporters.py --- old/urlwatch-2.13/lib/urlwatch/reporters.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/reporters.py 2018-08-30 10:36:16.000000000 +0200 @@ -485,7 +485,7 @@ try: json_res = result.json() - if (result.status_code == 200): + if (result.status_code == requests.codes.ok): logger.info("Mailgun response: id '{0}'. {1}".format(json_res['id'], json_res['message'])) else: logger.error("Mailgun error: {0}".format(json_res['message'])) @@ -506,7 +506,8 @@ def submit(self): bot_token = self.config['bot_token'] - chat_id = self.config['chat_id'] + chat_ids = self.config['chat_id'] + chat_ids = [chat_ids] if isinstance(chat_ids, str) else chat_ids text = '\n'.join(super().submit()) @@ -515,9 +516,11 @@ return result = None - for chunk in self.chunkstring(text, self.MAX_LENGTH): - result = self.submitToTelegram(bot_token, chat_id, chunk) + for chat_id in chat_ids: + res = self.submitToTelegram(bot_token, chat_id, chunk) + if res.status_code != requests.codes.ok or res is None: + result = res return result @@ -529,7 +532,7 @@ try: json_res = result.json() - if (result.status_code == 200): + if (result.status_code == requests.codes.ok): logger.info("Telegram response: ok '{0}'. {1}".format(json_res['ok'], json_res['result'])) else: logger.error("Telegram error: {0}".format(json_res['description'])) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/storage.py new/urlwatch-2.14/lib/urlwatch/storage.py --- old/urlwatch-2.13/lib/urlwatch/storage.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/storage.py 2018-08-30 10:36:16.000000000 +0200 @@ -97,6 +97,11 @@ 'enabled': False, 'api_key': '', }, + 'telegram': { + 'enabled': False, + 'bot_token': '', + 'chat-id': '', + }, 'mailgun': { 'enabled': False, 'api_key': '', @@ -359,7 +364,7 @@ ... @abstractmethod - def save(self, job, guid, data, timestamp, tries): + def save(self, job, guid, data, timestamp, tries, etag=None): ... @abstractmethod @@ -372,12 +377,12 @@ def backup(self): for guid in self.get_guids(): - data, timestamp, tries = self.load(None, guid) - yield guid, data, timestamp, tries + data, timestamp, tries, etag = self.load(None, guid) + yield guid, data, timestamp, tries, etag def restore(self, entries): - for guid, data, timestamp, tries in entries: - self.save(None, guid, data, timestamp, tries) + for guid, data, timestamp, tries, etag in entries: + self.save(None, guid, data, timestamp, tries, etag) def gc(self, known_guids): for guid in set(self.get_guids()) - set(known_guids): @@ -420,10 +425,10 @@ timestamp = os.stat(filename)[stat.ST_MTIME] - return data, timestamp + return data, timestamp, None - def save(self, job, guid, data, timestamp): - # Timestamp is always ignored + def save(self, job, guid, data, timestamp, etag=None): + # Timestamp and ETag are always ignored filename = self._get_filename(guid) with open(filename, 'w+') as fp: fp.write(data) @@ -443,6 +448,7 @@ timestamp = int data = str tries = int + etag = str class CacheMiniDBStorage(CacheStorage): @@ -464,15 +470,15 @@ return (guid for guid, in CacheEntry.query(self.db, minidb.Function('distinct', CacheEntry.c.guid))) def load(self, job, guid): - for data, timestamp, tries in CacheEntry.query(self.db, CacheEntry.c.data // CacheEntry.c.timestamp // CacheEntry.c.tries, - order_by=minidb.columns(CacheEntry.c.timestamp.desc, CacheEntry.c.tries.desc), - where=CacheEntry.c.guid == guid, limit=1): - return data, timestamp, tries + for data, timestamp, tries, etag in CacheEntry.query(self.db, CacheEntry.c.data // CacheEntry.c.timestamp // CacheEntry.c.tries // CacheEntry.c.etag, + order_by=minidb.columns(CacheEntry.c.timestamp.desc, CacheEntry.c.tries.desc), + where=CacheEntry.c.guid == guid, limit=1): + return data, timestamp, tries, etag - return None, None, 0 + return None, None, 0, None - def save(self, job, guid, data, timestamp, tries): - self.db.save(CacheEntry(guid=guid, timestamp=timestamp, data=data, tries=tries)) + def save(self, job, guid, data, timestamp, tries, etag=None): + self.db.save(CacheEntry(guid=guid, timestamp=timestamp, data=data, tries=tries, etag=etag)) self.db.commit() def delete(self, guid): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/worker.py new/urlwatch-2.14/lib/urlwatch/worker.py --- old/urlwatch-2.13/lib/urlwatch/worker.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/lib/urlwatch/worker.py 2018-08-30 10:36:16.000000000 +0200 @@ -70,6 +70,8 @@ if isinstance(job_state.exception, NotModifiedError): logger.info('Job %s has not changed (HTTP 304)', job_state.job) report.unchanged(job_state) + elif isinstance(job_state.exception, requests.exceptions.ConnectionError) and job_state.job.ignore_connection_errors: + logger.info('Connection error while executing job %s, ignored due to ignore_connection_errors', job_state.job) elif job_state.tries < max_tries: logger.debug('This was try %i of %i for job %s', job_state.tries, max_tries, job_state.job) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/setup.cfg new/urlwatch-2.14/setup.cfg --- old/urlwatch-2.13/setup.cfg 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/setup.cfg 2018-08-30 10:36:16.000000000 +0200 @@ -1,2 +1,2 @@ -[pep8] +[pycodestyle] max-line-length = 120 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/setup.py new/urlwatch-2.14/setup.py --- old/urlwatch-2.13/setup.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/setup.py 2018-08-30 10:36:16.000000000 +0200 @@ -1,6 +1,7 @@ #!/usr/bin/env python3 from setuptools import setup +from distutils import cmd import os import re @@ -16,7 +17,7 @@ m['name'] = 'urlwatch' m['author'], m['author_email'] = re.match(r'(.*) <(.*)>', m['author']).groups() m['description'], m['long_description'] = docs[0].strip().split('\n\n', 1) -m['install_requires'] = ['minidb', 'PyYAML', 'requests', 'keyring', 'pycodestyle', 'appdirs'] +m['install_requires'] = ['minidb', 'PyYAML', 'requests', 'keyring', 'pycodestyle', 'appdirs', 'lxml'] m['scripts'] = ['urlwatch'] m['package_dir'] = {'': 'lib'} m['packages'] = ['urlwatch'] @@ -29,5 +30,29 @@ ]), ] + +class InstallDependencies(cmd.Command): + """Install dependencies only""" + + description = 'Only install required packages using pip' + user_options = [] + + def initialize_options(self): + ... + + def finalize_options(self): + ... + + def run(self): + global m + try: + from pip._internal import main + except ImportError: + from pip import main + main(['install', '--upgrade'] + m['install_requires']) + + +m['cmdclass'] = {'install_dependencies': InstallDependencies} + del m['copyright'] setup(**m) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/test/test_filters.py new/urlwatch-2.14/test/test_filters.py --- old/urlwatch-2.13/test/test_filters.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/test/test_filters.py 2018-08-30 10:36:16.000000000 +0200 @@ -1,5 +1,6 @@ from urlwatch.filters import GetElementById from urlwatch.filters import GetElementByTag +from urlwatch.filters import JsonFormatFilter from nose.tools import eq_ @@ -35,3 +36,29 @@ """, 'div') print(result) eq_(result, """<div>foo</div><div>bar</div>""") + + +def test_json_format_filter(): + json_format_filter = JsonFormatFilter(None, None) + result = json_format_filter.filter( + """{"field1": {"f1.1": "value"},"field2": "value"}""") + print(result) + eq_(result, """{ + "field1": { + "f1.1": "value" + }, + "field2": "value" +}""") + + +def test_json_format_filter_subfilter(): + json_format_filter = JsonFormatFilter(None, None) + result = json_format_filter.filter( + """{"field1": {"f1.1": "value"},"field2": "value"}""", "2") + print(result) + eq_(result, """{ + "field1": { + "f1.1": "value" + }, + "field2": "value" +}""") diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/test/test_handler.py new/urlwatch-2.14/test/test_handler.py --- old/urlwatch-2.13/test/test_handler.py 2018-06-03 14:42:56.000000000 +0200 +++ new/urlwatch-2.14/test/test_handler.py 2018-08-30 10:36:16.000000000 +0200 @@ -161,14 +161,14 @@ def test_number_of_tries_in_cache_is_increased(): urlwatcher, cache_storage = prepare_retry_test() job = urlwatcher.jobs[0] - old_data, timestamp, tries = cache_storage.load(job, job.get_guid()) + old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid()) assert tries == 0 urlwatcher.run_jobs() urlwatcher.run_jobs() job = urlwatcher.jobs[0] - old_data, timestamp, tries = cache_storage.load(job, job.get_guid()) + old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid()) assert tries == 2 assert urlwatcher.report.job_states[-1].verb == 'error' @@ -179,7 +179,7 @@ urlwatcher, cache_storage = prepare_retry_test() job = urlwatcher.jobs[0] - old_data, timestamp, tries = cache_storage.load(job, job.get_guid()) + old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid()) assert tries == 0 urlwatcher.run_jobs() @@ -194,13 +194,13 @@ urlwatcher, cache_storage = prepare_retry_test() job = urlwatcher.jobs[0] - old_data, timestamp, tries = cache_storage.load(job, job.get_guid()) + old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid()) assert tries == 0 urlwatcher.run_jobs() job = urlwatcher.jobs[0] - old_data, timestamp, tries = cache_storage.load(job, job.get_guid()) + old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid()) assert tries == 1 # use an url that definitely exists @@ -210,5 +210,5 @@ urlwatcher.run_jobs() job = urlwatcher.jobs[0] - old_data, timestamp, tries = cache_storage.load(job, job.get_guid()) + old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid()) assert tries == 0