Hello community, here is the log from the commit of package python-tld for openSUSE:Factory checked in at 2019-12-04 13:52:53 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-tld (Old) and /work/SRC/openSUSE:Factory/.python-tld.new.4691 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-tld" Wed Dec 4 13:52:53 2019 rev:8 rq:753264 version:0.9.8 Changes: -------- --- /work/SRC/openSUSE:Factory/python-tld/python-tld.changes 2019-09-27 14:48:48.584708391 +0200 +++ /work/SRC/openSUSE:Factory/.python-tld.new.4691/python-tld.changes 2019-12-04 14:20:07.370425125 +0100 @@ -1,0 +2,10 @@ +Tue Nov 26 14:03:36 UTC 2019 - Sebastian Wagner <[email protected]> + +- update to version 0.9.8: + - Fix for occasional issue when some domains are not correctly recognised. +- update to version 0.9.7: + - Handling urls that are only a TLD. + - Accepts already splitted URLs. + - Tested against Python 3.8. + +------------------------------------------------------------------- Old: ---- tld-0.9.6.tar.gz New: ---- tld-0.9.8.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-tld.spec ++++++ --- /var/tmp/diff_new_pack.ZNeoRo/_old 2019-12-04 14:20:07.918425587 +0100 +++ /var/tmp/diff_new_pack.ZNeoRo/_new 2019-12-04 14:20:07.922425590 +0100 @@ -1,7 +1,7 @@ # # spec file for package python-tld # -# Copyright (c) 2019 SUSE LINUX GmbH, Nuernberg, Germany. +# Copyright (c) 2019 SUSE LLC # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -18,12 +18,12 @@ %{?!python_module:%define python_module() python-%{**} python3-%{**}} Name: python-tld -Version: 0.9.6 +Version: 0.9.8 Release: 0 Summary: URL top level domain (TLD) extraction module License: MPL-1.1 OR GPL-2.0-only OR LGPL-2.1-only Group: Development/Languages/Python -Url: https://github.com/barseghyanartur/tld +URL: https://github.com/barseghyanartur/tld Source: https://files.pythonhosted.org/packages/source/t/tld/tld-%{version}.tar.gz # PATCH-FIX-OPENSUSE skip_internet_tests.patch Patch0: skip_internet_tests.patch ++++++ tld-0.9.6.tar.gz -> tld-0.9.8.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/CHANGELOG.rst new/tld-0.9.8/CHANGELOG.rst --- old/tld-0.9.6/CHANGELOG.rst 2019-09-12 23:07:10.000000000 +0200 +++ new/tld-0.9.8/CHANGELOG.rst 2019-11-15 23:11:05.000000000 +0100 @@ -15,6 +15,25 @@ 0.3.4 to 0.4). - All backwards incompatible changes are mentioned in this document. +0.9.8 +----- +2019-11-15 + +- Fix for occasional issue when some domains are not correctly recognised. + +0.9.7 +----- +2019-10-30 + +.. note:: + + This release is dedicated to my newborn daughter. Happy birthday, my dear + Ani. + +- Handling urls that are only a TLD. +- Accepts already splitted URLs. +- Tested against Python 3.8. + 0.9.6 ----- 2019-09-12 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/PKG-INFO new/tld-0.9.8/PKG-INFO --- old/tld-0.9.6/PKG-INFO 2019-09-13 01:12:20.000000000 +0200 +++ new/tld-0.9.8/PKG-INFO 2019-11-15 23:21:31.000000000 +0100 @@ -1,7 +1,7 @@ Metadata-Version: 1.1 Name: tld -Version: 0.9.6 -Summary: Extract the top level domain (TLD) from the URL given. +Version: 0.9.8 +Summary: Extract the top-level domain (TLD) from the URL given. Home-page: https://github.com/barseghyanartur/tld Author: Artur Barseghyan Author-email: [email protected] @@ -38,7 +38,7 @@ Prerequisites ============= - - Python 2.7, 3.4, 3.5, 3.6, 3.7 and PyPy + - Python 2.7, 3.4, 3.5, 3.6, 3.7, 3.8 and PyPy Documentation ============= @@ -255,7 +255,7 @@ ====== Artur Barseghyan <[email protected]> -Keywords: tld,top level domain names,python +Keywords: tld,top-level domain names,python Platform: UNKNOWN Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 2.7 @@ -264,6 +264,7 @@ Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 +Classifier: Programming Language :: Python :: 3.8 Classifier: Environment :: Web Environment Classifier: Intended Audience :: Developers Classifier: Operating System :: OS Independent diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/README.rst new/tld-0.9.8/README.rst --- old/tld-0.9.6/README.rst 2019-09-11 00:55:59.000000000 +0200 +++ new/tld-0.9.8/README.rst 2019-10-30 21:33:22.000000000 +0100 @@ -30,7 +30,7 @@ Prerequisites ============= -- Python 2.7, 3.4, 3.5, 3.6, 3.7 and PyPy +- Python 2.7, 3.4, 3.5, 3.6, 3.7, 3.8 and PyPy Documentation ============= diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/docs/changelog.rst new/tld-0.9.8/docs/changelog.rst --- old/tld-0.9.6/docs/changelog.rst 2019-09-12 23:07:44.000000000 +0200 +++ new/tld-0.9.8/docs/changelog.rst 2019-11-15 23:12:24.000000000 +0100 @@ -15,6 +15,25 @@ 0.3.4 to 0.4). - All backwards incompatible changes are mentioned in this document. +0.9.8 +----- +2019-11-15 + +- Fix for occasional issue when some domains are not correctly recognised. + +0.9.7 +----- +2019-10-30 + +.. note:: + + This release is dedicated to my newborn daughter. Happy birthday, my dear + Ani. + +- Handling urls that are only a TLD. +- Accepts already splitted URLs. +- Tested against Python 3.8. + 0.9.6 ----- 2019-09-12 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/docs/index.rst new/tld-0.9.8/docs/index.rst --- old/tld-0.9.6/docs/index.rst 2019-09-12 23:07:44.000000000 +0200 +++ new/tld-0.9.8/docs/index.rst 2019-11-15 23:12:24.000000000 +0100 @@ -30,7 +30,7 @@ Prerequisites ============= -- Python 2.7, 3.4, 3.5, 3.6, 3.7 and PyPy +- Python 2.7, 3.4, 3.5, 3.6, 3.7, 3.8 and PyPy Documentation ============= diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/setup.py new/tld-0.9.8/setup.py --- old/tld-0.9.6/setup.py 2019-09-12 23:08:54.000000000 +0200 +++ new/tld-0.9.8/setup.py 2019-11-15 23:11:26.000000000 +0100 @@ -6,12 +6,12 @@ except: readme = '' -version = '0.9.6' +version = '0.9.8' setup( name='tld', version=version, - description="Extract the top level domain (TLD) from the URL given.", + description="Extract the top-level domain (TLD) from the URL given.", long_description=readme, classifiers=[ "Programming Language :: Python", @@ -21,6 +21,7 @@ "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", + "Programming Language :: Python :: 3.8", "Environment :: Web Environment", "Intended Audience :: Developers", "Operating System :: OS Independent", @@ -31,7 +32,7 @@ "License :: OSI Approved :: GNU Lesser General Public License v2 or " "later (LGPLv2+)", ], - keywords='tld, top level domain names, python', + keywords='tld, top-level domain names, python', author='Artur Barseghyan', author_email='[email protected]', url='https://github.com/barseghyanartur/tld', diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/src/tld/__init__.py new/tld-0.9.8/src/tld/__init__.py --- old/tld-0.9.6/src/tld/__init__.py 2019-09-12 22:54:34.000000000 +0200 +++ new/tld-0.9.8/src/tld/__init__.py 2019-11-15 23:11:46.000000000 +0100 @@ -9,7 +9,7 @@ ) __title__ = 'tld' -__version__ = '0.9.6' +__version__ = '0.9.8' __author__ = 'Artur Barseghyan' __copyright__ = '2013-2019 Artur Barseghyan' __license__ = 'MPL-1.1 OR GPL-2.0-only OR LGPL-2.0-or-later' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/src/tld/res/effective_tld_names.dat.txt new/tld-0.9.8/src/tld/res/effective_tld_names.dat.txt --- old/tld-0.9.6/src/tld/res/effective_tld_names.dat.txt 2019-09-12 23:00:11.000000000 +0200 +++ new/tld-0.9.8/src/tld/res/effective_tld_names.dat.txt 2019-11-15 23:12:17.000000000 +0100 @@ -1368,7 +1368,7 @@ gov.it edu.it // Reserved geo-names (regions and provinces): -// http://www.nic.it/sites/default/files/docs/Regulation_assignation_v7.1.pdf +// https://www.nic.it/sites/default/files/archivio/docs/Regulation_assignation_v7.1.pdf // Regions abr.it abruzzo.it @@ -6038,15 +6038,28 @@ perso.sn univ.sn -// so : http://www.soregistry.com/ +// so : http://sonic.so/policies/ so com.so +edu.so +gov.so +me.so net.so org.so // sr : https://en.wikipedia.org/wiki/.sr sr +// ss : https://registry.nic.ss/ +// Submitted by registry <[email protected]> +ss +biz.ss +com.ss +edu.ss +gov.ss +net.ss +org.ss + // st : http://www.nic.st/html/policyrules/ st co.st @@ -6789,6 +6802,9 @@ // xn--e1a4c ("eu", Cyrillic) : EU ею +// xn--mgbah1a3hjkrd ("Mauritania", Arabic) : MR +موريتانيا + // xn--node ("ge", Georgian Mkhedruli) : GE გე @@ -7062,7 +7078,7 @@ // newGTLDs -// List of new gTLDs imported from https://www.icann.org/resources/registries/gtlds/v2/gtlds.json on 2019-09-10T15:21:14Z +// List of new gTLDs imported from https://www.icann.org/resources/registries/gtlds/v2/gtlds.json on 2019-11-15T17:07:54Z // This list is auto-generated, don't edit it manually. // aaa : 2015-02-26 American Automobile Association, Inc. aaa @@ -7574,9 +7590,6 @@ // cars : 2014-11-13 Cars Registry Limited cars -// cartier : 2014-06-23 Richemont DNS Inc. -cartier - // casa : 2013-11-21 Minds + Machines Group Limited casa @@ -8045,9 +8058,6 @@ // events : 2013-12-05 Binky Moon, LLC events -// everbank : 2014-05-15 EverBank -everbank - // exchange : 2014-03-06 Binky Moon, LLC exchange @@ -8321,7 +8331,7 @@ // gmbh : 2016-01-29 Binky Moon, LLC gmbh -// gmo : 2014-01-09 GMO Internet Pte. Ltd. +// gmo : 2014-01-09 GMO Internet, Inc. gmo // gmx : 2014-04-24 1&1 Mail & Media GmbH @@ -9332,9 +9342,6 @@ // physio : 2014-05-01 PhysBiz Pty Ltd physio -// piaget : 2014-10-16 Richemont DNS Inc. -piaget - // pics : 2013-11-14 Uniregistry, Corp. pics @@ -9446,7 +9453,7 @@ // quebec : 2013-12-19 PointQuébec Inc quebec -// quest : 2015-03-26 Quest ION Limited +// quest : 2015-03-26 XYZ.COM LLC quest // qvc : 2015-07-30 QVC, Inc. @@ -9827,6 +9834,9 @@ // soy : 2014-01-23 Charleston Road Registry Inc. soy +// spa : 2019-09-19 Asia Spa and Wellness Promotion Council Limited +spa + // space : 2014-04-03 DotSpace Inc. space @@ -10115,7 +10125,7 @@ // university : 2014-03-06 Binky Moon, LLC university -// uno : 2013-09-11 Dot Latin LLC +// uno : 2013-09-11 DotSite Inc. uno // uol : 2014-05-01 UBN INTERNET LTDA. @@ -10346,7 +10356,7 @@ // xn--3bst00m : 2013-09-13 Eagle Horizon Limited 集团 -// xn--3ds443g : 2013-09-08 TLD REGISTRY LIMITED +// xn--3ds443g : 2013-09-08 TLD REGISTRY LIMITED OY 在线 // xn--3oq18vl8pn36a : 2015-07-02 Volkswagen (China) Investment Co., Ltd. @@ -10424,7 +10434,7 @@ // xn--cg4bki : 2013-09-27 SAMSUNG SDS CO., LTD 삼성 -// xn--czr694b : 2014-01-16 Dot Trademark TLD Holding Company Limited +// xn--czr694b : 2014-01-16 Internet DotTrademark Organisation Limited 商标 // xn--czrs0t : 2013-12-19 Binky Moon, LLC @@ -10451,7 +10461,7 @@ // xn--fhbei : 2015-01-15 VeriSign Sarl كوم -// xn--fiq228c5hs : 2013-09-08 TLD REGISTRY LIMITED +// xn--fiq228c5hs : 2013-09-08 TLD REGISTRY LIMITED OY 中文网 // xn--fiq64b : 2013-10-14 CITIC Group Corporation @@ -10481,7 +10491,7 @@ // xn--i1b6b1a6a2e : 2013-11-14 Public Interest Registry संगठन -// xn--imr513n : 2014-12-11 Dot Trademark TLD Holding Company Limited +// xn--imr513n : 2014-12-11 Internet DotTrademark Organisation Limited 餐厅 // xn--io0a7i : 2013-11-14 China Internet Network Information Center (CNNIC) @@ -10550,7 +10560,7 @@ // xn--nyqy26a : 2014-11-07 Stable Tone Limited 健康 -// xn--otu796d : 2017-08-06 Dot Trademark TLD Holding Company Limited +// xn--otu796d : 2017-08-06 Internet DotTrademark Organisation Limited 招聘 // xn--p1acf : 2013-12-12 Rusnames Limited @@ -10688,6 +10698,10 @@ *.compute.estate *.alces.network +// Altervista: https://www.altervista.org +// Submitted by Carlo Cannas <[email protected]> +altervista.org + // alwaysdata : https://www.alwaysdata.com // Submitted by Cyril <[email protected]> alwaysdata.net @@ -10822,12 +10836,6 @@ // Submitted by Vincent Tseng <[email protected]> myasustor.com -// Automattic Inc. : https://automattic.com/ -// Submitted by Alex Concha <[email protected]> -go-vip.co -go-vip.net -wpcomstaging.com - // AVM : https://avm.de // Submitted by Andreas Weise <[email protected]> myfritz.net @@ -11770,6 +11778,10 @@ // Submitted by Mads Hartmann <[email protected]> glitch.me +// GMO Pepabo, Inc. : https://pepabo.com/ +// Submitted by dojineko <[email protected]> +lolipop.io + // GOV.UK Platform as a Service : https://www.cloud.service.gov.uk/ // Submitted by Tom Whitwell <[email protected]> cloudapps.digital diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/src/tld/tests/test_core.py new/tld-0.9.8/src/tld/tests/test_core.py --- old/tld-0.9.6/src/tld/tests/test_core.py 2019-09-12 22:58:59.000000000 +0200 +++ new/tld-0.9.8/src/tld/tests/test_core.py 2019-11-15 23:09:21.000000000 +0100 @@ -6,6 +6,7 @@ import unittest import six +from six.moves.urllib.parse import urlsplit from .. import defaults from ..conf import get_setting, reset_settings, set_setting @@ -260,6 +261,33 @@ 'tld': 'xn--11b4c3d', 'kwargs': {'fail_silently': True}, }, + { + 'url': 'http://cloud.fedoraproject.org', + 'fld': 'cloud.fedoraproject.org', + 'subdomain': '', + 'domain': 'cloud.fedoraproject.org', + 'suffix': 'cloud.fedoraproject.org', + 'tld': 'cloud.fedoraproject.org', + 'kwargs': {'fail_silently': True} + }, + { + 'url': 'github.io', + 'fld': 'github.io', + 'subdomain': '', + 'domain': 'github.io', + 'suffix': 'github.io', + 'tld': 'github.io', + 'kwargs': {'fail_silently': True, 'fix_protocol': True} + }, + { + 'url': urlsplit('http://lemonde.fr/article.html'), + 'fld': 'lemonde.fr', + 'subdomain': '', + 'domain': 'lemonde', + 'suffix': 'fr', + 'tld': 'fr', + 'kwargs': {'fail_silently': True} + }, ] self.bad_patterns = { diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/src/tld/utils.py new/tld-0.9.8/src/tld/utils.py --- old/tld-0.9.6/src/tld/utils.py 2019-09-12 23:03:26.000000000 +0200 +++ new/tld-0.9.8/src/tld/utils.py 2019-11-15 23:09:21.000000000 +0100 @@ -3,7 +3,7 @@ import codecs from six import PY3, text_type -from six.moves.urllib.parse import urlsplit +from six.moves.urllib.parse import urlsplit, SplitResult from six.moves.urllib.request import urlopen from .conf import get_setting @@ -42,10 +42,14 @@ def __init__(self, tld, domain, subdomain, parsed_url): self.tld = tld - self.domain = domain + self.domain = domain if domain != '' else tld self.subdomain = subdomain self.parsed_url = parsed_url - self.__fld = "{0}.{1}".format(self.domain, self.tld) + + if domain: + self.__fld = "{0}.{1}".format(self.domain, self.tld) + else: + self.__fld = self.tld @property def extension(self): @@ -273,28 +277,25 @@ "set to True." ) - url = url.lower() - - if fix_protocol: - if ( - not url.startswith('//') - and not (url.startswith('http://') or url.startswith('https://')) - ): - url = 'https://{}'.format(url) - tld_names = get_tld_names(fail_silently=fail_silently) # Init - # Get parsed URL as we might need it later - parsed_url = urlsplit(url) - # Get (sub) domain name - domain_name = parsed_url.netloc + if not isinstance(url, SplitResult): + url = url.lower() - # Handling auth - if '@' in domain_name: - domain_name = domain_name.split('@', 1)[-1] + if fix_protocol: + if ( + not url.startswith('//') + and not (url.startswith('http://') or url.startswith('https://')) + ): + url = 'https://{}'.format(url) + + # Get parsed URL as we might need it later + parsed_url = urlsplit(url) + else: + parsed_url = url - # Handling port - domain_name = domain_name.split(':', 1)[0] + # Get (sub) domain name + domain_name = parsed_url.hostname if not domain_name: if fail_silently: @@ -350,7 +351,10 @@ else: raise TldDomainNotFound(domain_name=domain_name) - non_zero_i = max(1, len(domain_parts) - tld_length) + if len(domain_parts) == tld_length: + non_zero_i = -1 # hostname = tld + else: + non_zero_i = max(1, len(domain_parts) - tld_length) return domain_parts, non_zero_i, parsed_url @@ -402,6 +406,10 @@ if domain_parts is None: return None + if non_zero_i < 0: + # hostname = tld + return text_type(parsed_url.hostname) + return text_type(".").join(domain_parts[non_zero_i-1:]) @@ -450,13 +458,22 @@ return None if not as_object: + if non_zero_i < 0: + # hostname = tld + return text_type(parsed_url.hostname) return text_type(".").join(domain_parts[non_zero_i:]) - subdomain = text_type(".").join(domain_parts[:non_zero_i-1]) - domain = text_type(".").join( - domain_parts[non_zero_i-1:non_zero_i] - ) - _tld = text_type(".").join(domain_parts[non_zero_i:]) + if non_zero_i < 0: + # hostname = tld + subdomain = text_type("") + domain = text_type("") + _tld = text_type(parsed_url.hostname) + else: + subdomain = text_type(".").join(domain_parts[:non_zero_i-1]) + domain = text_type(".").join( + domain_parts[non_zero_i-1:non_zero_i] + ) + _tld = text_type(".").join(domain_parts[non_zero_i:]) return Result( subdomain=subdomain, @@ -522,7 +539,7 @@ :rtype: bool """ _tld = get_tld( - url='www.{}'.format(value), + url=value, fail_silently=True, fix_protocol=True, search_public=search_public, diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/tld-0.9.6/src/tld.egg-info/PKG-INFO new/tld-0.9.8/src/tld.egg-info/PKG-INFO --- old/tld-0.9.6/src/tld.egg-info/PKG-INFO 2019-09-13 01:12:20.000000000 +0200 +++ new/tld-0.9.8/src/tld.egg-info/PKG-INFO 2019-11-15 23:21:31.000000000 +0100 @@ -1,7 +1,7 @@ Metadata-Version: 1.1 Name: tld -Version: 0.9.6 -Summary: Extract the top level domain (TLD) from the URL given. +Version: 0.9.8 +Summary: Extract the top-level domain (TLD) from the URL given. Home-page: https://github.com/barseghyanartur/tld Author: Artur Barseghyan Author-email: [email protected] @@ -38,7 +38,7 @@ Prerequisites ============= - - Python 2.7, 3.4, 3.5, 3.6, 3.7 and PyPy + - Python 2.7, 3.4, 3.5, 3.6, 3.7, 3.8 and PyPy Documentation ============= @@ -255,7 +255,7 @@ ====== Artur Barseghyan <[email protected]> -Keywords: tld,top level domain names,python +Keywords: tld,top-level domain names,python Platform: UNKNOWN Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 2.7 @@ -264,6 +264,7 @@ Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 +Classifier: Programming Language :: Python :: 3.8 Classifier: Environment :: Web Environment Classifier: Intended Audience :: Developers Classifier: Operating System :: OS Independent
