commit python-idna for openSUSE:Factory

root Sun, 12 Nov 2017 08:59:53 -0800

Hello community,

here is the log from the commit of package python-idna for openSUSE:Factory 
checked in at 2017-11-12 17:59:35
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-idna (Old)
 and      /work/SRC/openSUSE:Factory/.python-idna.new (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Package is "python-idna"

Sun Nov 12 17:59:35 2017 rev:4 rq:540455 version:2.6

Changes:
--------
--- /work/SRC/openSUSE:Factory/python-idna/python-idna.changes  2017-09-23 
21:32:48.574278873 +0200
+++ /work/SRC/openSUSE:Factory/.python-idna.new/python-idna.changes     
2017-11-12 17:59:42.694577152 +0100
@@ -1,0 +2,15 @@
+Thu Nov  9 18:53:55 UTC 2017 - [email protected]
+
+- update to version 2.6:
+  * Allows generation of IDNA and UTS 46 table data for different
+    versions of Unicode, by deriving properties directly from Unicode
+    data.
+  * Ability to generate RFC 5892/IANA-style table data
+  * Diagnostic output of IDNA-related Unicode properties and derived
+    calculations for a given codepoint
+  * Support for idna.__version__ to report version
+  * Support for idna.idnadata.__version__ and
+    idna.uts46data.__version__ to report Unicode version of underlying
+    IDNA and UTS 46 data respectively.
+
+-------------------------------------------------------------------

Old:
----
  idna-2.5.tar.gz

New:
----
  idna-2.6.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-idna.spec ++++++
--- /var/tmp/diff_new_pack.0TjuST/_old  2017-11-12 17:59:44.914496294 +0100
+++ /var/tmp/diff_new_pack.0TjuST/_new  2017-11-12 17:59:44.914496294 +0100
@@ -18,7 +18,7 @@
 
 %{?!python_module:%define python_module() python-%{**} python3-%{**}}
 Name:           python-idna
-Version:        2.5
+Version:        2.6
 Release:        0
 Summary:        Internationalized Domain Names in Applications (IDNA)
 License:        BSD-3-Clause

++++++ idna-2.5.tar.gz -> idna-2.6.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/HISTORY.rst new/idna-2.6/HISTORY.rst
--- old/idna-2.5/HISTORY.rst    2017-03-07 04:25:38.000000000 +0100
+++ new/idna-2.6/HISTORY.rst    2017-08-08 05:42:40.000000000 +0200
@@ -3,6 +3,20 @@
 History
 -------
 
+2.6 (2017-08-08)
+++++++++++++++++
+
+- Allows generation of IDNA and UTS 46 table data for different
+  versions of Unicode, by deriving properties directly from
+  Unicode data.
+- Ability to generate RFC 5892/IANA-style table data
+- Diagnostic output of IDNA-related Unicode properties and
+  derived calculations for a given codepoint
+- Support for idna.__version__ to report version
+- Support for idna.idnadata.__version__ and
+  idna.uts46data.__version__ to report Unicode version of
+  underlying IDNA and UTS 46 data respectively.
+
 2.5 (2017-03-07)
 ++++++++++++++++
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/PKG-INFO new/idna-2.6/PKG-INFO
--- old/idna-2.5/PKG-INFO       2017-03-07 04:27:22.000000000 +0100
+++ new/idna-2.6/PKG-INFO       2017-08-08 05:43:08.000000000 +0200
@@ -1,6 +1,6 @@
 Metadata-Version: 1.1
 Name: idna
-Version: 2.5
+Version: 2.6
 Summary: Internationalized Domain Names in Applications (IDNA)
 Home-page: https://github.com/kjd/idna
 Author: Kim Davies
@@ -171,6 +171,41 @@
         when the codepoint is illegal based on its positional context (i.e. it 
is CONTEXTO
         or CONTEXTJ but the contextual requirements are not satisfied.)
         
+        Building and Diagnostics
+        ------------------------
+        
+        The IDNA and UTS 46 functionality relies upon pre-calculated lookup 
tables for
+        performance. These tables are derived from computing against 
eligibility criteria
+        in the respective standards. These tables are computed using the 
command-line
+        script ``tools/idna-data``.
+        
+        This tool will fetch relevant tables from the Unicode Consortium and 
perform the
+        required calculations to identify eligibility. It has three main modes:
+        
+        * ``idna-data make-libdata``. Generates ``idnadata.py`` and 
``uts46data.py``,
+          the pre-calculated lookup tables using for IDNA and UTS 46 
conversions. Implementors
+          who wish to track this library against a different Unicode version 
may use this tool
+          to manually generate a different version of the ``idnadata.py`` and 
``uts46data.py``
+          files.
+        
+        * ``idna-data make-table``. Generate a table of the IDNA disposition
+          (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix 
B.1 of RFC
+          5892 and the pre-computed tables published by `IANA 
<http://iana.org/>`_.
+        
+        * ``idna-data U+0061``. Prints debugging output on the various 
properties
+          associated with an individual Unicode codepoint (in this case, 
U+0061), that are
+          used to assess the IDNA and UTS 46 status of a codepoint. This is 
helpful in debugging
+          or analysis.
+        
+        The tool accepts a number of arguments, described using ``idna-data 
-h``. Most notably,
+        the ``--version`` argument allows the specification of the version of 
Unicode to use
+        in computing the table data. For example, ``idna-data --version 9.0.0 
make-libdata``
+        will generate library data against Unicode 9.0.0.
+        
+        Note that this script requires Python 3, but all generated library 
data will work
+        in Python 2.6+.
+        
+        
         Testing
         -------
         
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/README.rst new/idna-2.6/README.rst
--- old/idna-2.5/README.rst     2017-03-07 04:22:47.000000000 +0100
+++ new/idna-2.6/README.rst     2017-08-08 05:42:40.000000000 +0200
@@ -163,6 +163,41 @@
 when the codepoint is illegal based on its positional context (i.e. it is 
CONTEXTO
 or CONTEXTJ but the contextual requirements are not satisfied.)
 
+Building and Diagnostics
+------------------------
+
+The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for
+performance. These tables are derived from computing against eligibility 
criteria
+in the respective standards. These tables are computed using the command-line
+script ``tools/idna-data``.
+
+This tool will fetch relevant tables from the Unicode Consortium and perform 
the
+required calculations to identify eligibility. It has three main modes:
+
+* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``,
+  the pre-calculated lookup tables using for IDNA and UTS 46 conversions. 
Implementors
+  who wish to track this library against a different Unicode version may use 
this tool
+  to manually generate a different version of the ``idnadata.py`` and 
``uts46data.py``
+  files.
+
+* ``idna-data make-table``. Generate a table of the IDNA disposition
+  (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
+  5892 and the pre-computed tables published by `IANA <http://iana.org/>`_.
+
+* ``idna-data U+0061``. Prints debugging output on the various properties
+  associated with an individual Unicode codepoint (in this case, U+0061), that 
are
+  used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in 
debugging
+  or analysis.
+
+The tool accepts a number of arguments, described using ``idna-data -h``. Most 
notably,
+the ``--version`` argument allows the specification of the version of Unicode 
to use
+in computing the table data. For example, ``idna-data --version 9.0.0 
make-libdata``
+will generate library data against Unicode 9.0.0.
+
+Note that this script requires Python 3, but all generated library data will 
work
+in Python 2.6+.
+
+
 Testing
 -------
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/idna/__init__.py 
new/idna-2.6/idna/__init__.py
--- old/idna-2.5/idna/__init__.py       2017-03-07 04:22:47.000000000 +0100
+++ new/idna-2.6/idna/__init__.py       2017-06-28 16:45:32.000000000 +0200
@@ -1 +1,2 @@
+from .package_data import __version__
 from .core import *
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/idna/idnadata.py 
new/idna-2.6/idna/idnadata.py
--- old/idna-2.5/idna/idnadata.py       2017-03-07 04:22:47.000000000 +0100
+++ new/idna-2.6/idna/idnadata.py       2017-08-08 05:42:40.000000000 +0200
@@ -1,5 +1,6 @@
-# This file is automatically generated by build-idnadata.py
+# This file is automatically generated by tools/idna-data
 
+__version__ = "6.3.0"
 scripts = {
     'Greek': (
         0x37000000374,
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/idna/package_data.py 
new/idna-2.6/idna/package_data.py
--- old/idna-2.5/idna/package_data.py   1970-01-01 01:00:00.000000000 +0100
+++ new/idna-2.6/idna/package_data.py   2017-06-28 16:26:38.000000000 +0200
@@ -0,0 +1,2 @@
+__version__ = '2.6'
+
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/idna/uts46data.py 
new/idna-2.6/idna/uts46data.py
--- old/idna-2.5/idna/uts46data.py      2017-03-07 04:22:47.000000000 +0100
+++ new/idna-2.6/idna/uts46data.py      2017-08-08 05:42:40.000000000 +0200
@@ -1,9 +1,10 @@
-# This file is automatically generated by tools/build-uts46data.py
+# This file is automatically generated by tools/idna-data
 # vim: set fileencoding=utf-8 :
 
 """IDNA Mapping Table from UTS46."""
 
 
+__version__ = "6.3.0"
 def _seg_0():
     return [
     (0x0, '3'),
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/idna.egg-info/PKG-INFO 
new/idna-2.6/idna.egg-info/PKG-INFO
--- old/idna-2.5/idna.egg-info/PKG-INFO 2017-03-07 04:27:22.000000000 +0100
+++ new/idna-2.6/idna.egg-info/PKG-INFO 2017-08-08 05:43:08.000000000 +0200
@@ -1,6 +1,6 @@
 Metadata-Version: 1.1
 Name: idna
-Version: 2.5
+Version: 2.6
 Summary: Internationalized Domain Names in Applications (IDNA)
 Home-page: https://github.com/kjd/idna
 Author: Kim Davies
@@ -171,6 +171,41 @@
         when the codepoint is illegal based on its positional context (i.e. it 
is CONTEXTO
         or CONTEXTJ but the contextual requirements are not satisfied.)
         
+        Building and Diagnostics
+        ------------------------
+        
+        The IDNA and UTS 46 functionality relies upon pre-calculated lookup 
tables for
+        performance. These tables are derived from computing against 
eligibility criteria
+        in the respective standards. These tables are computed using the 
command-line
+        script ``tools/idna-data``.
+        
+        This tool will fetch relevant tables from the Unicode Consortium and 
perform the
+        required calculations to identify eligibility. It has three main modes:
+        
+        * ``idna-data make-libdata``. Generates ``idnadata.py`` and 
``uts46data.py``,
+          the pre-calculated lookup tables using for IDNA and UTS 46 
conversions. Implementors
+          who wish to track this library against a different Unicode version 
may use this tool
+          to manually generate a different version of the ``idnadata.py`` and 
``uts46data.py``
+          files.
+        
+        * ``idna-data make-table``. Generate a table of the IDNA disposition
+          (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix 
B.1 of RFC
+          5892 and the pre-computed tables published by `IANA 
<http://iana.org/>`_.
+        
+        * ``idna-data U+0061``. Prints debugging output on the various 
properties
+          associated with an individual Unicode codepoint (in this case, 
U+0061), that are
+          used to assess the IDNA and UTS 46 status of a codepoint. This is 
helpful in debugging
+          or analysis.
+        
+        The tool accepts a number of arguments, described using ``idna-data 
-h``. Most notably,
+        the ``--version`` argument allows the specification of the version of 
Unicode to use
+        in computing the table data. For example, ``idna-data --version 9.0.0 
make-libdata``
+        will generate library data against Unicode 9.0.0.
+        
+        Note that this script requires Python 3, but all generated library 
data will work
+        in Python 2.6+.
+        
+        
         Testing
         -------
         
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/idna.egg-info/SOURCES.txt 
new/idna-2.6/idna.egg-info/SOURCES.txt
--- old/idna-2.5/idna.egg-info/SOURCES.txt      2017-03-07 04:27:22.000000000 
+0100
+++ new/idna-2.6/idna.egg-info/SOURCES.txt      2017-08-08 05:43:08.000000000 
+0200
@@ -10,11 +10,11 @@
 idna/core.py
 idna/idnadata.py
 idna/intranges.py
+idna/package_data.py
 idna/uts46data.py
 idna.egg-info/PKG-INFO
 idna.egg-info/SOURCES.txt
 idna.egg-info/dependency_links.txt
-idna.egg-info/pbr.json
 idna.egg-info/top_level.txt
 tests/IdnaTest.txt.gz
 tests/__init__.py
@@ -23,6 +23,5 @@
 tests/test_idna_compat.py
 tests/test_idna_uts46.py
 tests/test_intranges.py
-tools/build-idnadata.py
-tools/build-uts46data.py
+tools/idna-data
 tools/intranges.py
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/idna.egg-info/pbr.json 
new/idna-2.6/idna.egg-info/pbr.json
--- old/idna-2.5/idna.egg-info/pbr.json 2017-03-07 04:27:22.000000000 +0100
+++ new/idna-2.6/idna.egg-info/pbr.json 1970-01-01 01:00:00.000000000 +0100
@@ -1 +0,0 @@
-{"is_release": true, "git_version": "0088bfc"}
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/setup.py new/idna-2.6/setup.py
--- old/idna-2.5/setup.py       2017-03-07 04:25:15.000000000 +0100
+++ new/idna-2.6/setup.py       2017-06-28 16:56:52.000000000 +0200
@@ -9,7 +9,6 @@
 import io, sys
 from setuptools import setup
 
-version = "2.5"
 
 def main():
 
@@ -17,10 +16,13 @@
     if python_version < (2,6):
         raise SystemExit("Sorry, Python 2.6 or newer required")
 
+    package_data = {}
+    exec(open('idna/package_data.py').read(), package_data)
+
     arguments = {
         'name': 'idna',
         'packages': ['idna'],
-        'version': version,
+        'version': package_data['__version__'],
         'description': 'Internationalized Domain Names in Applications (IDNA)',
         'long_description': io.open("README.rst", encoding="UTF-8").read(),
         'author': 'Kim Davies',
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/tools/build-idnadata.py 
new/idna-2.6/tools/build-idnadata.py
--- old/idna-2.5/tools/build-idnadata.py        2017-03-07 04:22:47.000000000 
+0100
+++ new/idna-2.6/tools/build-idnadata.py        1970-01-01 01:00:00.000000000 
+0100
@@ -1,110 +0,0 @@
-#!/usr/bin/env python
-
-from __future__ import print_function
-
-try:
-    from urllib.request import urlopen
-except ImportError:
-    from urllib2 import urlopen
-import xml.etree.ElementTree as etree
-
-from intranges import intranges_from_list
-
-UNICODE_VERSION = '6.3.0'
-
-SCRIPTS_URL = "http://www.unicode.org/Public/{version}/ucd/Scripts.txt";
-JOININGTYPES_URL = 
"http://www.unicode.org/Public/{version}/ucd/ArabicShaping.txt";
-IDNATABLES_URL = 
"http://www.iana.org/assignments/idna-tables-{version}/idna-tables-{version}.xml";
-IDNATABLES_NS = "http://www.iana.org/assignments";
-
-# These scripts are needed to compute IDNA contextual rules, see
-# https://www.iana.org/assignments/idna-tables-6.3.0#idna-tables-context
-
-SCRIPT_WHITELIST = sorted(['Greek', 'Han', 'Hebrew', 'Hiragana', 'Katakana'])
-
-
-def print_optimised_list(d):
-    print("(")
-    for value in intranges_from_list(d):
-        print("        {},".format(hex(value)))
-    print("    ),")
-
-
-def build_idnadata(version):
-
-    print("# This file is automatically generated by build-idnadata.py\n")
-
-    #
-    # Script classifications are used by some CONTEXTO rules in RFC 5891
-    #
-    print("scripts = {")
-    scripts = {}
-    for line in urlopen(SCRIPTS_URL.format(version=version)).readlines():
-        line = line.decode('utf-8')
-        line = line.strip()
-        if not line or line[0] == '#':
-            continue
-        if line.find('#'):
-            line = line.split('#')[0]
-        (codepoints, scriptname) = [x.strip() for x in line.split(';')]
-        if not scriptname in scripts:
-            scripts[scriptname] = set()
-        if codepoints.find('..') > 0:
-            (begin, end) = [int(x, 16) for x in codepoints.split('..')]
-            for cp in range(begin, end+1):
-                scripts[scriptname].add(cp)
-        else:
-            scripts[scriptname].add(int(codepoints, 16))
-
-    for script in SCRIPT_WHITELIST:
-        print("    '{0}':".format(script), end=' ')
-        print_optimised_list(scripts[script])
-
-    print("}")
-
-    #
-    # Joining types are used by CONTEXTJ rule A.1
-    #
-    print("joining_types = {")
-    scripts = {}
-    for line in urlopen(JOININGTYPES_URL.format(version=version)).readlines():
-        line = line.decode('utf-8')
-        line = line.strip()
-        if not line or line[0] == '#':
-            continue
-        (codepoint, name, joiningtype, group) = [x.strip() for x in 
line.split(';')]
-        print("    {0}: {1},".format(hex(int(codepoint, 16)), 
ord(joiningtype)))
-    print("}")
-
-    #
-    # These are the classification of codepoints into PVALID, CONTEXTO, 
CONTEXTJ, etc.
-    #
-    print("codepoint_classes = {")
-    classes = {}
-
-    namespace = "{{{0}}}".format(IDNATABLES_NS)
-    idntables_data = urlopen(IDNATABLES_URL.format(version=version)).read()
-    root = etree.fromstring(idntables_data)
-
-    for record in 
root.findall('{0}registry[@id="idna-tables-properties"]/{0}record'.format(namespace)):
-        codepoint = record.find("{0}codepoint".format(namespace)).text
-        prop = record.find("{0}property".format(namespace)).text
-        if prop in ('UNASSIGNED', 'DISALLOWED'):
-            continue
-        if not prop in classes:
-            classes[prop] = set()
-        if codepoint.find('-') > 0:
-            (begin, end) = [int(x, 16) for x in codepoint.split('-')]
-            for cp in range(begin, end+1):
-                classes[prop].add(cp)
-        else:
-            classes[prop].add(int(codepoint, 16))
-
-    for prop in classes:
-        print("    '{0}':".format(prop), end=' ')
-        print_optimised_list(classes[prop])
-
-    print("}")
-
-if __name__ == "__main__":
-    build_idnadata(UNICODE_VERSION)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/tools/build-uts46data.py 
new/idna-2.6/tools/build-uts46data.py
--- old/idna-2.5/tools/build-uts46data.py       2017-03-07 04:22:47.000000000 
+0100
+++ new/idna-2.6/tools/build-uts46data.py       1970-01-01 01:00:00.000000000 
+0100
@@ -1,106 +0,0 @@
-#!/usr/bin/env python
-
-"""Create a Python version of the IDNA Mapping Table from UTS46."""
-
-import re
-import sys
-
-# pylint: disable=unused-import,import-error,undefined-variable
-if sys.version_info[0] == 3:
-    from urllib.request import urlopen
-    unichr = chr
-else:
-    from urllib2 import urlopen
-# pylint: enable=unused-import,import-error,undefined-variable
-
-UNICODE_VERSION = '6.3.0'
-SEGMENT_SIZE = 100
-
-DATA_URL = "http://www.unicode.org/Public/idna/{version}/IdnaMappingTable.txt";
-RE_CHAR_RANGE = re.compile(br"([0-9a-fA-F]{4,6})(?:\.\.([0-9a-fA-F]{4,6}))?$")
-STATUSES = {
-    b"valid": ("V", False),
-    b"ignored": ("I", False),
-    b"mapped": ("M", True),
-    b"deviation": ("D", True),
-    b"disallowed": ("X", False),
-    b"disallowed_STD3_valid": ("3", False),
-    b"disallowed_STD3_mapped": ("3", True)
-}
-
-
-def parse_idna_mapping_table(inputstream):
-    """Parse IdnaMappingTable.txt and return a list of tuples."""
-    ranges = []
-    last_code = -1
-    last = (None, None)
-    for line in inputstream:
-        line = line.strip()
-        if b"#" in line:
-            line = line.split(b"#", 1)[0]
-        if not line:
-            continue
-        fields = [field.strip() for field in line.split(b";")]
-        char_range = RE_CHAR_RANGE.match(fields[0])
-        if not char_range:
-            raise ValueError(
-                "Invalid character or range {!r}".format(fields[0]))
-        start = int(char_range.group(1), 16)
-        if start != last_code + 1:
-            raise ValueError(
-                "Code point {!r} is not continguous".format(fields[0]))
-        if char_range.lastindex == 2:
-            last_code = int(char_range.group(2), 16)
-        else:
-            last_code = start
-        status, mapping = STATUSES[fields[1]]
-        if mapping:
-            mapping = (u"".join(unichr(int(codepoint, 16))
-                for codepoint in fields[2].split()).
-                replace("\\", "\\\\").replace("'", "\\'"))
-        else:
-            mapping = None
-        if start > 255 and (status, mapping) == last:
-            continue
-        last = (status, mapping)
-        while True:
-            if mapping is not None:
-                ranges.append(u"(0x{0:X}, '{1}', u'{2}')".format(
-                    start, status, mapping))
-            else:
-                ranges.append(u"(0x{0:X}, '{1}')".format(start, status))
-            start += 1
-            if start > 255 or start > last_code:
-                break
-    return ranges
-
-
-def build_uts46data(version):
-    """Fetch the mapping table, parse it, and rewrite idna/uts46data.py."""
-    ranges = 
parse_idna_mapping_table(urlopen(DATA_URL.format(version=version)))
-    with open("idna/uts46data.py", "wb") as outputstream:
-        outputstream.write(b'''\
-# This file is automatically generated by tools/build-uts46data.py
-# vim: set fileencoding=utf-8 :
-
-"""IDNA Mapping Table from UTS46."""
-
-
-''')
-        for idx, row in enumerate(ranges):
-            if idx % SEGMENT_SIZE == 0:
-                if idx!=0:
-                    outputstream.write(b"    ]\n\n")
-                outputstream.write(u"def _seg_{0}():\n    return 
[\n".format(idx/SEGMENT_SIZE).encode("utf8"))
-            outputstream.write(u"    {0},\n".format(row).encode("utf8"))
-        outputstream.write(b"    ]\n\n")
-        outputstream.write(b"uts46data = tuple(\n")
-
-        outputstream.write(b"    _seg_0()\n")
-        for i in xrange(1, (len(ranges)-1)/SEGMENT_SIZE+1):
-            outputstream.write(u"    + _seg_{0}()\n".format(i).encode("utf8"))
-        outputstream.write(b")\n")
-
-
-if __name__ == "__main__":
-    build_uts46data(UNICODE_VERSION)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' 
'--exclude=.svnignore' old/idna-2.5/tools/idna-data new/idna-2.6/tools/idna-data
--- old/idna-2.5/tools/idna-data        1970-01-01 01:00:00.000000000 +0100
+++ new/idna-2.6/tools/idna-data        2017-08-08 05:42:40.000000000 +0200
@@ -0,0 +1,671 @@
+#!/usr/bin/env python3
+
+import argparse, collections, datetime, os, re, sys, unicodedata
+from urllib.request import urlopen
+from intranges import intranges_from_list
+
+if sys.version_info[0] < 3:
+    print("Only Python 3 supported.")
+    sys.exit(2)
+
+# PREFERRED_VERSION = 'latest'   # https://github.com/kjd/idna/issues/8
+PREFERRED_VERSION = '6.3.0'
+UCD_URL = 'http://www.unicode.org/Public/{version}/ucd/{filename}'
+UTS46_URL = 'http://www.unicode.org/Public/idna/{version}/{filename}'
+
+DEFAULT_CACHE_DIR = '~/.cache/unidata'
+
+# Scripts affected by IDNA contextual rules
+SCRIPT_WHITELIST = sorted(['Greek', 'Han', 'Hebrew', 'Hiragana', 'Katakana'])
+
+# Used to piece apart UTS#46 data for Jython compatibility
+UTS46_SEGMENT_SIZE = 100
+
+UTS46_STATUSES = {
+    "valid": ("V", False),
+    "ignored": ("I", False),
+    "mapped": ("M", True),
+    "deviation": ("D", True),
+    "disallowed": ("X", False),
+    "disallowed_STD3_valid": ("3", False),
+    "disallowed_STD3_mapped": ("3", True)
+}
+
+# Exceptions are manually assigned in Section 2.6 of RFC 5892.
+exceptions = {
+    0x00DF: 'PVALID',      # LATIN SMALL LETTER SHARP S
+    0x03C2: 'PVALID',      # GREEK SMALL LETTER FINAL SIGMA
+    0x06FD: 'PVALID',      # ARABIC SIGN SINDHI AMPERSAND
+    0x06FE: 'PVALID',      # ARABIC SIGN SINDHI POSTPOSITION MEN
+    0x0F0B: 'PVALID',      # TIBETAN MARK INTERSYLLABIC TSHEG
+    0x3007: 'PVALID',      # IDEOGRAPHIC NUMBER ZERO
+    0x00B7: 'CONTEXTO',    # MIDDLE DOT
+    0x0375: 'CONTEXTO',    # GREEK LOWER NUMERAL SIGN (KERAIA)
+    0x05F3: 'CONTEXTO',    # HEBREW PUNCTUATION GERESH
+    0x05F4: 'CONTEXTO',    # HEBREW PUNCTUATION GERSHAYIM
+    0x30FB: 'CONTEXTO',    # KATAKANA MIDDLE DOT
+    0x0660: 'CONTEXTO',    # ARABIC-INDIC DIGIT ZERO
+    0x0661: 'CONTEXTO',    # ARABIC-INDIC DIGIT ONE
+    0x0662: 'CONTEXTO',    # ARABIC-INDIC DIGIT TWO
+    0x0663: 'CONTEXTO',    # ARABIC-INDIC DIGIT THREE
+    0x0664: 'CONTEXTO',    # ARABIC-INDIC DIGIT FOUR
+    0x0665: 'CONTEXTO',    # ARABIC-INDIC DIGIT FIVE
+    0x0666: 'CONTEXTO',    # ARABIC-INDIC DIGIT SIX
+    0x0667: 'CONTEXTO',    # ARABIC-INDIC DIGIT SEVEN
+    0x0668: 'CONTEXTO',    # ARABIC-INDIC DIGIT EIGHT
+    0x0669: 'CONTEXTO',    # ARABIC-INDIC DIGIT NINE
+    0x06F0: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT ZERO
+    0x06F1: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT ONE
+    0x06F2: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT TWO
+    0x06F3: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT THREE
+    0x06F4: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT FOUR
+    0x06F5: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT FIVE
+    0x06F6: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT SIX
+    0x06F7: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT SEVEN
+    0x06F8: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT EIGHT
+    0x06F9: 'CONTEXTO',    # EXTENDED ARABIC-INDIC DIGIT NINE
+    0x0640: 'DISALLOWED',  # ARABIC TATWEEL
+    0x07FA: 'DISALLOWED',  # NKO LAJANYALAN
+    0x302E: 'DISALLOWED',  # HANGUL SINGLE DOT TONE MARK
+    0x302F: 'DISALLOWED',  # HANGUL DOUBLE DOT TONE MARK
+    0x3031: 'DISALLOWED',  # VERTICAL KANA REPEAT MARK
+    0x3032: 'DISALLOWED',  # VERTICAL KANA REPEAT WITH VOICED SOUND MARK
+    0x3033: 'DISALLOWED',  # VERTICAL KANA REPEAT MARK UPPER HALF
+    0x3034: 'DISALLOWED',  # VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER 
HA
+    0x3035: 'DISALLOWED',  # VERTICAL KANA REPEAT MARK LOWER HALF
+    0x303B: 'DISALLOWED',  # VERTICAL IDEOGRAPHIC ITERATION MARK
+}
+backwardscompatible = {}
+
+
+def hexrange(start, end):
+    return range(int(start, 16), int(end, 16) + 1)
+
+def hexvalue(value):
+    return int(value, 16)
+
+
+class UnicodeVersion(object):
+
+    def __init__(self, version):
+        result = re.match('^(?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)$', 
version)
+        if result:
+            self.major = int(result.group('major'))
+            self.minor = int(result.group('minor'))
+            self.patch = int(result.group('patch'))
+            self.numerical = (self.major << 8) + (self.minor << 4) + self.patch
+            self.latest = False
+        elif version == 'latest':
+            self.latest = True
+        else:
+            raise ValueError('Unrecognized Unicode version')
+
+    def __repr__(self, with_date=True):
+        if self.latest:
+            if with_date:
+                return 
'latest@{}'.format(datetime.datetime.now().strftime('%Y-%m-%d'))
+            else:
+                return 'latest'
+        else:
+            return "{}.{}.{}".format(self.major, self.minor, self.patch)
+
+    @property
+    def tag(self):
+        return self.__repr__(with_date=False)
+
+    def __gt__(self, other):
+        if self.latest:
+            return True
+        return self.numerical > other.numerical
+
+    def __eq__(self, other):
+        if self.latest:
+            return False
+        return self.numerical == other.numerical
+
+
+class UnicodeData(object):
+
+    def __init__(self, version, cache, args):
+        self.version = UnicodeVersion(version)
+        self.system_version = UnicodeVersion(unicodedata.unidata_version)
+        self.source = args.source
+        self.cache = cache
+        self.max = 0
+
+        if self.system_version < self.version:
+            print("Warning: Character stability not guaranteed as Python 
Unicode data {}"
+                   " older than requested {}".format(self.system_version, 
self.version))
+
+        self._load_unicodedata()
+        self._load_proplist()
+        self._load_derivedcoreprops()
+        self._load_blocks()
+        self._load_casefolding()
+        self._load_hangulst()
+        self._load_arabicshaping()
+        self._load_scripts()
+        self._load_uts46mapping()
+
+    def _load_unicodedata(self):
+
+        f_ud = self._ucdfile('UnicodeData.txt')
+        self.ucd_data = {}
+        range_begin = None
+        for line in f_ud.splitlines():
+            fields = line.split(';')
+            value = int(fields[0], 16)
+            start_marker = re.match('^<(?P<name>.*?), First>$', fields[1])
+            end_marker = re.match('^<(?P<name>.*?), Last>$', fields[1])
+            if start_marker:
+                range_begin = value
+            elif end_marker:
+                for i in range(range_begin, value+1):
+                    fields[1] = '<{}>'.format(end_marker.group('name'))
+                    self.ucd_data[i] = fields[1:]
+                range_begin = None
+            else:
+                self.ucd_data[value] = fields[1:]
+
+    def _load_proplist(self):
+
+        f_pl = self._ucdfile('PropList.txt')
+        self.ucd_props = collections.defaultdict(list)
+        for line in f_pl.splitlines():
+            result = re.match(
+                
'^(?P<start>[0-9A-F]{4,6})(|\.\.(?P<end>[0-9A-F]{4,6}))\s*;\s*(?P<prop>\S+)\s*(|\#.*)$',
+                line)
+            if result:
+                if result.group('end'):
+                    for i in hexrange(result.group('start'), 
result.group('end')):
+                        self.ucd_props[i].append(result.group('prop'))
+                else:
+                    i = hexvalue(result.group('start'))
+                    self.ucd_props[i].append(result.group('prop'))
+
+    def _load_derivedcoreprops(self):
+
+        f_dcp = self._ucdfile('DerivedCoreProperties.txt')
+        for line in f_dcp.splitlines():
+            result = re.match(
+                
'^(?P<start>[0-9A-F]{4,6})(|\.\.(?P<end>[0-9A-F]{4,6}))\s*;\s*(?P<prop>\S+)\s*(|\#.*)$',
+                line)
+            if result:
+                if result.group('end'):
+                    for i in hexrange(result.group('start'), 
result.group('end')):
+                        self.ucd_props[i].append(result.group('prop'))
+                else:
+                    i = hexvalue(result.group('start'))
+                    self.ucd_props[i].append(result.group('prop'))
+
+    def _load_blocks(self):
+
+        self.ucd_block = {}
+        f_b = self._ucdfile('Blocks.txt')
+        for line in f_b.splitlines():
+            result = re.match(
+                
'^(?P<start>[0-9A-F]{4,6})\.\.(?P<end>[0-9A-F]{4,6})\s*;\s*(?P<block>.*)\s*$',
+                line)
+            if result:
+                for i in hexrange(result.group('start'), result.group('end')):
+                    self.ucd_block[i] = result.group('block')
+                    self.max = max(self.max, i)
+
+    def _load_casefolding(self):
+
+        self.ucd_cf = {}
+        f_cf = self._ucdfile('CaseFolding.txt')
+        for line in f_cf.splitlines():
+            result = re.match(
+                
'^(?P<cp>[0-9A-F]{4,6})\s*;\s*(?P<type>\S+)\s*;\s*(?P<subst>[0-9A-F\s]+)\s*',
+                line)
+            if result:
+                if result.group('type') in ('C', 'F'):
+                    self.ucd_cf[int(result.group('cp'), 16)] = \
+                        ''.join([chr(int(x, 16)) for x in 
result.group('subst').split(' ')])
+
+    def _load_hangulst(self):
+
+        self.ucd_hst = {}
+        f_hst = self._ucdfile('HangulSyllableType.txt')
+        for line in f_hst.splitlines():
+            result = re.match(
+                
'^(?P<start>[0-9A-F]{4,6})\.\.(?P<end>[0-9A-F]{4,6})\s*;\s*(?P<type>\S+)\s*(|\#.*)$',
+                line)
+            if result:
+                for i in hexrange(result.group('start'), result.group('end')):
+                    self.ucd_hst[i] = result.group('type')
+
+    def _load_arabicshaping(self):
+
+        self.ucd_as = {}
+        f_as = self._ucdfile('ArabicShaping.txt')
+        for line in f_as.splitlines():
+            result = 
re.match('^(?P<cp>[0-9A-F]{4,6})\s*;\s*.*?\s*;\s*(?P<jt>\S+)\s*;', line)
+            if result:
+                self.ucd_as[int(result.group('cp'), 16)] = result.group('jt')
+
+    def _load_scripts(self):
+
+        self.ucd_s = {}
+        f_s = self._ucdfile('Scripts.txt')
+        for line in f_s.splitlines():
+            result = re.match(
+                
'^(?P<start>[0-9A-F]{4,6})(|\.\.(?P<end>[0-9A-F]{4,6}))\s*;\s*(?P<script>\S+)\s*(|\#.*)$',
+                line)
+            if result:
+                if not result.group('script') in self.ucd_s:
+                    self.ucd_s[result.group('script')] = set()
+                if result.group('end'):
+                    for i in hexrange(result.group('start'), 
result.group('end')):
+                        self.ucd_s[result.group('script')].add(i)
+                else:
+                    i = hexvalue(result.group('start'))
+                    self.ucd_s[result.group('script')].add(i)
+
+    def _load_uts46mapping(self):
+
+        self.ucd_idnamt = {}
+        f_idnamt = self._ucdfile('IdnaMappingTable.txt', urlbase=UTS46_URL)
+        for line in f_idnamt.splitlines():
+            result = re.match(
+                
'^(?P<start>[0-9A-F]{4,6})(|\.\.(?P<end>[0-9A-F]{4,6}))\s*;\s*(?P<fields>[^#]+)',
+                line)
+            if result:
+                fields = [x.strip() for x in result.group('fields').split(';')]
+                if result.group('end'):
+                    for i in hexrange(result.group('start'), 
result.group('end')):
+                        self.ucd_idnamt[i] = fields
+                else:
+                    i = hexvalue(result.group('start'))
+                    self.ucd_idnamt[i] = fields
+
+    def _ucdfile(self, filename, urlbase=UCD_URL):
+        if self.source:
+            f = open("{}/{}".format(self.source, filename))
+            return f.read()
+        else:
+            cache_file = None
+            if self.cache:
+                cache_file = os.path.expanduser("{}/{}/{}".format(
+                    self.cache, self.version.tag, filename))
+                if os.path.isfile(cache_file):
+                    f = open(cache_file)
+                    return f.read()
+
+            version_path = self.version.tag
+            if version_path == 'latest':
+                version_path = 'UCD/latest'
+            url = urlbase.format(
+                version=version_path,
+                filename=filename,
+            )
+            content = urlopen(url).read()
+
+            if cache_file:
+                if not os.path.isdir(os.path.dirname(cache_file)):
+                    os.makedirs(os.path.dirname(cache_file))
+                f = open(cache_file, 'wb')
+                f.write(content)
+                f.close()
+
+            return str(content)
+
+    def codepoints(self):
+        for i in range(0, self.max + 1):
+            yield CodePoint(i, ucdata=self)
+
+
+class CodePoint:
+
+    def __init__(self, value=None, ucdata=None):
+        self.value = value
+        self.ucdata = ucdata
+
+    def _casefold(self, s):
+        r = ''
+        for c in s:
+            r += self.ucdata.ucd_cf.get(ord(c), c)
+        return r
+
+    @property
+    def exception_value(self):
+        return exceptions.get(self.value, False)
+
+    @property
+    def compat_value(self):
+        return backwardscompatible.get(self.value, False)
+
+    @property
+    def name(self):
+        if self.value in self.ucdata.ucd_data:
+            return self.ucdata.ucd_data[self.value][0]
+        elif 'Noncharacter_Code_Point' in self.ucdata.ucd_props[self.value]:
+            return '<noncharacter>'
+        else:
+            return '<reserved>'
+
+    @property
+    def general_category(self):
+        return self.ucdata.ucd_data.get(self.value, [None, None])[1]
+
+    @property
+    def unassigned(self):
+        return not ('Noncharacter_Code_Point' in 
self.ucdata.ucd_props[self.value] or \
+                    self.value in self.ucdata.ucd_data)
+
+    @property
+    def ldh(self):
+        if self.value == 0x002d or \
+           self.value in range(0x0030, 0x0039+1) or \
+           self.value in range(0x0061, 0x007a+1):
+            return True
+        return False
+
+    @property
+    def join_control(self):
+        return 'Join_Control' in self.ucdata.ucd_props[self.value]
+
+    @property
+    def joining_type(self):
+        return self.ucdata.ucd_as.get(self.value, None)
+
+    @property
+    def char(self):
+        return chr(self.value)
+
+    @property
+    def nfkc_cf(self):
+        return unicodedata.normalize('NFKC',
+                                     
self._casefold(unicodedata.normalize('NFKC', self.char)))
+
+    @property
+    def unstable(self):
+        return self.char != self.nfkc_cf
+
+    @property
+    def in_ignorableproperties(self):
+        for prop in ['Default_Ignorable_Code_Point', 'White_Space', 
'Noncharacter_Code_Point']:
+            if prop in self.ucdata.ucd_props[self.value]:
+                return True
+        return False
+
+    @property
+    def in_ignorableblocks(self):
+        return self.ucdata.ucd_block.get(self.value) in (
+            'Combining Diacritical Marks for Symbols', 'Musical Symbols',
+            'Ancient Greek Musical Notation'
+        )
+
+    @property
+    def oldhanguljamo(self):
+        return self.ucdata.ucd_hst.get(self.value) in ('L', 'V', 'T')
+
+    @property
+    def in_lettersdigits(self):
+        return self.general_category in ('Ll', 'Lu', 'Lo', 'Nd', 'Lm', 'Mn', 
'Mc')
+
+    @property
+    def idna2008_status(self):
+        if self.exception_value:
+            return self.exception_value
+        elif self.compat_value:
+            return self.compat_value
+        elif self.unassigned:
+            return 'UNASSIGNED'
+        elif self.ldh:
+            return 'PVALID'
+        elif self.join_control:
+            return 'CONTEXTJ'
+        elif self.unstable:
+            return 'DISALLOWED'
+        elif self.in_ignorableproperties:
+            return 'DISALLOWED'
+        elif self.in_ignorableblocks:
+            return 'DISALLOWED'
+        elif self.oldhanguljamo:
+            return 'DISALLOWED'
+        elif self.in_lettersdigits:
+            return 'PVALID'
+        else:
+            return 'DISALLOWED'
+
+    @property
+    def uts46_data(self):
+        return self.ucdata.ucd_idnamt.get(self.value, None)
+
+    @property
+    def uts46_status(self):
+        return ' '.join(self.uts46_data)
+
+
+def diagnose_codepoint(codepoint, args, ucdata):
+
+    cp = CodePoint(codepoint, ucdata=ucdata)
+
+    print("U+{:04X}:".format(codepoint))
+    print("   Name:             {}".format(cp.name))
+    print("1  Exceptions:       {}".format(exceptions.get(codepoint, False)))
+    print("2  Backwards Compat: {}".format(backwardscompatible.get(codepoint, 
False)))
+    print("3  Unassigned:       {}".format(cp.unassigned))
+    print("4  LDH:              {}".format(cp.ldh))
+    print("   Properties:       {}".format(" 
".join(sorted(ucdata.ucd_props.get(codepoint, ['None'])))))
+    print("5  .Join Control:    {}".format(cp.join_control))
+    print("   NFKC CF:          {}".format(" ".join(["U+{:04X}".format(ord(x)) 
for x in cp.nfkc_cf])))
+    print("6  .Unstable:        {}".format(cp.unstable))
+    print("7  .Ignorable Prop:  {}".format(cp.in_ignorableproperties))
+    print("   Block:            {}".format(ucdata.ucd_block.get(codepoint, 
None)))
+    print("8  .Ignorable Block: {}".format(cp.in_ignorableblocks))
+    print("   Hangul Syll Type: {}".format(ucdata.ucd_hst.get(codepoint, 
None)))
+    print("9  .Old Hangul Jamo: {}".format(cp.oldhanguljamo))
+    print("   General Category: {}".format(cp.general_category))
+    print("10 .Letters Digits:  {}".format(cp.in_lettersdigits))
+    print("== IDNA 2008:        {}".format(cp.idna2008_status))
+    print("== UTS 46:           {}".format(cp.uts46_status))
+    print("(Unicode {} [sys:{}])".format(ucdata.version, 
ucdata.system_version))
+
+def ucdrange(start, end):
+    if start == end:
+        return ("{:04X}".format(start.value), start.name)
+    else:
+        return ("{:04X}..{:04X}".format(start.value, end.value),
+                "{}..{}".format(start.name, end.name))
+
+def optimised_list(d):
+    yield '('
+    for value in intranges_from_list(d):
+        yield '        {},'.format(hex(value))
+    yield '    ),'
+
+def make_table(args, ucdata):
+
+    last_status = None
+    cps = []
+    table_data = []
+
+    for cp in ucdata.codepoints():
+        status = cp.idna2008_status
+        if (last_status and last_status != status):
+            (values, description) = ucdrange(cps[0], cps[-1])
+            table_data.append([values, last_status, description])
+            cps = []
+        last_status = status
+        cps.append(cp)
+    (values, description) = ucdrange(cps[0], cps[-1])
+    table_data.append([values, last_status, description])
+
+    if args.dir:
+
+        f = open("{}/idna-table-{}.txt".format(args.dir, ucdata.version), 'wb')
+        for row in table_data:
+            f.write("{:12}; {:12}# {:.44}\n".format(*row).encode('ascii'))
+        f.close()
+
+    else:
+
+        for row in table_data:
+            print("{:12}; {:12}# {:.44}".format(*row))
+
+def idna_libdata(ucdata):
+
+    yield "# This file is automatically generated by tools/idna-data\n"
+    yield "__version__ = \"{}\"".format(ucdata.version)
+
+    #
+    # Script classifications are used by some CONTEXTO rules in RFC 5891
+    #
+    yield "scripts = {"
+    for script in SCRIPT_WHITELIST:
+        prefix = "    '{0}': ".format(script)
+        for line in optimised_list(ucdata.ucd_s[script]):
+            yield prefix + line
+            prefix = ""
+    yield "}"
+
+    #
+    # Joining types are used by CONTEXTJ rule A.1
+    #
+    yield "joining_types = {"
+    for cp in ucdata.codepoints():
+        if cp.joining_type:
+            yield "    0x{0:x}: {1},".format(cp.value, ord(cp.joining_type))
+    yield "}"
+
+    #
+    # These are the classification of codepoints into PVALID, CONTEXTO, 
CONTEXTJ, etc.
+    #
+    yield "codepoint_classes = {"
+    classes = {}
+    for cp in ucdata.codepoints():
+        status = cp.idna2008_status
+        if status in ('UNASSIGNED', 'DISALLOWED'):
+            continue
+        if not status in classes:
+            classes[status] = set()
+        classes[status].add(cp.value)
+    for status in ['PVALID', 'CONTEXTJ', 'CONTEXTO']:
+        prefix = "    '{0}': ".format(status)
+        for line in optimised_list(classes[status]):
+            yield prefix + line
+            prefix = ""
+    yield "}"
+
+def uts46_ranges(ucdata):
+
+    last = (None, None)
+    for cp in ucdata.codepoints():
+        fields = cp.uts46_data
+        if not fields:
+            continue
+        status, mapping = UTS46_STATUSES[fields[0]]
+        if mapping:
+            mapping = "".join(chr(int(codepoint, 16)) for codepoint in 
fields[1].split())
+            mapping = mapping.replace("\\", "\\\\").replace("'", "\\'")
+        else:
+            mapping = None
+        if cp.value > 255 and (status, mapping) == last:
+            continue
+        last = (status, mapping)
+
+        if mapping is not None:
+            yield "(0x{0:X}, '{1}', u'{2}')".format(cp.value, status, mapping)
+        else:
+            yield "(0x{0:X}, '{1}')".format(cp.value, status)
+
+def uts46_libdata(ucdata):
+
+    yield "# This file is automatically generated by tools/idna-data"
+    yield "# vim: set fileencoding=utf-8 :\n"
+    yield '"""IDNA Mapping Table from UTS46."""\n\n'
+
+    yield "__version__ = \"{}\"".format(ucdata.version)
+
+    idx = -1
+    for row in uts46_ranges(ucdata):
+        idx += 1
+        if idx % UTS46_SEGMENT_SIZE == 0:
+            if idx != 0:
+                yield "    ]\n"
+            yield "def _seg_{0}():\n    return [".format(idx // 
UTS46_SEGMENT_SIZE)
+        yield "    {0},".format(row)
+    yield "    ]\n"
+
+    yield "uts46data = tuple("
+    yield "    _seg_0()"
+    for i in range(1, idx // UTS46_SEGMENT_SIZE + 1):
+        yield "    + _seg_{0}()".format(i)
+    yield ")"
+
+def make_libdata(args, ucdata):
+
+    dest_dir = args.dir or '.'
+
+    target_filename = os.path.join(dest_dir, 'idnadata.py')
+    with open(target_filename, 'wb') as target:
+        for line in idna_libdata(ucdata):
+            target.write((line + "\n").encode('utf-8'))
+
+    target_filename = os.path.join(dest_dir, 'uts46data.py')
+    with open(target_filename, 'wb') as target:
+        for line in uts46_libdata(ucdata):
+            target.write((line + "\n").encode('utf-8'))
+
+def arg_error(message, parser):
+
+    parser.print_usage()
+    print('{}: error: {}'.format(sys.argv[0], message))
+    sys.exit(2)
+
+def main():
+
+    parser = argparse.ArgumentParser(description='Determine IDNA code-point 
validity data')
+    parser.add_argument('action', type=str, default='preferred',
+                        help='Task to perform (make-libdata, make-tables, 
<codepoint>)')
+
+    parser.add_argument('--version', type=str, default='preferred',
+                        help='Unicode version to use (preferred, latest, 
<x.y.z>)')
+    parser.add_argument('--source', type=str, default=None,
+                        help='Where to fetch Unicode data (file path)')
+    parser.add_argument('--dir', type=str, default=None, help='Where to export 
the output')
+    parser.add_argument('--cache', type=str, default=None, help='Where to 
cache Unicode data')
+    parser.add_argument('--no-cache', action='store_true', help='Don\'t cache 
Unicode data')
+    libdata = parser.add_argument_group('make-libdata', 'Make module data for 
Python IDNA library')
+
+    tables = parser.add_argument_group('make-table', 'Make IANA-style 
reference table')
+
+    codepoint = parser.add_argument_group('codepoint',
+                                          'Display related data for given 
codepoint (e.g. U+0061)')
+
+    args = parser.parse_args()
+
+    if args.version == 'preferred':
+        target_version = PREFERRED_VERSION
+    else:
+        target_version = args.version
+
+    if args.cache and args.no_cache:
+        arg_error('I can\'t both --cache and --no-cache', parser)
+    cache = args.cache or DEFAULT_CACHE_DIR
+    if args.no_cache:
+        cache = None
+
+    ucdata = UnicodeData(target_version, cache, args)
+
+    if args.action == 'make-table':
+        make_table(args, ucdata)
+    elif args.action == 'make-libdata':
+        make_libdata(args, ucdata)
+    else:
+        result = re.match('^(?i)(U\+|)(?P<cp>[0-9A-F]{4,6})$', args.action)
+        if result:
+            codepoint = int(result.group('cp'), 16)
+            diagnose_codepoint(codepoint, args, ucdata)
+            sys.exit(0)
+        arg_error('Don\'t recognize action or codepoint value', parser)
+        
+
+if __name__ == '__main__':
+    main()
+
+
+

commit python-idna for openSUSE:Factory

Reply via email to