[Python-ideas] PEP 634-636: Mapping patterns and extra keys

David Foster Sat, 14 Nov 2020 22:05:04 -0800

From PEP 636 (Structural Pattern Matching):

> Mapping patterns: {"bandwidth": b, "latency": l} captures the"bandwidth" and "latency" values from a dict. Unlike sequence patterns,extra keys are ignored.

It surprises me that ignoring extra keys would be the *default*behavior. This seems unsafe. Extra keys I would think would be besttreated as suspicious by default.


* Ignoring extra keys loses data silently. In the current proposal:

    point = {'x': 1, 'y': 2, 'z': 3)
    match point:
        case {'x': x, 'y': y}:  # MATCHES, losing z  O_O
            pass
        case {'x': x, 'y': y, 'z': z}:  # will never match  O_O
            pass

* Ignoring extra keys is inconsistent with the handling of sequences: Wedon't allow extra items when using a destructuring assignment to a sequence:


    p = [1, 2]
    [x, y] = p

[x, y, z] = p # ERROR: ValueError: not enough values to unpack(expected 3, got 2) :)

* Ignoring extra keys in mapping patterns is inconsistent with thecurrent proposal for how sequence patterns match data:


    point = [1, 2, 3]
    match point:
        case [x, y]:  # notices extra value and does NOT match  :)
            pass
        case [x, y, z]:  # matches :)
            pass

* Ignoring extra keys is inconsistent with TypedDict's default "total"matching behavior:


    from typing import TypedDict

    class Point2D(TypedDict):
        x: int
        y: int

    p1: Point2D = {'x': 1, 'y': 2}

p2: Point2D = {'x': 1, 'y': 2, 'z': 3) # ERROR: Extra key 'z' forTypedDict "Point2D" :)

* It is *possible* to force an exact key match with a pattern guard butit's clumsy to do so.

  It should not be clumsy to parse strictly.

    point = {'x': 1, 'y': 2, 'z': 3)
    match point:

# notices extra value and does NOT match, but requires uglyguard :/

        case {'x': x, 'y': y, **rest} if rest == {}:
            pass
        case {'x': x, 'y': y, 'z': z, **rest} if rest == {}:
            pass

To avoid the above problems, **I'd advocate for disallowing extra keysin mapping patterns by default**. For cases where extra keys want to bespecifically allowed and ignored, I propose allowing a **_ wildcard.

Some examples that illustrate behavior when *disallowing* extra keys inmapping patterns:


1. Strict parsing

    from typing import TypedDict, Union

    Point2D = TypedDict('Point2D', {'x': int, 'y': int})
    Point3D = TypedDict('Point3D', {'x': int, 'y': int, 'z': int})

    def parse_point(point_json: dict) -> Union[Point2D, Point3D]:
        match point_json:
            case {'x': int(x), 'y': int(y)}:
                return Point2D({'x': x, 'y': y})
            case {'x': int(x), 'y': int(y), 'z': int(z)}:
                return Point3D({'x': x, 'y': y, 'z': z})
            case _:
                raise ValueError(f'not a valid point: {point_json!r}')

2. Loose parsing, discarding unknown data.

Common when reading JSON-like data when it's not necessary to outputit again later.


    from typing import TypedDict

TodoItem_ReadOnly = TypedDict('TodoItem_ReadOnly', {'title': str,'completed': bool})


    def parse_todo_item(todo_item_json: Dict) -> TodoItem_ReadOnly:
        match todo_item_json:
            case {'title': str(title), 'completed': bool(completed), **_}:

return TodoItem_ReadOnly({'title': title, 'completed':completed})

            case _:
                raise ValueError()

input = {'title': 'Buy groceries', 'completed': True,'assigned_to': ['me']}print(parse_todo_item(input)) # prints: {'title': 'Buy groceries','completed': True}


3. Loose parsing, preserving unknown data.

Common when parsing JSON-like data when it needs to be round-trippedand output again later.


    from typing import Any, Dict, TypedDict

TodoItem_ReadWrite = TypedDict('TodoItem_ReadWrite', {'title': str,'completed': bool, 'extra': Dict[str, Any]})


    def parse_todo_item(todo_item_json: Dict) -> TodoItem_ReadWrite:
        match todo_item_json:

case {'title': str(title), 'completed': bool(completed),**extra}:return TodoItem_ReadWrite({'title': title, 'completed':completed, 'extra': extra})

            case _:
                raise ValueError()

    def format_todo_item(item: TodoItem_ReadWrite) -> Dict:

return {'title': item['title'], 'completed': item['completed'],**item['extra']}

input = {'title': 'Buy groceries', 'completed': True,'assigned_to': ['me']}

    output = format_todo_item(parse_todo_item(input))

print(output) # prints: {'title': 'Buy groceries', 'completed':True, 'assigned_to': ['me']}



Comments?

--
David Foster | Seattle, WA, USA
Contributor to TypedDict support for mypy
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/ZPUVT7AF67VKNLSSGUHOBIM5F46ZEE77/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] PEP 634-636: Mapping patterns and extra keys

Reply via email to