Take as an example a function designed to process a tree of nodes similar to 
that which might be output by a JSON parser. There are 4 types of node:

- A node representing JSON strings
- A node representing JSON numbers
- A node representing JSON arrays
- A node representing JSON dictionaries

The function transforms a tree of nodes, beginning at the root node, and 
proceeding recursively through each child node in turn. The result is a Python 
object, with the following transformation applied to each node type:

- A JSON string `->` Python `str`
- A JSON number `->` Python `float`
- A JSON array `->` Python `list`
- A JSON dictionary `->` Python `dict`

I have implemented this function using 3 different approaches:

- The visitor pattern
- `isinstance` checks against the node type
- Pattern matching

Here is the implementation using the visitor pattern:

```
from typing import List, Tuple

class NodeVisitor:
    def visit_string_node(self, node: StringNode):
        pass

    def visit_number_node(self, node: NumberNode):
        pass

    def visit_list_node(self, node: ListNode):
        pass

    def visit_dict_node(self, node: DictNode):
        pass


class Node:
    def visit(visitor: NodeVisitor):
        raise NotImplementedError()


class StringNode(Node):
    value: str

    def visit(self, visitor: NodeVisitor):
        visitor.visit_string_node(self)


class NumberNode(Node):
    value: str

    def visit(self, visitor: NodeVisitor):
        visitor.visit_number_node(self)


class ListNode(Node):
    children: List[Node]

    def visit(self, visitor: NodeVisitor):
        visitor.visit_list_node(self)


class DictNode(Node):
    children: List[Tuple[str, Node]]

    def visit(self, visitor: NodeVisitor):
        visitor.visit_dict_node(self)


class Processor(NodeVisitor):
    def process(root_node: Node):
        return root_node.visit(self)

    def visit_string_node(self, node: StringNode):
        return node.value

    def visit_number_node(self, node: NumberNode):
        return float(node.value)

    def visit_list_node(self, node: ListNode):
        return [child_node.visit(self) for child_node in node.children]

    def visit_dict_node(self, node: DictNode):
        return {key: child_node.visit(self) for key, child_node in 
node.children}


def process(root_node: Node):
    processor = Processor()
    return processor.process(root_node)
```

Here is the implementation using `isinstance` checks against the node type:

```
from typing import List, Tuple

class Node:
    pass


class StringNode(Node):
    value: str


class NumberNode(Node):
    value: str


class ListNode(Node):
    children: List[Node]


class DictNode(Node):
    children: List[Tuple[str, Node]]


def process(root_node: Node):
    def process_node(node: Node):
        if isinstance(node, StringNode):
            return node.value
        elif isinstance(node, NumberNode):
            return float(node.value)
        elif isinstance(node, ListNode):
            return [process_node(child_node) for child_node in node.children]
        elif isinstance(node, DictNode):
            return {key: process_node(child_node) for key, child_node in 
node.children}
        else:
            raise Exception('Unexpected node')

    return process_node(root_node)
```

Finally here is the implementation using pattern matching:

```
from typing import List, Tuple

class Node:
    pass


class StringNode(Node):
    value: str


class NumberNode(Node):
    value: str


class ListNode(Node):
    children: List[Node]


class DictNode(Node):
    children: List[Tuple[str, Node]]


def process(root_node: Node):
    def process_node(node: Node):
        match node:
            case StringNode(value=str_value):
                return str_value
            case NumberNode(value=number_value):
                return float(number_value)
            case ListNode(children=child_nodes):
                return [process_node(child_node) for child_node in child_nodes]
            case DictNode(children=child_nodes):
                return {key: process_node(child_node) for key, child_node in 
child_nodes}
            case _:
                raise Exception('Unexpected node')

    return process_node(root_node)
```

Here are the lengths of the different implementations:

- Pattern matching `->` 37 lines
- `isinstance` checks `->` 36 lines
- The visitor pattern `->` 69 lines

The visitor pattern implementation is by far the most verbose solution, 
weighing in at almost twice the length of the alternative implementations due 
to the large amount of boilerplate that is necessary to achieve double 
dispatch. The pattern matching and `isinstance` check implementations are very 
similar in length for this trivial example.

In each implementation, there are 2 operations performed on each node.

- Determine the type of the node
- Destructure the node to extract the desired data

The visitor pattern and `isinstance` check implementations separate these 2 
operations, whereas the pattern matching approach combines the operations 
together. I believe that it is the declarative nature of pattern matching, 
where the operations of determining the type of the node and destructuring the 
node are combined into a single clause, which allows pattern matching to 
express a concise solution to the problem. In this trivial example, the 
advantage of pattern matching over the alternative of using a sequence of 
`if`-`elif`-`else` statements is not as obvious as it would be when compared to 
a more complex example, where a sub-tree of nodes might be matched based on 
their type and be destructured in a single clause.

I have seen elsewhere an argument that pattern matching should not be accepted 
into Python as it introduces a pseudo-DSL that is separate from the rest of the 
language. I agree that pattern matching might be viewed as a pseudo-DSL, but I 
believe that it is a good thing, if it allows the solution to certain classes 
of problems to be expressed in a concise manner. People often raise similar 
objections to operator overloading in other languages, whereas the presence of 
operator overloading in Python allows mathematical expressions involving custom 
numeric types such as vectors to be expressed in a natural way. Furthermore, 
Python has a regular expression module which implements it's own DSL for the 
purpose of matching string patterns. Regular expressions, in a similar way to 
pattern matching, allow string patterns to be expressed in a concise and 
declarative manner.

I really hope that the Steering Council accepts pattern matching into Python. I 
think that it allows for processing of heterogeneous graphs of objects using 
recursion in a concise, declarative manner. I would like to thank the authors 
of the Structural Pattern Matching PEP for their hard work in designing this 
feature and developing an implementation of it. I believe that it will be a 
wonderful addition to the language that I am very much looking forward to using.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7FOJ75QWQ5DFZIHLZFDAZWHXAJXLU3NO/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to