[
https://issues.apache.org/jira/browse/ARROW-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735913#comment-16735913
]
Antoine Pitrou commented on ARROW-4176:
---------------------------------------
Note this is pretty much already available in the stdlib:
{code:python}
>>> schema1 = pa.schema([('changed_column', pa.int8())])
>>>
>>>
>>> schema2 = pa.schema([('changed_column', pa.float64()),
>>> ('additional_column', pa.float64())])
>>>
>>> import difflib
>>>
>>>
>>> list(difflib.unified_diff(str(schema1).splitlines(),
>>> str(schema2).splitlines()))
>>>
['--- \n',
'+++ \n',
'@@ -1 +1,2 @@\n',
'-changed_column: int8',
'+changed_column: double',
'+additional_column: double']
{code}
I expect pytest and unittest to do the diffing automatically when an equality
comparison fails.
> [C++/Python] Human readable arrow schema comparison
> ---------------------------------------------------
>
> Key: ARROW-4176
> URL: https://issues.apache.org/jira/browse/ARROW-4176
> Project: Apache Arrow
> Issue Type: Improvement
> Reporter: Florian Jetter
> Priority: Minor
>
> When working with arrow schemas it would be helpful to have a human readable
> representation of the diff between two schemas.
> This could be either exposed as a function returning a string/diff object or
> via a function raising an Exception with this information.
> For instance:
> {code}
> schema_diff = get_schema_diff(schema1, schema2)
> expected_diff = """
> - col_changed: int8
> + col_changed: double
> + col_additional: int8
> """
> assert schema_diff == expected_diff
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)