[ 
https://issues.apache.org/jira/browse/ARROW-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248868#comment-16248868
 ] 

ASF GitHub Bot commented on ARROW-1763:
---------------------------------------

xhochy closed pull request #1308: ARROW-1763: [Python] Implement __hash__ for 
DataType
URL: https://github.com/apache/arrow/pull/1308
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/pyarrow/tests/test_types.py 
b/python/pyarrow/tests/test_types.py
index 0e3ea1fd4..9eefa33b6 100644
--- a/python/pyarrow/tests/test_types.py
+++ b/python/pyarrow/tests/test_types.py
@@ -137,3 +137,27 @@ def test_is_temporal_date_time_timestamp():
 def test_timestamp_type():
     # See ARROW-1683
     assert isinstance(pa.timestamp('ns'), pa.TimestampType)
+
+
+def test_types_hashable():
+    types = [
+        pa.null(),
+        pa.int32(),
+        pa.time32('s'),
+        pa.time64('us'),
+        pa.date32(),
+        pa.timestamp('us'),
+        pa.string(),
+        pa.binary(),
+        pa.binary(10),
+        pa.list_(pa.int32()),
+        pa.struct([pa.field('a', pa.int32()),
+                   pa.field('b', pa.int8()),
+                   pa.field('c', pa.string())])
+    ]
+
+    in_dict = {}
+    for i, type_ in enumerate(types):
+        assert hash(type_) == hash(type_)
+        in_dict[type_] = i
+        assert in_dict[type_] == i
diff --git a/python/pyarrow/types.pxi b/python/pyarrow/types.pxi
index d2e68ff79..edf0d8a30 100644
--- a/python/pyarrow/types.pxi
+++ b/python/pyarrow/types.pxi
@@ -69,6 +69,9 @@ cdef class DataType:
             )
         return frombytes(self.type.ToString())
 
+    def __hash__(self):
+        return hash(str(self))
+
     def __reduce__(self):
         return self.__class__, (), self.__getstate__()
 


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] DataType should be hashable
> ------------------------------------
>
>                 Key: ARROW-1763
>                 URL: https://issues.apache.org/jira/browse/ARROW-1763
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: Jeff Reback
>            Assignee: Wes McKinney
>              Labels: pull-request-available
>             Fix For: 0.8.0
>
>
> We can then use the DataType objects as keys in dictionary for example. xref 
> https://github.com/ibis-project/ibis/pull/1194#discussion_r148493472
> {code}
> In [1]: import pyarrow as pa
> In [2]: pa.__version__
> Out[2]: '0.7.1'
> In [3]: pa.int8()
> Out[3]: DataType(int8)
> In [4]: hash(pa.int8())
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call last)
> <ipython-input-4-bca0e6e2f6af> in <module>()
> ----> 1 hash(pa.int8())
> TypeError: unhashable type: 'pyarrow.lib.DataType'
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to