Joris Van den Bossche created ARROW-6749:
--------------------------------------------

             Summary: [Python] Conversion of non-ns timestamp array to numpy 
gives wrong values
                 Key: ARROW-6749
                 URL: https://issues.apache.org/jira/browse/ARROW-6749
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
            Reporter: Joris Van den Bossche


{code}
In [25]: np_arr = np.arange("2012-01-01", "2012-01-06", int(1e6)*60*60*24, 
dtype="datetime64[us]")                                                         
                                                        

In [26]: np_arr                                                                 
                                                                                
                                                   
Out[26]: 
array(['2012-01-01T00:00:00.000000', '2012-01-02T00:00:00.000000',
       '2012-01-03T00:00:00.000000', '2012-01-04T00:00:00.000000',
       '2012-01-05T00:00:00.000000'], dtype='datetime64[us]')

In [27]: arr = pa.array(np_arr)                                                 
                                                                                
                                                   

In [28]: arr                                                                    
                                                                                
                                                   
Out[28]: 
<pyarrow.lib.TimestampArray object at 0x7f0b2ef07ee8>
[
  2012-01-01 00:00:00.000000,
  2012-01-02 00:00:00.000000,
  2012-01-03 00:00:00.000000,
  2012-01-04 00:00:00.000000,
  2012-01-05 00:00:00.000000
]

In [29]: arr.type                                                               
                                                                                
                                                   
Out[29]: TimestampType(timestamp[us])

In [30]: arr.to_numpy()                                                         
                                                                                
                                                   
Out[30]: 
array(['1970-01-16T08:09:36.000000000', '1970-01-16T08:11:02.400000000',
       '1970-01-16T08:12:28.800000000', '1970-01-16T08:13:55.200000000',
       '1970-01-16T08:15:21.600000000'], dtype='datetime64[ns]')
{code}

So it seems to simply interpret the integer microsecond values as nanoseconds 
when converting to numpy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to