hi Bryan,

This is a display bug

In [6]: arr = pa.array([0, 1, 2], type=pa.timestamp('ns',
'America/Los_Angeles'))

In [7]: arr.view('int64')
Out[7]:
<pyarrow.lib.Int64Array object at 0x7fd1b8aaef30>
[
  0,
  1,
  2
]

In [8]: arr
Out[8]:
<pyarrow.lib.TimestampArray object at 0x7fd1b8aae6e0>
[
  1970-01-01 00:00:00.000000000,
  1970-01-01 00:00:00.000000001,
  1970-01-01 00:00:00.000000002
]

In [9]: arr.to_pandas()
Out[9]:
0             1969-12-31 16:00:00-08:00
1   1969-12-31 16:00:00.000000001-08:00
2   1969-12-31 16:00:00.000000002-08:00
dtype: datetime64[ns, America/Los_Angeles]

the repr of TimestampArray doesn't take into account the timezone

In [10]: arr[0]
Out[10]: <pyarrow.TimestampScalar: Timestamp('1969-12-31
16:00:00-0800', tz='America/Los_Angeles')>

So if it's incorrect, the problem is happening somewhere before or
while the StructArray is being created. If I had to guess it's caused
by the tzinfo of the datetime.datetime values not being handled in the
way that they were before

On Sun, Jul 19, 2020 at 5:19 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> Well this is not good and pretty disappointing given that we had nearly a 
> month to sort through the implications of Micah’s patch. We should try to 
> resolve this ASAP
>
> On Sun, Jul 19, 2020 at 5:10 PM Bryan Cutler <cutl...@gmail.com> wrote:
>>
>> +0 (non-binding)
>>
>> I ran verification script for binaries and then source, as below, and both
>> look good
>> ARROW_TMPDIR=/tmp/arrow-test TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1
>> TEST_PYTHON=1 TEST_JAVA=1 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
>> dev/release/verify-release-candidate.sh source 1.0.0 1
>>
>> I tried to patch Spark locally to verify the recent change in nested
>> timestamps and was not able to get things working quite right, but I'm not
>> sure if the problem is in Spark, Arrow or my patch - hence my vote of +0.
>>
>> Here is what I'm seeing
>>
>> ```
>> (Input as datetime)
>> datetime.datetime(2018, 3, 10, 0, 0)
>> datetime.datetime(2018, 3, 15, 0, 0)
>>
>> (Struct Array)
>> -- is_valid: all not null
>> -- child 0 type: timestamp[us, tz=America/Los_Angeles]
>>   [
>>     2018-03-10 00:00:00.000000,
>>     2018-03-10 00:00:00.000000
>>   ]
>> -- child 1 type: timestamp[us, tz=America/Los_Angeles]
>>   [
>>     2018-03-15 00:00:00.000000,
>>     2018-03-15 00:00:00.000000
>>   ]
>>
>> (Flattened Arrays)
>> types [TimestampType(timestamp[us, tz=America/Los_Angeles]),
>> TimestampType(timestamp[us, tz=America/Los_Angeles])]
>> [<pyarrow.lib.TimestampArray object at 0x7ffbbd88f520>
>> [
>>   2018-03-10 00:00:00.000000,
>>   2018-03-10 00:00:00.000000
>> ], <pyarrow.lib.TimestampArray object at 0x7ffba958be50>
>> [
>>   2018-03-15 00:00:00.000000,
>>   2018-03-15 00:00:00.000000
>> ]]
>>
>> (Pandas Conversion)
>> [
>> 0   2018-03-09 16:00:00-08:00
>> 1   2018-03-09 16:00:00-08:00
>> dtype: datetime64[ns, America/Los_Angeles],
>>
>> 0   2018-03-14 17:00:00-07:00
>> 1   2018-03-14 17:00:00-07:00
>> dtype: datetime64[ns, America/Los_Angeles]]
>> ```
>>
>> Based on output of existing a correct timestamp udf, it looks like the
>> pyarrow Struct Array values are wrong and that's carried through the
>> flattened arrays, causing the Pandas values to have a negative offset.
>>
>> Here is output from a working udf with timestamp, the pyarrow Array
>> displays in UTC time, I believe.
>>
>> ```
>> (Timestamp Array)
>> type timestamp[us, tz=America/Los_Angeles]
>> [
>>   [
>>     1969-01-01 09:01:01.000000
>>   ]
>> ]
>>
>> (Pandas Conversion)
>> 0   1969-01-01 01:01:01-08:00
>> Name: _0, dtype: datetime64[ns, America/Los_Angeles]
>>
>> (Timezone Localized)
>> 0   1969-01-01 01:01:01
>> Name: _0, dtype: datetime64[ns]
>> ```
>>
>> I'll have to dig in further at another time and debug where the values go
>> wrong.
>>
>> On Sat, Jul 18, 2020 at 9:51 PM Micah Kornfield <emkornfi...@gmail.com>
>> wrote:
>>
>> > +1 (binding)
>> >
>> > Ran wheel and binary tests on ubuntu 19.04
>> >
>> > On Fri, Jul 17, 2020 at 2:25 PM Neal Richardson <
>> > neal.p.richard...@gmail.com>
>> > wrote:
>> >
>> > > +1 (binding)
>> > >
>> > > In addition to the usual verification on
>> > > https://github.com/apache/arrow/pull/7787, I've successfully staged the
>> > R
>> > > binary artifacts on Windows (
>> > > https://github.com/r-windows/rtools-packages/pull/126), macOS (
>> > > https://github.com/autobrew/homebrew-core/pull/12), and Linux (
>> > > https://github.com/ursa-labs/arrow-r-nightly/actions/runs/172977277)
>> > using
>> > > the release candidate.
>> > >
>> > > And I agree with the judgment about skipping a JS release artifact. Looks
>> > > like there hasn't been a code change since October so there's no point.
>> > >
>> > > Neal
>> > >
>> > > On Fri, Jul 17, 2020 at 10:37 AM Wes McKinney <wesmck...@gmail.com>
>> > wrote:
>> > >
>> > > > I see the JS failures as well. I think it is a failure localized to
>> > > > newer Node versions since our JavaScript CI works fine. I don't think
>> > > > it should block the release given the lack of development activity in
>> > > > JavaScript [1] -- if any JS devs are concerned about publishing an
>> > > > artifact then we can skip pushing it to NPM
>> > > >
>> > > > @Ryan it seems it may be something environment related on your
>> > > > machine, I'm on Ubuntu 18.04 and have not seen this.
>> > > >
>> > > > On
>> > > >
>> > > > >   * Python 3.8 wheel's tests are failed. 3.5, 3.6 and 3.7
>> > > > >     are passed. It seems that -larrow and -larrow_python for
>> > > > >     Cython are failed.
>> > > >
>> > > > I suspect this is related to
>> > > >
>> > > >
>> > >
>> > https://github.com/apache/arrow/commit/120c21f4bf66d2901b3a353a1f67bac3c3355924#diff-0f69784b44040448d17d0e4e8a641fe8
>> > > > ,
>> > > > but I don't think it's a blocking issue
>> > > >
>> > > > [1]: https://github.com/apache/arrow/commits/master/js
>> > > >
>> > > > On Fri, Jul 17, 2020 at 9:42 AM Ryan Murray <rym...@dremio.com> wrote:
>> > > > >
>> > > > > I've tested Java and it looks good. However the verify script keeps
>> > on
>> > > > > bailing with protobuf related errors:
>> > > > > 'cpp/build/orc_ep-prefix/src/orc_ep-build/c++/src/orc_proto.pb.cc'
>> > and
>> > > > > friends cant find protobuf definitions. A bit odd as cmake can see
>> > > > protobuf
>> > > > > headers and builds directly off master work just fine. Has anyone
>> > else
>> > > > > experienced this? I am on ubutnu 18.04
>> > > > >
>> > > > > On Fri, Jul 17, 2020 at 10:49 AM Antoine Pitrou <anto...@python.org>
>> > > > wrote:
>> > > > >
>> > > > > >
>> > > > > > +1 (binding).  I tested on Ubuntu 18.04.
>> > > > > >
>> > > > > > * Wheels verification went fine.
>> > > > > > * Source verification went fine with CUDA enabled and
>> > > > > > TEST_INTEGRATION_JS=0 TEST_JS=0.
>> > > > > >
>> > > > > > I didn't test the binaries.
>> > > > > >
>> > > > > > Regards
>> > > > > >
>> > > > > > Antoine.
>> > > > > >
>> > > > > >
>> > > > > > Le 17/07/2020 à 03:41, Krisztián Szűcs a écrit :
>> > > > > > > Hi,
>> > > > > > >
>> > > > > > > I would like to propose the second release candidate (RC1) of
>> > > Apache
>> > > > > > > Arrow version 1.0.0.
>> > > > > > > This is a major release consisting of 826 resolved JIRA
>> > issues[1].
>> > > > > > >
>> > > > > > > The verification of the first release candidate (RC0) has failed
>> > > > [0], and
>> > > > > > > the packaging scripts were unable to produce two wheels. Compared
>> > > > > > > to RC0 this release candidate includes additional patches for the
>> > > > > > > following bugs: ARROW-9506, ARROW-9504, ARROW-9497,
>> > > > > > > ARROW-9500, ARROW-9499.
>> > > > > > >
>> > > > > > > This release candidate is based on commit:
>> > > > > > > bc0649541859095ee77d03a7b891ea8d6e2fd641 [2]
>> > > > > > >
>> > > > > > > The source release rc1 is hosted at [3].
>> > > > > > > The binary artifacts are hosted at [4][5][6][7].
>> > > > > > > The changelog is located at [8].
>> > > > > > >
>> > > > > > > Please download, verify checksums and signatures, run the unit
>> > > tests,
>> > > > > > > and vote on the release. See [9] for how to validate a release
>> > > > candidate.
>> > > > > > >
>> > > > > > > The vote will be open for at least 72 hours.
>> > > > > > >
>> > > > > > > [ ] +1 Release this as Apache Arrow 1.0.0
>> > > > > > > [ ] +0
>> > > > > > > [ ] -1 Do not release this as Apache Arrow 1.0.0 because...
>> > > > > > >
>> > > > > > > [0]:
>> > > > https://github.com/apache/arrow/pull/7778#issuecomment-659065370
>> > > > > > > [1]:
>> > > > > >
>> > > >
>> > >
>> > https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20%28Resolved%2C%20Closed%29%20AND%20fixVersion%20%3D%201.0.0
>> > > > > > > [2]:
>> > > > > >
>> > > >
>> > >
>> > https://github.com/apache/arrow/tree/bc0649541859095ee77d03a7b891ea8d6e2fd641
>> > > > > > > [3]:
>> > > > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-1.0.0-rc1
>> > > > > > > [4]: https://bintray.com/apache/arrow/centos-rc/1.0.0-rc1
>> > > > > > > [5]: https://bintray.com/apache/arrow/debian-rc/1.0.0-rc1
>> > > > > > > [6]: https://bintray.com/apache/arrow/python-rc/1.0.0-rc1
>> > > > > > > [7]: https://bintray.com/apache/arrow/ubuntu-rc/1.0.0-rc1
>> > > > > > > [8]:
>> > > > > >
>> > > >
>> > >
>> > https://github.com/apache/arrow/blob/bc0649541859095ee77d03a7b891ea8d6e2fd641/CHANGELOG.md
>> > > > > > > [9]:
>> > > > > >
>> > > >
>> > >
>> > https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
>> > > > > > >
>> > > > > >
>> > > >
>> > >
>> >

Reply via email to