hi Bryan, This is a display bug
In [6]: arr = pa.array([0, 1, 2], type=pa.timestamp('ns', 'America/Los_Angeles')) In [7]: arr.view('int64') Out[7]: <pyarrow.lib.Int64Array object at 0x7fd1b8aaef30> [ 0, 1, 2 ] In [8]: arr Out[8]: <pyarrow.lib.TimestampArray object at 0x7fd1b8aae6e0> [ 1970-01-01 00:00:00.000000000, 1970-01-01 00:00:00.000000001, 1970-01-01 00:00:00.000000002 ] In [9]: arr.to_pandas() Out[9]: 0 1969-12-31 16:00:00-08:00 1 1969-12-31 16:00:00.000000001-08:00 2 1969-12-31 16:00:00.000000002-08:00 dtype: datetime64[ns, America/Los_Angeles] the repr of TimestampArray doesn't take into account the timezone In [10]: arr[0] Out[10]: <pyarrow.TimestampScalar: Timestamp('1969-12-31 16:00:00-0800', tz='America/Los_Angeles')> So if it's incorrect, the problem is happening somewhere before or while the StructArray is being created. If I had to guess it's caused by the tzinfo of the datetime.datetime values not being handled in the way that they were before On Sun, Jul 19, 2020 at 5:19 PM Wes McKinney <wesmck...@gmail.com> wrote: > > Well this is not good and pretty disappointing given that we had nearly a > month to sort through the implications of Micah’s patch. We should try to > resolve this ASAP > > On Sun, Jul 19, 2020 at 5:10 PM Bryan Cutler <cutl...@gmail.com> wrote: >> >> +0 (non-binding) >> >> I ran verification script for binaries and then source, as below, and both >> look good >> ARROW_TMPDIR=/tmp/arrow-test TEST_DEFAULT=0 TEST_SOURCE=1 TEST_CPP=1 >> TEST_PYTHON=1 TEST_JAVA=1 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1 >> dev/release/verify-release-candidate.sh source 1.0.0 1 >> >> I tried to patch Spark locally to verify the recent change in nested >> timestamps and was not able to get things working quite right, but I'm not >> sure if the problem is in Spark, Arrow or my patch - hence my vote of +0. >> >> Here is what I'm seeing >> >> ``` >> (Input as datetime) >> datetime.datetime(2018, 3, 10, 0, 0) >> datetime.datetime(2018, 3, 15, 0, 0) >> >> (Struct Array) >> -- is_valid: all not null >> -- child 0 type: timestamp[us, tz=America/Los_Angeles] >> [ >> 2018-03-10 00:00:00.000000, >> 2018-03-10 00:00:00.000000 >> ] >> -- child 1 type: timestamp[us, tz=America/Los_Angeles] >> [ >> 2018-03-15 00:00:00.000000, >> 2018-03-15 00:00:00.000000 >> ] >> >> (Flattened Arrays) >> types [TimestampType(timestamp[us, tz=America/Los_Angeles]), >> TimestampType(timestamp[us, tz=America/Los_Angeles])] >> [<pyarrow.lib.TimestampArray object at 0x7ffbbd88f520> >> [ >> 2018-03-10 00:00:00.000000, >> 2018-03-10 00:00:00.000000 >> ], <pyarrow.lib.TimestampArray object at 0x7ffba958be50> >> [ >> 2018-03-15 00:00:00.000000, >> 2018-03-15 00:00:00.000000 >> ]] >> >> (Pandas Conversion) >> [ >> 0 2018-03-09 16:00:00-08:00 >> 1 2018-03-09 16:00:00-08:00 >> dtype: datetime64[ns, America/Los_Angeles], >> >> 0 2018-03-14 17:00:00-07:00 >> 1 2018-03-14 17:00:00-07:00 >> dtype: datetime64[ns, America/Los_Angeles]] >> ``` >> >> Based on output of existing a correct timestamp udf, it looks like the >> pyarrow Struct Array values are wrong and that's carried through the >> flattened arrays, causing the Pandas values to have a negative offset. >> >> Here is output from a working udf with timestamp, the pyarrow Array >> displays in UTC time, I believe. >> >> ``` >> (Timestamp Array) >> type timestamp[us, tz=America/Los_Angeles] >> [ >> [ >> 1969-01-01 09:01:01.000000 >> ] >> ] >> >> (Pandas Conversion) >> 0 1969-01-01 01:01:01-08:00 >> Name: _0, dtype: datetime64[ns, America/Los_Angeles] >> >> (Timezone Localized) >> 0 1969-01-01 01:01:01 >> Name: _0, dtype: datetime64[ns] >> ``` >> >> I'll have to dig in further at another time and debug where the values go >> wrong. >> >> On Sat, Jul 18, 2020 at 9:51 PM Micah Kornfield <emkornfi...@gmail.com> >> wrote: >> >> > +1 (binding) >> > >> > Ran wheel and binary tests on ubuntu 19.04 >> > >> > On Fri, Jul 17, 2020 at 2:25 PM Neal Richardson < >> > neal.p.richard...@gmail.com> >> > wrote: >> > >> > > +1 (binding) >> > > >> > > In addition to the usual verification on >> > > https://github.com/apache/arrow/pull/7787, I've successfully staged the >> > R >> > > binary artifacts on Windows ( >> > > https://github.com/r-windows/rtools-packages/pull/126), macOS ( >> > > https://github.com/autobrew/homebrew-core/pull/12), and Linux ( >> > > https://github.com/ursa-labs/arrow-r-nightly/actions/runs/172977277) >> > using >> > > the release candidate. >> > > >> > > And I agree with the judgment about skipping a JS release artifact. Looks >> > > like there hasn't been a code change since October so there's no point. >> > > >> > > Neal >> > > >> > > On Fri, Jul 17, 2020 at 10:37 AM Wes McKinney <wesmck...@gmail.com> >> > wrote: >> > > >> > > > I see the JS failures as well. I think it is a failure localized to >> > > > newer Node versions since our JavaScript CI works fine. I don't think >> > > > it should block the release given the lack of development activity in >> > > > JavaScript [1] -- if any JS devs are concerned about publishing an >> > > > artifact then we can skip pushing it to NPM >> > > > >> > > > @Ryan it seems it may be something environment related on your >> > > > machine, I'm on Ubuntu 18.04 and have not seen this. >> > > > >> > > > On >> > > > >> > > > > * Python 3.8 wheel's tests are failed. 3.5, 3.6 and 3.7 >> > > > > are passed. It seems that -larrow and -larrow_python for >> > > > > Cython are failed. >> > > > >> > > > I suspect this is related to >> > > > >> > > > >> > > >> > https://github.com/apache/arrow/commit/120c21f4bf66d2901b3a353a1f67bac3c3355924#diff-0f69784b44040448d17d0e4e8a641fe8 >> > > > , >> > > > but I don't think it's a blocking issue >> > > > >> > > > [1]: https://github.com/apache/arrow/commits/master/js >> > > > >> > > > On Fri, Jul 17, 2020 at 9:42 AM Ryan Murray <rym...@dremio.com> wrote: >> > > > > >> > > > > I've tested Java and it looks good. However the verify script keeps >> > on >> > > > > bailing with protobuf related errors: >> > > > > 'cpp/build/orc_ep-prefix/src/orc_ep-build/c++/src/orc_proto.pb.cc' >> > and >> > > > > friends cant find protobuf definitions. A bit odd as cmake can see >> > > > protobuf >> > > > > headers and builds directly off master work just fine. Has anyone >> > else >> > > > > experienced this? I am on ubutnu 18.04 >> > > > > >> > > > > On Fri, Jul 17, 2020 at 10:49 AM Antoine Pitrou <anto...@python.org> >> > > > wrote: >> > > > > >> > > > > > >> > > > > > +1 (binding). I tested on Ubuntu 18.04. >> > > > > > >> > > > > > * Wheels verification went fine. >> > > > > > * Source verification went fine with CUDA enabled and >> > > > > > TEST_INTEGRATION_JS=0 TEST_JS=0. >> > > > > > >> > > > > > I didn't test the binaries. >> > > > > > >> > > > > > Regards >> > > > > > >> > > > > > Antoine. >> > > > > > >> > > > > > >> > > > > > Le 17/07/2020 à 03:41, Krisztián Szűcs a écrit : >> > > > > > > Hi, >> > > > > > > >> > > > > > > I would like to propose the second release candidate (RC1) of >> > > Apache >> > > > > > > Arrow version 1.0.0. >> > > > > > > This is a major release consisting of 826 resolved JIRA >> > issues[1]. >> > > > > > > >> > > > > > > The verification of the first release candidate (RC0) has failed >> > > > [0], and >> > > > > > > the packaging scripts were unable to produce two wheels. Compared >> > > > > > > to RC0 this release candidate includes additional patches for the >> > > > > > > following bugs: ARROW-9506, ARROW-9504, ARROW-9497, >> > > > > > > ARROW-9500, ARROW-9499. >> > > > > > > >> > > > > > > This release candidate is based on commit: >> > > > > > > bc0649541859095ee77d03a7b891ea8d6e2fd641 [2] >> > > > > > > >> > > > > > > The source release rc1 is hosted at [3]. >> > > > > > > The binary artifacts are hosted at [4][5][6][7]. >> > > > > > > The changelog is located at [8]. >> > > > > > > >> > > > > > > Please download, verify checksums and signatures, run the unit >> > > tests, >> > > > > > > and vote on the release. See [9] for how to validate a release >> > > > candidate. >> > > > > > > >> > > > > > > The vote will be open for at least 72 hours. >> > > > > > > >> > > > > > > [ ] +1 Release this as Apache Arrow 1.0.0 >> > > > > > > [ ] +0 >> > > > > > > [ ] -1 Do not release this as Apache Arrow 1.0.0 because... >> > > > > > > >> > > > > > > [0]: >> > > > https://github.com/apache/arrow/pull/7778#issuecomment-659065370 >> > > > > > > [1]: >> > > > > > >> > > > >> > > >> > https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20%28Resolved%2C%20Closed%29%20AND%20fixVersion%20%3D%201.0.0 >> > > > > > > [2]: >> > > > > > >> > > > >> > > >> > https://github.com/apache/arrow/tree/bc0649541859095ee77d03a7b891ea8d6e2fd641 >> > > > > > > [3]: >> > > > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-1.0.0-rc1 >> > > > > > > [4]: https://bintray.com/apache/arrow/centos-rc/1.0.0-rc1 >> > > > > > > [5]: https://bintray.com/apache/arrow/debian-rc/1.0.0-rc1 >> > > > > > > [6]: https://bintray.com/apache/arrow/python-rc/1.0.0-rc1 >> > > > > > > [7]: https://bintray.com/apache/arrow/ubuntu-rc/1.0.0-rc1 >> > > > > > > [8]: >> > > > > > >> > > > >> > > >> > https://github.com/apache/arrow/blob/bc0649541859095ee77d03a7b891ea8d6e2fd641/CHANGELOG.md >> > > > > > > [9]: >> > > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates >> > > > > > > >> > > > > > >> > > > >> > > >> >