date:20190921

[jira] [Commented] (ARROW-6429) [CI][Crossbow] Nightly spark integration job fails

2019-09-21 Thread Bryan Cutler (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935210#comment-16935210
 ] 

Bryan Cutler commented on ARROW-6429:
-

I believe I need to add a patch so Spark can compile with Arrow Java. I'm 
working on this now.

> [CI][Crossbow] Nightly spark integration job fails
> --
>
> Key: ARROW-6429
> URL: https://issues.apache.org/jira/browse/ARROW-6429
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Neal Richardson
>Assignee: Wes McKinney
>Priority: Blocker
>  Labels: nightly, pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> See https://circleci.com/gh/ursa-labs/crossbow/2310. Either fix, skip job and 
> create followup Jira to unskip, or delete job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (ARROW-6641) [C++] Remove Deprecated WriteableFile warning

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney closed ARROW-6641.
---
Resolution: Duplicate

> [C++] Remove Deprecated WriteableFile warning
> -
>
> Key: ARROW-6641
> URL: https://issues.apache.org/jira/browse/ARROW-6641
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.14.0, 0.14.1
>Reporter: Karthikeyan Natarajan
>Priority: Major
>  Labels: newbie
> Fix For: 0.15.0
>
>
> Current version is 0.14.1. As per comment, deprecated `WriteableFile` should 
> be removed. 
>  
> {code:java}
> // TODO(kszucs): remove this after 0.13
> #ifndef _MSC_VER
> using WriteableFile ARROW_DEPRECATED("Use WritableFile") = WritableFile;
> using ReadableFileInterface ARROW_DEPRECATED("Use RandomAccessFile") = 
> RandomAccessFile;
> #else
> // MSVC does not like using ARROW_DEPRECATED with using declarations
> using WriteableFile = WritableFile;
> using ReadableFileInterface = RandomAccessFile;
> #endif
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6641) [C++] Remove Deprecated WriteableFile warning

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-6641:

Fix Version/s: 0.15.0

> [C++] Remove Deprecated WriteableFile warning
> -
>
> Key: ARROW-6641
> URL: https://issues.apache.org/jira/browse/ARROW-6641
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.14.0, 0.14.1
>Reporter: Karthikeyan Natarajan
>Priority: Major
>  Labels: newbie
> Fix For: 0.15.0
>
>
> Current version is 0.14.1. As per comment, deprecated `WriteableFile` should 
> be removed. 
>  
> {code:java}
> // TODO(kszucs): remove this after 0.13
> #ifndef _MSC_VER
> using WriteableFile ARROW_DEPRECATED("Use WritableFile") = WritableFile;
> using ReadableFileInterface ARROW_DEPRECATED("Use RandomAccessFile") = 
> RandomAccessFile;
> #else
> // MSVC does not like using ARROW_DEPRECATED with using declarations
> using WriteableFile = WritableFile;
> using ReadableFileInterface = RandomAccessFile;
> #endif
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6641) [C++] Remove Deprecated WriteableFile warning

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-6641:

Summary: [C++] Remove Deprecated WriteableFile warning  (was: Remove 
Deprecated WriteableFile warning)

> [C++] Remove Deprecated WriteableFile warning
> -
>
> Key: ARROW-6641
> URL: https://issues.apache.org/jira/browse/ARROW-6641
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.14.0, 0.14.1
>Reporter: Karthikeyan Natarajan
>Priority: Major
>  Labels: newbie
>
> Current version is 0.14.1. As per comment, deprecated `WriteableFile` should 
> be removed. 
>  
> {code:java}
> // TODO(kszucs): remove this after 0.13
> #ifndef _MSC_VER
> using WriteableFile ARROW_DEPRECATED("Use WritableFile") = WritableFile;
> using ReadableFileInterface ARROW_DEPRECATED("Use RandomAccessFile") = 
> RandomAccessFile;
> #else
> // MSVC does not like using ARROW_DEPRECATED with using declarations
> using WriteableFile = WritableFile;
> using ReadableFileInterface = RandomAccessFile;
> #endif
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6648) [Go] Expose the bitutil package

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-6648:

Summary: [Go] Expose the bitutil package  (was: Go: Expose the bitutil 
package)

> [Go] Expose the bitutil package
> ---
>
> Key: ARROW-6648
> URL: https://issues.apache.org/jira/browse/ARROW-6648
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Go
>Reporter: Jonathan A Sternberg
>Priority: Minor
>
> Please allow the {{bitutil}} package to be exposed to external developers. 
> The package provides useful utilities for constructing a bitmap and it is 
> needed if you want to create an external builder implementation that handles 
> null values.
> Thank you.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6277) [C++][Parquet] Support reading/writing other Parquet primitive types to DictionaryArray

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-6277:

Fix Version/s: (was: 0.15.0)
   1.0.0

> [C++][Parquet] Support reading/writing other Parquet primitive types to 
> DictionaryArray
> ---
>
> Key: ARROW-6277
> URL: https://issues.apache.org/jira/browse/ARROW-6277
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Benjamin Kietzman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> As follow up to ARROW-3246, we should support direct read/write of the other 
> Parquet primitive types. Currently only BYTE_ARRAY is implemented as it 
> provides the most performance benefit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-6429) [CI][Crossbow] Nightly spark integration job fails

2019-09-21 Thread Wes McKinney (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935173#comment-16935173
 ] 

Wes McKinney commented on ARROW-6429:
-

This may be fixed in master now, need to confirm

> [CI][Crossbow] Nightly spark integration job fails
> --
>
> Key: ARROW-6429
> URL: https://issues.apache.org/jira/browse/ARROW-6429
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Neal Richardson
>Assignee: Wes McKinney
>Priority: Blocker
>  Labels: nightly, pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> See https://circleci.com/gh/ursa-labs/crossbow/2310. Either fix, skip job and 
> create followup Jira to unskip, or delete job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-6654) [Python] Consider adding some user-friendly conveniences to Filesystem API

2019-09-21 Thread Wes McKinney (Jira)

Wes McKinney created ARROW-6654:
---

 Summary: [Python] Consider adding some user-friendly conveniences 
to Filesystem API
 Key: ARROW-6654
 URL: https://issues.apache.org/jira/browse/ARROW-6654
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney
 Fix For: 1.0.0


For example:

{code}
In [12]: lfs.get_target_stats('/home/wesm') 

   
---
TypeError Traceback (most recent call last)
 in 
> 1 lfs.get_target_stats('/home/wesm')

~/code/arrow/python/pyarrow/_fs.pyx in pyarrow._fs.FileSystem.get_target_stats()
239 check_status(self.fs.GetTargetStats(paths, ))
240 else:
--> 241 raise TypeError('Must pass either paths or a Selector')
242 
243 return [FileStats.wrap(stat) for stat in stats]

TypeError: Must pass either paths or a Selector
{code}

Some conveniences like {{listdir}} might be kind to the user



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6501) [C++] Remove non_zero_length field from SparseIndex

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-6501:

Fix Version/s: (was: 0.15.0)
   1.0.0

> [C++] Remove non_zero_length field from SparseIndex
> ---
>
> Key: ARROW-6501
> URL: https://issues.apache.org/jira/browse/ARROW-6501
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Kenta Murata
>Assignee: Kenta Murata
>Priority: Major
> Fix For: 1.0.0
>
>
> We can remove non_zero_length field from SparseIndex because it can be 
> supplied from the shape of the indices tensor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6353) [Python] Allow user to select compression level in pyarrow.parquet.write_table

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-6353:

Fix Version/s: 0.15.0

> [Python] Allow user to select compression level in pyarrow.parquet.write_table
> --
>
> Key: ARROW-6353
> URL: https://issues.apache.org/jira/browse/ARROW-6353
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Igor Yastrebov
>Assignee: Martin Radev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This feature was introduced for C++ in 
> [ARROW-6216|https://issues.apache.org/jira/browse/ARROW-6216].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (ARROW-6642) [Python] chained access of ParquetDataset's metadata segfaults

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-6642.
-
Resolution: Fixed

Issue resolved by pull request 5455
[https://github.com/apache/arrow/pull/5455]

> [Python] chained access of ParquetDataset's metadata segfaults
> --
>
> Key: ARROW-6642
> URL: https://issues.apache.org/jira/browse/ARROW-6642
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Joris Van den Bossche
>Priority: Major
>  Labels: parquet, pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Creating and reading a parquet dataset:
> {code}
> table = pa.table({'a': [1, 2, 3]})
> import pyarrow.parquet as pq
> pq.write_table(table, '__test_statistics_segfault.parquet')
> dataset = pq.ParquetDataset('__test_statistics_segfault.parquet')
> dataset_piece = dataset.pieces[0]
> {code}
> If you access the metadata and a column's statistics in steps, this works 
> fine:
> {code}
> meta = dataset_piece.get_metadata()
> row = meta.row_group(0)
> col = row.column(0)
> {code}
> but doing it chained in one step, this segfaults:
> {code}
> dataset_piece.get_metadata().row_group(0).column(0)
> {code}
> {{dataset_piece.get_metadata().row_group(0)}} still works, but additionally 
> with {{.column(0)}} then it segfaults. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (ARROW-6644) [JS] Amend NullType IPC protocol to append no buffers

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-6644.
-
Fix Version/s: (was: 1.0.0)
   0.15.0
   Resolution: Fixed

Issue resolved by pull request 5460
[https://github.com/apache/arrow/pull/5460]

> [JS] Amend NullType IPC protocol to append no buffers
> -
>
> Key: ARROW-6644
> URL: https://issues.apache.org/jira/browse/ARROW-6644
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Per ARROW-6379



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (ARROW-6644) [JS] Amend NullType IPC protocol to append no buffers

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-6644:
---

Assignee: Paul Taylor

> [JS] Amend NullType IPC protocol to append no buffers
> -
>
> Key: ARROW-6644
> URL: https://issues.apache.org/jira/browse/ARROW-6644
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Wes McKinney
>Assignee: Paul Taylor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Per ARROW-6379



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-6652.
-
Resolution: Fixed

Issue resolved by pull request 5462
[https://github.com/apache/arrow/pull/5462]

> [Python] to_pandas conversion removes timezone from type
> 
>
> Key: ARROW-6652
> URL: https://issues.apache.org/jira/browse/ARROW-6652
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bryan Cutler
>Assignee: Joris Van den Bossche
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
> type, removes the timezone in the resulting {{pandas.Series}}.
> {code}
> >>> import pyarrow as pa
> >>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
> >>> a.to_pandas()
> 0   1970-01-01 00:00:00.01
> dtype: datetime64[ns]
> {code}
> Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} 
> {{to_pandas}} retained the timezone.
> {code}
> In [4]: import pyarrow as pa 
>...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
>...: c = pa.Column.from_array('ts', a) 
> In [5]: c.to_pandas() 
>
> Out[5]: 
> 0   1969-12-31 16:00:00.01-08:00
> Name: ts, dtype: datetime64[ns, America/Los_Angeles]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (ARROW-6647) [C++] Can't build with g++ 4.8.5 on CentOS 7 by member initializer for shared_ptr

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-6647.
-
Fix Version/s: 0.15.0
   Resolution: Fixed

Issue resolved by pull request 5456
[https://github.com/apache/arrow/pull/5456]

> [C++] Can't build with g++ 4.8.5 on CentOS 7 by member initializer for 
> shared_ptr
> -
>
> Key: ARROW-6647
> URL: https://issues.apache.org/jira/browse/ARROW-6647
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sutou Kouhei
>Assignee: Sutou Kouhei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {noformat}
> % g++ --version
> g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
> Copyright (C) 2015 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> {noformat}
> Error message:
> {noformat}
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:
>  In instantiation of 'arrow::Status arrow::py::GetConverterFlat(const 
> std::shared_ptr&, bool, 
> std::unique_ptr*) [with arrow::py::NullCoding 
> null_coding = (arrow::py::NullCoding)1]':
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:1001:5:
>required from here
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:864:7:
>  error: conversion from 'std::nullptr_t' to non-scalar type 
> 'std::shared_ptr' requested
>  class DecimalConverter
>^
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:894:10:
>  note: synthesized method 
> 'arrow::py::DecimalConverter<(arrow::py::NullCoding)1>::DecimalConverter()' 
> first required here 
>  *out = std::unique_ptr(new TYPE_CLASS); \
>   ^
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:915:5:
>  note: in expansion of macro 'SIMPLE_CONVERTER_CASE'
>  SIMPLE_CONVERTER_CASE(DECIMAL, DecimalConverter);
>  ^
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:
>  In instantiation of 'arrow::Status arrow::py::GetConverterFlat(const 
> std::shared_ptr&, bool, 
> std::unique_ptr*) [with arrow::py::NullCoding 
> null_coding = (arrow::py::NullCoding)0]':
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:1004:5:
>required from here
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:864:7:
>  error: conversion from 'std::nullptr_t' to non-scalar type 
> 'std::shared_ptr' requested
>  class DecimalConverter
>^
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:894:10:
>  note: synthesized method 
> 'arrow::py::DecimalConverter<(arrow::py::NullCoding)0>::DecimalConverter()' 
> first required here 
>  *out = std::unique_ptr(new TYPE_CLASS); \
>   ^
> /root/rpmbuild/BUILD/apache-arrow-0.15.0/cpp/src/arrow/python/python_to_arrow.cc:915:5:
>  note: in expansion of macro 'SIMPLE_CONVERTER_CASE'
>  SIMPLE_CONVERTER_CASE(DECIMAL, DecimalConverter);
>  ^
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (ARROW-6651) [R] Fix R conda job

2019-09-21 Thread Neal Richardson (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-6651.

Fix Version/s: 0.15.0
   Resolution: Fixed

Issue resolved by pull request 5461
[https://github.com/apache/arrow/pull/5461]

> [R] Fix R conda job
> ---
>
> Key: ARROW-6651
> URL: https://issues.apache.org/jira/browse/ARROW-6651
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> ARROW-6214 touched the build scripts it uses and now the nightly job is 
> failing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6634) [C++] Do not require flatbuffers or flatbuffers_ep to build

2019-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6634:
--
Labels: pull-request-available  (was: )

> [C++] Do not require flatbuffers or flatbuffers_ep to build
> ---
>
> Key: ARROW-6634
> URL: https://issues.apache.org/jira/browse/ARROW-6634
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> Flatbuffers is small enough that we can vendor {{flatbuffers/flatbuffers.h}} 
> and check in the compiled files to make flatbuffers_ep unneeded



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (ARROW-6634) [C++] Do not require flatbuffers or flatbuffers_ep to build

2019-09-21 Thread Wes McKinney (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-6634:
---

Assignee: Wes McKinney

> [C++] Do not require flatbuffers or flatbuffers_ep to build
> ---
>
> Key: ARROW-6634
> URL: https://issues.apache.org/jira/browse/ARROW-6634
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> Flatbuffers is small enough that we can vendor {{flatbuffers/flatbuffers.h}} 
> and check in the compiled files to make flatbuffers_ep unneeded



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

2019-09-21 Thread Sutou Kouhei (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935163#comment-16935163
 ] 

Sutou Kouhei commented on ARROW-6653:
-

It doesn't work with GITHUB_TOKEN in GitHub Actions: 
https://github.com/apache/arrow/pull/5463

> [Developer] Add support for auto JIRA link on pull request
> --
>
> Key: ARROW-6653
> URL: https://issues.apache.org/jira/browse/ARROW-6653
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Kenta Murata
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://lists.apache.org/thread.html/7bb9e646832390d207393f064d7934e54c6cb010e30ea9f39f3ed1ce@%3Cdev.arrow.apache.org%3E
> I frequently do the following little bit bothersome steps for opening
> JIRA tickets when I watch a GitHub pull-request:
> 1. Select the "ARROW-" text in the title and copy it
> 2. Open JIRA if I haven't open it
> 3. Select a ticket to open it
> 4. Alter the URL by pasting text that copied at the step-1
> 5. Hit the enter key
> I think it is better if these steps become easier.
> We already have a mechanism to inject a GitHub pull-request URL into
> the corresponding JIRA ticket. How about making the similar mechanism
> for the reverse link?  I guess it is possible to automate making a
> comment of JIRA ticket URL to the pull-request when the "ARROW-"
> text is injected in the title field by using GitHub Actions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

2019-09-21 Thread Sutou Kouhei (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sutou Kouhei updated ARROW-6653:

Reporter: Kenta Murata  (was: Sutou Kouhei)

> [Developer] Add support for auto JIRA link on pull request
> --
>
> Key: ARROW-6653
> URL: https://issues.apache.org/jira/browse/ARROW-6653
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Kenta Murata
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://lists.apache.org/thread.html/7bb9e646832390d207393f064d7934e54c6cb010e30ea9f39f3ed1ce@%3Cdev.arrow.apache.org%3E
> I frequently do the following little bit bothersome steps for opening
> JIRA tickets when I watch a GitHub pull-request:
> 1. Select the "ARROW-" text in the title and copy it
> 2. Open JIRA if I haven't open it
> 3. Select a ticket to open it
> 4. Alter the URL by pasting text that copied at the step-1
> 5. Hit the enter key
> I think it is better if these steps become easier.
> We already have a mechanism to inject a GitHub pull-request URL into
> the corresponding JIRA ticket. How about making the similar mechanism
> for the reverse link?  I guess it is possible to automate making a
> comment of JIRA ticket URL to the pull-request when the "ARROW-"
> text is injected in the title field by using GitHub Actions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

2019-09-21 Thread Sutou Kouhei (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sutou Kouhei updated ARROW-6653:

Description: 
https://lists.apache.org/thread.html/7bb9e646832390d207393f064d7934e54c6cb010e30ea9f39f3ed1ce@%3Cdev.arrow.apache.org%3E

I frequently do the following little bit bothersome steps for opening
JIRA tickets when I watch a GitHub pull-request:

1. Select the "ARROW-" text in the title and copy it
2. Open JIRA if I haven't open it
3. Select a ticket to open it
4. Alter the URL by pasting text that copied at the step-1
5. Hit the enter key

I think it is better if these steps become easier.

We already have a mechanism to inject a GitHub pull-request URL into
the corresponding JIRA ticket. How about making the similar mechanism
for the reverse link?  I guess it is possible to automate making a
comment of JIRA ticket URL to the pull-request when the "ARROW-"
text is injected in the title field by using GitHub Actions.

> [Developer] Add support for auto JIRA link on pull request
> --
>
> Key: ARROW-6653
> URL: https://issues.apache.org/jira/browse/ARROW-6653
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Sutou Kouhei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://lists.apache.org/thread.html/7bb9e646832390d207393f064d7934e54c6cb010e30ea9f39f3ed1ce@%3Cdev.arrow.apache.org%3E
> I frequently do the following little bit bothersome steps for opening
> JIRA tickets when I watch a GitHub pull-request:
> 1. Select the "ARROW-" text in the title and copy it
> 2. Open JIRA if I haven't open it
> 3. Select a ticket to open it
> 4. Alter the URL by pasting text that copied at the step-1
> 5. Hit the enter key
> I think it is better if these steps become easier.
> We already have a mechanism to inject a GitHub pull-request URL into
> the corresponding JIRA ticket. How about making the similar mechanism
> for the reverse link?  I guess it is possible to automate making a
> comment of JIRA ticket URL to the pull-request when the "ARROW-"
> text is injected in the title field by using GitHub Actions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

2019-09-21 Thread Sutou Kouhei (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sutou Kouhei reassigned ARROW-6653:
---

Assignee: (was: Sutou Kouhei)

> [Developer] Add support for auto JIRA link on pull request
> --
>
> Key: ARROW-6653
> URL: https://issues.apache.org/jira/browse/ARROW-6653
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Sutou Kouhei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

2019-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6653:
--
Labels: pull-request-available  (was: )

> [Developer] Add support for auto JIRA link on pull request
> --
>
> Key: ARROW-6653
> URL: https://issues.apache.org/jira/browse/ARROW-6653
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Sutou Kouhei
>Assignee: Sutou Kouhei
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

2019-09-21 Thread Sutou Kouhei (Jira)

Sutou Kouhei created ARROW-6653:
---

 Summary: [Developer] Add support for auto JIRA link on pull request
 Key: ARROW-6653
 URL: https://issues.apache.org/jira/browse/ARROW-6653
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Developer Tools
Reporter: Sutou Kouhei
Assignee: Sutou Kouhei






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [arrow-testing] nevi-me commented on issue #10: ARROW-5399: [Testing] Add Rust IPC files

2019-09-21 Thread GitBox

nevi-me commented on issue #10: ARROW-5399: [Testing] Add Rust IPC files
URL: https://github.com/apache/arrow-testing/pull/10#issuecomment-533821698
 
 
   I'll wait for 0.15, I think I'll focus on some ground-work to get us ready 
for integration testing. I've got my spare time back now, so I'll have enough 
capacity to complete Rust IPC and integration by 1.0.0.
   
   Thanks Micah and Wes


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [arrow-testing] wesm commented on issue #10: ARROW-5399: [Testing] Add Rust IPC files

2019-09-21 Thread GitBox

wesm commented on issue #10: ARROW-5399: [Testing] Add Rust IPC files
URL: https://github.com/apache/arrow-testing/pull/10#issuecomment-533820806
 
 
   Ideally a testing corpus would be generated on the fly rather than checking 
in the files, but I can understand that this might make things easier for 
developers. 
   
   Note that these files might be using the pre-0.15 message format, do you 
want to wait until after 0.15.0 is released, or use a dev version to generate 
the test files?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread Joris Van den Bossche (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935111#comment-16935111
 ] 

Joris Van den Bossche commented on ARROW-6652:
--

Quickly did a PR (https://github.com/apache/arrow/pull/5462), thanks for the 
catch [~bryanc] !

> [Python] to_pandas conversion removes timezone from type
> 
>
> Key: ARROW-6652
> URL: https://issues.apache.org/jira/browse/ARROW-6652
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bryan Cutler
>Assignee: Joris Van den Bossche
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
> type, removes the timezone in the resulting {{pandas.Series}}.
> {code}
> >>> import pyarrow as pa
> >>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
> >>> a.to_pandas()
> 0   1970-01-01 00:00:00.01
> dtype: datetime64[ns]
> {code}
> Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} 
> {{to_pandas}} retained the timezone.
> {code}
> In [4]: import pyarrow as pa 
>...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
>...: c = pa.Column.from_array('ts', a) 
> In [5]: c.to_pandas() 
>
> Out[5]: 
> 0   1969-12-31 16:00:00.01-08:00
> Name: ts, dtype: datetime64[ns, America/Los_Angeles]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6652:
--
Labels: pull-request-available  (was: )

> [Python] to_pandas conversion removes timezone from type
> 
>
> Key: ARROW-6652
> URL: https://issues.apache.org/jira/browse/ARROW-6652
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bryan Cutler
>Assignee: Joris Van den Bossche
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>
> Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
> type, removes the timezone in the resulting {{pandas.Series}}.
> {code}
> >>> import pyarrow as pa
> >>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
> >>> a.to_pandas()
> 0   1970-01-01 00:00:00.01
> dtype: datetime64[ns]
> {code}
> Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} 
> {{to_pandas}} retained the timezone.
> {code}
> In [4]: import pyarrow as pa 
>...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
>...: c = pa.Column.from_array('ts', a) 
> In [5]: c.to_pandas() 
>
> Out[5]: 
> 0   1969-12-31 16:00:00.01-08:00
> Name: ts, dtype: datetime64[ns, America/Los_Angeles]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread Joris Van den Bossche (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935106#comment-16935106
 ] 

Joris Van den Bossche commented on ARROW-6652:
--

This should be an easy fix. It seems that the {{Column.to_pandas}} had a 
specific check for this case: 

https://github.com/apache/arrow/blob/5f564424c71cef12619522cdde59be5f69b31b68/python/pyarrow/table.pxi#L467-L478

that we can add back to Array.to_pandas

> [Python] to_pandas conversion removes timezone from type
> 
>
> Key: ARROW-6652
> URL: https://issues.apache.org/jira/browse/ARROW-6652
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bryan Cutler
>Priority: Critical
> Fix For: 0.15.0
>
>
> Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
> type, removes the timezone in the resulting {{pandas.Series}}.
> {code}
> >>> import pyarrow as pa
> >>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
> >>> a.to_pandas()
> 0   1970-01-01 00:00:00.01
> dtype: datetime64[ns]
> {code}
> Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} 
> {{to_pandas}} retained the timezone.
> {code}
> In [4]: import pyarrow as pa 
>...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
>...: c = pa.Column.from_array('ts', a) 
> In [5]: c.to_pandas() 
>
> Out[5]: 
> 0   1969-12-31 16:00:00.01-08:00
> Name: ts, dtype: datetime64[ns, America/Los_Angeles]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread Joris Van den Bossche (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van den Bossche reassigned ARROW-6652:


Assignee: Joris Van den Bossche

> [Python] to_pandas conversion removes timezone from type
> 
>
> Key: ARROW-6652
> URL: https://issues.apache.org/jira/browse/ARROW-6652
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bryan Cutler
>Assignee: Joris Van den Bossche
>Priority: Critical
> Fix For: 0.15.0
>
>
> Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
> type, removes the timezone in the resulting {{pandas.Series}}.
> {code}
> >>> import pyarrow as pa
> >>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
> >>> a.to_pandas()
> 0   1970-01-01 00:00:00.01
> dtype: datetime64[ns]
> {code}
> Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} 
> {{to_pandas}} retained the timezone.
> {code}
> In [4]: import pyarrow as pa 
>...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
>...: c = pa.Column.from_array('ts', a) 
> In [5]: c.to_pandas() 
>
> Out[5]: 
> 0   1969-12-31 16:00:00.01-08:00
> Name: ts, dtype: datetime64[ns, America/Los_Angeles]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread Antoine Pitrou (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935092#comment-16935092
 ] 

Antoine Pitrou commented on ARROW-6652:
---

Also cc [~jorisvandenbossche]

> [Python] to_pandas conversion removes timezone from type
> 
>
> Key: ARROW-6652
> URL: https://issues.apache.org/jira/browse/ARROW-6652
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bryan Cutler
>Priority: Critical
> Fix For: 0.15.0
>
>
> Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
> type, removes the timezone in the resulting {{pandas.Series}}.
> {code}
> >>> import pyarrow as pa
> >>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
> >>> a.to_pandas()
> 0   1970-01-01 00:00:00.01
> dtype: datetime64[ns]
> {code}
> Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} 
> {{to_pandas}} retained the timezone.
> {code}
> In [4]: import pyarrow as pa 
>...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
>...: c = pa.Column.from_array('ts', a) 
> In [5]: c.to_pandas() 
>
> Out[5]: 
> 0   1969-12-31 16:00:00.01-08:00
> Name: ts, dtype: datetime64[ns, America/Los_Angeles]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread Bryan Cutler (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935089#comment-16935089
 ] 

Bryan Cutler commented on ARROW-6652:
-

[~wesm] or [~apitrou]  would you be able to take a look at this?

> [Python] to_pandas conversion removes timezone from type
> 
>
> Key: ARROW-6652
> URL: https://issues.apache.org/jira/browse/ARROW-6652
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Bryan Cutler
>Priority: Critical
> Fix For: 0.15.0
>
>
> Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
> type, removes the timezone in the resulting {{pandas.Series}}.
> {code}
> >>> import pyarrow as pa
> >>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
> >>> a.to_pandas()
> 0   1970-01-01 00:00:00.01
> dtype: datetime64[ns]
> {code}
> Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} 
> {{to_pandas}} retained the timezone.
> {code}
> In [4]: import pyarrow as pa 
>...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
>...: c = pa.Column.from_array('ts', a) 
> In [5]: c.to_pandas() 
>
> Out[5]: 
> 0   1969-12-31 16:00:00.01-08:00
> Name: ts, dtype: datetime64[ns, America/Los_Angeles]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (ARROW-6429) [CI][Crossbow] Nightly spark integration job fails

2019-09-21 Thread Bryan Cutler (Jira)



[ 
https://issues.apache.org/jira/browse/ARROW-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933714#comment-16933714
 ] 

Bryan Cutler edited comment on ARROW-6429 at 9/21/19 4:59 PM:
--

[~wesm] the issue with the timestamp test failures looks to be because calling 
{{to_pandas}} on a pyarrow ChunkedArray with a tz aware timestamp type removes 
the tz from the resulting dtype. The behavior before was a pyarrow Column keeps 
the tz but the pyarrow Array removes when converting to a numpy array.

With Arrow 0.14.1
{code:java}
In [4]: import pyarrow as pa 
   ...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
   ...: c = pa.Column.from_array('ts', a) 

In [5]: c.to_pandas()   
 
Out[5]: 
0   1969-12-31 16:00:00.01-08:00
Name: ts, dtype: datetime64[ns, America/Los_Angeles]

In [6]: a.to_pandas()   
 
Out[6]: array(['1970-01-01T00:00:00.01'], dtype='datetime64[us]')
{code}
With current master
{code:java}
>>> import pyarrow as pa
>>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
>>> a.to_pandas()
0   1970-01-01 00:00:00.01
dtype: datetime64[ns]
{code}
After manually adding the timezone back in the series dtype (and fixing the 
Java compilation), all tests pass and the spark integration run finished. I 
wasn't able to look into why the timezone is being removed though. Should I 
open up a jira for this?

edit: I made ARROW-6652 since it is not just a Spark issue


was (Author: bryanc):
[~wesm] the issue with the timestamp test failures looks to be because calling 
{{to_pandas}} on a pyarrow ChunkedArray with a tz aware timestamp type removes 
the tz from the resulting dtype. The behavior before was a pyarrow Column keeps 
the tz but the pyarrow Array removes when converting to a numpy array.

With Arrow 0.14.1
{code}
In [4]: import pyarrow as pa 
   ...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
   ...: c = pa.Column.from_array('ts', a) 

In [5]: c.to_pandas()   
 
Out[5]: 
0   1969-12-31 16:00:00.01-08:00
Name: ts, dtype: datetime64[ns, America/Los_Angeles]

In [6]: a.to_pandas()   
 
Out[6]: array(['1970-01-01T00:00:00.01'], dtype='datetime64[us]')
{code}

With current master
{code}
>>> import pyarrow as pa
>>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
>>> a.to_pandas()
0   1970-01-01 00:00:00.01
dtype: datetime64[ns]
{code}

After manually adding the timezone back in the series dtype (and fixing the 
Java compilation), all tests pass and the spark integration run finished. I 
wasn't able to look into why the timezone is being removed though. Should I 
open up a jira for this?


> [CI][Crossbow] Nightly spark integration job fails
> --
>
> Key: ARROW-6429
> URL: https://issues.apache.org/jira/browse/ARROW-6429
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Neal Richardson
>Assignee: Wes McKinney
>Priority: Blocker
>  Labels: nightly, pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> See https://circleci.com/gh/ursa-labs/crossbow/2310. Either fix, skip job and 
> create followup Jira to unskip, or delete job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

2019-09-21 Thread Bryan Cutler (Jira)

Bryan Cutler created ARROW-6652:
---

 Summary: [Python] to_pandas conversion removes timezone from type
 Key: ARROW-6652
 URL: https://issues.apache.org/jira/browse/ARROW-6652
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Bryan Cutler
 Fix For: 0.15.0


Calling {{to_pandas}} on a {{pyarrow.Array}} with a timezone aware timestamp 
type, removes the timezone in the resulting {{pandas.Series}}.

{code}
>>> import pyarrow as pa
>>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
>>> a.to_pandas()
0   1970-01-01 00:00:00.01
dtype: datetime64[ns]
{code}

Previous behavior from 0.14.1 of converting a {{pyarrow.Column}} {{to_pandas}} 
retained the timezone.
{code}
In [4]: import pyarrow as pa 
   ...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
   ...: c = pa.Column.from_array('ts', a) 

In [5]: c.to_pandas()   
 
Out[5]: 
0   1969-12-31 16:00:00.01-08:00
Name: ts, dtype: datetime64[ns, America/Los_Angeles]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-6651) [R] Fix R conda job

2019-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/ARROW-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6651:
--
Labels: pull-request-available  (was: )

> [R] Fix R conda job
> ---
>
> Key: ARROW-6651
> URL: https://issues.apache.org/jira/browse/ARROW-6651
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Minor
>  Labels: pull-request-available
>
> ARROW-6214 touched the build scripts it uses and now the nightly job is 
> failing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-6651) [R] Fix R conda job

2019-09-21 Thread Neal Richardson (Jira)

Neal Richardson created ARROW-6651:
--

 Summary: [R] Fix R conda job
 Key: ARROW-6651
 URL: https://issues.apache.org/jira/browse/ARROW-6651
 Project: Apache Arrow
  Issue Type: Bug
  Components: Continuous Integration
Reporter: Neal Richardson
Assignee: Neal Richardson


ARROW-6214 touched the build scripts it uses and now the nightly job is failing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-6429) [CI][Crossbow] Nightly spark integration job fails

[jira] [Closed] (ARROW-6641) [C++] Remove Deprecated WriteableFile warning

[jira] [Updated] (ARROW-6641) [C++] Remove Deprecated WriteableFile warning

[jira] [Updated] (ARROW-6641) [C++] Remove Deprecated WriteableFile warning

[jira] [Updated] (ARROW-6648) [Go] Expose the bitutil package

[jira] [Updated] (ARROW-6277) [C++][Parquet] Support reading/writing other Parquet primitive types to DictionaryArray

[jira] [Commented] (ARROW-6429) [CI][Crossbow] Nightly spark integration job fails

[jira] [Created] (ARROW-6654) [Python] Consider adding some user-friendly conveniences to Filesystem API

[jira] [Updated] (ARROW-6501) [C++] Remove non_zero_length field from SparseIndex

[jira] [Updated] (ARROW-6353) [Python] Allow user to select compression level in pyarrow.parquet.write_table

[jira] [Resolved] (ARROW-6642) [Python] chained access of ParquetDataset's metadata segfaults

[jira] [Resolved] (ARROW-6644) [JS] Amend NullType IPC protocol to append no buffers

[jira] [Assigned] (ARROW-6644) [JS] Amend NullType IPC protocol to append no buffers

[jira] [Resolved] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Resolved] (ARROW-6647) [C++] Can't build with g++ 4.8.5 on CentOS 7 by member initializer for shared_ptr

[jira] [Resolved] (ARROW-6651) [R] Fix R conda job

[jira] [Updated] (ARROW-6634) [C++] Do not require flatbuffers or flatbuffers_ep to build

[jira] [Assigned] (ARROW-6634) [C++] Do not require flatbuffers or flatbuffers_ep to build

[jira] [Commented] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

[jira] [Updated] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

[jira] [Updated] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

[jira] [Assigned] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

[jira] [Updated] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

[jira] [Created] (ARROW-6653) [Developer] Add support for auto JIRA link on pull request

[GitHub] [arrow-testing] nevi-me commented on issue #10: ARROW-5399: [Testing] Add Rust IPC files

[GitHub] [arrow-testing] wesm commented on issue #10: ARROW-5399: [Testing] Add Rust IPC files

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Updated] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Assigned] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Commented] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Comment Edited] (ARROW-6429) [CI][Crossbow] Nightly spark integration job fails

[jira] [Created] (ARROW-6652) [Python] to_pandas conversion removes timezone from type

[jira] [Updated] (ARROW-6651) [R] Fix R conda job

[jira] [Created] (ARROW-6651) [R] Fix R conda job

36 matches

Site Navigation

Mail list logo

Footer information