[jira] [Updated] (ARROW-8359) [C++/Python] Enable aarch64/ppc64le build in conda recipes

2020-04-14 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8359:
---
Fix Version/s: (was: 0.17.0)
   1.0.0

> [C++/Python] Enable aarch64/ppc64le build in conda recipes
> --
>
> Key: ARROW-8359
> URL: https://issues.apache.org/jira/browse/ARROW-8359
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Packaging, Python
>Reporter: Uwe Korn
>Priority: Major
> Fix For: 1.0.0
>
>
> These two new arches were added in the conda recipes, we should also build 
> them as nightlies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8359) [C++/Python] Enable aarch64/ppc64le build in conda recipes

2020-04-14 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083344#comment-17083344
 ] 

Krisztian Szucs commented on ARROW-8359:


[~uwe] postponed it to 1.0.0

> [C++/Python] Enable aarch64/ppc64le build in conda recipes
> --
>
> Key: ARROW-8359
> URL: https://issues.apache.org/jira/browse/ARROW-8359
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Packaging, Python
>Reporter: Uwe Korn
>Priority: Major
> Fix For: 1.0.0
>
>
> These two new arches were added in the conda recipes, we should also build 
> them as nightlies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8444) [Documentation] Fix spelling errors across the codebase

2020-04-14 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8444:
--

 Summary: [Documentation] Fix spelling errors across the codebase
 Key: ARROW-8444
 URL: https://issues.apache.org/jira/browse/ARROW-8444
 Project: Apache Arrow
  Issue Type: Task
  Components: Documentation
Reporter: Krisztian Szucs






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8441) [C++] Fix crashes on invalid input (OSS-Fuzz)

2020-04-14 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8441.

Resolution: Fixed

Issue resolved by pull request 6928
[https://github.com/apache/arrow/pull/6928]

> [C++] Fix crashes on invalid input (OSS-Fuzz)
> -
>
> Key: ARROW-8441
> URL: https://issues.apache.org/jira/browse/ARROW-8441
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8442) [Python] NullType.to_pandas_dtype inconsisent with dtype returned in to_pandas/to_numpy

2020-04-14 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8442.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6930
[https://github.com/apache/arrow/pull/6930]

> [Python] NullType.to_pandas_dtype inconsisent with dtype returned in 
> to_pandas/to_numpy
> ---
>
> Key: ARROW-8442
> URL: https://issues.apache.org/jira/browse/ARROW-8442
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There is this behaviour of {{to_pandas_dtype}} returning float, while all 
> actual conversions to numpy or pandas use object dtype:
> {code}
> In [23]: pa.null().to_pandas_dtype()  
>   
>
> Out[23]: numpy.float64
> In [24]: pa.array([], pa.null()).to_pandas()  
>   
>
> Out[24]: Series([], dtype: object)
> In [25]: pa.array([], pa.null()).to_numpy(zero_copy_only=False)   
>   
>
> Out[25]: array([], dtype=object)
> {code}
> So we should probably fix {{NullType.to_pandas_dtype}} to return object, 
> which is used in practice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8442) [Python] NullType.to_pandas_dtype inconsisent with dtype returned in to_pandas/to_numpy

2020-04-14 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8442:
--

Assignee: Joris Van den Bossche

> [Python] NullType.to_pandas_dtype inconsisent with dtype returned in 
> to_pandas/to_numpy
> ---
>
> Key: ARROW-8442
> URL: https://issues.apache.org/jira/browse/ARROW-8442
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Joris Van den Bossche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There is this behaviour of {{to_pandas_dtype}} returning float, while all 
> actual conversions to numpy or pandas use object dtype:
> {code}
> In [23]: pa.null().to_pandas_dtype()  
>   
>
> Out[23]: numpy.float64
> In [24]: pa.array([], pa.null()).to_pandas()  
>   
>
> Out[24]: Series([], dtype: object)
> In [25]: pa.array([], pa.null()).to_numpy(zero_copy_only=False)   
>   
>
> Out[25]: array([], dtype=object)
> {code}
> So we should probably fix {{NullType.to_pandas_dtype}} to return object, 
> which is used in practice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8414) [Python] Non-deterministic row order failure in test_parquet.py

2020-04-14 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8414.

Resolution: Fixed

Issue resolved by pull request 6926
[https://github.com/apache/arrow/pull/6926]

> [Python] Non-deterministic row order failure in test_parquet.py
> ---
>
> Key: ARROW-8414
> URL: https://issues.apache.org/jira/browse/ARROW-8414
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Joris Van den Bossche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8437) [C++] Remove std::move return value from MakeRandomNullBitmap test utility

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8437.

Resolution: Fixed

Issue resolved by pull request 6924
[https://github.com/apache/arrow/pull/6924]

> [C++] Remove std::move return value from MakeRandomNullBitmap test utility
> --
>
> Key: ARROW-8437
> URL: https://issues.apache.org/jira/browse/ARROW-8437
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Introduced by #6910, the builds triggered on the PR have not catched the 
> compile error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8437) [C++] Remove std::move return value from MakeRandomNullBitmap test utility

2020-04-13 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8437:
--

 Summary: [C++] Remove std::move return value from 
MakeRandomNullBitmap test utility
 Key: ARROW-8437
 URL: https://issues.apache.org/jira/browse/ARROW-8437
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 0.17.0


Introduced by #6910, the builds triggered on the PR have not catched the 
compile error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8432) [Python][CI] Failure to download Hadoop

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8432:
--

Assignee: Ben Kietzman  (was: Krisztian Szucs)

> [Python][CI] Failure to download Hadoop
> ---
>
> Key: ARROW-8432
> URL: https://issues.apache.org/jira/browse/ARROW-8432
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration, Python
>Affects Versions: 0.16.0
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> https://circleci.com/gh/ursa-labs/crossbow/11128?utm_campaign=vcs-integration-link_medium=referral_source=github-build-link
> This is caused by an HTTP request failure 
> https://github.com/apache/arrow/blob/master/ci/docker/conda-python-hdfs.dockerfile#L36
> We should probably not rely on https://www.apache.org/dyn/mirrors/mirrors.cgi 
> to get tarballs. Currently there are:
> {code}
> ci/docker/conda-python-hdfs.dockerfile
> 36:RUN wget -q -O - 
> "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download=hadoop/common/hadoop-${hdfs}/hadoop-${hdfs}.tar.gz;
>  | tar -xzf - -C /opt
> ci/docker/linux-apt-docs.dockerfile
> 57:RUN wget -q -O - 
> "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download=maven/maven-3/${maven}/binaries/apache-maven-${maven}-bin.tar.gz;
>  | tar -xzf - -C /opt
> python/manylinux1/scripts/build_thrift.sh
> 22:  
> "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download=${THRIFT_DOWNLOAD_PATH};
>  \
> python/manylinux201x/scripts/build_thrift.sh
> 20:wget 
> https://archive.apache.org/dist/thrift/${THRIFT_VERSION}/thrift-${THRIFT_VERSION}.tar.gz
> {code}
> Factor these out into a reusable script for downloading apache tarballs. It 
> should contain hard coded apache mirrors and retry when connections fail



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8432) [Python][CI] Failure to download Hadoop

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8432:
--

Assignee: Krisztian Szucs  (was: Ben Kietzman)

> [Python][CI] Failure to download Hadoop
> ---
>
> Key: ARROW-8432
> URL: https://issues.apache.org/jira/browse/ARROW-8432
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration, Python
>Affects Versions: 0.16.0
>Reporter: Ben Kietzman
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> https://circleci.com/gh/ursa-labs/crossbow/11128?utm_campaign=vcs-integration-link_medium=referral_source=github-build-link
> This is caused by an HTTP request failure 
> https://github.com/apache/arrow/blob/master/ci/docker/conda-python-hdfs.dockerfile#L36
> We should probably not rely on https://www.apache.org/dyn/mirrors/mirrors.cgi 
> to get tarballs. Currently there are:
> {code}
> ci/docker/conda-python-hdfs.dockerfile
> 36:RUN wget -q -O - 
> "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download=hadoop/common/hadoop-${hdfs}/hadoop-${hdfs}.tar.gz;
>  | tar -xzf - -C /opt
> ci/docker/linux-apt-docs.dockerfile
> 57:RUN wget -q -O - 
> "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download=maven/maven-3/${maven}/binaries/apache-maven-${maven}-bin.tar.gz;
>  | tar -xzf - -C /opt
> python/manylinux1/scripts/build_thrift.sh
> 22:  
> "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download=${THRIFT_DOWNLOAD_PATH};
>  \
> python/manylinux201x/scripts/build_thrift.sh
> 20:wget 
> https://archive.apache.org/dist/thrift/${THRIFT_VERSION}/thrift-${THRIFT_VERSION}.tar.gz
> {code}
> Factor these out into a reusable script for downloading apache tarballs. It 
> should contain hard coded apache mirrors and retry when connections fail



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8415) [C++][Packaging] fix gandiva linux job

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8415.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6910
[https://github.com/apache/arrow/pull/6910]

> [C++][Packaging] fix gandiva linux job
> --
>
> Key: ARROW-8415
> URL: https://issues.apache.org/jira/browse/ARROW-8415
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Prudhvi Porandla
>Assignee: Prudhvi Porandla
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8436) [Docs] Update NodeJS version in the documentation image because version 11 is no longer maintained

2020-04-13 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8436:
--

 Summary: [Docs] Update NodeJS version in the documentation image 
because version 11 is no longer maintained
 Key: ARROW-8436
 URL: https://issues.apache.org/jira/browse/ARROW-8436
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation, JavaScript
Reporter: Krisztian Szucs


{code}


  DEPRECATION WARNING

  Node.js 11.x is no longer actively supported!

  You will not receive security or critical stability updates for this version.

  You should migrate to a supported version of Node.js as soon as possible.
  Use the installation script that corresponds to the version of Node.js you
  wish to install. e.g.

   * https://deb.nodesource.com/setup_10.x — Node.js 10 LTS "Dubnium" 
(recommended)
   * https://deb.nodesource.com/setup_12.x — Node.js 12 LTS "Erbium"

  Please see https://github.com/nodejs/Release for details about which
  version may be appropriate for you.

  The NodeSource Node.js distributions repository contains
  information both about supported versions of Node.js and supported Linux
  distributions. To learn more about usage, see the repository:
https://github.com/nodesource/distributions


{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8406) [Python] test_fs fails when run from a different drive on Windows

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8406.

Resolution: Fixed

Issue resolved by pull request 6911
[https://github.com/apache/arrow/pull/6911]

> [Python] test_fs fails when run from a different drive on Windows
> -
>
> Key: ARROW-8406
> URL: https://issues.apache.org/jira/browse/ARROW-8406
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Krisztian Szucs
>Assignee: Antoine Pitrou
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> {code:python}
> path = 
> "C:\Users\VssAdministrator\AppData\Local\Temp\pytest-of-VssAdministrator\pytest-0\test_construct_from_single_fil0\single-file"
> _, path = FileSystem.from_uri(path)
> path == 
> "/Users/VssAdministrator/AppData/Local/Temp/pytest-of-VssAdministrator/pytest-0/test_construct_from_single_fil0/single-file"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8290) [Python][Dataset] Improve ergonomy of the FileSystemDataset constructor

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8290.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6913
[https://github.com/apache/arrow/pull/6913]

> [Python][Dataset] Improve ergonomy of the FileSystemDataset constructor
> ---
>
> Key: ARROW-8290
> URL: https://issues.apache.org/jira/browse/ARROW-8290
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Joris Van den Bossche
>Priority: Major
>  Labels: dataset, pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, to manually create a FileSystemDataset, you can do something like:
> {code}
> dataset = ds.FileSystemDataset(
> schema, None, ds.ParquetFileFormat(), pa.fs.LocalFileSystem(),
> ["data_file1.parquet", "data_file2.parquet"],
> [ds.field('file') == 1, ds.field('file') == 2])
> {code}
> There are some usibility improvements we can do though:
> - Allow passing the arguments by name to improve readability of the calling 
> code (now they all need to be passed positionally, due to the way they are 
> implemented in cython as {{not None}})
> - I would maybe change the order of the arguments (eg start with the paths, 
> we don't need to match the order of the C++ constructor)
> - Potentially allow {{partitions}} to be optional, in which case they need to 
> be set to a list of ScalarExpression(True) values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8428) [C++][NIGHTLY:gandiva-jar-trusty] GCC 4.8 failures in C++ unit tests

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8428.

Resolution: Fixed

Issue resolved by pull request 6916
[https://github.com/apache/arrow/pull/6916]

> [C++][NIGHTLY:gandiva-jar-trusty] GCC 4.8 failures in C++ unit tests
> 
>
> Key: ARROW-8428
> URL: https://issues.apache.org/jira/browse/ARROW-8428
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.16.0
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/ARROW-8388
> Not reported by the CI job added in that issue since manylinux1 doesn't 
> currently build the c++ unit tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8431) [C++][CI] Configure a build to build and execute the C++ tests with GCC 4.8

2020-04-13 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8431:
--

 Summary: [C++][CI] Configure a build to build and execute the C++ 
tests with GCC 4.8
 Key: ARROW-8431
 URL: https://issues.apache.org/jira/browse/ARROW-8431
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Krisztian Szucs
 Fix For: 1.0.0


The gandiva jar nightly build and the manylinux1 wheels are building with GCC 
4.8.
We already have the manylinux1 running on each commit but it doesn't exercise 
the C++ tests 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8430) [CI] Configure self-hosted runners for Github Actions

2020-04-13 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8430:
--

 Summary: [CI] Configure self-hosted runners for Github Actions
 Key: ARROW-8430
 URL: https://issues.apache.org/jira/browse/ARROW-8430
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


Set up Ubuntu C++ ARMv8 builders and perhaps AMD64 builder to run on 
self-hosted github runners.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8407) [Rust] Add rustdoc for Dictionary type

2020-04-13 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8407.

Resolution: Fixed

Issue resolved by pull request 6904
[https://github.com/apache/arrow/pull/6904]

> [Rust] Add rustdoc for Dictionary type
> --
>
> Key: ARROW-8407
> URL: https://issues.apache.org/jira/browse/ARROW-8407
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add rustdoc for Dictionary type



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8417) [Packaging] Move the manylinux crossbow wheel builds to Githuba actions

2020-04-13 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8417:
--

 Summary: [Packaging] Move the manylinux crossbow wheel builds to 
Githuba actions
 Key: ARROW-8417
 URL: https://issues.apache.org/jira/browse/ARROW-8417
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


To free up some bandwidth on azure for the conda jobs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8406) [Python] test_fs fails when run from a different drive on Windows

2020-04-13 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082256#comment-17082256
 ] 

Krisztian Szucs commented on ARROW-8406:


{code:python}
>>> FileSystem.from_uri('D:/something.parquet')
(, 
'D:/something.parquet')
>>> FileSystem.from_uri('D:\something.parquet')
(, 
'D:/something.parquet')
>>> FileSystem.from_uri('file://D:/something.parquet')
(, 
'/something.parquet')
>>> FileSystem.from_uri('file://D:\something.parquet')
(, 
'/something.parquet')
{code}

> [Python] test_fs fails when run from a different drive on Windows
> -
>
> Key: ARROW-8406
> URL: https://issues.apache.org/jira/browse/ARROW-8406
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Krisztian Szucs
>Assignee: Antoine Pitrou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> {code:python}
> path = 
> "C:\Users\VssAdministrator\AppData\Local\Temp\pytest-of-VssAdministrator\pytest-0\test_construct_from_single_fil0\single-file"
> _, path = FileSystem.from_uri(path)
> path == 
> "/Users/VssAdministrator/AppData/Local/Temp/pytest-of-VssAdministrator/pytest-0/test_construct_from_single_fil0/single-file"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8406) [Python] test_fs fails when run from a different drive on Windows

2020-04-13 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082236#comment-17082236
 ] 

Krisztian Szucs commented on ARROW-8406:


[~apitrou] could you try with the path prefixed with the sceme? 
{{file://C:/Users/VssAdministrator...}}

> [Python] test_fs fails when run from a different drive on Windows
> -
>
> Key: ARROW-8406
> URL: https://issues.apache.org/jira/browse/ARROW-8406
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Krisztian Szucs
>Assignee: Antoine Pitrou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> {code:python}
> path = 
> "C:\Users\VssAdministrator\AppData\Local\Temp\pytest-of-VssAdministrator\pytest-0\test_construct_from_single_fil0\single-file"
> _, path = FileSystem.from_uri(path)
> path == 
> "/Users/VssAdministrator/AppData/Local/Temp/pytest-of-VssAdministrator/pytest-0/test_construct_from_single_fil0/single-file"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8406) [Python] FileSystem.from_uri erases the drive on Windows

2020-04-12 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8406:
--

Assignee: Antoine Pitrou

> [Python] FileSystem.from_uri erases the drive on Windows
> 
>
> Key: ARROW-8406
> URL: https://issues.apache.org/jira/browse/ARROW-8406
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Krisztian Szucs
>Assignee: Antoine Pitrou
>Priority: Major
>
> {code:python}
> path = 
> "C:\Users\VssAdministrator\AppData\Local\Temp\pytest-of-VssAdministrator\pytest-0\test_construct_from_single_fil0\single-file"
> _, path = FileSystem.from_uri(path)
> path == 
> "/Users/VssAdministrator/AppData/Local/Temp/pytest-of-VssAdministrator/pytest-0/test_construct_from_single_fil0/single-file"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8406) [Python] FileSystem.from_uri erases the drive on Windows

2020-04-12 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8406:
--

 Summary: [Python] FileSystem.from_uri erases the drive on Windows
 Key: ARROW-8406
 URL: https://issues.apache.org/jira/browse/ARROW-8406
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Python
Reporter: Krisztian Szucs


{code:python}
path = 
"C:\Users\VssAdministrator\AppData\Local\Temp\pytest-of-VssAdministrator\pytest-0\test_construct_from_single_fil0\single-file"
_, path = FileSystem.from_uri(path)
path == 
"/Users/VssAdministrator/AppData/Local/Temp/pytest-of-VssAdministrator/pytest-0/test_construct_from_single_fil0/single-file"
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8397) [C++] Fail to compile aggregate_test.cc on Ubuntu 16.04

2020-04-11 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8397.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6895
[https://github.com/apache/arrow/pull/6895]

> [C++] Fail to compile aggregate_test.cc on Ubuntu 16.04
> ---
>
> Key: ARROW-8397
> URL: https://issues.apache.org/jira/browse/ARROW-8397
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> See build log 
> https://app.circleci.com/pipelines/github/ursa-labs/crossbow/31122/workflows/b250d378-52a8-4d15-9909-96474fa38482/jobs/10840



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8400) [Python][Dataset] Infer the filesystem from the first path if multiple paths are passed to dataset()

2020-04-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8400:
--

 Summary: [Python][Dataset] Infer the filesystem from the first 
path if multiple paths are passed to dataset()
 Key: ARROW-8400
 URL: https://issues.apache.org/jira/browse/ARROW-8400
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Krisztian Szucs


See conversation https://github.com/apache/arrow/pull/6505#discussion_r406677317



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8398) [Python] Remove deprecation warnings originating from python tests

2020-04-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8398:
--

 Summary: [Python] Remove deprecation warnings originating from 
python tests
 Key: ARROW-8398
 URL: https://issues.apache.org/jira/browse/ARROW-8398
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


See build log 
https://travis-ci.org/github/ursa-labs/crossbow/builds/673385834#L6846



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8388) [C++] GCC 4.8 fails to move on return

2020-04-10 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8388.

Resolution: Fixed

Issue resolved by pull request 6894
[https://github.com/apache/arrow/pull/6894]

> [C++] GCC 4.8 fails to move on return
> -
>
> Key: ARROW-8388
> URL: https://issues.apache.org/jira/browse/ARROW-8388
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.16.0
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> See https://github.com/apache/arrow/pull/6883#issuecomment-611661733
> This is a recurring problem which usually shows up as a broken nightly (the 
> gandiva nightly jobs, specifically) along with similar issues due to gcc 
> 4.8's incomplete handling of c++11. As long as someone depends on these we 
> should probably have an every-commit CI job which checks we haven't 
> introduced such a breakage



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8397) [C++] Fail to compile aggregate_test.cc on Ubuntu 16.04

2020-04-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8397:
--

 Summary: [C++] Fail to compile aggregate_test.cc on Ubuntu 16.04
 Key: ARROW-8397
 URL: https://issues.apache.org/jira/browse/ARROW-8397
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


See build log 
https://app.circleci.com/pipelines/github/ursa-labs/crossbow/31122/workflows/b250d378-52a8-4d15-9909-96474fa38482/jobs/10840



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-7965) [Python] Refine higher level dataset API

2020-04-09 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-7965:
---
Summary: [Python] Refine higher level dataset API  (was: [Python] Hold a 
reference to the dataset factory for later reuse)

> [Python] Refine higher level dataset API
> 
>
> Key: ARROW-7965
> URL: https://issues.apache.org/jira/browse/ARROW-7965
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Provide a more intuitive way to construct nested dataset:
> ```python
> # instead of using confusing factory function
> dataset([
>  factory("s3://old-taxi-data", format="parquet"),
>  factory("local/path/to/new/data", format="csv")
> ])
> # let the user to construct a new dataset directly from dataset objects
> dataset([ 
> dataset("s3://old-taxi-data", format="parquet"),
> dataset("local/path/to/new/data", format="csv")
> ])
> ```
> In the future we might want to introduce a new Dataset class which wraps 
> functionality of both the dataset actory and the materialized dataset 
> enabling optimizations over rediscovery of already materialized datasets. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8359) [C++/Python] Enable aarch64/ppc64le build in conda recipes

2020-04-09 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079598#comment-17079598
 ] 

Krisztian Szucs commented on ARROW-8359:


For travis we do, not yet for drone. Does drone provide a free plan?

> [C++/Python] Enable aarch64/ppc64le build in conda recipes
> --
>
> Key: ARROW-8359
> URL: https://issues.apache.org/jira/browse/ARROW-8359
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Packaging, Python
>Reporter: Uwe Korn
>Priority: Major
> Fix For: 0.17.0
>
>
> These two new arches were added in the conda recipes, we should also build 
> them as nightlies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-8359) [C++/Python] Enable aarch64/ppc64le build in conda recipes

2020-04-09 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079598#comment-17079598
 ] 

Krisztian Szucs edited comment on ARROW-8359 at 4/9/20, 5:36 PM:
-

[~uwe] For travis we do, not yet for drone. Does drone provide a free plan?


was (Author: kszucs):
For travis we do, not yet for drone. Does drone provide a free plan?

> [C++/Python] Enable aarch64/ppc64le build in conda recipes
> --
>
> Key: ARROW-8359
> URL: https://issues.apache.org/jira/browse/ARROW-8359
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Packaging, Python
>Reporter: Uwe Korn
>Priority: Major
> Fix For: 0.17.0
>
>
> These two new arches were added in the conda recipes, we should also build 
> them as nightlies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-7794) [Rust] cargo publish fails for arrow-flight due to relative path to Flight.proto

2020-04-09 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-7794:
--

Assignee: Andy Grove  (was: Krisztian Szucs)

> [Rust] cargo publish fails for arrow-flight due to relative path to 
> Flight.proto
> 
>
> Key: ARROW-7794
> URL: https://issues.apache.org/jira/browse/ARROW-7794
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust
>Affects Versions: 0.16.0
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Running "cargo publish" for the arrow-flight crate resulted in this error:
> {code:java}
> error: failed to run custom build command for `arrow-flight v0.16.0 
> (/home/andy/apache-arrow-0.16.0/rust/target/package/arrow-flight-0.16.0)`Caused
>  by:
>   process didn't exit successfully: 
> `/home/andy/apache-arrow-0.16.0/rust/target/package/arrow-flight-0.16.0/target/debug/build/arrow-flight-1b2906a3933d2832/build-script-build`
>  (exit code: 1)
> --- stderr
> Error: Custom { kind: Other, error: "protoc failed: ../../format: warning: 
> directory does not exist.\nCould not make proto path relative: 
> ../../format/Flight.proto: No such file or directory\n" }
>  {code}
> The workaround was to edit the build.rs and make the path absolute and then 
> run "cargo publish --allow-dirty", but we should find a better solution 
> before the next release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-7794) [Rust] cargo publish fails for arrow-flight due to relative path to Flight.proto

2020-04-09 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-7794:
--

Assignee: Krisztian Szucs  (was: Andy Grove)

> [Rust] cargo publish fails for arrow-flight due to relative path to 
> Flight.proto
> 
>
> Key: ARROW-7794
> URL: https://issues.apache.org/jira/browse/ARROW-7794
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust
>Affects Versions: 0.16.0
>Reporter: Andy Grove
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Running "cargo publish" for the arrow-flight crate resulted in this error:
> {code:java}
> error: failed to run custom build command for `arrow-flight v0.16.0 
> (/home/andy/apache-arrow-0.16.0/rust/target/package/arrow-flight-0.16.0)`Caused
>  by:
>   process didn't exit successfully: 
> `/home/andy/apache-arrow-0.16.0/rust/target/package/arrow-flight-0.16.0/target/debug/build/arrow-flight-1b2906a3933d2832/build-script-build`
>  (exit code: 1)
> --- stderr
> Error: Custom { kind: Other, error: "protoc failed: ../../format: warning: 
> directory does not exist.\nCould not make proto path relative: 
> ../../format/Flight.proto: No such file or directory\n" }
>  {code}
> The workaround was to edit the build.rs and make the path absolute and then 
> run "cargo publish --allow-dirty", but we should find a better solution 
> before the next release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-7256) [C++] Remove ARROW_MEMORY_POOL_DEFAULT option

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-7256:
--

Assignee: Krisztian Szucs  (was: Francois Saint-Jacques)

> [C++] Remove ARROW_MEMORY_POOL_DEFAULT option
> -
>
> Key: ARROW-7256
> URL: https://issues.apache.org/jira/browse/ARROW-7256
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As mentioned elsewhere in a JIRA I recall, we aren't testing adequately the 
> CMake option for "no default memory pool", so it would either be better to 
> require explicit memory pools or pass the default, rather than having a 
> build-time option to set whether a default will be passed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-7256) [C++] Remove ARROW_MEMORY_POOL_DEFAULT macro

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-7256:
---
Summary: [C++] Remove ARROW_MEMORY_POOL_DEFAULT macro  (was: [C++] Remove 
ARROW_MEMORY_POOL_DEFAULT option)

> [C++] Remove ARROW_MEMORY_POOL_DEFAULT macro
> 
>
> Key: ARROW-7256
> URL: https://issues.apache.org/jira/browse/ARROW-7256
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As mentioned elsewhere in a JIRA I recall, we aren't testing adequately the 
> CMake option for "no default memory pool", so it would either be better to 
> require explicit memory pools or pass the default, rather than having a 
> build-time option to set whether a default will be passed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-7256) [C++] Remove ARROW_MEMORY_POOL_DEFAULT option

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-7256.

Fix Version/s: (was: 0.16.0)
   0.17.0
   Resolution: Fixed

Issue resolved by pull request 6877
[https://github.com/apache/arrow/pull/6877]

> [C++] Remove ARROW_MEMORY_POOL_DEFAULT option
> -
>
> Key: ARROW-7256
> URL: https://issues.apache.org/jira/browse/ARROW-7256
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Francois Saint-Jacques
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As mentioned elsewhere in a JIRA I recall, we aren't testing adequately the 
> CMake option for "no default memory pool", so it would either be better to 
> require explicit memory pools or pass the default, rather than having a 
> build-time option to set whether a default will be passed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8316) [CI] Set docker-compose to use docker-cli instead of docker-py for building images

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8316.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6802
[https://github.com/apache/arrow/pull/6802]

> [CI] Set docker-compose to use docker-cli instead of docker-py for building 
> images
> --
>
> Key: ARROW-8316
> URL: https://issues.apache.org/jira/browse/ARROW-8316
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The images pushed from the master branch were sometimes producing reusable 
> layers, sometimes not. So the caching was working non-deterministically. 
> The underlying issue is https://github.com/docker/compose/issues/883



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (ARROW-7256) [C++] Remove ARROW_MEMORY_POOL_DEFAULT option

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reopened ARROW-7256:


> [C++] Remove ARROW_MEMORY_POOL_DEFAULT option
> -
>
> Key: ARROW-7256
> URL: https://issues.apache.org/jira/browse/ARROW-7256
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Francois Saint-Jacques
>Priority: Major
> Fix For: 0.16.0
>
>
> As mentioned elsewhere in a JIRA I recall, we aren't testing adequately the 
> CMake option for "no default memory pool", so it would either be better to 
> require explicit memory pools or pass the default, rather than having a 
> build-time option to set whether a default will be passed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8369) [CI] Fix crossbow wildcard groups

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8369.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6868
[https://github.com/apache/arrow/pull/6868]

> [CI] Fix crossbow wildcard groups
> -
>
> Key: ARROW-8369
> URL: https://issues.apache.org/jira/browse/ARROW-8369
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This was broken in ARROW-8356



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8371) [Crossbow] Implement and exercise sanity checks for tasks.yml

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8371.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6875
[https://github.com/apache/arrow/pull/6875]

> [Crossbow] Implement and exercise sanity checks for tasks.yml 
> --
>
> Key: ARROW-8371
> URL: https://issues.apache.org/jira/browse/ARROW-8371
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See conversation at 
> https://github.com/apache/arrow/pull/6868#issuecomment-610721717



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8335) [Release] Add crossbow jobs to run release verification

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8335:
---
Fix Version/s: (was: 0.17.0)
   1.0.0

> [Release] Add crossbow jobs to run release verification
> ---
>
> Key: ARROW-8335
> URL: https://issues.apache.org/jira/browse/ARROW-8335
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Workflow: edit version number and rc number in template in 
> {{dev/release/github.verify.yml}}, make PR, and do 
> * {{@github-actions crossbow submit -g verify-rc}} to run everything
> * {{@github-actions crossbow submit -g verify-rc-wheel|source|binary}} to run 
> those groups
> * Other groups at {{verify-rc-wheel|source-macos|ubuntu|windows}}, 
> {{verify-rc-source-cpp|csharp|java|etc.}}
> * Individual workflows at e.g. {{verify-rc-wheel-windows}}, 
> {{verify-rc-source-macos-csharp}}. We could break out the wheel verification 
> by python version (maybe we should), but that requires changes to the 
> verification scripts themselves.
> Running the main {{verify-rc}} group will put a ton of workflow svg badges on 
> the PR so we can see at a glance what is passing and failing. If things fail 
> when running all, can push fixes to the verification script to the branch and 
> retry just those that failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8371) [Crossbow] Implement and exercise sanity checks for tasks.yml

2020-04-08 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8371:
---
Issue Type: Improvement  (was: Task)

> [Crossbow] Implement and exercise sanity checks for tasks.yml 
> --
>
> Key: ARROW-8371
> URL: https://issues.apache.org/jira/browse/ARROW-8371
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>
> See conversation at 
> https://github.com/apache/arrow/pull/6868#issuecomment-610721717



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8371) [Crossbow] Implement and exercise sanity checks for tasks.yml

2020-04-08 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8371:
--

 Summary: [Crossbow] Implement and exercise sanity checks for 
tasks.yml 
 Key: ARROW-8371
 URL: https://issues.apache.org/jira/browse/ARROW-8371
 Project: Apache Arrow
  Issue Type: Task
  Components: Continuous Integration, Packaging
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


See conversation at 
https://github.com/apache/arrow/pull/6868#issuecomment-610721717



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-5497) [Release] Build and publish R/Java/JS docs

2020-04-07 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17077661#comment-17077661
 ] 

Krisztian Szucs commented on ARROW-5497:


[~npr]

 

[https://arrow.apache.org/docs/r/]

[https://arrow.apache.org/docs/java/]

[https://arrow.apache.org/docs/js/]

The release process generates them, but I'm not sure that we have proper 
references to them.

> [Release] Build and publish R/Java/JS docs
> --
>
> Key: ARROW-5497
> URL: https://issues.apache.org/jira/browse/ARROW-5497
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools, Documentation, Java, JavaScript, R
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Major
>
> Edit: this ticket was originally just about adding the R package docs, but it 
> seems that the JS and Java docs aren't getting built as part of the release 
> process anymore, so that needs to be fixed.
>  
> Original description:
> https://issues.apache.org/jira/browse/ARROW-5452 added the R pkgdown site 
> config. Adding the wiring into the apidocs build scripts was deferred because 
> there was some discussion about which workflow was supported and which was 
> deprecated.  
> Uwe says: "Have a look at 
> [https://github.com/apache/arrow/blob/master/docs/Dockerfile] and 
> [https://github.com/apache/arrow/blob/master/ci/docker_build_sphinx.sh] Add 
> that and a docs-r entry in the main {{docker-compose.yml}} should be 
> sufficient to get it running in the docker setup. But actually I would rather 
> like to see that we also add the R build to the above linked files."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8362) [Crossbow] Ensure that the locally generated version is used in the docker tasks

2020-04-07 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8362.

Resolution: Fixed

Issue resolved by pull request 6862
[https://github.com/apache/arrow/pull/6862]

> [Crossbow] Ensure that the locally generated version is used in the docker 
> tasks
> 
>
> Key: ARROW-8362
> URL: https://issues.apache.org/jira/browse/ARROW-8362
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Arrow fork might not have the version tags, so the scm based version 
> generation can't work. 
> Pass the locally detected version to the docker builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8363) [Archery] Comment bot should report any errors happening during crossbow submit

2020-04-07 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8363:
--

 Summary: [Archery] Comment bot should report any errors happening 
during crossbow submit
 Key: ARROW-8363
 URL: https://issues.apache.org/jira/browse/ARROW-8363
 Project: Apache Arrow
  Issue Type: Task
  Components: Archery
Reporter: Krisztian Szucs


We already get a feedback to the github comment, but no error message. 

 

Example failure 
https://github.com/apache/arrow/runs/567644496?check_suite_focus=true#step:5:42



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8362) [Crossbow] Ensure that the locally generated version is used in the docker tasks

2020-04-07 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8362:
--

 Summary: [Crossbow] Ensure that the locally generated version is 
used in the docker tasks
 Key: ARROW-8362
 URL: https://issues.apache.org/jira/browse/ARROW-8362
 Project: Apache Arrow
  Issue Type: Task
  Components: Packaging
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 0.17.0


Arrow fork might not have the version tags, so the scm based version generation 
can't work. 

Pass the locally detected version to the docker builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8149) [C++/Python] Enable CUDA Support in conda recipes

2020-04-06 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076757#comment-17076757
 ] 

Krisztian Szucs commented on ARROW-8149:


[~uwe] what is the status of it? I assume we can postpone it.

> [C++/Python] Enable CUDA Support in conda recipes
> -
>
> Key: ARROW-8149
> URL: https://issues.apache.org/jira/browse/ARROW-8149
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Packaging
>Reporter: Uwe Korn
>Priority: Major
> Fix For: 0.17.0
>
>
> See the changes in 
> [https://github.com/conda-forge/arrow-cpp-feedstock/pull/123], we need to 
> copy this into the Arrow repository and also test CUDA in these recipes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8213) [Python][Dataset] Opening a dataset with a local incorrect path gives confusing error message

2020-04-06 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8213:
--

Assignee: Krisztian Szucs

> [Python][Dataset] Opening a dataset with a local incorrect path gives 
> confusing error message
> -
>
> Key: ARROW-8213
> URL: https://issues.apache.org/jira/browse/ARROW-8213
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++ - Dataset, Python
>Reporter: Joris Van den Bossche
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Even after the previous PRs related to local paths 
> (https://github.com/apache/arrow/pull/6643, 
> https://github.com/apache/arrow/pull/6655), I don't think the user experience 
> optimal in case you are working with local files, and pass a wrong, 
> non-existent path (eg due to a typo).
> Currently, you get this error:
> {code}
> >>> dataset = ds.dataset("data_with_typo.parquet", format="parquet")
> ...
> ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet'
> {code}
> where "URI has empty scheme" is rather confusing for the user in case of a 
> non-existent path.  I think ideally we should raise a "No such file or 
> directory" error.
> I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can 
> also give other errors that we do want to propagate to the user. 
> The most straightforward that I am now thinking of is checking if "URI has 
> empty scheme" is in the error message, and then rewording it, but that's not 
> very clean ..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8355) [Python] Reduce the number of pandas dependent test cases in test_feather

2020-04-06 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8355:
--

 Summary: [Python] Reduce the number of pandas dependent test cases 
in test_feather
 Key: ARROW-8355
 URL: https://issues.apache.org/jira/browse/ARROW-8355
 Project: Apache Arrow
  Issue Type: Task
  Components: Python
Reporter: Krisztian Szucs
 Fix For: 1.0.0


See comment https://github.com/apache/arrow/pull/6849#discussion_r404160096



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8342) [Python] dask and kartothek integration tests are failing

2020-04-06 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076730#comment-17076730
 ] 

Krisztian Szucs commented on ARROW-8342:


[~wesm] you might want to disallow duplicate keys in the KeyValueMetadata 
constructor by initializing a dict from args and kwargs

> [Python] dask and kartothek integration tests are failing
> -
>
> Key: ARROW-8342
> URL: https://issues.apache.org/jira/browse/ARROW-8342
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Wes McKinney
>Priority: Blocker
> Fix For: 0.17.0
>
>
> The integration tests for both dask and kartothek, and for both master and 
> latest released version of them, started failing the last days.
> Dask latest: 
> https://circleci.com/gh/ursa-labs/crossbow/10629?utm_campaign=vcs-integration-link_medium=referral_source=github-build-link
>  
> Kartothek latest: 
> https://circleci.com/gh/ursa-labs/crossbow/10604?utm_campaign=vcs-integration-link_medium=referral_source=github-build-link
> I think both are related to the KeyValueMetadata changes (ARROW-8079).
> The kartothek one is clearly related, as it gives: TypeError: 
> 'pyarrow.lib.KeyValueMetadata' object does not support item assignment
> And I think the dask one is related to the "pandas" key now being present 
> twice, and therefore it is using the "wrong" one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-7222) [Python][Release] Wipe any existing generated Python API documentation when updating website

2020-04-06 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-7222.

Resolution: Fixed

> [Python][Release] Wipe any existing generated Python API documentation when 
> updating website
> 
>
> Key: ARROW-7222
> URL: https://issues.apache.org/jira/browse/ARROW-7222
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Removed APIs are persisting in Google searches, e.g.
> https://arrow.apache.org/docs/python/generated/pyarrow.Column.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7222) [Python][Release] Wipe any existing generated Python API documentation when updating website

2020-04-06 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076443#comment-17076443
 ] 

Krisztian Szucs commented on ARROW-7222:


The wiping is part of the post release script for docs 
[https://github.com/apache/arrow/blob/master/dev/release/post-09-docs.sh#L38]

We can close it, but in the future we want to keep more versions available 
(including one for master).

> [Python][Release] Wipe any existing generated Python API documentation when 
> updating website
> 
>
> Key: ARROW-7222
> URL: https://issues.apache.org/jira/browse/ARROW-7222
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.17.0
>
>
> Removed APIs are persisting in Google searches, e.g.
> https://arrow.apache.org/docs/python/generated/pyarrow.Column.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-7222) [Python][Release] Wipe any existing generated Python API documentation when updating website

2020-04-06 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-7222:
--

Assignee: Krisztian Szucs

> [Python][Release] Wipe any existing generated Python API documentation when 
> updating website
> 
>
> Key: ARROW-7222
> URL: https://issues.apache.org/jira/browse/ARROW-7222
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Removed APIs are persisting in Google searches, e.g.
> https://arrow.apache.org/docs/python/generated/pyarrow.Column.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-2910) [Packaging] Build from official apache archive

2020-04-04 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075227#comment-17075227
 ] 

Krisztian Szucs commented on ARROW-2910:


Still valid, it would be nice to use the official source tarball but the 
scripts are currently wired to use the repository.

> [Packaging] Build from official apache archive
> --
>
> Key: ARROW-2910
> URL: https://issues.apache.org/jira/browse/ARROW-2910
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-1271) [Packaging] Build scripts for creating nightly conda-forge-compatible package builds

2020-04-04 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs closed ARROW-1271.
--
Resolution: Duplicate

> [Packaging] Build scripts for creating nightly conda-forge-compatible package 
> builds
> 
>
> Key: ARROW-1271
> URL: https://issues.apache.org/jira/browse/ARROW-1271
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
>
> cc [~cpcloud]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-1271) [Packaging] Build scripts for creating nightly conda-forge-compatible package builds

2020-04-04 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075224#comment-17075224
 ] 

Krisztian Szucs commented on ARROW-1271:


Yes, it is resolved by 
https://github.com/apache/arrow/commit/a718b6c21084d6027b1a5ad4e921e34cce106d8c

> [Packaging] Build scripts for creating nightly conda-forge-compatible package 
> builds
> 
>
> Key: ARROW-1271
> URL: https://issues.apache.org/jira/browse/ARROW-1271
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
>
> cc [~cpcloud]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-1299) [Doc] Publish nightly documentation against master somewhere

2020-04-04 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075223#comment-17075223
 ] 

Krisztian Szucs commented on ARROW-1299:


My idea is to update arrow-site to host the documentation for the last three 
releases and one for master which could be updated regularly by a github 
actions cron job.

> [Doc] Publish nightly documentation against master somewhere
> 
>
> Key: ARROW-1299
> URL: https://issues.apache.org/jira/browse/ARROW-1299
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Documentation
>Reporter: Wes McKinney
>Priority: Major
>
> This will help catch problems with the generated documentation prior to 
> release time, and also allow users to read the latest prose documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-1582) [Python] Set up + document nightly conda builds for macOS

2020-04-04 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-1582.

Fix Version/s: 0.17.0
 Assignee: Krisztian Szucs
   Resolution: Fixed

> [Python] Set up + document nightly conda builds for macOS
> -
>
> Key: ARROW-1582
> URL: https://issues.apache.org/jira/browse/ARROW-1582
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: nightly
> Fix For: 0.17.0
>
>
> It's already been great to be able to test the nightlies on Linux in conda; 
> it would be great to be able to do the same on macOS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-1582) [Python] Set up + document nightly conda builds for macOS

2020-04-04 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17075221#comment-17075221
 ] 

Krisztian Szucs commented on ARROW-1582:


Yes, resolved by 
https://github.com/apache/arrow/commit/b6842982f60c8af52da42bf7aebe278089514df4

> [Python] Set up + document nightly conda builds for macOS
> -
>
> Key: ARROW-1582
> URL: https://issues.apache.org/jira/browse/ARROW-1582
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
>  Labels: nightly
>
> It's already been great to be able to test the nightlies on Linux in conda; 
> it would be great to be able to do the same on macOS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8330) [Documentation] The post release script generates the documentation with a development version

2020-04-03 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8330:
--

 Summary: [Documentation] The post release script generates the 
documentation with a development version
 Key: ARROW-8330
 URL: https://issues.apache.org/jira/browse/ARROW-8330
 Project: Apache Arrow
  Issue Type: Task
  Components: Documentation
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 0.17.0


See the current documentation page. Also regenerate the github page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8329) [Documentation][C++] Undocumented FilterOptions argument in Filter kernel

2020-04-03 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8329:
--

 Summary: [Documentation][C++] Undocumented FilterOptions argument 
in Filter kernel
 Key: ARROW-8329
 URL: https://issues.apache.org/jira/browse/ARROW-8329
 Project: Apache Arrow
  Issue Type: Task
  Components: C++, Documentation
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 0.17.0


The documentation build fails, see 
https://github.com/apache/arrow/runs/558617620#step:6:1186



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8323) [C++] Pin gRPC at v1.27 to avoid compilation error in its headers

2020-04-03 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8323.

Resolution: Fixed

Issue resolved by pull request 6820
[https://github.com/apache/arrow/pull/6820]

> [C++] Pin gRPC at v1.27 to avoid compilation error in its headers
> -
>
> Key: ARROW-8323
> URL: https://issues.apache.org/jira/browse/ARROW-8323
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.16.0
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [gRPC 1.28|https://github.com/grpc/grpc/releases/tag/v1.28.0] includes a 
> change which introduces an implicit size_t->int conversion in proto_utils.h: 
> https://github.com/grpc/grpc/commit/2748755a4ff9ed940356e78c105f55f839fdf38b
> Conversion warnings are treated as errors for example here: 
> https://ci.appveyor.com/project/BenjaminKietzman/arrow/build/job/9cl0vqa8e495knn3#L1126
> So IIUC we need to pin gRPC to 1.27 for now.
> Upstream PR: https://github.com/grpc/grpc/pull/22557



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8321) [CI] Use bundled thrift in Fedora 30 build

2020-04-03 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8321.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6819
[https://github.com/apache/arrow/pull/6819]

> [CI] Use bundled thrift in Fedora 30 build
> --
>
> Key: ARROW-8321
> URL: https://issues.apache.org/jira/browse/ARROW-8321
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration
>Affects Versions: 0.17.0
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After unsetting Thrift_SOURCE from AUTO it surfaced that the thrift available 
> on Fedora 30 is older 0.10 than the minimal required version 0.11.
> Build thrift_ep instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8326) [C++] Don't use deprecated TYPED_TEST_CASE

2020-04-03 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8326.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6823
[https://github.com/apache/arrow/pull/6823]

> [C++] Don't use deprecated TYPED_TEST_CASE
> --
>
> Key: ARROW-8326
> URL: https://issues.apache.org/jira/browse/ARROW-8326
> Project: Apache Arrow
>  Issue Type: Test
>  Components: C++
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8322) [CI] Fix C# workflow file syntax

2020-04-02 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8322:
--

Assignee: Krisztian Szucs

> [CI] Fix C# workflow file syntax
> 
>
> Key: ARROW-8322
> URL: https://issues.apache.org/jira/browse/ARROW-8322
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The github actions expression requires the enclosing "${{ }}"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8322) [CI] Fix C# workflow file syntax

2020-04-02 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8322.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6815
[https://github.com/apache/arrow/pull/6815]

> [CI] Fix C# workflow file syntax
> 
>
> Key: ARROW-8322
> URL: https://issues.apache.org/jira/browse/ARROW-8322
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Continuous Integration
>Reporter: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The github actions expression requires the enclosing "${{ }}"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8322) [CI] Fix C# workflow file syntax

2020-04-02 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8322:
--

 Summary: [CI] Fix C# workflow file syntax
 Key: ARROW-8322
 URL: https://issues.apache.org/jira/browse/ARROW-8322
 Project: Apache Arrow
  Issue Type: Task
  Components: Continuous Integration
Reporter: Krisztian Szucs


The github actions expression requires the enclosing "${{ }}"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8321) [CI] Use bundled thrift in Fedora 30 build

2020-04-02 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8321:
--

 Summary: [CI] Use bundled thrift in Fedora 30 build
 Key: ARROW-8321
 URL: https://issues.apache.org/jira/browse/ARROW-8321
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Affects Versions: 0.17.0
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


After unsetting Thrift_SOURCE from AUTO it surfaced that the thrift available 
on Fedora 30 is older 0.10 than the minimal required version 0.11.

Build thrift_ep instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8319) [CI] Install thrift compiler in the debian build

2020-04-02 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8319:
--

 Summary: [CI] Install thrift compiler in the debian build
 Key: ARROW-8319
 URL: https://issues.apache.org/jira/browse/ARROW-8319
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 0.17.0


CMake is missing thrift compiler after setting Thrift_SOURCE to empty from 
AUTO, 
see build: 
https://github.com/apache/arrow/runs/555631125?check_suite_focus=true#step:6:143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8316) [CI] Set docker-compose to use docker-cli instead of docker-py for building images

2020-04-02 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8316:
--

 Summary: [CI] Set docker-compose to use docker-cli instead of 
docker-py for building images
 Key: ARROW-8316
 URL: https://issues.apache.org/jira/browse/ARROW-8316
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Krisztian Szucs


The images pushed from the master branch were sometimes producing reusable 
layers, sometimes not. So the caching was working non-deterministically. 
The underlying issue is https://github.com/docker/compose/issues/883







--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8315) [Python][Dataset] Don't rely on ordered dict keys in test_dataset.py

2020-04-02 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8315:
--

Assignee: Krisztian Szucs

> [Python][Dataset] Don't rely on ordered dict keys in test_dataset.py
> 
>
> Key: ARROW-8315
> URL: https://issues.apache.org/jira/browse/ARROW-8315
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.16.0
>Reporter: Ben Kietzman
>Assignee: Krisztian Szucs
>Priority: Minor
>  Labels: dataset
> Fix For: 0.17.0
>
>
> Python 3.5 does not guarantee insertion order of dict keys, so we can't rely 
> on it when constructing tables in test_dataset.py
> https://github.com/apache/arrow/pull/6809/checks?check_run_id=554945477#step:6:2166



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8079) [Python] Implement a wrapper for KeyValueMetadata, duck-typing dict where relevant

2020-03-31 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8079:
--

Assignee: Krisztian Szucs

> [Python] Implement a wrapper for KeyValueMetadata, duck-typing dict where 
> relevant
> --
>
> Key: ARROW-8079
> URL: https://issues.apache.org/jira/browse/ARROW-8079
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Per mailing list discussion, it may be better to not return the metadata 
> always as a dict and instead wrap the KeyValueMetadata methods. We can make 
> {{__getitem__}} lookup a key in it of course



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8221) [Python][Dataset] Expose schema inference / validation options in the factory

2020-03-31 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8221:
--

Assignee: Joris Van den Bossche  (was: Krisztian Szucs)

> [Python][Dataset] Expose schema inference / validation options in the factory
> -
>
> Key: ARROW-8221
> URL: https://issues.apache.org/jira/browse/ARROW-8221
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Joris Van den Bossche
>Priority: Major
> Fix For: 0.17.0
>
>
> ARROW-8058 added options related to schema inference / validation for the 
> Dataset factory. We should expose this in Python in the {{dataset(..)}} 
> factory function:
> - Add ability to pass a user-specified schema with a {{schema}} keyword, 
> instead of inferring the schema from (one of) the files (to be passed to the 
> factory finish method)
> - Add {{validate_schema}} option to toggle whether the schema is validated 
> against the actual files or not.
> - Expose in some way the number of fragments to be inspected when inferring 
> or validating the schema. Not sure yet what the best API for this would be. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8291) [Packaging] Conda nightly builds can't locate Numpy

2020-03-31 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8291:
--

 Summary: [Packaging] Conda nightly builds can't locate Numpy
 Key: ARROW-8291
 URL: https://issues.apache.org/jira/browse/ARROW-8291
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Krisztian Szucs
 Fix For: 0.17.0


See build error 
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-03-30-1-azure-conda-linux-gcc-py36



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8185) [Packaging] Document the available nightly wheels and conda packages

2020-03-31 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8185:
---
Summary: [Packaging] Document the available nightly wheels and conda 
packages  (was: [Packaging] Document the available nightly wheels, conda and R 
packages under the development section)

> [Packaging] Document the available nightly wheels and conda packages
> 
>
> Key: ARROW-8185
> URL: https://issues.apache.org/jira/browse/ARROW-8185
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> The packaging scripts are uploading the artifacts to package manager specific 
> hosting services like Anaconda and Gemfury. We should document this in a form 
> which conforms the [ASF 
> Policy|https://www.apache.org/dev/release-distribution.html#unreleased].
> For more information see the conversation at 
> https://github.com/apache/arrow/pull/6669#issuecomment-601947006



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8271) [Packaging] Allow wheel upload failures to gemfury

2020-03-31 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8271.

Fix Version/s: 0.17.0
   Resolution: Fixed

Issue resolved by pull request 6761
[https://github.com/apache/arrow/pull/6761]

> [Packaging] Allow wheel upload failures to gemfury
> --
>
> Key: ARROW-8271
> URL: https://issues.apache.org/jira/browse/ARROW-8271
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging, Python
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If we run multiple nightly/scheduled jobs per day for the same arrow commit 
> then gemfury's API will refuse the upload because of conflicting versions, 
> see 
> [build|https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=9053=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=b525c197-f769-5e52-d38a-e6301f5260f2=27].
> Sadly gemfury doesn't have a force update like parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-7870) [CI][Packaging] Host nightly wheels on Apache bintray

2020-03-30 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs closed ARROW-7870.
--
Fix Version/s: (was: 0.17.0)
   Resolution: Duplicate

> [CI][Packaging] Host nightly wheels on Apache bintray
> -
>
> Key: ARROW-7870
> URL: https://issues.apache.org/jira/browse/ARROW-7870
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging, Python
>Reporter: Neal Richardson
>Assignee: Kouhei Sutou
>Priority: Major
>
> See 
> https://lists.apache.org/thread.html/r86c46849d8fe77de12821834b12330f0f77c3e7d7d4e6302c9f634d3%40%3Cdev.arrow.apache.org%3E
> Investigate whether bintray is a good alternative, and if we use it, add a 
> note to our website about nightly builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7870) [CI][Packaging] Host nightly wheels on Apache bintray

2020-03-30 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071129#comment-17071129
 ] 

Krisztian Szucs commented on ARROW-7870:


Superceeded by https://issues.apache.org/jira/browse/ARROW-8165

> [CI][Packaging] Host nightly wheels on Apache bintray
> -
>
> Key: ARROW-7870
> URL: https://issues.apache.org/jira/browse/ARROW-7870
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging, Python
>Reporter: Neal Richardson
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 0.17.0
>
>
> See 
> https://lists.apache.org/thread.html/r86c46849d8fe77de12821834b12330f0f77c3e7d7d4e6302c9f634d3%40%3Cdev.arrow.apache.org%3E
> Investigate whether bintray is a good alternative, and if we use it, add a 
> note to our website about nightly builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8272) [CI][Python] Test failure on Ubuntu 16.04

2020-03-30 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8272:
--

Assignee: Krisztian Szucs

> [CI][Python] Test failure on Ubuntu 16.04
> -
>
> Key: ARROW-8272
> URL: https://issues.apache.org/jira/browse/ARROW-8272
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration, Python
>Reporter: Antoine Pitrou
>Assignee: Krisztian Szucs
>Priority: Critical
>
> See https://github.com/pitrou/arrow/runs/545291564



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8220) [Python] Make dataset FileFormat objects serializable

2020-03-30 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8220.

Resolution: Fixed

Issue resolved by pull request 6720
[https://github.com/apache/arrow/pull/6720]

> [Python] Make dataset FileFormat objects serializable
> -
>
> Key: ARROW-8220
> URL: https://issues.apache.org/jira/browse/ARROW-8220
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Similar to ARROW-8060, ARROW-8059, also the FileFormats need to be pickleable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8271) [Packaging] Allow wheel upload failures to gemfury

2020-03-30 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8271:
---
Description: 
If we run multiple nightly/scheduled jobs per day for the same arrow commit 
then gemfury's API will refuse the upload because of conflicting versions, see 
[build|https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=9053=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=b525c197-f769-5e52-d38a-e6301f5260f2=27].

Sadly gemfury doesn't have a force update like parameter.

  was:If we run multiple nightly/scheduled jobs per day for the same arrow 
commit then gemfury's API will refuse the upload because of conflicting 
versions, see 
[build|https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=9053=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=b525c197-f769-5e52-d38a-e6301f5260f2=27].


> [Packaging] Allow wheel upload failures to gemfury
> --
>
> Key: ARROW-8271
> URL: https://issues.apache.org/jira/browse/ARROW-8271
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging, Python
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>
> If we run multiple nightly/scheduled jobs per day for the same arrow commit 
> then gemfury's API will refuse the upload because of conflicting versions, 
> see 
> [build|https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=9053=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=b525c197-f769-5e52-d38a-e6301f5260f2=27].
> Sadly gemfury doesn't have a force update like parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8271) [Packaging] Allow wheel upload failures to gemfury

2020-03-30 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8271:
--

 Summary: [Packaging] Allow wheel upload failures to gemfury
 Key: ARROW-8271
 URL: https://issues.apache.org/jira/browse/ARROW-8271
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging, Python
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


If we run multiple nightly/scheduled jobs per day for the same arrow commit 
then gemfury's API will refuse the upload because of conflicting versions, see 
[build|https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=9053=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=b525c197-f769-5e52-d38a-e6301f5260f2=27].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8242) [C++] Flight fails to compile on GCC 4.8

2020-03-27 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8242:
---
Priority: Blocker  (was: Major)

> [C++] Flight fails to compile on GCC 4.8
> 
>
> Key: ARROW-8242
> URL: https://issues.apache.org/jira/browse/ARROW-8242
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> See recent build log 
> https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=8944=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=5b4cc83a-7bb0-5664-5bb1-588f7e4dc05b=2186



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8242) [C++] Flight fails to compile on GCC 4.8

2020-03-27 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8242:
---
Fix Version/s: 0.17.0

> [C++] Flight fails to compile on GCC 4.8
> 
>
> Key: ARROW-8242
> URL: https://issues.apache.org/jira/browse/ARROW-8242
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> See recent build log 
> https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=8944=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=5b4cc83a-7bb0-5664-5bb1-588f7e4dc05b=2186



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8242) [C++] Flight fails to compile on GCC 4.8

2020-03-27 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8242:
---
Summary: [C++] Flight fails to compile on GCC 4.8  (was: [C++] GCC 4.8 
fails to compileFlight)

> [C++] Flight fails to compile on GCC 4.8
> 
>
> Key: ARROW-8242
> URL: https://issues.apache.org/jira/browse/ARROW-8242
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>
> See recent build log 
> https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=8944=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=5b4cc83a-7bb0-5664-5bb1-588f7e4dc05b=2186



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8242) [C++] GCC 4.8 fails to compileFlight

2020-03-27 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8242:
---
Summary: [C++] GCC 4.8 fails to compileFlight  (was: [C++] GCC 4.8 fails to 
compile Flight)

> [C++] GCC 4.8 fails to compileFlight
> 
>
> Key: ARROW-8242
> URL: https://issues.apache.org/jira/browse/ARROW-8242
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>
> See recent build log 
> https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=8944=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=5b4cc83a-7bb0-5664-5bb1-588f7e4dc05b=2186



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8242) [C++] GCC 4.8 fails to compile Flight

2020-03-27 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8242:
--

 Summary: [C++] GCC 4.8 fails to compile Flight
 Key: ARROW-8242
 URL: https://issues.apache.org/jira/browse/ARROW-8242
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


See recent build log 
https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=8944=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=5b4cc83a-7bb0-5664-5bb1-588f7e4dc05b=2186



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8070) [C++] Cast segfaults on unsupported cast from list to utf8

2020-03-27 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8070:
---
Summary: [C++] Cast segfaults on unsupported cast from list to utf8 
 (was: [Python] Array.cast segfaults on unsupported cast from list to 
utf8)

> [C++] Cast segfaults on unsupported cast from list to utf8
> --
>
> Key: ARROW-8070
> URL: https://issues.apache.org/jira/browse/ARROW-8070
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Daniel Nugent
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Was messing around with some nested arrays and found a pretty easy to 
> reproduce segfault:
> {code:java}
> Python 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 22:33:48)
> [GCC 7.3.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numpy as np, pyarrow as pa
> >>> pa.__version__
> '0.16.0'
> >>> np.__version__
> '1.18.1'
> >>> x=[np.array([b'a',b'b'])]
> >>> a = pa.array(x,pa.list_(pa.binary()))
> >>> a
> 
> [
>   [
> 61,
> 62
>   ]
> ]
> >>> a.cast(pa.string())
> Segmentation fault
> {code}
> I don't know if that cast makes sense, but I left the checks on, so I would 
> not expect a segfault from it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8070) [C++] Cast segfaults on unsupported cast from list to utf8

2020-03-27 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8070:
---
Component/s: (was: Python)
 C++

> [C++] Cast segfaults on unsupported cast from list to utf8
> --
>
> Key: ARROW-8070
> URL: https://issues.apache.org/jira/browse/ARROW-8070
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Daniel Nugent
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Was messing around with some nested arrays and found a pretty easy to 
> reproduce segfault:
> {code:java}
> Python 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 22:33:48)
> [GCC 7.3.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numpy as np, pyarrow as pa
> >>> pa.__version__
> '0.16.0'
> >>> np.__version__
> '1.18.1'
> >>> x=[np.array([b'a',b'b'])]
> >>> a = pa.array(x,pa.list_(pa.binary()))
> >>> a
> 
> [
>   [
> 61,
> 62
>   ]
> ]
> >>> a.cast(pa.string())
> Segmentation fault
> {code}
> I don't know if that cast makes sense, but I left the checks on, so I would 
> not expect a segfault from it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8184) [Packaging] Use arrow-nightlies organization name on Anaconda and Gemfury to host the nightlies

2020-03-26 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8184.

Resolution: Fixed

Issue resolved by pull request 6717
[https://github.com/apache/arrow/pull/6717]

> [Packaging] Use arrow-nightlies organization name on Anaconda and Gemfury to 
> host the nightlies
> ---
>
> Key: ARROW-8184
> URL: https://issues.apache.org/jira/browse/ARROW-8184
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently I've set up the scripts to use Ursa Labs's accounts, but we should 
> prefer a more neutral org.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8220) [Python] Make dataset FileFormat objects serializable

2020-03-25 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-8220:
--

Assignee: Krisztian Szucs

> [Python] Make dataset FileFormat objects serializable
> -
>
> Key: ARROW-8220
> URL: https://issues.apache.org/jira/browse/ARROW-8220
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Similar to ARROW-8060, ARROW-8059, also the FileFormats need to be pickleable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8060) [Python] Make dataset Expression objects serializable

2020-03-25 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8060.

Resolution: Fixed

Issue resolved by pull request 6702
[https://github.com/apache/arrow/pull/6702]

> [Python] Make dataset Expression objects serializable
> -
>
> Key: ARROW-8060
> URL: https://issues.apache.org/jira/browse/ARROW-8060
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Joris Van den Bossche
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> It would be good to be able to pickle pyarrow.dataset.Expression objects (eg 
> for use in dask.distributed)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-7755) [Python] Windows wheel cannot be installed on Python 3.8

2020-03-25 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-7755.

Resolution: Fixed

> [Python] Windows wheel cannot be installed on Python 3.8
> 
>
> Key: ARROW-7755
> URL: https://issues.apache.org/jira/browse/ARROW-7755
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Critical
> Fix For: 0.17.0
>
>
> {code}
> λ pip install 
> C:\tmp\arrow-verify-release-wheels\pyarrow-0.16.0-cp38-cp38m-win_amd64.whl 
> ERROR: pyarrow-0.16.0-cp38-cp38m-win_amd64.whl is not a supported wheel on 
> this platform.
> {code}
> The wheel came from
> https://bintray.com/apache/arrow/download_file?file_path=python-rc%2F0.16.0-rc2%2Fpyarrow-0.16.0-cp38-cp38m-win_amd64.whl
> The "m" ABI tag appears to have been removed in Python 3.8
> https://github.com/pypa/setuptools/pull/1822
> Locally I have pip 20.0.2, wheel 0.34.1, and setuptools 45.1.0.post20200127



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7755) [Python] Windows wheel cannot be installed on Python 3.8

2020-03-25 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066938#comment-17066938
 ] 

Krisztian Szucs commented on ARROW-7755:


I've removed the dirty {{pyarrow-0.16.0-cp38-cp38m-win_amd64.*}} files from the 
0.16 bintray release, but I left them under the 0.16.0-rc2 tag.

> [Python] Windows wheel cannot be installed on Python 3.8
> 
>
> Key: ARROW-7755
> URL: https://issues.apache.org/jira/browse/ARROW-7755
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Critical
> Fix For: 0.17.0
>
>
> {code}
> λ pip install 
> C:\tmp\arrow-verify-release-wheels\pyarrow-0.16.0-cp38-cp38m-win_amd64.whl 
> ERROR: pyarrow-0.16.0-cp38-cp38m-win_amd64.whl is not a supported wheel on 
> this platform.
> {code}
> The wheel came from
> https://bintray.com/apache/arrow/download_file?file_path=python-rc%2F0.16.0-rc2%2Fpyarrow-0.16.0-cp38-cp38m-win_amd64.whl
> The "m" ABI tag appears to have been removed in Python 3.8
> https://github.com/pypa/setuptools/pull/1822
> Locally I have pip 20.0.2, wheel 0.34.1, and setuptools 45.1.0.post20200127



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-7771) [Developer] Use ARROW_TMPDIR environment variable in the verification scripts instead of TMPDIR

2020-03-25 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-7771:
---
Summary: [Developer] Use ARROW_TMPDIR environment variable in the 
verification scripts instead of TMPDIR  (was: [Release] Use ARROW_TMPDIR 
environment variable in the verification scripts instead of TMPDIR)

> [Developer] Use ARROW_TMPDIR environment variable in the verification scripts 
> instead of TMPDIR
> ---
>
> Key: ARROW-7771
> URL: https://issues.apache.org/jira/browse/ARROW-7771
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See discussion 
> https://github.com/apache/arrow/pull/6344#issuecomment-582128686



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-7850) [Packaging][Python] Document how to install nightly built wheels

2020-03-25 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-7850:
---
Fix Version/s: (was: 0.17.0)

> [Packaging][Python] Document how to install nightly built wheels
> 
>
> Key: ARROW-7850
> URL: https://issues.apache.org/jira/browse/ARROW-7850
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>
> Follow-up work on https://github.com/apache/arrow/pull/6366#issue-371626256
> As per comment 
> https://github.com/apache/arrow/pull/6366#issuecomment-585750794
> It'd be also nice to resolve the version selection issue described in the 
> comments above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (ARROW-7850) [Packaging][Python] Document how to install nightly built wheels

2020-03-25 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reopened ARROW-7850:


> [Packaging][Python] Document how to install nightly built wheels
> 
>
> Key: ARROW-7850
> URL: https://issues.apache.org/jira/browse/ARROW-7850
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.17.0
>
>
> Follow-up work on https://github.com/apache/arrow/pull/6366#issue-371626256
> As per comment 
> https://github.com/apache/arrow/pull/6366#issuecomment-585750794
> It'd be also nice to resolve the version selection issue described in the 
> comments above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-7850) [Packaging][Python] Document how to install nightly built wheels

2020-03-25 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs closed ARROW-7850.
--
Resolution: Duplicate

> [Packaging][Python] Document how to install nightly built wheels
> 
>
> Key: ARROW-7850
> URL: https://issues.apache.org/jira/browse/ARROW-7850
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>
> Follow-up work on https://github.com/apache/arrow/pull/6366#issue-371626256
> As per comment 
> https://github.com/apache/arrow/pull/6366#issuecomment-585750794
> It'd be also nice to resolve the version selection issue described in the 
> comments above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


<    1   2   3   4   5   6   7   8   9   10   >