[jira] [Created] (ARROW-14501) [Python] Add StructType attribute to access all its fields

2021-10-28 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14501: - Summary: [Python] Add StructType attribute to access all its fields Key: ARROW-14501 URL: https://issues.apache.org/jira/browse/ARROW-14501 Project:

[jira] [Created] (ARROW-14584) [Python][CI] Python sdist installation fails with latest setuptools 58.5

2021-11-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14584: - Summary: [Python][CI] Python sdist installation fails with latest setuptools 58.5 Key: ARROW-14584 URL: https://issues.apache.org/jira/browse/ARROW-14584

[jira] [Created] (ARROW-14732) [Python] Improve error message in compute functions when passing wrong positional/keyword arguments

2021-11-17 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14732: - Summary: [Python] Improve error message in compute functions when passing wrong positional/keyword arguments Key: ARROW-14732 URL: https://issues.apache.org/jira

[jira] [Created] (ARROW-14751) [C++] The index_in_meta_binary/is_in_meta_binary kernels are missing a docstring

2021-11-18 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14751: - Summary: [C++] The index_in_meta_binary/is_in_meta_binary kernels are missing a docstring Key: ARROW-14751 URL: https://issues.apache.org/jira/browse/ARROW-14751

[jira] [Created] (ARROW-14798) [Python] Limit the size of the repr for large Tables

2021-11-23 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14798: - Summary: [Python] Limit the size of the repr for large Tables Key: ARROW-14798 URL: https://issues.apache.org/jira/browse/ARROW-14798 Project: Apache

[jira] [Created] (ARROW-14799) [C++] Adding tabular pretty printing of Table / RecordBatch

2021-11-23 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14799: - Summary: [C++] Adding tabular pretty printing of Table / RecordBatch Key: ARROW-14799 URL: https://issues.apache.org/jira/browse/ARROW-14799 Project

[jira] [Created] (ARROW-14926) [Docs] Fix CSS for visibility of the version dropdown

2021-11-30 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14926: - Summary: [Docs] Fix CSS for visibility of the version dropdown Key: ARROW-14926 URL: https://issues.apache.org/jira/browse/ARROW-14926 Project: Apach

[jira] [Created] (ARROW-14929) [CI][Python] Kartothek integration build due to installation issue

2021-11-30 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14929: - Summary: [CI][Python] Kartothek integration build due to installation issue Key: ARROW-14929 URL: https://issues.apache.org/jira/browse/ARROW-14929

[jira] [Created] (ARROW-14967) [CI][Python] Ability to include pip packages in the conda environments

2021-12-02 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14967: - Summary: [CI][Python] Ability to include pip packages in the conda environments Key: ARROW-14967 URL: https://issues.apache.org/jira/browse/ARROW-14967

[jira] [Created] (ARROW-14990) [CI] Nightly integration for dask is failing because of missing pandas dependency

2021-12-06 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-14990: - Summary: [CI] Nightly integration for dask is failing because of missing pandas dependency Key: ARROW-14990 URL: https://issues.apache.org/jira/browse/ARROW-1499

[jira] [Created] (ARROW-15042) [Python] Consolidate shared methods of RecordBatch and Table

2021-12-09 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15042: - Summary: [Python] Consolidate shared methods of RecordBatch and Table Key: ARROW-15042 URL: https://issues.apache.org/jira/browse/ARROW-15042 Projec

[jira] [Created] (ARROW-15043) [Python][Docs] Update type conversion table for pandas <-> arrow

2021-12-09 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15043: - Summary: [Python][Docs] Update type conversion table for pandas <-> arrow Key: ARROW-15043 URL: https://issues.apache.org/jira/browse/ARROW-15043 Pr

[jira] [Created] (ARROW-15077) [Python] Move Expression class from _dataset to _compute cython module

2021-12-13 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15077: - Summary: [Python] Move Expression class from _dataset to _compute cython module Key: ARROW-15077 URL: https://issues.apache.org/jira/browse/ARROW-15077

[jira] [Created] (ARROW-15117) [Docs] Splitting the sphinx-based Arrow docs into separate sphinx projects

2021-12-15 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15117: - Summary: [Docs] Splitting the sphinx-based Arrow docs into separate sphinx projects Key: ARROW-15117 URL: https://issues.apache.org/jira/browse/ARROW-15117

[jira] [Created] (ARROW-15131) [Python] Coerce value_set argument to array in "is_in" kernel

2021-12-16 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15131: - Summary: [Python] Coerce value_set argument to array in "is_in" kernel Key: ARROW-15131 URL: https://issues.apache.org/jira/browse/ARROW-15131 Proje

[jira] [Created] (ARROW-15137) [Dev] Update archery crossbow latest-prefix to work with nightly dates

2021-12-16 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15137: - Summary: [Dev] Update archery crossbow latest-prefix to work with nightly dates Key: ARROW-15137 URL: https://issues.apache.org/jira/browse/ARROW-15137

[jira] [Created] (ARROW-15307) [C++][Dataset] Provide more context in error message if cast fails during scanning

2022-01-12 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15307: - Summary: [C++][Dataset] Provide more context in error message if cast fails during scanning Key: ARROW-15307 URL: https://issues.apache.org/jira/browse/ARROW-153

[jira] [Created] (ARROW-15310) [C++][Python][Dataset] Detect (and warn?) when DirectoryPartitioning is parsing an actually hive-style file path?

2022-01-12 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15310: - Summary: [C++][Python][Dataset] Detect (and warn?) when DirectoryPartitioning is parsing an actually hive-style file path? Key: ARROW-15310 URL: https://issues.a

[jira] [Created] (ARROW-15321) [Dev][Archery] numpydoc validation doesn't check all class methods

2022-01-13 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15321: - Summary: [Dev][Archery] numpydoc validation doesn't check all class methods Key: ARROW-15321 URL: https://issues.apache.org/jira/browse/ARROW-15321

[jira] [Created] (ARROW-15323) [CI] Nightly spark integration builds are failing

2022-01-13 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15323: - Summary: [CI] Nightly spark integration builds are failing Key: ARROW-15323 URL: https://issues.apache.org/jira/browse/ARROW-15323 Project: Apache Ar

[jira] [Created] (ARROW-15324) [C++][CI] HDFS test build is failing with segfault (TestLibHdfs::test_mv_rename)

2022-01-13 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15324: - Summary: [C++][CI] HDFS test build is failing with segfault (TestLibHdfs::test_mv_rename) Key: ARROW-15324 URL: https://issues.apache.org/jira/browse/ARROW-15324

[jira] [Created] (ARROW-15326) [CI][Gandiva] Ubuntu release build is failing with failing Gandiva tests

2022-01-13 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15326: - Summary: [CI][Gandiva] Ubuntu release build is failing with failing Gandiva tests Key: ARROW-15326 URL: https://issues.apache.org/jira/browse/ARROW-15326

[jira] [Created] (ARROW-15364) [Python][Doc] Update filesystem entry in read docstrings

2022-01-19 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15364: - Summary: [Python][Doc] Update filesystem entry in read docstrings Key: ARROW-15364 URL: https://issues.apache.org/jira/browse/ARROW-15364 Project: Ap

[jira] [Created] (ARROW-15365) [Python] Expose full cast options in the pyarrow.compute.cast function

2022-01-19 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15365: - Summary: [Python] Expose full cast options in the pyarrow.compute.cast function Key: ARROW-15365 URL: https://issues.apache.org/jira/browse/ARROW-15365

[jira] [Created] (ARROW-15370) [Python] Regression in empty table to_pandas conversion

2022-01-19 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15370: - Summary: [Python] Regression in empty table to_pandas conversion Key: ARROW-15370 URL: https://issues.apache.org/jira/browse/ARROW-15370 Project: Apa

[jira] [Created] (ARROW-15394) [CI][Docs] Doxygen not ran in the docs nightly build

2022-01-20 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15394: - Summary: [CI][Docs] Doxygen not ran in the docs nightly build Key: ARROW-15394 URL: https://issues.apache.org/jira/browse/ARROW-15394 Project: Apache

[jira] [Created] (ARROW-15455) [C++] Cast between fixed size list type and variable size list

2022-01-25 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15455: - Summary: [C++] Cast between fixed size list type and variable size list Key: ARROW-15455 URL: https://issues.apache.org/jira/browse/ARROW-15455 Pro

[jira] [Created] (ARROW-15477) [C++][Python] Enable ListArray::FromArrays with custom list type (field names/nullability)

2022-01-27 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15477: - Summary: [C++][Python] Enable ListArray::FromArrays with custom list type (field names/nullability) Key: ARROW-15477 URL: https://issues.apache.org/jira/browse/A

[jira] [Created] (ARROW-15478) [C++] Creating (or casting to) list array with non-nullable field doesn't check nulls

2022-01-27 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15478: - Summary: [C++] Creating (or casting to) list array with non-nullable field doesn't check nulls Key: ARROW-15478 URL: https://issues.apache.org/jira/browse/ARROW-

[jira] [Created] (ARROW-15479) [C++] Cast to fixed size list with different field name

2022-01-27 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15479: - Summary: [C++] Cast to fixed size list with different field name Key: ARROW-15479 URL: https://issues.apache.org/jira/browse/ARROW-15479 Project: Apa

[jira] [Created] (ARROW-15545) [C++] Cast dictionary of extension type to extension type

2022-02-03 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15545: - Summary: [C++] Cast dictionary of extension type to extension type Key: ARROW-15545 URL: https://issues.apache.org/jira/browse/ARROW-15545 Project: A

[jira] [Created] (ARROW-15548) [C++][Parquet] Field-level metadata are not supported? (ColumnMetadata.key_value_metadata)

2022-02-03 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15548: - Summary: [C++][Parquet] Field-level metadata are not supported? (ColumnMetadata.key_value_metadata) Key: ARROW-15548 URL: https://issues.apache.org/jira/browse/A

[jira] [Created] (ARROW-15552) [Docs][Format] Unclear wording about base64 encoding requirement of metadata values

2022-02-03 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15552: - Summary: [Docs][Format] Unclear wording about base64 encoding requirement of metadata values Key: ARROW-15552 URL: https://issues.apache.org/jira/browse/ARROW-15

[jira] [Created] (ARROW-15564) [C++] Expose MergeOptions in Concatenate to unify types

2022-02-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15564: - Summary: [C++] Expose MergeOptions in Concatenate to unify types Key: ARROW-15564 URL: https://issues.apache.org/jira/browse/ARROW-15564 Project: Apa

[jira] [Created] (ARROW-15601) [Docs][Release] Update post release script to move stable docs to versioned + keep dev docs

2022-02-07 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15601: - Summary: [Docs][Release] Update post release script to move stable docs to versioned + keep dev docs Key: ARROW-15601 URL: https://issues.apache.org/jira/browse/

[jira] [Created] (ARROW-15643) [C++] Kernel to select subset of fields of a StructArray

2022-02-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15643: - Summary: [C++] Kernel to select subset of fields of a StructArray Key: ARROW-15643 URL: https://issues.apache.org/jira/browse/ARROW-15643 Project: Ap

[jira] [Created] (ARROW-15652) [C++] GDB plugin printer gives error with extension type

2022-02-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15652: - Summary: [C++] GDB plugin printer gives error with extension type Key: ARROW-15652 URL: https://issues.apache.org/jira/browse/ARROW-15652 Project: Ap

[jira] [Created] (ARROW-15711) [C++][Parquet] Extension types with nanosecond timestamp resolution don't roundtrip

2022-02-17 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15711: - Summary: [C++][Parquet] Extension types with nanosecond timestamp resolution don't roundtrip Key: ARROW-15711 URL: https://issues.apache.org/jira/browse/ARROW-15

[jira] [Created] (ARROW-15720) [CI] Nightly dask build is failing due to wrong usage of Array.to_pandas

2022-02-17 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15720: - Summary: [CI] Nightly dask build is failing due to wrong usage of Array.to_pandas Key: ARROW-15720 URL: https://issues.apache.org/jira/browse/ARROW-15720

[jira] [Created] (ARROW-15760) [C++] Avoid hard dependency on git in cmake (download tarballs from github instead)

2022-02-22 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15760: - Summary: [C++] Avoid hard dependency on git in cmake (download tarballs from github instead) Key: ARROW-15760 URL: https://issues.apache.org/jira/browse/ARROW-15

[jira] [Created] (ARROW-15761) [Python] Remove the deprecated pyarrow.filesystem legacy implementations

2022-02-23 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15761: - Summary: [Python] Remove the deprecated pyarrow.filesystem legacy implementations Key: ARROW-15761 URL: https://issues.apache.org/jira/browse/ARROW-15761

[jira] [Created] (ARROW-15847) [Python] Building with Parquet but without Parquet encryption fails

2022-03-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15847: - Summary: [Python] Building with Parquet but without Parquet encryption fails Key: ARROW-15847 URL: https://issues.apache.org/jira/browse/ARROW-15847

[jira] [Created] (ARROW-15867) [Python] Ignored exception printed when pandas is not installed

2022-03-08 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15867: - Summary: [Python] Ignored exception printed when pandas is not installed Key: ARROW-15867 URL: https://issues.apache.org/jira/browse/ARROW-15867 Pro

[jira] [Created] (ARROW-15868) [Python] Remove the legacy ParquetDataset custom python-based implementation

2022-03-08 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15868: - Summary: [Python] Remove the legacy ParquetDataset custom python-based implementation Key: ARROW-15868 URL: https://issues.apache.org/jira/browse/ARROW-15868

[jira] [Created] (ARROW-15870) [Python] Start to raise deprecation warnings when using use_legacy_dataset=True in parquet.py

2022-03-08 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15870: - Summary: [Python] Start to raise deprecation warnings when using use_legacy_dataset=True in parquet.py Key: ARROW-15870 URL: https://issues.apache.org/jira/brows

[jira] [Created] (ARROW-15871) [Python] Start raising deprecation warnings for ParquetDataset keywords that won't be supported with the new API

2022-03-08 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15871: - Summary: [Python] Start raising deprecation warnings for ParquetDataset keywords that won't be supported with the new API Key: ARROW-15871 URL: https://issues.ap

[jira] [Created] (ARROW-15882) [Ci][Python] Nightly hypothesis build is not actually running the hypothesis tests

2022-03-09 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15882: - Summary: [Ci][Python] Nightly hypothesis build is not actually running the hypothesis tests Key: ARROW-15882 URL: https://issues.apache.org/jira/browse/ARROW-158

[jira] [Created] (ARROW-15883) [C++] Support for fractional seconds in strptime() for ISO format?

2022-03-09 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15883: - Summary: [C++] Support for fractional seconds in strptime() for ISO format? Key: ARROW-15883 URL: https://issues.apache.org/jira/browse/ARROW-15883

[jira] [Created] (ARROW-15884) [C++][Doc] Document that the strptime kernel ignores %Z

2022-03-09 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15884: - Summary: [C++][Doc] Document that the strptime kernel ignores %Z Key: ARROW-15884 URL: https://issues.apache.org/jira/browse/ARROW-15884 Project: Apa

[jira] [Created] (ARROW-15960) [Python] Segfault constructing a fixed size list array of size 0 with dictionary values

2022-03-17 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15960: - Summary: [Python] Segfault constructing a fixed size list array of size 0 with dictionary values Key: ARROW-15960 URL: https://issues.apache.org/jira/browse/ARRO

[jira] [Created] (ARROW-15997) [CI] Nightly turbodbc build is failing (C++ compilation error)

2022-03-22 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-15997: - Summary: [CI] Nightly turbodbc build is failing (C++ compilation error) Key: ARROW-15997 URL: https://issues.apache.org/jira/browse/ARROW-15997 Proj

[jira] [Created] (ARROW-16018) [Doc][Python] Run doctests on Python docstring examples

2022-03-24 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16018: - Summary: [Doc][Python] Run doctests on Python docstring examples Key: ARROW-16018 URL: https://issues.apache.org/jira/browse/ARROW-16018 Project: Apa

[jira] [Created] (ARROW-16107) [CI][Archery] Fix archery crossbow query to get latest prefix

2022-04-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16107: - Summary: [CI][Archery] Fix archery crossbow query to get latest prefix Key: ARROW-16107 URL: https://issues.apache.org/jira/browse/ARROW-16107 Proje

[jira] [Created] (ARROW-16113) [Python] Partitioning.dictionaries in case of a subset of fields are dictionary encoded

2022-04-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16113: - Summary: [Python] Partitioning.dictionaries in case of a subset of fields are dictionary encoded Key: ARROW-16113 URL: https://issues.apache.org/jira/browse/ARRO

[jira] [Created] (ARROW-16119) [Python] Deprecate the legacy ParquetDataset custom python-based implementation

2022-04-05 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16119: - Summary: [Python] Deprecate the legacy ParquetDataset custom python-based implementation Key: ARROW-16119 URL: https://issues.apache.org/jira/browse/ARROW-16119

[jira] [Created] (ARROW-16120) [Python] ParquetDataset deprecation: change Deprecation to FutureWarnings

2022-04-05 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16120: - Summary: [Python] ParquetDataset deprecation: change Deprecation to FutureWarnings Key: ARROW-16120 URL: https://issues.apache.org/jira/browse/ARROW-16120

[jira] [Created] (ARROW-16121) [Python] Deprecate the (common_)metadata(_path) attributes of ParquetDataset

2022-04-05 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16121: - Summary: [Python] Deprecate the (common_)metadata(_path) attributes of ParquetDataset Key: ARROW-16121 URL: https://issues.apache.org/jira/browse/ARROW-16121

[jira] [Created] (ARROW-16122) [Python] Deprecate no-longer supported keywords in parquet.write_to_dataset

2022-04-05 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16122: - Summary: [Python] Deprecate no-longer supported keywords in parquet.write_to_dataset Key: ARROW-16122 URL: https://issues.apache.org/jira/browse/ARROW-16122

[jira] [Created] (ARROW-16123) [Python] Do no include __init__ in the API documentation

2022-04-05 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16123: - Summary: [Python] Do no include __init__ in the API documentation Key: ARROW-16123 URL: https://issues.apache.org/jira/browse/ARROW-16123 Project: Ap

[jira] [Created] (ARROW-16140) [Python] zoneinfo timezones failing during type inference

2022-04-07 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16140: - Summary: [Python] zoneinfo timezones failing during type inference Key: ARROW-16140 URL: https://issues.apache.org/jira/browse/ARROW-16140 Project: A

[jira] [Created] (ARROW-16204) [C++][Dataset] Default error existing_data_behaviour for writing dataset ignores "part-{i}.ext" files

2022-04-15 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16204: - Summary: [C++][Dataset] Default error existing_data_behaviour for writing dataset ignores "part-{i}.ext" files Key: ARROW-16204 URL: https://issues.apache.org/j

[jira] [Created] (ARROW-16231) [C++][Python] IPC failure for dictionary with extension type with struct storage type

2022-04-19 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16231: - Summary: [C++][Python] IPC failure for dictionary with extension type with struct storage type Key: ARROW-16231 URL: https://issues.apache.org/jira/browse/ARROW-

[jira] [Created] (ARROW-16262) [CI] Kartothek nightly integration build is failing because of Parquet statistics date change

2022-04-21 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16262: - Summary: [CI] Kartothek nightly integration build is failing because of Parquet statistics date change Key: ARROW-16262 URL: https://issues.apache.org/jira/brows

[jira] [Created] (ARROW-16336) [Python] Hide internal (common_)metadata related warnings from the user (ParquetDataset)

2022-04-26 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16336: - Summary: [Python] Hide internal (common_)metadata related warnings from the user (ParquetDataset) Key: ARROW-16336 URL: https://issues.apache.org/jira/browse/ARR

[jira] [Created] (ARROW-16337) [Python] Expose parameter that determines to store Arrow schema in Parquet metadata in Python

2022-04-26 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16337: - Summary: [Python] Expose parameter that determines to store Arrow schema in Parquet metadata in Python Key: ARROW-16337 URL: https://issues.apache.org/jira/brows

[jira] [Created] (ARROW-16339) [C++][Parquet] Parquet FileMetaData key_value_metadata not always mapped to Arrow Schema metadata

2022-04-26 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16339: - Summary: [C++][Parquet] Parquet FileMetaData key_value_metadata not always mapped to Arrow Schema metadata Key: ARROW-16339 URL: https://issues.apache.org/jira/b

[jira] [Created] (ARROW-16413) [C++][Python] FileFormat::GetReaderAsync hangs with an fsspec filesystem

2022-04-29 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16413: - Summary: [C++][Python] FileFormat::GetReaderAsync hangs with an fsspec filesystem Key: ARROW-16413 URL: https://issues.apache.org/jira/browse/ARROW-16413

[jira] [Created] (ARROW-16442) [Python] The fragments for ORC dataset return base Fragment instead of FileFragment

2022-05-03 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16442: - Summary: [Python] The fragments for ORC dataset return base Fragment instead of FileFragment Key: ARROW-16442 URL: https://issues.apache.org/jira/browse/ARROW-16

[jira] [Created] (ARROW-16458) [Python] Run S3 tests in the nightly dask integration build

2022-05-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16458: - Summary: [Python] Run S3 tests in the nightly dask integration build Key: ARROW-16458 URL: https://issues.apache.org/jira/browse/ARROW-16458 Project

[jira] [Created] (ARROW-16460) [Python] Some dataset tests using PyFileSystem are failing on Windows

2022-05-04 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16460: - Summary: [Python] Some dataset tests using PyFileSystem are failing on Windows Key: ARROW-16460 URL: https://issues.apache.org/jira/browse/ARROW-16460

[jira] [Created] (ARROW-16651) [Python] Casting Table to new schema ignores nullability of fields

2022-05-25 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16651: - Summary: [Python] Casting Table to new schema ignores nullability of fields Key: ARROW-16651 URL: https://issues.apache.org/jira/browse/ARROW-16651

[jira] [Created] (ARROW-16652) [Python][C++] Cast compute kernel segfaults when called with a Table

2022-05-25 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16652: - Summary: [Python][C++] Cast compute kernel segfaults when called with a Table Key: ARROW-16652 URL: https://issues.apache.org/jira/browse/ARROW-16652

[jira] [Created] (ARROW-16719) [Python] Add path/URI /+ filesystem handling to parquet.read_metadata

2022-06-01 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16719: - Summary: [Python] Add path/URI /+ filesystem handling to parquet.read_metadata Key: ARROW-16719 URL: https://issues.apache.org/jira/browse/ARROW-16719

[jira] [Created] (ARROW-16728) [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset

2022-06-02 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-16728: - Summary: [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset Key: ARROW-16728 URL: https://issues.apache.org/jira/browse/ARROW-16728

[jira] [Created] (ARROW-5201) [Python] Import ABCs from collections is deprecated in Python 3.7

2019-04-23 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-5201: Summary: [Python] Import ABCs from collections is deprecated in Python 3.7 Key: ARROW-5201 URL: https://issues.apache.org/jira/browse/ARROW-5201 Proje

[jira] [Assigned] (ARROW-5201) [Python] Import ABCs from collections is deprecated in Python 3.7

2019-04-23 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-5201: Assignee: Joris Van den Bossche > [Python] Import ABCs from collections is

[jira] [Commented] (ARROW-5165) [Python][Documentation] Build docs don't suggest assigning $ARROW_BUILD_TYPE

2019-04-23 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824183#comment-16824183 ] Joris Van den Bossche commented on ARROW-5165: -- Just ran into this as well.

[jira] [Updated] (ARROW-5125) [Python] Cannot roundtrip extreme dates through pyarrow

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5125: - Labels: parquet windows (was: parquet) > [Python] Cannot roundtrip extreme dates

[jira] [Updated] (ARROW-3176) [Python] Overflow in Date32 column conversion to pandas

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-3176: - Description: When converting an arrow column holding a {{Date32Array}} to {{panda

[jira] [Commented] (ARROW-3176) [Python] Overflow in Date32 column conversion to pandas

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824911#comment-16824911 ] Joris Van den Bossche commented on ARROW-3176: -- Note that the default type c

[jira] [Resolved] (ARROW-4934) [Python] Address deprecation notice that will be a bug in Python 3.8

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche resolved ARROW-4934. -- Resolution: Fixed Apparently https://issues.apache.org/jira/browse/ARROW-5201 (

[jira] [Created] (ARROW-5210) [Python] editable install (pip install -e .) is failing

2019-04-24 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-5210: Summary: [Python] editable install (pip install -e .) is failing Key: ARROW-5210 URL: https://issues.apache.org/jira/browse/ARROW-5210 Project: Apache

[jira] [Updated] (ARROW-5210) [Python] editable install (pip install -e .) is failing

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5210: - Description: Following the python development documentation on building arrow and

[jira] [Comment Edited] (ARROW-5210) [Python] editable install (pip install -e .) is failing

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825155#comment-16825155 ] Joris Van den Bossche edited comment on ARROW-5210 at 4/24/19 1:42 PM:

[jira] [Commented] (ARROW-5210) [Python] editable install (pip install -e .) is failing

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825155#comment-16825155 ] Joris Van den Bossche commented on ARROW-5210: -- With pip 19.1 (released yest

[jira] [Commented] (ARROW-5210) [Python] editable install (pip install -e .) is failing

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825170#comment-16825170 ] Joris Van den Bossche commented on ARROW-5210: -- The reason it is currently f

[jira] [Comment Edited] (ARROW-3176) [Python] Overflow in Date32 column conversion to pandas

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824911#comment-16824911 ] Joris Van den Bossche edited comment on ARROW-3176 at 4/24/19 2:02 PM:

[jira] [Commented] (ARROW-3176) [Python] Overflow in Date32 column conversion to pandas

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825407#comment-16825407 ] Joris Van den Bossche commented on ARROW-3176: -- Yes, I think, ideally, arrow

[jira] [Commented] (ARROW-3176) [Python] Overflow in Date32 column conversion to pandas

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825413#comment-16825413 ] Joris Van den Bossche commented on ARROW-3176: -- Actually, I take that back.

[jira] [Commented] (ARROW-3176) [Python] Overflow in Date32 column conversion to pandas

2019-04-24 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825426#comment-16825426 ] Joris Van den Bossche commented on ARROW-3176: -- This seems to be a pandas re

[jira] [Updated] (ARROW-5212) Array BinaryBuilder in Go library has no access to resize the values buffer

2019-04-25 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5212: - Component/s: Go > Array BinaryBuilder in Go library has no access to resize the v

[jira] [Updated] (ARROW-5212) [Go] Array BinaryBuilder in Go library has no access to resize the values buffer

2019-04-25 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5212: - Summary: [Go] Array BinaryBuilder in Go library has no access to resize the value

[jira] [Updated] (ARROW-5030) [Python] read_row_group fails with Nested data conversions not implemented for chunked array outputs

2019-04-25 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5030: - Labels: parquet (was: ) > [Python] read_row_group fails with Nested data convers

[jira] [Updated] (ARROW-3861) [Python] ParquetDataset().read columns argument always returns partition column

2019-04-26 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-3861: - Labels: parquet pyarrow python (was: pyarrow python) > [Python] ParquetDataset()

[jira] [Updated] (ARROW-3861) [Python] ParquetDataset().read columns argument always returns partition column

2019-04-26 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-3861: - Labels: parquet python (was: parquet pyarrow python) > [Python] ParquetDataset()

[jira] [Created] (ARROW-5220) [Python] index / unknown columns in specified schema in Table.from_pandas

2019-04-26 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-5220: Summary: [Python] index / unknown columns in specified schema in Table.from_pandas Key: ARROW-5220 URL: https://issues.apache.org/jira/browse/ARROW-5220

[jira] [Updated] (ARROW-5220) [Python] index / unknown columns in specified schema in Table.from_pandas

2019-04-26 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5220: - Description: The {{Table.from_pandas}} method allows to specify a schema ("This c

[jira] [Commented] (ARROW-3861) [Python] ParquetDataset().read columns argument always returns partition column

2019-04-26 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826772#comment-16826772 ] Joris Van den Bossche commented on ARROW-3861: -- [~cthi] note that the way yo

[jira] [Commented] (ARROW-5208) [Python] Inconsistent resulting type during casting in pa.array() when mask is present

2019-04-26 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826904#comment-16826904 ] Joris Van den Bossche commented on ARROW-5208: -- To get started, I think the

[jira] [Updated] (ARROW-5089) [C++/Python] Writing dictionary encoded columns to parquet is extremely slow when using chunk size

2019-04-26 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5089: - Labels: parquet performance (was: performance) > [C++/Python] Writing dictionary

<    1   2   3   4   5   6   7   8   9   10   >