[GitHub] [arrow] assignUser closed issue #14826: write_dataset is crashing on my machine

2023-01-20 Thread GitBox


assignUser closed issue #14826: write_dataset is crashing on my machine
URL: https://github.com/apache/arrow/issues/14826


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] ablack3 opened a new issue, #33807: Using dplyr::tally with an Arrow FileSystemDataset crashes R

2023-01-20 Thread GitBox


ablack3 opened a new issue, #33807:
URL: https://github.com/apache/arrow/issues/33807

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The following code snippet crashes R. I'm using arrow 10.0.1
   
   ```
   library(dplyr)
   arrow::write_dataset(cars, here::here("cars.feather"), format = "feather")
   a <- arrow::open_dataset(here::here("cars.feather"), format = "feather")
   a %>% tally()
   ```
   
   **Platform information**
   ```
   > sessionInfo()
   R version 4.2.2 (2022-10-31)
   Platform: x86_64-apple-darwin17.0 (64-bit)
   Running under: macOS Monterey 12.6
   
   Matrix products: default
   LAPACK: 
/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
   
   locale:
   [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
   
   attached base packages:
   [1] stats graphics  grDevices utils datasets  methods   base 
   
   other attached packages:
   [1] arrow_10.0.1   testthat_3.1.6
   
   loaded via a namespace (and not attached):
[1] assertthat_0.2.1 brio_1.1.3   R6_2.5.1 lifecycle_1.0.3  
magrittr_2.0.3   rlang_1.0.6 
[7] cli_3.5.0rstudioapi_0.14  vctrs_0.5.1  tools_4.2.2  
bit64_4.0.5  glue_1.6.2  
   [13] purrr_1.0.0  bit_4.0.5compiler_4.2.2   tidyselect_1.2.0
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-adbc] lidavidm opened a new issue, #366: [Discuss] Is the conventional commit format working?

2023-01-20 Thread GitBox


lidavidm opened a new issue, #366:
URL: https://github.com/apache/arrow-adbc/issues/366

   I've found that it's easy to typo the 'component' and that it's not clear 
what to use for the component. (For instance: is a cross-language change 
`fix(c,python)`?) Maybe we should align with the Arrow project and just use the 
language as the 'component' (so `c`, `python`, `go`, etc.)? Or, we could 
improve the validation to check that the 'component' really is a subdirectory 
of the repo (that way we won't typo `go/adbc/flightsql` when we mean 
`go/adbc/driver/flightsql`).
   
   It doesn't help that GitHub defaults to the commit message, not the PR 
title/message, when merging - so we'll fix it in the PR, only to have GitHub 
merge using the original message. We can ask INFRA to change the setting to 
merge using the PR title/description, if people agree.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] thisisnic closed issue #33746: [R] Update NEWS for 11.0.0

2023-01-20 Thread GitBox


thisisnic closed issue #33746: [R] Update NEWS for 11.0.0
URL: https://github.com/apache/arrow/issues/33746


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] sjperkins opened a new issue, #33804: Add support for manylinux_2_28 wheels

2023-01-20 Thread GitBox


sjperkins opened a new issue, #33804:
URL: https://github.com/apache/arrow/issues/33804

   ### Describe the enhancement requested
   
   This is low priority as I'm not on the PMC or a Committer. However, I 
thought I'd create it as I wanted to create a pyarrow wheel with the new C++ 
ABI: `_GLIBCXX_USE_CXX11_ABI=1`. In the process of doing so, I created a 
manylinux_2_28 wheel by adapting the existing manylinux2014 Dockerfile which 
may prove useful:
   
   Related:
   
   - 
https://pypackaging-native.github.io/key-issues/native-dependencies/cpp_deps/
   - https://github.com/apache/arrow/issues/32415
   
   I'll submit the manylinux_2_28 Dockerfile in a PR supporting this 
enhancement.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] sjperkins opened a new issue, #33801: C++ Extension Types aren't correctly exposed in pyarrow

2023-01-20 Thread GitBox


sjperkins opened a new issue, #33801:
URL: https://github.com/apache/arrow/issues/33801

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Version: master branch (11.0.0)
   Platform: Ubuntu 20.04
   
   Neither `__arrow_ext_class__`  nor `__arrow_ext_scalar_class__` are exposed 
on `BaseExtensionType`.
   
   This results in the following sort of errors when trying to access a C++ 
ExtensionArray/ExtensionType from pyarrow:
   
   ```
   AttributeError: 'pyarrow.lib.BaseExtensionType' object has no attribute 
'__arrow_ext_class__'
   ```
   
   See the following, for example:
   
   - https://github.com/apache/arrow/issues/32291
   - https://github.com/apache/arrow/pull/10565#issuecomment-890893166
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] kou opened a new issue, #33800: [Packaging] Drop support for Ubuntu 18.04

2023-01-19 Thread GitBox


kou opened a new issue, #33800:
URL: https://github.com/apache/arrow/issues/33800

   ### Describe the enhancement requested
   
   Ubuntu 18.04 will reach End of Standard Support on 2023-04: 
https://wiki.ubuntu.com/Releases
   
   > Version | Code name | Docs | Release | End of Standard Support | End of 
Life
   > -- | -- | -- | -- | -- | --
   > Ubuntu 18.04.6 LTS | Bionic Beaver | Changes | September 17.2021 | April 
2023 | April 2028
   
   We'll release 12.0.0 on 2023-04 so 12.0.0 doesn't need Ubuntu 18.04 support. 
We can drop support for Ubuntu 18.04 support now because the maintenance branch 
for 11.0.0 is already created.
   
   FYI: We can require CMake 3.16 or later after we drop support for Ubuntu 
18.04 because Ubuntu 20.04 ships CMake 3.16 and EPEL for CentOS 7 ships CMake 
3.17.
   
   ### Component(s)
   
   Packaging


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] EpsilonPrime opened a new issue, #33798: [C++] Add decimal support for binary round kernel

2023-01-19 Thread GitBox


EpsilonPrime opened a new issue, #33798:
URL: https://github.com/apache/arrow/issues/33798

   ### Describe the enhancement requested
   
   As part of ARROW-18425 a binary version of the round kernel was added.  
However it only provided support for int and float.  Decimal support should 
also be added so that the binary and unary versions have equivalent 
functionality.
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] EpsilonPrime opened a new issue, #33797: [C++] Add decimal version of Round benchmarks

2023-01-19 Thread GitBox


EpsilonPrime opened a new issue, #33797:
URL: https://github.com/apache/arrow/issues/33797

   ### Describe the enhancement requested
   
   The Acero Round compute kernel currently has benchmarks for integer and 
floating point types.
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] zeroshade closed issue #32946: [Go] Implement RLE Array and Compare

2023-01-19 Thread GitBox


zeroshade closed issue #32946: [Go] Implement RLE Array and Compare
URL: https://github.com/apache/arrow/issues/32946


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] zeroshade closed issue #33734: [Go] arrow library is not compatible with grpc < 1.45 due to use of reflection experimental interface

2023-01-19 Thread GitBox


zeroshade closed issue #33734: [Go] arrow library is not compatible with grpc < 
1.45 due to use of reflection experimental interface
URL: https://github.com/apache/arrow/issues/33734


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-testing] andygrove merged pull request #86: Add gzip compressed version of file aggregate_test_100.csv to enable …

2023-01-19 Thread GitBox


andygrove merged PR #86:
URL: https://github.com/apache/arrow-testing/pull/86


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] zeroshade closed issue #33789: [Go] RecordReader has no way to propagate errors

2023-01-19 Thread GitBox


zeroshade closed issue #33789: [Go] RecordReader has no way to propagate errors
URL: https://github.com/apache/arrow/issues/33789


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] wjones127 closed issue #14476: How to define a StructArray from R?

2023-01-19 Thread GitBox


wjones127 closed issue #14476: How to define a StructArray from R?
URL: https://github.com/apache/arrow/issues/14476


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] pleicht opened a new issue, #33796: arrow-testing.pc.in's Cflags are incorrectly set if gtest isn't built as part of the arrow build

2023-01-19 Thread GitBox


pleicht opened a new issue, #33796:
URL: https://github.com/apache/arrow/issues/33796

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   If we don't build gtest as part of the arrow build process (we could get it 
pre-built somewhere else), then the following variable is unset: 
   
https://github.com/apache/arrow/blob/bf8780d0ff794c50312d799a9e877430e99dcf8b/cpp/src/arrow/arrow-testing.pc.in#L22
   
   Which is currently only set in the `macro(build_gtest)` cmake function found 
here: 
https://github.com/apache/arrow/blob/359f28ba9d62a5e8456d92dfbe5b16b790019edd/cpp/cmake_modules/ThirdpartyToolchain.cmake#L2003
   
   
   As a result the Cflags generated in: 
   
https://github.com/apache/arrow/blob/bf8780d0ff794c50312d799a9e877430e99dcf8b/cpp/src/arrow/arrow-testing.pc.in#L29
   End up being just `-I`, which then causes an `-I` to appear in the compile 
command for users building against the arrow project, which in our case (and I 
assume all cases?) is invalid.
   As an example, taking out a sub portion of our compile command which was 
generated with this issue:
   `-pthread -I -std=gnu++17`
   
   A solution here would be to not generate any Cflags in the case that 
`GTEST_INCLUDE_DIR` isn't set.  The `-I` in `Cflags: -I${gtest_includedir}` 
needs to be created conditionally.  I'll try to add a PR in the next few days 
to address the issue.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] lidavidm opened a new issue, #33794: [Go][FlightRPC] Add ability to bind a reader of parameters to Flight SQL prepared statement

2023-01-19 Thread GitBox


lidavidm opened a new issue, #33794:
URL: https://github.com/apache/arrow/issues/33794

   ### Describe the enhancement requested
   
   This will let us bind a stream of parameters, not just a single batch.
   
   This will be used to implement BindStream in the ADBC driver.
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] lidavidm closed issue #33767: [Go] Exported ArrowArrayStream.get_next doesn't handle uninitialized ArrowArrays well

2023-01-19 Thread GitBox


lidavidm closed issue #33767: [Go] Exported ArrowArrayStream.get_next doesn't 
handle uninitialized ArrowArrays well
URL: https://github.com/apache/arrow/issues/33767


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] Oduig opened a new issue, #33790: Support for reading .csv files from a zip archive

2023-01-19 Thread GitBox


Oduig opened a new issue, #33790:
URL: https://github.com/apache/arrow/issues/33790

   ### Describe the enhancement requested
   
   I would like to read CSVs from *.zip archives. The supported compression 
formats include gzip and bz2, but not zip.
   Would it be possible to add this as an extension?
   
   Supporting zip archives would allow Airbyte to use pyarrow to read CSVs from 
compressed ZIP archives.
   
   I looked around to see if anything had been proposed about this before, but 
I couldn't find anything and browsing through the sources, I have difficulty to 
determine how easy/hard it would be to contribute a fix.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] pitrou closed issue #15137: [C++][CI] ASAN error in streaming JSON reader tests

2023-01-19 Thread GitBox


pitrou closed issue #15137: [C++][CI] ASAN error in streaming JSON reader tests
URL: https://github.com/apache/arrow/issues/15137


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] thisisnic closed issue #33777: [R] Nightly builds failing due to dataset test not being skipped on builds without datasets module

2023-01-19 Thread GitBox


thisisnic closed issue #33777: [R] Nightly builds failing due to dataset test 
not being skipped on builds without datasets module
URL: https://github.com/apache/arrow/issues/33777


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] lidavidm opened a new issue, #33789: [Go] RecordReader has no way to propagate errors

2023-01-19 Thread GitBox


lidavidm opened a new issue, #33789:
URL: https://github.com/apache/arrow/issues/33789

   ### Describe the enhancement requested
   
   RecordReader's methods don't return `err`, so there's no way to propagate 
errors. For this reason, exported streams in the C Data Interface have no way 
of returning errors, either. 
   
   Changing the interface would of course be a breaking change. The alternative 
is to declare this:
   
   ```
   type ClosableRecordReader interface {
   RecordReader
   Closable
   }
   ```
   
   which gives us one place to report errors (at the end).
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] jorisvandenbossche closed issue #15109: [Python] Can't create a non empty StructArray with no field using `StructArray.from_array`

2023-01-19 Thread GitBox


jorisvandenbossche closed issue #15109: [Python] Can't create a non empty 
StructArray with no field using `StructArray.from_array`
URL: https://github.com/apache/arrow/issues/15109


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] thisisnic closed issue #33779: [R] Nightly builds (R 3.5 and 3.6) failing due to field refs test

2023-01-19 Thread GitBox


thisisnic closed issue #33779: [R] Nightly builds (R 3.5 and 3.6) failing due 
to field refs test
URL: https://github.com/apache/arrow/issues/33779


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] aliakseimakarau opened a new issue, #33787: Arrow under s390x: statement has no effect [-Werror=unused-value]

2023-01-19 Thread GitBox


aliakseimakarau opened a new issue, #33787:
URL: https://github.com/apache/arrow/issues/33787

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   Arrow is employed in CEPH software defined storage 
(https://github.com/ceph/ceph, 
https://github.com/ceph/ceph/blob/main/.gitmodules):
   ```
   [submodule "src/arrow"]
path = src/arrow
url = https://github.com/apache/arrow.git
   ```
   Building the whole system at s390x with the -Werror triggers the following: 
   arrow/cpp/src/arrow/util/cpu_info.cc:155:3: error: statement has no effect 
[-Werror=unused-value] (347a88ff9d20e2a4061eec0b455b8ea1aa8335dc).
   
   Should a "dummy" default element be inserted into the `flag_mappings[]` :
   ```
   struct {
 std::string name;
 int64_t flag;
   } flag_mappings[] = {
   #if (defined(__i386) || defined(_M_IX86) || defined(__x86_64__) || 
defined(_M_X64))
   {"ssse3", CpuInfo::SSSE3},   {"sse4_1", CpuInfo::SSE4_1},
   {"sse4_2", CpuInfo::SSE4_2}, {"popcnt", CpuInfo::POPCNT},
   {"avx", CpuInfo::AVX},   {"avx2", CpuInfo::AVX2},
   {"avx512f", CpuInfo::AVX512F},   {"avx512cd", CpuInfo::AVX512CD},
   {"avx512vl", CpuInfo::AVX512VL}, {"avx512dq", CpuInfo::AVX512DQ},
   {"avx512bw", CpuInfo::AVX512BW}, {"bmi1", CpuInfo::BMI1},
   {"bmi2", CpuInfo::BMI2},
   #endif
   #if defined(__aarch64__)
   {"asimd", CpuInfo::ASIMD},
   #endif
   };
   ```
   Thank you!
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] raulcd opened a new issue, #33786: [Release][C++] Release verification tasks fail with libxsimd-dev installed on ubuntu 22.04

2023-01-19 Thread GitBox


raulcd opened a new issue, #33786:
URL: https://github.com/apache/arrow/issues/33786

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   As pointed during the Release verification for 11.0.0 RC 0 the build failed 
on Ubuntu 22.04 with:
   ```
   -- Building xsimd from source
   CMake Error at cmake_modules/ThirdpartyToolchain.cmake:2295 (add_library):
 add_library cannot create imported target "xsimd" because another target
 with the same name already exists.
   Call Stack (most recent call first):
 CMakeLists.txt:498 (include)
   ```
   Full log shared by @pitrou here: 
https://gist.github.com/pitrou/3fdca2460fa71bba731b0706703b70b2
   
   I have been able to reproduce when installing: `$ sudo apt install 
libxsimd-dev` on my Ubuntu 22.04.
   
   Mail thread where the issue was raised: 
https://lists.apache.org/thread/bxkd8xb90pf83mp17xjv3gms46yzyz2q
   
   ### Component(s)
   
   C++, Release


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] lidavidm closed issue #15203: [Java] ArrowFileWriter/ArrowStreamWriter lack compression support

2023-01-19 Thread GitBox


lidavidm closed issue #15203: [Java] ArrowFileWriter/ArrowStreamWriter lack 
compression support
URL: https://github.com/apache/arrow/issues/15203


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] OfekShilon opened a new issue, #33784: [R] writing/reading a data.frame with column class 'list' changes column class

2023-01-19 Thread GitBox


OfekShilon opened a new issue, #33784:
URL: https://github.com/apache/arrow/issues/33784

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   (and in addition, adds a `ptype` attribute - as already detailed in #15248)
   
   ```r
   # One way to create column with class list:
   library(tibble)
   tb <- tibble(list_column = list(c(a = 1, b = 2)))
   df <- as.data.frame(tb)
   class(df$list_column)
   # [1] "list"
   
   # Write + read back
   tmpf <- tempfile()
   arrow::write_feather(df, tmpf)
   df2 <- arrow::read_feather(tmpf)
   class(df2$list_column)
   # [1] "arrow_list""vctrs_list_of" "vctrs_vctr""list" 
   
   unlink(tmpf)
   ```
   
   
   
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] raulcd opened a new issue, #33783: [Release][C#] Release verification tasks fail with new version of C# 7.0.x

2023-01-19 Thread GitBox


raulcd opened a new issue, #33783:
URL: https://github.com/apache/arrow/issues/33783

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   dotnet 7.0.0 was released on November and we never added new tasks for it. 
https://dotnet.microsoft.com/en-us/download/dotnet/7.0
   We currently only have verification jobs with 6.0.202. If I run verification 
locally with Ubuntu 22.04 with .NET 6.0.202 C# jobs are successful:
   ```
   ===
   Build and test C# libraries
   ===
   └ Ensuring that C# is installed...
   └ Installed C# at  (.NET 6.0.202)
   You can invoke the tool using the following command: sourcelink
   Tool 'sourcelink' (version '3.1.1') was successfully installed.
   /tmp/arrow-11.0.0.ReaWC/apache-arrow-11.0.0/csharp 
/tmp/arrow-11.0.0.ReaWC/apache-arrow-11.0.0 ~/code/arrow
     Determining projects to restore...
   ```
   
   but it fails if I upgrade dotnet to `7.0.102`:
   ```
   ===
   Build and test C# libraries
   ===
   └ Ensuring that C# is installed...
   └ Found C# at  (.NET 7.0.102)
   
   Welcome to .NET 7.0!
   -
   SDK Version: 7.0.102
   ...
   dev/release/verify-release-candidate.sh: line 341: 129149 Segmentation fault 
     (core dumped) dotnet tool install --tool-path ${csharp_bin} sourcelink
   Failed to verify release candidate. See /tmp/arrow-11.0.0.lNQyX for details.
   ```
   
   ### Component(s)
   
   C#, Release


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] raulcd opened a new issue, #33782: [Release] Vote email number of issues is querying JIRA and producing a wrong number

2023-01-19 Thread GitBox


raulcd opened a new issue, #33782:
URL: https://github.com/apache/arrow/issues/33782

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   When generating the vote email for RC 0 on 11.0.0 I've realised that the 
vote email generated contains the following:
   ```
   This is a release consisting of 274 resolved JIRA issues[1].
   ```
   This number is extracted from:
   ```
   jira_url="https://issues.apache.org/jira;
 
jql="project%20%3D%20ARROW%20AND%20status%20in%20%28Resolved%2C%20Closed%29%20AND%20fixVersion%20%3D%20${version}"
 n_resolved_issues=$(curl "${jira_url}/rest/api/2/search/?jql=${jql}" | jq 
".total")
   ```
   This is wrong now, we should extract this from the GitHub milestone:
   https://github.com/apache/arrow/milestone/1?closed=1
   I've updated this manually for the current email vote but we should fix it 
on the `dev/release/02-source.sh` script.
   
   ### Component(s)
   
   Release


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] MMCMA closed issue #15153: [Python] OSError: Couldn't deserialize thrift: TProtocolException

2023-01-19 Thread GitBox


MMCMA closed issue #15153: [Python] OSError: Couldn't deserialize thrift: 
TProtocolException
URL: https://github.com/apache/arrow/issues/15153


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] thisisnic opened a new issue, #33779: [R] Nightly builds (R 3.5 and 3.6) failing

2023-01-19 Thread GitBox


thisisnic opened a new issue, #33779:
URL: https://github.com/apache/arrow/issues/33779

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The 
[test-r-versions](https://github.com/ursacomputing/crossbow/actions/runs/3954166164/jobs/6771218456)
 nightly build is failing on R 3.5 and 3.6 due to a test introduced in #19706 
   
   ```
   ══ Failed tests 

   ── Error ('test-expression.R:154'): Nested field from a non-field-ref 
(struct_field kernel) ──
   Error: field 'c' not found in struct>
   Backtrace:
   ▆
1. ├─testthat::expect_error(x$c, "field 'c' not found in struct") at test-expression.R:154:2
2. │ └─testthat:::expect_condition_matching(...)
3. │   └─testthat:::quasi_capture(...)
4. │ ├─testthat (local) .capture(...)
5. │ │ └─base::withCallingHandlers(...)
6. │ └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
7. ├─x$c
8. └─arrow:::`$.Expression`(x, c)
9.   └─arrow:::get_nested_field(x, name)
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] thisisnic opened a new issue, #33777: [R] Nightly builds failing due to dataset test not being skipped on builds without datasets module

2023-01-19 Thread GitBox


thisisnic opened a new issue, #33777:
URL: https://github.com/apache/arrow/issues/33777

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Nightly builds where datasets aren't installed are failing due to a 
recently-introduced test using datasets, e.g. 
https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=42831=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=d9b15392-e4ce-5e4c-0c8c-b69645229181
   
   ```
   ══ Failed tests 

   ── Error ('test-dplyr-query.R:745'): Can use nested field refs 
─
   Error: This build of the arrow package does not support Datasets
   Backtrace:
   ▆
1. ├─arrow:::expect_equal(...) at test-dplyr-query.R:745:2
2. │ └─base::inherits(object, "ArrowObject") at 
tests/testthat/helper-expectation.R:34:2
3. ├─... %>% collect()
4. ├─dplyr::collect(.)
5. ├─dplyr::filter(., nested > 7)
6. ├─dplyr::mutate(., nested = df_col$a, times2 = df_col$a * 2)
7. └─InMemoryDataset$create(.)
8.   └─arrow:::stop_if_no_datasets()
   
   [ FAIL 1 | WARN 0 | SKIP 117 | PASS 6415 ]
   Error: Test failures
   Execution halted
   
   1 error ✖ | 0 warnings ✔ | 2 notes ✖
   Error: R CMD check found ERRORs
   Execution halted
   1
   ##[error]Bash exited with code '1'.
   
   
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-testing] jdye64 commented on pull request #86: Add gzip compressed version of file aggregate_test_100.csv to enable …

2023-01-18 Thread GitBox


jdye64 commented on PR #86:
URL: https://github.com/apache/arrow-testing/pull/86#issuecomment-1396395501

   FYI and more context on this PR/request. In `arrow-datafusion` we use this 
repo for test data. I am writing a test for a bug I found specifically around 
gzip compressed csv files and noticed that none existed. I simply compressed 
the existing `aggregate_test_100.csv` on a Ubuntu 20 machine using the command 
`gzip aggregate_test_100.csv`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-testing] jdye64 opened a new pull request, #86: Add gzip compressed version of file aggregate_test_100.csv to enable …

2023-01-18 Thread GitBox


jdye64 opened a new pull request, #86:
URL: https://github.com/apache/arrow-testing/pull/86

   …file decompression testing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] assignUser opened a new issue, #33773: [Docs][Release] Add vcpkg-port update script to release magement guide

2023-01-18 Thread GitBox


assignUser opened a new issue, #33773:
URL: https://github.com/apache/arrow/issues/33773

   ### Describe the enhancement requested
   
   #14610/#33467 added a script to update the vcpkg port file as part of the 
release process, this should be documented.
   
   ### Component(s)
   
   Documentation, Release


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] westonpace closed issue #33640: [C++] as-of-join backpressure for large sources

2023-01-18 Thread GitBox


westonpace closed issue #33640: [C++] as-of-join backpressure for large sources
URL: https://github.com/apache/arrow/issues/33640


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] kou closed issue #33754: [CI][C++] macOS arm64 verification tasks fail due to missing grpc++ headers

2023-01-18 Thread GitBox


kou closed issue #33754: [CI][C++]  macOS arm64 verification tasks fail due to 
missing grpc++ headers
URL: https://github.com/apache/arrow/issues/33754


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] wjones127 opened a new issue, #33771: [C++][Benchmark] tpch benchmark fails DCHECK

2023-01-18 Thread GitBox


wjones127 opened a new issue, #33771:
URL: https://github.com/apache/arrow/issues/33771

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I found this DCHECK is failing for me locally in the benchmark, even though 
the unit tests are passing:
   
   
   
https://github.com/apache/arrow/blob/fb264b770b95e776ac51172f4491be2a1f1ee519/cpp/src/arrow/compute/exec/tpch_node.cc#L1795
   
   ### Component(s)
   
   Benchmarking, C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] jbrockmendel opened a new issue, #33769: ENH: support quantile for temporal dtypes

2023-01-18 Thread GitBox


jbrockmendel opened a new issue, #33769:
URL: https://github.com/apache/arrow/issues/33769

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   cc @jorisvandenbossche 
   
   For some methods (e.g., dictionary_encode xref #15226, mode, min_max) it is 
straightforward to cast to integer, compute, then cast back.  For quantile I've 
found doing this breaks a bunch of pandas tests (or more accurately, fails to 
fix existing xfails).  I speculate that this has to do with lossiness in 
int->float->int conversions.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] lidavidm opened a new issue, #33767: [Go] Exported ArrowArrayStream.get_next doesn't handle uninitialized ArrowArrays well

2023-01-18 Thread GitBox


lidavidm opened a new issue, #33767:
URL: https://github.com/apache/arrow/issues/33767

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   `get_next` should set `ArrowArray.release` to `NULL` when there are no more 
records. However, the current implementation instead tries to _release_ the 
out-parameter. This is harmless when the out-parameter is 0-initialized (the 
implementation will skip the call) but otherwise it'll crash (after jumping to 
a random garbage address).
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] lidavidm closed issue #32584: [C++][FlightRPC] Fix linking of Flight/gRPC example on MacOS

2023-01-18 Thread GitBox


lidavidm closed issue #32584: [C++][FlightRPC] Fix linking of Flight/gRPC 
example on MacOS
URL: https://github.com/apache/arrow/issues/32584


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] vient opened a new issue, #33765: Multiple warnings and asserts triggered in debug CPython 3.11

2023-01-18 Thread GitBox


vient opened a new issue, #33765:
URL: https://github.com/apache/arrow/issues/33765

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   CPython can be built in debug mode to catch some maybe-fatal-maybe-not 
errors. We have such python3.11 build with `--with-pydebug`, here is an example 
of gc warning
   ```
   >>> import pyarrow as pa
   >>> table = pa.table({'a': [1]})
   gc:0: ResourceWarning: Object of type pyarrow.lib.Int64Array is not 
untracked before destruction
   ```
   similar code sometimes triggers assertion
   ```
   gc:0: ResourceWarning: Object of type pyarrow.lib.UInt16Array is not 
untracked before destruction
   Modules/gcmodule.c:442: update_refs: Assertion "gc_get_refs(gc) != 0" failed
   Enable tracemalloc to get the memory block allocation traceback
   
   object address  : 0x7f804e8762e0
   object refcount : 0
   object type : 0x7f80fcc3f5e0
   object type name: pyarrow.lib.UInt16Array
   object repr : 
   
   Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
   Python runtime state: initialized
   ```
   Another crash
   ```
   >>> import pyarrow as pa
   >>> pa.table({0: []})
   python: Objects/typeobject.c:1068: type_call: Assertion 
`!_PyErr_Occurred(tstate)' failed.
   Aborted (core dumped)
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] assignUser closed issue #32920: [Dev] More descriptive error output in merge script

2023-01-18 Thread GitBox


assignUser closed issue #32920: [Dev] More descriptive error output in merge 
script
URL: https://github.com/apache/arrow/issues/32920


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] sfc-gh-zpeng opened a new issue, #33763: (pyarrow) pa.map_() ignores field metadata

2023-01-18 Thread GitBox


sfc-gh-zpeng opened a new issue, #33763:
URL: https://github.com/apache/arrow/issues/33763

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   A map type can be created with the key and the item fields. And custom KV 
metadata can be attached to those fields. However, when creating such a type 
using pyarrow.map_(), the field level metadata are not taken. For example:
   
   ```
   map_type = pa.map_(
   pa.field("key", pa.string(), nullable=False, metadata={"abc": "1"}),
   pa.field("value", pa.int32(), metadata={"abc": "2"}))
   ```
   
   `map_type.key_field.metadata` is None, but it's expected to be `{"abc": 
"1"}`.
   
   I believe it's a bug in pyarrow. Specifically at this line: 
https://github.com/apache/arrow/blob/1d9366f19e4b9846b33cc0c7bd7941cb5f482d74/python/pyarrow/types.pxi#L2929
   
   A new field is created and used but without the metadata of the input field.
   
   Also see: 
https://colab.research.google.com/drive/1ixsRK02I0aItU9FlHQf14IArWwR5ugiA#scrollTo=mzkPfZ5h6Td6
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-adbc] lidavidm closed issue #348: [CI] Refactor CI jobs

2023-01-18 Thread GitBox


lidavidm closed issue #348: [CI] Refactor CI jobs
URL: https://github.com/apache/arrow-adbc/issues/348


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] nealrichardson closed issue #33758: SparkR Arrow "Hello World" Error: 'write_arrow' is not an exported object from 'namespace:arrow'

2023-01-18 Thread GitBox


nealrichardson closed issue #33758: SparkR Arrow "Hello World" Error: 
'write_arrow' is not an exported object from 'namespace:arrow'
URL: https://github.com/apache/arrow/issues/33758


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] nealrichardson closed issue #29743: [Dev] merge_arrow_pr.py script fails if head pointer can't be checked out

2023-01-18 Thread GitBox


nealrichardson closed issue #29743: [Dev] merge_arrow_pr.py script fails if 
head pointer can't be checked out
URL: https://github.com/apache/arrow/issues/29743


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] nealrichardson opened a new issue, #33762: [Dev] Remove Jira support from merge script

2023-01-18 Thread GitBox


nealrichardson opened a new issue, #33762:
URL: https://github.com/apache/arrow/issues/33762

   ### Describe the enhancement requested
   
   Since we've migrated, we can drop all of that, right? Also include the jira 
token store in `dev/merge.conf.sample`.
   
   ### Component(s)
   
   Developer Tools


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] wjones127 closed issue #33605: [Python] Parquet file writes incorrect booleans on large file with default write batch size

2023-01-18 Thread GitBox


wjones127 closed issue #33605: [Python] Parquet file writes incorrect booleans 
on large file with default write batch size
URL: https://github.com/apache/arrow/issues/33605


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] nealrichardson closed issue #18818: [R] Create a field ref to a field in a struct

2023-01-18 Thread GitBox


nealrichardson closed issue #18818: [R] Create a field ref to a field in a 
struct
URL: https://github.com/apache/arrow/issues/18818


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-adbc] lidavidm opened a new issue, #358: [CI] Enable CGO tests on Windows

2023-01-18 Thread GitBox


lidavidm opened a new issue, #358:
URL: https://github.com/apache/arrow-adbc/issues/358

   They currently fail in some way I can't reproduce locally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] nealrichardson opened a new issue, #33760: [R] Push projection expressions into ScanNode

2023-01-18 Thread GitBox


nealrichardson opened a new issue, #33760:
URL: https://github.com/apache/arrow/issues/33760

   ### Describe the enhancement requested
   
   https://github.com/apache/arrow/pull/19706/files#r1073391100 pointed out 
that in creating the ScanNode, we're extracting field names from Expressions in 
order to pass them to C++, which then makes FieldRef Expressions again. We can 
probably eliminate that step. Doing so may mean we need to drop a following 
Project step (or not, we'll have to see), and if so that means our 
`show_query()` output would change too--but if the projection doesn't show up 
faithfully in the print method of the ScanNode, we may want to reconsider (or, 
better, improve the ScanNode print).
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] Treize44 opened a new issue, #33759: How to limit the memory consumption of to_batches()

2023-01-18 Thread GitBox


Treize44 opened a new issue, #33759:
URL: https://github.com/apache/arrow/issues/33759

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   In order to get the unique values of a column of a 500GB Parquet dataset 
(made of 13 000 fragments) on a computer with 12GB of memory, I chose to use 
to_batches() as following :
   `
   import pyarrow as pa
   import pyarrow.dataset as ds
   
   partitioning = ds.partitioning( pa.schema([(timestamp, 
pa.timestamp("us"))]),flavor="hive",)
   unique_values = set()
   dataset = ds.dataset(path, format="parquet", partitioning=partitioning)
   batch_it = dataset to_batches(columns=[column_name])
   for batch in batch_it:
   unique_values.update(batch.column(column_name).unique())
   `
   The problem is that the process quickly accumulates memory and exceeds the 
amount available.
   When I put a breakpoint on the line "for batch in batch_it", the process 
continues to accumulate memory until it crashes.
   
   I understand that to_batches readahead but I thought I could limit it with 
"fragment_readahead" parameter. Is there a way to limit readahead ? Is there a 
way to "free" memory after a batch has been consumed ?
   Is there another way to go ? My first try was using to_table() but it needs 
20GB of memory in that case. It seems that to_batches would also need 20GB
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] cyborne100 opened a new issue, #33758: SparkR Arrow "Hello World" Error: 'write_arrow' is not an exported object from 'namespace:arrow'

2023-01-18 Thread GitBox


cyborne100 opened a new issue, #33758:
URL: https://github.com/apache/arrow/issues/33758

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Using Spark on Databricks runtime 10.4 LTS | Spark 3.2.1 | Scala 2.12. I am 
attempting to use the "hello world" instructions from [the SparkR 
pages](https://spark.apache.org/docs/latest/sparkr.html#apache-arrow-in-sparkr).
  Both SparkR and arrow are installed at the cluster level.  For some reason, 
Arrow & SparkR are trying to call write_arrow (which was deprecated in Arrow 
1.0).
   
   Running:
   
   ```
   library(SparkR)
   library(arrow)
   # Converts Spark DataFrame from an R DataFrame
   spark_df <- createDataFrame(mtcars)
   
   # Converts Spark DataFrame to an R DataFrame
   collect(spark_df)
   
   # Apply an R native function to each partition.
   collect(dapply(spark_df, function(rdf) { data.frame(rdf$gear + 1) }, 
structType("gear double")))
   
   # Apply an R native function to grouped data.
   collect(gapply(spark_df,
  "gear",
  function(key, group) {
data.frame(gear = key[[1]], disp = mean(group$disp) > 
group$disp)
  },
  structType("gear double, disp boolean")))
   ```
   
   The notebook error from
   `collect(dapply(spark_df, function(rdf) { data.frame(rdf$gear + 1) }, 
structType("gear double")))
   `
   is:
   
   > Error in readBin(con, raw(), as.integer(dataLen), endian = "big") : 
   >   invalid 'n' argument
   
   Digging further into the Spark job stderr, I get:
   
   > Job aborted due to stage failure: Task 0 in stage 5.0 failed 4 times, most 
recent failure: Lost task 0.3 in stage 5.0 (TID 14) (10.1.8.43 executor 2): 
org.apache.spark.SparkException: R unexpectedly exited.
   **R worker produced errors: Error: 'write_arrow' is not an exported object 
from 'namespace:arrow'
   Execution halted**
   > 
   >at 
org.apache.spark.api.r.BaseRRunner$ReaderIterator$$anonfun$1.applyOrElse(BaseRRunner.scala:169)
   >at 
org.apache.spark.api.r.BaseRRunner$ReaderIterator$$anonfun$1.applyOrElse(BaseRRunner.scala:162)
   >at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
   >at 
org.apache.spark.sql.execution.r.ArrowRRunner$$anon$2.read(ArrowRRunner.scala:194)
   >at 
org.apache.spark.sql.execution.r.ArrowRRunner$$anon$2.read(ArrowRRunner.scala:123)
   >at 
org.apache.spark.api.r.BaseRRunner$ReaderIterator.hasNext(BaseRRunner.scala:138)
   >at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
   >at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
   >at 
org.apache.spark.sql.execution.arrow.ArrowConverters$$anon$1.hasNext(ArrowConverters.scala:206)
   >at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
   >at scala.collection.Iterator.foreach(Iterator.scala:943)
   >at scala.collection.Iterator.foreach$(Iterator.scala:943)
   >at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
   >at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
   >at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
   >at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
   >at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
   >at scala.collection.TraversableOnce.to(TraversableOnce.scala:366)
   >at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364)
   >at scala.collection.AbstractIterator.to(Iterator.scala:1431)
   >at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358)
   >at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358)
   >at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1431)
   >at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345)
   >at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339)
   >at scala.collection.AbstractIterator.toArray(Iterator.scala:1431)
   >at 
org.apache.spark.sql.Dataset.$anonfun$collectAsArrowToR$3(Dataset.scala:3841)
   >at 
org.apache.spark.scheduler.ResultTask.$anonfun$runTask$3(ResultTask.scala:75)
   >at 
com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
   >at 
org.apache.spark.scheduler.ResultTask.$anonfun$runTask$1(ResultTask.scala:75)
   >at 
com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
   >at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:55)
   >at org.apache.spark.scheduler.Task.doRunTask(Task.scala:156)
   >at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:125)
   >at 
com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
   >at org.apache.spark.scheduler.Task.run(Task.scala:95)
   >at 

[GitHub] [arrow] nealrichardson opened a new issue, #33757: [R] Bindings for list_element and list_slice

2023-01-18 Thread GitBox


nealrichardson opened a new issue, #33757:
URL: https://github.com/apache/arrow/issues/33757

   ### Describe the enhancement requested
   
   #19706 added bindings for `[[` to the `struct_field` function. We could also 
do `list_element` with that if the expression is a list type, and map `[` to 
`list_slice` as well. 
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] nealrichardson opened a new issue, #33756: [R] Support making FieldRef from integer

2023-01-18 Thread GitBox


nealrichardson opened a new issue, #33756:
URL: https://github.com/apache/arrow/issues/33756

   ### Describe the enhancement requested
   
   #19706 added support for creating nested field refs, and it uncovered that 
it is possible in C++ to create FieldRefs from integer positions but it is not 
supported in R. `Expression$field_ref(2)` is theoretically useable, but support 
for `struct_column[[2]]` in a dplyr pipeline would be more practically useful.
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] kou closed issue #33752: [Packaging][Conda] Ubuntu: libarrow conda package fails to install on ecryptfs

2023-01-18 Thread GitBox


kou closed issue #33752: [Packaging][Conda] Ubuntu: libarrow conda package 
fails to install on ecryptfs
URL: https://github.com/apache/arrow/issues/33752


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] kou closed issue #15139: [C++] arrow.pc is missing dependencies with Windows static builds

2023-01-18 Thread GitBox


kou closed issue #15139: [C++] arrow.pc is missing dependencies with Windows 
static builds
URL: https://github.com/apache/arrow/issues/15139


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] pitrou closed issue #14923: [C++][Parquet] DeltaBitPackDecoder expects all miniblock bitwidths to be present for the last block

2023-01-18 Thread GitBox


pitrou closed issue #14923: [C++][Parquet] DeltaBitPackDecoder expects all 
miniblock bitwidths to be present for the last block
URL: https://github.com/apache/arrow/issues/14923


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] raulcd opened a new issue, #33754: [CI][C++] macOS arm64 verification tasks fail due to missing grpc++ headers

2023-01-18 Thread GitBox


raulcd opened a new issue, #33754:
URL: https://github.com/apache/arrow/issues/33754

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Since yesterday the nightlies for macOS arm64:
   
[verify-rc-source-cpp-macos-arm64](https://github.com/ursacomputing/crossbow/actions/runs/3939430656/jobs/6739280311)
   
[verify-rc-source-integration-macos-arm64](https://github.com/ursacomputing/crossbow/actions/runs/3939434913/jobs/6739289762)
   
   have failed with:
   ```
-- Forcing gRPC_SOURCE to Protobuf_SOURCE (SYSTEM)
   CMake Warning at cmake_modules/FindgRPCAlt.cmake:25 (find_package):
 By not providing "FindgRPC.cmake" in CMAKE_MODULE_PATH this project has
 asked CMake to find a package configuration file provided by "gRPC", but
 CMake did not find one.
   
 Could not find a package configuration file provided by "gRPC" (requested
 version 1.17.0) with any of the following names:
   
   gRPCConfig.cmake
   grpc-config.cmake
   
 Add the installation prefix of "gRPC" to CMAKE_PREFIX_PATH or set
 "gRPC_DIR" to a directory containing one of the above files.  If "gRPC"
 provides a separate development package or SDK, be sure it has been
 installed.
   Call Stack (most recent call first):
 cmake_modules/ThirdpartyToolchain.cmake:280 (find_package)
 cmake_modules/ThirdpartyToolchain.cmake:3942 (resolve_dependency)
 CMakeLists.txt:498 (include)
   
   
   -- Checking for module 'grpc++'
   --   No package 'grpc++' found
   -- Providing CMake module for gRPCAlt as part of Arrow CMake package
   -- pkg-config package for grpc++ for static link isn't found
   CMake Error at cmake_modules/ThirdpartyToolchain.cmake:3957 
(get_target_property):
 get_target_property() called with non-existent target "gRPC::grpc++".
   Call Stack (most recent call first):
 CMakeLists.txt:498 (include)
   
   
   CMake Error at cmake_modules/ThirdpartyToolchain.cmake:3965 (message):
 Cannot find grpc++ headers in
   Call Stack (most recent call first):
 CMakeLists.txt:498 (include)
   
   
   -- Configuring incomplete, errors occurred!
   See also 
"/var/folders/dl/2sqc_b2s20vfy540jn97pz8hgn/T/arrow-HEAD.X.2O9roRLY/cpp-build/CMakeFiles/CMakeOutput.log".
   See also 
"/var/folders/dl/2sqc_b2s20vfy540jn97pz8hgn/T/arrow-HEAD.X.2O9roRLY/cpp-build/CMakeFiles/CMakeError.log".
   Failed to verify release candidate. See 
/var/folders/dl/2sqc_b2s20vfy540jn97pz8hgn/T/arrow-HEAD.X.2O9roRLY for 
details.
   ```
   This have also failed on the Release Candidate 0 verification tasks for 
11.0.0:
   https://github.com/apache/arrow/pull/33751#issuecomment-1387057497
   
   
   ### Component(s)
   
   C++, Continuous Integration


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] crusaderky opened a new issue, #33752: Ubuntu: libarrow conda package fails to install on ecryptfs

2023-01-18 Thread GitBox


crusaderky opened a new issue, #33752:
URL: https://github.com/apache/arrow/issues/33752

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Ubuntu 22.04.1 x86-64
   conda 22.9.0 (defaults channel)
   
   My home directory was created on top of ecryptfs by the Ubuntu installer:
   
   > $ mount | grep home
   > /home/.ecryptfs/crusaderky/.Private on /home/crusaderky type ecryptfs 
(rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=0ee7f63b0c91f840,ecryptfs_sig=c7f3e46a3b8390b1,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs)
   
   Trying to install libarrow with conda fails with `[Errno 36] File name too 
long`:
   > $ conda create -n test libarrow

 
   > InvalidArchiveError("Error with archive 
/home/crusaderky/miniconda3/pkgs/libarrow-10.0.1-h86614e7_4_cpu.conda.  You 
probably need to delete and re-download or re-create this file.  Message 
was:\n\nfailed with error: [Errno 36] File name too long: 
'/home/crusaderky/miniconda3/pkgs/libarrow-10.0.1-h86614e7_4_cpu/share/gdb/auto-load/home/conda/feedstock_root/build_artifacts/apache-arrow_1673819166020/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol'")
   
   # Workaround
   Move the conda root dir to ext4:
   ```bash
   $ sudo mkdir /home/$USER-nocrypt
   $ sudo chown $USER:users /home/$USER-nocrypt
   $ mv /home/$USER/miniconda3 /home/$USER-nocrypt/
   $ ln -s /home/$USER-nocrypt/miniconda3 /home/$USER/miniconda3
   $ conda create -n test libarrow
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] otegami opened a new issue, #33750: [GLib][Ruby] Add support for the option to set `chunksize` in TableBatchReader

2023-01-18 Thread GitBox


otegami opened a new issue, #33750:
URL: https://github.com/apache/arrow/issues/33750

   ### Describe the enhancement requested
   
   ## Target
   
   TableBatchReader's `chunksize`
   - ref: https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L258
   
   ## Proposed feature
   
   Add support for the option to set `chunksize` in TableBatchReader
   
   ## Impact of this request
   
   It allows the maximum number of records in each record batch to be specified
   
   ### Component(s)
   
   GLib, Ruby


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] otegami opened a new issue, #33749: [Ruby] Add Arrow::RecordBatch#each_raw_record

2023-01-18 Thread GitBox


otegami opened a new issue, #33749:
URL: https://github.com/apache/arrow/issues/33749

   ### Describe the enhancement requested
   
   ## Target method
   
   Arrow::RecordBatch#raw_records
   
   ## Proposed feature
   
   Add Arrow::RecordBatch#each_raw_record method which is an iterator of 
Arrow::RecordBatch#raw_records.
   
   ## Impact of this request
   
   It can iterate over huge datasets, such as those using the Apache Parquet 
format.
   
   ### Component(s)
   
   Ruby


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] ava6969 opened a new issue, #33747: Published new library panda-apache

2023-01-18 Thread GitBox


ava6969 opened a new issue, #33747:
URL: https://github.com/apache/arrow/issues/33747

   ### Describe the enhancement requested
   
   This library creates a pandas interface over arrow Apache. It still 
maintains Apache performance. If it will be useful to you. I am open to more 
collaboration.
   
   https://github.com/ava6969/panda-arrow.git
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] thisisnic opened a new issue, #33746: Update NEWS for 11.0.0

2023-01-18 Thread GitBox


thisisnic opened a new issue, #33746:
URL: https://github.com/apache/arrow/issues/33746

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Update NEWS.md in R package
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] jorisvandenbossche opened a new issue, #33745: [C++][Doc] Update "struct_field" kernel documentation about passing field names in addition to indices

2023-01-18 Thread GitBox


jorisvandenbossche opened a new issue, #33745:
URL: https://github.com/apache/arrow/issues/33745

   https://github.com/apache/arrow/pull/14495 update the "struct_field" kernel, 
but the documentation at 
https://arrow.apache.org/docs/dev/cpp/compute.html#cpp-compute-vector-structural-transforms
 (note (6)) was not updated accordingly


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] zhztheplayer opened a new issue, #33743: [Java] Release outstanding buffers when BaseAllocator is being closed

2023-01-18 Thread GitBox


zhztheplayer opened a new issue, #33743:
URL: https://github.com/apache/arrow/issues/33743

   ### Describe the enhancement requested
   
   This is mainly aim to enhance 
[BaseAllocator#close()](https://github.com/apache/arrow/blob/4e439f6a597180c5fc8ff1552c860cecd33736c5/java/memory/memory-core/src/main/java/org/apache/arrow/memory/BaseAllocator.java#L370-L454)
 to implement the original design of its super method 
`BufferAllocator#close()`: 
   
   
https://github.com/apache/arrow/blob/4e439f6a597180c5fc8ff1552c860cecd33736c5/java/memory/memory-core/src/main/java/org/apache/arrow/memory/BufferAllocator.java#L88-L95
   
   The implementation should be fast enough to not impact current allocation 
process much. Also we should put detailed information of this clean-up action 
into allocator-close logs.
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] AlenkaF opened a new issue, #33742: [Python] Address docstrings in Data Types classes

2023-01-18 Thread GitBox


AlenkaF opened a new issue, #33742:
URL: https://github.com/apache/arrow/issues/33742

   ### Describe the enhancement requested
   
   Ensure docstrings for [Data Types 
Classes](https://arrow.apache.org/docs/python/api/datatypes.html#type-classes) 
have an Examples section.
   
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] AlenkaF opened a new issue, #33741: [Python] Address docstrings in Data Types Factory Functions

2023-01-18 Thread GitBox


AlenkaF opened a new issue, #33741:
URL: https://github.com/apache/arrow/issues/33741

   ### Describe the enhancement requested
   
   Ensure docstrings for [Data Types Factory 
Functions](https://arrow.apache.org/docs/python/api/datatypes.html#factory-functions)
 have an Examples section.
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] pitrou closed issue #33740: [C++] Flight build error with conda packages (requiring static linking)

2023-01-18 Thread GitBox


pitrou closed issue #33740: [C++] Flight build error with conda packages 
(requiring static linking)
URL: https://github.com/apache/arrow/issues/33740


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] pitrou opened a new issue, #33740: [C++] Flight build error with conda packages (requiring static linking)

2023-01-18 Thread GitBox


pitrou opened a new issue, #33740:
URL: https://github.com/apache/arrow/issues/33740

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I'm getting this error after a git pull:
   ```
   -- Linking Arrow Flight tests statically due to static Protobuf
   -- Linking Arrow Flight tests statically due to static gRPC
   -- If static Protobuf or gRPC are used, Arrow must be built statically
   -- (These libraries have global state, and linkage must be consistent)
   CMake Error at src/arrow/flight/CMakeLists.txt:48 (message):
 Must build Arrow statically to link Flight tests statically
   
   ```
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] raulcd closed issue #14997: [Release][Archery] Update archery release curate to support GitHub issues

2023-01-18 Thread GitBox


raulcd closed issue #14997: [Release][Archery] Update archery release curate to 
support GitHub issues
URL: https://github.com/apache/arrow/issues/14997


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] raulcd closed issue #15002: [Release][Archery] Update archery release cherry-pick to support GitHub issues

2023-01-18 Thread GitBox


raulcd closed issue #15002: [Release][Archery] Update archery release 
cherry-pick to support GitHub issues
URL: https://github.com/apache/arrow/issues/15002


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] raulcd closed issue #14999: [Release][Archery] Update archery release changelog to support GitHub issues

2023-01-18 Thread GitBox


raulcd closed issue #14999: [Release][Archery] Update archery release changelog 
to support GitHub issues
URL: https://github.com/apache/arrow/issues/14999


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] westonpace opened a new issue, #33737: [C++] Simplify tracing in exec plan

2023-01-17 Thread GitBox


westonpace opened a new issue, #33737:
URL: https://github.com/apache/arrow/issues/33737

   ### Describe the enhancement requested
   
   The old tracing model starts a span when a node starts and ends the span 
when the node marks itself finished.  Some nodes start an additional 
InputReceived span with the above mentioned span as parent.  This makes it 
rather difficult to tell where time is actually being spent because large 
blocks of the span represent idle time.  It does not accurately reflect time 
spent.
   
   I've changed the model to use async scheduler tasks as spans.  In practice, 
this means that there is now a span per fragment.  It may have child spans for 
each of the nodes that runs on the fragment (simple nodes may just mark their 
execution as an event).  This also will allow us to get rid of the 
ExecNode::finsihed_ future as they are no longer really necessary (they 
currently still show up as "waiting for finish" spans that don't really provide 
any useful information).
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] chrisirhc opened a new issue, #33734: [Go] arrow library is not compatible with grpc < 1.45 due to use of reflection experimental interface

2023-01-17 Thread GitBox


chrisirhc opened a new issue, #33734:
URL: https://github.com/apache/arrow/issues/33734

   ### Describe the enhancement requested
   
   When attempting to use arrow library in projects with grpc < 1.45, the 
reflection was added in v1.45.0 via 
https://github.com/grpc/grpc-go/commit/18564ff61d5505d955c7bd1adc28e4f1ed96300c 
. 
   
   This is due to a single line that references an experimental interface in 
grpc.reflection package:
   
https://github.com/apache/arrow/blob/c8d6110a26c41966e539e9fa2f5cb8c31dc2f0fe/go/arrow/flight/server.go#L97-L99
   
   The interface is defined as:
   
https://github.com/grpc/grpc-go/blob/4c776ec01572d55249df309251900554b46adb41/reflection/serverreflection.go#L69-L83
   
   I propose to inline this interface so that the go arrow library can be used 
in projects with earlier versions of grpc which don't contain this experimental 
interface. This should maintain the reflection capabilities introduced in 
https://github.com/apache/arrow/commit/07e7009154dc64967543ccd6462841443a8586b7 
but make go arrow library compatible with grpc < 1.45.
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] cyb70289 closed issue #33655: [C++][Parquet] Write columns in parallel for parquet writer

2023-01-17 Thread GitBox


cyb70289 closed issue #33655: [C++][Parquet] Write columns in parallel for 
parquet writer
URL: https://github.com/apache/arrow/issues/33655


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] ziggythehamster opened a new issue, #33733: Amazon Linux 2 RPMs - openssl-devel cannot coexist with openssl11-devel and breaks installing arrow-devel

2023-01-17 Thread GitBox


ziggythehamster opened a new issue, #33733:
URL: https://github.com/apache/arrow/issues/33733

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The `arrow-devel` package depends on `openssl-devel` on RPM-based distros. 
On Amazon Linux 2, `openssl-devel` and `openssl11-devel` cannot coexist, thus 
you cannot install `arrow-devel` on a system that has `openssl11-devel` 
installed.
   
   Arrow seems to support OpenSSL 1.0 and 1.1, but is built with OpenSSL 1.0 on 
Amazon Linux 2, and would depend on the OpenSSL 1.0 headers installed by 
`openssl-devel` (so you couldn't simply make the requirement be either one). 
Perhaps there needs to be an `arrow-openssl11-devel` on Amazon Linux 2?
   
   ### Component(s)
   
   Release


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] lidavidm closed issue #32901: [C++][Python][FlightRPC] Add Flight SQL ADBC driver and Python bindings

2023-01-17 Thread GitBox


lidavidm closed issue #32901: [C++][Python][FlightRPC] Add Flight SQL ADBC 
driver and Python bindings
URL: https://github.com/apache/arrow/issues/32901


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] thisisnic closed issue #33526: [R] Implement new function open_dataset_csv with signature more closely matching read_csv_arrow

2023-01-17 Thread GitBox


thisisnic closed issue #33526: [R] Implement new function open_dataset_csv with 
signature more closely matching read_csv_arrow
URL: https://github.com/apache/arrow/issues/33526


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] wjones127 closed issue #15212: ORC writer doesn't work on sliced list arrays

2023-01-17 Thread GitBox


wjones127 closed issue #15212: ORC writer doesn't work on sliced list arrays
URL: https://github.com/apache/arrow/issues/15212


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] assignUser closed issue #20512: [Python] Quadratic memory usage of Table.to_pandas with nested data

2023-01-17 Thread GitBox


assignUser closed issue #20512: [Python] Quadratic memory usage of 
Table.to_pandas with nested data
URL: https://github.com/apache/arrow/issues/20512


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] assignUser closed issue #33726: Set consistent host name in Go benchmarks in CI

2023-01-17 Thread GitBox


assignUser closed issue #33726: Set consistent host name in Go benchmarks in CI
URL: https://github.com/apache/arrow/issues/33726


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] little-arhat opened a new issue, #33729: Support Python enums in pyarrow

2023-01-17 Thread GitBox


little-arhat opened a new issue, #33729:
URL: https://github.com/apache/arrow/issues/33729

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hello! 
   
   Filing this as a bug, though it could be feature request or even usage 
question.
   
   Code:
   ```
   import pyarrow
   from enum import Enum
   import pandas as pd
   
   class Unit(Enum):
   A = "A"
   B = "B"
   
   df = pd.DataFrame({'x': [Unit.A, Unit.B]})
   
   print(pyarrow.Table.from_pandas(df))
   ```
   
   Expected:
   
   smth like
   ```
   pyarrow.Table
   x: dictionary
   
   x: [  -- dictionary:
   ["A","B"]  -- indices:
   [0,1]]
   ```
   
   Got:
   
   ```
   Traceback (most recent call last):
 File "x.py", line 12, in 
   print(pyarrow.Table.from_pandas(df))
 File "pyarrow/table.pxi", line 3475, in pyarrow.lib.Table.from_pandas
 File "/Users/a/venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py", 
line 611, in dataframe_to_arrays
   arrays = [convert_column(c, f)
 File "/Users/avenv/lib/python3.8/site-packages/pyarrow/pandas_compat.py", 
line 611, in 
   arrays = [convert_column(c, f)
 File "/Users/a/venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py", 
line 598, in convert_column
   raise e
 File "/Users/a/venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py", 
line 592, in convert_column
   result = pa.array(col, type=type_, from_pandas=True, safe=safe)
 File "pyarrow/array.pxi", line 316, in pyarrow.lib.array
 File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
 File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
   pyarrow.lib.ArrowInvalid: ("Could not convert  with type Unit: 
did not recognize Python value type when inferring an Arrow data type", 
'Conversion failed for column x with type object')
   ```
   
   Extracting `.name` from enum values and converting to `category` works as 
expected. 
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] pitrou closed issue #14875: [Python][C++] C Data Interface incorrect validate failures

2023-01-17 Thread GitBox


pitrou closed issue #14875: [Python][C++] C Data Interface incorrect validate 
failures
URL: https://github.com/apache/arrow/issues/14875


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] crusaderky opened a new issue, #33727: pandas string[pyarrow] -> category -> to_parquet fails

2023-01-17 Thread GitBox


crusaderky opened a new issue, #33727:
URL: https://github.com/apache/arrow/issues/33727

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   pandas 1.5.2
   pyarrow 10.0.1
   
   If you convert a pandas Series with dtype `string[pyarrow]` to `category`, 
the categories will be `string[pyarrow]`. So far, so good.
   However, when you try writing the resulting object to parquet, PyArrow fails 
as it does not recognize its own datatype.
   
   ## Reproducer
   ```python
   >>> import pandas as pd
   >>> df = pd.DataFrame({"x": ["foo", "bar", "foo"], dtype="string[pyarrow]")
   >>> df.dtypes.x
   string[pyarrow]
   >>> df = df.astype("category")
   >>> df.dtypes.x
   CategoricalDtype(categories=['bar', 'foo'], ordered=False)
   >>> df.dtypes.x.categories.dtype
   string[pyarrow]
   >>> df.to_parquet("foo.parquet")
   pyarrow.lib.ArrowInvalid: ("Could not convert  
with type pyarrow.lib.StringScalar: did not recognize Python value type when 
inferring an Arrow data type", 'Conversion failed for column x with type 
category')
   ```
   ## Workaround
   ```python
   df = df.astype(
   {
   k: pd.CategoricalDtype(v.categories.astype(object))
   for k, v in df.dtypes.items()
   if isinstance(v, pd.CategoricalDtype)
   and v.categories.dtype == "string[pyarrow]"
   }
   )
   ```
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] domoritz closed issue #33681: [JS] Update flatbuffers

2023-01-17 Thread GitBox


domoritz closed issue #33681: [JS] Update flatbuffers
URL: https://github.com/apache/arrow/issues/33681


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] alistaire47 opened a new issue, #33726: Set consistent host name in Go benchmarks in CI

2023-01-17 Thread GitBox


alistaire47 opened a new issue, #33726:
URL: https://github.com/apache/arrow/issues/33726

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Currently Go benchmarks are running in CI, but because the particular runner 
nodes vary and we do not set a consistent hostname, we can't see the history of 
the benchmarks because the hostnames keep changing, e.g.
   
   - fv-az570-835 
https://conbench.ursa.dev/benchmarks/6948d6fcaa08438d9e3a31a05fb7a62a/
   - fv-az361-674 
https://conbench.ursa.dev/benchmarks/aeb05a213dac45e59f62dc2f6d9e888d/
   - Mac-1673971649551.local 
https://conbench.ursa.dev/benchmarks/68559e0a74a34327a130f9dc719e7778/
   - Mac-1672779927260.local 
https://conbench.ursa.dev/benchmarks/e05538be04f749f9b3d0f5299b1aeca9/
   
   The solution for this is to set an environment variable called 
`CONBENCH_MACHINE_INFO_NAME` to a consistent value so [this 
code](https://github.com/conbench/conbench/blob/main/benchadapt/python/benchadapt/_machine_info.py#L161)
 will pick it up and use it instead. We're running on two types of runners, so 
we'll need to insert the env var in 
https://github.com/apache/arrow/blob/master/.github/workflows/go.yml in both 
the envs on 
[L94-98](https://github.com/apache/arrow/blob/master/.github/workflows/go.yml#L94-L98)
 and 
[L264-268](https://github.com/apache/arrow/blob/master/.github/workflows/go.yml#L264-L268).
 We can hardcode values in those two locations with names for the types of 
runners they are, e.g. something like `amd64-debian-11` and `amd64-macos-11`, 
respectively. Some of our other host names, for reference: 
   
   - ec2-m5-4xlarge-us-east-2
   - arm64-t4g-linux-compute
   - ursa-i9-9960x
   
   Long-term, we plan to move these benchmarks out of Arrow's CI and together 
with the rest in 
[voltrondata-labs/arrow-benchmarks-ci](https://github.com/voltrondata-labs/arrow-benchmarks-ci),
 but there's work to do before we're ready for that, and in the mean time, 
cleaning up our naming will let us see the history we're generating for Go 
benchmarks.
   
   ### Component(s)
   
   Continuous Integration, Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] westonpace opened a new issue, #33724: [Doc]: Update Acero Substrait conformance for 11.0.0 release

2023-01-17 Thread GitBox


westonpace opened a new issue, #33724:
URL: https://github.com/apache/arrow/issues/33724

   ### Describe the enhancement requested
   
   Since we are approaching the release we should update the doc to reflect the 
newly added capabilities & restrictions
   
   ### Component(s)
   
   Documentation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] zeroshade closed issue #33717: [Go] FlightSQL server elides errors in StreamChunks

2023-01-17 Thread GitBox


zeroshade closed issue #33717: [Go] FlightSQL server elides errors in 
StreamChunks
URL: https://github.com/apache/arrow/issues/33717


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] kou opened a new issue, #33723: [C++] re2::RE2::RE2() result must be checked

2023-01-17 Thread GitBox


kou opened a new issue, #33723:
URL: https://github.com/apache/arrow/issues/33723

   ### Describe the enhancement requested
   
   `re2::RE2::RE2()` may be failed. We should check the `re2::RE2::RE2()` 
result with `re2::RE2::ok()`.
   
   For example:
   
   ```diff
   diff --git a/cpp/src/arrow/compute/kernels/scalar_string_ascii.cc 
b/cpp/src/arrow/compute/kernels/scalar_string_ascii.cc
   index d3d0ac3201..b2b9d47c02 100644
   --- a/cpp/src/arrow/compute/kernels/scalar_string_ascii.cc
   +++ b/cpp/src/arrow/compute/kernels/scalar_string_ascii.cc
   @@ -1681,6 +1681,10 @@ struct FindSubstringRegex {

  template 
  OutValue Call(KernelContext*, std::string_view val, Status*) const {
   +if (!regex_match_->ok()) {
   +  // TODO: Report error
   +  return -1;
   +}
re2::StringPiece piece(val.data(), val.length());
re2::StringPiece match;
if (RE2::PartialMatch(piece, *regex_match_, )) {
   ```
   
   Gandiva also doesn't check `re2::RE2::RE2()` result.
   
   If `re2::RE2::RE2()` is failed, a program is crashed like #25633 .
   
   ### Component(s)
   
   C++, C++ - Gandiva


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-adbc] lidavidm closed issue #308: [C] NotSupportedError for postgres CHAR / VARCHAR columns

2023-01-17 Thread GitBox


lidavidm closed issue #308: [C] NotSupportedError for postgres CHAR / VARCHAR 
columns
URL: https://github.com/apache/arrow-adbc/issues/308


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] domoritz closed issue #33679: [JS] Update Dependencies

2023-01-17 Thread GitBox


domoritz closed issue #33679: [JS] Update Dependencies
URL: https://github.com/apache/arrow/issues/33679


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] paleolimbot opened a new issue, #33721: MacOS install from local job is failing

2023-01-17 Thread GitBox


paleolimbot opened a new issue, #33721:
URL: https://github.com/apache/arrow/issues/33721

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The test-r-install-local job for MacOS is currently failing and has been for 
several days. It fails while installing Arrow (in particular, the GCS step) and 
there's something about sccache shutting down unexpectedly: 
https://github.com/ursacomputing/crossbow/actions/runs/3934942306/jobs/6730161860#step:7:1429
   
   ```
   [ 88%] Building C object 
src/arrow/CMakeFiles/arrow_objlib.dir/vendored/uriparser/UriShorten.c.o
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (binary_data_as_debug_string.cc.o) in output file 
used for input files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_rest_internal.a(binary_data_as_debug_string.cc.o)
 and: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_storage.a(binary_data_as_debug_string.cc.o)
 due to use of basename, truncation and blank padding
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (compute_engine_util.cc.o) in output file used for 
input files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_common.a(compute_engine_util.cc.o)
 and: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_storage.a(compute_engine_util.cc.o)
 due to use of basename, truncation and blank padding
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (condition_variable.c.o) in output file used for 
input files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/awssdk_ep-install/lib/libaws-c-common.a(condition_variable.c.o)
 and: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/awssdk_ep-install/lib/libaws-c-common.a(condition_variable.c.o)
 due to use of basename, truncation and blank padding
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (cpuid.c.o) in output file used for input files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/awssdk_ep-install/lib/libaws-c-common.a(cpuid.c.o)
 and: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/awssdk_ep-install/lib/libaws-c-common.a(cpuid.c.o)
 due to use of basename, truncation and blank padding
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (credentials.cc.o) in output file used for input 
files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_common.a(credentials.cc.o)
 and: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_storage.a(credentials.cc.o)
 due to use of basename, truncation and blank padding
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (curl_handle.cc.o) in output file used for input 
files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_storage.a(curl_handle.cc.o)
 and: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_rest_internal.a(curl_handle.cc.o)
 due to use of basename, truncation and blank padding
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (curl_handle_factory.cc.o) in output file used for 
input files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_storage.a(curl_handle_factory.cc.o)
 and: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_rest_internal.a(curl_handle_factory.cc.o)
 due to use of basename, truncation and blank padding
   
/Applications/Xcode_14.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool:
 warning same member name (curl_wrappers.cc.o) in output file used for input 
files: 
/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/RtmpcMQGvC/file1ad21678f1c9/google_cloud_cpp_ep-install/lib/libgoogle_cloud_cpp_storage.a(curl_wrappers.cc.o)
 and: 

[GitHub] [arrow-adbc] judahrand closed issue #344: [Question] `GetTableSchema` return schema expectation

2023-01-17 Thread GitBox


judahrand closed issue #344: [Question] `GetTableSchema` return schema 
expectation
URL: https://github.com/apache/arrow-adbc/issues/344


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] pitrou closed issue #33688: [C++] Add custom codec make codec pluggable for IPC

2023-01-17 Thread GitBox


pitrou closed issue #33688: [C++] Add custom codec make codec pluggable for IPC
URL: https://github.com/apache/arrow/issues/33688


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] mhilton opened a new issue, #33717: [Go] FlightSQL server elides errors in StreamChunks

2023-01-17 Thread GitBox


mhilton opened a new issue, #33717:
URL: https://github.com/apache/arrow/issues/33717

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   When sending a stream of results as a response to a  `DoGetStatement` (or 
indeed any other `DoGet` request). Any error returned over the `StreamChunk` 
channel will be silently dropped. The expected behaviour is for the error to 
propogate to the gRPC client.
   
   This is occurring because when the `DoGet` handler detects that the 
`StreamChunk` contains an error it returns the contents of the `err` value, 
which will always be `nil` if that code path is being followed 
(https://github.com/apache/arrow/blob/master/go/arrow/flight/flightsql/server.go#L635).
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] AlenkaF opened a new issue, #33715: [Python] Remove --disable-warnings with newer version of pytest-cython

2023-01-17 Thread GitBox


AlenkaF opened a new issue, #33715:
URL: https://github.com/apache/arrow/issues/33715

   ### Describe the enhancement requested
   
   https://github.com/apache/arrow/pull/33609 adds `--disable-warnings` to 
pytest-cython in `conda-python-docs` (docker-compose.yml) to ignore pytest 
deprecation warning. This should be removed once 
https://github.com/lgpage/pytest-cython/issues/24 is resolved and a new version 
of pytest-cython that includes the fix is released.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] kou closed issue #15287: [Ruby] Add option to keep/merge join keys in Table#join

2023-01-16 Thread GitBox


kou closed issue #15287: [Ruby] Add option to keep/merge join keys in Table#join
URL: https://github.com/apache/arrow/issues/15287


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow] kou closed issue #31506: [Python] Address docstrings in Streams and File Access (Factory Functions)

2023-01-16 Thread GitBox


kou closed issue #31506: [Python] Address docstrings in Streams and File Access 
(Factory Functions)
URL: https://github.com/apache/arrow/issues/31506


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   5   6   7   8   9   10   >