issues
Thread
Date
Earlier messages
Messages by Thread
[I] [Java][Dataset] OOM Killer triggered by unbounded native memory usage during Parquet read; NativeMemoryPool.createListenable crashes with JNI error [arrow]
via GitHub
Re: [I] [Java][Dataset] OOM Killer triggered by unbounded native memory usage during Parquet read; NativeMemoryPool.createListenable crashes with JNI error [arrow]
via GitHub
[I] [Java][Dataset] OOM Killer triggered by unbounded native memory usage during Parquet read; NativeMemoryPool.createListenable crashes with JNI error #49472 [arrow-java]
via GitHub
Re: [I] [C++][Parquet] Clarify the destructor implementation of `parquet::Page` [arrow]
via GitHub
[I] Functions substring_index and truncate crash with extreme integer values [arrow]
via GitHub
Re: [I] [Doc] Use sphinx-remove-toctrees to generated docstring pages from navigation (and reduce build time) [arrow]
via GitHub
Re: [I] How to Deterministically Release the Memory of a pyarrow.Table [arrow]
via GitHub
Re: [I] [C++] Split PushGenerator producer into separately tracked entity [arrow]
via GitHub
Re: [I] [Python] Resolve parquet version deprecation warnings when compiling pyarrow [arrow]
via GitHub
Re: [I] [C++][Python][Parquet] Uniform encryption [arrow]
via GitHub
Re: [I] [C++] Add integration tests for IPC format in Skyhook [arrow]
via GitHub
Re: [I] [CI][C++] Update linting from Clang 8 [arrow]
via GitHub
Re: [I] [C++][Python] Add some kind of user-visible signal when backpressure is applied [arrow]
via GitHub
Re: [I] [C++][Compute] Add UTF-8 non-ASCII tests to scalar string kernels [arrow]
via GitHub
Re: [I] [C++] Investigate reducing I/O thread pool size to avoid CPU wastage. [arrow]
via GitHub
[I] Improve robustness against corrupted Arrow buffers from Flight SQL [arrow-go]
via GitHub
Re: [I] [Python][Parquet] Pyarrow dataset repartitioning performance [arrow]
via GitHub
Re: [I] [C++] String comparison in between ternary kernel [arrow]
via GitHub
Re: [I] [C++] Rename type traits utilities to improve semantic consistency [arrow]
via GitHub
Re: [I] Benchmark should check for built components [arrow]
via GitHub
Re: [I] [C++] Change Scanner::Head to return a RecordBatchReader [arrow]
via GitHub
Re: [I] [R] Implement nrow on some collapsed queries [arrow]
via GitHub
Re: [I] [C++][Parquet][Doc] default output option for Parquet Scan Example [arrow]
via GitHub
Re: [I] [Python][Parquet] Allow to select columns of a list field without requiring the list component names [arrow]
via GitHub
[I] Replace Black, Flake8, and isort with Ruff [arrow-adbc]
via GitHub
Re: [I] Replace Black, Flake8, and isort with Ruff [arrow-adbc]
via GitHub
[I] [C++][FlightRPC] Arrow Flight timeout test failure on MSVC Windows [arrow]
via GitHub
Re: [I] [C++] Return a random sample of rows from a query [arrow]
via GitHub
Re: [I] [C++] Improve ExecPlan::ToString [arrow]
via GitHub
Re: [I] [C++] A more RAM-efficient top-k sink node [arrow]
via GitHub
Re: [I] [R] Allow multiple arguments to n_distinct() [arrow]
via GitHub
Re: [I] [C++][Dataset] Define appropriate abstractions for "fragments" that can handle compute [arrow]
via GitHub
Re: [I] [C++][Parquet] Statistics::num_values() is misleading [arrow]
via GitHub
Re: [I] [Python] Filename-based partitioning scheme [arrow]
via GitHub
Re: [I] [C++] Improve select_k_unstable performance [arrow]
via GitHub
Re: [I] [C++] Optimize dictionary support in kernels/Support nulls in DictionaryUnifier [arrow]
via GitHub
Re: [I] [C++][Dataset] Add more fine-grained error for existing data to dataset writer [arrow]
via GitHub
[I] [Flight client] Allow disabling the creation of a child allocator [arrow-java]
via GitHub
[I] [ARROW Java][HDFS] JVM hangs after reading HDFS files via Arrow Dataset API due to non-daemon native threads [arrow]
via GitHub
[I] [C++][FlightRPC][ODBC] Enable ODBC build on Linux [arrow]
via GitHub
[I] parquet/pqarrow: pre-allocate BinaryBuilder data buffer using column chunk metadata to eliminate resize overhead [arrow-go]
via GitHub
Re: [I] parquet/pqarrow: pre-allocate BinaryBuilder data buffer using column chunk metadata to eliminate resize overhead [arrow-go]
via GitHub
[I] Add Support for iOS [arrow-swift]
via GitHub
[I] [R][CI] Use Posit manylinux binaries in CI [arrow]
via GitHub
[I] feat(javascript): set up npm publishing for node.js ADBC Driver Manager [arrow-adbc]
via GitHub
Re: [I] [C++] use `std::atomic<std::shared_ptr>` instead of `std::atomic_load()`/`std::atomic_store()` [arrow]
via GitHub
[I] [R][CI] Use RHEL-9 binaries on Amazon Linux 2023 builds [arrow]
via GitHub
Re: [I] [R][CI] Use RHEL-9 binaries on Amazon Linux 2023 builds [arrow]
via GitHub
Re: [I] [R] Implement summary.ArrowTabular/Dataset [arrow]
via GitHub
Re: [I] [R][C++] Let the JSON reader accept a document which is an array at top level in addition to line delimited JSON [arrow]
via GitHub
Re: [I] [R] Unhelpful error message when creating a dataset from datasets of differing file types [arrow]
via GitHub
Re: [I] [R] Implement binding for cut() [arrow]
via GitHub
Re: [I] [R][C++] Calling bucket$ls on GCS bucket without `recursive = TRUE` doesn't list full contents [arrow]
via GitHub
Re: [I] [C++] Substrait plan with multiple aggregate fields returns incorrect results [arrow]
via GitHub
[I] [CI][C++] Meson build is broken: File fixed_shape_tensor_test.cc does not exist [arrow]
via GitHub
Re: [I] [CI][C++] Meson build is broken: File fixed_shape_tensor_test.cc does not exist [arrow]
via GitHub
[I] [C++] Map type should always use "key"/"value"/"entries" for field names [arrow]
via GitHub
Re: [I] [C++] Map type should always use "key"/"value"/"entries" for field names [arrow]
via GitHub
[I] castVARCHAR_timestamp_int64 produces negative milliseconds for pre-epoch timestamps (before 1970-01-01). [arrow]
via GitHub
Re: [I] Schema class is missing in API docs [arrow-js]
via GitHub
[I] python/adbc_driver_manager: unused imports [arrow-adbc]
via GitHub
Re: [I] python/adbc_driver_manager: unused imports [arrow-adbc]
via GitHub
Re: [I] Trim object memory for ArrowBuf [arrow-java]
via GitHub
[I] [Python] Reintroduce docstring injection once scikit-build-core is used as build backend [arrow]
via GitHub
[I] makeData for Struct with empty children silently defaults length to 0 [arrow-js]
via GitHub
[I] Zero-column RecordBatch loses numRows after IPC deserialization [arrow-js]
via GitHub
Re: [I] Zero-column RecordBatch loses numRows after IPC deserialization [arrow-js]
via GitHub
[I] [C++] Backport Neon xsimd fix [arrow]
via GitHub
[I] [C++][CI] Detect mismatching schema in differential IPC fuzzing [arrow]
via GitHub
Re: [I] [C++][R] Support a "modified" hive style directory naming scheme [arrow]
via GitHub
Re: [I] [Python] Wheels not built for 32-Bit Windows [arrow]
via GitHub
Re: [I] [R] Re-allow some multithreading on Windows [arrow]
via GitHub
Re: [I] [C++][Compute] Implement count distinct kernel using HyperLogLog [arrow]
via GitHub
Re: [I] [C++][Dataset] Enhance dataset writer to allow file-per-batch [arrow]
via GitHub
Re: [I] [R] update metadata when casting a record batch column [arrow]
via GitHub
Re: [I] [C++][Compute] Standardize generator dispatchers [arrow]
via GitHub
[I] [C++][Python] Investigate OpenTelemetry works as expected on Windows [arrow]
via GitHub
[I] Potential dereference of nullptr [arrow]
via GitHub
Re: [I] [Python] Unable to calculate pyarrow.Table.nbytes if column type is string_view [arrow]
via GitHub
[I] ci: cpp-clang-latest is failing [arrow-adbc]
via GitHub
Re: [I] ci: cpp-clang-latest is failing [arrow-adbc]
via GitHub
Re: [I] [JS] Data.setValid results in incorrect nullCount [arrow-js]
via GitHub
[I] [C++][FlightRPC][ODBC] Disable DSN default values on MacOS [arrow]
via GitHub
Re: [I] [Python] Can't convert from df column with type uuid.UUID to str [arrow]
via GitHub
Re: [I] [Python] Infer and convert uuid.UUID objects in the python->arrow conversion layer [arrow]
via GitHub
[I] Add rand_integer function [arrow]
via GitHub
Re: [I] [C++][Gandiva] Add rand_integer function [arrow]
via GitHub
[I] Optimize lpad/rpad UTF-8 functions: fix memory safety issue and improve performance [arrow]
via GitHub
[I] [Python][C++][GPU] Python Cuda jobs fail with 'cuda.bindings.driver.CUcontext' object has no attribute 'value' [arrow]
via GitHub
Re: [I] [C++] Reconcile type promotion rules between if_else, case_when, coalesce, select [arrow]
via GitHub
Re: [I] [C++] Reuse original offsets buffer in ASCII string kernels [arrow]
via GitHub
Re: [I] [C++] Simplify registration of scalar arithmetic/string functions [arrow]
via GitHub
Re: [I] [R] Expose null placement option through sort bindings [arrow]
via GitHub
Re: [I] [C++] Decimal promotion rules should consider inflating type [arrow]
via GitHub
Re: [I] [C++] Read DELTA_BYTE_ARRAY data written with bug PARQUET-246 [arrow]
via GitHub
Re: [I] [Website] Change home page heading text to match GitHub repo description [arrow]
via GitHub
[I] [C++][FlightRPC] Flight fails building with protobuf v34.0+ due to functions updated with nodiscard [arrow]
via GitHub
[I] [C++][CI] Add golden integration files to IPC file fuzz corpus [arrow]
via GitHub
Re: [I] [C++][CI] Add golden integration files to IPC file fuzz corpus [arrow]
via GitHub
[I] [C++] Multi-threaded logging can mix up messages [arrow]
via GitHub
Re: [I] [JS] Fully null column of type `Bool` produces incompatible IPC stream with JS package [arrow-js]
via GitHub
Re: [I] Migrate from Yarn 1 [arrow-js]
via GitHub
[I] [C++][Gandiva] Add support for LLVM 22.1.0 [arrow]
via GitHub
Re: [I] [C++][Gandiva] Add support for LLVM 22.1.0 [arrow]
via GitHub
[I] [CI][Python] Emscripten build fails building stubs [arrow]
via GitHub
Re: [I] [CI][Python] Emscripten build fails building stubs [arrow]
via GitHub
Re: [I] [R] Add error handling to C++ compute functions listed via list_compute_functions() which don't have bindings in R or options not supplied by user [arrow]
via GitHub
Re: [I] [Python] Allow Table.from_pydict to specify a type mapper for extension types [arrow]
via GitHub
Re: [I] [R] Support for .keep_all = TRUE with distinct() [arrow]
via GitHub
Re: [I] [Python] Update use of FunctionOptions scoped enums once Cython 3 is the default [arrow]
via GitHub
Re: [I] [C++] parquet with invalid utf8 does not error [arrow]
via GitHub
Re: [I] [R] Cast of NaN to integer should return NA_integer_ [arrow]
via GitHub
Re: [I] [C++] Improve performance on dictionaries for 'case_when' kernel [arrow]
via GitHub
[I] [CI][C++] test-ubuntu-22.04-cpp-emscripten fails with no member named 'log2p1' in namespace 'std' [arrow]
via GitHub
Re: [I] [CI][C++] test-ubuntu-22.04-cpp-emscripten fails with no member named 'log2p1' in namespace 'std' [arrow]
via GitHub
[I] [Release] Use release/KEYS not dev/KEYS in verification [arrow-go]
via GitHub
Re: [I] [Release] Use release/KEYS not dev/KEYS in verification [arrow-go]
via GitHub
[I] [CI][Integration][Ruby] Add the Ruby implementation to integration targets [arrow]
via GitHub
[I] [C++][Gandiva] Fix castVARCHAR memory inefficiencies, unused bool allocation, and missing len<=0 handling [arrow]
via GitHub
Re: [I] r/adbcdrivermanager: adbcdrivermanager package fails to compile on at least one CRAN check machine [arrow-adbc]
via GitHub
[I] [C++][FlightRPC][Packaging] ODBC MSI installer missing arrow_flight_sql_odbc.dll [arrow]
via GitHub
Re: [I] [C++][FlightRPC][Packaging] ODBC MSI installer missing arrow_flight_sql_odbc.dll [arrow]
via GitHub
Re: [I] [C++] Use cmake to build lz4 [arrow]
via GitHub
Re: [I] [R] Bind median() and quantile() to exact not approximate median and quantile [arrow]
via GitHub
Re: [I] [C++][Dataset] - Dataset write to accept mask array to include/exclude rows [arrow]
via GitHub
Re: [I] [C++] Add profiling / tracing for exec plan [arrow]
via GitHub
Re: [I] [python] read csv with different number of columns per row [arrow]
via GitHub
Re: [I] [C++][Compute] Improve top_k/bottom_k Selecters via CRTP [arrow]
via GitHub
Re: [I] [C++] Add page skipping to parquet reading [arrow]
via GitHub
[I] [GLib] Add `garrow_map_data_type_is_keys_sorted()` [arrow]
via GitHub
Re: [I] [GLib] Add `garrow_map_data_type_is_keys_sorted()` [arrow]
via GitHub
Re: [I] [Ruby] Add support for fixed size list array [arrow]
via GitHub
[I] [C++] Map type ignores key/item/value field names [arrow]
via GitHub
Re: [I] [C++] Map type ignores key/item/value field names [arrow]
via GitHub
Re: [I] [C++][Compute] Implement streaming version for SelectK [arrow]
via GitHub
[I] [Python] MacOs stuck on importing (23.0.1) while works on (15.0.1) [arrow]
via GitHub
[I] Support per-batch custom_metadata on DictionaryBatch (IPC Message field) [arrow-java]
via GitHub
[I] Filtering corrupts data in column containing an array [arrow]
via GitHub
Re: [I] [C++] Provide a cross platform helper for definition of library init code [arrow]
via GitHub
Re: [I] [C++][Python][Compute] Number to string hex conversion [arrow]
via GitHub
Re: [I] [Python][Docs] Update datasets user guide with more details on Partitioning(Factory) [arrow]
via GitHub
Re: [I] [C++] Allow Partitioning objects to be created with a vector of field names [arrow]
via GitHub
Re: [I] [C++] An ExecContext with no executor should not be valid [arrow]
via GitHub
Re: [I] Include datafusion's python binding in arrow website [arrow]
via GitHub
Re: [I] Add support for warnings / notices [arrow-adbc]
via GitHub
[I] [Ruby] Add support for custom metadata in field and schema [arrow]
via GitHub
[I] fatal error: 'span' file not found [arrow]
via GitHub
[I] Java compression codecs do not release compressed ArrowBuf in decompress, causing allocator leaks [arrow-java]
via GitHub
[I] [C++][FlightRPC][ODBC] Enable DSN default values on macOS [arrow]
via GitHub
[I] Parquet StreamReader should clarify its contract for parquet files without a schema. [arrow]
via GitHub
[I] Single chunk ChunkedArray doesn't correctly respect copy in __array__method [arrow]
via GitHub
[I] Support per-batch custom_metadata on RecordBatch (IPC Message field) [arrow-java]
via GitHub
Re: [I] [C++][Python] Add VariableShapeTensor implementation [arrow]
via GitHub
[I] [Python] Enable OpenTelemetry on PyArrow wheels [arrow]
via GitHub
Re: [I] [Python] Enable OpenTelemetry on PyArrow wheels [arrow]
via GitHub
Re: [I] [C++][Python] Failed to build pyarrow, missing Arrow C++ [arrow]
via GitHub
Re: [I] [C++][Python] Support binary_view in basic kernels [arrow]
via GitHub
[I] `pqarrow.SchemaField.IsLeaf()` unreliable because `ColIndex` is never set to -1 for non-leaves [arrow-go]
via GitHub
Re: [I] `pqarrow.SchemaField.IsLeaf()` unreliable because `ColIndex` is never set to -1 for non-leaves [arrow-go]
via GitHub
Re: [I] [Python] Pip install error for pyarrow 6.0.1 on Python 3.6.8 due to setuptools_scm transitive dependency [arrow]
via GitHub
Re: [I] Unable to load libhdfs [arrow]
via GitHub
Re: [I] [C++] CMake build of arrow libraries fails on Windows [arrow]
via GitHub
Re: [I] [C++] Vcpkg install error for abseil on windows when building Arrow C++ [arrow]
via GitHub
Re: [I] [Docs] Describe use of Jira Affects Version in Contributing docs [arrow]
via GitHub
Re: [I] [C++] Cannot install Arrow with Zstd on Windows [arrow]
via GitHub
Re: [I] [R] installation failure on R Studio Server [arrow]
via GitHub
Re: [I] [C++][Compute] Add Find method to Grouper [arrow]
via GitHub
Re: [I] [C++][Compute] Provide a default implementation of ExecNode::Pause/Resume [arrow]
via GitHub
Re: [I] [Python] Use IPC writing code for pickling RecordBatches [arrow]
via GitHub
Re: [I] [C++] Add an arrow::Table::GetFieldByName method [arrow]
via GitHub
Re: [I] [C++][Dataset] Remove UnionDataset in favor of UnionExecNode [arrow]
via GitHub
Re: [I] [Doc] Make main column width larger [arrow]
via GitHub
Re: [I] [C++] Improve performance of unpack64 [arrow]
via GitHub
[I] [R][CI] r-devdocs crossbow job fails during gap between C++ and R releases [arrow]
via GitHub
[I] [R] CRAN packaging checklist for version 23.0.1.1 [arrow]
via GitHub
[I] [Python][Parquet] Add options to control writing of Bloom filters to `parquet.write_table` [arrow]
via GitHub
Re: [I] [C++][Compute] Make a subset of compute:: available even if ARROW_COMPUTE=OFF [arrow]
via GitHub
Re: [I] [C++] Make index kernel work in exec plans [arrow]
via GitHub
Re: [I] [C++] [Dataset] Add optional scan type that tags batches with locational information [arrow]
via GitHub
Re: [I] [Gandiva] Support null data type for gandiva. [arrow]
via GitHub
Re: [I] [C++] Create utility for runtime warnings [arrow]
via GitHub
Re: [I] [R] Enable object name linter [arrow]
via GitHub
Re: [I] [R][CI] Clean up crossbow R templates [arrow]
via GitHub
Re: [I] format: support multiple result sets [arrow-adbc]
via GitHub
[I] CI: Python integration tests are being skipped in CI [arrow-dotnet]
via GitHub
Re: [I] CI: Python integration tests are being skipped in CI [arrow-dotnet]
via GitHub
[I] The IReadOnlyList indexer on top of BinaryArray doesn't return nulls [arrow-dotnet]
via GitHub
Re: [I] The IReadOnlyList indexer on top of BinaryArray doesn't return nulls [arrow-dotnet]
via GitHub
[I] [C++]: Work around `bit_width` not being available on MacOS's partially compatible C++20 build [arrow]
via GitHub
Re: [I] [C++]: Work around `bit_width` not being available on MacOS's partially compatible C++20 build [arrow]
via GitHub
Re: [I] [C++]: Work around `bit_width` not being available on MacOS's partially compatible C++20 build [arrow]
via GitHub
[I] [C++][R] More robust `libtool` checking [arrow]
via GitHub
Re: [I] [C++][R] More robust `libtool` checking [arrow]
via GitHub
[I] Basic compute/comparison kernels missing for string_view? [arrow]
via GitHub
Re: [I] [Python][C++] Basic compute/comparison kernels missing for string_view? [arrow]
via GitHub
Re: [I] [Java][Docs] Undocumented null return from CallHeaders.getAll() [arrow-java]
via GitHub
[I] [CI][C++] JNI build error: `'bit' file not found` [arrow]
via GitHub
Re: [I] [C++][Docs] Missing docs for many Datum members [arrow]
via GitHub
Re: [I] [C++] RecordBatch::Add/SetColumn w/ ArrayData [arrow]
via GitHub
Re: [I] Identify selected row when using filters [arrow]
via GitHub
Earlier messages