issues
Thread
Date
Earlier messages
Messages by Thread
[I] [Java] Improve VectorSchemaRoot.getVector(String name) lookup performance [arrow-java]
via GitHub
[I] Explicitly providing CMAKE_LIBTOOL does not work on MacOS [arrow]
via GitHub
Re: [I] [C++] DictionaryArray::dictionary() is not thread safe [arrow]
via GitHub
Re: [I] [Document] Why int32() offset type is used for DenseUnionArray? [arrow]
via GitHub
Re: [I] [C++] CSV reader: Ability to not infer column types. [arrow]
via GitHub
[I] Fix remaining overflow and negative length handling issues in Gandiva string functions [arrow]
via GitHub
[I] Azure with SAS Keys [arrow]
via GitHub
[I] [GLib] Enable tests for custom extension data type [arrow]
via GitHub
[I] [Python][CI] Raise oldest NumPy wheel-test requirement to a patched release [arrow]
via GitHub
[I] [C++] IPC file fuzzer fails when footer schema has differing endianness [arrow]
via GitHub
[I] Question regarding Parquet Page Index: Why enable it during write if it's not utilized during read? [arrow-go]
via GitHub
Re: [I] [C++][Acero] Window Functions add helper classes for window aggregates and distinct aggregates [arrow]
via GitHub
Re: [I] [C++][Acero] Window Functions add helper classes for quantiles [arrow]
via GitHub
Re: [I] [C++][Acero] Add Window Functions exec node [arrow]
via GitHub
Re: [I] [C++/Python] Add support for S3 Bucket Versioning [arrow]
via GitHub
[I] [Avro] hamba/avro is abandoned [arrow-go]
via GitHub
[I] [Python] Improve Extension Types Support in PyArrow (umbrella issue) [arrow]
via GitHub
Re: [I] [Python] Subclassing the PyExtensionType and getting it's bit_width attribute returns Non-fixed width type ValueError [arrow]
via GitHub
[I] The annotation is incorrect. It should be 1M. [arrow-go]
via GitHub
[I] [C++][Parquet] Avoid unbounded temp alloc in BYTE_STREAM_SPLIT decoder [arrow]
via GitHub
Re: [I] [C++] Support optional arguments in aggregation function mapping in the Substrait consumer. [arrow]
via GitHub
Re: [I] [R] Differing results in log bindings [arrow]
via GitHub
Re: [I] [Python][Dev] Document the process to run numpydoc checks [arrow]
via GitHub
Re: [I] [R] Implement asof join [arrow]
via GitHub
Re: [I] Clean up how the CSV reader handles the first buffer [arrow]
via GitHub
Re: [I] [R] Tidy up the pkgdown articles site index [arrow]
via GitHub
Re: [I] [R] arrow_eval: do we need both nse_funcs and .cache$functions? [arrow]
via GitHub
Re: [I] [C++] [Python] Major performance improvements to CSV reading from S3 [arrow]
via GitHub
Re: [I] [R] Table viewer for knitr/notebooks [arrow]
via GitHub
[I] [C++][Dataset] std::bad_weak_ptr in multi-threaded writer tests on MinGW gcc-16 [arrow]
via GitHub
[I] Managing ownership in VectorSchemaRoot#addVector, recent changes miss the main fault. [arrow-java]
via GitHub
Re: [I] [R] [Docs] Improve (or really actually document) our Python bridge documentation [arrow]
via GitHub
Re: [I] [C++] Fetch Node Substrait Integration [arrow]
via GitHub
[I] [C++][Parquet] Reading dictionary encoded boolean throws NYI [arrow]
via GitHub
Re: [I] [C++] Substarit End-To-End Tests for Relations [arrow]
via GitHub
Re: [I] [R] Allow unrecognized R expressions to be callable as compute::Functions [arrow]
via GitHub
Re: [I] [R] Add vignette on ExecPlans and how they work [arrow]
via GitHub
Re: [I] [Python] Memory kept after del and pool.released_unused() [arrow]
via GitHub
Re: [I] Does arrow support access S3 based on 'path-style'? [arrow]
via GitHub
Re: [I] [C++] RecordBatch Make() with Arrow Arrays could infer length [arrow]
via GitHub
Re: [I] [C++][Parquet] Support nested data conversions for chunked array [arrow]
via GitHub
[I] [GLib] garrow_data_type_new_raw segfaults on arrow::extension::OpaqueType and any non-GLib ExtensionType (ADBC PostgreSQL NUMERIC) [arrow]
via GitHub
Re: [I] [GLib] garrow_data_type_new_raw segfaults on arrow::extension::OpaqueType and any non-GLib ExtensionType (ADBC PostgreSQL NUMERIC) [arrow]
via GitHub
[I] [C++] Uncontrolled Memory Allocation (OOM) in Parquet Delta decoders [arrow]
via GitHub
[I] [C++][Gandiva] Use timegm in date_time_test utilities to avoid DST-dependent behavior [arrow]
via GitHub
Re: [I] [Python] `compute.count_distinct` not implemented for `extension<arrow.uuid>` and `extension<arrow.json>` [arrow]
via GitHub
Re: [I] [Python] `compute.min_max` is not implemented for `extension<arrow.json>` [arrow]
via GitHub
[I] [Bug] NewIntXStatistics factories unconditionally set hasDistinctCount=true, causing distinct_count=0 to always appear in Parquet output [arrow-go]
via GitHub
[I] [C++] HeadBucket called in S3FS breaking IAM scoped prefixes [arrow]
via GitHub
Re: [I] [R] Implement typeof() in Arrow dplyr queries [arrow]
via GitHub
Re: [I] [R] Implement as.integer and as.numeric for timestamp types etc. in Arrow dplyr queries [arrow]
via GitHub
Re: [I] [R]: Lack of `assume_timezone` binding [arrow]
via GitHub
Re: [I] [C++] Move Parquet APIs to use Result instead of Status [arrow]
via GitHub
Re: [I] [C++][Python][Doc] Document that order is not preserved when writing dataset with use_threads=True [arrow]
via GitHub
Re: [I] [C++][Python] SEGFAULT when casting FixedSizeTensorArray to storage type then back to FixedSizeTensorArray [arrow]
via GitHub
Re: [I] [Python] ParquetWriter use_compliant_nested_type=True does not preserve ExtensionArray when reading back [arrow]
via GitHub
Re: [I] [Python] `pyarrow.Table.to_pandas` creates Index instead of PeriodIndex [arrow]
via GitHub
[I] [C++][CI] gcc-16 MinGW failures - remaining fixes (follow-up to #49930) [arrow]
via GitHub
[I] [Format] Better document IPC file and stream equivalence [arrow]
via GitHub
[I] [C++] Provide a default implementation for ExtensionType::ExtensionEquals [arrow]
via GitHub
Re: [I] [C++] Provide a default implementation for ExtensionType::ExtensionEquals [arrow]
via GitHub
Re: [I] Enhancement Request: Custom Operator Support for PyArrow Extension Types in Compute Functions [arrow]
via GitHub
Re: [I] [Python] For extension types, compute kernels should default to storage types? [arrow]
via GitHub
Re: [I] [Python] Enhance logical operator support and truth value handling for PyArrow Arrays [arrow]
via GitHub
Re: [I] [C++] The difference between namespace detail and internal [arrow]
via GitHub
Re: [I] [C++] Add support for %Z to strptime [arrow]
via GitHub
Re: [I] [C++][Compute] Add Cryptographic hash functions to Acero [arrow]
via GitHub
Re: [I] [C++] [Python] Tag record batches with start_byte and end_byte infromation [arrow]
via GitHub
Re: [I] [R] Update binding for add_filename() NSE function to error if used on Table [arrow]
via GitHub
Re: [I] [C++] Disable anonymous namespaces in debug mode [arrow]
via GitHub
Re: [I] [R] Additional dplyr functionality [arrow]
via GitHub
[I] Power BI - using UTF8_LCASE column returns error Unable to understand the type for column [arrow-adbc]
via GitHub
Re: [I] [Python] Dataset.to_batches() / ParquetFileFragment.to_batches() hang forever [arrow]
via GitHub
[I] [Go] RecordFromJSON does not handle integer values not representable by double [arrow-go]
via GitHub
Re: [I] [Go] RecordFromJSON does not handle integer values not representable by double [arrow-go]
via GitHub
[I] [FlightSQL] SQLite example pulls GPL-licensed modernc.org/ccorpus into all consumers' go.sum [arrow-go]
via GitHub
Re: [I] [FlightSQL] SQLite example pulls GPL-licensed modernc.org/ccorpus into all consumers' go.sum [arrow-go]
via GitHub
[I] [Python] Protect PyBuffer and NumPyBuffer destructors against interpreter finalization [arrow]
via GitHub
Re: [I] [Python] Protect PyBuffer and NumPyBuffer destructors against interpreter finalization [arrow]
via GitHub
[I] [R] Update macOS CRAN job SDK from 11.3 to 14.5 to match R 4.6.0 build environment [arrow]
via GitHub
Re: [I] [C++] Bump versions of bundled dependencies [arrow]
via GitHub
[I] [C++] Bump bundled c-ares [arrow]
via GitHub
Re: [I] [C++] Bump bundled c-ares [arrow]
via GitHub
Re: [I] [Python] Interchange object data buffer has the wrong dtype / `from_dataframe` incorrect [arrow]
via GitHub
Re: [I] [Python][Interchange protocol] Export boolean columns as bit-packed values [arrow]
via GitHub
Re: [I] [Python] DataFrame interchange protocol: NaNs are interchanged as null [arrow]
via GitHub
Re: [I] [Python] Cannot create RecordBatch with nested struct containing extension type [arrow]
via GitHub
Re: [I] [Parquet][Python] parquet arrow schema inconsistent for file with UUID [arrow]
via GitHub
Re: [I] [Python] Instantiating arrays with type ListType[ExtensionType] is not supported [arrow]
via GitHub
Re: [I] Extension types not fully supported in list arrays [arrow]
via GitHub
Re: [I] [C++][Compute] Support to initialize expression with a string [arrow]
via GitHub
Re: [I] [C++] Add hash_mode function [arrow]
via GitHub
Re: [I] [C++] Implement the round-shift for fixed size data type [arrow]
via GitHub
Re: [I] [C++][Docs] Describe limitations and alternatives for handling dependencies via package managers [arrow]
via GitHub
Re: [I] [Docs][C++] Add missing methods to ArrayBuilders API Reference [arrow]
via GitHub
Re: [I] [C++] Add Byte Range to CSV Reader ReadOptions [arrow]
via GitHub
Re: [I] [Python] Error using extension types in struct in PyArrow [arrow]
via GitHub
[I] BUG: Pandas BUG: DataFrame.fillna() with ArrowDtype(pa.null()) columns causes Arrow C++ assertion failure (core dump) [arrow]
via GitHub
Re: [I] [Python] BUG: Pandas BUG: DataFrame.fillna() with ArrowDtype(pa.null()) columns causes Arrow C++ assertion failure (core dump) [arrow]
via GitHub
Re: [I] [Python] Accessing parquet files with parquet.read_table in google cloud storage fails, but works with dataset, works in 16.1.0 fails in 17.0.0 [arrow]
via GitHub
[I] [Python][C++] Verification jobs for Python fail building PyArrow [arrow]
via GitHub
Re: [I] [Python][C++] Verification jobs for Python fail building PyArrow [arrow]
via GitHub
[I] [Python] Pandas integration TestConvertMetadata.test_table_column_subset_metadata fails with nightlies pandas [arrow]
via GitHub
Re: [I] [Python] Pandas integration TestConvertMetadata.test_table_column_subset_metadata fails with nightlies pandas [arrow]
via GitHub
[I] [CI][C++] MinGW jobs fail every run after MSYS2 toolchain updates [arrow]
via GitHub
Re: [I] [CI][C++] MinGW jobs fail every run after MSYS2 toolchain updates [arrow]
via GitHub
[I] UB in TypedColumnWriterImpl::UpdateLevelHistogram due to std::span construction from nullptr [arrow]
via GitHub
[I] [Python][Parquet] bloom_filter_offset not present in ColumnChunkMetaData.to_dict() output despite bloom filter being written [arrow]
via GitHub
Re: [I] [Python][Parquet] Expose bloom_filter_offset and bloom_filter_length to Python in column chunk metadata [arrow]
via GitHub
Re: [I] [R] fix max_rows_per_group must be a positive number [arrow]
via GitHub
[I] [Python] Inconsistent default values for Parquet `pre_buffer` [arrow]
via GitHub
Re: [I] [Parquet][Python] Inconsistent default values for Parquet `pre_buffer` [arrow]
via GitHub
Re: [I] [C++] Arrow test 'arrow-utility-test' contains heap-buffer-overflow error [arrow]
via GitHub
[I] [C++] Bump xsimd [arrow]
via GitHub
Re: [I] [Doc][Python] Timestamp with tz loses its time zone after `to_numpy` [arrow]
via GitHub
Re: [I] [Website][Dev] Update merge_pr script on arrow-site repository to use new github workflow [arrow]
via GitHub
Re: [I] [C++][Benchmarking] Add AsofJoin Ordering Assertion and Benchmark Fixes [arrow]
via GitHub
Re: [I] [C++] Unify KeyColumnArray and ArraySpan [arrow]
via GitHub
Re: [I] [C++] JSON kernels [arrow]
via GitHub
Re: [I] [C++] Use shared_ptr<DataType> less throughout arrow/compute [arrow]
via GitHub
Re: [I] [C++][Docs] Add Acero project example in Getting Started Section [arrow]
via GitHub
Re: [I] [R][CI] Use rcmdcheck feature to ignore inconsequential notes [arrow]
via GitHub
[I] [CI][Python] Pandas nightly tests failing due to DatetimeIndex missmatch [arrow]
via GitHub
Re: [I] [CI][Python] Pandas nightly tests failing due to DatetimeIndex missmatch [arrow]
via GitHub
[I] [C++][Parquet] Catch std::vector allocation errors in encoding fuzzer [arrow]
via GitHub
Re: [I] [C++][Parquet] Catch std::vector allocation errors in encoding fuzzer [arrow]
via GitHub
[I] Panic: interface conversion: *hashing.BinaryMemoTable is not hashing.TypedMemoTable[uint16] when using IsIn with FixedSizeBinary(2) [arrow-go]
via GitHub
Re: [I] Panic: interface conversion: *hashing.BinaryMemoTable is not hashing.TypedMemoTable[uint16] when using IsIn with FixedSizeBinary(2) [arrow-go]
via GitHub
Re: [I] [Python] Update timezones strategy to include fixed offsets [arrow]
via GitHub
[I] GH-49915: [numpy_convert] memory management Bugs for `PyList_SetItem` in `SparseCSFTensorToNdarray` [arrow]
via GitHub
Re: [I] [Python] numpy_convert memory management(Use-After-Free) Bugs for `PyList_SetItem` in `SparseCSFTensorToNdarray` [arrow]
via GitHub
[I] Reference Counting (memory management) for `PyList_SetItem` in `SparseCSFTensorToNdarray` [arrow]
via GitHub
Re: [I] Reference Counting (Use-After-Free) Bugs for `PyList_SetItem` in `SparseCSFTensorToNdarray` [arrow]
via GitHub
Re: [I] Reference Counting (Use-After-Free) Bugs for `PyList_SetItem` in `SparseCSFTensorToNdarray` [arrow]
via GitHub
[I] Consume COLUMN_DEF in JDBC Flight Driver [arrow-java]
via GitHub
Re: [I] [Docs] Filter website-only issues from changelogs [arrow]
via GitHub
Re: [I] [C++] Substrait aggregate rel has incorrect output order [arrow]
via GitHub
Re: [I] [C++] Calculate output type from aggregate to convert arrow aggregate to substrait [arrow]
via GitHub
Re: [I] [C++] ReadRel is translated to a source node that emits unexpected fields [arrow]
via GitHub
Re: [I] [Release][R] Add verification for R binaries to rc script [arrow]
via GitHub
Re: [I] [proposal] Arrow Intermediate Representation to facilitate the transformation of row-oriented data sources into Arrow columnar representation [arrow]
via GitHub
Re: [I] [C++][Docs] Substrait Usage in Acero [arrow]
via GitHub
Re: [I] [C++][Compute] Switch new (Swiss) hash join to use 64-bit hash [arrow]
via GitHub
Re: [I] [C++][R] Strptime should detect invalid formats [arrow]
via GitHub
[I] [Archery] C++ Benchmark preserve improvements [arrow]
via GitHub
Re: [I] [R] Should parquet/IPC writers detect compression from filename? [arrow]
via GitHub
[I] Helper method for RecordBuilder to drop last, possibly incomplete "row" [arrow-go]
via GitHub
Re: [I] [C++][Gandiva] Implement NextDay Function to case-insensitive [arrow]
via GitHub
Re: [I] [C++] Backpressure should resume as a new task, assuming executor is present [arrow]
via GitHub
Re: [I] [C++][Compute] Add dictionary support to new (Swiss) hash join [arrow]
via GitHub
Re: [I] [C++] Clarify lifecycle of a StopSource/StopToken [arrow]
via GitHub
Re: [I] [R] Better handling of calling string functions on dictionaries [arrow]
via GitHub
Re: [I] [R] Support more objects in as_schema() and use it in more places [arrow]
via GitHub
[I] FixedShapedTensor `to_pandas_dtype` is returning `NotImplementedError` [arrow]
via GitHub
Re: [I] [Gandiva] Remove usages of mutable reference out arguments [arrow]
via GitHub
Re: [I] [R] Allow statically linked libcurl in GCS when building libarrow DLL in RTools [arrow]
via GitHub
Re: [I] [R] Enable GCS tests for Windows [arrow]
via GitHub
Re: [I] [Python] Add python bindings to ExecuteScalarExpression [arrow]
via GitHub
Re: [I] [C++][Docs] Revise C++ Documentation [arrow]
via GitHub
Re: [I] [C++] Expose higher-level utility to execute a kernel [arrow]
via GitHub
Re: [I] [R] Improve evaluation of R functions from C++ [arrow]
via GitHub
[I] pqarrow.FileWriter.Close() leaks per-column-chunk buffers when the underlying io.Writer fails mid-flush [arrow-go]
via GitHub
Re: [I] pqarrow.FileWriter.Close() leaks per-column-chunk buffers when the underlying io.Writer fails mid-flush [arrow-go]
via GitHub
[I] pqarrow.FileWriter.Close() leaks per-column-chunk buffers when the underlying io.Writer fails mid-flush [arrow-go]
via GitHub
Re: [I] [Dev] Can we prepend "[COMPONENT]" to issue title automatically? [arrow]
via GitHub
Re: [I] [C++] Add `OptionalBitmapAnd` utility [arrow]
via GitHub
Re: [I] [R][C++] Support upcasting NULL columns in Dataset CSV reader [arrow]
via GitHub
Re: [I] [R][C++] Add ability to trim whitespace to CSV reading options [arrow]
via GitHub
Re: [I] [R] Any support for rolling windows functions? [arrow]
via GitHub
Re: [I] [Format] Add wording for alternative layouts [arrow]
via GitHub
Re: [I] [Python] Instantiate `pa.Table` from a `Generator`/`Iterator` [arrow]
via GitHub
Re: [I] [C++] Provide way for extension array to provide it's own value pretty printer [arrow]
via GitHub
Re: [I] [Python] Filter on `__row_index` [arrow]
via GitHub
Re: [I] [C++][Python] Fully support special fields in `Scanner`. [arrow]
via GitHub
Re: [I] [Python] Feature to append row groups to existing parquet file [arrow]
via GitHub
[I] [Archery] benchmark subcommand not supporting pandas >=3.0 [arrow]
via GitHub
Re: [I] [Archery] benchmark subcommand not supporting pandas >=3.0 [arrow]
via GitHub
[I] [C++] Deprecate `RandomAccessFile::Read{At,Async}` without `allow_short_read` [arrow]
via GitHub
[I] Floordiv compute kernel [arrow]
via GitHub
Re: [I] Floordiv compute kernel [arrow]
via GitHub
[I] [Python] Add timezone information when printing TimestampArray [arrow]
via GitHub
Re: [I] [Python] Add timezone information when printing TimestampArray [arrow]
via GitHub
Re: [I] [R] Add examples working with `tidyr::unnest`and `tidyr::unnest_longer` [arrow]
via GitHub
Re: [I] [R] parse_date_time should support quiet = FALSE [arrow]
via GitHub
Re: [I] [C++] "replace_with_mask" kernel to raise informative error if replacement array is too long? [arrow]
via GitHub
Re: [I] [C++] Enable multiple character delimiters in read_csv [arrow]
via GitHub
Re: [I] [R] parse_date_time should support locale parameter [arrow]
via GitHub
Re: [I] Enable a smaller build of just libparquet [arrow]
via GitHub
Re: [I] [C++][Gandiva] Enhance random data generation [arrow]
via GitHub
Re: [I] [Python][Parquet] Faster parquet partitioning scheme [arrow]
via GitHub
Re: [I] [C++][Parquet] Add SkipValues() to decoder, Refactor TypedColumnReader::Skip to use it. [arrow]
via GitHub
Re: [I] Add trademark symbol to Apache Arrow logo [arrow]
via GitHub
Re: [I] [C++][Python] Add option to include partitioning columns in basename_template's filename [arrow]
via GitHub
Re: [I] [C++][Parquet] Support read by row ranges [arrow]
via GitHub
Re: [I] [R] Without invalid_row_handler in CSV Parsing Options [arrow]
via GitHub
Re: [I] [Archery] Allow running external repetitions of C++ micro-benchmarks [arrow]
via GitHub
Re: [I] Remove ad-hoc substrait version after substrait#342 [arrow]
via GitHub
Re: [I] [Release] Add a post script to generate announce email [arrow]
via GitHub
Re: [I] [C++] Add an end-to-end fuzz test for the new scan node [arrow]
via GitHub
Re: [I] [Python] [Flight] Arrow Flight Server tells which line fails in Python [arrow]
via GitHub
Earlier messages