[jira] [Created] (ARROW-17182) [C++][Docs] Show examples of multiple Acero-compatible language APIs
Ian Cook created ARROW-17182: Summary: [C++][Docs] Show examples of multiple Acero-compatible language APIs Key: ARROW-17182 URL: https://issues.apache.org/jira/browse/ARROW-17182 Project: Apache Arrow Issue Type: Sub-task Components: C++, Documentation Reporter: Ian Cook Today, there is really only one feature-complete high-level API wrapping the lower-level Acero ExecPlan API: the dplyr interface arrow R pacakge. But in the future, when Acero has full capability to consume and execute Substrait plans, then will have more high-level APIs wrapping Acero, including Ibis. It would be nice to include a tabbed interface in a prominent place on the front page of the Acero docs in showing several different Acero-compatible language APIs, similar to what's on the front page of the [Apache Spark website|https://spark.apache.org/]. Credit to [~willjones127] for suggesting this idea. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17181) [Python] Scalar UDF Experimental Documentation
Vibhatha Lakmal Abeykoon created ARROW-17181: Summary: [Python] Scalar UDF Experimental Documentation Key: ARROW-17181 URL: https://issues.apache.org/jira/browse/ARROW-17181 Project: Apache Arrow Issue Type: Sub-task Reporter: Vibhatha Lakmal Abeykoon Assignee: Vibhatha Lakmal Abeykoon At the moment the existing Scalar UDF usage is not documented. There will be a final version of documentation update once other features are integrated. But to support the users and developers, the existing content needs to be documented. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17180) [C++] Backpressure should resume as a new task, assuming executor is present
Weston Pace created ARROW-17180: --- Summary: [C++] Backpressure should resume as a new task, assuming executor is present Key: ARROW-17180 URL: https://issues.apache.org/jira/browse/ARROW-17180 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Weston Pace As brought up on the mailing list: https://lists.apache.org/thread/hscjqyw7lpt95vlkzoslm6pyhy9x6wso When we resume producing in the source node, we should run the continuation (which starts scanning again) as a new task. This will avoid potential stack overflow (if we pause and start many times) and help make for more understandable stack traces. The continuation is unlikely to have anything to do with the task that is calling resume and so there is not as much downside to a context switch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17179) [R] Support more objects in as_schema() and use it in more places
Dewey Dunnington created ARROW-17179: Summary: [R] Support more objects in as_schema() and use it in more places Key: ARROW-17179 URL: https://issues.apache.org/jira/browse/ARROW-17179 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Dewey Dunnington Right now the {{as_schema()}} method isn't used in many places and doesn't support many object types. It is probably a good fit for sanitizing arguments where a schema is expected and likely has more classes that can be interpreted in this way (e.g., Field or DataType as identified in ARROW-16444). After ARROW-16444, the internal {{in_type_as_schema()}} function used to sanitize the {{in_type}} argument of {{register_scalar_function()}} may be a good candidate for this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17178) [R] Support head() in arrow_dplyr_query with user-defined function
Dewey Dunnington created ARROW-17178: Summary: [R] Support head() in arrow_dplyr_query with user-defined function Key: ARROW-17178 URL: https://issues.apache.org/jira/browse/ARROW-17178 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Dewey Dunnington After ARROW-16444 and ARROW-16703 we will have some arrow_dplyr_query objects whose pipeline can't contain {{head()}} after the part that calls R code. This is a very big feature not to support and we need to find a workaround. The full-on solution is to make sure that we support an R-level RecordBatchReader, but there may be a workaround that we can support in the meantime. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17177) [C++][Docs] Re-Organize the Existing ACERO Streaming Engine Documentation
Vibhatha Lakmal Abeykoon created ARROW-17177: Summary: [C++][Docs] Re-Organize the Existing ACERO Streaming Engine Documentation Key: ARROW-17177 URL: https://issues.apache.org/jira/browse/ARROW-17177 Project: Apache Arrow Issue Type: Sub-task Reporter: Vibhatha Lakmal Abeykoon Assignee: Vibhatha Lakmal Abeykoon The current document is too-length. By creating sub-pages for each example and explain the code and provide a better description, that would be much better in terms of readability and browsing the content. The idea is to create a sub-folder in the examples called 'acero` and include each example in a separate `.cc` file. This is the code change. Following this, the documentation page on the website can be splitted into sub-pages. This is the only change suggested for this sub-task. There is already a JIRA: https://issues.apache.org/jira/browse/ARROW-16802 to improve the internal content. So it would be used for re-writing the contnet. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17176) [Rust] Activate generate_decimal256_case arrow integration test for rust
L. C. Hsieh created ARROW-17176: --- Summary: [Rust] Activate generate_decimal256_case arrow integration test for rust Key: ARROW-17176 URL: https://issues.apache.org/jira/browse/ARROW-17176 Project: Apache Arrow Issue Type: Bug Components: Archery Reporter: L. C. Hsieh -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17175) [CI][macOS] macos-10.15 is deprecated and macos-latest is macos-11
Kouhei Sutou created ARROW-17175: Summary: [CI][macOS] macos-10.15 is deprecated and macos-latest is macos-11 Key: ARROW-17175 URL: https://issues.apache.org/jira/browse/ARROW-17175 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Kouhei Sutou Assignee: Kouhei Sutou https://github.com/actions/virtual-environments#available-environments {quote} macOS 11macos-latest or macos-11 macOS 10.15 deprecated macos-10.15 {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17174) FileSystemDataset FilenamePartitioning error - fsspec filesystem
Adam Kirby created ARROW-17174: -- Summary: FileSystemDataset FilenamePartitioning error - fsspec filesystem Key: ARROW-17174 URL: https://issues.apache.org/jira/browse/ARROW-17174 Project: Apache Arrow Issue Type: Bug Components: C++, Python Affects Versions: 8.0.1 Reporter: Adam Kirby Attachments: zip_of_csvs_test.py Unless this is user error (which it may well be!), it seems that Dataset FilenamePartitioning on read doesn't seem to work with an fsspec filesystem. >From what I can glean, the filenames can be parsed successfully when passed to the parse() method, but do not seem to be being extracted as fields from the filenames passed to dataset() – instead, they appear as nulls. When trying to use the partitioning discover() method (assuming this is a reasonable thing to try), I get the below traceback. (Repro python script attached). Traceback (most recent call last): File "/zip_of_csvs_test.py", line 82, in ds_partitioned = pds.dataset( File "/.pyenv/versions/3.8.2/lib/python3.8/site-packages/pyarrow/dataset.py", line 697, in dataset return _filesystem_dataset(source, **kwargs) File "/.pyenv/versions/3.8.2/lib/python3.8/site-packages/pyarrow/dataset.py", line 449, in _filesystem_dataset return factory.finish(schema) File "pyarrow/_dataset.pyx", line 1857, in pyarrow._dataset.DatasetFactory.finish File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: No non-null segments were available for field 'frequency'; couldn't infer type -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-adbc] lidavidm opened a new pull request, #44: [C] Fix compatibility issues noticed with Ibis
lidavidm opened a new pull request, #44: URL: https://github.com/apache/arrow-adbc/pull/44 - The SQLite catalog is called "main" - Fix a couple bugs and note areas of improvement -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-17173) [C++] Clarify lifecycle of a StopSource/StopToken
Dewey Dunnington created ARROW-17173: Summary: [C++] Clarify lifecycle of a StopSource/StopToken Key: ARROW-17173 URL: https://issues.apache.org/jira/browse/ARROW-17173 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Dewey Dunnington In ARROW-11841 we ran into an issue where a single cancellable operation (i.e., {{SetSignalStopSource()}}/{{ResetSignalStopSource()}} was a poor fit: the {{StopToken}} must be assigned to an {{IOContext}} when a filesystem is created; however, the filesystem may be reused for more than one cancellable operation (e.g., reading a CSV). Following the instructions in the current API (in util/cancel.h) results in a situation the lifecycle of the filesystem must match the lifecycle of the {{StopSource}}, which can be difficult to program around. A related problem is that where we load Python and R Arrow libraries that link to the same .so. After ARROW-11841, R will have the ability to register signal handlers to interrupt Arrow operations, and users that load pyarrow via reticulate must be careful to disable it or they will get an error along the lines of "StopSource already set up". >From a purely R-centric point of view, we could provide our own {{StopToken}} >implementation if we were allowed to since R already implements the proper >signal handler and the arrow R package implements the proper event loop to >make this thread safe. Currently the {{StopToken}} is passed by value and thus >a subclass is not an option. For R, anyway, this would eliminate any need to >consider the lifecycle of another object. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17172) [C++][Python] test_cython_api fails on windows
Alenka Frim created ARROW-17172: --- Summary: [C++][Python] test_cython_api fails on windows Key: ARROW-17172 URL: https://issues.apache.org/jira/browse/ARROW-17172 Project: Apache Arrow Issue Type: Sub-task Components: C++, Python Reporter: Alenka Frim Assignee: Alenka Frim Fix For: 10.0.0 With the current change in https://github.com/apache/arrow/pull/13311 the second part of the test_cython_api that checks that the extension module is loadable from a subprocess without pyarrow imported first is failing on Windows. Research the issue and be sure the test is run and passes the CI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-testing] pitrou commented on pull request #80: ARROW-17100: Add example of Arrow 2.0 DataPageV2 compression issue
pitrou commented on PR #80: URL: https://github.com/apache/arrow-testing/pull/80#issuecomment-1191800075 Thanks for the update @wjones127 :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-testing] pitrou merged pull request #80: ARROW-17100: Add example of Arrow 2.0 DataPageV2 compression issue
pitrou merged PR #80: URL: https://github.com/apache/arrow-testing/pull/80 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-17171) [C++][Gandiva] Implement case-insensitive
Vinicius Souza Roque created ARROW-17171: Summary: [C++][Gandiva] Implement case-insensitive Key: ARROW-17171 URL: https://issues.apache.org/jira/browse/ARROW-17171 Project: Apache Arrow Issue Type: New Feature Components: C++ - Gandiva Reporter: Vinicius Souza Roque Implementing changes for the function to be case-insensitive -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17170) Research Documentation Formats
Kae Suarez created ARROW-17170: -- Summary: Research Documentation Formats Key: ARROW-17170 URL: https://issues.apache.org/jira/browse/ARROW-17170 Project: Apache Arrow Issue Type: Sub-task Reporter: Kae Suarez Assignee: Kae Suarez In order to revise the documentation, some inspiration is needed to get the format right. This ticket provides a space for exploration of possible inspiration for the C++ documentation – once we have some good examples and/or agreement, we can move to some content creation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17169) [Go] goPanicIndex in firstTimeBitmapWriter.Finish()
Robert Purdom created ARROW-17169: - Summary: [Go] goPanicIndex in firstTimeBitmapWriter.Finish() Key: ARROW-17169 URL: https://issues.apache.org/jira/browse/ARROW-17169 Project: Apache Arrow Issue Type: Bug Components: Go Affects Versions: 8.0.1, 9.0.0 Environment: go (1.18.3), Linux, AMD64 Reporter: Robert Purdom I'm working with complex parquet files with 500+ "root" columns where some fields are lists of structs, internally referred to as 'topics'. Some of these structs have 100's of columns. When reading a particular topic, I get an Index Panic at the line indicated below. This error occurs when the value for the topic is Null, as in, for this particular root record, this topic has no data. The root is household data, the topic is auto, so the error occurs when the household has no autos. The auto field is a Nullable List of Struct. {code:go} /* Finish() was called from defLevelsToBitmapInternal. data values when panic occurs bw.length == 17531 bw.bitMask == 1 bw.pos == 3424 bw.length == 17531 len(bw.Buf) == 428 cap(bw.Buf) == 448 bw.byteOffset == 428 bw.curByte == 0 */ // bitmap_writer.go func (bw *firstTimeBitmapWriter) Finish() { // store curByte into the bitmap if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length { bw.buf[int(bw.byteOffset)] = bw.curByte // < Panic index } } {code} In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested the below modification and it does remedy the bug. However, it's probably only masking the actual bug. {code:go} // Test version: No Panic func (bw *firstTimeBitmapWriter) Finish() { // store curByte into the bitmap if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length { if bw.byteOffset == len(bw.Buf) { bw.buf = append(bw.buf, bw.curByte) } else { bw.buf[int(bw.byteOffset)] = bw.curByte } } }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17168) [C++][Docs] Expand C++ Cookbook
Kae Suarez created ARROW-17168: -- Summary: [C++][Docs] Expand C++ Cookbook Key: ARROW-17168 URL: https://issues.apache.org/jira/browse/ARROW-17168 Project: Apache Arrow Issue Type: Sub-task Components: C++, Documentation Reporter: Kae Suarez Currently, the C++ Cookbook has very few examples compared to the Python Cookbook, even though C++ Arrow can be used for the same tasks. Achieving parity between the cookbooks would be useful for developers to be able to use Arrow via C++ as a primary language. Requires creation of examples and some light supporting prose – Python Cookbook can likely be copied in structure and some prose to save time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17167) [C++][Docs] Improve C++ Documentation
Kae Suarez created ARROW-17167: -- Summary: [C++][Docs] Improve C++ Documentation Key: ARROW-17167 URL: https://issues.apache.org/jira/browse/ARROW-17167 Project: Apache Arrow Issue Type: Improvement Components: C++, Documentation Reporter: Kae Suarez Parent ticket for tasks that aim to improve C++ Arrow documentation. General goal could be parity with Python documentation, so there's a baseline – open to further suggestions. Suggestions from new users would be incredibly valued, due to their experiences being more likely to have been impacted by any lacking documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-testing] pitrou commented on pull request #80: ARROW-17100: Add example of Arrow 2.0 DataPageV2 compression issue
pitrou commented on PR #80: URL: https://github.com/apache/arrow-testing/pull/80#issuecomment-1191734043 Some nits: * avoid creating a subdir for a single file? * add a README.md in `data/parquet` to start describing the files being added? A bit like in https://github.com/apache/parquet-testing/blob/master/data/README.md -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-testing] wjones127 commented on pull request #80: ARROW-17100: Add example of Arrow 2.0 DataPageV2 compression issue
wjones127 commented on PR #80: URL: https://github.com/apache/arrow-testing/pull/80#issuecomment-1191717690 cc @pitrou -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-testing] wjones127 opened a new pull request, #80: ARROW-17100: Add example of Arrow 2.0 DataPageV2 compression issue
wjones127 opened a new pull request, #80: URL: https://github.com/apache/arrow-testing/pull/80 We used to always set `is_compressed=false` in page headers regardless of whether there was actual compression. Check you can read this file if you want to suppose files written by Arrow C++ 2.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] lidavidm opened a new issue, #43: ADBC/Ibis pain points
lidavidm opened a new issue, #43: URL: https://github.com/apache/arrow-adbc/issues/43 - Need way to query driver and server version - Need handling of multiple databases within a connection -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-17166) [R] [CI] Remove ENV TZ from docker files
Rok Mihevc created ARROW-17166: -- Summary: [R] [CI] Remove ENV TZ from docker files Key: ARROW-17166 URL: https://issues.apache.org/jira/browse/ARROW-17166 Project: Apache Arrow Issue Type: Bug Reporter: Rok Mihevc Fix For: 9.0.0 We have noticed R CI job (AMD64 Ubuntu 20.04 R 4.2 Force-Tests true) failing on master: [1|https://github.com/apache/arrow/runs/7424773120?check_suite_focus=true#step:7:5547], [2|https://github.com/apache/arrow/runs/7431821192?check_suite_focus=true#step:7:5804], [3|https://github.com/apache/arrow/runs/7445803518?check_suite_focus=true#step:7:16305] with: {code:java} Start test: array uses local timezone for POSIXct without timezone test-Array.R:269:3 [success] System has not been booted with systemd as init system (PID 1). Can't operate. Failed to create bus connection: Host is down {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17165) [Python] Add python bindings to ExecuteScalarExpression
Weston Pace created ARROW-17165: --- Summary: [Python] Add python bindings to ExecuteScalarExpression Key: ARROW-17165 URL: https://issues.apache.org/jira/browse/ARROW-17165 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Weston Pace Currently, if a user wants to execute an expression, we require them to create an exec plan, add a project node, and then run the exec plan. However, for simple use cases, where a user already has a record batch in memory, we could probably expose ExecuteScalarExpression on its own. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17164) [C++] Expose higher-level utility to execute the kernel
Antoine Pitrou created ARROW-17164: -- Summary: [C++] Expose higher-level utility to execute the kernel Key: ARROW-17164 URL: https://issues.apache.org/jira/browse/ARROW-17164 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Antoine Pitrou Fix For: 9.0.0 Currently, the compute layer exposes several high-level facilities to execute a compute function: {{CallFunction}} and {{Function::Execute}}. However, if you'd favor a two-step approach of first resolving the {{Kernel}} for a given set of argument types, then execute the kernel, then you're forced to deal with the rather cumbersome {{Kernel}} execution interface. It would be nice if the base {{Kernel}} class had something similar to the {{Function::Execute}} method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-adbc] lidavidm merged pull request #42: [Docs] Update README
lidavidm merged PR #42: URL: https://github.com/apache/arrow-adbc/pull/42 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-17163) [C++] Don't install jni_util.h
David Li created ARROW-17163: Summary: [C++] Don't install jni_util.h Key: ARROW-17163 URL: https://issues.apache.org/jira/browse/ARROW-17163 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: David Li ARROW-17086 fixed some compiler warnings and restored the installation of jni_util.h to match prior behavior. But we never intended to expose this header, and the downstream Gluten project [no longer depends on it|https://github.com/apache/arrow/pull/13614#issuecomment-1191198106], so we can stop installing it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17162) [C++] Bump version of bundled protobuf to include ABI mismatch on debug builds
Raúl Cumplido created ARROW-17162: - Summary: [C++] Bump version of bundled protobuf to include ABI mismatch on debug builds Key: ARROW-17162 URL: https://issues.apache.org/jira/browse/ARROW-17162 Project: Apache Arrow Issue Type: Improvement Reporter: Raúl Cumplido Assignee: Raúl Cumplido Fix For: 9.0.0 As part of the investigation for ARROW https://issues.apache.org/jira/browse/ARROW-17104 and the issue on https://issues.apache.org/jira/browse/ARROW-16520. We identified some missing symbols when compiling protobuf, See upstream fix: [https://github.com/protocolbuffers/protobuf/pull/10271] This tickets purpose is only to update the version of the vendored protobuf version defined for ARROW. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17161) [C++][Java] Dataset: Support reading from fixed offset of a file for Parquet format
Hongze Zhang created ARROW-17161: Summary: [C++][Java] Dataset: Support reading from fixed offset of a file for Parquet format Key: ARROW-17161 URL: https://issues.apache.org/jira/browse/ARROW-17161 Project: Apache Arrow Issue Type: Improvement Components: C++, Java Reporter: Hongze Zhang Fix For: 9.0.0 This adds property *start_offset_* and *length_* to FileSource and should be functional for Parquet dataset format. Supporting Java and C++ dataset API at this time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17160) [C++] Create a base directory for PyArrow CPP header files
Alenka Frim created ARROW-17160: --- Summary: [C++] Create a base directory for PyArrow CPP header files Key: ARROW-17160 URL: https://issues.apache.org/jira/browse/ARROW-17160 Project: Apache Arrow Issue Type: Sub-task Components: C++ Reporter: Alenka Frim Assignee: Alenka Frim Fix For: 10.0.0 See: https://github.com/apache/arrow/pull/13311#discussion_r925344753 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17159) [C++][JAVA] Dataset: Support reading from fixed offset of a file for Parquet format
Jin Chengcheng created ARROW-17159: -- Summary: [C++][JAVA] Dataset: Support reading from fixed offset of a file for Parquet format Key: ARROW-17159 URL: https://issues.apache.org/jira/browse/ARROW-17159 Project: Apache Arrow Issue Type: Improvement Components: C++, Java Affects Versions: 9.0.0 Reporter: Jin Chengcheng Assignee: Jin Chengcheng With that, we can use substrait plan ReadRel_LocalFiles_FileOrFiles.start() and length() to pushdown scan filter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17158) [GLib][Flight] Add support for GetFlightInfo
Kouhei Sutou created ARROW-17158: Summary: [GLib][Flight] Add support for GetFlightInfo Key: ARROW-17158 URL: https://issues.apache.org/jira/browse/ARROW-17158 Project: Apache Arrow Issue Type: Improvement Components: FlightRPC, GLib Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17157) [GLib][Ruby][Flight] Add support for headers to GAFlightCallOptions
Kouhei Sutou created ARROW-17157: Summary: [GLib][Ruby][Flight] Add support for headers to GAFlightCallOptions Key: ARROW-17157 URL: https://issues.apache.org/jira/browse/ARROW-17157 Project: Apache Arrow Issue Type: Improvement Components: FlightRPC, GLib, Ruby Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17156) [GLib][Flight] Add GAFlightClientOptions::disable-server-verification
Kouhei Sutou created ARROW-17156: Summary: [GLib][Flight] Add GAFlightClientOptions::disable-server-verification Key: ARROW-17156 URL: https://issues.apache.org/jira/browse/ARROW-17156 Project: Apache Arrow Issue Type: Improvement Components: FlightRPC, GLib Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17155) pyarrow compute module does not contain functions described in documentation
Volodymyr created ARROW-17155: - Summary: pyarrow compute module does not contain functions described in documentation Key: ARROW-17155 URL: https://issues.apache.org/jira/browse/ARROW-17155 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 8.0.0 Reporter: Volodymyr Looks like pyarrow compute module (https://github.com/apache/arrow/blob/master/python/pyarrow/compute.py) has entirely different stuff than described in documentation: [https://arrow.apache.org/docs/python/api/compute.html] -- This message was sent by Atlassian Jira (v8.20.10#820010)