[jira] [Updated] (ARROW-9904) [C++] Unroll the loop manually for CountSetBits
[ https://issues.apache.org/jira/browse/ARROW-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9904: -- Labels: pull-request-available (was: ) > [C++] Unroll the loop manually for CountSetBits > --- > > Key: ARROW-9904 > URL: https://issues.apache.org/jira/browse/ARROW-9904 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Frank Du >Assignee: Frank Du >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The tight loop below can get better performance if unroll manually to > indicate the compiler generating better parallel instructions. > for (auto iter = u64_data; iter < end; ++iter) { > count += BitUtil::PopCount(*iter); > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9904) [C++] Unroll the loop manually for CountSetBits
[ https://issues.apache.org/jira/browse/ARROW-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9904: Assignee: Apache Arrow JIRA Bot (was: Frank Du) > [C++] Unroll the loop manually for CountSetBits > --- > > Key: ARROW-9904 > URL: https://issues.apache.org/jira/browse/ARROW-9904 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Frank Du >Assignee: Apache Arrow JIRA Bot >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The tight loop below can get better performance if unroll manually to > indicate the compiler generating better parallel instructions. > for (auto iter = u64_data; iter < end; ++iter) { > count += BitUtil::PopCount(*iter); > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9904) [C++] Unroll the loop manually for CountSetBits
[ https://issues.apache.org/jira/browse/ARROW-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9904: Assignee: Frank Du (was: Apache Arrow JIRA Bot) > [C++] Unroll the loop manually for CountSetBits > --- > > Key: ARROW-9904 > URL: https://issues.apache.org/jira/browse/ARROW-9904 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Frank Du >Assignee: Frank Du >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The tight loop below can get better performance if unroll manually to > indicate the compiler generating better parallel instructions. > for (auto iter = u64_data; iter < end; ++iter) { > count += BitUtil::PopCount(*iter); > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9904) [C++] Unroll the loop manually for CountSetBits
Frank Du created ARROW-9904: --- Summary: [C++] Unroll the loop manually for CountSetBits Key: ARROW-9904 URL: https://issues.apache.org/jira/browse/ARROW-9904 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Frank Du Assignee: Frank Du The tight loop below can get better performance if unroll manually to indicate the compiler generating better parallel instructions. for (auto iter = u64_data; iter < end; ++iter) { count += BitUtil::PopCount(*iter); } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-7871) [Python] Expose more compute kernels
[ https://issues.apache.org/jira/browse/ARROW-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wieteska reassigned ARROW-7871: -- Assignee: (was: Andrew Wieteska) > [Python] Expose more compute kernels > > > Key: ARROW-7871 > URL: https://issues.apache.org/jira/browse/ARROW-7871 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Krisztian Szucs >Priority: Major > > Currently only the sum kernel is exposed. > Or consider to deprecate/remove the pyarrow.compute module, and bind the > compute kernels as methods instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9903) [R] open_dataset freezes opening feather files
Sean Clement created ARROW-9903: --- Summary: [R] open_dataset freezes opening feather files Key: ARROW-9903 URL: https://issues.apache.org/jira/browse/ARROW-9903 Project: Apache Arrow Issue Type: Bug Environment: Rstudio Reporter: Sean Clement Session info: {code:java} // R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19041)Matrix products: defaultlocale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.1 purrr_0.3.4 readr_1.3.1 tidyr_1.1.1 [7] tibble_3.0.3ggplot2_3.3.2 tidyverse_1.3.0 arrow_1.0.1loaded via a namespace (and not attached): [1] Rcpp_1.0.5 cellranger_1.1.0 pillar_1.4.6 compiler_4.0.2 dbplyr_1.4.4 tools_4.0.2 [7] bit_1.1-15.2 lubridate_1.7.9 jsonlite_1.7.0 lifecycle_0.2.0 gtable_0.3.0 pkgconfig_2.0.3 [13] rlang_0.4.7 reprex_0.3.0 cli_2.0.2DBI_1.1.0 rstudioapi_0.11 haven_2.3.1 [19] withr_2.2.0 xml2_1.3.2 httr_1.4.2 fs_1.4.1 generics_0.0.2 vctrs_0.3.2 [25] hms_0.5.3bit64_0.9-7 grid_4.0.2 tidyselect_1.1.0 glue_1.4.1 R6_2.4.1 [31] fansi_0.4.1 readxl_1.3.1 modelr_0.1.8 blob_1.2.1 magrittr_1.5 backports_1.1.7 [37] scales_1.1.1 ellipsis_0.3.1 rvest_0.3.5 assertthat_0.2.1 colorspace_1.4-1 stringi_1.4.6 [43] munsell_0.5.0broom_0.7.0 crayon_1.3.4 {code} While cycling through and processing files using open_dataset(..., format = "feather") in R, the function hangs randomly and will not proceed to the next file. The freeze does not appear at the same file each time, additionally, the same function freezes when used one on occasion. When open_dataset hangs the only way to get R to stop is using Task Manager as Rstudio becomes totally unresponsive. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9885) [Rust] [DataFusion] Simplify code of type coercion for binary types
[ https://issues.apache.org/jira/browse/ARROW-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-9885. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8076 [https://github.com/apache/arrow/pull/8076] > [Rust] [DataFusion] Simplify code of type coercion for binary types > --- > > Key: ARROW-9885 > URL: https://issues.apache.org/jira/browse/ARROW-9885 > Project: Apache Arrow > Issue Type: Task > Components: Rust, Rust - DataFusion >Reporter: Jorge >Assignee: Jorge >Priority: Trivial > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > The function `numerical_coercion` only uses the operator `op` for its error > formatting. But the function's intent can be simply generalized to "coerce > two types to numerically equivalent types". -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9888) [Rust] [DataFusion] ExecutionContext can not be shared between threads
[ https://issues.apache.org/jira/browse/ARROW-9888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-9888. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8082 [https://github.com/apache/arrow/pull/8082] > [Rust] [DataFusion] ExecutionContext can not be shared between threads > -- > > Key: ARROW-9888 > URL: https://issues.apache.org/jira/browse/ARROW-9888 > Project: Apache Arrow > Issue Type: Bug >Reporter: Andrew Lamb >Assignee: Andrew Lamb >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > As suggested by Jorge on https://github.com/apache/arrow/pull/8079 > The high level idea is to allow ExecutionContext on multi-threaded > environments such as Python. > The two use-cases: > 1. when a project is planning a complex number of plans that depend on a > common set of sources and UDFs, it would be nice to be able to multi-thread > the planning. This is particularly important when planning requires reading > remote metadata to formulate themselves (e.g. when the source is in s3 with > many partitions). Metadata reading is often slow and network bounded, which > makes threads suitable for these workloads. If multi-threading is not > possible, either each plan needs to read the metadata independently (one > context per plan) or planning must be sequential (with lots of network > waiting). > 2. when creating bindings to programming languages that support > multi-threading, it would be nice for the ExecutionContext to be thread safe, > so that we can more easily integrate with those languages. > The code might look like: > {code} > alamb@MacBook-Pro rust % git diff > diff --git a/rust/datafusion/src/execution/context.rs > b/rust/datafusion/src/execution/context.rs > index 5f8aa342e..7374b0a78 100644 > --- a/rust/datafusion/src/execution/context.rs > +++ b/rust/datafusion/src/execution/context.rs > @@ -460,7 +460,7 @@ mod tests { > use arrow::array::{ArrayRef, Int32Array}; > use arrow::compute::add; > use std::fs::File; > -use std::io::prelude::*; > +use std::{sync::Mutex, io::prelude::*}; > use tempdir::TempDir; > use test::*; > > @@ -928,6 +928,28 @@ mod tests { > Ok(()) > } > > +#[test] > +fn send_context_to_threads() -> Result<()> { > +// ensure that ExecutionContext's can be read by multiple threads > concurrently > +let tmp_dir = TempDir::new("send_context_to_threads")?; > +let partition_count = 4; > +let mut ctx = Arc::new(Mutex::new(create_ctx(_dir, > partition_count)?)); > + > +let threads: Vec>> = (0..2) > +.map(|_| { ctx.clone() }) > +.map(|ctx_clone| thread::spawn(move || { > +let ctx = ctx_clone.lock().expect("Locked context"); > +// Ensure we can create logical plan code on a separate > thread. > +ctx.create_logical_plan("SELECT c1, c2 FROM test WHERE c1 > > 0 AND c1 < 3") > +})) > +.collect(); > + > +for thread in threads { > +thread.join().expect("Failed to join thread")?; > +} > +Ok(()) > +} > + > #[test] > fn scalar_udf() -> Result<()> { > let schema = Schema::new(vec![ > {code} > At the moment, Rust refuses to compile this example (and also refuses to > share ExecutionContexts between threads) due to the following (namely that > there are several `dyn` objects that are also not marked as Send + Sync: > {code} >Compiling datafusion v2.0.0-SNAPSHOT > (/Users/alamb/Software/arrow/rust/datafusion) > error[E0277]: `(dyn execution::physical_plan::PhysicalPlanner + 'static)` > cannot be sent between threads safely >--> datafusion/src/execution/context.rs:940:30 > | > 940 | .map(|ctx_clone| thread::spawn(move || { > | ^ `(dyn > execution::physical_plan::PhysicalPlanner + 'static)` cannot be sent between > threads safely > | >::: > /Users/alamb/.rustup/toolchains/nightly-2020-04-22-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/thread/mod.rs:616:8 > | > 616 | F: Send + 'static, > | required by this bound in `std::thread::spawn` > | > = help: the trait `std::marker::Send` is not implemented for `(dyn > execution::physical_plan::PhysicalPlanner + 'static)` > = note: required because of the requirements on the impl of > `std::marker::Send` for `std::sync::Arc<(dyn > execution::physical_plan::PhysicalPlanner + 'static)>` > = note: required because it appears within the type > `std::option::Option execution::physical_plan::PhysicalPlanner +
[jira] [Resolved] (ARROW-9583) [Rust] Offset is mishandled in arithmetic and boolean compute kernels
[ https://issues.apache.org/jira/browse/ARROW-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-9583. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 7854 [https://github.com/apache/arrow/pull/7854] > [Rust] Offset is mishandled in arithmetic and boolean compute kernels > - > > Key: ARROW-9583 > URL: https://issues.apache.org/jira/browse/ARROW-9583 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Affects Versions: 1.0.0 >Reporter: Jörn Horstmann >Assignee: Paddy Horan >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Several compute kernels create the resulting ArrayData with the same offset > of one of the operands. Instead this offset should be 0 since the buffer is > freshly constructed with the correct len. > Example of one failing test: > > {code:java} > #[test] > fn test_primitive_array_add_sliced() { > let a = Int32Array::from(vec![0, 0, 0, 5, 6, 7, 8, 9, 0]); > let b = Int32Array::from(vec![0, 0, 0, 6, 7, 8, 9, 8, 0]); > let a = a.slice(3, 5); > let b = b.slice(3, 5); > let a = a.as_any().downcast_ref::().unwrap(); > let b = b.as_any().downcast_ref::().unwrap(); > assert_eq!(5, a.value(0)); > assert_eq!(6, b.value(0)); > let c = add(, ).unwrap(); > assert_eq!(5, c.len()); > assert_eq!(11, c.value(0)); > assert_eq!(13, c.value(1)); > assert_eq!(15, c.value(2)); > assert_eq!(17, c.value(3)); > assert_eq!(17, c.value(4)); > } > {code} > Additionally, the boolean kernels seem to require that both operands have the > same offset. This shouldn't be needed, but it seems that the simd > implementation requires that the offset is a multiple of 8 (bits) so that the > operation works correctly on whole bytes. The scalar implementation should be > fine with any offset. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9900) [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan
[ https://issues.apache.org/jira/browse/ARROW-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated ARROW-9900: -- Component/s: Rust - DataFusion Rust > [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan > > > Key: ARROW-9900 > URL: https://issues.apache.org/jira/browse/ARROW-9900 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion >Reporter: Andrew Lamb >Assignee: Andrew Lamb >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The idea is to continue to simplify the code and improve performance: the > inputs to nodes are often copied and using Box requires unnecessary deep > copies -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9900) [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan
[ https://issues.apache.org/jira/browse/ARROW-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-9900. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8098 [https://github.com/apache/arrow/pull/8098] > [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan > > > Key: ARROW-9900 > URL: https://issues.apache.org/jira/browse/ARROW-9900 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Andrew Lamb >Assignee: Andrew Lamb >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The idea is to continue to simplify the code and improve performance: the > inputs to nodes are often copied and using Box requires unnecessary deep > copies -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9892) [Rust] [DataFusion] Add support for concat
[ https://issues.apache.org/jira/browse/ARROW-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-9892. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8090 [https://github.com/apache/arrow/pull/8090] > [Rust] [DataFusion] Add support for concat > -- > > Key: ARROW-9892 > URL: https://issues.apache.org/jira/browse/ARROW-9892 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Jorge >Assignee: Jorge >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > So that we can concatenate strings together. > {{pub fn concat(args: Vec) -> Expr}} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9902) [Rust] [DataFusion] Add support for array()
[ https://issues.apache.org/jira/browse/ARROW-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge updated ARROW-9902: - Description: `array` is a function that takes an arbitrary number of columns and returns a fixed-size array with their values. This function is notoriously difficult to implement because it receives an arbitrary number of arguments or arbitrary but common types, but it is also useful for e.g. time-series data. was: `array` is a function that takes an arbitrary number of columns and returns a fixed-size list with their values. This function is notoriously difficult to implement because it receives an arbitrary number of arguments or arbitrary but common types, but it is also useful for e.g. time-series data. > [Rust] [DataFusion] Add support for array() > --- > > Key: ARROW-9902 > URL: https://issues.apache.org/jira/browse/ARROW-9902 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Jorge >Assignee: Jorge >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > `array` is a function that takes an arbitrary number of columns and returns a > fixed-size array with their values. > This function is notoriously difficult to implement because it receives an > arbitrary number of arguments or arbitrary but common types, but it is also > useful for e.g. time-series data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9902) [Rust] [DataFusion] Add support for array()
[ https://issues.apache.org/jira/browse/ARROW-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9902: Assignee: Jorge (was: Apache Arrow JIRA Bot) > [Rust] [DataFusion] Add support for array() > --- > > Key: ARROW-9902 > URL: https://issues.apache.org/jira/browse/ARROW-9902 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Jorge >Assignee: Jorge >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > `array` is a function that takes an arbitrary number of columns and returns a > fixed-size list with their values. > This function is notoriously difficult to implement because it receives an > arbitrary number of arguments or arbitrary but common types, but it is also > useful for e.g. time-series data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9902) [Rust] [DataFusion] Add support for array()
[ https://issues.apache.org/jira/browse/ARROW-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9902: -- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Add support for array() > --- > > Key: ARROW-9902 > URL: https://issues.apache.org/jira/browse/ARROW-9902 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Jorge >Assignee: Jorge >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > `array` is a function that takes an arbitrary number of columns and returns a > fixed-size list with their values. > This function is notoriously difficult to implement because it receives an > arbitrary number of arguments or arbitrary but common types, but it is also > useful for e.g. time-series data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9902) [Rust] [DataFusion] Add support for array()
[ https://issues.apache.org/jira/browse/ARROW-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9902: Assignee: Apache Arrow JIRA Bot (was: Jorge) > [Rust] [DataFusion] Add support for array() > --- > > Key: ARROW-9902 > URL: https://issues.apache.org/jira/browse/ARROW-9902 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Jorge >Assignee: Apache Arrow JIRA Bot >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > `array` is a function that takes an arbitrary number of columns and returns a > fixed-size list with their values. > This function is notoriously difficult to implement because it receives an > arbitrary number of arguments or arbitrary but common types, but it is also > useful for e.g. time-series data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9902) [Rust] [DataFusion] Add support for array()
Jorge created ARROW-9902: Summary: [Rust] [DataFusion] Add support for array() Key: ARROW-9902 URL: https://issues.apache.org/jira/browse/ARROW-9902 Project: Apache Arrow Issue Type: Improvement Components: Rust, Rust - DataFusion Reporter: Jorge Assignee: Jorge `array` is a function that takes an arbitrary number of columns and returns a fixed-size list with their values. This function is notoriously difficult to implement because it receives an arbitrary number of arguments or arbitrary but common types, but it is also useful for e.g. time-series data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9868) [C++] Provide utility for copying files between filesystems
[ https://issues.apache.org/jira/browse/ARROW-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9868: -- Labels: filesystem pull-request-available s3 (was: filesystem s3) > [C++] Provide utility for copying files between filesystems > --- > > Key: ARROW-9868 > URL: https://issues.apache.org/jira/browse/ARROW-9868 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Neal Richardson >Assignee: Ben Kietzman >Priority: Major > Labels: filesystem, pull-request-available, s3 > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > {{CopyStream}} in arrow/filesystem/util_internal.h does this, but we should > expose it, multithread it (can read in one thread while the other thread > writes), and further see if there are filesystem-specific optimizations (e.g. > S3 multipart uploading/downloading). We may also want a version that takes a > FileSelector or vector of paths and parallelizes the operations on them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9891) [Rust] [DataFusion] Make math functions support f32
[ https://issues.apache.org/jira/browse/ARROW-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-9891. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8089 [https://github.com/apache/arrow/pull/8089] > [Rust] [DataFusion] Make math functions support f32 > --- > > Key: ARROW-9891 > URL: https://issues.apache.org/jira/browse/ARROW-9891 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Jorge >Assignee: Jorge >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Given a math function `g`, we compute g(f32) using g(cast(f32 AS f64)). > The goal of this issue is to make the operation be cast(g(f32) AS f64) > instead. > Since computations on f32 are faster than on f64, this is a simple > optimization. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9845) [Rust] [Parquet] serde_json is only used in tests but isn't in dev-dependencies
[ https://issues.apache.org/jira/browse/ARROW-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-9845. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8087 [https://github.com/apache/arrow/pull/8087] > [Rust] [Parquet] serde_json is only used in tests but isn't in > dev-dependencies > --- > > Key: ARROW-9845 > URL: https://issues.apache.org/jira/browse/ARROW-9845 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust >Affects Versions: 1.0.0 >Reporter: Benjamin Kimock >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0, 1.0.1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > This is resolved by moving the dependency out of dependencies and into to > dev-dependencies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9873) [C++][Compute] Improve mode kernel for intergers within limited value range
[ https://issues.apache.org/jira/browse/ARROW-9873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-9873. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8091 [https://github.com/apache/arrow/pull/8091] > [C++][Compute] Improve mode kernel for intergers within limited value range > --- > > Key: ARROW-9873 > URL: https://issues.apache.org/jira/browse/ARROW-9873 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Yibo Cai >Assignee: Yibo Cai >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Attachments: mode-range-skylake.png > > Time Spent: 1.5h > Remaining Estimate: 0h > > It's possible to improve mode kernel performance for integers within limited > value range by using a value indexed array instead of general hash table. > Similar trick is used in sorting kernel ARROW-1571. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9901) [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading
[ https://issues.apache.org/jira/browse/ARROW-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9901: -- Labels: pull-request-available (was: ) > [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading > -- > > Key: ARROW-9901 > URL: https://issues.apache.org/jira/browse/ARROW-9901 > Project: Apache Arrow > Issue Type: Sub-task > Components: C++ >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We should write tests where definition and repetition levels are explicitly > written out for a particular Parquet schema, then read as a Arrow column. > Sketch here: > https://gist.github.com/pitrou/282dd790cac0eb2c1b59e8c9ab1941d8 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9901) [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading
[ https://issues.apache.org/jira/browse/ARROW-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9901: Assignee: Antoine Pitrou (was: Apache Arrow JIRA Bot) > [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading > -- > > Key: ARROW-9901 > URL: https://issues.apache.org/jira/browse/ARROW-9901 > Project: Apache Arrow > Issue Type: Sub-task > Components: C++ >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We should write tests where definition and repetition levels are explicitly > written out for a particular Parquet schema, then read as a Arrow column. > Sketch here: > https://gist.github.com/pitrou/282dd790cac0eb2c1b59e8c9ab1941d8 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9901) [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading
[ https://issues.apache.org/jira/browse/ARROW-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9901: Assignee: Apache Arrow JIRA Bot (was: Antoine Pitrou) > [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading > -- > > Key: ARROW-9901 > URL: https://issues.apache.org/jira/browse/ARROW-9901 > Project: Apache Arrow > Issue Type: Sub-task > Components: C++ >Reporter: Antoine Pitrou >Assignee: Apache Arrow JIRA Bot >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We should write tests where definition and repetition levels are explicitly > written out for a particular Parquet schema, then read as a Arrow column. > Sketch here: > https://gist.github.com/pitrou/282dd790cac0eb2c1b59e8c9ab1941d8 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9901) [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading
Antoine Pitrou created ARROW-9901: - Summary: [C++] Add hand-crafted Parquet to Arrow reconstruction test for nested reading Key: ARROW-9901 URL: https://issues.apache.org/jira/browse/ARROW-9901 Project: Apache Arrow Issue Type: Sub-task Components: C++ Reporter: Antoine Pitrou Assignee: Antoine Pitrou We should write tests where definition and repetition levels are explicitly written out for a particular Parquet schema, then read as a Arrow column. Sketch here: https://gist.github.com/pitrou/282dd790cac0eb2c1b59e8c9ab1941d8 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9900) [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan
[ https://issues.apache.org/jira/browse/ARROW-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9900: -- Labels: pull-request-available (was: ) > [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan > > > Key: ARROW-9900 > URL: https://issues.apache.org/jira/browse/ARROW-9900 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Andrew Lamb >Assignee: Andrew Lamb >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The idea is to continue to simplify the code and improve performance: the > inputs to nodes are often copied and using Box requires unnecessary deep > copies -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9900) [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan
Andrew Lamb created ARROW-9900: -- Summary: [Rust][DataFusion] Use Arc<> instead of Box<> in LogicalPlan Key: ARROW-9900 URL: https://issues.apache.org/jira/browse/ARROW-9900 Project: Apache Arrow Issue Type: Sub-task Reporter: Andrew Lamb Assignee: Andrew Lamb The idea is to continue to simplify the code and improve performance: the inputs to nodes are often copied and using Box requires unnecessary deep copies -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9899) [Rust] [DataFusion] Switch from Box --> SchemaRef (Arc) to be consistent with the rest of Arrow
[ https://issues.apache.org/jira/browse/ARROW-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9899: -- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Switch from Box --> SchemaRef (Arc) to be > consistent with the rest of Arrow > --- > > Key: ARROW-9899 > URL: https://issues.apache.org/jira/browse/ARROW-9899 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Andrew Lamb >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The idea is to use SchemaRef (which is an Arc) instead of > Box inside Datafusion to be consistent with the rest of the arrow > implementation, avoid so many copies, and make the code simpler. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9899) [Rust] [DataFusion] Switch from Box --> SchemaRef (Arc) to be consistent with the rest of Arrow
Andrew Lamb created ARROW-9899: -- Summary: [Rust] [DataFusion] Switch from Box --> SchemaRef (Arc) to be consistent with the rest of Arrow Key: ARROW-9899 URL: https://issues.apache.org/jira/browse/ARROW-9899 Project: Apache Arrow Issue Type: Sub-task Reporter: Andrew Lamb The idea is to use SchemaRef (which is an Arc) instead of Box inside Datafusion to be consistent with the rest of the arrow implementation, avoid so many copies, and make the code simpler. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9605) [C++] Optimize performance for aggregate min/max compute kernels
[ https://issues.apache.org/jira/browse/ARROW-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-9605. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 7871 [https://github.com/apache/arrow/pull/7871] > [C++] Optimize performance for aggregate min/max compute kernels > > > Key: ARROW-9605 > URL: https://issues.apache.org/jira/browse/ARROW-9605 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Frank Du >Assignee: Frank Du >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > # Use BitBlockCounter to speedup the performance for typical 0.01% null-able > data. > # Enable AVX compiler auto vectorize for no-nulls on int types. Float/Double > use fmin/fmax to handle NaN which can't be vectorize by compiler. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9794) [C++] Add functionality to cpu_info to discriminate between Intel vs AMD x86
[ https://issues.apache.org/jira/browse/ARROW-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-9794. --- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8093 [https://github.com/apache/arrow/pull/8093] > [C++] Add functionality to cpu_info to discriminate between Intel vs AMD x86 > > > Key: ARROW-9794 > URL: https://issues.apache.org/jira/browse/ARROW-9794 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Micah Kornfield >Assignee: Frank Du >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > This is needed to do runtime dispatches for places where pext/pdep can be > used. These perform poorly on AMD. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9898) [C++][Gandiva] Error handling in castINT fails in some enviroments
[ https://issues.apache.org/jira/browse/ARROW-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9898: Assignee: Projjal Chanda (was: Apache Arrow JIRA Bot) > [C++][Gandiva] Error handling in castINT fails in some enviroments > -- > > Key: ARROW-9898 > URL: https://issues.apache.org/jira/browse/ARROW-9898 > Project: Apache Arrow > Issue Type: Bug >Reporter: Projjal Chanda >Assignee: Projjal Chanda >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In some environment the error path in castINT leads to segfault. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9898) [C++][Gandiva] Error handling in castINT fails in some enviroments
[ https://issues.apache.org/jira/browse/ARROW-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-9898: Assignee: Apache Arrow JIRA Bot (was: Projjal Chanda) > [C++][Gandiva] Error handling in castINT fails in some enviroments > -- > > Key: ARROW-9898 > URL: https://issues.apache.org/jira/browse/ARROW-9898 > Project: Apache Arrow > Issue Type: Bug >Reporter: Projjal Chanda >Assignee: Apache Arrow JIRA Bot >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In some environment the error path in castINT leads to segfault. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9898) [C++][Gandiva] Error handling in castINT fails in some enviroments
[ https://issues.apache.org/jira/browse/ARROW-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9898: -- Labels: pull-request-available (was: ) > [C++][Gandiva] Error handling in castINT fails in some enviroments > -- > > Key: ARROW-9898 > URL: https://issues.apache.org/jira/browse/ARROW-9898 > Project: Apache Arrow > Issue Type: Bug >Reporter: Projjal Chanda >Assignee: Projjal Chanda >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In some environment the error path in castINT leads to segfault. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9898) [C++][Gandiva] Error handling in castINT fails in some enviroments
Projjal Chanda created ARROW-9898: - Summary: [C++][Gandiva] Error handling in castINT fails in some enviroments Key: ARROW-9898 URL: https://issues.apache.org/jira/browse/ARROW-9898 Project: Apache Arrow Issue Type: Bug Reporter: Projjal Chanda Assignee: Projjal Chanda In some environment the error path in castINT leads to segfault. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9897) [C++][Gandiva] Add to_date() function from pattern
[ https://issues.apache.org/jira/browse/ARROW-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9897: -- Labels: pull-request-available (was: ) > [C++][Gandiva] Add to_date() function from pattern > -- > > Key: ARROW-9897 > URL: https://issues.apache.org/jira/browse/ARROW-9897 > Project: Apache Arrow > Issue Type: Bug >Reporter: Projjal Chanda >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Signature: date64 to_date(utf8, utf8) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-7226) [JSON][Python] Json loader fails on example in documentation.
[ https://issues.apache.org/jira/browse/ARROW-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche resolved ARROW-7226. -- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8055 [https://github.com/apache/arrow/pull/8055] > [JSON][Python] Json loader fails on example in documentation. > - > > Key: ARROW-7226 > URL: https://issues.apache.org/jira/browse/ARROW-7226 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Rinke Hoekstra >Assignee: Andrew Wieteska >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I was just trying this with the example found in the pyarrow docs at > [http://arrow.apache.org/docs/python/json.html] > The documented example does not work. Is this related to this issue, or is it > another matter? > It says to load the following JSON file: > {{{"a": [1, 2], "b": {"c": true, "d": "1991-02-03" > {{{"a": [3, 4, 5], "b": {"c": false, "d": "2019-04-01" > I fixed this to make it valid JSON (It is valid [JSON > Lines|[http://jsonlines.org/]], but that's another issue): > {{[{"a": [1, 2], "b": {"c": true, "d": "1991-02-03"}},}} > {{{"a": [3, 4, 5], "b": {"c": false, "d": "2019-04-01"}}]}} > Then reading the JSON from a file called `my_data.json`: > {{from pyarrow import json}} > {{table = json.read_json("my_data.json")}} > Gives the following error: > {code:java} > ---}} > ArrowInvalid Traceback (most recent call last) > in () > 1 from pyarrow import json > > 2 table = json.read_json('test.json') > ~/.local/share/virtualenvs/parquet-ifRxINoC/lib/python3.7/site-packages/pyarrow/_json.pyx > in pyarrow._json.read_json() > ~/.local/share/virtualenvs/parquet-ifRxINoC/lib/python3.7/site-packages/pyarrow/error.pxi > in pyarrow.lib.check_status() > ArrowInvalid: JSON parse error: A column changed from object to array > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9897) [C++][Gandiva] Add to_date() function from pattern
Projjal Chanda created ARROW-9897: - Summary: [C++][Gandiva] Add to_date() function from pattern Key: ARROW-9897 URL: https://issues.apache.org/jira/browse/ARROW-9897 Project: Apache Arrow Issue Type: Bug Reporter: Projjal Chanda Signature: date64 to_date(utf8, utf8) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9858) [C++][Python][Docs] Expand user guide for FileSystem
[ https://issues.apache.org/jira/browse/ARROW-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9858: - Summary: [C++][Python][Docs] Expand user guide for FileSystem (was: [C++][Python][Docs] User guide for S3FileSystem) > [C++][Python][Docs] Expand user guide for FileSystem > > > Key: ARROW-9858 > URL: https://issues.apache.org/jira/browse/ARROW-9858 > Project: Apache Arrow > Issue Type: New Feature > Components: C++, Documentation, Python >Reporter: Neal Richardson >Assignee: Joris Van den Bossche >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > https://arrow.apache.org/docs/python/filesystems.html is pretty thin > https://arrow.apache.org/docs/python/api/filesystems.html doesn't mention S3 > and in general there are some tricks to getting FileSystemFromUri to work -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-9858) [C++][Python][Docs] User guide for S3FileSystem
[ https://issues.apache.org/jira/browse/ARROW-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-9858. --- Resolution: Fixed Issue resolved by pull request 8065 [https://github.com/apache/arrow/pull/8065] > [C++][Python][Docs] User guide for S3FileSystem > --- > > Key: ARROW-9858 > URL: https://issues.apache.org/jira/browse/ARROW-9858 > Project: Apache Arrow > Issue Type: New Feature > Components: C++, Documentation, Python >Reporter: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > https://arrow.apache.org/docs/python/filesystems.html is pretty thin > https://arrow.apache.org/docs/python/api/filesystems.html doesn't mention S3 > and in general there are some tricks to getting FileSystemFromUri to work -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-9858) [C++][Python][Docs] User guide for S3FileSystem
[ https://issues.apache.org/jira/browse/ARROW-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-9858: - Assignee: Joris Van den Bossche > [C++][Python][Docs] User guide for S3FileSystem > --- > > Key: ARROW-9858 > URL: https://issues.apache.org/jira/browse/ARROW-9858 > Project: Apache Arrow > Issue Type: New Feature > Components: C++, Documentation, Python >Reporter: Neal Richardson >Assignee: Joris Van den Bossche >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 3h > Remaining Estimate: 0h > > https://arrow.apache.org/docs/python/filesystems.html is pretty thin > https://arrow.apache.org/docs/python/api/filesystems.html doesn't mention S3 > and in general there are some tricks to getting FileSystemFromUri to work -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-9821) [Rust][DataFusion] User Defined PlanNode / Operator API
[ https://issues.apache.org/jira/browse/ARROW-9821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189105#comment-17189105 ] Andrew Lamb commented on ARROW-9821: [~emkornfi...@gmail.com] no problem -- I will do so > [Rust][DataFusion] User Defined PlanNode / Operator API > --- > > Key: ARROW-9821 > URL: https://issues.apache.org/jira/browse/ARROW-9821 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust, Rust - DataFusion >Reporter: Andrew Lamb >Assignee: Andrew Lamb >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > The basic goal is to allow users to implement their own PlanNodes. I will > provide a google doc opened for comments shortly. > Proposal: > https://docs.google.com/document/d/1IHCGkCuUvnE9BavkykPULn6Ugxgqc1JShT4nz1vMi7g/edit# > See also mailing list discussion here: > https://lists.apache.org/thread.html/rf8ae7d1147e93e3f6172bc2e4fa50a38abcb35f046cc5830e09da6cc%40%3Cdev.arrow.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-9896) [Go] Rename ipc tool from 'arrow-json-intergration-test' to 'arrow-json'
[ https://issues.apache.org/jira/browse/ARROW-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189099#comment-17189099 ] FredGan commented on ARROW-9896: [~sbinet] Hi, I tried to solve it, but failed. Seems the name "arrow-json-intergration-test" is used in many languages. So I give it up. > [Go] Rename ipc tool from 'arrow-json-intergration-test' to 'arrow-json' > > > Key: ARROW-9896 > URL: https://issues.apache.org/jira/browse/ARROW-9896 > Project: Apache Arrow > Issue Type: Improvement > Components: Go >Affects Versions: 1.0.0 >Reporter: FredGan >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > There are five tools in go/arrow/ipc/cmd directory. The other four tools are > named the same with that described in the code. But the > 'arrow-json-intergration-test' is different. The code name it "arrow-json". > So maybe it would be better that this directory renamed to 'arrow-json' -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9896) [Go] Rename ipc tool from 'arrow-json-intergration-test' to 'arrow-json'
[ https://issues.apache.org/jira/browse/ARROW-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9896: -- Labels: pull-request-available (was: ) > [Go] Rename ipc tool from 'arrow-json-intergration-test' to 'arrow-json' > > > Key: ARROW-9896 > URL: https://issues.apache.org/jira/browse/ARROW-9896 > Project: Apache Arrow > Issue Type: Improvement > Components: Go >Affects Versions: 1.0.0 >Reporter: FredGan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > There are five tools in go/arrow/ipc/cmd directory. The other four tools are > named the same with that described in the code. But the > 'arrow-json-intergration-test' is different. The code name it "arrow-json". > So maybe it would be better that this directory renamed to 'arrow-json' -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9896) [Go] Rename ipc tool from 'arrow-json-intergration-test' to 'arrow-json'
FredGan created ARROW-9896: -- Summary: [Go] Rename ipc tool from 'arrow-json-intergration-test' to 'arrow-json' Key: ARROW-9896 URL: https://issues.apache.org/jira/browse/ARROW-9896 Project: Apache Arrow Issue Type: Improvement Components: Go Affects Versions: 1.0.0 Reporter: FredGan There are five tools in go/arrow/ipc/cmd directory. The other four tools are named the same with that described in the code. But the 'arrow-json-intergration-test' is different. The code name it "arrow-json". So maybe it would be better that this directory renamed to 'arrow-json' -- This message was sent by Atlassian Jira (v8.3.4#803005)