[jira] [Created] (ARROW-9935) New filesystem API unable to read empty S3 folders
Weston Pace created ARROW-9935: -- Summary: New filesystem API unable to read empty S3 folders Key: ARROW-9935 URL: https://issues.apache.org/jira/browse/ARROW-9935 Project: Apache Arrow Issue Type: Bug Affects Versions: 1.0.0 Reporter: Weston Pace Attachments: arrow_453.py, arrow_9935.py When an empty "folder" is created in S3 using the online bucket explorer tool on the management console then it creates a special empty file with the same name as the folder. (Some more details here: [https://docs.aws.amazon.com/AmazonS3/latest/user-guide/using-folders.html)] If parquet files are later loaded into one of these directories (with or without partitioning subdirectories) then this dataset cannot be read by the new dataset API. The underlying s3fs `find` method returns a "file" object with size 0 that pyarrow then attempts to read. Since this file doesn't truly exist a FileNotFoundError is thrown. Would it be safe to simply ignore all files with size 0? As a workaround I can wrap s3fs' find method and strip out these objects with size 0 myself. I've attached a script showing the issue and a workaround. It uses a public bucket that I'll leave up for a few months. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9934) [Rust] Shape and stride check in tensor
Fernando Herrera created ARROW-9934: --- Summary: [Rust] Shape and stride check in tensor Key: ARROW-9934 URL: https://issues.apache.org/jira/browse/ARROW-9934 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Fernando Herrera When creating a tensor there is no check for the supplied shape and stride. There should be a check before creating the tensor object. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9933) [Developer] Add drone as a CI provider for crossbow
Uwe Korn created ARROW-9933: --- Summary: [Developer] Add drone as a CI provider for crossbow Key: ARROW-9933 URL: https://issues.apache.org/jira/browse/ARROW-9933 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9932) R package fails to install on Ubuntu 14
Ofek Shilon created ARROW-9932: -- Summary: R package fails to install on Ubuntu 14 Key: ARROW-9932 URL: https://issues.apache.org/jira/browse/ARROW-9932 Project: Apache Arrow Issue Type: Bug Components: R Affects Versions: 1.0.1 Environment: R version 3.4.0 (2015-04-16) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS Reporter: Ofek Shilon 1. From R (3.4) prompt, we run {{> install.packages("arrow")}} and it seems to succeed. 2. Next we run: {{> arrow::install_arrow()}} This is the full output: {{Installing package into '/opt/R-3.4.0.mkl/library'}} {{(as 'lib' is unspecified)}} {{trying URL 'https://cloud.r-project.org/src/contrib/arrow_1.0.1.tar.gz'}} {{Content type 'application/x-gzip' length 274865 bytes (268 KB)}} {{==}} {{downloaded 268 KBinstalling *source* package 'arrow' ...}} {{** package 'arrow' successfully unpacked and MD5 sums checked}} {{*** No C++ binaries found for ubuntu-14.04}} {{*** Successfully retrieved C++ source}} {{*** Building C++ libraries}} {{ cmake}} {{Error in dQuote(env_var_list, FALSE) : unused argument (FALSE)}} {{Calls: build_libarrow -> paste}} {{Execution halted}} {{- NOTE ---}} {{After installation, please run arrow::install_arrow()}} {{for help installing required runtime libraries}} {{-}} {{** libs}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c array.cpp -o array.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c array_from_vector.cpp -o array_from_vector.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c array_to_vector.cpp -o array_to_vector.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c arraydata.cpp -o arraydata.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c arrowExports.cpp -o arrowExports.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c buffer.cpp -o buffer.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c chunkedarray.cpp -o chunkedarray.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c compression.cpp -o compression.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c compute.cpp -o compute.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c csv.cpp -o csv.o}}{{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c dataset.cpp -o dataset.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c datatype.cpp -o datatype.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c expression.cpp -o expression.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c feather.cpp -o feather.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c field.cpp -o field.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c filesystem.cpp -o filesystem.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c imports.cpp -o imports.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG -I"/opt/R-3.4.0.mkl/library/Rcpp/include" -I/usr/local/include -fpic -march=x86-64 -O3 -c io.cpp -o io.o}} {{g++ -std=gnu++0x -I/opt/R-3.4.0.mkl/lib64/R/include -DNDEBUG
[GitHub] [arrow-testing] pitrou merged pull request #46: ARROW-9931: Add IPC fuzz file
pitrou merged pull request #46: URL: https://github.com/apache/arrow-testing/pull/46 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-testing] pitrou opened a new pull request #46: ARROW-9931: Add IPC fuzz file
pitrou opened a new pull request #46: URL: https://github.com/apache/arrow-testing/pull/46 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-9931) [C++] Fix undefined behaviour on invalid IPC (OSS-Fuzz)
Antoine Pitrou created ARROW-9931: - Summary: [C++] Fix undefined behaviour on invalid IPC (OSS-Fuzz) Key: ARROW-9931 URL: https://issues.apache.org/jira/browse/ARROW-9931 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Antoine Pitrou Assignee: Antoine Pitrou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9930) [C++] Fix undefined behaviour on invalid IPC (OSS-Fuzz)
Antoine Pitrou created ARROW-9930: - Summary: [C++] Fix undefined behaviour on invalid IPC (OSS-Fuzz) Key: ARROW-9930 URL: https://issues.apache.org/jira/browse/ARROW-9930 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Antoine Pitrou Assignee: Antoine Pitrou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9929) [Developer] Autotune cmake-format
Uwe Korn created ARROW-9929: --- Summary: [Developer] Autotune cmake-format Key: ARROW-9929 URL: https://issues.apache.org/jira/browse/ARROW-9929 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools Reporter: Uwe Korn Assignee: Uwe Korn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9928) [C++] Speed up integer parsing slightly
Antoine Pitrou created ARROW-9928: - Summary: [C++] Speed up integer parsing slightly Key: ARROW-9928 URL: https://issues.apache.org/jira/browse/ARROW-9928 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Antoine Pitrou By exiting early out of the parsing routine when the input is exhausted, we can save a little bit a processing time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9927) Add dplyr group_by, summarise and mutate support in function open_dataset R arrow package
Pal created ARROW-9927: -- Summary: Add dplyr group_by, summarise and mutate support in function open_dataset R arrow package Key: ARROW-9927 URL: https://issues.apache.org/jira/browse/ARROW-9927 Project: Apache Arrow Issue Type: Bug Reporter: Pal -- This message was sent by Atlassian Jira (v8.3.4#803005)