[jira] [Created] (ARROW-7399) gandiva does not pick runtime cpu features
Pindikura Ravindra created ARROW-7399: - Summary: gandiva does not pick runtime cpu features Key: ARROW-7399 URL: https://issues.apache.org/jira/browse/ARROW-7399 Project: Apache Arrow Issue Type: Task Components: C++ - Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra [~yibo] reported that the IR code generated by gandiva is using 128-bit registers even though the test machine has cpu with avx2 feature. I was able to reproduce the same on a gce host. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7378) loop vectorization broken in gandiva
Pindikura Ravindra created ARROW-7378: - Summary: loop vectorization broken in gandiva Key: ARROW-7378 URL: https://issues.apache.org/jira/browse/ARROW-7378 Project: Apache Arrow Issue Type: Task Components: C++ - Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra [~yibo] pointed out in the mailing list that this is broken. I found that there is something in the last change to llvm_generator.cc that broke the auto vectorization. [https://github.com/apache/arrow/commit/165b02d2358e5c8c2039cf626ac7326d82e3ca90] If I undo this one patch, I can see the vectorization happen with Yibo Cai's test. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6491) [Java] fix master build failure caused by ErrorProne
Pindikura Ravindra created ARROW-6491: - Summary: [Java] fix master build failure caused by ErrorProne Key: ARROW-6491 URL: https://issues.apache.org/jira/browse/ARROW-6491 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Ji Liu -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (ARROW-6490) [Java] log error for leak in allocator close
Pindikura Ravindra created ARROW-6490: - Summary: [Java] log error for leak in allocator close Key: ARROW-6490 URL: https://issues.apache.org/jira/browse/ARROW-6490 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra Currently, the allocator close throws an exception that includes some details in case of memory leaks. However, if there is a hierarchy of allocators and they are all closed at different times, it's hard to find the cause of the original leak. If we also log a message when the leak occurs, it will be easier to correlate these. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (ARROW-6383) [Java] report outstanding child allocators on parent allocator close
Pindikura Ravindra created ARROW-6383: - Summary: [Java] report outstanding child allocators on parent allocator close Key: ARROW-6383 URL: https://issues.apache.org/jira/browse/ARROW-6383 Project: Apache Arrow Issue Type: Task Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra when a parent allocator is closed, we should report the child allocators if any are outstanding. This helps in debugging memory leaks - will tell if the leak happened in the parent or the child. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface
Pindikura Ravindra created ARROW-6211: - Summary: [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface Key: ARROW-6211 URL: https://issues.apache.org/jira/browse/ARROW-6211 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra This is a follow-up from [https://github.com/apache/arrow/pull/4933] public interface VectorVisitor \{..} In ValueVector : public OUT accept(VectorVisitor visitor, IN value) throws EX; -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-6210) [Java] remove equals API from ValueVector
Pindikura Ravindra created ARROW-6210: - Summary: [Java] remove equals API from ValueVector Key: ARROW-6210 URL: https://issues.apache.org/jira/browse/ARROW-6210 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra This is a follow-up from [https://github.com/apache/arrow/pull/4933] The callers should be fixed to use the RangeEquals API instead. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-6116) [C++][Gandiva] Fix bug in TimedTestFilterAdd2
Pindikura Ravindra created ARROW-6116: - Summary: [C++][Gandiva] Fix bug in TimedTestFilterAdd2 Key: ARROW-6116 URL: https://issues.apache.org/jira/browse/ARROW-6116 Project: Apache Arrow Issue Type: Bug Components: C++ - Gandiva Reporter: Pindikura Ravindra The tests should be : f0 + f1 < f2, instead it's doing f1 + f2 < f2. This was reported via a PR [https://github.com/apache/arrow/pull/4976] -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-6093) [Java] reduce branches in algo for first match in VectorRangeSearcher
Pindikura Ravindra created ARROW-6093: - Summary: [Java] reduce branches in algo for first match in VectorRangeSearcher Key: ARROW-6093 URL: https://issues.apache.org/jira/browse/ARROW-6093 Project: Apache Arrow Issue Type: Improvement Components: Java Reporter: Pindikura Ravindra This is a follow up Jira for the improvement suggested by [~fsaintjacques] in the PR for [https://github.com/apache/arrow/pull/4925] -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5964) [C++][Gandiva] Cast double to decimal with rounding returns 0
Pindikura Ravindra created ARROW-5964: - Summary: [C++][Gandiva] Cast double to decimal with rounding returns 0 Key: ARROW-5964 URL: https://issues.apache.org/jira/browse/ARROW-5964 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra casting 1.15470053838 to decimal(18,0) gives 0. should return 1. there is a bug in the overflow check after rounding. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5925) [Gandiva][C++] cast decimal to int should round up
Pindikura Ravindra created ARROW-5925: - Summary: [Gandiva][C++] cast decimal to int should round up Key: ARROW-5925 URL: https://issues.apache.org/jira/browse/ARROW-5925 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5903) [Java] Set methods in DecimalVector are slow
Pindikura Ravindra created ARROW-5903: - Summary: [Java] Set methods in DecimalVector are slow Key: ARROW-5903 URL: https://issues.apache.org/jira/browse/ARROW-5903 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra The methods are doing a bound check on each byte in the input buffer and each byte on the output buffer. Avoiding this repetitive work improves perf by a factor of 2x to 3x. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5867) [C++][Gandiva] Add support for cast int to decimal
Pindikura Ravindra created ARROW-5867: - Summary: [C++][Gandiva] Add support for cast int to decimal Key: ARROW-5867 URL: https://issues.apache.org/jira/browse/ARROW-5867 Project: Apache Arrow Issue Type: Task Components: C++ - Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5829) [Java] failure in TestServerOptions.domainSocket
Pindikura Ravindra created ARROW-5829: - Summary: [Java] failure in TestServerOptions.domainSocket Key: ARROW-5829 URL: https://issues.apache.org/jira/browse/ARROW-5829 Project: Apache Arrow Issue Type: Bug Components: FlightRPC, Java Reporter: Pindikura Ravindra I see this consistently with the 0.14.0 rc0 release candidate on mac mojave. java.io.IOException: Failed to bind at org.apache.arrow.flight.TestServerOptions.domainSocket(TestServerOptions.java:46) Caused by: io.netty.channel.unix.Errors$NativeIoException: bind(..) failed: Address already in use -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5818) [Java][Gandiva] support varlen output vectors
Pindikura Ravindra created ARROW-5818: - Summary: [Java][Gandiva] support varlen output vectors Key: ARROW-5818 URL: https://issues.apache.org/jira/browse/ARROW-5818 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5701) [C++][Gandiva] Build expressions only for the required selection vector types
Pindikura Ravindra created ARROW-5701: - Summary: [C++][Gandiva] Build expressions only for the required selection vector types Key: ARROW-5701 URL: https://issues.apache.org/jira/browse/ARROW-5701 Project: Apache Arrow Issue Type: Task Components: C++ - Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra We currently build the JIT for all known selection vector types (there are 4 supported types). For very long expressions, this increases the build time by 4x. Instead, we should build only for the required selection vector type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5636) [C++][Gandiva] Expression cache should not use ToString on data type
Pindikura Ravindra created ARROW-5636: - Summary: [C++][Gandiva] Expression cache should not use ToString on data type Key: ARROW-5636 URL: https://issues.apache.org/jira/browse/ARROW-5636 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra The expression cache in gandiva generates uses the ToString() method of arrow::DataType() for both hashing and equality. This is error-prone - we should have a visitor for generating hash, and use the equality visitor for comparison. [~fsaintjacques] [~praveenbingo] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5626) [C
Pindikura Ravindra created ARROW-5626: - Summary: [C Key: ARROW-5626 URL: https://issues.apache.org/jira/browse/ARROW-5626 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5602) [Java][Gandiva] Add test for decimal round functions
Pindikura Ravindra created ARROW-5602: - Summary: [Java][Gandiva] Add test for decimal round functions Key: ARROW-5602 URL: https://issues.apache.org/jira/browse/ARROW-5602 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5579) [Java] shade flatbuffer dependency
Pindikura Ravindra created ARROW-5579: - Summary: [Java] shade flatbuffer dependency Key: ARROW-5579 URL: https://issues.apache.org/jira/browse/ARROW-5579 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Reported in a [github issue|[https://github.com/apache/arrow/issues/4489]] After some [discussion|https://github.com/google/flatbuffers/issues/5368] with the Flatbuffers maintainer, it appears that FB generated code is not guaranteed to be compatible with _any other_ version of the runtime library other than the exact same version of the flatc used to compile it. This makes depending on flatbuffers in a library (like arrow) quite risky, as if an app depends on any other version of FB, either directly or transitively, it's likely the versions will clash at some point and you'll see undefined behaviour at runtime. Shading the dependency looks to me the best way to avoid this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5484) [Java] remove FieldReader from ValueVector
Pindikura Ravindra created ARROW-5484: - Summary: [Java] remove FieldReader from ValueVector Key: ARROW-5484 URL: https://issues.apache.org/jira/browse/ARROW-5484 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra Every implementation of ValueVector has an instance of .FieldReader, which has an overhead of 28 bytes on the heap. This can be avoided by instantiating the object only when required. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5483) [Java] add ValueVector constructors that take a Field object
Pindikura Ravindra created ARROW-5483: - Summary: [Java] add ValueVector constructors that take a Field object Key: ARROW-5483 URL: https://issues.apache.org/jira/browse/ARROW-5483 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra Each instance of a ValueVector instantiates Field and FieldType object, which consume 81 bytes of heap space. This duplication be avoided in cases where all the ValueVectors belong to the same set of columns/schema. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5482) reduce heap footprint of ValueVectors
Pindikura Ravindra created ARROW-5482: - Summary: reduce heap footprint of ValueVectors Key: ARROW-5482 URL: https://issues.apache.org/jira/browse/ARROW-5482 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra In some scenarios, we hold lots of value vectors in memory eg. during join, aggregation. The heap analysis shows that the costs are as follows for a simple IntVector (used VisualVM on mac) : IntVector : 80 bytes vector.types.pojo.FieldType : 41 bytes vector.types.pojo.Field : 40 bytes IntReaderImpl : 28 bytes I'll use this Jira to track ways to reduce the heap usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5451) [C++][Gandiva] Add round functions for decimals
Pindikura Ravindra created ARROW-5451: - Summary: [C++][Gandiva] Add round functions for decimals Key: ARROW-5451 URL: https://issues.apache.org/jira/browse/ARROW-5451 Project: Apache Arrow Issue Type: Task Components: C++ - Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra Will use this Jira to add support for : * round * truncate * ceil * floor * cast decimal to double, double to decimal * cast decimal to long, long to decimal * convert (modify precision/scale) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5321) [Gandiva][C++] add isnull and isnotnull for utf8 and binary types
Pindikura Ravindra created ARROW-5321: - Summary: [Gandiva][C++] add isnull and isnotnull for utf8 and binary types Key: ARROW-5321 URL: https://issues.apache.org/jira/browse/ARROW-5321 Project: Apache Arrow Issue Type: Task Components: C++ - Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5243) [Java][Gandiva] Add test for decimal compare functions
Pindikura Ravindra created ARROW-5243: - Summary: [Java][Gandiva] Add test for decimal compare functions Key: ARROW-5243 URL: https://issues.apache.org/jira/browse/ARROW-5243 Project: Apache Arrow Issue Type: Bug Components: C++ - Gandiva, Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5232) [Java] value vector size increases rapidly in case of clear/setSafe loop
Pindikura Ravindra created ARROW-5232: - Summary: [Java] value vector size increases rapidly in case of clear/setSafe loop Key: ARROW-5232 URL: https://issues.apache.org/jira/browse/ARROW-5232 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5226) [Gandiva] support compare operators for decimal
Pindikura Ravindra created ARROW-5226: - Summary: [Gandiva] support compare operators for decimal Key: ARROW-5226 URL: https://issues.apache.org/jira/browse/ARROW-5226 Project: Apache Arrow Issue Type: Task Components: C++ - Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4758) [Flight] Build fails on Mac due to missing Schema_generated.h
Pindikura Ravindra created ARROW-4758: - Summary: [Flight] Build fails on Mac due to missing Schema_generated.h Key: ARROW-4758 URL: https://issues.apache.org/jira/browse/ARROW-4758 Project: Apache Arrow Issue Type: Task Components: FlightRPC Reporter: Pindikura Ravindra I saw this on CI, a retrigger of the build fixed the issue and I am not able to get the link of the previous build failure. The error happened for the file flight/client.cc, which includes -ipc/metadata--internal.h, which includes arrow/ipc/Schema_generated.h arrow/ipc/Schema_generated.h arrow/ipc/Schema_generated.h arrow/ipc/Schema_generated.h -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4756) [CI] document the procedure to update docker image for manylinux1 builds
Pindikura Ravindra created ARROW-4756: - Summary: [CI] document the procedure to update docker image for manylinux1 builds Key: ARROW-4756 URL: https://issues.apache.org/jira/browse/ARROW-4756 Project: Apache Arrow Issue Type: Task Components: Continuous Integration Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4693) [CI] Build boost library with multi precision
Pindikura Ravindra created ARROW-4693: - Summary: [CI] Build boost library with multi precision Key: ARROW-4693 URL: https://issues.apache.org/jira/browse/ARROW-4693 Project: Apache Arrow Issue Type: Task Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra This is required for ARROW-4205. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4653) [C++] decimal multiply broken when both args are negative
Pindikura Ravindra created ARROW-4653: - Summary: [C++] decimal multiply broken when both args are negative Key: ARROW-4653 URL: https://issues.apache.org/jira/browse/ARROW-4653 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4639) [CI] Crossbow build failing for Gandiva jars
Pindikura Ravindra created ARROW-4639: - Summary: [CI] Crossbow build failing for Gandiva jars Key: ARROW-4639 URL: https://issues.apache.org/jira/browse/ARROW-4639 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra All tests are failing. Seems to be related to gflags. [https://travis-ci.org/pravindra/arrow-build/jobs/495977029] 1: Test timeout computed to be: 1000 1: Running arrow-allocator-test, redirecting output into /Users/travis/build/pravindra/arrow-build/arrow/cpp/build/build/test-logs/arrow-allocator-test.txt (attempt 1/1) 1: dyld: Library not loaded: @rpath/libgflags.2.2.dylib 1: Referenced from: /Users/travis/build/pravindra/arrow-build/arrow/cpp/build/release/libarrow.13.dylib 1: Reason: image not found 1: /Users/travis/build/pravindra/arrow-build/arrow/cpp/build-support/run-test.sh: line 97: 8124 Abort trap: 6 $TEST_EXECUTABLE "$@" 2>&1 1: 8125 Done | $ROOT/build-support/asan_symbolize.py 1: 8126 Done | c++filt 1: 8127 Done | $ROOT/build-support/stacktrace_addr2line.pl $TEST_EXECUTABLE 1: 8128 Done | $pipe_cmd 2>&1 1: 8129 Done | tee $LOGFILE -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4570) [Gandiva] Add overflow checks for decimals
Pindikura Ravindra created ARROW-4570: - Summary: [Gandiva] Add overflow checks for decimals Key: ARROW-4570 URL: https://issues.apache.org/jira/browse/ARROW-4570 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra For decimals, overflows can occur at two places : # input array can have values that are outside the bound (eg. > 38 digits) # When an operation can result in overflows. eg. add of two decimals of (38, 6) can result in an overflow, if the input numbers are very large. In both the above cases, just verifying that an overflow occurred can be a perf overhead. We should do this based on a conf variable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4569) [Gandiva] validate that the precision/scale are within bounds
Pindikura Ravindra created ARROW-4569: - Summary: [Gandiva] validate that the precision/scale are within bounds Key: ARROW-4569 URL: https://issues.apache.org/jira/browse/ARROW-4569 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4532) varwidth vector buffer much larger than expected
Pindikura Ravindra created ARROW-4532: - Summary: varwidth vector buffer much larger than expected Key: ARROW-4532 URL: https://issues.apache.org/jira/browse/ARROW-4532 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra There's a bug in BaseVariableWidthVector.java::setSafe that's causing the value buffers to be much larger than expected. This causes memory wastage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4496) [CI] CI failing for python Xcode 7.3
Pindikura Ravindra created ARROW-4496: - Summary: [CI] CI failing for python Xcode 7.3 Key: ARROW-4496 URL: https://issues.apache.org/jira/browse/ARROW-4496 Project: Apache Arrow Issue Type: Bug Components: Python Reporter: Pindikura Ravindra The last couple of PR triggered builds have failed with this : CMake Error at cmake_modules/FindNumPy.cmake:62 (message): NumPy import failure: Traceback (most recent call last): File "", line 1, in File "/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/__init__.py", line 142, in from . import add_newdocs File "/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/add_newdocs.py", line 13, in from numpy.lib import add_newdoc File "/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/lib/__init__.py", line 8, in from .type_check import * File "/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/lib/type_check.py", line 11, in import numpy.core.numeric as _nx File "/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/core/__init__.py", line 26, in raise ImportError(msg) [https://travis-ci.org/apache/arrow/jobs/489917808] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4403) [Rust] CI fails due to formatting errors
Pindikura Ravindra created ARROW-4403: - Summary: [Rust] CI fails due to formatting errors Key: ARROW-4403 URL: https://issues.apache.org/jira/browse/ARROW-4403 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra [https://travis-ci.org/apache/arrow/jobs/485310770] Diff in /home/travis/build/apache/arrow/rust/arrow/src/csv/reader.rs at line 545: Field::new("lng", DataType::Float64, false), ]); - let file_with_headers = File::open("test/data/uk_cities_with_headers.csv").unwrap(); + let file_with_headers = + File::open("test/data/uk_cities_with_headers.csv").unwrap(); let file_without_headers = File::open("test/data/uk_cities.csv").unwrap(); let both_files = file_with_headers .chain(Cursor::new("\n".to_string())) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4400) [CI] install of clang tools failing
Pindikura Ravindra created ARROW-4400: - Summary: [CI] install of clang tools failing Key: ARROW-4400 URL: https://issues.apache.org/jira/browse/ARROW-4400 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra +sudo apt-add-repository -y 'deb http://llvm.org/apt/xenial/ llvm-toolchain-xenial-6.0 main' +sudo apt-get update -qq W: The repository 'http://llvm.org/apt/xenial llvm-toolchain-xenial-6.0 Release' does not have a Release file. E: Failed to fetch https://llvm.org/apt/xenial/dists/llvm-toolchain-xenial-6.0/main/binary-amd64/Packages Protocol "http" not supported or disabled in libcurl E: Some index files failed to download. They have been ignored, or old ones used instead. The command "$TRAVIS_BUILD_DIR/ci/travis_install_clang_tools.sh" failed and exited with 100 during . Your build has been stopped. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4357) arrow java build broken on trusty
Pindikura Ravindra created ARROW-4357: - Summary: arrow java build broken on trusty Key: ARROW-4357 URL: https://issues.apache.org/jira/browse/ARROW-4357 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra [https://travis-ci.com/dremio/arrow-build/builds/98435917] SLF4J: The requested version 1.5.6 by your slf4j binding is not compatible with [1.6, 1.7] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4342) [Gandiva][Java] spurious failures in projector cache test
Pindikura Ravindra created ARROW-4342: - Summary: [Gandiva][Java] spurious failures in projector cache test Key: ARROW-4342 URL: https://issues.apache.org/jira/browse/ARROW-4342 Project: Apache Arrow Issue Type: Bug Components: Gandiva, Java Reporter: Pindikura Ravindra [ERROR] Tests run: 21, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.542 s <<< FAILURE! - in org.apache.arrow.gandiva.evaluator.ProjectorTest [ERROR] testMakeProjector(org.apache.arrow.gandiva.evaluator.ProjectorTest) Time elapsed: 0.079 s <<< FAILURE! java.lang.AssertionError at org.apache.arrow.gandiva.evaluator.ProjectorTest.testMakeProjector(ProjectorTest.java:164) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4274) [Gandiva] static jni library broken after decimal changes
Pindikura Ravindra created ARROW-4274: - Summary: [Gandiva] static jni library broken after decimal changes Key: ARROW-4274 URL: https://issues.apache.org/jira/browse/ARROW-4274 Project: Apache Arrow Issue Type: Bug Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra With the decimal changes, there can be cpp calls from the IR code. The symbols for these need to be visible in the gandiva cpp library. but, the jni library makes visible only a limited set of symbols from gandiva (the ones specified in src/gandiva/jni/symbols.map). This breaks if the jni library links with the static-libstdc++ (dremio builds the gandiva binary with stdc++ statically linked) due to two reasons # The cpp symbols like std::ios_base::init are not exported via symbols.map. This causes LLVM to complain that there is are unresolved symbols. # Also, there is a problem with exceptions (string_view.hpp can throw exceptions) - This alsi causes LLVM to complain that unwindResume is unresolved. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4209) [Gandiva] returning IR structs causes issues with windows
Pindikura Ravindra created ARROW-4209: - Summary: [Gandiva] returning IR structs causes issues with windows Key: ARROW-4209 URL: https://issues.apache.org/jira/browse/ARROW-4209 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra The decimal add fn return a struct (of high/low values). This is known to be fragile, due to abi compatibility issues. so, fixing this to switch to primitive types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4206) [Gandiva] Implement decimal divide
Pindikura Ravindra created ARROW-4206: - Summary: [Gandiva] Implement decimal divide Key: ARROW-4206 URL: https://issues.apache.org/jira/browse/ARROW-4206 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4204) [Gandiva] implement decimal subtract
Pindikura Ravindra created ARROW-4204: - Summary: [Gandiva] implement decimal subtract Key: ARROW-4204 URL: https://issues.apache.org/jira/browse/ARROW-4204 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4205) [Gandiva] Implement decimal multiply
Pindikura Ravindra created ARROW-4205: - Summary: [Gandiva] Implement decimal multiply Key: ARROW-4205 URL: https://issues.apache.org/jira/browse/ARROW-4205 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4203) [Gandiva] use aliases when building expressions to simplify tests
Pindikura Ravindra created ARROW-4203: - Summary: [Gandiva] use aliases when building expressions to simplify tests Key: ARROW-4203 URL: https://issues.apache.org/jira/browse/ARROW-4203 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra {code:java} // code placeholder auto node_c = TreeExprBuilder::MakeField(field_c); auto if_node = TreeExprBuilder::MakeIf(node_c, node_a, node_b, decimal_type); auto expr = TreeExprBuilder::MakeExpression(if_node, field_result); {code} @wesm suggested that code like the above can be simplified with aliases : {code:java} gandiva::expr(gandiva::if_(gandiva::field(...)){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4202) [Gandiva] use ArrayFromJson in tests
Pindikura Ravindra created ARROW-4202: - Summary: [Gandiva] use ArrayFromJson in tests Key: ARROW-4202 URL: https://issues.apache.org/jira/browse/ARROW-4202 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Most of the gandiva tests use wrappers over ArrowFromVector. These will become a lot more readable if we switch to ArrayFromJSON. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4201) [Gandiva] integrate test utils with arrow
Pindikura Ravindra created ARROW-4201: - Summary: [Gandiva] integrate test utils with arrow Key: ARROW-4201 URL: https://issues.apache.org/jira/browse/ARROW-4201 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra The following tasks to be addressed as part of this Jira : # move (or consolidate) data generators in generate_data.h to arrow # move convenience fns in gandiva/tests/test_util.h to arrow # move (or consolidate) EXPECT_ARROW_* fns to arrow -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4167) [Gandiva] switch to arrow/util/variant
Pindikura Ravindra created ARROW-4167: - Summary: [Gandiva] switch to arrow/util/variant Key: ARROW-4167 URL: https://issues.apache.org/jira/browse/ARROW-4167 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra gandiva cpp uses boost variant. It should switch to arrow/util/variant. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4147) [JAVA] Reduce heap usage for variable width vectors
Pindikura Ravindra created ARROW-4147: - Summary: [JAVA] Reduce heap usage for variable width vectors Key: ARROW-4147 URL: https://issues.apache.org/jira/browse/ARROW-4147 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra This is a follow up to ARROW-1807. The same changes need to be done for variable len vectors too. Also, the default value for initial allocations (4096) causes a lot of wastage, and needs to be changed (to 3970). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4115) [Gandiva] valgrind complains that boolean output data buffer has uninited data
Pindikura Ravindra created ARROW-4115: - Summary: [Gandiva] valgrind complains that boolean output data buffer has uninited data Key: ARROW-4115 URL: https://issues.apache.org/jira/browse/ARROW-4115 Project: Apache Arrow Issue Type: Bug Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4104) [Java] race in AllocationManager during release
Pindikura Ravindra created ARROW-4104: - Summary: [Java] race in AllocationManager during release Key: ARROW-4104 URL: https://issues.apache.org/jira/browse/ARROW-4104 Project: Apache Arrow Issue Type: Bug Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra This is caused due to a bug in my changes for ARROW-1807. The synchronization is happening on the BufferLedger instance instead of the AllocationManager instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4086) [Java] Add api to fetch summary of root allocator
Pindikura Ravindra created ARROW-4086: - Summary: [Java] Add api to fetch summary of root allocator Key: ARROW-4086 URL: https://issues.apache.org/jira/browse/ARROW-4086 Project: Apache Arrow Issue Type: Task Components: Java Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra On allocation failures, it's useful to know where the memory is being used in the tree of allocators (for debugging). One way to do this would be by adding APIs to : # get root allocator from a given allocator # get summary of usage/limit from an allocator upto N levels (N is an input arg) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-4077) [Gandiva] fix CI if ctest doesn't run any tests
Pindikura Ravindra created ARROW-4077: - Summary: [Gandiva] fix CI if ctest doesn't run any tests Key: ARROW-4077 URL: https://issues.apache.org/jira/browse/ARROW-4077 Project: Apache Arrow Issue Type: Bug Components: Gandiva Reporter: Pindikura Ravindra This has happened a couple of times already due to changes in build/flags/labels and it's hard to figure out unless we look into the travis output carefully. Instead, travis_script_gandiva_cpp.sh should terminate with a non-zero error if no tests are run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3991) [gandiva] floating point division shouldn't cause errors
Pindikura Ravindra created ARROW-3991: - Summary: [gandiva] floating point division shouldn't cause errors Key: ARROW-3991 URL: https://issues.apache.org/jira/browse/ARROW-3991 Project: Apache Arrow Issue Type: Bug Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra for division, gandiva explicitly checks if the divisor is zero and raises an error. This is correct for integer division. For float point divisions, it should just return infinity. https://www.gnu.org/software/libc/manual/html_node/Infinity-and-NaN.html#Infinity-and-NaN -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3979) [Gandiva] fix all valgrind reported errors
Pindikura Ravindra created ARROW-3979: - Summary: [Gandiva] fix all valgrind reported errors Key: ARROW-3979 URL: https://issues.apache.org/jira/browse/ARROW-3979 Project: Apache Arrow Issue Type: Bug Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra Travis reports lots of valgrind errors when running gandiva tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3977) [Gandiva] gandiva cpp tests not running in CI
Pindikura Ravindra created ARROW-3977: - Summary: [Gandiva] gandiva cpp tests not running in CI Key: ARROW-3977 URL: https://issues.apache.org/jira/browse/ARROW-3977 Project: Apache Arrow Issue Type: Bug Components: Gandiva Reporter: Pindikura Ravindra Saw this in the logs : Checking test dependency graph... Checking test dependency graph end No tests were found!!! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3805) [Gandiva] handle null validity bitmap in if-else expressions
Pindikura Ravindra created ARROW-3805: - Summary: [Gandiva] handle null validity bitmap in if-else expressions Key: ARROW-3805 URL: https://issues.apache.org/jira/browse/ARROW-3805 Project: Apache Arrow Issue Type: Bug Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra This is a follow-up to the changes in ARROW-3765 [~suquark] [~pcmoritz] [~praveenbingo] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3701) [Gandiva] Add support for decimal operations
Pindikura Ravindra created ARROW-3701: - Summary: [Gandiva] Add support for decimal operations Key: ARROW-3701 URL: https://issues.apache.org/jira/browse/ARROW-3701 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra To begin with, will add support for 128-bit decimals. There are two parts : # llvm_generator needs to understand decimal types (value, precision, scale) # code decimal operations : add/subtract/multiply/divide/mod/.. ** This will be c++ code that can be pre-compiled to emit IR code -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3655) [Gandiva] switch away from default_memory_pool
Pindikura Ravindra created ARROW-3655: - Summary: [Gandiva] switch away from default_memory_pool Key: ARROW-3655 URL: https://issues.apache.org/jira/browse/ARROW-3655 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra After changes to ARROW-3519, Gandiva uses default_memory_pool for some allocations. This needs to be replaced with the pool passed in the Evaluate call. Also, change signatures of all Evaluate APIs (both in project and filter) to take a pool argument. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3597) [Gandiva] gandiva should integrate with ADD_ARROW_TEST for tests
Pindikura Ravindra created ARROW-3597: - Summary: [Gandiva] gandiva should integrate with ADD_ARROW_TEST for tests Key: ARROW-3597 URL: https://issues.apache.org/jira/browse/ARROW-3597 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3519) Add support for functions that can return variable len output
Pindikura Ravindra created ARROW-3519: - Summary: Add support for functions that can return variable len output Key: ARROW-3519 URL: https://issues.apache.org/jira/browse/ARROW-3519 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra This is a pre-requisite for ARROW-3459. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3511) support input selection vectors for both projector and filter
Pindikura Ravindra created ARROW-3511: - Summary: support input selection vectors for both projector and filter Key: ARROW-3511 URL: https://issues.apache.org/jira/browse/ARROW-3511 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra The Gandiva filter module returns a selection vector representing the indices of records (in the batch) that matched the filter. We can connect this to other modules, by passing along this selection vector as an input argument to the downstream projector/filter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3501) remove dependency of gcc 4.9 for gandiva
Pindikura Ravindra created ARROW-3501: - Summary: remove dependency of gcc 4.9 for gandiva Key: ARROW-3501 URL: https://issues.apache.org/jira/browse/ARROW-3501 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Gandiva has a dependency on gcc 4.9 - causes a link error with gcc 4.8. Investigate and remove this dependency if possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3487) simplify NULL_IF_NULL functions that can return errors
Pindikura Ravindra created ARROW-3487: - Summary: simplify NULL_IF_NULL functions that can return errors Key: ARROW-3487 URL: https://issues.apache.org/jira/browse/ARROW-3487 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra NULL_IF_NULL functions that can return errors eg. divide currently look at the validity bits in each function (to avoid returning spurious errors). {code:java} divide(TYPE in1, boolean is_valid1, TYPE in2, boolean is_valid2, ..) { if (!is_valid1 || !is_valid2) { return 0; } if (in2 == 0) { /* set error */ } } {code} This validity check is duplicated for multiple functions and should be moved to the common layer (for all NULL_IF_NULL functions that can return error). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3472) remove gandiva helpers library
Pindikura Ravindra created ARROW-3472: - Summary: remove gandiva helpers library Key: ARROW-3472 URL: https://issues.apache.org/jira/browse/ARROW-3472 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra Assignee: Pindikura Ravindra Gandiva has two native libraries - libgandiva.so and libgandiva_helpers.so - the helpers one is mostly a duplicate and was added to get around unresolved symbols with java/jni. but, this is a hack and needs to be cleaned up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3469) add travis entry for gandiva on OSX
Pindikura Ravindra created ARROW-3469: - Summary: add travis entry for gandiva on OSX Key: ARROW-3469 URL: https://issues.apache.org/jira/browse/ARROW-3469 Project: Apache Arrow Issue Type: Task Components: Gandiva Reporter: Pindikura Ravindra ARROW-3382 adds a travis job for gandiva on ubuntu. We need to do the same for OSX. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3459) Add support for variable length output vectors
Pindikura Ravindra created ARROW-3459: - Summary: Add support for variable length output vectors Key: ARROW-3459 URL: https://issues.apache.org/jira/browse/ARROW-3459 Project: Apache Arrow Issue Type: New Feature Components: Gandiva Reporter: Pindikura Ravindra Gandiva can currently handle variable length input vectors but requires the output vectors to be fixed-length. This is because we do not have a handle to allocate or resize arrow vectors from inside the LLVM code. Due to this limitation, we are not able to support a lot of utf8 related functions (convert-string-to-numeric, toupper, strstr, replace, ..). This needs to be fixed for both C++ and Java. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3458) Add a string based expression parser
Pindikura Ravindra created ARROW-3458: - Summary: Add a string based expression parser Key: ARROW-3458 URL: https://issues.apache.org/jira/browse/ARROW-3458 Project: Apache Arrow Issue Type: New Feature Components: Gandiva Reporter: Pindikura Ravindra Gandiva currently supports a tree-based expression builder. This requires writing a lot of code for even simple expressions. For eg. to build an expression for "a + b < 10", the code is : {code:java} // schema for input fields auto field0 = field("a", int32()); auto field1 = field("b", int32()); auto schema = arrow::schema({field0, field1}); // output fields auto field_result = field("res", boolean()); // Build expression auto node_f0 = TreeExprBuilder::MakeField(field0); auto node_f1 = TreeExprBuilder::MakeField(field1); auto literal_10 = TreeExprBuilder::MakeLiteral(10); auto sum_expr = TreeExprBuilder::MakeFunction("add", {node_f0, node_f1}, int32()); auto lt_expr = TreeExprBuilder::MakeExpression("less_than", {sum_expr, literal_10}, field_result); {code} An alternate way to do this would be : {code:java} // Build expression auto expr = StringExprBuilder::MakeExpression(schema, "a + b < 10", field_result); {code} The expression syntax should be close to that of SQL. To begin with, this'll simplify writing tests. And, it will provide an easier api to work with gandiva. -- This message was sent by Atlassian JIRA (v7.6.3#76005)