[jira] [Created] (ARROW-7399) gandiva does not pick runtime cpu features

2019-12-16 Thread Pindikura Ravindra (Jira)
Pindikura Ravindra created ARROW-7399:
-

 Summary: gandiva does not pick runtime cpu features
 Key: ARROW-7399
 URL: https://issues.apache.org/jira/browse/ARROW-7399
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


[~yibo] reported that the IR code generated by gandiva is using 128-bit 
registers even though the test machine has cpu with avx2 feature. I was able to 
reproduce the same on a  gce host.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7378) loop vectorization broken in gandiva

2019-12-11 Thread Pindikura Ravindra (Jira)
Pindikura Ravindra created ARROW-7378:
-

 Summary: loop vectorization broken in gandiva
 Key: ARROW-7378
 URL: https://issues.apache.org/jira/browse/ARROW-7378
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


[~yibo] pointed out in the mailing list that this is broken.

 
I found that there is something in the last change to llvm_generator.cc that 
broke the auto vectorization.
 
[https://github.com/apache/arrow/commit/165b02d2358e5c8c2039cf626ac7326d82e3ca90]
 
If I undo this one patch, I can see the vectorization happen with Yibo Cai's 
test.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6491) [Java] fix master build failure caused by ErrorProne

2019-09-09 Thread Pindikura Ravindra (Jira)
Pindikura Ravindra created ARROW-6491:
-

 Summary: [Java] fix master build failure caused by ErrorProne
 Key: ARROW-6491
 URL: https://issues.apache.org/jira/browse/ARROW-6491
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Ji Liu






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (ARROW-6490) [Java] log error for leak in allocator close

2019-09-09 Thread Pindikura Ravindra (Jira)
Pindikura Ravindra created ARROW-6490:
-

 Summary: [Java] log error for leak in allocator close
 Key: ARROW-6490
 URL: https://issues.apache.org/jira/browse/ARROW-6490
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


Currently, the allocator close throws an exception that includes some details 
in case of memory leaks. However, if there is a hierarchy of allocators and 
they are all closed at different times, it's hard to find the cause of the 
original leak.

 

If we also log a message when the leak occurs, it will be easier to correlate 
these.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (ARROW-6383) [Java] report outstanding child allocators on parent allocator close

2019-08-29 Thread Pindikura Ravindra (Jira)
Pindikura Ravindra created ARROW-6383:
-

 Summary: [Java] report outstanding child allocators on parent 
allocator close
 Key: ARROW-6383
 URL: https://issues.apache.org/jira/browse/ARROW-6383
 Project: Apache Arrow
  Issue Type: Task
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


when a parent allocator is closed, we should report the child allocators if any 
are outstanding. This helps in debugging memory leaks - will tell if the leak 
happened in the parent or the child.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (ARROW-6211) [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface

2019-08-12 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-6211:
-

 Summary: [Java] Remove dependency on RangeEqualsVisitor from 
ValueVector interface
 Key: ARROW-6211
 URL: https://issues.apache.org/jira/browse/ARROW-6211
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra


This is a follow-up from [https://github.com/apache/arrow/pull/4933]

 

public interface VectorVisitor \{..}

 

In ValueVector : 

public  OUT accept(VectorVisitor 
visitor, IN value) throws EX;

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6210) [Java] remove equals API from ValueVector

2019-08-12 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-6210:
-

 Summary: [Java] remove equals API from ValueVector
 Key: ARROW-6210
 URL: https://issues.apache.org/jira/browse/ARROW-6210
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra


This is a follow-up from [https://github.com/apache/arrow/pull/4933]

The callers should be fixed to use the RangeEquals API instead.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6116) [C++][Gandiva] Fix bug in TimedTestFilterAdd2

2019-08-02 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-6116:
-

 Summary: [C++][Gandiva] Fix bug in TimedTestFilterAdd2
 Key: ARROW-6116
 URL: https://issues.apache.org/jira/browse/ARROW-6116
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra


The tests should be : f0 + f1 < f2, instead it's doing f1 + f2 < f2. This was 
reported via a PR

 

[https://github.com/apache/arrow/pull/4976]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6093) [Java] reduce branches in algo for first match in VectorRangeSearcher

2019-08-01 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-6093:
-

 Summary: [Java] reduce branches in algo for first match in 
VectorRangeSearcher
 Key: ARROW-6093
 URL: https://issues.apache.org/jira/browse/ARROW-6093
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Pindikura Ravindra


This is a follow up Jira for the improvement suggested by [~fsaintjacques] in 
the PR for 

[https://github.com/apache/arrow/pull/4925]

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5964) [C++][Gandiva] Cast double to decimal with rounding returns 0

2019-07-17 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5964:
-

 Summary: [C++][Gandiva] Cast double to decimal with rounding 
returns 0
 Key: ARROW-5964
 URL: https://issues.apache.org/jira/browse/ARROW-5964
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


casting 1.15470053838 to decimal(18,0) gives 0. should return 1.

there is a bug in the overflow check after rounding.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5925) [Gandiva][C++] cast decimal to int should round up

2019-07-12 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5925:
-

 Summary: [Gandiva][C++] cast decimal to int should round up
 Key: ARROW-5925
 URL: https://issues.apache.org/jira/browse/ARROW-5925
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5903) [Java] Set methods in DecimalVector are slow

2019-07-10 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5903:
-

 Summary: [Java] Set methods in DecimalVector are slow
 Key: ARROW-5903
 URL: https://issues.apache.org/jira/browse/ARROW-5903
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


The methods are doing a bound check on each byte in the input buffer and each 
byte on the output buffer. Avoiding this repetitive work improves perf by a 
factor of 2x to 3x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5867) [C++][Gandiva] Add support for cast int to decimal

2019-07-06 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5867:
-

 Summary: [C++][Gandiva] Add support for cast int to decimal
 Key: ARROW-5867
 URL: https://issues.apache.org/jira/browse/ARROW-5867
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5829) [Java] failure in TestServerOptions.domainSocket

2019-07-02 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5829:
-

 Summary: [Java] failure in TestServerOptions.domainSocket
 Key: ARROW-5829
 URL: https://issues.apache.org/jira/browse/ARROW-5829
 Project: Apache Arrow
  Issue Type: Bug
  Components: FlightRPC, Java
Reporter: Pindikura Ravindra


I see this consistently with the 0.14.0 rc0 release candidate on mac mojave.

java.io.IOException: Failed to bind
 at 
org.apache.arrow.flight.TestServerOptions.domainSocket(TestServerOptions.java:46)
Caused by: io.netty.channel.unix.Errors$NativeIoException: bind(..) failed: 
Address already in use

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5818) [Java][Gandiva] support varlen output vectors

2019-07-01 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5818:
-

 Summary: [Java][Gandiva] support varlen output vectors
 Key: ARROW-5818
 URL: https://issues.apache.org/jira/browse/ARROW-5818
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5701) [C++][Gandiva] Build expressions only for the required selection vector types

2019-06-23 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5701:
-

 Summary: [C++][Gandiva] Build expressions only for the required 
selection vector types
 Key: ARROW-5701
 URL: https://issues.apache.org/jira/browse/ARROW-5701
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


We currently build the JIT for all known selection vector types (there are 4 
supported types). For very long expressions, this increases the build time by 
4x.

 

Instead, we should build only for the required selection vector type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5636) [C++][Gandiva] Expression cache should not use ToString on data type

2019-06-17 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5636:
-

 Summary: [C++][Gandiva] Expression cache should not use ToString 
on data type
 Key: ARROW-5636
 URL: https://issues.apache.org/jira/browse/ARROW-5636
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra


The expression cache in gandiva generates uses the ToString() method of 
arrow::DataType() for both hashing and equality.

This is error-prone - we should have a visitor for generating hash, and use the 
equality visitor for comparison.

[~fsaintjacques] [~praveenbingo] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5626) [C

2019-06-17 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5626:
-

 Summary: [C
 Key: ARROW-5626
 URL: https://issues.apache.org/jira/browse/ARROW-5626
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5602) [Java][Gandiva] Add test for decimal round functions

2019-06-13 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5602:
-

 Summary: [Java][Gandiva] Add test for decimal round functions
 Key: ARROW-5602
 URL: https://issues.apache.org/jira/browse/ARROW-5602
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5579) [Java] shade flatbuffer dependency

2019-06-12 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5579:
-

 Summary: [Java] shade flatbuffer dependency
 Key: ARROW-5579
 URL: https://issues.apache.org/jira/browse/ARROW-5579
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra


Reported in a [github issue|[https://github.com/apache/arrow/issues/4489]] 

 

After some [discussion|https://github.com/google/flatbuffers/issues/5368] with 
the Flatbuffers maintainer, it appears that FB generated code is not guaranteed 
to be compatible with _any other_ version of the runtime library other than the 
exact same version of the flatc used to compile it.

This makes depending on flatbuffers in a library (like arrow) quite risky, as 
if an app depends on any other version of FB, either directly or transitively, 
it's likely the versions will clash at some point and you'll see undefined 
behaviour at runtime.

Shading the dependency looks to me the best way to avoid this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5484) [Java] remove FieldReader from ValueVector

2019-06-02 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5484:
-

 Summary: [Java] remove FieldReader from ValueVector
 Key: ARROW-5484
 URL: https://issues.apache.org/jira/browse/ARROW-5484
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


Every implementation of ValueVector has an instance of .FieldReader, which has 
an overhead of 28 bytes on the heap. This can be avoided by instantiating the 
object only when required.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5483) [Java] add ValueVector constructors that take a Field object

2019-06-02 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5483:
-

 Summary: [Java] add ValueVector constructors that take a Field 
object
 Key: ARROW-5483
 URL: https://issues.apache.org/jira/browse/ARROW-5483
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


Each instance of a ValueVector instantiates Field and FieldType object, which 
consume 81 bytes of heap space. This duplication be avoided in cases where all 
the ValueVectors belong to the same set of columns/schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5482) reduce heap footprint of ValueVectors

2019-06-02 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5482:
-

 Summary: reduce heap footprint of ValueVectors
 Key: ARROW-5482
 URL: https://issues.apache.org/jira/browse/ARROW-5482
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


In some scenarios, we hold lots of value vectors in memory eg. during join, 
aggregation. The heap analysis shows that the costs are as follows for a simple 
IntVector (used VisualVM on mac) :

 

IntVector : 80 bytes

vector.types.pojo.FieldType : 41 bytes

vector.types.pojo.Field : 40 bytes

IntReaderImpl : 28 bytes

 

I'll use this Jira to track ways to reduce the heap usage.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5451) [C++][Gandiva] Add round functions for decimals

2019-05-30 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5451:
-

 Summary: [C++][Gandiva] Add round functions for decimals
 Key: ARROW-5451
 URL: https://issues.apache.org/jira/browse/ARROW-5451
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


Will use this Jira to add support for :
 * round
 * truncate
 * ceil
 * floor
 * cast decimal to double, double to decimal
 * cast decimal to long, long to decimal
 * convert (modify precision/scale)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5321) [Gandiva][C++] add isnull and isnotnull for utf8 and binary types

2019-05-15 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5321:
-

 Summary: [Gandiva][C++] add isnull and isnotnull for utf8 and 
binary types
 Key: ARROW-5321
 URL: https://issues.apache.org/jira/browse/ARROW-5321
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5243) [Java][Gandiva] Add test for decimal compare functions

2019-04-30 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5243:
-

 Summary: [Java][Gandiva] Add test for decimal compare functions
 Key: ARROW-5243
 URL: https://issues.apache.org/jira/browse/ARROW-5243
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++ - Gandiva, Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5232) [Java] value vector size increases rapidly in case of clear/setSafe loop

2019-04-29 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5232:
-

 Summary: [Java] value vector size increases rapidly in case of 
clear/setSafe loop
 Key: ARROW-5232
 URL: https://issues.apache.org/jira/browse/ARROW-5232
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5226) [Gandiva] support compare operators for decimal

2019-04-28 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5226:
-

 Summary: [Gandiva] support compare operators for decimal
 Key: ARROW-5226
 URL: https://issues.apache.org/jira/browse/ARROW-5226
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4758) [Flight] Build fails on Mac due to missing Schema_generated.h

2019-03-04 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4758:
-

 Summary: [Flight] Build fails on Mac due to missing 
Schema_generated.h
 Key: ARROW-4758
 URL: https://issues.apache.org/jira/browse/ARROW-4758
 Project: Apache Arrow
  Issue Type: Task
  Components: FlightRPC
Reporter: Pindikura Ravindra


I saw this on CI, a retrigger of the build fixed the issue and I am not able to 
get the link of the previous build failure.

The error happened for the file flight/client.cc, which includes 
-ipc/metadata--internal.h, which includes arrow/ipc/Schema_generated.h

arrow/ipc/Schema_generated.h

arrow/ipc/Schema_generated.h

arrow/ipc/Schema_generated.h



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4756) [CI] document the procedure to update docker image for manylinux1 builds

2019-03-03 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4756:
-

 Summary: [CI] document the procedure to update docker image for 
manylinux1 builds
 Key: ARROW-4756
 URL: https://issues.apache.org/jira/browse/ARROW-4756
 Project: Apache Arrow
  Issue Type: Task
  Components: Continuous Integration
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4693) [CI] Build boost library with multi precision

2019-02-27 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4693:
-

 Summary: [CI] Build boost library with multi precision  
 Key: ARROW-4693
 URL: https://issues.apache.org/jira/browse/ARROW-4693
 Project: Apache Arrow
  Issue Type: Task
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


This is required for ARROW-4205.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4653) [C++] decimal multiply broken when both args are negative

2019-02-21 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4653:
-

 Summary: [C++] decimal multiply broken when both args are negative
 Key: ARROW-4653
 URL: https://issues.apache.org/jira/browse/ARROW-4653
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4639) [CI] Crossbow build failing for Gandiva jars

2019-02-20 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4639:
-

 Summary: [CI] Crossbow build failing for Gandiva jars
 Key: ARROW-4639
 URL: https://issues.apache.org/jira/browse/ARROW-4639
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


All tests are failing. Seems to be related to gflags.

 

[https://travis-ci.org/pravindra/arrow-build/jobs/495977029]

 
1: Test timeout computed to be: 1000
1: Running arrow-allocator-test, redirecting output into 
/Users/travis/build/pravindra/arrow-build/arrow/cpp/build/build/test-logs/arrow-allocator-test.txt
 (attempt 1/1)
1: dyld: Library not loaded: @rpath/libgflags.2.2.dylib
1: Referenced from: 
/Users/travis/build/pravindra/arrow-build/arrow/cpp/build/release/libarrow.13.dylib
1: Reason: image not found
1: 
/Users/travis/build/pravindra/arrow-build/arrow/cpp/build-support/run-test.sh: 
line 97: 8124 Abort trap: 6 $TEST_EXECUTABLE "$@" 2>&1
1: 8125 Done | $ROOT/build-support/asan_symbolize.py
1: 8126 Done | c++filt
1: 8127 Done | $ROOT/build-support/stacktrace_addr2line.pl $TEST_EXECUTABLE
1: 8128 Done | $pipe_cmd 2>&1
1: 8129 Done | tee $LOGFILE



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4570) [Gandiva] Add overflow checks for decimals

2019-02-14 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4570:
-

 Summary: [Gandiva] Add overflow checks for decimals
 Key: ARROW-4570
 URL: https://issues.apache.org/jira/browse/ARROW-4570
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


For decimals, overflows can occur at two places :
 # input array can have values that are outside the bound (eg. > 38 digits)
 # When an operation can result in overflows. eg. add of two decimals of (38, 
6) can result in an overflow, if the input numbers are very large.

In both the above cases, just verifying that an overflow occurred can be a perf 
overhead. We should do this based on a conf variable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4569) [Gandiva] validate that the precision/scale are within bounds

2019-02-14 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4569:
-

 Summary: [Gandiva] validate that the precision/scale are within 
bounds
 Key: ARROW-4569
 URL: https://issues.apache.org/jira/browse/ARROW-4569
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4532) varwidth vector buffer much larger than expected

2019-02-11 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4532:
-

 Summary: varwidth vector buffer much larger than expected
 Key: ARROW-4532
 URL: https://issues.apache.org/jira/browse/ARROW-4532
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


There's a bug in BaseVariableWidthVector.java::setSafe that's causing the value 
buffers to be much larger than expected. This causes memory wastage.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4496) [CI] CI failing for python Xcode 7.3

2019-02-07 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4496:
-

 Summary: [CI] CI failing for python Xcode 7.3
 Key: ARROW-4496
 URL: https://issues.apache.org/jira/browse/ARROW-4496
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Pindikura Ravindra


The last couple of PR triggered builds have failed with this :

CMake Error at cmake_modules/FindNumPy.cmake:62 (message):
NumPy import failure:
Traceback (most recent call last):
File "", line 1, in 
File 
"/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/__init__.py",
 line 142, in 
from . import add_newdocs
File 
"/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/add_newdocs.py",
 line 13, in 
from numpy.lib import add_newdoc
File 
"/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/lib/__init__.py",
 line 8, in 
from .type_check import *
File 
"/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/lib/type_check.py",
 line 11, in 
import numpy.core.numeric as _nx
File 
"/Users/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages/numpy/core/__init__.py",
 line 26, in 
raise ImportError(msg)

[https://travis-ci.org/apache/arrow/jobs/489917808]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4403) [Rust] CI fails due to formatting errors

2019-01-28 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4403:
-

 Summary: [Rust] CI fails due to formatting errors
 Key: ARROW-4403
 URL: https://issues.apache.org/jira/browse/ARROW-4403
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra


[https://travis-ci.org/apache/arrow/jobs/485310770]

 
Diff in /home/travis/build/apache/arrow/rust/arrow/src/csv/reader.rs at line 
545:
 Field::new("lng", DataType::Float64, false),
 ]);
 
- let file_with_headers = 
File::open("test/data/uk_cities_with_headers.csv").unwrap();
+ let file_with_headers =
+ File::open("test/data/uk_cities_with_headers.csv").unwrap();
 let file_without_headers = File::open("test/data/uk_cities.csv").unwrap();
 let both_files = file_with_headers
 .chain(Cursor::new("\n".to_string()))



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4400) [CI] install of clang tools failing

2019-01-27 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4400:
-

 Summary: [CI] install of clang tools failing
 Key: ARROW-4400
 URL: https://issues.apache.org/jira/browse/ARROW-4400
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra


+sudo apt-add-repository -y 'deb http://llvm.org/apt/xenial/ 
llvm-toolchain-xenial-6.0 main'
+sudo apt-get update -qq
W: The repository 'http://llvm.org/apt/xenial llvm-toolchain-xenial-6.0 
Release' does not have a Release file.
E: Failed to fetch 
https://llvm.org/apt/xenial/dists/llvm-toolchain-xenial-6.0/main/binary-amd64/Packages
 Protocol "http" not supported or disabled in libcurl
E: Some index files failed to download. They have been ignored, or old ones 
used instead.
The command "$TRAVIS_BUILD_DIR/ci/travis_install_clang_tools.sh" failed and 
exited with 100 during .
 
Your build has been stopped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4357) arrow java build broken on trusty

2019-01-24 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4357:
-

 Summary: arrow java build broken on trusty
 Key: ARROW-4357
 URL: https://issues.apache.org/jira/browse/ARROW-4357
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


[https://travis-ci.com/dremio/arrow-build/builds/98435917]

 
SLF4J: The requested version 1.5.6 by your slf4j binding is not compatible with 
[1.6, 1.7]
SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4342) [Gandiva][Java] spurious failures in projector cache test

2019-01-23 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4342:
-

 Summary: [Gandiva][Java] spurious failures in projector cache test
 Key: ARROW-4342
 URL: https://issues.apache.org/jira/browse/ARROW-4342
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva, Java
Reporter: Pindikura Ravindra


[ERROR] Tests run: 21, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.542 
s <<< FAILURE! - in org.apache.arrow.gandiva.evaluator.ProjectorTest

[ERROR] testMakeProjector(org.apache.arrow.gandiva.evaluator.ProjectorTest) 
Time elapsed: 0.079 s <<< FAILURE! java.lang.AssertionError at 
org.apache.arrow.gandiva.evaluator.ProjectorTest.testMakeProjector(ProjectorTest.java:164)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4274) [Gandiva] static jni library broken after decimal changes

2019-01-16 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4274:
-

 Summary: [Gandiva] static jni library broken after decimal changes
 Key: ARROW-4274
 URL: https://issues.apache.org/jira/browse/ARROW-4274
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


With the decimal changes, there can be cpp calls from the IR code. The symbols 
for these  need to be visible in the gandiva cpp library. but, the jni library 
makes visible only a limited set of symbols from gandiva (the ones specified in 
src/gandiva/jni/symbols.map).

This breaks  if the jni library links with the static-libstdc++ (dremio builds 
the gandiva binary with stdc++ statically linked) due to two reasons
 # The cpp symbols like std::ios_base::init are not exported via symbols.map. 
This causes LLVM to complain that there is are unresolved symbols.
 # Also, there is a problem with exceptions (string_view.hpp can throw 
exceptions) - This alsi causes LLVM to complain that unwindResume is unresolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4209) [Gandiva] returning IR structs causes issues with windows

2019-01-09 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4209:
-

 Summary: [Gandiva] returning IR structs causes issues with windows
 Key: ARROW-4209
 URL: https://issues.apache.org/jira/browse/ARROW-4209
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


The decimal add fn return a struct (of high/low values). This is known to be 
fragile, due to abi compatibility issues. so, fixing this to switch to 
primitive types.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4206) [Gandiva] Implement decimal divide

2019-01-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4206:
-

 Summary: [Gandiva] Implement decimal divide
 Key: ARROW-4206
 URL: https://issues.apache.org/jira/browse/ARROW-4206
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4204) [Gandiva] implement decimal subtract

2019-01-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4204:
-

 Summary: [Gandiva] implement decimal subtract
 Key: ARROW-4204
 URL: https://issues.apache.org/jira/browse/ARROW-4204
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4205) [Gandiva] Implement decimal multiply

2019-01-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4205:
-

 Summary: [Gandiva] Implement decimal multiply
 Key: ARROW-4205
 URL: https://issues.apache.org/jira/browse/ARROW-4205
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4203) [Gandiva] use aliases when building expressions to simplify tests

2019-01-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4203:
-

 Summary: [Gandiva] use aliases when building expressions to 
simplify tests
 Key: ARROW-4203
 URL: https://issues.apache.org/jira/browse/ARROW-4203
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


{code:java}
// code placeholder
auto node_c = TreeExprBuilder::MakeField(field_c);
auto if_node = TreeExprBuilder::MakeIf(node_c, node_a, node_b, decimal_type);

auto expr = TreeExprBuilder::MakeExpression(if_node, field_result);
{code}
@wesm suggested that code like the above can be simplified with aliases :
{code:java}
gandiva::expr(gandiva::if_(gandiva::field(...)){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4202) [Gandiva] use ArrayFromJson in tests

2019-01-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4202:
-

 Summary: [Gandiva] use ArrayFromJson in tests
 Key: ARROW-4202
 URL: https://issues.apache.org/jira/browse/ARROW-4202
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


Most of the gandiva tests use wrappers over ArrowFromVector. These will become 
a lot more readable if we switch to ArrayFromJSON.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4201) [Gandiva] integrate test utils with arrow

2019-01-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4201:
-

 Summary: [Gandiva] integrate test utils with arrow
 Key: ARROW-4201
 URL: https://issues.apache.org/jira/browse/ARROW-4201
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


The following tasks to be addressed as part of this Jira :
 # move (or consolidate) data generators in generate_data.h to arrow
 # move convenience fns in gandiva/tests/test_util.h to arrow
 # move (or consolidate) EXPECT_ARROW_* fns to arrow



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4167) [Gandiva] switch to arrow/util/variant

2019-01-06 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4167:
-

 Summary: [Gandiva] switch to arrow/util/variant
 Key: ARROW-4167
 URL: https://issues.apache.org/jira/browse/ARROW-4167
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


gandiva cpp uses boost variant. It should switch to arrow/util/variant.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4147) [JAVA] Reduce heap usage for variable width vectors

2019-01-02 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4147:
-

 Summary: [JAVA] Reduce heap usage for variable width vectors
 Key: ARROW-4147
 URL: https://issues.apache.org/jira/browse/ARROW-4147
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


This is a follow up to ARROW-1807. The same changes need to be done for 
variable len vectors too.

Also, the default value for initial allocations (4096) causes a lot of wastage, 
and needs to be changed (to 3970).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4115) [Gandiva] valgrind complains that boolean output data buffer has uninited data

2018-12-26 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4115:
-

 Summary: [Gandiva] valgrind complains that boolean output data 
buffer has uninited data
 Key: ARROW-4115
 URL: https://issues.apache.org/jira/browse/ARROW-4115
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4104) [Java] race in AllocationManager during release

2018-12-22 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4104:
-

 Summary: [Java] race in AllocationManager during release
 Key: ARROW-4104
 URL: https://issues.apache.org/jira/browse/ARROW-4104
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


This is caused due to a bug in my changes for ARROW-1807. The synchronization 
is happening on the BufferLedger instance instead of the AllocationManager 
instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4086) [Java] Add api to fetch summary of root allocator

2018-12-20 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4086:
-

 Summary: [Java] Add api to fetch summary of root allocator
 Key: ARROW-4086
 URL: https://issues.apache.org/jira/browse/ARROW-4086
 Project: Apache Arrow
  Issue Type: Task
  Components: Java
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


On allocation failures, it's useful to know where the memory is being used in 
the tree of allocators (for debugging). One way to do this would be by adding 
APIs to :
 # get root allocator from a given allocator
 # get summary of usage/limit from an allocator upto N levels (N is an input 
arg)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4077) [Gandiva] fix CI if ctest doesn't run any tests

2018-12-19 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-4077:
-

 Summary: [Gandiva] fix CI if ctest doesn't run any tests
 Key: ARROW-4077
 URL: https://issues.apache.org/jira/browse/ARROW-4077
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva
Reporter: Pindikura Ravindra


This has happened a couple of times already due to changes in 
build/flags/labels and it's hard to figure out unless we look into the travis 
output carefully.

Instead, travis_script_gandiva_cpp.sh should terminate with a non-zero error if 
no tests are run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3991) [gandiva] floating point division shouldn't cause errors

2018-12-10 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3991:
-

 Summary: [gandiva] floating point division shouldn't cause errors
 Key: ARROW-3991
 URL: https://issues.apache.org/jira/browse/ARROW-3991
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


for division, gandiva explicitly checks if the divisor is zero and raises an 
error.

This is correct for integer division. For float point divisions, it should just 
return infinity.

https://www.gnu.org/software/libc/manual/html_node/Infinity-and-NaN.html#Infinity-and-NaN



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3979) [Gandiva] fix all valgrind reported errors

2018-12-10 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3979:
-

 Summary: [Gandiva] fix all valgrind reported errors
 Key: ARROW-3979
 URL: https://issues.apache.org/jira/browse/ARROW-3979
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


Travis reports lots of valgrind errors when running gandiva tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3977) [Gandiva] gandiva cpp tests not running in CI

2018-12-09 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3977:
-

 Summary: [Gandiva] gandiva cpp tests not running in CI
 Key: ARROW-3977
 URL: https://issues.apache.org/jira/browse/ARROW-3977
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva
Reporter: Pindikura Ravindra


Saw this in the logs :
 
Checking test dependency graph...
Checking test dependency graph end
No tests were found!!!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3805) [Gandiva] handle null validity bitmap in if-else expressions

2018-11-16 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3805:
-

 Summary: [Gandiva] handle null validity bitmap in if-else 
expressions
 Key: ARROW-3805
 URL: https://issues.apache.org/jira/browse/ARROW-3805
 Project: Apache Arrow
  Issue Type: Bug
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


This is a follow-up to the changes in ARROW-3765

[~suquark] [~pcmoritz] [~praveenbingo]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3701) [Gandiva] Add support for decimal operations

2018-11-04 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3701:
-

 Summary: [Gandiva] Add support for decimal operations
 Key: ARROW-3701
 URL: https://issues.apache.org/jira/browse/ARROW-3701
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


To begin with, will add support for 128-bit decimals. There are two parts :
 # llvm_generator needs to understand decimal types (value, precision, scale)
 # code decimal operations : add/subtract/multiply/divide/mod/..
 ** This will be c++ code that can be pre-compiled to emit IR code



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3655) [Gandiva] switch away from default_memory_pool

2018-10-30 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3655:
-

 Summary: [Gandiva] switch away from default_memory_pool
 Key: ARROW-3655
 URL: https://issues.apache.org/jira/browse/ARROW-3655
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


After changes to ARROW-3519, Gandiva uses default_memory_pool for some 
allocations. This needs to be replaced with the pool passed in the Evaluate 
call. 

 

Also, change signatures of all Evaluate APIs (both in project and filter) to 
take a pool argument.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3597) [Gandiva] gandiva should integrate with ADD_ARROW_TEST for tests

2018-10-23 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3597:
-

 Summary: [Gandiva] gandiva should integrate with ADD_ARROW_TEST 
for tests
 Key: ARROW-3597
 URL: https://issues.apache.org/jira/browse/ARROW-3597
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3519) Add support for functions that can return variable len output

2018-10-15 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3519:
-

 Summary: Add support for functions that can return variable len 
output
 Key: ARROW-3519
 URL: https://issues.apache.org/jira/browse/ARROW-3519
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


This is a pre-requisite for ARROW-3459.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3511) support input selection vectors for both projector and filter

2018-10-14 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3511:
-

 Summary: support input selection vectors for both projector and 
filter
 Key: ARROW-3511
 URL: https://issues.apache.org/jira/browse/ARROW-3511
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


The Gandiva filter module returns a selection vector representing the indices 
of records (in the batch) that matched the filter. We can connect this to other 
modules, by passing along this selection vector as an input argument to the 
downstream projector/filter.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3501) remove dependency of gcc 4.9 for gandiva

2018-10-12 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3501:
-

 Summary: remove dependency of gcc 4.9 for gandiva
 Key: ARROW-3501
 URL: https://issues.apache.org/jira/browse/ARROW-3501
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


Gandiva has a dependency on gcc 4.9 - causes a link error with gcc 4.8. 
Investigate and remove this dependency if possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3487) simplify NULL_IF_NULL functions that can return errors

2018-10-11 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3487:
-

 Summary: simplify NULL_IF_NULL functions that can return errors
 Key: ARROW-3487
 URL: https://issues.apache.org/jira/browse/ARROW-3487
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


NULL_IF_NULL functions that can return errors eg. divide currently look at the 
validity bits in each function (to avoid returning spurious errors).

 
{code:java}
divide(TYPE in1, boolean is_valid1, TYPE in2, boolean is_valid2, ..) {
    if (!is_valid1 || !is_valid2) { 
      return 0;
    }
if (in2 == 0) { /* set error */ }
}
{code}
 

This validity check is duplicated for multiple functions and should be moved to 
the common layer (for all NULL_IF_NULL functions that can return error).

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3472) remove gandiva helpers library

2018-10-09 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3472:
-

 Summary: remove gandiva helpers library
 Key: ARROW-3472
 URL: https://issues.apache.org/jira/browse/ARROW-3472
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra
Assignee: Pindikura Ravindra


Gandiva has two native libraries - libgandiva.so and libgandiva_helpers.so - 
the helpers one is mostly a duplicate and was added to get around unresolved 
symbols with java/jni. but, this is a hack and needs to be cleaned up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3469) add travis entry for gandiva on OSX

2018-10-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3469:
-

 Summary: add travis entry for gandiva on OSX
 Key: ARROW-3469
 URL: https://issues.apache.org/jira/browse/ARROW-3469
 Project: Apache Arrow
  Issue Type: Task
  Components: Gandiva
Reporter: Pindikura Ravindra


ARROW-3382 adds a travis job for gandiva on ubuntu. We need to do the same for 
OSX.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3459) Add support for variable length output vectors

2018-10-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3459:
-

 Summary: Add support for variable length output vectors
 Key: ARROW-3459
 URL: https://issues.apache.org/jira/browse/ARROW-3459
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Gandiva
Reporter: Pindikura Ravindra


Gandiva can currently handle variable length input vectors but requires the 
output vectors to be fixed-length. This is because we do not have a handle to 
allocate or resize arrow vectors from inside the LLVM code. Due to this 
limitation, we are not able to support a lot of utf8 related functions 
(convert-string-to-numeric, toupper, strstr, replace, ..).

 

This needs to be fixed for both C++ and Java.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3458) Add a string based expression parser

2018-10-08 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3458:
-

 Summary: Add a string based expression parser
 Key: ARROW-3458
 URL: https://issues.apache.org/jira/browse/ARROW-3458
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Gandiva
Reporter: Pindikura Ravindra


Gandiva currently supports a tree-based expression builder. This requires 
writing a lot of code for even simple expressions.

For eg. to build an expression for "a + b < 10", the code is :
{code:java}
   // schema for input fields
  auto field0 = field("a", int32());
  auto field1 = field("b", int32());
  auto schema = arrow::schema({field0, field1});

  // output fields
  auto field_result = field("res", boolean());

  // Build expression
  auto node_f0 = TreeExprBuilder::MakeField(field0);
  auto node_f1 = TreeExprBuilder::MakeField(field1);
  auto literal_10 = TreeExprBuilder::MakeLiteral(10);
  auto sum_expr =
  TreeExprBuilder::MakeFunction("add", {node_f0, node_f1}, int32());
  auto lt_expr =
  TreeExprBuilder::MakeExpression("less_than", {sum_expr, literal_10}, 
field_result);

{code}
An alternate way to do this would be :

 
{code:java}
// Build expression
auto expr = StringExprBuilder::MakeExpression(schema, "a + b < 10", 
field_result);
{code}
The expression syntax should be close to that of SQL.

 

To begin with, this'll simplify writing tests. And, it will provide an easier 
api to work with gandiva.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)