[GitHub] [arrow-site] wesm merged pull request #34: Update docs for 0.15.0

2019-10-12 Thread GitBox
wesm merged pull request #34: Update docs for 0.15.0
URL: https://github.com/apache/arrow-site/pull/34
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [arrow-site] wesm opened a new pull request #34: Update docs for 0.15.0

2019-10-12 Thread GitBox
wesm opened a new pull request #34: Update docs for 0.15.0
URL: https://github.com/apache/arrow-site/pull/34
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[arrow] branch master updated: ARROW-6860: [Python][C++] Do not link shared libraries monolithically to pyarrow.lib, add libarrow_python_flight.so

2019-10-12 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 102acc4  ARROW-6860: [Python][C++] Do not link shared libraries 
monolithically to pyarrow.lib, add libarrow_python_flight.so
102acc4 is described below

commit 102acc47287c37a01ac11a5cb6bd1da3f1f0712d
Author: Wes McKinney 
AuthorDate: Sat Oct 12 14:42:04 2019 -0500

ARROW-6860: [Python][C++] Do not link shared libraries monolithically to 
pyarrow.lib, add libarrow_python_flight.so

Adding a new shared library libarrow_python_flight.so that allows us to 
link libarrow_flight and this new library to the Cython _flight extension. I 
initially tried moving the Flight Python bindings directly to libarrow_flight 
but realized this would create a transitive dependency on libpython which is 
not desirable. Any shared library that uses Python C APIs is expected to be 
loaded into a running Python interpreter and not linked explicitly to libpython

Because Apache ORC also needs to statically link Protocol Buffers, I have 
disabled it in the manylinux wheels. Hopefully we can come up with a solution 
where projects like Apache Beam, TensorFlow, and others can all use Protocol 
Buffers together and not have these problems

Closes #5627 from wesm/ARROW-6860 and squashes the following commits:

d5d67f81c  Revert libarrow_flight.pxd changes
b31fbdf32  Build libarrow_python_flight that links to 
libarrow_python and libarrow_flight. Do not link all shared libraries to Cython 
"lib" extension

Authored-by: Wes McKinney 
Signed-off-by: Wes McKinney 
---
 cpp/cmake_modules/FindArrowFlight.cmake | 41 +--
 cpp/src/arrow/python/CMakeLists.txt | 48 +++
 cpp/src/arrow/python/flight.h   | 59 +++--
 python/CMakeLists.txt   | 46 -
 python/manylinux1/build_arrow.sh| 11 +++---
 python/manylinux2010/build_arrow.sh | 11 +++---
 6 files changed, 166 insertions(+), 50 deletions(-)

diff --git a/cpp/cmake_modules/FindArrowFlight.cmake 
b/cpp/cmake_modules/FindArrowFlight.cmake
index 193361a..048e153 100644
--- a/cpp/cmake_modules/FindArrowFlight.cmake
+++ b/cpp/cmake_modules/FindArrowFlight.cmake
@@ -66,7 +66,13 @@ find_library(ARROW_FLIGHT_LIB_PATH
  PATHS ${ARROW_SEARCH_LIB_PATH}
  PATH_SUFFIXES ${LIB_PATH_SUFFIXES}
  NO_DEFAULT_PATH)
+find_library(ARROW_PYTHON_FLIGHT_LIB_PATH
+ NAMES arrow_python_flight
+ PATHS ${ARROW_SEARCH_LIB_PATH}
+ PATH_SUFFIXES ${LIB_PATH_SUFFIXES}
+ NO_DEFAULT_PATH)
 get_filename_component(ARROW_FLIGHT_LIBS ${ARROW_FLIGHT_LIB_PATH} DIRECTORY)
+get_filename_component(ARROW_PYTHON_FLIGHT_LIBS 
${ARROW_PYTHON_FLIGHT_LIB_PATH} DIRECTORY)
 
 if(MSVC)
   # Prioritize "/bin" over LIB_PATH_SUFFIXES - DLL files are installed
@@ -77,7 +83,15 @@ if(MSVC)
PATHS ${ARROW_HOME}
PATH_SUFFIXES "bin" ${LIB_PATH_SUFFIXES}
NO_DEFAULT_PATH)
+  find_library(ARROW_PYTHON_FLIGHT_SHARED_LIBRARIES
+   NAMES arrow_flight
+   PATHS ${ARROW_HOME}
+   PATH_SUFFIXES "bin" ${LIB_PATH_SUFFIXES}
+   NO_DEFAULT_PATH)
+
   get_filename_component(ARROW_FLIGHT_SHARED_LIBS 
${ARROW_FLIGHT_SHARED_LIBRARIES} DIRECTORY)
+  get_filename_component(ARROW_PYTHON_FLIGHT_SHARED_LIBS
+${ARROW_PYTHON_FLIGHT_SHARED_LIBRARIES} DIRECTORY)
 endif()
 
 if(ARROW_FLIGHT_INCLUDE_DIR AND ARROW_FLIGHT_LIBS)
@@ -117,9 +131,32 @@ else()
   set(ARROW_FLIGHT_FOUND FALSE)
 endif()
 
+if(ARROW_PYTHON_FLIGHT_LIBS)
+  set(ARROW_PYTHON_FLIGHT_FOUND TRUE)
+  set(ARROW_PYTHON_FLIGHT_LIB_NAME arrow_python_flight)
+  if(MSVC)
+set(
+  ARROW_PYTHON_FLIGHT_STATIC_LIB
+  
${ARROW_PYTHON_FLIGHT_LIBS}/${ARROW_PYTHON_FLIGHT_LIB_NAME}${ARROW_MSVC_STATIC_LIB_SUFFIX}${CMAKE_STATIC_LIBRARY_SUFFIX}
+  )
+set(ARROW_PYTHON_FLIGHT_SHARED_LIB
+
${ARROW_PYTHON_FLIGHT_SHARED_LIBS}/${ARROW_PYTHON_FLIGHT_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+set(ARROW_PYTHON_FLIGHT_SHARED_IMP_LIB 
${ARROW_PYTHON_FLIGHT_LIBS}/${ARROW_PYTHON_FLIGHT_LIB_NAME}.lib)
+  else()
+set(ARROW_PYTHON_FLIGHT_STATIC_LIB 
${ARROW_LIBS}/lib${ARROW_PYTHON_FLIGHT_LIB_NAME}.a)
+set(ARROW_PYTHON_FLIGHT_SHARED_LIB
+
${ARROW_LIBS}/lib${ARROW_PYTHON_FLIGHT_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+  endif()
+
+  message(STATUS "Found the Arrow Flight Python library: 
${ARROW_PYTHON_FLIGHT_LIB_PATH}")
+endif()
+
 if(MSVC)
   mark_as_advanced(ARROW_FLIGHT_INCLUDE_DIR ARROW_FLIGHT_STATIC_LIB 
ARROW_FLIGHT_SHARED_LIB
-   ARROW_FLIGHT_SHARED_IMP_LIB)
+ARROW_FLIGHT_SHARED_IMP_LIB
+ARROW_PYTHON_FLIGHT_STATIC_LIB ARROW_PYTHON_FLIGHT_SHARED_LIB
+   

[arrow] branch master updated: ARROW-5680: [Rust] [DataFusion] GROUP BY sql tests are now deterministic

2019-10-12 Thread agrove
This is an automated email from the ASF dual-hosted git repository.

agrove pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new d47a40e  ARROW-5680: [Rust] [DataFusion] GROUP BY sql tests are now 
deterministic
d47a40e is described below

commit d47a40e88292840b9c73bec59af1828b273445e8
Author: Andy Grove 
AuthorDate: Sat Oct 12 10:37:11 2019 -0600

ARROW-5680: [Rust] [DataFusion] GROUP BY sql tests are now deterministic

DataFusion doesn't support `ORDER BY` yet, so I modified the aggregate 
tests to collect `Vec` and sort it before comparing against expected 
results, making these tests deterministic.

The PR is a little noisy because I removed the final `\n` from the expected 
results in most of the tests due to the new approach of collecting 
`Vec` and then performing a `.join("\n")` to create the string.

Closes #5622 from andygrove/ARROW-5680 and squashes the following commits:

a3083c6cb  GROUP BY sql tests are now deterministic

Authored-by: Andy Grove 
Signed-off-by: Andy Grove 
---
 rust/datafusion/tests/sql.rs | 141 +++
 1 file changed, 47 insertions(+), 94 deletions(-)

diff --git a/rust/datafusion/tests/sql.rs b/rust/datafusion/tests/sql.rs
index bb60603..b9741ce 100644
--- a/rust/datafusion/tests/sql.rs
+++ b/rust/datafusion/tests/sql.rs
@@ -87,8 +87,10 @@ fn parquet_query() {
 let mut ctx = ExecutionContext::new();
 register_alltypes_parquet( ctx);
 let sql = "SELECT id, string_col FROM alltypes_plain";
-let actual = execute( ctx, sql);
-let expected = 
"4\t\"0\"\n5\t\"1\"\n6\t\"0\"\n7\t\"1\"\n2\t\"0\"\n3\t\"1\"\n0\t\"0\"\n1\t\"1\"\n".to_string();
+let actual = execute( ctx, sql).join("\n");
+let expected =
+
"4\t\"0\"\n5\t\"1\"\n6\t\"0\"\n7\t\"1\"\n2\t\"0\"\n3\t\"1\"\n0\t\"0\"\n1\t\"1\""
+.to_string();
 assert_eq!(expected, actual);
 }
 
@@ -114,8 +116,8 @@ fn csv_count_star() {
 let mut ctx = ExecutionContext::new();
 register_aggregate_csv( ctx);
 let sql = "SELECT COUNT(*), COUNT(1), COUNT(c1) FROM aggregate_test_100";
-let actual = execute( ctx, sql);
-let expected = "100\t100\t100\n".to_string();
+let actual = execute( ctx, sql).join("\n");
+let expected = "100\t100\t100".to_string();
 assert_eq!(expected, actual);
 }
 
@@ -124,8 +126,8 @@ fn csv_query_with_predicate() {
 let mut ctx = ExecutionContext::new();
 register_aggregate_csv( ctx);
 let sql = "SELECT c1, c12 FROM aggregate_test_100 WHERE c12 > 0.376 AND 
c12 < 0.4";
-let actual = execute( ctx, sql);
-let expected = 
"\"e\"\t0.39144436569161134\n\"d\"\t0.38870280983958583\n".to_string();
+let actual = execute( ctx, sql).join("\n");
+let expected = 
"\"e\"\t0.39144436569161134\n\"d\"\t0.38870280983958583".to_string();
 assert_eq!(expected, actual);
 }
 
@@ -133,40 +135,39 @@ fn csv_query_with_predicate() {
 fn csv_query_group_by_int_min_max() {
 let mut ctx = ExecutionContext::new();
 register_aggregate_csv( ctx);
-//TODO add ORDER BY once supported, to make this test determistic
 let sql = "SELECT c2, MIN(c12), MAX(c12) FROM aggregate_test_100 GROUP BY 
c2";
-let actual = execute( ctx, sql);
-let expected = 
"4\t0.02182578039211991\t0.9237877978193884\n5\t0.0147930530301\t0.9723580396501548\n2\t0.16301110515739792\t0.991517828651004\n3\t0.047343434291126085\t0.9293883502480845\n1\t0.05636955101974106\t0.9965400387585364\n".to_string();
-assert_eq!(expected, actual);
+let mut actual = execute( ctx, sql);
+actual.sort();
+let expected = 
"1\t0.05636955101974106\t0.9965400387585364\n2\t0.16301110515739792\t0.991517828651004\n3\t0.047343434291126085\t0.9293883502480845\n4\t0.02182578039211991\t0.9237877978193884\n5\t0.0147930530301\t0.9723580396501548".to_string();
+assert_eq!(expected, actual.join("\n"));
 }
 
 #[test]
 fn csv_query_avg() {
 let mut ctx = ExecutionContext::new();
 register_aggregate_csv( ctx);
-//TODO add ORDER BY once supported, to make this test determistic
 let sql = "SELECT avg(c12) FROM aggregate_test_100";
-let actual = execute( ctx, sql);
-let expected = "0.5089725099127211\n".to_string();
-assert_eq!(expected, actual);
+let mut actual = execute( ctx, sql);
+actual.sort();
+let expected = "0.5089725099127211".to_string();
+assert_eq!(expected, actual.join("\n"));
 }
 
 #[test]
 fn csv_query_group_by_avg() {
 let mut ctx = ExecutionContext::new();
 register_aggregate_csv( ctx);
-//TODO add ORDER BY once supported, to make this test determistic
 let sql = "SELECT c1, avg(c12) FROM aggregate_test_100 GROUP BY c1";
-let actual = execute( ctx, sql);
-let expected = 

[arrow] branch master updated (1fc1015 -> b1d5d0d)

2019-10-12 Thread agrove
This is an automated email from the ASF dual-hosted git repository.

agrove pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.


from 1fc1015  ARROW-6859: [CI][Nightly] Disable docker layer caching for 
CircleCI tasks
 add b1d5d0d  ARROW-6690: [Rust] [DataFusion] Optimize aggregates without 
GROUP BY to use SIMD

No new revisions were added by this update.

Summary of changes:
 .../src/execution/physical_plan/common.rs  |  90 ++-
 .../src/execution/physical_plan/expressions.rs | 652 +
 .../src/execution/physical_plan/hash_aggregate.rs  |  25 +-
 rust/datafusion/src/execution/physical_plan/mod.rs |  11 +-
 4 files changed, 521 insertions(+), 257 deletions(-)



[arrow] branch master updated (8621a5c -> 1fc1015)

2019-10-12 Thread npr
This is an automated email from the ASF dual-hosted git repository.

npr pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.


from 8621a5c  ARROW-6464: [Java] Refactor 
FixedSizeListVector#splitAndTransfer with slice API (#5293)
 add 1fc1015  ARROW-6859: [CI][Nightly] Disable docker layer caching for 
CircleCI tasks

No new revisions were added by this update.

Summary of changes:
 dev/tasks/docker-tests/circle.linux.yml | 1 -
 1 file changed, 1 deletion(-)



[arrow] branch master updated (ad85b11 -> 8621a5c)

2019-10-12 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.


from ad85b11  ARROW-6661: [Java] Implement APIs like slice to enhance 
VectorSchemaRoot (#5470)
 add 8621a5c  ARROW-6464: [Java] Refactor 
FixedSizeListVector#splitAndTransfer with slice API (#5293)

No new revisions were added by this update.

Summary of changes:
 .../src/main/codegen/templates/UnionVector.java|  1 +
 .../apache/arrow/vector/BaseFixedWidthVector.java  |  2 +-
 .../arrow/vector/complex/FixedSizeListVector.java  | 69 +-
 .../apache/arrow/vector/complex/ListVector.java|  2 +-
 .../apache/arrow/vector/complex/StructVector.java  |  3 +-
 .../arrow/vector/TestFixedSizeListVector.java  | 31 ++
 6 files changed, 102 insertions(+), 6 deletions(-)



[arrow] branch master updated (b9203a9 -> ad85b11)

2019-10-12 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.


from b9203a9  ARROW-6074: [FlightRPC][Java] Middleware
 add ad85b11  ARROW-6661: [Java] Implement APIs like slice to enhance 
VectorSchemaRoot (#5470)

No new revisions were added by this update.

Summary of changes:
 .../org/apache/arrow/vector/VectorSchemaRoot.java  | 91 +-
 .../apache/arrow/vector/TestVectorSchemaRoot.java  | 85 
 2 files changed, 174 insertions(+), 2 deletions(-)



[arrow] branch master updated (c8bcd70 -> b9203a9)

2019-10-12 Thread emkornfield
This is an automated email from the ASF dual-hosted git repository.

emkornfield pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git.


from c8bcd70  ARROW-6732: [Java] Implement quick sort in a non-recursive 
way to avoid stack overflow
 add b9203a9  ARROW-6074: [FlightRPC][Java] Middleware

No new revisions were added by this update.

Summary of changes:
 .../org/apache/arrow/flight/FlightGrpcUtils.java   |   3 +-
 ...lightRuntimeException.java => CallHeaders.java} |  39 ++-
 .../flight/{FlightConstants.java => CallInfo.java} |  12 +-
 .../java/org/apache/arrow/flight/CallStatus.java   |   9 +
 .../java/org/apache/arrow/flight/FlightClient.java |  34 +-
 .../arrow/flight/FlightClientMiddleware.java   |  52 +++
 .../java/org/apache/arrow/flight/FlightMethod.java |  61 
 .../org/apache/arrow/flight/FlightProducer.java|  12 +
 .../arrow/flight/FlightRuntimeException.java   |   6 +
 .../java/org/apache/arrow/flight/FlightServer.java |  33 +-
 .../arrow/flight/FlightServerMiddleware.java   |  99 ++
 .../org/apache/arrow/flight/FlightService.java | 147 ++---
 .../org/apache/arrow/flight/FlightStatusCode.java  |   4 +
 .../java/org/apache/arrow/flight/FlightStream.java |   9 +
 .../java/org/apache/arrow/flight/StreamPipe.java   |  30 +-
 .../flight/grpc/ClientInterceptorAdapter.java  | 149 +
 .../grpc/ContextPropagatingExecutorService.java| 117 +++
 .../apache/arrow/flight/grpc/MetadataAdapter.java  |  72 +
 .../flight/grpc/ServerInterceptorAdapter.java  | 142 
 .../org/apache/arrow/flight/grpc/StatusUtils.java  |   4 +
 .../apache/arrow/flight/TestClientMiddleware.java  | 211 
 .../apache/arrow/flight/TestServerMiddleware.java  | 360 +
 .../org/apache/arrow/flight/perf/TestPerf.java |   7 +-
 23 files changed, 1538 insertions(+), 74 deletions(-)
 copy 
java/flight/src/main/java/org/apache/arrow/flight/{FlightRuntimeException.java 
=> CallHeaders.java} (56%)
 copy java/flight/src/main/java/org/apache/arrow/flight/{FlightConstants.java 
=> CallInfo.java} (76%)
 create mode 100644 
java/flight/src/main/java/org/apache/arrow/flight/FlightClientMiddleware.java
 create mode 100644 
java/flight/src/main/java/org/apache/arrow/flight/FlightMethod.java
 create mode 100644 
java/flight/src/main/java/org/apache/arrow/flight/FlightServerMiddleware.java
 create mode 100644 
java/flight/src/main/java/org/apache/arrow/flight/grpc/ClientInterceptorAdapter.java
 create mode 100644 
java/flight/src/main/java/org/apache/arrow/flight/grpc/ContextPropagatingExecutorService.java
 create mode 100644 
java/flight/src/main/java/org/apache/arrow/flight/grpc/MetadataAdapter.java
 create mode 100644 
java/flight/src/main/java/org/apache/arrow/flight/grpc/ServerInterceptorAdapter.java
 create mode 100644 
java/flight/src/test/java/org/apache/arrow/flight/TestClientMiddleware.java
 create mode 100644 
java/flight/src/test/java/org/apache/arrow/flight/TestServerMiddleware.java