[arrow] branch master updated: ARROW-3788: [Ruby] Add support for CSV parser written in C++

2018-11-15 Thread kou
This is an automated email from the ASF dual-hosted git repository.

kou pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new f83454c  ARROW-3788: [Ruby] Add support for CSV parser written in C++
f83454c is described below

commit f83454ca64c5b90598ccd88e004275bcbae39e75
Author: Kouhei Sutou 
AuthorDate: Fri Nov 16 16:31:37 2018 +0900

ARROW-3788: [Ruby] Add support for CSV parser written in C++

This is disabled by default because value conversion feature isn't enough 
for now.
We can enable it by specifying `:use_threads` option explicitly:

```ruby
Arrow::Table.load("xxx.csv", use_threads: true)
```

Author: Kouhei Sutou 

Closes #2961 from kou/ruby-csv and squashes the following commits:

d59646b9  Support Ruby < 2.5
cafe5d1e  Add support for column type options
71a99c75   Add support for CSV parser written in C++
---
 ruby/red-arrow/lib/arrow/csv-loader.rb | 88 +-
 .../arrow/csv-read-options.rb} | 19 ++---
 .../lib/arrow/{csv-reader.rb => data-type.rb}  | 47 
 ruby/red-arrow/lib/arrow/loader.rb |  3 +-
 ruby/red-arrow/test/test-csv-loader.rb | 27 +++
 ruby/red-arrow/test/test-table.rb  |  5 +-
 6 files changed, 140 insertions(+), 49 deletions(-)

diff --git a/ruby/red-arrow/lib/arrow/csv-loader.rb 
b/ruby/red-arrow/lib/arrow/csv-loader.rb
index f3ad6ce..3aa85bf 100644
--- a/ruby/red-arrow/lib/arrow/csv-loader.rb
+++ b/ruby/red-arrow/lib/arrow/csv-loader.rb
@@ -60,11 +60,72 @@ module Arrow
 end
 
 def read_csv(csv)
-  reader = CSVReader.new(csv)
-  reader.read
+  values_set = []
+  csv.each do |row|
+if row.is_a?(CSV::Row)
+  row = row.collect(&:last)
+end
+row.each_with_index do |value, i|
+  values = (values_set[i] ||= [])
+  values << value
+end
+  end
+  return nil if values_set.empty?
+
+  arrays = values_set.collect.with_index do |values, i|
+ArrayBuilder.build(values)
+  end
+  if csv.headers
+names = csv.headers
+  else
+names = arrays.size.times.collect(&:to_s)
+  end
+  raw_table = {}
+  names.each_with_index do |name, i|
+raw_table[name] = arrays[i]
+  end
+  Table.new(raw_table)
+end
+
+def reader_options
+  options = CSVReadOptions.new
+  @options.each do |key, value|
+case key
+when :headers
+  if value
+options.n_header_rows = 1
+  else
+options.n_header_rows = 0
+  end
+when :column_types
+  value.each do |name, type|
+options.add_column_type(name, type)
+  end
+when :schema
+  options.add_schema(value)
+else
+  setter = "#{key}="
+  if options.respond_to?(setter)
+options.__send__(setter, value)
+  else
+return nil
+  end
+end
+  end
+  options
 end
 
 def load_from_path(path)
+  options = reader_options
+  if options
+begin
+  MemoryMappedInputStream.open(path.to_s) do |input|
+return CSVReader.new(input, options).read
+  end
+rescue Arrow::Error::Invalid
+end
+  end
+
   options = update_csv_parse_options(@options, :open_csv, path)
   open_csv(path, **options) do |csv|
 read_csv(csv)
@@ -72,6 +133,16 @@ module Arrow
 end
 
 def load_data(data)
+  options = reader_options
+  if options
+begin
+  BufferInputStream.open(Buffer.new(data)) do |input|
+return CSVReader.new(input, options).read
+  end
+rescue Arrow::Error::Invalid
+end
+  end
+
   options = update_csv_parse_options(@options, :parse_csv_data, data)
   parse_csv_data(data, **options) do |csv|
 read_csv(csv)
@@ -119,6 +190,11 @@ module Arrow
   end
 end
 
+AVAILABLE_CSV_PARSE_OPTIONS = {}
+CSV.instance_method(:initialize).parameters.each do |type, name|
+  AVAILABLE_CSV_PARSE_OPTIONS[name] = true if type == :key
+end
+
 def update_csv_parse_options(options, create_csv, *args)
   if options.key?(:converters)
 new_options = options.dup
@@ -127,6 +203,14 @@ module Arrow
 new_options = options.merge(converters: converters)
   end
 
+  # TODO: Support :schema and :column_types
+
+  unless AVAILABLE_CSV_PARSE_OPTIONS.empty?
+new_options.select! do |key, value|
+  AVAILABLE_CSV_PARSE_OPTIONS.key?(key)
+end
+  end
+
   unless options.key?(:headers)
 __send__(create_csv, *args, **new_options) do |csv|
   new_options[:headers] = have_header?(csv)
diff --git a/ruby/red-arrow/test/test-csv-reader.rb 

[arrow] branch master updated: ARROW-3765: [Gandiva] Segfault when the validity bitmap has not been allocated

2018-11-15 Thread pcmoritz
This is an automated email from the ASF dual-hosted git repository.

pcmoritz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 5874af5  ARROW-3765: [Gandiva] Segfault when the validity bitmap has 
not been allocated
5874af5 is described below

commit 5874af553f035244713b687b50e57dce81204433
Author: suquark 
AuthorDate: Thu Nov 15 22:10:27 2018 -0800

ARROW-3765: [Gandiva] Segfault when the validity bitmap has not been 
allocated

https://issues.apache.org/jira/browse/ARROW-3765

Author: suquark 

Closes #2967 from suquark/gandiva and squashes the following commits:

6d09068d0  lint
4b3ea9d32  lint
76b7e7f1e  fix bug
efff64a4c  combine tests to reduce build time
5e4dda518  lint
b509b0573  rename test
4e2528bdb  lint
bdf08f9ff  fix bugs & add new tests
de2061330  Gandiva null validity buffer support.
---
 cpp/src/gandiva/annotator.cc | 12 ++
 cpp/src/gandiva/bitmap_accumulator.h |  4 +++-
 cpp/src/gandiva/tests/filter_test.cc | 46 
 3 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/cpp/src/gandiva/annotator.cc b/cpp/src/gandiva/annotator.cc
index 0fe9fc8..3c8585c 100644
--- a/cpp/src/gandiva/annotator.cc
+++ b/cpp/src/gandiva/annotator.cc
@@ -59,11 +59,13 @@ void Annotator::PrepareBuffersForField(const 
FieldDescriptor& desc,
EvalBatch* eval_batch) {
   int buffer_idx = 0;
 
-  // TODO:
-  // - validity is optional
-
-  uint8_t* validity_buf = 
const_cast(array_data.buffers[buffer_idx]->data());
-  eval_batch->SetBuffer(desc.validity_idx(), validity_buf);
+  // The validity buffer is optional. Use nullptr if it does not have one.
+  if (array_data.buffers[buffer_idx]) {
+uint8_t* validity_buf = 
const_cast(array_data.buffers[buffer_idx]->data());
+eval_batch->SetBuffer(desc.validity_idx(), validity_buf);
+  } else {
+eval_batch->SetBuffer(desc.validity_idx(), nullptr);
+  }
   ++buffer_idx;
 
   if (desc.HasOffsetsIdx()) {
diff --git a/cpp/src/gandiva/bitmap_accumulator.h 
b/cpp/src/gandiva/bitmap_accumulator.h
index 31b6609..157405d 100644
--- a/cpp/src/gandiva/bitmap_accumulator.h
+++ b/cpp/src/gandiva/bitmap_accumulator.h
@@ -20,6 +20,7 @@
 
 #include 
 
+#include "arrow/util/macros.h"
 #include "gandiva/dex.h"
 #include "gandiva/dex_visitor.h"
 #include "gandiva/eval_batch.h"
@@ -36,7 +37,8 @@ class BitMapAccumulator : public DexDefaultVisitor {
   void Visit(const VectorReadValidityDex& dex) {
 int idx = dex.ValidityIdx();
 auto bitmap = eval_batch_.GetBuffer(idx);
-src_maps_.push_back(bitmap);
+// The bitmap could be null. Ignore it in this case.
+if (bitmap != NULLPTR) src_maps_.push_back(bitmap);
   }
 
   void Visit(const LocalBitMapValidityDex& dex) {
diff --git a/cpp/src/gandiva/tests/filter_test.cc 
b/cpp/src/gandiva/tests/filter_test.cc
index f95cdcc..f63899a 100644
--- a/cpp/src/gandiva/tests/filter_test.cc
+++ b/cpp/src/gandiva/tests/filter_test.cc
@@ -290,4 +290,50 @@ TEST_F(TestFilter, TestSimpleSVInt32) {
   EXPECT_ARROW_ARRAY_EQUALS(exp, selection_vector->ToArray());
 }
 
+TEST_F(TestFilter, TestNullValidityBuffer) {
+  // schema for input fields
+  auto field0 = field("f0", int32());
+  auto field1 = field("f1", int32());
+  auto schema = arrow::schema({field0, field1});
+
+  // Build condition f0 + f1 < 10
+  auto node_f0 = TreeExprBuilder::MakeField(field0);
+  auto node_f1 = TreeExprBuilder::MakeField(field1);
+  auto sum_func =
+  TreeExprBuilder::MakeFunction("add", {node_f0, node_f1}, arrow::int32());
+  auto literal_10 = TreeExprBuilder::MakeLiteral((int32_t)10);
+  auto less_than_10 = TreeExprBuilder::MakeFunction("less_than", {sum_func, 
literal_10},
+arrow::boolean());
+  auto condition = TreeExprBuilder::MakeCondition(less_than_10);
+
+  std::shared_ptr filter;
+  Status status = Filter::Make(schema, condition, );
+  EXPECT_TRUE(status.ok());
+
+  // Create a row-batch with some sample data
+  int num_records = 5;
+
+  auto array_ = MakeArrowArrayInt32({1, 2, 3, 4, 6}, {true, true, true, false, 
true});
+  // Create an array without a validity buffer.
+  auto array0 =
+  std::make_shared(5, array_->data()->buffers[1], 
nullptr, 0);
+  auto array1 = MakeArrowArrayInt32({5, 9, 6, 17, 3}, {true, true, false, 
true, true});
+  // expected output (indices for which condition matches)
+  auto exp = MakeArrowArrayUint16({0, 4});
+
+  // prepare input record batch
+  auto in_batch = arrow::RecordBatch::Make(schema, num_records, {array0, 
array1});
+
+  std::shared_ptr selection_vector;
+  status = SelectionVector::MakeInt16(num_records, pool_, _vector);
+  EXPECT_TRUE(status.ok());
+
+  // Evaluate expression
+  status = filter->Evaluate(*in_batch, selection_vector);
+  EXPECT_TRUE(status.ok());
+
+  

[arrow] branch master updated: ARROW-3784: [R] Array with type fails with x is not a vector

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new cc5b445  ARROW-3784: [R] Array with type fails with x is not a vector
cc5b445 is described below

commit cc5b44563002474ffa69c2fb7db7d0b3564e10b1
Author: Romain Francois 
AuthorDate: Thu Nov 15 13:31:09 2018 -0500

ARROW-3784: [R] Array with type fails with x is not a vector

closes #2956 using a better approach. This reserves the `type` argument for 
when we'll really use it.

```r
> library(arrow)
> array(1:10, type = int16())
arrow::Array
[
  1,
  2,
  3,
  4,
  5,
  6,
  7,
  8,
  9,
  10
]
Warning message:
The `type` argument is currently ignored
```

Author: Romain Francois 

Closes #2960 from romainfrancois/ARROW-3784/array-type-arg and squashes the 
following commits:

10a763232  "document" the type argument
92d57a8e4  reserve the `type` argument, but just warn 
about it for now.
---
 r/NAMESPACE  | 2 ++
 r/R/ChunkedArray.R   | 9 +++--
 r/R/array.R  | 7 ++-
 r/man/array.Rd   | 4 +++-
 r/man/chunked_array.Rd   | 4 +++-
 r/tests/testthat/test-Array.R| 6 ++
 r/tests/testthat/test-chunkedarray.R | 6 ++
 7 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/r/NAMESPACE b/r/NAMESPACE
index 46d40e1..93df8ff 100644
--- a/r/NAMESPACE
+++ b/r/NAMESPACE
@@ -148,6 +148,8 @@ importFrom(purrr,map)
 importFrom(purrr,map2)
 importFrom(purrr,map_int)
 importFrom(rlang,dots_n)
+importFrom(rlang,list2)
+importFrom(rlang,warn)
 importFrom(tibble,as_tibble)
 importFrom(withr,defer_parent)
 useDynLib(arrow, .registration = TRUE)
diff --git a/r/R/ChunkedArray.R b/r/R/ChunkedArray.R
index 69c98ba..c681fc3 100644
--- a/r/R/ChunkedArray.R
+++ b/r/R/ChunkedArray.R
@@ -39,8 +39,13 @@
 #' create an arrow::Array from an R vector
 #'
 #' @param \dots Vectors to coerce
+#' @param type currently ignored
 #'
+#' @importFrom rlang list2
 #' @export
-chunked_array <- function(...){
-  shared_ptr(`arrow::ChunkedArray`, ChunkedArray__from_list(rlang::list2(...)))
+chunked_array <- function(..., type){
+  if (!missing(type)) {
+warn("The `type` argument is currently ignored")
+  }
+  shared_ptr(`arrow::ChunkedArray`, ChunkedArray__from_list(list2(...)))
 }
diff --git a/r/R/array.R b/r/R/array.R
index 869479b..bd949dc 100644
--- a/r/R/array.R
+++ b/r/R/array.R
@@ -63,9 +63,14 @@
 #' create an arrow::Array from an R vector
 #'
 #' @param \dots Vectors to coerce
+#' @param type currently ignored
 #'
+#' @importFrom rlang warn
 #' @export
-array <- function(...){
+array <- function(..., type){
+  if (!missing(type)) {
+warn("The `type` argument is currently ignored")
+  }
   `arrow::Array`$dispatch(Array__from_vector(vctrs::vec_c(...)))
 }
 
diff --git a/r/man/array.Rd b/r/man/array.Rd
index ed81d0c..38bd773 100644
--- a/r/man/array.Rd
+++ b/r/man/array.Rd
@@ -4,10 +4,12 @@
 \alias{array}
 \title{create an arrow::Array from an R vector}
 \usage{
-array(...)
+array(..., type)
 }
 \arguments{
 \item{\dots}{Vectors to coerce}
+
+\item{type}{currently ignored}
 }
 \description{
 create an arrow::Array from an R vector
diff --git a/r/man/chunked_array.Rd b/r/man/chunked_array.Rd
index 27b91cf..1f4fb83 100644
--- a/r/man/chunked_array.Rd
+++ b/r/man/chunked_array.Rd
@@ -4,10 +4,12 @@
 \alias{chunked_array}
 \title{create an arrow::Array from an R vector}
 \usage{
-chunked_array(...)
+chunked_array(..., type)
 }
 \arguments{
 \item{\dots}{Vectors to coerce}
+
+\item{type}{currently ignored}
 }
 \description{
 create an arrow::Array from an R vector
diff --git a/r/tests/testthat/test-Array.R b/r/tests/testthat/test-Array.R
index d06f88f..a3e5134 100644
--- a/r/tests/testthat/test-Array.R
+++ b/r/tests/testthat/test-Array.R
@@ -280,3 +280,9 @@ test_that("support for NaN (ARROW-3615)", {
   expect_true(y$IsValid(2))
   expect_equal(y$null_count(), 1L)
 })
+
+test_that("array ignores the type argument (ARROW-3784)", {
+  a <- expect_warning(array(1:10, type = int16()))
+  b <- array(1:10)
+  expect_equal(a, b)
+})
diff --git a/r/tests/testthat/test-chunkedarray.R 
b/r/tests/testthat/test-chunkedarray.R
index 088367a..fb45c99 100644
--- a/r/tests/testthat/test-chunkedarray.R
+++ b/r/tests/testthat/test-chunkedarray.R
@@ -159,3 +159,9 @@ test_that("ChunkedArray supports difftime", {
   expect_equal(a$length(), 2L)
   expect_equal(a$as_vector(), c(time, time))
 })
+
+test_that("chunked_array ignores the type argument (ARROW-3784)", {
+  a <- expect_warning(chunked_array(1:10, type = int16()))
+  b <- chunked_array(1:10)
+  expect_equal(a, b)
+})



[arrow] branch master updated: ARROW-3800: [C++] Vendor a string_view backport

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 948e0fb  ARROW-3800: [C++] Vendor a string_view backport
948e0fb is described below

commit 948e0fbf9ba94b63d98159ff599585af2841a069
Author: Antoine Pitrou 
AuthorDate: Thu Nov 15 13:25:12 2018 -0500

ARROW-3800: [C++] Vendor a string_view backport

Vendor the `std::string_view` backport from 
https://github.com/martinmoene/string-view-lite

Author: Antoine Pitrou 

Closes #2974 from pitrou/ARROW-3800-string-view-backport and squashes the 
following commits:

4353414b6  ARROW-3800:  Vendor a string_view backport
---
 LICENSE.txt|   28 +
 cpp/CMakeLists.txt |1 +
 cpp/build-support/clang_format_exclusions.txt  |1 +
 cpp/build-support/lint_cpp_cli.py  |3 +-
 cpp/src/arrow/array-test.cc|   10 +-
 cpp/src/arrow/array.h  |   39 +-
 cpp/src/arrow/builder.cc   |   73 +-
 cpp/src/arrow/builder.h|   47 +-
 cpp/src/arrow/compute/kernels/cast.cc  |   20 +-
 cpp/src/arrow/pretty_print.cc  |   16 +-
 cpp/src/arrow/python/CMakeLists.txt|2 +-
 cpp/src/arrow/python/arrow_to_pandas.cc|   62 +-
 cpp/src/arrow/python/deserialize.cc|   10 +-
 cpp/src/arrow/util/CMakeLists.txt  |2 +
 cpp/src/arrow/util/string.h|   13 +-
 cpp/src/arrow/util/string_view.h   |   31 +
 cpp/src/arrow/util/string_view/CMakeLists.txt  |   20 +
 cpp/src/arrow/util/string_view/string_view.hpp | 1292 
 dev/release/rat_exclude_files.txt  |1 +
 19 files changed, 1505 insertions(+), 166 deletions(-)

diff --git a/LICENSE.txt b/LICENSE.txt
index 85a9bbd..2651a13 100644
--- a/LICENSE.txt
+++ b/LICENSE.txt
@@ -769,3 +769,31 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, 
DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
+
+
+
+The file cpp/src/util/string_view/string_view.hpp has the following license
+
+Boost Software License - Version 1.0 - August 17th, 2003
+
+Permission is hereby granted, free of charge, to any person or organization
+obtaining a copy of the software and accompanying documentation covered by
+this license (the "Software") to use, reproduce, display, distribute,
+execute, and transmit the Software, and to prepare derivative works of the
+Software, and to permit third-parties to whom the Software is furnished to
+do so, all subject to the following:
+
+The copyright notices in the Software and this entire statement, including
+the above license grant, this restriction and the following disclaimer,
+must be included in all copies of the Software, in whole or in part, and
+all derivative works of the Software, unless such copies or derivative
+works are solely in the form of machine-executable object code generated by
+a source language processor.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
+SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
+FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
+ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+DEALINGS IN THE SOFTWARE.
diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
index 72edd2f..a32ac0f 100644
--- a/cpp/CMakeLists.txt
+++ b/cpp/CMakeLists.txt
@@ -348,6 +348,7 @@ if (UNIX)
 (item MATCHES "xxhash.h") OR
 (item MATCHES "xxhash.cc") OR
 (item MATCHES "config.h") OR
+(item MATCHES "util/string_view/") OR
 (item MATCHES "util/variant") OR
 (item MATCHES "zmalloc.h") OR
 (item MATCHES "gandiva/precompiled/date.h") OR
diff --git a/cpp/build-support/clang_format_exclusions.txt 
b/cpp/build-support/clang_format_exclusions.txt
index 03caa00..1aeecfa 100644
--- a/cpp/build-support/clang_format_exclusions.txt
+++ b/cpp/build-support/clang_format_exclusions.txt
@@ -4,6 +4,7 @@
 *pyarrow_lib.h
 *python/config.h
 *python/platform.h
+*util/string_view/*
 *util/variant.h
 *util/variant/*
 *thirdparty/ae/*
diff --git a/cpp/build-support/lint_cpp_cli.py 
b/cpp/build-support/lint_cpp_cli.py
index 00a453a..993ea2f 100644
--- a/cpp/build-support/lint_cpp_cli.py
+++ b/cpp/build-support/lint_cpp_cli.py
@@ -69,9 +69,10 @@ def lint_file(path):
 
 
 EXCLUSIONS 

[arrow] branch master updated: [Gandiva] Add link to Gandiva codebase in top level README

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 3e84f99  [Gandiva] Add link to Gandiva codebase in top level README
3e84f99 is described below

commit 3e84f9970b3841867d7094fdd23387b6b0434142
Author: Wes McKinney 
AuthorDate: Thu Nov 15 12:35:13 2018 -0500

[Gandiva] Add link to Gandiva codebase in top level README

Change-Id: I80e2993438518429f47daf1cc42097169e4dada5
---
 README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/README.md b/README.md
index 24d75f9..de60608 100644
--- a/README.md
+++ b/README.md
@@ -47,6 +47,7 @@ Major components of the project include:
  - [C++ libraries](https://github.com/apache/arrow/tree/master/cpp)
  - [C bindings using GLib](https://github.com/apache/arrow/tree/master/c_glib)
  - [C# .NET libraries](https://github.com/apache/arrow/tree/master/csharp)
+ - [Gandiva](https://github.com/apache/arrow/tree/master/cpp/src/gandiva): an 
[LLVM](https://llvm.org)-based Arrow expression compiler, part of the C++ 
codebase
  - [Go libraries](https://github.com/apache/arrow/tree/master/go)
  - [Java libraries](https://github.com/apache/arrow/tree/master/java)
  - [JavaScript libraries](https://github.com/apache/arrow/tree/master/js)



[arrow] branch master updated: ARROW-3186: [GLib][CI] Use the latest Meson again

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new d873261  ARROW-3186: [GLib][CI] Use the latest Meson again
d873261 is described below

commit d873261d636aaf2a1a8c78ec2c1841d8f1ad9c73
Author: Kouhei Sutou 
AuthorDate: Thu Nov 15 12:27:41 2018 -0500

ARROW-3186: [GLib][CI] Use the latest Meson again

Author: Kouhei Sutou 

Closes #2972 from kou/glib-use-the-latest-meson and squashes the following 
commits:

da7b87944   Use the latest Meson again
---
 ci/travis_before_script_c_glib.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/ci/travis_before_script_c_glib.sh 
b/ci/travis_before_script_c_glib.sh
index d9aec19..7cd1c2a 100755
--- a/ci/travis_before_script_c_glib.sh
+++ b/ci/travis_before_script_c_glib.sh
@@ -26,8 +26,7 @@ source $TRAVIS_BUILD_DIR/ci/travis_install_conda.sh
 conda create -n meson -y -q python=3.6
 conda activate meson
 
-# ARROW-3186: meson 0.47.2 issues
-pip install meson==0.47.1
+pip install meson
 
 if [ $TRAVIS_OS_NAME = "osx" ]; then
   export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/opt/libffi/lib/pkgconfig



[arrow] branch master updated: ARROW-3797: [Rust] BinaryArray::value_offset incorrect in offset case

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 1a00fe5  ARROW-3797: [Rust] BinaryArray::value_offset incorrect in 
offset case
1a00fe5 is described below

commit 1a00fe5385efbd63ae77f71c63dc101e6cbd26d2
Author: Brent Kerby 
AuthorDate: Thu Nov 15 09:40:17 2018 -0500

ARROW-3797: [Rust] BinaryArray::value_offset incorrect in offset case

Fixes a bug in BinaryArray::value_offset. Also added test cases to now 
cover this method as well as the BinaryArray::value_length method in the case 
where the underlying ArrayData has a nonzero offset.

Author: Brent Kerby 

Closes #2971 from blkerby/BinaryArray_offset_fix and squashes the following 
commits:

fea0730cf  Fix argument order in assert_eq in new test cases
8193a265d  Add test for BinaryArray::value_offset and 
value_length for offset case
31cc4527e  Fix bug in BinaryArray::value_offset
---
 rust/src/array.rs | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/rust/src/array.rs b/rust/src/array.rs
index 0144c64..9157897 100644
--- a/rust/src/array.rs
+++ b/rust/src/array.rs
@@ -526,7 +526,7 @@ impl BinaryArray {
 /// Note this doesn't do any bound checking, for performance reason.
 #[inline]
 pub fn value_offset(, i: i64) -> i32 {
-self.value_offset_at(i)
+self.value_offset_at(self.data.offset() + i)
 }
 
 /// Returns the length for the element at index `i`.
@@ -981,6 +981,10 @@ mod tests {
 binary_array.get_value(1)
 );
 assert_eq!("parquet", binary_array.get_string(1));
+assert_eq!(5, binary_array.value_offset(0));
+assert_eq!(0, binary_array.value_length(0));
+assert_eq!(5, binary_array.value_offset(1));
+assert_eq!(7, binary_array.value_length(1));
 }
 
 #[test]



[arrow] branch master updated: ARROW-3703: [Python] DataFrame.to_parquet crashes if datetime column has time zones

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 6e46bdc  ARROW-3703: [Python] DataFrame.to_parquet crashes if datetime 
column has time zones
6e46bdc is described below

commit 6e46bdc9a354ebb15644e99a80f6cc07bb440b21
Author: Krisztián Szűcs 
AuthorDate: Thu Nov 15 08:47:11 2018 -0500

ARROW-3703: [Python] DataFrame.to_parquet crashes if datetime column has 
time zones

Author: Krisztián Szűcs 

Closes #2975 from kszucs/ARROW-3703 and squashes the following commits:

dba35f267  more robust timezone to string conversion
---
 python/pyarrow/tests/test_convert_pandas.py | 28 
 python/pyarrow/tests/test_parquet.py| 11 +++
 python/pyarrow/types.pxi| 26 ++
 3 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/python/pyarrow/tests/test_convert_pandas.py 
b/python/pyarrow/tests/test_convert_pandas.py
index 0a0a524..7f672ea 100644
--- a/python/pyarrow/tests/test_convert_pandas.py
+++ b/python/pyarrow/tests/test_convert_pandas.py
@@ -15,6 +15,8 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+
+import six
 import decimal
 import json
 import multiprocessing as mp
@@ -26,6 +28,7 @@ import numpy.testing as npt
 import pandas as pd
 import pandas.util.testing as tm
 import pytest
+import pytz
 
 import pyarrow as pa
 import pyarrow.types as patypes
@@ -823,6 +826,31 @@ class TestConvertDateTimeLikeTypes(object):
 })
 tm.assert_frame_equal(expected_df, result)
 
+def test_python_datetime_with_pytz_tzinfo(self):
+for tz in [pytz.utc, pytz.timezone('US/Eastern'), pytz.FixedOffset(1)]:
+values = [datetime(2018, 1, 1, 12, 23, 45, tzinfo=tz)]
+df = pd.DataFrame({'datetime': values})
+_check_pandas_roundtrip(df)
+
+@pytest.mark.skipif(six.PY2, reason='datetime.timezone is available since '
+'python version 3.2')
+def test_python_datetime_with_timezone_tzinfo(self):
+from datetime import timezone
+
+values = [datetime(2018, 1, 1, 12, 23, 45, tzinfo=pytz.utc)]
+df = pd.DataFrame({'datetime': values})
+_check_pandas_roundtrip(df)
+
+# datetime.timezone is going to be pytz.FixedOffset
+hours = 1
+tz_timezone = timezone(timedelta(hours=hours))
+tz_pytz = pytz.FixedOffset(hours * 60)
+values = [datetime(2018, 1, 1, 12, 23, 45, tzinfo=tz_timezone)]
+values_exp = [datetime(2018, 1, 1, 12, 23, 45, tzinfo=tz_pytz)]
+df = pd.DataFrame({'datetime': values})
+df_exp = pd.DataFrame({'datetime': values_exp})
+_check_pandas_roundtrip(df, expected=df_exp)
+
 def test_python_datetime_subclass(self):
 
 class MyDatetime(datetime):
diff --git a/python/pyarrow/tests/test_parquet.py 
b/python/pyarrow/tests/test_parquet.py
index bacffdf..8217dd3 100644
--- a/python/pyarrow/tests/test_parquet.py
+++ b/python/pyarrow/tests/test_parquet.py
@@ -20,6 +20,7 @@ import decimal
 import io
 import json
 import os
+import six
 import pytest
 
 import numpy as np
@@ -244,6 +245,16 @@ def test_pandas_parquet_datetime_tz():
 tm.assert_frame_equal(df, df_read)
 
 
+@pytest.mark.skipif(six.PY2, reason='datetime.timezone is available since '
+'python version 3.2')
+def test_datetime_timezone_tzinfo():
+value = datetime.datetime(2018, 1, 1, 1, 23, 45,
+  tzinfo=datetime.timezone.utc)
+df = pd.DataFrame({'foo': [value]})
+
+_roundtrip_pandas_dataframe(df, write_kwargs={})
+
+
 def test_pandas_parquet_custom_metadata(tempdir):
 df = alltypes_sample(size=1)
 
diff --git a/python/pyarrow/types.pxi b/python/pyarrow/types.pxi
index fb7d081..399f15e 100644
--- a/python/pyarrow/types.pxi
+++ b/python/pyarrow/types.pxi
@@ -962,12 +962,30 @@ def tzinfo_to_string(tz):
   name : string
 Time zone name
 """
-if tz.zone is None:
-sign = '+' if tz._minutes >= 0 else '-'
-hours, minutes = divmod(abs(tz._minutes), 60)
+import pytz
+import datetime
+
+def fixed_offset_to_string(offset):
+seconds = int(offset.utcoffset(None).total_seconds())
+sign = '+' if seconds >= 0 else '-'
+minutes, seconds = divmod(abs(seconds), 60)
+hours, minutes = divmod(minutes, 60)
+if seconds > 0:
+raise ValueError('Offset must represent whole number of minutes')
 return '{}{:02d}:{:02d}'.format(sign, hours, minutes)
-else:
+
+if isinstance(tz, pytz.tzinfo.BaseTzInfo):
 return tz.zone
+elif isinstance(tz, pytz._FixedOffset):
+return 

[arrow] branch master updated: ARROW-3754: [C++] Enable Zstandard by default only when CMake is 3.7 or later

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new d5cfabf  ARROW-3754: [C++] Enable Zstandard by default only when CMake 
is 3.7 or later
d5cfabf is described below

commit d5cfabf6bfa064e9e12ad892f510c0ca2556632e
Author: Kouhei Sutou 
AuthorDate: Thu Nov 15 08:34:26 2018 -0500

ARROW-3754: [C++] Enable Zstandard by default only when CMake is 3.7 or 
later

ExternalProject_Add(SOURCE_SUBDIR) is available since CMake 3.7.

Author: Kouhei Sutou 

Closes #2970 from kou/cpp-zstd and squashes the following commits:

b3d9646ab   Enable Zstandard by default only when CMake is 
3.7 or later
---
 cpp/CMakeLists.txt  | 8 +++-
 cpp/cmake_modules/ThirdpartyToolchain.cmake | 1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
index 997421c..72edd2f 100644
--- a/cpp/CMakeLists.txt
+++ b/cpp/CMakeLists.txt
@@ -248,9 +248,15 @@ Pass multiple labels by dividing with semicolons")
 "Build with zlib compression"
 ON)
 
+  if(CMAKE_VERSION VERSION_LESS 3.7)
+set(ARROW_WITH_ZSTD_DEFAULT OFF)
+  else()
+# ExternalProject_Add(SOURCE_SUBDIR) is available since CMake 3.7.
+set(ARROW_WITH_ZSTD_DEFAULT ON)
+  endif()
   option(ARROW_WITH_ZSTD
 "Build with zstd compression"
-ON)
+${ARROW_WITH_ZSTD_DEFAULT})
 
   option(ARROW_GENERATE_COVERAGE
 "Build with C++ code coverage enabled"
diff --git a/cpp/cmake_modules/ThirdpartyToolchain.cmake 
b/cpp/cmake_modules/ThirdpartyToolchain.cmake
index 76a65b7..224ea1c 100644
--- a/cpp/cmake_modules/ThirdpartyToolchain.cmake
+++ b/cpp/cmake_modules/ThirdpartyToolchain.cmake
@@ -1069,6 +1069,7 @@ if (ARROW_WITH_ZSTD)
 set(ZSTD_CMAKE_ARGS
 "-DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}"
 "-DCMAKE_INSTALL_PREFIX=${ZSTD_PREFIX}"
+"-DCMAKE_INSTALL_LIBDIR=${CMAKE_INSTALL_LIBDIR}"
 "-DZSTD_BUILD_PROGRAMS=off"
 "-DZSTD_BUILD_SHARED=off"
 "-DZSTD_BUILD_STATIC=on"



[arrow] branch master updated: ARROW-3672 & ARROW-3673: [Go] add support for time32 and time64 array

2018-11-15 Thread wesm
This is an automated email from the ASF dual-hosted git repository.

wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new c604adb  ARROW-3672 & ARROW-3673: [Go] add support for time32 and 
time64 array
c604adb is described below

commit c604adbb3ba9eefa9f24093a9b652b9bc66fcd72
Author: alexandreyc 
AuthorDate: Thu Nov 15 08:27:21 2018 -0500

ARROW-3672 & ARROW-3673: [Go] add support for time32 and time64 array

Hello everyone,

My attempt at adding support for time32 and time64 array. Need review 
because I'm not sure I added all that is needed.

Thanks,

Alexandre

Author: alexandreyc 

Closes #2944 from alexandreyc/master and squashes the following commits:

e686b19b5  update mu to u for microseconds
dd2a5d0d1  add tests for time 64 array and fix bug
ea68cd50f  add tests for time 32 array
83eb1310f  add tests for time32 and time64 builders
e79db3348  add time32 and time64 array
3d6ddcf64  Fix missing import in numeric builder template
---
 go/arrow/array/array.go   |   4 +-
 go/arrow/array/numeric.gen.go |  90 ++
 go/arrow/array/numeric_test.go| 260 
 go/arrow/array/numericbuilder.gen.go  | 272 ++
 go/arrow/array/numericbuilder.gen.go.tmpl |   1 +
 go/arrow/array/numericbuilder_test.go | 229 +
 go/arrow/datatype_fixedwidth.go   |  34 +++-
 go/arrow/datatype_fixedwidth_test.go  |   2 +-
 go/arrow/numeric.tmpldata |  24 +++
 go/arrow/type_traits_numeric.gen.go   |  98 +++
 10 files changed, 1008 insertions(+), 6 deletions(-)

diff --git a/go/arrow/array/array.go b/go/arrow/array/array.go
index d1dd31d..a225693 100644
--- a/go/arrow/array/array.go
+++ b/go/arrow/array/array.go
@@ -183,8 +183,8 @@ func init() {
arrow.DATE32:unsupportedArrayType,
arrow.DATE64:unsupportedArrayType,
arrow.TIMESTAMP: func(data *Data) Interface { return 
NewTimestampData(data) },
-   arrow.TIME32:unsupportedArrayType,
-   arrow.TIME64:unsupportedArrayType,
+   arrow.TIME32:func(data *Data) Interface { return 
NewTime32Data(data) },
+   arrow.TIME64:func(data *Data) Interface { return 
NewTime64Data(data) },
arrow.INTERVAL:  unsupportedArrayType,
arrow.DECIMAL:   unsupportedArrayType,
arrow.LIST:  func(data *Data) Interface { return 
NewListData(data) },
diff --git a/go/arrow/array/numeric.gen.go b/go/arrow/array/numeric.gen.go
index 6f633ea..1f734c0 100644
--- a/go/arrow/array/numeric.gen.go
+++ b/go/arrow/array/numeric.gen.go
@@ -519,3 +519,93 @@ func (a *Timestamp) setData(data *Data) {
a.values = a.values[beg:end]
}
 }
+
+// A type which represents an immutable sequence of arrow.Time32 values.
+type Time32 struct {
+   array
+   values []arrow.Time32
+}
+
+func NewTime32Data(data *Data) *Time32 {
+   a := {}
+   a.refCount = 1
+   a.setData(data)
+   return a
+}
+
+func (a *Time32) Value(i int) arrow.Time32 { return a.values[i] }
+func (a *Time32) Time32Values() []arrow.Time32 { return a.values }
+
+func (a *Time32) String() string {
+   o := new(strings.Builder)
+   o.WriteString("[")
+   for i, v := range a.values {
+   if i > 0 {
+   fmt.Fprintf(o, " ")
+   }
+   switch {
+   case a.IsNull(i):
+   o.WriteString("(null)")
+   default:
+   fmt.Fprintf(o, "%v", v)
+   }
+   }
+   o.WriteString("]")
+   return o.String()
+}
+
+func (a *Time32) setData(data *Data) {
+   a.array.setData(data)
+   vals := data.buffers[1]
+   if vals != nil {
+   a.values = arrow.Time32Traits.CastFromBytes(vals.Bytes())
+   beg := a.array.data.offset
+   end := beg + a.array.data.length
+   a.values = a.values[beg:end]
+   }
+}
+
+// A type which represents an immutable sequence of arrow.Time64 values.
+type Time64 struct {
+   array
+   values []arrow.Time64
+}
+
+func NewTime64Data(data *Data) *Time64 {
+   a := {}
+   a.refCount = 1
+   a.setData(data)
+   return a
+}
+
+func (a *Time64) Value(i int) arrow.Time64 { return a.values[i] }
+func (a *Time64) Time64Values() []arrow.Time64 { return a.values }
+
+func (a *Time64) String() string {
+   o := new(strings.Builder)
+   o.WriteString("[")
+   for i, v := range a.values {
+   if i > 0 {
+   fmt.Fprintf(o, " ")
+   }
+  

[arrow] branch master updated: ARROW-3798: [GLib] Add support for column type CSV read option

2018-11-15 Thread shiro
This is an automated email from the ASF dual-hosted git repository.

shiro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 281eb22  ARROW-3798: [GLib] Add support for column type CSV read option
281eb22 is described below

commit 281eb22f8cb7f17afcdabf3795177c25063f4888
Author: Kouhei Sutou 
AuthorDate: Thu Nov 15 21:21:15 2018 +0900

ARROW-3798: [GLib] Add support for column type CSV read option

Author: Kouhei Sutou 

Closes #2973 from kou/glib-csv-type and squashes the following commits:

3cb0d078   Add column type CSV read option
---
 c_glib/arrow-glib/reader.cpp   | 68 ++
 c_glib/arrow-glib/reader.h |  9 ++
 c_glib/test/test-csv-reader.rb | 64 ++-
 3 files changed, 127 insertions(+), 14 deletions(-)

diff --git a/c_glib/arrow-glib/reader.cpp b/c_glib/arrow-glib/reader.cpp
index 5253a45..b4b5c08 100644
--- a/c_glib/arrow-glib/reader.cpp
+++ b/c_glib/arrow-glib/reader.cpp
@@ -22,6 +22,7 @@
 #endif
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1276,6 +1277,73 @@ garrow_csv_read_options_new(void)
   return GARROW_CSV_READ_OPTIONS(csv_read_options);
 }
 
+/**
+ * garrow_csv_read_options_add_column_type:
+ * @options: A #GArrowCSVReadOptions.
+ * @name: The name of the target column.
+ * @data_type: The #GArrowDataType for the column.
+ *
+ * Add value type of a column.
+ *
+ * Since: 0.12.0
+ */
+void
+garrow_csv_read_options_add_column_type(GArrowCSVReadOptions *options,
+const gchar *name,
+GArrowDataType *data_type)
+{
+  auto priv = GARROW_CSV_READ_OPTIONS_GET_PRIVATE(options);
+  auto arrow_data_type = garrow_data_type_get_raw(data_type);
+  priv->convert_options.column_types[name] = arrow_data_type;
+}
+
+/**
+ * garrow_csv_read_options_add_schema:
+ * @options: A #GArrowCSVReadOptions.
+ * @schema: The #GArrowSchema that specifies columns and their types.
+ *
+ * Add value types for columns in the schema.
+ *
+ * Since: 0.12.0
+ */
+void
+garrow_csv_read_options_add_schema(GArrowCSVReadOptions *options,
+   GArrowSchema *schema)
+{
+  auto priv = GARROW_CSV_READ_OPTIONS_GET_PRIVATE(options);
+  auto arrow_schema = garrow_schema_get_raw(schema);
+  for (const auto field : arrow_schema->fields()) {
+priv->convert_options.column_types[field->name()] = field->type();
+  }
+}
+
+/**
+ * garrow_csv_read_options_get_column_types:
+ * @options: A #GArrowCSVReadOptions.
+ *
+ * Returns: (transfer full) (element-type gchar* GArrowDataType):
+ *   The column name and value type mapping of the options.
+ *
+ * Since: 0.12.0
+ */
+GHashTable *
+garrow_csv_read_options_get_column_types(GArrowCSVReadOptions *options)
+{
+  auto priv = GARROW_CSV_READ_OPTIONS_GET_PRIVATE(options);
+  GHashTable *types = g_hash_table_new_full(g_str_hash,
+g_str_equal,
+g_free,
+g_object_unref);
+  for (const auto iter : priv->convert_options.column_types) {
+auto arrow_name = iter.first;
+auto arrow_data_type = iter.second;
+g_hash_table_insert(types,
+g_strdup(arrow_name.c_str()),
+garrow_data_type_new_raw(_data_type));
+  }
+  return types;
+}
+
 
 typedef struct GArrowCSVReaderPrivate_ {
   std::shared_ptr reader;
diff --git a/c_glib/arrow-glib/reader.h b/c_glib/arrow-glib/reader.h
index d1a3947..de33a79 100644
--- a/c_glib/arrow-glib/reader.h
+++ b/c_glib/arrow-glib/reader.h
@@ -255,6 +255,15 @@ struct _GArrowCSVReadOptionsClass
 };
 
 GArrowCSVReadOptions *garrow_csv_read_options_new(void);
+void
+garrow_csv_read_options_add_column_type(GArrowCSVReadOptions *options,
+const gchar *name,
+GArrowDataType *data_type);
+void
+garrow_csv_read_options_add_schema(GArrowCSVReadOptions *options,
+   GArrowSchema *schema);
+GHashTable *
+garrow_csv_read_options_get_column_types(GArrowCSVReadOptions *options);
 
 #define GARROW_TYPE_CSV_READER (garrow_csv_reader_get_type())
 G_DECLARE_DERIVABLE_TYPE(GArrowCSVReader,
diff --git a/c_glib/test/test-csv-reader.rb b/c_glib/test/test-csv-reader.rb
index 12897a8..3cae103 100644
--- a/c_glib/test/test-csv-reader.rb
+++ b/c_glib/test/test-csv-reader.rb
@@ -40,20 +40,56 @@ message,count
table.read)
 end
 
-def test_options
-  options = Arrow::CSVReadOptions.new
-  options.quoted = false
-  table = Arrow::CSVReader.new(open_input(<<-CSV), options)
-message,count
-"Start",2
-"Shutdown",9
-  CSV
-  columns = {
-"message" => build_string_array(["\"Start\"", "\"Shutdown\""]),
-

[arrow] branch master updated: ARROW-3796: [Rust] Add Example for PrimitiveArrayBuilder

2018-11-15 Thread kszucs
This is an automated email from the ASF dual-hosted git repository.

kszucs pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 1c8f8fc  ARROW-3796: [Rust] Add Example for PrimitiveArrayBuilder
1c8f8fc is described below

commit 1c8f8fc8e3e4f2267f1e88707d0fc370ea94cf72
Author: Paddy Horan 
AuthorDate: Thu Nov 15 12:46:52 2018 +0100

ARROW-3796: [Rust] Add Example for PrimitiveArrayBuilder

I will follow up with examples of `ListArrayBuilder` and `BinaryBuilder` 
when merged.  The info in the readme keeps going out of date so it's probably 
better to build up the examples (which are tested by CI) and re-direct new 
users there.

Author: Paddy Horan 

Closes #2969 from paddyhoran/ARROW-3796 and squashes the following commits:

46699f97  Fixed lint and comment
26cec305  Updated CI to run new example
72d6b3ee  Updated readme.
2a0fe8ae  Added example
---
 ci/rust-build-main.bat|  1 +
 ci/travis_script_rust.sh  |  1 +
 rust/README.md| 21 ++---
 rust/examples/builders.rs | 43 +++
 4 files changed, 51 insertions(+), 15 deletions(-)

diff --git a/ci/rust-build-main.bat b/ci/rust-build-main.bat
index 7b50d9f..ea040e0 100644
--- a/ci/rust-build-main.bat
+++ b/ci/rust-build-main.bat
@@ -53,6 +53,7 @@ cargo test --target %TARGET% --release
 @echo
 @echo Run example (release)
 @echo -
+cargo run --example builders --target %TARGET% --release
 cargo run --example dynamic_types --target %TARGET% --release
 
 popd
diff --git a/ci/travis_script_rust.sh b/ci/travis_script_rust.sh
index 1cd179f..d0889e1 100755
--- a/ci/travis_script_rust.sh
+++ b/ci/travis_script_rust.sh
@@ -36,6 +36,7 @@ cargo rustc -- -D warnings
 cargo build
 cargo test
 cargo bench
+cargo run --example builders
 cargo run --example dynamic_types
 
 popd
diff --git a/rust/README.md b/rust/README.md
index 5f545aa..131c7d9 100644
--- a/rust/README.md
+++ b/rust/README.md
@@ -37,25 +37,16 @@ let array = PrimitiveArray::from(vec![1, 2, 3, 4, 5]);
 println!("array contents: {:?}", array.iter().collect::>());
 ```
 
-## Creating an Array from a Builder
-
-```rust
-let mut builder: Builder = Builder::new();
-for i in 0..10 {
-builder.push(i);
-}
-let buffer = builder.finish();
-let array = PrimitiveArray::from(buffer);
-
-println!("array contents: {:?}", array.iter().collect::>());
-```
-
 ## Run Examples
 
+The examples folder shows how to construct some different types of Arrow
+arrays, including dynamic arrays created at runtime.
+
 Examples can be run using the `cargo run --example` command. For example:
 
 ```bash
-cargo run --example array_from_builder
+cargo run --example builders
+cargo run --example dynamic_types
 ```
 
 ## Run Tests
@@ -74,7 +65,7 @@ 
instructions](https://doc.rust-lang.org/cargo/reference/publishing.html) to
 create an account and login to crates.io before asking to be added as an owner
 of the [arrow crate](https://crates.io/crates/arrow).
 
-Checkout the tag for the version to be releases. For example:
+Checkout the tag for the version to be released. For example:
 
 ```bash
 git checkout apache-arrow-0.11.0
diff --git a/rust/examples/builders.rs b/rust/examples/builders.rs
new file mode 100644
index 000..d88370b
--- /dev/null
+++ b/rust/examples/builders.rs
@@ -0,0 +1,43 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+///! Many builders are available to easily create different types of arrow 
arrays
+extern crate arrow;
+
+use arrow::builder::*;
+
+fn main() {
+// Primitive Arrays
+//
+// Primitive arrays are arrays of fixed-width primitive types (bool, u8, 
u16, u32, u64, i8, i16,
+// i32, i64, f32, f64)
+
+// Create a new builder with a capacity of 100
+let mut primitive_array_builder = PrimitiveArrayBuildernew(100);
+
+// Push an individual primitive value
+primitive_array_builder.push(55).unwrap();
+
+// Push a null value
+primitive_array_builder.push_null().unwrap();
+
+// Push a slice of primitive values
+primitive_array_builder.push_slice(&[39, 89, 

[arrow] branch master updated: ARROW-912: [Python] Recommend that Python developers use -DCMAKE_INSTALL_LIBDIR=lib when building Arrow C++ libraries

2018-11-15 Thread kszucs
This is an automated email from the ASF dual-hosted git repository.

kszucs pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
 new 0d092e0  ARROW-912: [Python] Recommend that Python developers use 
-DCMAKE_INSTALL_LIBDIR=lib when building Arrow C++ libraries
0d092e0 is described below

commit 0d092e015931e889815b0913e8018ba624d45494
Author: Wes McKinney 
AuthorDate: Thu Nov 15 12:43:53 2018 +0100

ARROW-912: [Python] Recommend that Python developers use 
-DCMAKE_INSTALL_LIBDIR=lib when building Arrow C++ libraries

This was a rough edge on some multiarch-enabled systems. Our wheel builds 
are not multiarch aware: 
https://github.com/apache/arrow/blob/master/python/setup.py#L235

Author: Wes McKinney 

Closes #2964 from wesm/ARROW-912 and squashes the following commits:

18c7edad  Recommend that Python developers use 
-DCMAKE_INSTALL_LIBDIR=lib when building
---
 python/doc/source/development.rst | 8 
 1 file changed, 8 insertions(+)

diff --git a/python/doc/source/development.rst 
b/python/doc/source/development.rst
index eefd976..3bd6689 100644
--- a/python/doc/source/development.rst
+++ b/python/doc/source/development.rst
@@ -162,6 +162,7 @@ Now build and install the Arrow C++ libraries:
 
cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
  -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
+ -DCMAKE_INSTALL_LIBDIR=lib \
  -DARROW_PARQUET=on \
  -DARROW_PYTHON=on \
  -DARROW_PLASMA=on \
@@ -174,6 +175,13 @@ Now build and install the Arrow C++ libraries:
 If you don't want to build and install the Plasma in-memory object store,
 you can omit the ``-DARROW_PLASMA=on`` flag.
 
+.. note::
+
+   On Linux systems with support for building on multiple architectures,
+   ``make`` may install libraries in the ``lib64`` directory by default. For
+   this reason we recommend passing ``-DCMAKE_INSTALL_LIBDIR=lib`` because the
+   Python build scripts assume the library directory is ``lib``
+
 Now, build pyarrow:
 
 .. code-block:: shell