arrow git commit: ARROW-60: [C++] Struct type builder API

2016-06-07 Thread wesm
rimitive.h" +#include "arrow/types/struct.h" +#include "arrow/types/test-common.h" +#include "arrow/util/status.h" using std::shared_ptr; using std::string; @@ -52,4 +61,327 @@ TEST(TestStructType, Basics) { // TODO(wesm): out of bounds for field(...) } +void Va

arrow git commit: ARROW-211: [Format] Fixed typos in layout examples

2016-06-07 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 65740950c -> ce2fe7a78 ARROW-211: [Format] Fixed typos in layout examples Just a few typo fixes according to the ticket. Author: Smyatkin Maxim Closes #86 from Smyatkin-Maxim/ARROW-211 and squashes the following

arrow git commit: ARROW-212: Change contract of PrimitiveArray to reflect its abstractness

2016-06-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master bc6c4c88f -> 8197f246d ARROW-212: Change contract of PrimitiveArray to reflect its abstractness Follow-up based on #80 Author: Micah Kornfield Closes #87 from emkornfield/emk_clarify_primitive and squashes the

arrow git commit: ARROW-200: [C++/Python] Return error status on string initialization failure

2016-06-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 9ce13a067 -> bc6c4c88f ARROW-200: [C++/Python] Return error status on string initialization failure Author: Micah Kornfield Closes #88 from emkornfield/emk_arrow_200 and squashes the following commits: 37e23be

arrow git commit: ARROW-203: Python: Basic filename based Parquet read/write

2016-06-10 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 8197f246d -> ec66ddd1f ARROW-203: Python: Basic filename based Parquet read/write Author: Uwe L. Korn Closes #83 from xhochy/arrow-203 and squashes the following commits: 405f85d [Uwe L. Korn] Remove FindParquet

arrow git commit: ARROW-223: Do not link against libpython

2016-06-21 Thread wesm
Repository: arrow Updated Branches: refs/heads/master a3e3849cd -> f7ade7bfe ARROW-223: Do not link against libpython Author: Uwe L. Korn Closes #95 from xhochy/arrow-223 and squashes the following commits: 4fdf1e7 [Uwe L. Korn] ARROW-223: Do not link against libpython

arrow git commit: ARROW-217: Fix Travis w.r.t conda 4.1.0 changes

2016-06-15 Thread wesm
Repository: arrow Updated Branches: refs/heads/master ec66ddd1f -> b4e0e93d5 ARROW-217: Fix Travis w.r.t conda 4.1.0 changes Travis is happy, fixes the problems we see with Travis in #85 Author: Uwe L. Korn Closes #90 from xhochy/fix-conda-show-channel-urls and squashes

arrow git commit: ARROW-218: Add optional API token authentication option to PR merge tool

2016-06-16 Thread wesm
ing issues on shared outbound IP addresses. Author: Wes McKinney <w...@apache.org> Closes #91 from wesm/ARROW-218 and squashes the following commits: f45808c [Wes McKinney] Add optional GitHub API token to patch tool (to avoid rate limiting issues with unauthenticated requests) Project: h

arrow git commit: ARROW-210: Cleanup of the string related types in C++ code base

2016-06-16 Thread wesm
ng.h index d2d3c5b..b3c00d2 100644 --- a/cpp/src/arrow/types/string.h +++ b/cpp/src/arrow/types/string.h @@ -34,87 +34,99 @@ namespace arrow { class Buffer; class MemoryPool; -struct CharType : public DataType { - int size; - - explicit CharType(int size) : DataType(Type::CHAR), size(size) {}

[2/2] arrow git commit: ARROW-222: Prototyping an IO interface for Arrow, with initial HDFS target

2016-06-24 Thread wesm
+ if (client_->Exists(path)) { ASSERT_OK(client_->Delete(path, true)); } + + ASSERT_OK(client_->CreateDirectory(path)); + ASSERT_TRUE(client_->Exists(path)); + EXPECT_OK(client_->Delete(path, true)); + ASSERT_FALSE(client_->Exists(path)); +} + +TEST_F(TestHdfsClient, GetCapacityUsed)

[1/2] arrow git commit: ARROW-222: Prototyping an IO interface for Arrow, with initial HDFS target

2016-06-24 Thread wesm
Repository: arrow Updated Branches: refs/heads/master f7ade7bfe -> ef9083029 http://git-wip-us.apache.org/repos/asf/arrow/blob/ef908302/cpp/thirdparty/hadoop/include/hdfs.h -- diff --git a/cpp/thirdparty/hadoop/include/hdfs.h

arrow git commit: ARROW-8: Add .travis.yml and test script for Arrow C++. OS X build fixes

2016-03-01 Thread wesm
Repository: arrow Updated Branches: refs/heads/master e6905effb -> a3856222d ARROW-8: Add .travis.yml and test script for Arrow C++. OS X build fixes Project: http://git-wip-us.apache.org/repos/asf/arrow/repo Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/a3856222 Tree:

arrow git commit: ARROW-42: Add Python tests to Travis CI build

2016-03-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master e822ea758 -> 83675273b ARROW-42: Add Python tests to Travis CI build Author: Wes McKinney <w...@apache.org> Closes #22 from wesm/ARROW-42 and squashes the following commits: 3b056a1 [Wes McKinney] Modularize Travis CI buil

[1/2] arrow git commit: ARROW-54: [Python] Rename package to "pyarrow"

2016-03-09 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 83675273b -> 6fdcd4943 http://git-wip-us.apache.org/repos/asf/arrow/blob/6fdcd494/python/pyarrow/includes/libarrow.pxd -- diff --git a/python/pyarrow/includes/libarrow.pxd

arrow git commit: ARROW-68: Better error handling for not fully setup systems

2016-03-19 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 5881aacef -> c99661069 ARROW-68: Better error handling for not fully setup systems Author: Micah Kornfield Closes #27 from emkornfield/emk_add_nice_errors_PR and squashes the following commits: c0b9d78 [Micah

arrow git commit: ARROW-55: [Python] Fix unit tests in 2.7

2016-03-19 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 6fdcd4943 -> 883c62bdd ARROW-55: [Python] Fix unit tests in 2.7 Fixing the #define check for Python 2 makes all unit tests pass in Python 2.7. Author: Dan Robinson Closes #25 from danrobinson/ARROW-55 and

arrow git commit: ARROW-72: Search for alternative parquet-cpp header

2016-03-21 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 3a99f39d6 -> 016b92bcc ARROW-72: Search for alternative parquet-cpp header Author: Uwe L. Korn Closes #30 from xhochy/arrow-72 and squashes the following commits: 5b6b328 [Uwe L. Korn] ARROW-72: Search for alternative

arrow git commit: ARROW-28: Adding google's benchmark library to the toolchain

2016-03-22 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 016b92bcc -> 4ec034bbe ARROW-28: Adding google's benchmark library to the toolchain This isn't yet complete, but before I go further I think its worth asking some questions on peoples' preferences: 1. It seems that the build third-party

arrow git commit: ARROW-75: Fix handling of empty strings

2016-03-22 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 4ec034bbe -> 093f9bd8c ARROW-75: Fix handling of empty strings Fixes [ARROW-75](https://issues.apache.org/jira/browse/ARROW-75) (and changes Python tests to verify that behavior). Author: Dan Robinson Closes

arrow git commit: ARROW-70: Add adapt 'lite' DCHECK macros from Kudu as also used in Parquet

2016-03-23 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 65db0da80 -> a4002c6e2 ARROW-70: Add adapt 'lite' DCHECK macros from Kudu as also used in Parquet Also added a null pointer DCHECK to show that it works. cc @emkornfield Author: Wes McKinney <w...@apache.org> Closes #33 from w

arrow git commit: ARROW-77: [C++] Conform bitmap interpretation to ARROW-62; 1 for nulls, 0 for non-nulls

2016-03-24 Thread wesm
Repository: arrow Updated Branches: refs/heads/master a4002c6e2 -> fbbee3d2d ARROW-77: [C++] Conform bitmap interpretation to ARROW-62; 1 for nulls, 0 for non-nulls Author: Wes McKinney <w...@apache.org> Closes #35 from wesm/ARROW-77 and squashes the following commits: 848

[1/3] arrow git commit: ARROW-67: C++ metadata flatbuffer serialization and data movement to memory maps

2016-03-22 Thread wesm
elative offset into the shared memory page where the bytes for this + /// buffer starts + offset: long; + + /// The absolute length (in bytes) of the memory buffer. The memory is found + /// from offset (inclusive) to offset + length (non-inclusive). + length: long; +} + +/// Metadata about a fi

[3/3] arrow git commit: ARROW-67: C++ metadata flatbuffer serialization and data movement to memory maps

2016-03-22 Thread wesm
and consolidation as part of this. For example, List types are now internally equivalent to a nested type with 1 named child field (versus a struct, which can have any number of child fields). Associated JIRAs: ARROW-48, ARROW-57, ARROW-58 Author: Wes McKinney <w...@apache.org> Closes #28 from we

[2/3] arrow git commit: ARROW-67: C++ metadata flatbuffer serialization and data movement to memory maps

2016-03-22 Thread wesm
// The buffer is prefixed by its size as int32_t + const uint8_t* fb_head = buffer->data() + sizeof(int32_t); + const flatbuf::Message* message = flatbuf::GetMessage(fb_head); + + // TODO(wesm): verify message + result->impl_.reset(new Impl(buffer, message)); + *out = result; +

arrow git commit: ARROW-22: [C++] Convert flat Parquet schemas to Arrow schemas

2016-03-26 Thread wesm
dence between repetition and definition levels so that the right null bits can be set easily during reassembly. Closes #37. Closes #38. Closes #39 Author: Wes McKinney <w...@apache.org> Author: Uwe L. Korn <uw...@xhochy.com> Closes #41 from wesm/ARROW-22 and squashes the followi

arrow git commit: ARROW-44: Python: prototype object model for array slot values ("scalars")

2016-03-07 Thread wesm
rr[2]) Out[10]: 0 In [11]: arr.type Out[11]: DataType(list) ``` Author: Wes McKinney <w...@apache.org> Closes #20 from wesm/ARROW-44 and squashes the following commits: df06ba1 [Wes McKinney] Add tests for scalars proxying implemented Python list type conversions, fix associated bugs 20

arrow git commit: ARROW-26: Add instructions for enabling Arrow C++ Parquet adapter build

2016-03-03 Thread wesm
ild it in Arrow's thirdparty, but it immediately results in a dependency-hell situation (Parquet requires Thrift, Boost, snappy, lz4, zlib) Author: Wes McKinney <w...@apache.org> Closes #12 from wesm/ARROW-26 and squashes the following commits: b28fd75 [Wes McKinney] Add instructions for

arrow git commit: ARROW-20: Add null_count_ member to array containers, remove nullable_ member

2016-03-03 Thread wesm
gorithms code. If it is deemed useful we can validate (cheaply) that physical data meets the metadata requirements (e.g. non-nullable type metadata cannot be associated with data containers having nulls). Author: Wes McKinney <w...@apache.org> Closes #9 from wesm/ARROW-20 and squashes th

arrow git commit: ARROW-36: Remove fixVersions from JIRA resolve code path

2016-03-03 Thread wesm
<w...@apache.org> Closes #11 from wesm/ARROW-36 and squashes the following commits: 432c17c [Wes McKinney] Remove fixVersions from JIRA resolve code path Project: http://git-wip-us.apache.org/repos/asf/arrow/repo Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/1000d110 Tree: http:

arrow git commit: ARROW-13: Add PR merge tool from parquet-mr, suitably modified

2016-03-03 Thread wesm
Repository: arrow Updated Branches: refs/heads/master a3856222d -> 8f2ca246b ARROW-13: Add PR merge tool from parquet-mr, suitably modified Author: Wes McKinney <w...@apache.org> Closes #7 from wesm/ARROW-13 and squashes the following commits: 7a58712 [Wes McKinney] Add PR merge

arrow git commit: ARROW-7: Add barebones Python library build toolchain

2016-03-07 Thread wesm
for example: enabling C++ code with no knowledge of Python to invoke Python functions). Let's see how this goes: there are other options, like Boost::Python, but Cython + shim code is a more lightweight and flexible solution for the moment. Author: Wes McKinney <w...@apache.org>

arrow git commit: ARROW-9: Rename some unchanged "Drill" to "Arrow" (follow-up)

2016-03-07 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 8caa28726 -> 571343bbe ARROW-9: Rename some unchanged "Drill" to "Arrow" (follow-up) https://issues.apache.org/jira/browse/ARROW-9 There is a unchanged one from "Drill" to "Arrow" at `ValueVector` and minor typos are fixed. Author:

arrow git commit: ARROW-35: Add a short call-to-action in the top level README.md

2016-03-07 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 572cdf22e -> 8caa28726 ARROW-35: Add a short call-to-action in the top level README.md Author: Wes McKinney <w...@apache.org> Closes #13 from wesm/ARROW-35 and squashes the following commits: e10bfc3 [Wes McKinney] Add a prop

[1/2] arrow git commit: ARROW-31: Python: prototype user object model, add PyList conversion path with type inference

2016-03-07 Thread wesm
bool_count_(0), + int_count_(0), + float_count_(0), + string_count_(0) {} + + void Visit(PyObject* obj) { +++total_count_; +if (obj == Py_None) { + ++none_count_; +} else if (PyFloat_Check(obj)) { + ++float_count_; +} else if (IsPyInteger(obj)) { +

[2/2] arrow git commit: ARROW-31: Python: prototype user object model, add PyList conversion path with type inference

2016-03-07 Thread wesm
merging. Author: Wes McKinney <w...@apache.org> Closes #19 from wesm/ARROW-31 and squashes the following commits: 2345541 [Wes McKinney] Test basic conversion of nested lists 1d4618b [Wes McKinney] Prototype string and double converters b02b296 [Wes McKinney] Type inference for lists and

arrow git commit: ARROW-23: Add a logical Column data structure

2016-03-04 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 3b777c7f4 -> 9c2b95446 ARROW-23: Add a logical Column data structure I also added global const instances of common primitive types Author: Wes McKinney <w...@apache.org> Closes #15 from wesm/ARROW-23 and squashes the followin

arrow git commit: ARROW-24: C++: Implement a logical Table container type

2016-03-04 Thread wesm
n the road. Author: Wes McKinney <w...@apache.org> Closes #16 from wesm/ARROW-24 and squashes the following commits: b701c76 [Wes McKinney] Test case for wrong number of columns passed 5faa5ac [Wes McKinney] cpplint 9a651cb [Wes McKinney] Basic table prototype. Move Schema code unde

arrow git commit: ARROW-43: Python: format array values to in __repr__ for interactive computing

2016-03-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master ae95dbd18 -> 45cd9fd8d ARROW-43: Python: format array values to in __repr__ for interactive computing Author: Wes McKinney <w...@apache.org> Closes #21 from wesm/ARROW-43 and squashes the following commits: dee6ba2 [Wes McKinn

arrow git commit: ARROW-90: [C++] Check for SIMD instruction set support

2016-03-31 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 6d31d5928 -> 79fddd113 ARROW-90: [C++] Check for SIMD instruction set support This also adds an option to disable the usage of a specific instruction set, e.g. you compile on a machine that supports SSE3 but you want to use the binary also

arrow git commit: ARROW-88: [C++] Refactor usages of parquet_cpp namespace

2016-03-28 Thread wesm
loses #49 from wesm/ARROW-88 and squashes the following commits: c4d81dc [Wes McKinney] Refactor usages of parquet_cpp namespace Project: http://git-wip-us.apache.org/repos/asf/arrow/repo Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/df7726d4 Tree: http://git-wip-us.apache.org/repos/

arrow git commit: ARROW-193: typos "int his" fix to "in this"

2016-05-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master c9ffe546b -> 1f04f7ff9 ARROW-193: typos "int his" fix to "in this" Project: http://git-wip-us.apache.org/repos/asf/arrow/repo Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/1f04f7ff Tree:

arrow git commit: ARROW-91: Basic Parquet read support

2016-05-10 Thread wesm
less required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and li

arrow git commit: ARROW-199: [C++] Refine third party dependency

2016-05-14 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 68b80a838 -> 6968ec01d ARROW-199: [C++] Refine third party dependency To generate makefile, run download_thirdparty.sh and build_thirdparty.sh is not enough source setup_build_env.sh is necessary since FLATBUFFERS_HOME must be set .

arrow git commit: ARROW-204: Add Travis CI builds that post conda artifacts for Linux and OS X

2016-05-18 Thread wesm
ing issues won't fail the build. Author: Wes McKinney <w...@apache.org> Closes #79 from wesm/ARROW-204 and squashes the following commits: afd0582 [Wes McKinney] Change encrypted token to apache/arrow, only upload on commits to master 58955e5 [Wes McKinney] Draft of automated conda builds for

arrow git commit: ARROW-201: [C++] Initial ParquetWriter implementation

2016-05-18 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 978de1a94 -> e0fb3698e ARROW-201: [C++] Initial ParquetWriter implementation Author: Uwe L. Korn Closes #78 from xhochy/arrow-201 and squashes the following commits: 5d95099 [Uwe L. Korn] Add check for flat column

arrow git commit: ARROW-82: Initial IPC support for ListArray

2016-04-18 Thread wesm
mmy zero-length buffer, not to be copied +buffers->push_back(std::make_shared(nullptr, 0)); + } -if (prim_arr->null_count() > 0) { - buffers->push_back(prim_arr->null_bitmap()); -} else { - // Push a dummy zero-length buffer, not to be copied - buffers->pus

arrow git commit: ARROW-103: Add files to gitignore

2016-04-17 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 37f727168 -> 5843e6872 ARROW-103: Add files to gitignore Patches [ARROW-103](https://issues.apache.org/jira/browse/ARROW-103), though perhaps it would make sense to leave that issue open to cover any future .gitignore-related pull

arrow git commit: ARROW-237: Implement parquet-cpp's abstract IO interfaces for memory allocation and file reading

2016-07-11 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 77598fa59 -> ff6132f8a ARROW-237: Implement parquet-cpp's abstract IO interfaces for memory allocation and file reading Part of ARROW-227 and ARROW-236 Author: Wes McKinney <w...@apache.org> Closes #101 from wesm/ARROW-237 and

arrow git commit: ARROW-244: Some global APIs of IPC module should be visible to the outside

2016-08-01 Thread wesm
Repository: arrow Updated Branches: refs/heads/master a2fb756a4 -> dc79ceb05 ARROW-244: Some global APIs of IPC module should be visible to the outside Author: Jihoon Son Closes #109 from jihoonson/ARROW-244 and squashes the following commits: 51d9a87 [Jihoon Son]

arrow git commit: ARROW-240: Provide more detailed installation instructions for pyarrow. Closes

2016-08-01 Thread wesm
Repository: arrow Updated Branches: refs/heads/master dc79ceb05 -> 356d015bb ARROW-240: Provide more detailed installation instructions for pyarrow. Closes Project: http://git-wip-us.apache.org/repos/asf/arrow/repo Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/356d015b Tree:

arrow git commit: ARROW-101: Fix java compiler warnings

2016-08-01 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 356d015bb -> 3a2dfba59 ARROW-101: Fix java compiler warnings Fixes several warnings emitted by java compiler regarding the use of raw types and unclosed resources. Author: Laurent Goujon Closes #60 from

arrow git commit: ARROW-106: [C++] Add IPC to binary/string types

2016-07-12 Thread wesm
Repository: arrow Updated Branches: refs/heads/master ff6132f8a -> 62390d842 ARROW-106: [C++] Add IPC to binary/string types Author: Micah Kornfield Closes #103 from emkornfield/emk_add_string_rpc and squashes the following commits: 9c563fe [Micah Kornfield]

arrow git commit: ARROW-247: Missing explicit destructor in RowBatchReader causes an incomplete type error

2016-08-04 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 56835c338 -> 5df7d4dee ARROW-247: Missing explicit destructor in RowBatchReader causes an incomplete type error Author: Jihoon Son Closes #111 from jihoonson/ARROW-247 and squashes the following commits: cc7281c

arrow git commit: ARROW-215: Support other integer types and strings in Parquet I/O

2016-06-29 Thread wesm
Repository: arrow Updated Branches: refs/heads/master ef9083029 -> 2f52cf4ee ARROW-215: Support other integer types and strings in Parquet I/O Change-Id: I72c6c82bc38c895a04172531bebbc78d4fb08732 Project: http://git-wip-us.apache.org/repos/asf/arrow/repo Commit:

arrow git commit: ARROW-234: Build libhdfs IO extension in conda artifacts

2016-07-01 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 2f52cf4ee -> fab4c82d2 ARROW-234: Build libhdfs IO extension in conda artifacts Author: Wes McKinney <w...@apache.org> Closes #97 from wesm/ARROW-234 and squashes the following commits: 3edb8d1 [Wes McKinney] Enable ARROW_HDFS

arrow git commit: ARROW-107: [C++] Implement IPC for structs

2016-08-16 Thread wesm
w/util/buffer.h" #include "arrow/util/logging.h" #include "arrow/util/status.h" @@ -118,8 +119,11 @@ Status VisitArray(const Array* arr, std::vector* field_nodes RETURN_NOT_OK(VisitArray( list_arr->values().get(), field_nodes, buffers, max_recursion_depth -

arrow git commit: ARROW-251: Expose APIs for getting code and message of the status

2016-08-15 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 689cd270e -> 268e108c2 ARROW-251: Expose APIs for getting code and message of the status Author: Jihoon Son Closes #114 from jihoonson/ARROW-251 and squashes the following commits: d1186bf [Jihoon Son] Fix

arrow git commit: ARROW-523: Python: Account for changes in PARQUET-834

2017-02-02 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 0ae4d86e5 -> c05292faf ARROW-523: Python: Account for changes in PARQUET-834 Author: Uwe L. Korn Closes #313 from xhochy/ARROW-523 and squashes the following commits: ff699ea [Uwe L. Korn] Use relative import e36dcc8

arrow git commit: ARROW-477: [Java] Add support for second/microsecond/nanosecond timestamps in-memory and in IPC/JSON layer

2017-02-03 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 720d422fa -> 08f38d979 ARROW-477: [Java] Add support for second/microsecond/nanosecond timestamps in-memory and in IPC/JSON layer Changes include: - add support for TimeStamp data type with second/microsecond/nanosecond time units - add

arrow git commit: ARROW-410: [C++] Add virtual Writeable::Flush

2017-01-31 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 7ac320bde -> be5d73f2c ARROW-410: [C++] Add virtual Writeable::Flush Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #310 from wesm/ARROW-410 and squashes the following commits: 7352f0a [Wes McKinney] Add virtual Writeab

arrow git commit: ARROW-381: [C++] Simplify primitive array type builders to use a default type singleton

2017-02-04 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 5b35d6bda -> 84f16624b ARROW-381: [C++] Simplify primitive array type builders to use a default type singleton Author: Uwe L. Korn Closes #316 from xhochy/ARROW-381 and squashes the following commits: 7061d9a [Uwe L.

arrow git commit: ARROW-527: Remove drill-module.conf file

2017-02-04 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 84f16624b -> c45c3b3e1 ARROW-527: Remove drill-module.conf file Remove drill-module.conf file as it is not used by the project. Author: Laurent Goujon Closes #318 from laurentgo/laurent/ARROW-527 and squashes the

arrow git commit: ARROW-531: Python: Document jemalloc, extend Pandas section, add Getting Involved

2017-02-07 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 4c3481ea5 -> e97fbe640 ARROW-531: Python: Document jemalloc, extend Pandas section, add Getting Involved Author: Uwe L. Korn Closes #321 from xhochy/ARROW-531 and squashes the following commits: 55da9dc [Uwe L. Korn]

arrow git commit: ARROW-535: [Python] Add type mapping for NPY_LONGLONG

2017-02-07 Thread wesm
Repository: arrow Updated Branches: refs/heads/master f268e927a -> 4c3481ea5 ARROW-535: [Python] Add type mapping for NPY_LONGLONG Based on https://github.com/wesm/feather/pull/107 Author: Uwe L. Korn <uw...@xhochy.com> Closes #323 from xhochy/ARROW-535 and squashes the followin

arrow git commit: ARROW-351: Time type has no unit

2017-02-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 1407abfc9 -> b99d049c3 ARROW-351: Time type has no unit Author: Julien Le Dem Closes #328 from julienledem/arrow_351 and squashes the following commits: 2497ee3 [Julien Le Dem] ARROW-351: Time type has no unit

arrow git commit: ARROW-366 Java Dictionary Vector

2017-02-07 Thread wesm
Repository: arrow Updated Branches: refs/heads/master e97fbe640 -> c322cbf22 ARROW-366 Java Dictionary Vector I've added a dictionary type, and a partial implementation of a dictionary vector that just wraps an index vector and has a reference to a lookup vector. The spec seems to indicate

arrow git commit: ARROW-538: [C++] Set up AddressSanitizer (ASAN) builds

2017-02-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 4440e4011 -> 0bdfd5efb ARROW-538: [C++] Set up AddressSanitizer (ASAN) builds Most of the infrastructure was already in place, only needed to fix the gtest build. We will now build with AddressSanitizer activated on OSX. Author: Uwe L.

arrow git commit: ARROW-543: C++: Lazily computed null_counts counts number of non-null entries

2017-02-08 Thread wesm
Repository: arrow Updated Branches: refs/heads/master b99d049c3 -> 4440e4011 ARROW-543: C++: Lazily computed null_counts counts number of non-null entries Author: Uwe L. Korn Closes #329 from xhochy/ARROW-543 and squashes the following commits: 191792b [Uwe L. Korn]

arrow git commit: ARROW-529: Python: Add jemalloc and Python 3.6 to manylinux1 build

2017-02-05 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 70c05be21 -> 5bee596ca ARROW-529: Python: Add jemalloc and Python 3.6 to manylinux1 build Author: Uwe L. Korn Closes #319 from xhochy/ARROW-529 and squashes the following commits: 48893a2 [Uwe L. Korn] ARROW-529:

[1/3] arrow git commit: ARROW-33: [C++] Implement zero-copy array slicing, integrate with IPC code paths

2017-02-06 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 74bc4dd48 -> 5439b7158 http://git-wip-us.apache.org/repos/asf/arrow/blob/5439b715/python/src/pyarrow/adapters/pandas.cc -- diff --git

[2/3] arrow git commit: ARROW-33: [C++] Implement zero-copy array slicing, integrate with IPC code paths

2017-02-06 Thread wesm
); + const int right_abs_index = o_i + right.offset(); + // TODO(wesm): really we should be comparing stretches of non-null data // rather than looking at one value at a time. if (union_mode == UnionMode::SPARSE) { -if (!left.child(child_num)->RangeEq

[3/3] arrow git commit: ARROW-33: [C++] Implement zero-copy array slicing, integrate with IPC code paths

2017-02-06 Thread wesm
to do to polish things up Closes #56. Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #322 from wesm/ARROW-33 and squashes the following commits: 61afe42 [Wes McKinney] Some API cleaning in builder.h 86511a3 [Wes McKinney] Python fixes, clang warning fixes 9a00870 [Wes McKinney

arrow git commit: ARROW-525: Python: Add more documentation to the package

2017-02-04 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 08f38d979 -> e881f1155 ARROW-525: Python: Add more documentation to the package Author: Uwe L. Korn Closes #317 from xhochy/ARROW-525 and squashes the following commits: d213e63 [Uwe L. Korn] ARROW-525: Python: Add

arrow git commit: ARROW-457: Python: Better control over memory pool

2017-02-04 Thread wesm
Repository: arrow Updated Branches: refs/heads/master e881f1155 -> 5b35d6bda ARROW-457: Python: Better control over memory pool Author: Uwe L. Korn Closes #315 from xhochy/ARROW-457 and squashes the following commits: dc5abdb [Uwe L. Korn] Use aligned deallocator 20c8505

arrow git commit: ARROW-505: [C++] Fix compiler warning in gcc in release mode

2017-01-22 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 5888e10cf -> 5a161ebc1 ARROW-505: [C++] Fix compiler warning in gcc in release mode Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #294 from wesm/fix-release-compile-warning and squashes the following commits: 418

arrow git commit: ARROW-495: [C++] Implement streaming binary format, refactoring

2017-01-21 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 8ca7033fc -> 5888e10cf ARROW-495: [C++] Implement streaming binary format, refactoring cc @nongli Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #293 from wesm/ARROW-495 and squashes the following commits: 279583b [Wes

arrow git commit: ARROW-494: [C++] Extend lifetime of memory mapped data if any buffers reference it

2017-01-23 Thread wesm
ory was being unmapped even if there are `arrow::Buffer` object referencing it. Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #298 from wesm/ARROW-494 and squashes the following commits: 60222e3 [Wes McKinney] clang-format 2960d17 [Wes McKinney] Add C++ unit test d7d776a [Wes McKi

arrow git commit: ARROW-506: Java: Implement echo server for integration testing.

2017-01-23 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 69cdbd8ce -> c327b5fd2 ARROW-506: Java: Implement echo server for integration testing. While implementing this, it became clear it made sense for the stream writer to have an API to indicate EOS without closing the stream. The current

arrow git commit: ARROW-475: [Python] Add support for reading multiple Parquet files as a single pyarrow.Table

2017-01-23 Thread wesm
lso implements ARROW-470 Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #296 from wesm/ARROW-475 and squashes the following commits: 894d2a2 [Wes McKinney] Implement Filesystem abstraction, add Filesystem.read_parquet. Implement rudimentary shim on local filesystem 3927c2c [Wes McKin

arrow git commit: ARROW-503: [Python] Implement Python interface to streaming file format

2017-01-23 Thread wesm
ged. Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #299 from wesm/ARROW-503 and squashes the following commits: e9d918e [Wes McKinney] Close BufferOutputStream after completing file or stream writes 31e519f [Wes McKinney] Add function alias to preserve backwards compatibility faa

arrow git commit: ARROW-508: [C++] Add basic threadsafety to normal files and memory maps

2017-01-23 Thread wesm
ion in esoteric circumstances. I'm going to report a bug to change these to `ReadAt` which can be more easily made threadsafe Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #300 from wesm/ARROW-508 and squashes the following commits: e57156c [Wes McKinney] Make base ReadableFile

arrow git commit: ARROW-81: [Format] Augment dictionary encoding metadata to accommodate additional use cases

2017-01-23 Thread wesm
ort, and in general for statistical computing applications. Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #297 from wesm/ARROW-81 and squashes the following commits: c960bac [Wes McKinney] Augment dictionary encoding metadata to accommodate additional use cases Project: http:

arrow git commit: ARROW-378: Python: Respect timezone on conversion of Pandas datetime columns

2017-01-23 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 085c8754b -> c90ca60c1 ARROW-378: Python: Respect timezone on conversion of Pandas datetime columns arrow is now pandas datetime timezone aware Author: ahnj Closes #287 from ahnj/timestamp-aware and squashes the

arrow git commit: ARROW-512: C++: Add method to check for primitive types

2017-01-26 Thread wesm
Repository: arrow Updated Branches: refs/heads/master a68af9d16 -> a90b5f363 ARROW-512: C++: Add method to check for primitive types Also includes some documentation updates. Author: Uwe L. Korn Closes #304 from xhochy/ARROW-512 and squashes the following commits:

arrow git commit: ARROW-514: [Python] Automatically wrap pyarrow.io.Buffer in BufferReader

2017-01-26 Thread wesm
Repository: arrow Updated Branches: refs/heads/master aac2e70c1 -> 30bb0d97d ARROW-514: [Python] Automatically wrap pyarrow.io.Buffer in BufferReader Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #306 from wesm/ARROW-514 and squashes the following commits: d5e3235 [Wes

arrow git commit: ARROW-519: [C++] Refactor array comparison code into a compare.h / compare.cc in part to resolve Xcode 6.1 linker issue

2017-01-29 Thread wesm
arrays not equal" per ARROW-517 Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #308 from wesm/ARROW-519 and squashes the following commits: 85b0bf8 [Wes McKinney] Fix invalid memory access when doing RangeEquals on BinaryArray with all empty strings f5f4593 [Wes McKinney] Rem

arrow git commit: ARROW-513: [C++] Fixing Appveyor / MSVC build

2017-01-26 Thread wesm
ter` in the `FileWriter` implementation. This is not consistent with Microsoft's Modern C++ support matrix https://msdn.microsoft.com/en-us/library/hh567368.aspx, so perhaps they now support inheriting *public* constructors. Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #305 from wesm/

arrow git commit: ARROW-498 [C++] Add command line utilities that convert between stream and file.

2017-01-25 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 61a54f8a6 -> a68af9d16 ARROW-498 [C++] Add command line utilities that convert between stream and file. These are in the style of unix utilities using stdin/stdout for argument passing. This makes it easy to chain them together and I

arrow git commit: ARROW-563: Support non-standard gcc version strings

2017-02-20 Thread wesm
Repository: arrow Updated Branches: refs/heads/master ab15e01c7 -> ef6b46557 ARROW-563: Support non-standard gcc version strings Author: Uwe L. Korn Closes #343 from xhochy/ARROW-563 and squashes the following commits: 64d1c93 [Uwe L. Korn] ARROW-563: Support non-standard

[2/2] arrow git commit: ARROW-459: [C++] Dictionary IPC support in file and stream formats

2017-02-24 Thread wesm
ARROW-459: [C++] Dictionary IPC support in file and stream formats Also fixes ARROW-565 Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #347 from wesm/ARROW-459 and squashes the following commits: 6a987b7 [Wes McKinney] Fix clang warning with forward declaration 8e0e6fb [Wes Mc

arrow git commit: ARROW-580: C++: Also provide jemalloc_X targets if only a static or shared version is found

2017-02-25 Thread wesm
Repository: arrow Updated Branches: refs/heads/master d28f1c1e0 -> 89dc55789 ARROW-580: C++: Also provide jemalloc_X targets if only a static or shared version is found Author: Uwe L. Korn Closes #349 from xhochy/ARROW-580 and squashes the following commits: 6cdeef2 [Uwe

arrow git commit: ARROW-578: [C++] Add -DARROW_CXXFLAGS=... option to make CMake more consistent

2017-02-25 Thread wesm
ork properly in our Travis CI setup, so go figure. Some Google searches seem to confirm this is a known issue, and having a specific "user flags" option is a way around it. We just did the same thing in parquet-cpp. Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #348

arrow git commit: ARROW-551: C++: Construction of Column with nullptr Array segfaults

2017-02-12 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 42b55d98c -> e4845c447 ARROW-551: C++: Construction of Column with nullptr Array segfaults Author: Uwe L. Korn Closes #335 from xhochy/ARROW-551 and squashes the following commits: 440d4a9 [Uwe L. Korn] ARROW-551: C++:

arrow git commit: ARROW-556: [Integration] Configure C++ integration test executable with a single environment variable. Update README

2017-02-13 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 66f650cd3 -> 69cf69238 ARROW-556: [Integration] Configure C++ integration test executable with a single environment variable. Update README Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #340 from wesm/ARROW-556 and

arrow git commit: ARROW-558: Add KEYS files

2017-02-14 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 69cf69238 -> d50f1525a ARROW-558: Add KEYS files Author: Uwe L. Korn Closes #341 from xhochy/ARROW-558 and squashes the following commits: ea5327b [Uwe L. Korn] ARROW-558: Add KEYS files Project:

svn commit: r18328 - /release/arrow/KEYS

2017-02-14 Thread wesm
Author: wesm Date: Tue Feb 14 13:29:44 2017 New Revision: 18328 Log: [Arrow] Update KEYS file Modified: release/arrow/KEYS Modified: release/arrow/KEYS == --- release/arrow/KEYS (original) +++ release/arrow/KEYS

arrow git commit: ARROW-544: [C++] Test writing zero-length record batches, zero-length BinaryArray fixes

2017-02-10 Thread wesm
ges to verify. cc @BryanCutler Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #333 from wesm/ARROW-544 and squashes the following commits: f80d58f [Wes McKinney] Protect zero-length record batches from incomplete buffer metadata f876dce [Wes McKinney] Test with null value_offsets to

arrow git commit: ARROW-561:[JAVA][PYTHON] Update java & python dependencies to improve downstream packaging experience

2017-02-15 Thread wesm
Repository: arrow Updated Branches: refs/heads/master d50f1525a -> fa8d27f31 ARROW-561:[JAVA][PYTHON] Update java & python dependencies to improve downstream packaging experience The current build for arrow uses a interesting work around for hamcrest conflict between JUNIT and mockito which

arrow git commit: ARROW-553: C++: Faster valid bitmap building

2017-02-13 Thread wesm
Repository: arrow Updated Branches: refs/heads/master 1f26040f5 -> ad0157547 ARROW-553: C++: Faster valid bitmap building Author: Uwe L. Korn Closes #338 from xhochy/ARROW-553 and squashes the following commits: 1c1ee3d [Uwe L. Korn] ARROW-553: C++: Faster valid bitmap

arrow git commit: ARROW-547: [Python] Add zero-copy slice methods to Array, RecordBatch

2017-02-13 Thread wesm
Repository: arrow Updated Branches: refs/heads/master ad0157547 -> 66f650cd3 ARROW-547: [Python] Add zero-copy slice methods to Array, RecordBatch Author: Wes McKinney <wes.mckin...@twosigma.com> Closes #336 from wesm/ARROW-547 and squashes the following commits: 42037c2 [Wes

  1   2   3   4   5   6   7   8   9   10   >