[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-17 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441565270



##
File path: cpp/src/arrow/compute/kernels/scalar_validity.cc
##
@@ -0,0 +1,107 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/kernels/common.h"
+
+#include "arrow/util/bit_util.h"
+#include "arrow/util/bitmap_ops.h"
+
+namespace arrow {
+
+using internal::CopyBitmap;
+using internal::InvertBitmap;
+
+namespace compute {
+namespace {
+
+struct IsValidOperator {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+DCHECK_EQ(out->offset, 0);
+DCHECK_LE(out->length, arr.length);
+if (arr.buffers[0] != nullptr) {
+  out->buffers[1] = arr.offset == 0
+? arr.buffers[0]
+: SliceBuffer(arr.buffers[0], arr.offset / 8, 
arr.length / 8);
+  out->offset = arr.offset % 8;

Review comment:
   In general I'm not clear on what the kernel is/isn't allowed to modify 
in the out data. I've added `can_write_into_slices=false` , so IIUC the kernel 
should always exclusively own the out data.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441096806



##
File path: cpp/src/arrow/compute/kernels/scalar_validity.cc
##
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/kernels/common.h"
+
+#include "arrow/util/bit_util.h"
+#include "arrow/util/bitmap_ops.h"
+
+namespace arrow {
+
+using internal::CopyBitmap;
+using internal::InvertBitmap;
+
+namespace compute {
+namespace {
+
+struct IsValidOperator {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+if (arr.buffers[0] != nullptr && out->offset == arr.offset &&
+out->length == arr.length) {

Review comment:
   PTAL





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441019388



##
File path: cpp/src/arrow/compute/kernels/scalar_validity.cc
##
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/kernels/common.h"
+
+#include "arrow/util/bit_util.h"
+#include "arrow/util/bitmap_ops.h"
+
+namespace arrow {
+
+using internal::CopyBitmap;
+using internal::InvertBitmap;
+
+namespace compute {
+namespace {
+
+struct IsValidOperator {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+if (arr.buffers[0] != nullptr && out->offset == arr.offset &&
+out->length == arr.length) {

Review comment:
   As it happens this fails for sliced input. I'll restore the checks. For 
your interest, one approach i tried for `is_valid` was a `MetaFunction` which 
invoked a no-op ScalarFunction with INTERSECTION null handling then yanked the 
null bitmap from that. Unfortunately INTERSECTION doesn't currently support the 
zero copy case and I wasn't sure the approach would be acceptable but it did 
avoid repetition of null bitmap handling logic. What do you think?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441021056



##
File path: cpp/src/arrow/testing/random.h
##
@@ -250,6 +250,21 @@ class ARROW_EXPORT RandomArrayGenerator {
int32_t min_length, int32_t 
max_length,
double null_probability = 0);
 
+  /// \brief Randomly generate an Array of the specified type, size, and 
null_probability.
+  ///
+  /// Generation parameters other than size and null_probability are 
determined based on
+  /// the type of Array to be generated.
+  /// If boolean the probabilities of true,false values are 0.25,0.75 
respectively.
+  /// If numeric min,max will be the least and greatest representable values.
+  /// If string min_length,max_length will be 0,sqrt(size) respectively.
+  ///
+  /// \param[in] type the type of Array to generate
+  /// \param[in] size the size of the Array to generate
+  /// \param[in] null_probability the probability of a slot being null
+  /// \return a generated Array
+  std::shared_ptr ArrayOf(std::shared_ptr type, int64_t size,
+ double null_probability);

Review comment:
   https://issues.apache.org/jira/browse/ARROW-9149





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441019388



##
File path: cpp/src/arrow/compute/kernels/scalar_validity.cc
##
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/kernels/common.h"
+
+#include "arrow/util/bit_util.h"
+#include "arrow/util/bitmap_ops.h"
+
+namespace arrow {
+
+using internal::CopyBitmap;
+using internal::InvertBitmap;
+
+namespace compute {
+namespace {
+
+struct IsValidOperator {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+if (arr.buffers[0] != nullptr && out->offset == arr.offset &&
+out->length == arr.length) {

Review comment:
   As it happens this fails for sliced input. I'll restore the checks. For 
your interest, one approach i tried for `is_valid` was a `MetaFunction` which 
invoked a no-op ScalarFunction with INTERSECTION null handling then yanked the 
null bitmap from that. Unfortunately INTERSECTION doesn't currently support the 
zero copy case and I wasn't sure the approach would be acceptable but it did 
avoid repetition of null bitmap handling logic. What do you thinks?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441013409



##
File path: cpp/src/arrow/compute/kernels/scalar_validity_test.cc
##
@@ -0,0 +1,78 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include 
+
+#include "arrow/array.h"
+#include "arrow/compute/api.h"
+#include "arrow/compute/kernels/test_util.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/type.h"
+#include "arrow/type_traits.h"
+#include "arrow/util/bitmap_reader.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace compute {
+
+class TestValidityKernels : public ::testing::Test {
+ protected:
+  // XXX Since IsValid and IsNull don't touch any buffers but the null bitmap
+  // testing multiple types seems redundant.
+  using ArrowType = BooleanType;
+
+  static std::shared_ptr type_singleton() {
+return TypeTraits::type_singleton();
+  }
+};
+
+TEST_F(TestValidityKernels, ArrayIsValid) {
+  CheckScalarUnary("is_valid", type_singleton(), "[]", type_singleton(), "[]");
+  CheckScalarUnary("is_valid", type_singleton(), "[null]", type_singleton(), 
"[false]");
+  CheckScalarUnary("is_valid", type_singleton(), "[1]", type_singleton(), 
"[true]");
+  CheckScalarUnary("is_valid", type_singleton(), "[null, 1, 0, null]", 
type_singleton(),
+   "[false, true, true, false]");
+}
+
+TEST_F(TestValidityKernels, ArrayIsValidBufferPassthruOptimization) {
+  Datum arg = ArrayFromJSON(boolean(), "[null, 1, 0, null]");
+  ASSERT_OK_AND_ASSIGN(auto validity, arrow::compute::IsValid(arg));
+  ASSERT_EQ(validity.array()->buffers[1], arg.array()->buffers[0]);
+}
+
+TEST_F(TestValidityKernels, ScalarIsValid) {
+  CheckScalarUnary("is_valid", MakeScalar(19.7), MakeScalar(true));
+  CheckScalarUnary("is_valid", MakeNullScalar(float64()), MakeScalar(false));
+}
+
+TEST_F(TestValidityKernels, ArrayIsNull) {
+  CheckScalarUnary("is_null", type_singleton(), "[]", type_singleton(), "[]");
+  CheckScalarUnary("is_null", type_singleton(), "[null]", type_singleton(), 
"[true]");
+  CheckScalarUnary("is_null", type_singleton(), "[1]", type_singleton(), 
"[false]");
+  CheckScalarUnary("is_null", type_singleton(), "[null, 1, 0, null]", 
type_singleton(),
+   "[true, false, false, true]");
+}
+
+TEST_F(TestValidityKernels, ScalarIsNull) {
+  CheckScalarUnary("is_null", MakeScalar(19.7), MakeScalar(false));
+  CheckScalarUnary("is_null", MakeNullScalar(float64()), MakeScalar(true));
+}

Review comment:
   will do





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441005251



##
File path: cpp/src/arrow/testing/random.cc
##
@@ -262,5 +262,62 @@ std::shared_ptr 
RandomArrayGenerator::Offsets(int64_t size, int32_t first
   return std::make_shared(array_data);
 }
 
+struct RandomArrayGeneratorOfImpl {
+  Status Visit(const NullType&) {
+DCHECK_NE(null_probability_, 0.0);

Review comment:
   alright





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-16 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r441004600



##
File path: cpp/src/arrow/compute/kernels/scalar_validity.cc
##
@@ -0,0 +1,110 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/kernels/common.h"
+
+#include "arrow/util/bit_util.h"
+#include "arrow/util/bitmap_ops.h"
+
+namespace arrow {
+
+using internal::CopyBitmap;
+using internal::InvertBitmap;
+
+namespace compute {
+namespace {
+
+struct IsValidOperator {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+if (arr.buffers[0] != nullptr && out->offset == arr.offset &&
+out->length == arr.length) {

Review comment:
   will do





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440239175



##
File path: cpp/src/arrow/compute/kernels/scalar_validity_test.cc
##
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include 
+
+#include "arrow/array.h"
+#include "arrow/compute/api.h"
+#include "arrow/compute/kernels/test_util.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/type.h"
+#include "arrow/type_traits.h"
+#include "arrow/util/bitmap_reader.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace compute {
+
+class TestValidityKernels : public ::testing::Test {
+ protected:
+  // XXX Since IsValid and IsNull don't touch any buffers but the null bitmap
+  // testing multiple types seems redundant.
+  using ArrowType = BooleanType;
+
+  using CType = typename ArrowType::c_type;
+
+  static std::shared_ptr type_singleton() {
+return TypeTraits::type_singleton();
+  }
+
+  void AssertUnary(Datum arg, Datum expected) {
+ASSERT_OK_AND_ASSIGN(auto actual, func_(arg, nullptr));
+ASSERT_EQ(actual.kind(), expected.kind());
+if (actual.kind() == Datum::ARRAY) {
+  ASSERT_OK(actual.make_array()->ValidateFull());
+  AssertArraysApproxEqual(*expected.make_array(), *actual.make_array());
+} else {
+  AssertScalarsEqual(*expected.scalar(), *actual.scalar());
+}
+  }
+
+  void AssertUnary(const std::string& arg_json, const std::string& 
expected_json) {
+AssertUnary(ArrayFromJSON(type_singleton(), arg_json),
+ArrayFromJSON(type_singleton(), expected_json));
+  }
+
+  using UnaryFunction = std::function(const Datum&, 
ExecContext*)>;
+  UnaryFunction func_;
+};
+
+TEST_F(TestValidityKernels, ArrayIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[false]");
+  this->AssertUnary("[1]", "[true]");
+  this->AssertUnary("[null, 1, 0, null]", "[false, true, true, false]");
+}
+
+TEST_F(TestValidityKernels, ArrayIsValidBufferPassthruOptimization) {
+  Datum arg = ArrayFromJSON(boolean(), "[null, 1, 0, null]");
+  ASSERT_OK_AND_ASSIGN(auto validity, arrow::compute::IsValid(arg));
+  ASSERT_EQ(validity.array()->buffers[1], arg.array()->buffers[0]);
+}
+
+TEST_F(TestValidityKernels, ScalarIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  AssertUnary(Datum(19.7), Datum(true));
+  AssertUnary(MakeNullScalar(float64()), Datum(false));
+}
+
+TEST_F(TestValidityKernels, ArrayIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[true]");
+  this->AssertUnary("[1]", "[false]");
+  this->AssertUnary("[null, 1, 0, null]", "[true, false, false, true]");
+}
+
+TEST_F(TestValidityKernels, DISABLED_ScalarIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  AssertUnary(Datum(19.7), Datum(false));
+  AssertUnary(MakeNullScalar(float64()), Datum(true));
+}
+
+class IsValidProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_valid"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(args[0]->is_valid);
+  }
+};
+
+TEST_P(IsValidProperty, TestIsValidProperty) { Validate(); }
+
+INSTANTIATE_TEST_SUITE_P(IsValidPropertyTest, IsValidProperty,
+ ScalarFunctionPropertyTestParam::Values({
+ {0, 0.0},
+ {1, 0.0},
+ {2, 0.0},
+ {1024, 0.25},
+ }));
+
+class IsNullProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_null"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(!args[0]->is_valid);
+  }

[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440231899



##
File path: cpp/src/arrow/compute/kernels/scalar_validity_test.cc
##
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include 
+
+#include "arrow/array.h"
+#include "arrow/compute/api.h"
+#include "arrow/compute/kernels/test_util.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/type.h"
+#include "arrow/type_traits.h"
+#include "arrow/util/bitmap_reader.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace compute {
+
+class TestValidityKernels : public ::testing::Test {
+ protected:
+  // XXX Since IsValid and IsNull don't touch any buffers but the null bitmap
+  // testing multiple types seems redundant.
+  using ArrowType = BooleanType;
+
+  using CType = typename ArrowType::c_type;
+
+  static std::shared_ptr type_singleton() {
+return TypeTraits::type_singleton();
+  }
+
+  void AssertUnary(Datum arg, Datum expected) {
+ASSERT_OK_AND_ASSIGN(auto actual, func_(arg, nullptr));
+ASSERT_EQ(actual.kind(), expected.kind());
+if (actual.kind() == Datum::ARRAY) {
+  ASSERT_OK(actual.make_array()->ValidateFull());
+  AssertArraysApproxEqual(*expected.make_array(), *actual.make_array());
+} else {
+  AssertScalarsEqual(*expected.scalar(), *actual.scalar());
+}
+  }
+
+  void AssertUnary(const std::string& arg_json, const std::string& 
expected_json) {
+AssertUnary(ArrayFromJSON(type_singleton(), arg_json),
+ArrayFromJSON(type_singleton(), expected_json));
+  }
+
+  using UnaryFunction = std::function(const Datum&, 
ExecContext*)>;
+  UnaryFunction func_;
+};
+
+TEST_F(TestValidityKernels, ArrayIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[false]");
+  this->AssertUnary("[1]", "[true]");
+  this->AssertUnary("[null, 1, 0, null]", "[false, true, true, false]");
+}
+
+TEST_F(TestValidityKernels, ArrayIsValidBufferPassthruOptimization) {
+  Datum arg = ArrayFromJSON(boolean(), "[null, 1, 0, null]");
+  ASSERT_OK_AND_ASSIGN(auto validity, arrow::compute::IsValid(arg));
+  ASSERT_EQ(validity.array()->buffers[1], arg.array()->buffers[0]);
+}
+
+TEST_F(TestValidityKernels, ScalarIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  AssertUnary(Datum(19.7), Datum(true));
+  AssertUnary(MakeNullScalar(float64()), Datum(false));
+}
+
+TEST_F(TestValidityKernels, ArrayIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[true]");
+  this->AssertUnary("[1]", "[false]");
+  this->AssertUnary("[null, 1, 0, null]", "[true, false, false, true]");
+}
+
+TEST_F(TestValidityKernels, DISABLED_ScalarIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  AssertUnary(Datum(19.7), Datum(false));
+  AssertUnary(MakeNullScalar(float64()), Datum(true));
+}
+
+class IsValidProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_valid"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(args[0]->is_valid);
+  }
+};
+
+TEST_P(IsValidProperty, TestIsValidProperty) { Validate(); }
+
+INSTANTIATE_TEST_SUITE_P(IsValidPropertyTest, IsValidProperty,
+ ScalarFunctionPropertyTestParam::Values({
+ {0, 0.0},
+ {1, 0.0},
+ {2, 0.0},
+ {1024, 0.25},
+ }));
+
+class IsNullProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_null"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(!args[0]->is_valid);
+  }

[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440220181



##
File path: cpp/src/arrow/compute/kernels/scalar_validity_test.cc
##
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include 
+
+#include "arrow/array.h"
+#include "arrow/compute/api.h"
+#include "arrow/compute/kernels/test_util.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/type.h"
+#include "arrow/type_traits.h"
+#include "arrow/util/bitmap_reader.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace compute {
+
+class TestValidityKernels : public ::testing::Test {
+ protected:
+  // XXX Since IsValid and IsNull don't touch any buffers but the null bitmap
+  // testing multiple types seems redundant.
+  using ArrowType = BooleanType;
+
+  using CType = typename ArrowType::c_type;
+
+  static std::shared_ptr type_singleton() {
+return TypeTraits::type_singleton();
+  }
+
+  void AssertUnary(Datum arg, Datum expected) {
+ASSERT_OK_AND_ASSIGN(auto actual, func_(arg, nullptr));
+ASSERT_EQ(actual.kind(), expected.kind());
+if (actual.kind() == Datum::ARRAY) {
+  ASSERT_OK(actual.make_array()->ValidateFull());
+  AssertArraysApproxEqual(*expected.make_array(), *actual.make_array());
+} else {
+  AssertScalarsEqual(*expected.scalar(), *actual.scalar());
+}
+  }
+
+  void AssertUnary(const std::string& arg_json, const std::string& 
expected_json) {
+AssertUnary(ArrayFromJSON(type_singleton(), arg_json),
+ArrayFromJSON(type_singleton(), expected_json));
+  }
+
+  using UnaryFunction = std::function(const Datum&, 
ExecContext*)>;
+  UnaryFunction func_;
+};
+
+TEST_F(TestValidityKernels, ArrayIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[false]");
+  this->AssertUnary("[1]", "[true]");
+  this->AssertUnary("[null, 1, 0, null]", "[false, true, true, false]");
+}
+
+TEST_F(TestValidityKernels, ArrayIsValidBufferPassthruOptimization) {
+  Datum arg = ArrayFromJSON(boolean(), "[null, 1, 0, null]");
+  ASSERT_OK_AND_ASSIGN(auto validity, arrow::compute::IsValid(arg));
+  ASSERT_EQ(validity.array()->buffers[1], arg.array()->buffers[0]);
+}
+
+TEST_F(TestValidityKernels, ScalarIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  AssertUnary(Datum(19.7), Datum(true));
+  AssertUnary(MakeNullScalar(float64()), Datum(false));
+}
+
+TEST_F(TestValidityKernels, ArrayIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[true]");
+  this->AssertUnary("[1]", "[false]");
+  this->AssertUnary("[null, 1, 0, null]", "[true, false, false, true]");
+}
+
+TEST_F(TestValidityKernels, DISABLED_ScalarIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  AssertUnary(Datum(19.7), Datum(false));
+  AssertUnary(MakeNullScalar(float64()), Datum(true));
+}
+
+class IsValidProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_valid"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(args[0]->is_valid);
+  }
+};
+
+TEST_P(IsValidProperty, TestIsValidProperty) { Validate(); }
+
+INSTANTIATE_TEST_SUITE_P(IsValidPropertyTest, IsValidProperty,
+ ScalarFunctionPropertyTestParam::Values({
+ {0, 0.0},
+ {1, 0.0},
+ {2, 0.0},
+ {1024, 0.25},
+ }));
+
+class IsNullProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_null"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(!args[0]->is_valid);
+  }

[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440206424



##
File path: cpp/src/arrow/compute/kernels/test_util.h
##
@@ -97,5 +99,63 @@ using TestingStringTypes =
 
 static constexpr random::SeedType kRandomSeed = 0x0ff1ce;
 
+struct ScalarFunctionPropertyTestParam {

Review comment:
   https://en.wikipedia.org/wiki/Property_testing





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440205745



##
File path: cpp/src/arrow/compute/kernels/scalar_validity_test.cc
##
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include 
+
+#include "arrow/array.h"
+#include "arrow/compute/api.h"
+#include "arrow/compute/kernels/test_util.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/type.h"
+#include "arrow/type_traits.h"
+#include "arrow/util/bitmap_reader.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace compute {
+
+class TestValidityKernels : public ::testing::Test {
+ protected:
+  // XXX Since IsValid and IsNull don't touch any buffers but the null bitmap
+  // testing multiple types seems redundant.
+  using ArrowType = BooleanType;
+
+  using CType = typename ArrowType::c_type;
+
+  static std::shared_ptr type_singleton() {
+return TypeTraits::type_singleton();
+  }
+
+  void AssertUnary(Datum arg, Datum expected) {
+ASSERT_OK_AND_ASSIGN(auto actual, func_(arg, nullptr));
+ASSERT_EQ(actual.kind(), expected.kind());
+if (actual.kind() == Datum::ARRAY) {
+  ASSERT_OK(actual.make_array()->ValidateFull());
+  AssertArraysApproxEqual(*expected.make_array(), *actual.make_array());
+} else {
+  AssertScalarsEqual(*expected.scalar(), *actual.scalar());
+}
+  }
+
+  void AssertUnary(const std::string& arg_json, const std::string& 
expected_json) {
+AssertUnary(ArrayFromJSON(type_singleton(), arg_json),
+ArrayFromJSON(type_singleton(), expected_json));
+  }
+
+  using UnaryFunction = std::function(const Datum&, 
ExecContext*)>;
+  UnaryFunction func_;
+};
+
+TEST_F(TestValidityKernels, ArrayIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[false]");
+  this->AssertUnary("[1]", "[true]");
+  this->AssertUnary("[null, 1, 0, null]", "[false, true, true, false]");
+}
+
+TEST_F(TestValidityKernels, ArrayIsValidBufferPassthruOptimization) {
+  Datum arg = ArrayFromJSON(boolean(), "[null, 1, 0, null]");
+  ASSERT_OK_AND_ASSIGN(auto validity, arrow::compute::IsValid(arg));
+  ASSERT_EQ(validity.array()->buffers[1], arg.array()->buffers[0]);
+}
+
+TEST_F(TestValidityKernels, ScalarIsValid) {
+  func_ = arrow::compute::IsValid;
+
+  AssertUnary(Datum(19.7), Datum(true));
+  AssertUnary(MakeNullScalar(float64()), Datum(false));
+}
+
+TEST_F(TestValidityKernels, ArrayIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  this->AssertUnary("[]", "[]");
+  this->AssertUnary("[null]", "[true]");
+  this->AssertUnary("[1]", "[false]");
+  this->AssertUnary("[null, 1, 0, null]", "[true, false, false, true]");
+}
+
+TEST_F(TestValidityKernels, DISABLED_ScalarIsNull) {
+  func_ = arrow::compute::IsNull;
+
+  AssertUnary(Datum(19.7), Datum(false));
+  AssertUnary(MakeNullScalar(float64()), Datum(true));
+}
+
+class IsValidProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_valid"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(args[0]->is_valid);
+  }
+};
+
+TEST_P(IsValidProperty, TestIsValidProperty) { Validate(); }
+
+INSTANTIATE_TEST_SUITE_P(IsValidPropertyTest, IsValidProperty,
+ ScalarFunctionPropertyTestParam::Values({
+ {0, 0.0},
+ {1, 0.0},
+ {2, 0.0},
+ {1024, 0.25},
+ }));
+
+class IsNullProperty : public ScalarFunctionPropertyMixin {
+ public:
+  std::shared_ptr GetFunction() override {
+return internal::checked_pointer_cast(
+*GetFunctionRegistry()->GetFunction("is_null"));
+  }
+
+  Result> Contract(const ScalarVector& args,
+   const FunctionOptions*) override {
+return std::make_shared(!args[0]->is_valid);
+  }

[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440202771



##
File path: cpp/src/arrow/compute/kernels/scalar_validity.cc
##
@@ -0,0 +1,97 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/kernels/common.h"
+
+#include "arrow/util/bit_util.h"
+#include "arrow/util/bitmap_ops.h"
+
+namespace arrow {
+namespace compute {
+
+struct IsValid {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+if (arr.buffers[0] != nullptr && out->offset == arr.offset &&
+out->length == arr.length) {
+  out->buffers[1] = arr.buffers[0];
+  return;
+}

Review comment:
   My thought was that this optimization won't be available frequently 
enough to sacrifice the benefits of preallocated memory when zero copy isn't 
possible The intent here was to allow immediate reclamation of memory when we 
can determine we don't need it. I'll refactor to assume zero copy is usually 
possible





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440197092



##
File path: cpp/src/arrow/testing/random.h
##
@@ -250,6 +250,9 @@ class ARROW_EXPORT RandomArrayGenerator {
int32_t min_length, int32_t 
max_length,
double null_probability = 0);
 
+  std::shared_ptr Of(std::shared_ptr type, int64_t size,

Review comment:
   will do





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440185720



##
File path: cpp/src/arrow/compute/kernels/scalar_validity_test.cc
##
@@ -0,0 +1,151 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include 
+
+#include "arrow/array.h"
+#include "arrow/compute/api.h"
+#include "arrow/compute/kernels/test_util.h"
+#include "arrow/testing/gtest_common.h"
+#include "arrow/testing/gtest_util.h"
+#include "arrow/testing/random.h"
+#include "arrow/type.h"
+#include "arrow/type_traits.h"
+#include "arrow/util/bitmap_reader.h"
+#include "arrow/util/checked_cast.h"
+
+namespace arrow {
+namespace compute {
+
+class TestValidityKernels : public ::testing::Test {
+ protected:
+  // XXX Since IsValid and IsNull don't touch any buffers but the null bitmap
+  // testing multiple types seems redundant.
+  using ArrowType = BooleanType;
+
+  using CType = typename ArrowType::c_type;
+
+  static std::shared_ptr type_singleton() {
+return TypeTraits::type_singleton();
+  }
+
+  void AssertUnary(Datum arg, Datum expected) {
+ASSERT_OK_AND_ASSIGN(auto actual, func_(arg, nullptr));
+ASSERT_EQ(actual.kind(), expected.kind());
+if (actual.kind() == Datum::ARRAY) {
+  ASSERT_OK(actual.make_array()->ValidateFull());
+  AssertArraysApproxEqual(*expected.make_array(), *actual.make_array());
+} else {
+  AssertScalarsEqual(*expected.scalar(), *actual.scalar());
+}
+  }
+
+  void AssertUnary(const std::string& arg_json, const std::string& 
expected_json) {
+AssertUnary(ArrayFromJSON(type_singleton(), arg_json),
+ArrayFromJSON(type_singleton(), expected_json));
+  }
+
+  using UnaryFunction = std::function(const Datum&, 
ExecContext*)>;
+  UnaryFunction func_;

Review comment:
   It's a bit less boilerplate but probably not enough to justify the 
different function. I'll remove it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-15 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r440182319



##
File path: cpp/src/arrow/compute/kernels/scalar_validity.cc
##
@@ -0,0 +1,97 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "arrow/compute/kernels/common.h"
+
+#include "arrow/util/bit_util.h"
+#include "arrow/util/bitmap_ops.h"
+
+namespace arrow {
+namespace compute {
+
+struct IsValid {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+if (arr.buffers[0] != nullptr && out->offset == arr.offset &&
+out->length == arr.length) {
+  out->buffers[1] = arr.buffers[0];
+  return;
+}
+
+if (arr.null_count == 0 || arr.buffers[0] == nullptr) {
+  BitUtil::SetBitsTo(out->buffers[1]->mutable_data(), out->offset, 
out->length, true);
+  return;
+}
+
+internal::CopyBitmap(arr.buffers[0]->data(), arr.offset, arr.length,
+ out->buffers[1]->mutable_data(), out->offset);
+  }
+};
+
+struct IsNull {
+  static void Call(KernelContext* ctx, const Scalar& in, Scalar* out) {
+checked_cast(out)->value = !in.is_valid;
+  }
+
+  static void Call(KernelContext* ctx, const ArrayData& arr, ArrayData* out) {
+if (arr.null_count == 0 || arr.buffers[0] == nullptr) {
+  BitUtil::SetBitsTo(out->buffers[1]->mutable_data(), out->offset, 
out->length,
+ false);
+  return;
+}
+
+internal::InvertBitmap(arr.buffers[0]->data(), arr.offset, arr.length,
+   out->buffers[1]->mutable_data(), out->offset);
+  }
+};
+
+namespace codegen {
+
+void MakeFunction(std::string name, std::vector in_types, 
OutputType out_type,
+  ArrayKernelExec exec, FunctionRegistry* registry,
+  NullHandling::type null_handling) {
+  Arity arity{static_cast(in_types.size())};
+  auto func = std::make_shared(name, arity);
+
+  ScalarKernel kernel(std::move(in_types), out_type, exec);
+  kernel.null_handling = null_handling;
+  kernel.can_write_into_slices = true;
+
+  DCHECK_OK(func->AddKernel(std::move(kernel)));
+  DCHECK_OK(registry->AddFunction(std::move(func)));
+}
+
+}  // namespace codegen

Review comment:
   okay





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-12 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r439696674



##
File path: cpp/src/arrow/compute/kernels/codegen_internal.h
##
@@ -181,8 +180,7 @@ struct GetOutputType> {
 };
 
 template 
-struct GetOutputType<
-Type, enable_if_t::value>> {
+struct GetOutputType::value>> {
   using T = std::string;
 };

Review comment:
   https://issues.apache.org/jira/browse/ARROW-9120





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-12 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r439676908



##
File path: cpp/src/arrow/compute/kernels/codegen_internal.h
##
@@ -181,8 +180,7 @@ struct GetOutputType> {
 };
 
 template 
-struct GetOutputType<
-Type, enable_if_t::value>> {
+struct GetOutputType::value>> {
   using T = std::string;
 };

Review comment:
   @wesm found it: `codegen**_internal**.h` is excluded from the `format` 
target, so it hasn't been touched by clang-format until now. We need a JIRA for 
this





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-12 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r439445910



##
File path: cpp/src/arrow/compute/kernels/codegen_internal.h
##
@@ -181,8 +180,7 @@ struct GetOutputType> {
 };
 
 template 
-struct GetOutputType<
-Type, enable_if_t::value>> {
+struct GetOutputType::value>> {
   using T = std::string;
 };

Review comment:
   I don't know; very odd. It appears to have only affected this file. 
`clang-format-8` is being used both locally and in CI





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-12 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r439442386



##
File path: cpp/src/arrow/compute/kernels/test_util.h
##
@@ -33,7 +33,9 @@
 #include "arrow/testing/random.h"
 #include "arrow/testing/util.h"
 #include "arrow/type.h"
+#include "arrow/util/iterator.h"

Review comment:
   ```suggestion
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-12 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r439442116



##
File path: cpp/src/arrow/util/iterator_test.cc
##
@@ -27,6 +27,9 @@
 #include 
 #include 
 
+#include 
+#include 
+

Review comment:
   ```suggestion
   ```
   no longer in use





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] bkietz commented on a change in pull request #7410: ARROW-971: [C++][Compute] IsValid, IsNull kernels

2020-06-12 Thread GitBox


bkietz commented on a change in pull request #7410:
URL: https://github.com/apache/arrow/pull/7410#discussion_r439440106



##
File path: cpp/src/arrow/compute/exec.cc
##
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 

Review comment:
   ```suggestion
   ```
   debug code





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org