[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361882#comment-16361882
 ] 

ASF GitHub Bot commented on ARROW-969:
--

trxcllnt commented on issue #1574: ARROW-969: [C++] Add add/remove field 
functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#issuecomment-365163973
 
 
   @wesm looks like noise, but ping me if you see it again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361874#comment-16361874
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on issue #1574: ARROW-969: [C++] Add add/remove field functions 
for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#issuecomment-365162629
 
 
   Merging, thanks @xuepanchen! @trxcllnt @TheNeuralBit odd unrelated JS 
failure here, in case you want to have a look


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361876#comment-16361876
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm closed pull request #1574: ARROW-969: [C++] Add add/remove field functions 
for RecordBatch
URL: https://github.com/apache/arrow/pull/1574
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/cpp/src/arrow/record_batch.cc b/cpp/src/arrow/record_batch.cc
index d418cc4a2..f295b864c 100644
--- a/cpp/src/arrow/record_batch.cc
+++ b/cpp/src/arrow/record_batch.cc
@@ -21,15 +21,24 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "arrow/array.h"
 #include "arrow/status.h"
 #include "arrow/type.h"
 #include "arrow/util/logging.h"
+#include "arrow/util/stl.h"
 
 namespace arrow {
 
+Status RecordBatch::AddColumn(int i, const std::string& field_name,
+  const std::shared_ptr& column,
+  std::shared_ptr* out) const {
+  auto field = ::arrow::field(field_name, column->type());
+  return AddColumn(i, field, column, out);
+}
+
 /// \class SimpleRecordBatch
 /// \brief A basic, non-lazy in-memory record batch
 class SimpleRecordBatch : public RecordBatch {
@@ -78,6 +87,42 @@ class SimpleRecordBatch : public RecordBatch {
 
   std::shared_ptr column_data(int i) const override { return 
columns_[i]; }
 
+  Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const override {
+DCHECK(field != nullptr);
+DCHECK(column != nullptr);
+
+if (!field->type()->Equals(column->type())) {
+  std::stringstream ss;
+  ss << "Column data type " << field->type()->name()
+ << " does not match field data type " << column->type()->name();
+  return Status::Invalid(ss.str());
+}
+if (column->length() != num_rows_) {
+  std::stringstream ss;
+  ss << "Added column's length must match record batch's length. Expected 
length "
+ << num_rows_ << " but got length " << column->length();
+  return Status::Invalid(ss.str());
+}
+
+std::shared_ptr new_schema;
+RETURN_NOT_OK(schema_->AddField(i, field, &new_schema));
+
+*out = RecordBatch::Make(new_schema, num_rows_,
+ internal::AddVectorElement(columns_, i, 
column->data()));
+return Status::OK();
+  }
+
+  Status RemoveColumn(int i, std::shared_ptr* out) const override 
{
+std::shared_ptr new_schema;
+RETURN_NOT_OK(schema_->RemoveField(i, &new_schema));
+
+*out = RecordBatch::Make(new_schema, num_rows_,
+ internal::DeleteVectorElement(columns_, i));
+return Status::OK();
+  }
+
   std::shared_ptr ReplaceSchemaMetadata(
   const std::shared_ptr& metadata) const override {
 auto new_schema = schema_->AddMetadata(metadata);
diff --git a/cpp/src/arrow/record_batch.h b/cpp/src/arrow/record_batch.h
index b2c4c76b3..6fb747c40 100644
--- a/cpp/src/arrow/record_batch.h
+++ b/cpp/src/arrow/record_batch.h
@@ -96,6 +96,35 @@ class ARROW_EXPORT RecordBatch {
   /// \return an internal ArrayData object
   virtual std::shared_ptr column_data(int i) const = 0;
 
+  /// \brief Add column to the record batch, producing a new RecordBatch
+  ///
+  /// \param[in] i field index, which will be boundschecked
+  /// \param[in] field field to be added
+  /// \param[in] column column to be added
+  /// \param[out] out record batch with column added
+  virtual Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const = 0;
+
+  /// \brief Add new nullable column to the record batch, producing a new
+  /// RecordBatch.
+  ///
+  /// For non-nullable columns, use the Field-based version of this method.
+  ///
+  /// \param[in] i field index, which will be boundschecked
+  /// \param[in] field_name name of field to be added
+  /// \param[in] column column to be added
+  /// \param[out] out record batch with column added
+  virtual Status AddColumn(int i, const std::string& field_name,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const;
+
+  /// \brief Remove column from the record batch, producing a new RecordBatch
+  ///
+  /// \param[in] i field index, does boundscheck
+  /// \param[out] out record batch with column removed
+  virtual Status RemoveColumn(int i, std::shared_ptr* out) const 
= 0;
+
   virtual std::shared_ptr ReplaceSchemaMetadata(
   const std::shared_ptr& metadata) const = 0;
 
diff --git a/cpp/src/arrow/table-test.cc b/cpp/src/

[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357439#comment-16357439
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167038477
 
 

 ##
 File path: cpp/src/arrow/record_batch.cc
 ##
 @@ -78,6 +79,52 @@ class SimpleRecordBatch : public RecordBatch {
 
   std::shared_ptr column_data(int i) const override { return 
columns_[i]; }
 
+  Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const override {
+if (i < 0 || i > num_columns() + 1) {
+  return Status::Invalid("Invalid column index");
+}
+if (field == nullptr) {
+  std::stringstream ss;
+  ss << "Field " << i << " was null";
+  return Status::Invalid(ss.str());
+}
+if (column == nullptr) {
+  std::stringstream ss;
+  ss << "Column " << i << " was null";
+  return Status::Invalid(ss.str());
+}
 
 Review comment:
   I took a look at `SimpleTable::AddColumn`; there `col` is being null-checked 
-- I think that should also be a DCHECK


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357438#comment-16357438
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167037896
 
 

 ##
 File path: cpp/src/arrow/table-test.cc
 ##
 @@ -588,6 +588,101 @@ TEST_F(TestRecordBatch, Slice) {
   }
 }
 
+TEST_F(TestRecordBatch, AddColumn) {
+  const int length = 10;
+
+  auto field1 = field("f1", int32());
+  auto field2 = field("f2", uint8());
+  auto field3 = field("f3", int16());
+
+  auto schema1 = ::arrow::schema({field1, field2});
+  auto schema2 = ::arrow::schema({field2, field3});
+  auto schema3 = ::arrow::schema({field2});
+
+  auto array1 = MakeRandomArray(length);
+  auto array2 = MakeRandomArray(length);
+  auto array3 = MakeRandomArray(length);
+
+  auto batch1 = RecordBatch::Make(schema1, length, {array1, array2});
+  auto batch2 = RecordBatch::Make(schema2, length, {array2, array3});
+  auto batch3 = RecordBatch::Make(schema3, length, {array2});
+
+  const RecordBatch& batch = *batch3;
+  std::shared_ptr result;
+
+  // Negative tests with invalid index
+  Status status = batch.AddColumn(5, field1, array1->data(), &result);
 
 Review comment:
   Add a test for `batch.AddColumn(2, ...)` to address the edge case in the 
implementation. We probably need a corresponding test for `Table` (and maybe 
also `Schema`). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357435#comment-16357435
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167037500
 
 

 ##
 File path: cpp/src/arrow/record_batch.h
 ##
 @@ -96,6 +96,14 @@ class ARROW_EXPORT RecordBatch {
   /// \return an internal ArrayData object
   virtual std::shared_ptr column_data(int i) const = 0;
 
+  /// \brief Add column to the record batch, producing a new RecordBatch
+  virtual Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const = 0;
+
+  /// \brief Remove column from the record batch, producing a new RecordBatch
+  virtual Status RemoveColumn(int i, std::shared_ptr* out) const 
= 0;
 
 Review comment:
   Can you document the parameters for these functions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357432#comment-16357432
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167036916
 
 

 ##
 File path: cpp/src/arrow/record_batch.cc
 ##
 @@ -78,6 +79,52 @@ class SimpleRecordBatch : public RecordBatch {
 
   std::shared_ptr column_data(int i) const override { return 
columns_[i]; }
 
+  Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const override {
+if (i < 0 || i > num_columns() + 1) {
+  return Status::Invalid("Invalid column index");
+}
+if (field == nullptr) {
+  std::stringstream ss;
+  ss << "Field " << i << " was null";
+  return Status::Invalid(ss.str());
+}
+if (column == nullptr) {
+  std::stringstream ss;
+  ss << "Column " << i << " was null";
+  return Status::Invalid(ss.str());
+}
 
 Review comment:
   I think these should both be `DCHECK`, since null would indicate a problem 
with application logic, so should be a "can't fail"


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357431#comment-16357431
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167036590
 
 

 ##
 File path: cpp/src/arrow/record_batch.cc
 ##
 @@ -78,6 +79,52 @@ class SimpleRecordBatch : public RecordBatch {
 
   std::shared_ptr column_data(int i) const override { return 
columns_[i]; }
 
+  Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const override {
+if (i < 0 || i > num_columns() + 1) {
 
 Review comment:
   I think this should be `i > num_columns()`. This is also a bug in 
`SimpleTable::AddColumn`. Can you add a test where `i == num_columns()`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357436#comment-16357436
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167037985
 
 

 ##
 File path: cpp/src/arrow/table-test.cc
 ##
 @@ -588,6 +588,101 @@ TEST_F(TestRecordBatch, Slice) {
   }
 }
 
+TEST_F(TestRecordBatch, AddColumn) {
+  const int length = 10;
+
+  auto field1 = field("f1", int32());
+  auto field2 = field("f2", uint8());
+  auto field3 = field("f3", int16());
+
+  auto schema1 = ::arrow::schema({field1, field2});
+  auto schema2 = ::arrow::schema({field2, field3});
+  auto schema3 = ::arrow::schema({field2});
+
+  auto array1 = MakeRandomArray(length);
+  auto array2 = MakeRandomArray(length);
+  auto array3 = MakeRandomArray(length);
+
+  auto batch1 = RecordBatch::Make(schema1, length, {array1, array2});
+  auto batch2 = RecordBatch::Make(schema2, length, {array2, array3});
+  auto batch3 = RecordBatch::Make(schema3, length, {array2});
+
+  const RecordBatch& batch = *batch3;
+  std::shared_ptr result;
+
+  // Negative tests with invalid index
+  Status status = batch.AddColumn(5, field1, array1->data(), &result);
+  ASSERT_TRUE(status.IsInvalid());
+  status = batch.AddColumn(-1, field1, array1->data(), &result);
+  ASSERT_TRUE(status.IsInvalid());
+
+  // Negative test with wrong length
+  auto longer_col = MakeRandomArray(length + 1);
+  status = batch.AddColumn(0, field1, longer_col->data(), &result);
+  ASSERT_TRUE(status.IsInvalid());
+
+  // Negative test with mismatch type
+  status = batch.AddColumn(0, field1, array2->data(), &result);
+  ASSERT_TRUE(status.IsInvalid());
+
+  ASSERT_OK(batch.AddColumn(0, field1, array1->data(), &result));
+  ASSERT_TRUE(result->Equals(*batch1));
+
+  ASSERT_OK(batch.AddColumn(1, field3, array3->data(), &result));
+  ASSERT_TRUE(result->Equals(*batch2));
+}
+
+TEST_F(TestRecordBatch, RemoveColumn) {
+  const int length = 10;
+
+  auto field1 = field("f1", int32());
+  auto field2 = field("f2", uint8());
+  auto field3 = field("f3", int16());
+
+  auto schema1 = ::arrow::schema({field1, field2, field3});
+  auto schema2 = ::arrow::schema({field2, field3});
+  auto schema3 = ::arrow::schema({field1, field3});
+  auto schema4 = ::arrow::schema({field1, field2});
+
+  auto array1 = MakeRandomArray(length);
+  auto array2 = MakeRandomArray(length);
+  auto array3 = MakeRandomArray(length);
+
+  auto batch1 = RecordBatch::Make(schema1, length, {array1, array2, array3});
+  auto batch2 = RecordBatch::Make(schema2, length, {array2, array3});
+  auto batch3 = RecordBatch::Make(schema3, length, {array1, array3});
+  auto batch4 = RecordBatch::Make(schema4, length, {array1, array2});
+
+  const RecordBatch& batch = *batch1;
+  std::shared_ptr result;
+
+  ASSERT_OK(batch.RemoveColumn(0, &result));
+  ASSERT_TRUE(result->Equals(*batch2));
+
+  ASSERT_OK(batch.RemoveColumn(1, &result));
+  ASSERT_TRUE(result->Equals(*batch3));
+
+  ASSERT_OK(batch.RemoveColumn(2, &result));
+  ASSERT_TRUE(result->Equals(*batch4));
 
 Review comment:
   Add a test for removing an out of bounds index


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357437#comment-16357437
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167037093
 
 

 ##
 File path: cpp/src/arrow/record_batch.cc
 ##
 @@ -78,6 +79,52 @@ class SimpleRecordBatch : public RecordBatch {
 
   std::shared_ptr column_data(int i) const override { return 
columns_[i]; }
 
+  Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
 
 Review comment:
   Pass `Array` here instead, since that's more likely to be what the user has 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357434#comment-16357434
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on a change in pull request #1574: ARROW-969: [C++] Add 
add/remove field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#discussion_r167037375
 
 

 ##
 File path: cpp/src/arrow/record_batch.cc
 ##
 @@ -78,6 +79,52 @@ class SimpleRecordBatch : public RecordBatch {
 
   std::shared_ptr column_data(int i) const override { return 
columns_[i]; }
 
+  Status AddColumn(int i, const std::shared_ptr& field,
+   const std::shared_ptr& column,
+   std::shared_ptr* out) const override {
+if (i < 0 || i > num_columns() + 1) {
+  return Status::Invalid("Invalid column index");
+}
+if (field == nullptr) {
+  std::stringstream ss;
+  ss << "Field " << i << " was null";
+  return Status::Invalid(ss.str());
+}
+if (column == nullptr) {
+  std::stringstream ss;
+  ss << "Column " << i << " was null";
+  return Status::Invalid(ss.str());
+}
+if (!field->type()->Equals(column->type)) {
+  std::stringstream ss;
+  ss << "Column data type " << field->type()->name()
+ << " does not match field data type " << column->type->name();
+  return Status::Invalid(ss.str());
+}
+if (column->length != num_rows_) {
+  std::stringstream ss;
+  ss << "Added column's length must match record batch's length. Expected 
length "
+ << num_rows_ << " but got length " << column->length;
+  return Status::Invalid(ss.str());
+}
+
+std::shared_ptr new_schema;
+RETURN_NOT_OK(schema_->AddField(i, field, &new_schema));
 
 Review comment:
   We could leave the boundschecking above to `Schema::AddField` -- could you 
also check whether that function has the issues described above?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355969#comment-16355969
 ] 

ASF GitHub Bot commented on ARROW-969:
--

wesm commented on issue #1574: ARROW-969: [C++] Add add/remove field functions 
for RecordBatch
URL: https://github.com/apache/arrow/pull/1574#issuecomment-363890696
 
 
   Will review when I can, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-969) [C++/Python] Add add/remove field functions for RecordBatch

2018-02-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355847#comment-16355847
 ] 

ASF GitHub Bot commented on ARROW-969:
--

xuepanchen opened a new pull request #1574: ARROW-969: [C++] Add add/remove 
field functions for RecordBatch
URL: https://github.com/apache/arrow/pull/1574
 
 
   Add AddColumn and RemoveColumn methods for RecordBatch, as well as test cases


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++/Python] Add add/remove field functions for RecordBatch
> ---
>
> Key: ARROW-969
> URL: https://issues.apache.org/jira/browse/ARROW-969
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++, Python
>Reporter: Wes McKinney
>Assignee: Panchen Xue
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Analogous to the Table equivalents



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)