[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430786#comment-16430786
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on issue #1784: ARROW-2328: [C++] Fixed and unit tested 
feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#issuecomment-379806635
 
 
   Thank you! I will merge once the AppVeyor build passes (the Travis-CI 
failures in the Rust and glib builds are unrelated).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430523#comment-16430523
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r180093019
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -223,15 +223,17 @@ Status MakeRandomBinaryArray(int64_t length, bool 
include_nulls, MemoryPool* poo
 if (include_nulls && values_index == 0) {
   RETURN_NOT_OK(builder.AppendNull());
 } else {
-  const std::string& value = values[values_index];
+  const std::string value =
+  i < int64_t(values.size()) ? values[values_index] : 
std::to_string(i);
 
 Review comment:
   Not knowing the history of this particular function, I don't know what would 
be "better" in there. For my test I pretty much just want the consecutive 
numbers, otherwise it is very difficult to see what has gone wrong (I like the 
tests to give me the answer to that question if possible). I made a change to 
add a second function that implements that, but turns out this function is only 
used from MakeStringTypesRecordBatch, so we can't really have two 
implementations.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430304#comment-16430304
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r180034330
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -223,15 +223,17 @@ Status MakeRandomBinaryArray(int64_t length, bool 
include_nulls, MemoryPool* poo
 if (include_nulls && values_index == 0) {
   RETURN_NOT_OK(builder.AppendNull());
 } else {
-  const std::string& value = values[values_index];
+  const std::string value =
+  i < int64_t(values.size()) ? values[values_index] : 
std::to_string(i);
 
 Review comment:
   I'd rather revert this change. If we want better random generation of binary 
arrays we should do it more thoroughly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430299#comment-16430299
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r180032970
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -223,15 +223,17 @@ Status MakeRandomBinaryArray(int64_t length, bool 
include_nulls, MemoryPool* poo
 if (include_nulls && values_index == 0) {
   RETURN_NOT_OK(builder.AppendNull());
 } else {
-  const std::string& value = values[values_index];
+  const std::string value =
+  i < int64_t(values.size()) ? values[values_index] : 
std::to_string(i);
 
 Review comment:
   The original random strings were not very random, just a repetition of the 
same few strings. I admit the new ones aren't very random either, in particular 
there are no repetitions which you would expect in a in real data. I probably 
should have added a new method (MakeRangeBinaryArray?).
   Of course you don't want true random data either in a test.
   If I revert the change I wouldn't be confident that my tests are still 
testing anything, the repetition might just align in a lucky pattern.
   You ok with leaving it, I think it is better than the previous code?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430241#comment-16430241
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r180016562
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -223,15 +223,17 @@ Status MakeRandomBinaryArray(int64_t length, bool 
include_nulls, MemoryPool* poo
 if (include_nulls && values_index == 0) {
   RETURN_NOT_OK(builder.AppendNull());
 } else {
-  const std::string& value = values[values_index];
+  const std::string value =
+  i < int64_t(values.size()) ? values[values_index] : 
std::to_string(i);
 
 Review comment:
   Why did you need this change? Is this just a debugging leftover?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429308#comment-16429308
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179914221
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,89 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+class TestTableWriterSlice : public TestTableWriter,
+ public 
::testing::WithParamInterface> {
+};
+
+TEST_P(TestTableWriterSlice, SliceRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(start * 2, ));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_P(TestTableWriterSlice, SliceStringsRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+  auto with_nulls = start % 2 == 0;
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatch(, with_nulls));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_P(TestTableWriterSlice, SliceBooleanRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+  std::shared_ptr batch;
+  ASSERT_OK(MakeBooleanBatchSized(600, ));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+INSTANTIATE_TEST_CASE_P(
+TestTableWriterSliceOffsets, TestTableWriterSlice,
+::testing::Values(std::make_tuple(300, 30), std::make_tuple(301, 30),
 
 Review comment:
   Yes it does, yesterday I was just too lazy to find out how.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429306#comment-16429306
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179914007
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -244,18 +246,22 @@ Status 
MakeStringTypesRecordBatch(std::shared_ptr* out) {
 
   // Quirk with RETURN_NOT_OK macro and templated functions
   {
-auto s = MakeRandomBinaryArray(length, true, pool, 
);
+auto s = MakeRandomBinaryArray(length, with_nulls, 
pool, );
 RETURN_NOT_OK(s);
   }
 
   {
-auto s = MakeRandomBinaryArray(length, true, pool, 
);
+auto s = MakeRandomBinaryArray(length, with_nulls, 
pool, );
 RETURN_NOT_OK(s);
   }
   *out = RecordBatch::Make(schema, length, {a0, a1});
   return Status::OK();
 }
 
+Status MakeStringTypesRecordBatchWithoutNulls(std::shared_ptr* 
out) {
 
 Review comment:
   Oops.. I'll rename it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429305#comment-16429305
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179913956
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,89 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+class TestTableWriterSlice : public TestTableWriter,
+ public 
::testing::WithParamInterface> {
+};
+
+TEST_P(TestTableWriterSlice, SliceRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(start * 2, ));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
 
 Review comment:
   You are right the duplication can be removed without adding any complexity.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429296#comment-16429296
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179913300
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -244,18 +246,22 @@ Status 
MakeStringTypesRecordBatch(std::shared_ptr* out) {
 
   // Quirk with RETURN_NOT_OK macro and templated functions
   {
-auto s = MakeRandomBinaryArray(length, true, pool, 
);
+auto s = MakeRandomBinaryArray(length, with_nulls, 
pool, );
 RETURN_NOT_OK(s);
   }
 
   {
-auto s = MakeRandomBinaryArray(length, true, pool, 
);
+auto s = MakeRandomBinaryArray(length, with_nulls, 
pool, );
 RETURN_NOT_OK(s);
   }
   *out = RecordBatch::Make(schema, length, {a0, a1});
   return Status::OK();
 }
 
+Status MakeStringTypesRecordBatchWithoutNulls(std::shared_ptr* 
out) {
 
 Review comment:
   Why is it called "...WithoutNulls" if it passes `with_nulls = true`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429295#comment-16429295
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179913262
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,89 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+class TestTableWriterSlice : public TestTableWriter,
+ public 
::testing::WithParamInterface> {
+};
+
+TEST_P(TestTableWriterSlice, SliceRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(start * 2, ));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
 
 Review comment:
   This part and below is still duplicated in the other two test cases. Do you 
think you can factor that out, for example as a template function?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429294#comment-16429294
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179913230
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,89 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+class TestTableWriterSlice : public TestTableWriter,
+ public 
::testing::WithParamInterface> {
+};
+
+TEST_P(TestTableWriterSlice, SliceRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(start * 2, ));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_P(TestTableWriterSlice, SliceStringsRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+  auto with_nulls = start % 2 == 0;
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatch(, with_nulls));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_P(TestTableWriterSlice, SliceBooleanRoundTrip) {
+  auto p = GetParam();
+  auto start = std::get<0>(p);
+  auto size = std::get<1>(p);
+  std::shared_ptr batch;
+  ASSERT_OK(MakeBooleanBatchSized(600, ));
+  batch = batch->Slice(start, size);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+INSTANTIATE_TEST_CASE_P(
+TestTableWriterSliceOffsets, TestTableWriterSlice,
+::testing::Values(std::make_tuple(300, 30), std::make_tuple(301, 30),
 
 Review comment:
   Hmm... Does gtest allow making a Cartesian product here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428677#comment-16428677
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on issue #1784: ARROW-2328: [C++] Fixed and unit tested 
feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#issuecomment-379329455
 
 
   Thanks for the review comments. I’ve rewritten the tests as parameter unit 
tests and did most of the rest. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428613#comment-16428613
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179819483
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, false));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsWithNullsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, true));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceAtNonEightOffsetStringsWithNullsRoundTrip) {
 
 Review comment:
   Cool :-) Though I don't think using a loop is a big deal either...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428592#comment-16428592
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179816648
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -75,6 +75,43 @@ static Status WritePadded(io::OutputStream* stream, const 
uint8_t* data, int64_t
   return Status::OK();
 }
 
+static Status WritePaddedWithOffset(io::OutputStream* stream, const uint8_t* 
data,
+int64_t bit_offset, const int64_t length,
+int64_t* bytes_written) {
+  data = data + bit_offset / 8;
+  uint8_t bit_shift = static_cast(bit_offset % 8);
 
 Review comment:
   Not really, but I think I got some warning, maybe with clang.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428582#comment-16428582
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179815275
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, false));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsWithNullsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, true));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceAtNonEightOffsetStringsWithNullsRoundTrip) {
 
 Review comment:
   I think the problem with loops in unit tests is that you only see the first 
failure. That said I should be able to convert these tests to one parameterised 
test, which hopefully also answers the above concerns..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428577#comment-16428577
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179814535
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, false));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
 
 Review comment:
   Yes(ish) It helps give a helpfull message when the test fails.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428576#comment-16428576
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1784: ARROW-2328: [C++] Fixed 
and unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179814300
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -223,15 +223,17 @@ Status MakeRandomBinaryArray(int64_t length, bool 
include_nulls, MemoryPool* poo
 if (include_nulls && values_index == 0) {
   RETURN_NOT_OK(builder.AppendNull());
 } else {
-  const std::string& value = values[values_index];
+  const std::string value =
+  i < int64_t(values.size()) ? values[values_index] : 
std::to_string(i);
   RETURN_NOT_OK(builder.Append(reinterpret_cast(value.data()),
static_cast(value.size(;
 }
   }
   return builder.Finish(out);
 }
 
-Status MakeStringTypesRecordBatch(std::shared_ptr* out) {
+Status MakeStringTypesRecordBatchWithNulls(std::shared_ptr* out,
 
 Review comment:
   Adding the default argument didn't work, because MakeStringTypesRecordBatch 
is used as a function pointer in other tests. I'll rename is to 
MakeStringTypesRecordBatchWithoutNulls abd this one to 
MakeStringTypesRecordBatch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427024#comment-16427024
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179482956
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -223,15 +223,17 @@ Status MakeRandomBinaryArray(int64_t length, bool 
include_nulls, MemoryPool* poo
 if (include_nulls && values_index == 0) {
   RETURN_NOT_OK(builder.AppendNull());
 } else {
-  const std::string& value = values[values_index];
+  const std::string value =
+  i < int64_t(values.size()) ? values[values_index] : 
std::to_string(i);
   RETURN_NOT_OK(builder.Append(reinterpret_cast(value.data()),
static_cast(value.size(;
 }
   }
   return builder.Finish(out);
 }
 
-Status MakeStringTypesRecordBatch(std::shared_ptr* out) {
+Status MakeStringTypesRecordBatchWithNulls(std::shared_ptr* out,
 
 Review comment:
   Or simply add a `bool with_nulls = true` optional argument to 
`MakeStringTypesRecordBatch`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427020#comment-16427020
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179481731
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -558,15 +595,18 @@ class TableWriter::TableWriterImpl : public ArrayVisitor 
{
   // byte boundary, and we write this much data into the stream
 
 Review comment:
   This comment needs updating.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427009#comment-16427009
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179479281
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, false));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsWithNullsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, true));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceAtNonEightOffsetStringsWithNullsRoundTrip) {
 
 Review comment:
   There's no need writing independent methods here, just iterate over 
different slice starts / lengths.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427005#comment-16427005
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179478971
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, false));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  SCOPED_TRACE(col->data()->chunk(0)->ToString() + "\n" + 
batch->column(0)->ToString());
 
 Review comment:
   Is this required?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427006#comment-16427006
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179478878
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, false));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
 
 Review comment:
   There's a lot of duplicate code here with `SliceRoundTrip`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427004#comment-16427004
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179478878
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
+  ASSERT_OK(writer_->Append("f1", *batch->column(1)));
+  Finish();
+
+  std::shared_ptr col;
+  ASSERT_OK(reader_->GetColumn(0, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(0)));
+  ASSERT_EQ("f0", col->name());
+
+  ASSERT_OK(reader_->GetColumn(1, ));
+  ASSERT_TRUE(col->data()->chunk(0)->Equals(batch->column(1)));
+  ASSERT_EQ("f1", col->name());
+}
+
+TEST_F(TestTableWriter, SliceStringsRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeStringTypesRecordBatchWithNulls(, false));
+  batch = batch->Slice(320, 30);
+
+  ASSERT_OK(writer_->Append("f0", *batch->column(0)));
 
 Review comment:
   There's a lot of duplicate here with `SliceRoundTrip`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426997#comment-16426997
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179478300
 
 

 ##
 File path: cpp/src/arrow/ipc/feather-test.cc
 ##
 @@ -406,6 +406,125 @@ TEST_F(TestTableWriter, PrimitiveNullRoundTrip) {
   }
 }
 
+TEST_F(TestTableWriter, SliceRoundTrip) {
+  std::shared_ptr batch;
+  ASSERT_OK(MakeIntBatchSized(300, ));
+  batch = batch->Slice(100, 100);
 
 Review comment:
   It would be nice if you could iterate over a couple of different slice 
starts / lengths, especially non-multiples of 8.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426995#comment-16426995
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

pitrou commented on a change in pull request #1784: ARROW-2328: [C++] Fixed and 
unit tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#discussion_r179477752
 
 

 ##
 File path: cpp/src/arrow/ipc/test-common.h
 ##
 @@ -223,15 +223,17 @@ Status MakeRandomBinaryArray(int64_t length, bool 
include_nulls, MemoryPool* poo
 if (include_nulls && values_index == 0) {
   RETURN_NOT_OK(builder.AppendNull());
 } else {
-  const std::string& value = values[values_index];
+  const std::string value =
+  i < int64_t(values.size()) ? values[values_index] : 
std::to_string(i);
   RETURN_NOT_OK(builder.Append(reinterpret_cast(value.data()),
static_cast(value.size(;
 }
   }
   return builder.Finish(out);
 }
 
-Status MakeStringTypesRecordBatch(std::shared_ptr* out) {
+Status MakeStringTypesRecordBatchWithNulls(std::shared_ptr* out,
 
 Review comment:
   Style note: if it's called `MakeStringTypesRecordBatchWithNulls`, it 
shouldn't take an argument indicating whether you want to include nulls. 
Consider renaming or using a private helper function.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412964#comment-16412964
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on issue #1784: ARROW-2328: [C++] Fixed and unit tested 
feather writing with slice
URL: https://github.com/apache/arrow/pull/1784#issuecomment-375958714
 
 
   Will do, thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412857#comment-16412857
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

wesm commented on issue #1784: ARROW-2328: [C++] Fixed and unit tested feather 
writing with slice
URL: https://github.com/apache/arrow/pull/1784#issuecomment-375939284
 
 
   BTW, feel no need to put these "ARROW-2328: [C++]" messages in your commits. 
The only one that matters is in the PR title
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411547#comment-16411547
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

wesm commented on issue #1784: ARROW-2328: [C++] Fixed and unit tested feather 
writing with slice
URL: https://github.com/apache/arrow/pull/1784#issuecomment-375699243
 
 
   I will review this when I can, it has been a very busy week


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411293#comment-16411293
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on issue #1766: ARROW-2328: [C++] Writing a slice with 
feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#issuecomment-375644923
 
 
   I've redone the changes in a new branch and pull request 
(https://github.com/apache/arrow/pull/1784). Hopefully that is a little cleaner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411292#comment-16411292
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr opened a new pull request #1784: (ARROW-2328: [C++] Fixed and unit 
tested feather writing with slice
URL: https://github.com/apache/arrow/pull/1784
 
 
   When writing a slice of a table to feather, both the scalar data and the 
null bitmap were written out wrongly, i.e. from the beginning of the original 
table instead of from the offset.
   - scalar data is written from the offset
   - null bitmap is written from the offset.
   
   I messed up my original pull request 
https://github.com/apache/arrow/pull/1766, hence this replaces it. Hope that 
helps.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410594#comment-16410594
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

wesm commented on issue #1766: ARROW-2328: [C++] Writing a slice with feather 
ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#issuecomment-375496335
 
 
   Would like to review this more closely, will look when I can


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410591#comment-16410591
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1766: ARROW-2328: [C++] 
Writing a slice with feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#discussion_r176607118
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -558,8 +577,18 @@ class TableWriter::TableWriterImpl : public ArrayVisitor {
   // byte boundary, and we write this much data into the stream
   int64_t null_bitmap_size = 
GetOutputLength(BitUtil::BytesForBits(values.length()));
   if (values.null_bitmap()) {
-RETURN_NOT_OK(WritePadded(stream_.get(), values.null_bitmap()->data(),
-  null_bitmap_size, _written));
+auto null_bitmap = values.null_bitmap();
+if (values.offset() > 0) {
+  if (!pool_) {
+return Status::Invalid(
+"Require memory pol to write array slice (i.e. offset > 0) 
with nulls.");
+  }
+  RETURN_NOT_OK(GetTruncatedBitmap(values.offset(), values.length(), 
null_bitmap,
+   pool_, _bitmap));
+}
+
+RETURN_NOT_OK(WritePadded(stream_.get(), null_bitmap->data(), 
null_bitmap_size,
 
 Review comment:
   I've pushed a change to do as you suggested and added a test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410246#comment-16410246
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1766: ARROW-2328: [C++] 
Writing a slice with feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#discussion_r176562102
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -558,8 +577,18 @@ class TableWriter::TableWriterImpl : public ArrayVisitor {
   // byte boundary, and we write this much data into the stream
   int64_t null_bitmap_size = 
GetOutputLength(BitUtil::BytesForBits(values.length()));
   if (values.null_bitmap()) {
-RETURN_NOT_OK(WritePadded(stream_.get(), values.null_bitmap()->data(),
-  null_bitmap_size, _written));
+auto null_bitmap = values.null_bitmap();
+if (values.offset() > 0) {
+  if (!pool_) {
+return Status::Invalid(
+"Require memory pol to write array slice (i.e. offset > 0) 
with nulls.");
+  }
+  RETURN_NOT_OK(GetTruncatedBitmap(values.offset(), values.length(), 
null_bitmap,
+   pool_, _bitmap));
+}
+
+RETURN_NOT_OK(WritePadded(stream_.get(), null_bitmap->data(), 
null_bitmap_size,
 
 Review comment:
   I agree that would be better if it is possible. Same goes for the arrow 
writing code of course. I can give it a try tomorrow.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410081#comment-16410081
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

xhochy commented on a change in pull request #1766: ARROW-2328: [C++] Writing a 
slice with feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#discussion_r176529987
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -558,8 +577,18 @@ class TableWriter::TableWriterImpl : public ArrayVisitor {
   // byte boundary, and we write this much data into the stream
   int64_t null_bitmap_size = 
GetOutputLength(BitUtil::BytesForBits(values.length()));
   if (values.null_bitmap()) {
-RETURN_NOT_OK(WritePadded(stream_.get(), values.null_bitmap()->data(),
-  null_bitmap_size, _written));
+auto null_bitmap = values.null_bitmap();
+if (values.offset() > 0) {
+  if (!pool_) {
+return Status::Invalid(
+"Require memory pol to write array slice (i.e. offset > 0) 
with nulls.");
+  }
+  RETURN_NOT_OK(GetTruncatedBitmap(values.offset(), values.length(), 
null_bitmap,
+   pool_, _bitmap));
+}
+
+RETURN_NOT_OK(WritePadded(stream_.get(), null_bitmap->data(), 
null_bitmap_size,
 
 Review comment:
   Wouldn't it be better to extend `WritePadded` or write a second version of 
it that can handle bit-offsets? Using that way we could avoid the memory 
allocation for `GetTruncatedBitmap` above.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408021#comment-16408021
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1766: ARROW-2328: [C++] 
Writing a slice with feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#discussion_r176107139
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -603,7 +602,12 @@ class TableWriter::TableWriterImpl : public ArrayVisitor {
   }
 
   if (prim_values.values()) {
-values_buffer = prim_values.values()->data();
+if (prim_values.offset() != 0 && (fw_type.bit_width() % 8 != 0)) {
+  return arrow::Status::Invalid(
 
 Review comment:
   I've now made these changes and more.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407661#comment-16407661
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on issue #1766: ARROW-2328: [C++] Writing a slice with 
feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#issuecomment-374875952
 
 
   In the latest commit I have unit tested and fixed the null bitmap bug. I 
ended up adding the memory pool argument with a nullptr default and fail if an 
array is sliced and no pool was passed in.
   I also copy and pasted GetTruncatedBitmap from writer.cpp, I wasn't sure how 
to share it without it leaking into the public api.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> The null bitmap also ends up misaligned. Also tested and fixed in the pull 
> request below.
>  I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407264#comment-16407264
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on issue #1766: ARROW-2328: [C++] Writing a slice with 
feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#issuecomment-374799805
 
 
   On further investigation, there is a second bug, in that the null bitmap 
isn't aligned with the offset either. Hence the SliceStringsRoundTrip test 
fails.
   Looking at the TableWriter code, to fix this will require creating a new 
null bitmap, which will require a MemoryPool and hence a change in the api. :(. 
   Without the ability to have nulls, writing sliced tables to feather seems to 
be mostly broken. I could add the pool as an optional argument, or what would 
be a preferred approach? I'll have a think tomorrow. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407192#comment-16407192
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

Adriandorr commented on a change in pull request #1766: ARROW-2328: [C++] 
Writing a slice with feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#discussion_r175948619
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -603,7 +602,12 @@ class TableWriter::TableWriterImpl : public ArrayVisitor {
   }
 
   if (prim_values.values()) {
-values_buffer = prim_values.values()->data();
+if (prim_values.offset() != 0 && (fw_type.bit_width() % 8 != 0)) {
+  return arrow::Status::Invalid(
 
 Review comment:
   Thanks for your comments. I've changed the status return code. But I'm 
having two problems:
   * when I run 'make format' it only changes a whole bunch of files I didn't 
change. Was master broken when I forked?
   * SliceStringsRoundTrip fails and I'm struggling to understand which of the 
various offsets needs to be adjusted. I can probably figure that out tomorrow.
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2328) Writing a slice with feather ignores the offset

2018-03-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406917#comment-16406917
 ] 

ASF GitHub Bot commented on ARROW-2328:
---

xhochy commented on a change in pull request #1766: ARROW-2328: [C++] Writing a 
slice with feather ignores the offset.
URL: https://github.com/apache/arrow/pull/1766#discussion_r175899893
 
 

 ##
 File path: cpp/src/arrow/ipc/feather.cc
 ##
 @@ -603,7 +602,12 @@ class TableWriter::TableWriterImpl : public ArrayVisitor {
   }
 
   if (prim_values.values()) {
-values_buffer = prim_values.values()->data();
+if (prim_values.offset() != 0 && (fw_type.bit_width() % 8 != 0)) {
+  return arrow::Status::Invalid(
 
 Review comment:
   This should be NotImplemented. We should be able to write them someday.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Writing a slice with feather ignores the offset
> ---
>
> Key: ARROW-2328
> URL: https://issues.apache.org/jira/browse/ARROW-2328
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Adrian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Writing a slice from row n of length m of an array to feather would write the 
> first m rows, instead of the rows starting at n.
> I've created a pull request with tests and fix here: 
> [Pullrequest#1766|https://github.com/apache/arrow/pull/1766]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)