Re: [PR] GH-40297: [C++] Add TensorFromJSON helper function [arrow]

via GitHub Thu, 07 Mar 2024 05:29:07 -0800


bkietz commented on code in PR #40365:
URL: https://github.com/apache/arrow/pull/40365#discussion_r1516146997



##########
cpp/src/arrow/record_batch_test.cc:
##########
@@ -705,17 +705,12 @@ TEST_F(TestRecordBatch, ToTensorSupportedNaN) {
   std::vector<int64_t> shape = {9, 2};
   const int64_t f32_size = sizeof(float);
   std::vector<int64_t> f_strides = {f32_size, f32_size * shape[0]};
-  std::vector<float> f_values = {
-      static_cast<float>(NAN), 2,  3,  4,  5, 6, 7, 8, 9, 10, 20, 30, 40,
-      static_cast<float>(NAN), 60, 70, 80, 90};
-  auto data = Buffer::Wrap(f_values);
-
-  std::shared_ptr<Tensor> tensor_expected;
-  ASSERT_OK_AND_ASSIGN(tensor_expected, Tensor::Make(float32(), data, shape, 
f_strides));
+  std::shared_ptr<Tensor> tensor_expected = TensorFromJSON(
+      float32(), shape,
+      "[NaN, 2,  3,  4,  5, 6, 7, 8, 9, 10, 20, 30, 40, NaN, 60, 70, 80, 90]", 
f_strides);

Review Comment:
   > it might be better to very explicitly create the values as they are stored 
in the tensor
   
   ArrayFromJSON has no way to specify many valid arrow arrays, for example 
arrays with non-zero/empty values under null bits or the much larger ambiguity 
of offsets in Utf8. Constructing these directly is of course still possible 
when necessary using other functions, as in any test written before the 
addition of a FromJSON factory. However in the majority of test cases such 
flexibility would be unnecessary and distracting as we only need an obvious, 
self-contained, and maximally terse literal value. IMO writing a nested Json 
array in order to represent a nested Arrow array is clearly preferable and 
doing otherwise misses an opportunity for clarity in the most common cases; 
explicit shape/strides are still easy enough to specify using existing 
functions including ArrayFromJSON.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-40297: [C++] Add TensorFromJSON helper function [arrow]

Reply via email to