bkietz commented on a change in pull request #9532:
URL: https://github.com/apache/arrow/pull/9532#discussion_r583924317



##########
File path: cpp/src/arrow/dataset/expression_test.cc
##########
@@ -1135,5 +1175,36 @@ TEST(Expression, SerializationRoundTrips) {
                          equal(field_ref("beta"), literal(3.25f))}));
 }
 
+TEST(Projection, AugmentWithNull) {
+  auto just_i32 = 
ArrayFromJSON(struct_({kBoringSchema->GetFieldByName("i32")}),
+                                R"([{"i32": 0}, {"i32": 1}, {"i32": 2}])");
+
+  {
+    ASSERT_OK_AND_ASSIGN(auto proj, project({field_ref("f64"), 
field_ref("i32")},
+                                            {"projected double", "projected 
int"})
+                                        .Bind(*kBoringSchema));
+
+    auto expected = ArrayFromJSON(
+        struct_({field("projected double", float64()), field("projected int", 
int32())}),
+        R"([[null, 0], [null, 1], [null, 2]])");
+    ASSERT_OK_AND_ASSIGN(auto actual, ExecuteScalarExpression(proj, just_i32));
+
+    AssertDatumsEqual(Datum(expected), actual);
+  }
+
+  {
+    ASSERT_OK_AND_ASSIGN(
+        auto proj,
+        project({field_ref("f64")}, {"projected 
double"}).Bind(*kBoringSchema));
+
+    // NB: only a scalar was projected, this is *not* automatically broadcast 
to an array.
+    ASSERT_OK_AND_ASSIGN(auto expected, 
StructScalar::Make({MakeNullScalar(float64())},

Review comment:
       Ah, I see your concern. Individual calls to project do not broadcast 
scalars in case subsequent steps in the pipeline want to do something more 
efficient. FilterAndProjectScanTask broadcasts scalars to the correct length 
before yielding the batch: 
https://github.com/apache/arrow/pull/9532/files?file-filters%5B%5D=.cc&file-filters%5B%5D=.h&file-filters%5B%5D=.java&file-filters%5B%5D=.pxd&file-filters%5B%5D=.py#diff-25b1bd283e8242f8384b24a0f1e8b61fbca0c2784ab679f9a2a00b03450487aaR72-R76




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to