bkietz commented on code in PR #14402:
URL: https://github.com/apache/arrow/pull/14402#discussion_r994822479


##########
cpp/src/arrow/engine/substrait/expression_internal.cc:
##########
@@ -295,7 +295,8 @@ Result<compute::Expression> FromProto(const 
substrait::Expression& expr,
 Result<Datum> FromProto(const substrait::Expression::Literal& lit,
                         const ExtensionSet& ext_set,
                         const ConversionOptions& conversion_options) {
-  if (lit.nullable()) {

Review Comment:
   `Expression.Literal.{null, empty_list, empty_map}` are all `Type`s, which 
can independently express nullability. There's [a 
comment](https://github.com/substrait-io/substrait/blob/f3f6bdc947e689e800279666ff33f118e42d2146/proto/substrait/algebra.proto#L565-L567)
 for `Expression.Literal.null` which indicates that Literal.nullable should be 
left false since its type should carry that property. In light of this I'd 
propose we validate that if `lit` contains one of `Expression.Literal.{null, 
empty_list, empty_map}` then `lit.nullable()` must be false regardless of the 
conversion options



##########
cpp/src/arrow/engine/substrait/serde_test.cc:
##########
@@ -402,22 +402,36 @@ TEST(Substrait, SupportedLiterals) {
   auto ExpectEq = [](std::string_view json, Datum expected_value) {
     ARROW_SCOPED_TRACE(json);
 
-    ASSERT_OK_AND_ASSIGN(
-        auto buf, internal::SubstraitFromJSON("Expression",
-                                              "{\"literal\":" + 
std::string(json) + "}"));
-    ExtensionSet ext_set;
-    ASSERT_OK_AND_ASSIGN(auto expr, DeserializeExpression(*buf, ext_set));
+    for (bool nullable : {false, true}) {
+      std::string json_with_nullable;
+      std::string nullable_json = (nullable) ? ", \"nullable\": true" : "";
+      if (nullable) {
+        auto final_closing_brace = json.find_last_of('}');
+        ASSERT_NE(std::string_view::npos, final_closing_brace);
+        json_with_nullable =
+            std::string(json.substr(0, final_closing_brace)) + ", 
\"nullable\": true}";
+        json = json_with_nullable;
+      }

Review Comment:
   IIUC, this results in
   ```json
   {"literal": {"boolean": true, "nullable": true}, "nullable": true}
   ```
   but literal_type a `oneof` so it should only have a single key-value pair, 
where the key names the member of the union:
   ```json
   {"literal": {"boolean": true}, "nullable": true}
   ```
   If this isn't being caught in tests maybe there's a flag "error on unused 
fields" we should set.
   ```suggestion
   ```



##########
cpp/src/arrow/engine/substrait/serde_test.cc:
##########
@@ -504,6 +518,8 @@ TEST(Substrait, SupportedLiterals) {
     ASSERT_OK_AND_ASSIGN(auto json, internal::SubstraitToJSON("Type", *buf));
     ExpectEq("{\"null\": " + json + "}", MakeNullScalar(type));
   }
+
+  ConversionOptions conversion_options;

Review Comment:
   This seems unused, did you mean to put it in `ExpectEq` with 
   `conversion_options.strictness = ConversionStrictness::BEST_EFFORT`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to