rangadi commented on code in PR #40011:
URL: https://github.com/apache/spark/pull/40011#discussion_r1105466945


##########
connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufFunctionsSuite.scala:
##########
@@ -385,34 +387,12 @@ class ProtobufFunctionsSuite extends QueryTest with 
SharedSparkSession with Prot
     }
   }
 
-  test("Handle recursive fields in Protobuf schema, A->B->A") {
-    val schemaA = ProtobufUtils.buildDescriptor(testFileDesc, "recursiveA")

Review Comment:
   This extra code to create actual records is not required since we are 
testing an AnalysisException. It does not need real data.



##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala:
##########
@@ -118,17 +118,17 @@ object SchemaConverters {
         // discarded.
         // SQL Schema for the protobuf message `message Person { string name = 
1; Person bff = 2}`
         // will vary based on the value of "recursive.fields.max.depth".
-        // 0: struct<name: string, bff: null>
-        // 1: struct<name string, bff: <name: string, bff: null>>
-        // 2: struct<name string, bff: <name: string, bff: struct<name: 
string, bff: null>>> ...
+        // 1: struct<name: string, bff: null>
+        // 2: struct<name string, bff: <name: string, bff: null>>
+        // 3: struct<name string, bff: <name: string, bff: struct<name: 
string, bff: null>>> ...
         val recordName = fd.getMessageType.getFullName
         val recursiveDepth = existingRecordNames.getOrElse(recordName, 0)
         val recursiveFieldMaxDepth = protobufOptions.recursiveFieldMaxDepth
-        if (existingRecordNames.contains(recordName) && 
(recursiveFieldMaxDepth < 0 ||
+        if (existingRecordNames.contains(recordName) && 
(recursiveFieldMaxDepth <= 0 ||
           recursiveFieldMaxDepth > 10)) {
           throw 
QueryCompilationErrors.foundRecursionInProtobufSchema(fd.toString())
         } else if (existingRecordNames.contains(recordName) &&
-          recursiveDepth > recursiveFieldMaxDepth) {
+          recursiveDepth >= recursiveFieldMaxDepth) {

Review Comment:
   The main bug fix: 
   We allowed recursive depth go over by one. This makes the setting of '0' 
does not work as documented (earlier documentation stated even the first 
occurrence would be removed it it was set to 0). 
   Removing the first occurrence does not make sense since we don't know if 
this field is recursive until the second occurrence. 



##########
connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufFunctionsSuite.scala:
##########
@@ -853,7 +824,7 @@ class ProtobufFunctionsSuite extends QueryTest with 
SharedSparkSession with Prot
 
     val df = Seq(employee.toByteArray).toDF("protoEvent")
     val options = new java.util.HashMap[String, String]()
-    options.put("recursive.fields.max.depth", "1")
+    options.put("recursive.fields.max.depth", "2")

Review Comment:
   We need to increase the setting because this PR fixes off by one with 
enforcement. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to