SandishKumarHN commented on code in PR #38922:
URL: https://github.com/apache/spark/pull/38922#discussion_r1053690877
##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala:
##########
@@ -92,17 +108,35 @@ object SchemaConverters {
MapType(keyType, valueType, valueContainsNull =
false).defaultConcreteType,
nullable = false))
case MESSAGE =>
- if (existingRecordNames.contains(fd.getFullName)) {
+ // If the `recursive.fields.max.depth` value is not specified, it will
default to -1;
+ // recursive fields are not permitted. Setting it to 0 drops all
recursive fields,
+ // 1 allows it to be recursed once, and 2 allows it to be recursed
twice and so on.
+ // A value greater than 10 is not allowed, and if a protobuf record
has more depth for
+ // recursive fields than the allowed value, it will be truncated and
some fields may be
+ // discarded.
+ // SQL Schema for the protobuf message `message Person { string name =
1; Person bff = 2}`
+ // will vary based on the value of "recursive.fields.max.depth".
+ // 0: struct<name: string, bff: null>
+ // 1: struct<name string, bff: <name: string, bff: null>>
+ // 2: struct<name string, bff: <name: string, bff: struct<name:
string, bff: null>>> ...
+ val recordName = fd.getMessageType.getFullName
Review Comment:
@cloud-fan fd.getFullName gives a fully qualified name along with a field
name, we needed the fully qualified type name. we made this decision above.
here is the difference.
```
println(s"${fd.getFullName} : ${fd.getMessageType.getFullName}")
org.apache.spark.sql.protobuf.protos.Employee.ic :
org.apache.spark.sql.protobuf.protos.IC
org.apache.spark.sql.protobuf.protos.IC.icManager :
org.apache.spark.sql.protobuf.protos.Employee
org.apache.spark.sql.protobuf.protos.Employee.ic :
org.apache.spark.sql.protobuf.protos.IC
org.apache.spark.sql.protobuf.protos.IC.icManager :
org.apache.spark.sql.protobuf.protos.Employee
org.apache.spark.sql.protobuf.protos.Employee.em :
org.apache.spark.sql.protobuf.protos.EM
```
@rangadi previous code ```fd.getFullName``` fully qualified name along with
a field name works to find out recursion. so before we just use to throw errors
on any recursion field.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]