Yuchen Liu created SPARK-48964:
----------------------------------
Summary: Fix the discrepancy between implementation, comment and
documentation of option recursive.fields.max.depth in ProtoBuf connector
Key: SPARK-48964
URL: https://issues.apache.org/jira/browse/SPARK-48964
Project: Spark
Issue Type: Documentation
Components: Connect
Affects Versions: 3.5.1, 3.5.0, 4.0.0, 3.5.2, 3.5.3
Reporter: Yuchen Liu
Fix For: 4.0.0, 3.5.2, 3.5.3, 3.5.1, 3.5.0
After the three PRs ([https://github.com/apache/spark/pull/38922,]
[https://github.com/apache/spark/pull/40011,]
[https://github.com/apache/spark/pull/40141]) working on the same option, there
are some legacy comments and documentation that has not been updated to the
latest implementation. This task should consolidate them. Below is the correct
description of the behavior.
The `recursive.fields.max.depth` parameter can be specified in the
from_protobuf options to control the maximum allowed recursion depth for a
field. Setting `recursive.fields.max.depth` to 1 drops all-recursive fields,
setting it to 2 allows it to be recursed once, and setting it to 3 allows it to
be recursed twice. Attempting to set the `recursive.fields.max.depth` to a
value greater than 10 is not allowed. If the `recursive.fields.max.depth` is
specified to a value smaller than 1, recursive fields are not permitted. The
default value of the option is -1. if a protobuf record has more depth for
recursive fields than the allowed value, it will be truncated and some fields
may be discarded. This check is based on the fully qualified field type. SQL
Schema for the protobuf message
{code:java}
message Person { string name = 1; Person bff = 2 }{code}
will vary based on the value of `recursive.fields.max.depth`.
{code:java}
1: struct<name: string>
2: struct<name string, bff: <name: string>>
3: struct<name string, bff: <name: string, bff: struct<name: string>>> ...
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]