On Thu, 14 May 2026 at 15:25, Antoine Pitrou <[email protected]> wrote:

>
> I haven't really followed Variant development, but it's extremely
> reasonable for implementations to choose reasonable nesting limits (say,
> 64 levels).
>

yes, some limit is needed. The json one is 500 so I took that.


>
> I would point out that we already have somehow similar limits in Parquet
> C++ for Thrift decoding:
>
> https://github.com/apache/arrow/blob/c1036681b099c5f9b0684a710be04bb7619e926f/cpp/src/parquet/properties.h#L105-L121
>
> I'll add that parsing Variants is a natural target for fuzz testing.
>

Less so than the compression stuff. The challenge with the variants is you
don't want to be so rigorous it hurts performance. Arrow's rust parquet has
on-demand strict validation.

In my PR metadata content validation "monotonically increasing offsets into
the data" stays as it is today, on-demand when you do a lookup. With
Neelesh's cached metadata, there's only one lookup per key, which is a real
performance kille right now.

>
>

Reply via email to