Hi everyone,I've been working on #3225 to extend field-major processing to
nested struct fields in Comet. My implementation focuses on separating
validity extraction from child field processing to improve cache locality
and reduce type-dispatch overhead.

As I finalize this PR, I have a question regarding the broader roadmap for
complex types:

Now that we have a recursive strategy for Struct fields, what are the
community's thoughts on applying a similar vectorized approach to List and
Map types? Specifically, what is the preferred pattern for handling
variable-length offsets in these collection types while staying within the
optimized field-major traversal path in the shuffle kernels?

I am a student contributor preparing for GSoc 2026 and would love to align
my current work with the long-term architectural goals for complex type
optimization in DataFusion Comet.

Best regards, Vignesh.

GitHub: vigneshsiva11

Reply via email to