[Discussion] Extending vectorized field-major processing for nested complex types (Issue #3225)

Vignesh Siva Thu, 22 Jan 2026 04:03:45 -0800

Hi everyone,I've been working on #3225 to extend field-major processing to
nested struct fields in Comet. My implementation focuses on separating
validity extraction from child field processing to improve cache locality
and reduce type-dispatch overhead.


As I finalize this PR, I have a question regarding the broader roadmap for
complex types:

Now that we have a recursive strategy for Struct fields, what are the
community's thoughts on applying a similar vectorized approach to List and
Map types? Specifically, what is the preferred pattern for handling
variable-length offsets in these collection types while staying within the
optimized field-major traversal path in the shuffle kernels?

I am a student contributor preparing for GSoc 2026 and would love to align
my current work with the long-term architectural goals for complex type
optimization in DataFusion Comet.

Best regards, Vignesh.

GitHub: vigneshsiva11

[Discussion] Extending vectorized field-major processing for nested complex types (Issue #3225)

Reply via email to