wombatu-kun commented on PR #18497:
URL: https://github.com/apache/hudi/pull/18497#issuecomment-4242014596

   Minor code-quality nits
   
   - `HoodieSparkLanceWriter#enrichSparkSchemaForLanceVectors`: the local 
`DataType dt = field.dataType()` is assigned before the `ArrayType` check, then 
unused once the cast is done. Can collapse to `if (!(field.dataType() 
instanceof ArrayType))`.
   - `VectorConversionUtils#buildVectorColumnsMetadataValue` duplicates the 
format produced by `HoodieSchema#buildVectorColumnsMetadataValue` for Avro 
schemas. Consider adding a Javadoc cross-reference and asserting the two 
produce identical strings for equivalent schemas (could even delegate).
   - `VectorConversionUtils#restoreVectorMetadataFromArrowSchema` walks only 
top-level fields. If a nested struct ever carries a VECTOR child (unlikely 
today but possible), it would be missed. Worth a Javadoc note: "Top-level 
VECTORs only; nested struct children are not recursed into."
   - `HoodieBaseLanceWriter#close`: the new `additionalSchemaMetadata` hook is 
called after bloom filter metadata — fine — but the nested `if (writer != 
null)` check is redundant (already inside the outer `writer != null` for bloom 
filter); can collapse.
   - `buildVectorColumnsMetadataValue` returns `""` for schemas without 
vectors, and the override early-returns `Collections.emptyMap()` in that case — 
so no footer entry is emitted. Correct, just worth a comment explaining that 
the hook is called unconditionally but is a no-op when there are no vectors 
(otherwise a future reader maintaining this code might wonder why a non-VECTOR 
Lance file has no `hoodie.vector.columns`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to