I think this is probably because the schema is transformed for Dictionary
encoded fields [1].  Something could probably be done to expose the schema
separately, but the library and readers are mostly designed around
populating and repopulating VectorSchemaRoots, so I don't think the extra
cost was considered.  What type of bottlenecks is this causing for you?

If you would like to open a PR or further discuss dev@ might be a better
place to discuss use-case and design of the feature you are looking for.

Thanks,
Micah

[1]
https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/ipc/ArrowReader.java#L183

On Wed, Oct 7, 2020 at 12:34 PM Michael Mior <[email protected]> wrote:

> Why is the Schema object not exposed in ArrowReader? (e.g. readSchema
> is protected). Instead, I need to call
> getVectorSchemaRoot().getSchema() which unnecessarily allocates a
> VectorSchemaRoot that I don't immediately need.
>
> --
> Michael Mior
> [email protected]
>

Reply via email to