albertlockett opened a new issue, #9195:
URL: https://github.com/apache/arrow-rs/issues/9195

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   We're working on a 
[protocol](https://github.com/open-telemetry/otel-arrow/blob/main/docs/otap_basics.md#transport)
 that sends/receives a stream of record batches as an encapsulated Arrow IPC 
stream. The stream of record batches has a dynamic schema, so each time the 
schema changes we begin a new IPC Stream.
   
   Rather than throwing away and recreating an 
`arrow_ipc::writer::StreamWriter` each time this happens,  which is 
wasteful/expensive, we created our own [type for producing the IPC 
stream](https://github.com/open-telemetry/otel-arrow/blob/4b646461dc3070dbe85c5cbc3051ddd08d7331f3/rust/otap-dataflow/crates/pdata/src/encode/producer.rs#L23-L40)
 that tries to reuse as much as it can. However, for each stream it's still 
[creating a new 
`DictionaryTracker`](https://github.com/open-telemetry/otel-arrow/blob/4b646461dc3070dbe85c5cbc3051ddd08d7331f3/rust/otap-dataflow/crates/pdata/src/encode/producer.rs#L136-L139).
   
   **Describe the solution you'd like**
   I'd like for there to be a `reset` method on the `DictionaryTracker` that 
clears any internal state so I can reuse it without the allocation cost of 
creating a new one.
   
   **Describe alternatives you've considered**
   I had considered adding a `reset` method to `StreamWriter`. I might do that 
anyway in a followup PR if we're OK with it. But this method on `StreamWriter` 
would need to reset the `DictionaryTracker` anyway, so the solution described 
in this issue seemed like a good first step.
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to