alihan-synnada opened a new pull request, #13412:
URL: https://github.com/apache/datafusion/pull/13412

   ## Which issue does this PR close?
   
   None
   
   ## Rationale for this change
   
   Part of #13411
   
   This PR implements a common `Decoder` trait, the `BatchDeserializer` trait 
and the `DecoderDeserializer` struct as described in the issue, along with 
`CsvDecoder` and `JsonDecoder` as `arrow-csv` and `arrow-json` `Decoder`s are 
readily available.
   
   ## What changes are included in this PR?
   
   Note: There are about 290 lines of new tests, so it is about 250 lines of 
actual code.
   
   - Add `BatchDeserializer` as a common interface.
     - `digest` consumes the input in chunks
     - `next` attempts to deserialize the digested data and returns a 
`DeserializerOutput` which is either a `RecordBatch`, `RequiresMoreData` and 
`InputExhausted`
     - `finish` signals the end of the input stream
   - Add `Decoder` trait
     - Mimics arrow-json and arrow-csv's `Decoder`s
   - Implement `Decoder` for `CsvDecoder` and `JsonDecoder` by forwarding 
methods
   - Add `DecoderDeserializer` and implement `BatchDeserializer` for formats 
that have a `Decoder` implementation.
   - Add `deserialize_stream` function to deduplicate the deserialization logic
   
   ## Are these changes tested?
   
   Yes, the changes are covered by new tests added to the CSV and JSON modules.
   
   ## Are there any user-facing changes?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to