Ilia Khaustov created AVRO-2105:
-----------------------------------
Summary: Using DataFileWriter in append mode with write-only file
IO
Key: AVRO-2105
URL: https://issues.apache.org/jira/browse/AVRO-2105
Project: Avro
Issue Type: Improvement
Components: python
Environment: Python 2/3
Reporter: Ilia Khaustov
Priority: Minor
*Problem*: DataFileWriter supports "create" and "append" modes. "Append" mode
can be triggered by passing schema as None to constructor. In this case, it is
required from given file writer to allow reading as well - internal logic
relies on reading meta information from given file. If it was opened in "ab+"
mode it works, but in "ab" it will raise IOError.
*Practical example*: I use Avro serialization in Python with LZMA compression
for serialized files. LZMA library provides a file-like class LZMAFile for
writing uncompressed data from memory to disk, or reading compressed file to
decompressed stream. It doesn't support "+" modes - only compression or
decompression, not both. This looks like a blocker for straight-forward
implementation of appending to compressed Avro objects. However, LZMAFile
supports appending, so does DataFileWriter.
*Possible solution*: Add "reader" kwarg to DataFileWriter constructor that
would be used instead of "writer" in "append" mode for reading metadata. If not
given, "reader' set to "writer" for compatibility.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)