[ 
https://issues.apache.org/jira/browse/AVRO-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilia Khaustov updated AVRO-2105:
--------------------------------
    Description: 
*Problem*: DataFileWriter supports "create" and "append" modes. "Append" mode 
can be triggered by passing schema as None to constructor. In this case, it is 
required from given file writer to allow reading as well - internal logic 
relies on reading meta information from given file. If it was opened in "ab+" 
mode it works, but in "ab" it will raise IOError.

*Practical example*: I use Avro serialization in Python with LZMA compression 
for serialized files. LZMA library provides a file-like class LZMAFile for 
writing uncompressed data from memory to disk, or reading compressed file to 
decompressed stream. It doesn't support "+" modes - only compression or 
decompression, not both. This looks like a blocker for straight-forward 
implementation of appending to compressed Avro objects. However, LZMAFile 
supports appending, so does DataFileWriter.

*Possible solution*: Add "reader" kwarg to DataFileWriter constructor that 
would be used instead of "writer" in "append" mode for reading metadata. If not 
given, "reader" set to "writer" for compatibility.

  was:
*Problem*: DataFileWriter supports "create" and "append" modes. "Append" mode 
can be triggered by passing schema as None to constructor. In this case, it is 
required from given file writer to allow reading as well - internal logic 
relies on reading meta information from given file. If it was opened in "ab+" 
mode it works, but in "ab" it will raise IOError.

*Practical example*: I use Avro serialization in Python with LZMA compression 
for serialized files. LZMA library provides a file-like class LZMAFile for 
writing uncompressed data from memory to disk, or reading compressed file to 
decompressed stream. It doesn't support "+" modes - only compression or 
decompression, not both. This looks like a blocker for straight-forward 
implementation of appending to compressed Avro objects. However, LZMAFile 
supports appending, so does DataFileWriter.

*Possible solution*: Add "reader" kwarg to DataFileWriter constructor that 
would be used instead of "writer" in "append" mode for reading metadata. If not 
given, "reader' set to "writer" for compatibility.


> Using DataFileWriter in append mode with write-only file IO 
> ------------------------------------------------------------
>
>                 Key: AVRO-2105
>                 URL: https://issues.apache.org/jira/browse/AVRO-2105
>             Project: Avro
>          Issue Type: Improvement
>          Components: python
>         Environment: Python 2/3
>            Reporter: Ilia Khaustov
>            Priority: Minor
>              Labels: python
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> *Problem*: DataFileWriter supports "create" and "append" modes. "Append" mode 
> can be triggered by passing schema as None to constructor. In this case, it 
> is required from given file writer to allow reading as well - internal logic 
> relies on reading meta information from given file. If it was opened in "ab+" 
> mode it works, but in "ab" it will raise IOError.
> *Practical example*: I use Avro serialization in Python with LZMA compression 
> for serialized files. LZMA library provides a file-like class LZMAFile for 
> writing uncompressed data from memory to disk, or reading compressed file to 
> decompressed stream. It doesn't support "+" modes - only compression or 
> decompression, not both. This looks like a blocker for straight-forward 
> implementation of appending to compressed Avro objects. However, LZMAFile 
> supports appending, so does DataFileWriter.
> *Possible solution*: Add "reader" kwarg to DataFileWriter constructor that 
> would be used instead of "writer" in "append" mode for reading metadata. If 
> not given, "reader" set to "writer" for compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to