[ 
https://issues.apache.org/jira/browse/AVRO-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340963#comment-14340963
 ] 

Alexander Hasha commented on AVRO-1286:
---------------------------------------

Has anyone thought any more about this recently?  I'm looking at this issue for 
my own purposes.  As far as I can tell so far, the calls to `seek` are not 
inherently necessary to parsing the data stream.  There is one seek to 
determine the file length, but that looks like a convenience method for 
determining if the end of the file has been reached.  (You can tell when that 
happens on a stream fairly easily.)  You do need to seek backwards by 
`SYNC_SIZE`, but it seems like this could be accomplished by buffering a whole 
number of blocks in memory, not necessarily the whole file.

I'd like to give this a shot, but am worried I'm failing to understand an 
important detail.  

> Python script avro cat should be able to read from stdin
> --------------------------------------------------------
>
>                 Key: AVRO-1286
>                 URL: https://issues.apache.org/jira/browse/AVRO-1286
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>            Reporter: Uri Laserson
>            Priority: Minor
>
> Currently, you have to specify a target file on the command line.  But it 
> would be nice to be able to stream data through avro cat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to