[ 
https://issues.apache.org/jira/browse/AVRO-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716018#action_12716018
 ] 

Doug Cutting commented on AVRO-25:
----------------------------------

This is getting really close.  A few more things:
 - Some of the files use CRLF as EOL instead of just LF, which confuses diff 
(e.g., ValueReader).  Can you please fix them to all use LF?  Thanks!
 - I'd rather not make ValueReader/Writer abstract in this patch, since it's 
not required.  Let's hold off on that for a later patch which can separately 
motivate it, and keep this patch focussed on the addition of blocking.
 - If you're moving files in subversion, it's best to provide a shell script 
that should be run before the patch is applied that does the 'svn mv'.  And, if 
you use 'svn mv', the patch should then show changes to moved files.  (FWIW, 
these files were originally in the io package, but Raymie argued they should go 
in ipc so I moved them there.)  It might be simplest if we didn't move these at 
all in this patch, but rather left that to a separate patch.
 - Finally, you've changed GenericDatumReader/Writer's API so that these no 
longer permit arbitrary classes to represent strings and bytes.  The idea is 
that, as with records, one can use arbitrary base classes to represent these by 
overriding {read/write}{String/Bytes}, since not all applications may like 
using Utf8 and ByteBuffer.  The ValueReader/Writer take Utf8 and ByteBuffer, 
but an application can override these GenericDatumReader/Writer methods to 
convert between Utf8 and ByteBuffer and an application's chosen representation. 
 Is there a reason that the addition of blocking should change this?

Thanks for bearing with me!


> Blocking for value output (with API change)
> -------------------------------------------
>
>                 Key: AVRO-25
>                 URL: https://issues.apache.org/jira/browse/AVRO-25
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Raymie Stata
>            Assignee: Thiruvalluvan M. G.
>         Attachments: AVRO-25.patch, AVRO-25.patch, AVRO-25.patch, 
> AVRO-25.patch
>
>
> The Avro specification has provisions for decomposing very large arrays and 
> maps into "blocks."  These provisions allow for streaming implementations 
> that would allow one to, for example, write the contents of a file out as an 
> Avro array w/out knowing in advance how many records are in the file.
> The current Java implementation of Avro does support this provision.  My 
> colleague Thiru will be attaching a patch which implements blocking.  It 
> turns out that the buffering required to do blocking is non-trivial, so it 
> seem beneficial to include a standard implementation of blocking as part of 
> the reference Avro implementation.
> This is an early version of the code.  We are still working on testing and 
> performance tuning.  But we wanted early feedback.
> This patch also includes a new set of classes called ValueInput and 
> ValueOutput, which are meant to replace ValueReader and ValueWriter.  These 
> classes have largely the same API as ValueReader/Writer, but they include a 
> few more methods to "bracket" items that appear inside of arrays and maps.  
> Shortly, we'll be posting a separate patch which implements further 
> subclasses of ValueInput/Output that do "validation" of input and output 
> against a schema (and also do automatic schema resolution for readers).
> We're implementing these classes separate from ValueInput/Output to allow you 
> to kick our tires w/out causing too much disruption to your source trees.  
> Let's validate the basic idea behind these patches first, and then determine 
> the details of integrating them into the rest of Avro.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to