[ 
https://issues.apache.org/jira/browse/AVRO-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168735#comment-13168735
 ] 

Douglas Creager commented on AVRO-957:
--------------------------------------

I've had a chance to look through this, and I think it's very good.  I only 
have a couple of questions/comments, and then I can commit this.

First, there are a couple of places where it looks like you're calling 
avro_realloc once for each block in a file.  (In codec.c:encode_deflate and 
datafile.c:file_read_block_count.)  You should first check that the buffer 
isn't already big enough to hold the requested space.  (avro_realloc doesn't do 
this for you; it does a realloc each time you call it.)  That will save some 
memory allocation overhead when reading from huge files.

Second, in codec.c:avro_codec, can you add an avro_set_error call in the 
switch's default block?  Something like "Unknown codec %s".

Third, CMake actually has some built-in logic for finding zlib, and we can use 
pkg-config to find liblzma.  I've created a patch that does this, which I'll 
attach after I finish typing in this comment.

Lastly, there are some nit-picky whitespace inconsistencies; since you're using 
git, you can use the --color option to "git diff" or "git show" to see some of 
them.  (It'll show trailing whitespace in super-bright red.)  And there are 
some places where things aren't lining up when you've got multiple lines inside 
of some parentheses; make sure you're using real tabs, and that your tab stops 
are at 8 spaces, when you line things up.

Other than that, this looks great.  As we mentioned on the dev mailing list, 
we'll need to have an actual patch file uploaded to this ticket, marked that 
you grant permission to the ASF to include it in the codebase, before we can 
merge the changes into SVN.  (We can't use the git branch or github Pull 
Request directly — the patches on this JIRA ticket are how ASF keeps track of 
the copyright assignment for legal purposes.)  It looks like the best way to do 
this will be to change your github Pull Request to use the codec branch.  Then 
you can use the https://github.com/apache/avro/pull/5.patch link to obtain a 
patch file that you can upload.
                
> Added Codec Support to Avro-C
> -----------------------------
>
>                 Key: AVRO-957
>                 URL: https://issues.apache.org/jira/browse/AVRO-957
>             Project: Avro
>          Issue Type: New Feature
>          Components: c
>            Reporter: Lucas Martin-King
>              Labels: c, codec, deflate
>
> We (Experian Hitwise) have added codec support to avro-c (as per the avro 
> spec), with preliminary deflate/inflate support (as well as null codec 
> support).
> This changes the way blocks are written to the file, with the block data 
> being passed through the codec interface, before being written to file.
> This also changes the way blocks are read from a file, with the block data 
> first being read into a buffer, before being passed through the codec 
> interface, then a memory reader is set to the decoded data, which is read by 
> the user calls avro_file_reader_read_value() and avro_file_reader_read()
> Please feel free to make changes, as although I did try to emulate the coding 
> style of the rest of the avro c library, there may be things I've done 
> "differently" :-)
> Code is available from Github:  https://github.com/hitwise/avro  (branch: 
> codec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to