[
https://issues.apache.org/jira/browse/AVRO-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715358#comment-16715358
]
ASF subversion and git services commented on AVRO-1167:
-------------------------------------------------------
Commit 1f00b1a7a2f76921325ba073de3212a4a22524de in avro's branch
refs/heads/branch-1.8 from John Gill
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1f00b1a ]
AVRO-1167, AVRO-766: Fix c AVRO_LINK memory leaks (#217)
* AVRO-1167: Enhance avro_schema_copy for AVRO_LINK
- Add hash of named schemas found during copy
- Find saved named schema for copy of AVRO_LINK
* AVRO-766: Correct memory leaks in AVRO_LINK copy
- Adds test cases for AVRO-766 & AVRO-1167
- Corrects reference counting for avro_schema_copy
* Enable TEST_AVRO_1167 in test_avro_766
This ensures that both fixes work together and that no valgrind errors are
produced from a recrusive schema.
> Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.
> -------------------------------------------------------------
>
> Key: AVRO-1167
> URL: https://issues.apache.org/jira/browse/AVRO-1167
> Project: Apache Avro
> Issue Type: Bug
> Components: c
> Affects Versions: 1.7.1
> Environment: Ubuntu Linux 11.10
> Reporter: Vivek Nadkarni
> Priority: Major
> Attachments: AVRO-1167-TEST.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> When avro_schema_copy() encounters an AVRO_LINK from an old_schema to a
> new_schema, it sets the target of the new_link to the target of the old_link
> in the old_schema. Thus, the AVRO_LINK in the new_schema points to an element
> in the old_schema.
> While this is currently safe, since the reference count of the target in the
> old_schema is incremented, we are not really making a "copy" of the schema.
> There is a "TODO" in the code, which says that we should make a
> avro_schema_copy() of the target in old_schema instead of linking directly to
> it. However, this solution of making a copy would result in a few problems:
> 1. Avro schemas are intended to be self-contained. That implies that
> AVRO_LINKs are intended to be internal links inside a self-contained schema.
> The code introduces unnecessary (and potentially disallowed) external
> dependencies in an Avro schema.
> 2. The purpose of copying a schema that we want to decouple the old_schema
> from the new_schema. The two copies may have different owners, we may want to
> deallocate old schema etc.
> 3. If the schema is recursive, then the code would enter an infinite
> recursion loop.
> It appears to me that the "correct" solution would be to replicate the entire
> structure of the current schema, including the internal links. This means
> that if old_link_A points to old_target_B, then new_link_A should point to
> new_target_B in the new schema. Note that there should only be one copy of
> new_target_B in the new schema, even if there are multiple links pointing to
> new_target_B - i.e. we should not make a new copy for each link.
> In order to implement this proper copying of links, we would need to keep a
> lookup table of pairs of old and new schemas as they are being created, as
> well as a list of all the AVRO_LINKs that are copied. Then as a post-copy
> step, we would go and fix up all the AVRO_LINKs to point to the appropriate
> targets. This is the way the schema is constructed in the first place in
> avro_schema_from_json().
> An inefficient way to obtain the correct result from avro_schema_copy() would
> be to perform an avro_schema_to_json() followed by an avro_schema_from_json().
> Note: I have not implemented a fix for this issue, but I am documenting this
> issue in AVRO-JIRA because this issue needs to be resolved before AVRO-766
> can be fixed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)