[
https://issues.apache.org/jira/browse/ZOOKEEPER-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587520#comment-13587520
]
Daniel Lescohier commented on ZOOKEEPER-1519:
---------------------------------------------
You're correct. I didn't have time to test the patch, but I wanted to get it
out there for discussion.
I looked further, and the callers get void* data from the public API with no
length parameter. So, the public API does not allow us to copy the data.
In order to fix it, it looks like a public API change is required. Either:
1. Document in the API that the caller cannot free that memory until the
zookeeper library is done with it (which also means it can't be a pointer to
memory on the stack). I assume that the library is done with it once it calls
the completion callback? So the program can free it once it gets the same
pointer back in a callback (or when the zookeeper connection is closed). I
think this would make it hard to integrate with scripting languages like
Python, because the scripting language C interface would have to copy the
memory, account for it in some global structure, and free it once it sees that
pointer again in a callback or when the zookeeper connection is closed.
2. Document in the API that the void * must be malloc'ed memory, and the
ownership is passed to the library (which means the caller copies it, and the
library frees it). That's also a difficult API.
3. Add a data length parameter to the API, so the library can copy it.
4. Don't use a void * for the 'data' parameter, use something else.
> Zookeeper Async calls can reference free()'d memory
> ---------------------------------------------------
>
> Key: ZOOKEEPER-1519
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1519
> Project: ZooKeeper
> Issue Type: Bug
> Components: c client
> Affects Versions: 3.3.3, 3.3.6
> Environment: Ubuntu 11.10, Ubuntu packaged Zookeeper 3.3.3 with some
> backported fixes.
> Reporter: Mark Gius
> Attachments: zookeeper-1519.patch
>
>
> zoo_acreate() and zoo_aset() take a char * argument for data and prepare a
> call to zookeeper. This char * doesn't seem to be duplicated at any point,
> making it possible that the caller of the asynchronous function might
> potentially free() the char * argument before the zookeeper library completes
> its request. This is unlikely to present a real problem unless the freed
> memory is re-used before zookeeper consumes it. I've been unable to
> reproduce this issue using pure C as a result.
> However, ZKPython is a whole different story. Consider this snippet:
> ok = zookeeper.acreate(handle, path, json.dumps(value),
> acl, flags, callback)
> assert ok == zookeeper.OK
> In this snippet, json.dumps() allocates a string which is passed into the
> acreate(). When acreate() returns, the zookeeper request has been
> constructed with a pointer to the string allocated by json.dumps(). Also
> when acreate() returns, that string is now referenced by 0 things (ZKPython
> doesn't bump the refcount) and the string is eligible for garbage collection
> and re-use. The Zookeeper request now has a pointer to dangerous freed
> memory.
> I've been seeing odd behavior in our development environments for some time
> now, where it appeared as though two separate JSON payloads had been joined
> together. Python has been allocating a new JSON string in the middle of the
> old string that an incomplete zookeeper async call had not yet processed.
> I am not sure if this is a behavior that should be documented, or if the C
> binding implementation needs to be updated to create copies of the data
> payload provided for aset and acreate.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira