[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user alanconway commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-82366307 On Mon, 2015-03-16 at 12:31 -0700, Andrew Stitcher wrote: I think you new pn_data_size() possibly has the wrong side effects - it seems to lose the pn_data cursor position, so that merely querying the size will be hard to intermix with appending to the pn_data. It is true that this is not something the current code ever does does, but I think making the query function have side effects (except caching the size itself perhaps) is unexpected. â Reply to this email directly or view it on GitHub. Fixed commit 973bad033ba3a1b700ab80ab4edee209ab81f05a Author: Alan Conway acon...@redhat.com Commit: Alan Conway acon...@redhat.com NO-JIRA: Restore data position when measuring size of encoded data. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user alanconway commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-81791421 Comitted the suggested API upstream. Doesn't do the suggested caching yet. commit fac7c86c8bc818ea845d6426fd85095a189522d6 Author: Alan Conway acon...@redhat.com Commit: Alan Conway acon...@redhat.com NO-JIRA: Measure size of encoded data. Introduce pn_data_size() to calculate the buffer space required to encode a data object. - encoder.c measures full size of data. - interop.py verifies computed size matches actual decode size. - update java/jython compatilbility layer. Code review at https://github.com/apache/qpid-proton/pull/11. Did not (yet) implement size caching as suggested in the review as invalidating the cache is a little involved. Leaving it for another day. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user alanconway closed the pull request at: https://github.com/apache/qpid-proton/pull/11 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user astitcher commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-81887757 I think you new pn_data_size() possibly has the wrong side effects - it seems to lose the pn_data cursor position, so that merely querying the size will be hard to intermix with appending to the pn_data. It is true that this is not something the current code ever does does, but I think making the query function have side effects (except caching the size itself perhaps) is unexpected. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user astitcher commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-78436876 I'd love to hear some details. Are you suggesting we accumulate the encoded size every time you modify the data? Not especially, but possibly - Iobserve that when you need to resize your buffer you always end up doing at least 2 passes of the data. So in this case it isn't really a big deal to ask for the size after the first attempt failed. That leads me to caching the required size in the pn_data_t whenever we do encode even if (especially if) it fails. So the flow would look like (in python) size = 1024 cd, enc = pn_data_encode(self._data, size) if cd == PN_OVERFLOW: size = pn_data_encoded_size(self._data) cd, enc = pn_data_encode(self._data, size) assert(cd!=PN_OVERFLOW) So no dance any more (at least only the two step, not round -and-round). And if this is not fast enough we can cache away the size during pn_data_encode() for use in pn_data_encoded_size(), remembering to invalidate it if the struct is further changed (not actually likely). Just in case it isn't clear this is better than calculating the size first because there is no wasted work assuming a cached size (at least compared to the existing and your proposed scheme), whereas always asking for the size and calculating it is always extra work if 80% of the time the buffer would have been big enough anyway. This isn't so far from your proposal, I just think the API doesn't suck ;-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user alanconway commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-78482514 On Thu, 2015-03-12 at 00:52 -0700, Andrew Stitcher wrote: I'd love to hear some details. Are you suggesting we accumulate the encoded size every time you modify the data? Not especially, but possibly - Iobserve that when you need to resize your buffer you always end up doing at least 2 passes of the data. So in this case it isn't really a big deal to ask for the size after the first attempt failed. That leads me to caching the required size in the pn_data_t whenever we do encode even if (especially if) it fails. So the flow would look like (in python) size = 1024 cd, enc = pn_data_encode(self._data, size) if cd == PN_OVERFLOW: size = pn_data_encoded_size(self._data) cd, enc = pn_data_encode(self._data, size) assert(cd!=PN_OVERFLOW) So no dance any more (at least only the two step, not round -and-round). And if this is not fast enough we can cache away the size during pn_data_encode() for use in pn_data_encoded_size(), remembering to invalidate it if the struct is further changed (not actually likely). Just in case it isn't clear this is better than calculating the size first because there is no wasted work assuming a cached size (at least compared to the existing and your proposed scheme), whereas always asking for the size and calculating it is always extra work if 80% of the time the buffer would have been big enough anyway. This isn't so far from your proposal, I just think the API doesn't suck ;-) Agreed, same behavior nicer API. Will rework re-post. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user astitcher commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-78280044 Thinking about it a bit more, I think I particularly object to the new public API. What you really want at an API level here is simply a function to tell you how many bytes you need to allocate, not a strange combined encode, but if you can't tell how many bytes I would need. FWIW, I think that there are some simple strategies that can get you the needed size out of a pn_data_t with a little extra being held in that structure, withotu having to do the full encode. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user alanconway commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-78340239 On Wed, 2015-03-11 at 14:26 -0400, Alan Conway wrote: On Wed, 2015-03-11 at 08:00 -0700, Andrew Stitcher wrote: Thinking about it a bit more, I think I particularly object to the new public API. What you really want at an API level here is simply a function to tell you how many bytes you need to allocate, not a strange combined encode, but if you can't tell how many bytes I would need. Not that strange BTW, this is what snprintf does. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
Github user alanconway commented on the pull request: https://github.com/apache/qpid-proton/pull/11#issuecomment-78339832 On Wed, 2015-03-11 at 08:00 -0700, Andrew Stitcher wrote: Thinking about it a bit more, I think I particularly object to the new public API. What you really want at an API level here is simply a function to tell you how many bytes you need to allocate, not a strange combined encode, but if you can't tell how many bytes I would need. A simpler pn_data_encoded_size() API would be easy to do but potentially requires scanning the data structure twice to encode... FWIW, I think that there are some simple strategies that can get you the needed size out of a pn_data_t with a little extra being held in that structure, withotu having to do the full encode. I'd love to hear some details. Are you suggesting we accumulate the encoded size every time you modify the data? The motivation for all this is to avoid the guess and double allocation dance that the bindings currently do, e.g. in python: size = 1024 while True: cd, enc = pn_data_encode(self._data, size) if cd == PN_OVERFLOW: size *= 2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] qpid-proton pull request: NO-JIRA: Measure size of encoded data.
GitHub user alanconway opened a pull request: https://github.com/apache/qpid-proton/pull/11 NO-JIRA: Measure size of encoded data. Introduce pn_data_encode2 which allows you to both encode data and, if you don't have enough space, find out how much space you need. You can also use this to simply find out how much space is required to the data. This supports two patterns of use: 1. Check how much space you need, allocate the space and encode. 2. Try to encode into a buffer you already have. If that doesn't work, re-allocate to the correct size and encode. Eliminates the need for guessing and buffer doubling strategies, which are inefficient given that we know exactly what is in the data. The original pn_data_encode is unmodified for compatibility. You can merge this pull request into a Git repository by running: $ git pull https://github.com/alanconway/qpid-proton encode-size Alternatively you can review and apply these changes as the patch at: https://github.com/apache/qpid-proton/pull/11.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11 commit f117ef5dee718ee6ea6af49e32750eecd81a1d12 Author: Alan Conway acon...@redhat.com Date: 2015-03-10T16:16:32Z NO-JIRA: Measure size of encoded data. Introduce pn_data_encode2 which allows you to both encode data and, if you don't have enough space, find out how much space you need. You can also use this to simply find out how much space is required to the data. This supports two patterns of use: 1. Check how much space you need, allocate the space and encode. 2. Try to encode into a buffer you already have. If that doesn't work, re-allocate to the correct size and encode. Eliminates the need for guessing and buffer doubling strategies, which are inefficient given that we know exactly what is in the data. The original pn_data_encode is unmodified for compatibility. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---