[
https://issues.apache.org/jira/browse/COUCHDB-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838738#comment-13838738
]
Nick North commented on COUCHDB-1953:
-------------------------------------
Forgot to link the pull request: it's
https://github.com/apache/couchdb/pull/115.
Further to Alexander's question on how much faster the new code is: for 4KB
blocks of data and a 40-character MIME boundary, the existing code will scan
almost the whole of each block that does not contain the boundary 40 times,
while the new code scans the block once, and then scans the last 40 characters
once (plus further potential scans of those last 40 if the first character of
the boundary appears there). Furthermore the existing code will record a
partial match if any prefix of the boundary occurs almost anywhere in the
block, which will cause the scanning process to start up again with the
remainder of the block, plus a new 4KB block tacked onto the end - this would
become very slow if "-" occurs frequently in an attachment.
> Speed up parsing of multipart/related requests
> ----------------------------------------------
>
> Key: COUCHDB-1953
> URL: https://issues.apache.org/jira/browse/COUCHDB-1953
> Project: CouchDB
> Issue Type: Improvement
> Components: HTTP Interface
> Reporter: Nick North
>
> Parsing of multipart/related requests searches for the MIME boundary string
> using the couch_httpd:find_in_binary/2 function, which can be made more
> efficient.
> When the boundary string is not found in its entirety in the search data, the
> function should then look to see if the data ends with a prefix of the
> string, but it currently looks for any prefix of the string almost anywhere
> in the search data.
> A pull request to fix this will be submitted shortly.
--
This message was sent by Atlassian JIRA
(v6.1#6144)