[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
On Fri, 14 Dec 2012, david at tethera.net wrote: > From: David Bremner > > The query is split into tokens, with ' ' and ':' as delimiters. Any > token containing some hex-escaped character is quoted according to > Xapian rules. This maps id:foo%20%22bar to id:"foo ""bar". > This intentionally does not quote prefixes, so they still work as prefixes. > --- > tag-util.c | 50 ++ > 1 file changed, 50 insertions(+) > > diff --git a/tag-util.c b/tag-util.c > index f89669a..e1181f8 100644 > --- a/tag-util.c > +++ b/tag-util.c > @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove) > return NULL; > } > > +static tag_parse_status_t > +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error, > + char **query_string) > +{ > +char *tok = encoded; > +size_t tok_len = 0; > +char *buf = NULL; > +size_t buf_len = 0; > +tag_parse_status_t ret = TAG_PARSE_SUCCESS; > + > +*query_string = talloc_strdup (ctx, ""); > + > +while (*query_string && > +(tok = strtok_len (tok + tok_len, ": ", _len)) != NULL) { strtok_len() will eat all the leading delimiters at each call, and will not return a zero-length token if you have multiple consecutive delimiters. Which means you may end up losing stuff here. Whether that matters or not I'm too tired to tell... BR, Jani. > + char delim = tok[tok_len]; > + > + *(tok + tok_len++) = '\0'; > + > + if (strcspn (tok, "%") < tok_len - 1) { > + /* something to decode */ > + if (hex_decode_inplace (tok) != HEX_SUCCESS) { > + ret = line_error (TAG_PARSE_INVALID, line_for_error, > + "hex decoding of token '%s' failed", tok); > + goto DONE; > + } > + > + if (double_quote_str (ctx, tok, , _len)) { > + ret = line_error (TAG_PARSE_OUT_OF_MEMORY, > + line_for_error, "aborting"); > + goto DONE; > + } > + *query_string = talloc_asprintf_append_buffer ( > + *query_string, "%s%c", buf, delim); > + > + } else { > + /* This is not just an optimization, but used to preserve > + * prefixes like id:, which cannot be quoted. > + */ > + *query_string = talloc_asprintf_append_buffer ( > + *query_string, "%s%c", tok, delim); > + } > + > +} > + > + DONE: > +if (ret != TAG_PARSE_SUCCESS && *query_string) > + talloc_free (*query_string); > +return ret; > +} > + > tag_parse_status_t > parse_tag_line (void *ctx, char *line, > tag_op_flag_t flags, > -- > 1.7.10.4 > > ___ > notmuch mailing list > notmuch at notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch
[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
Jani Nikula writes: > strtok_len() will eat all the leading delimiters at each call, and will > not return a zero-length token if you have multiple consecutive > delimiters. Which means you may end up losing stuff here. Right, I think for ':' it does matter, but it should be fixable with a a little loop to copy ':'s to the query string after the (possibly quoted) token.
[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
On Fri, 14 Dec 2012, david at tethera.net wrote: > From: David Bremner > > The query is split into tokens, with ' ' and ':' as delimiters. Any > token containing some hex-escaped character is quoted according to > Xapian rules. This maps id:foo%20%22bar to id:"foo ""bar". > This intentionally does not quote prefixes, so they still work as prefixes. > --- > tag-util.c | 50 ++ > 1 file changed, 50 insertions(+) > > diff --git a/tag-util.c b/tag-util.c > index f89669a..e1181f8 100644 > --- a/tag-util.c > +++ b/tag-util.c > @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove) > return NULL; > } > > +static tag_parse_status_t > +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error, > + char **query_string) > +{ Would decode_and_quote_query be a better name given the order these two happen? Also a comment describing the function would be nice. > +char *tok = encoded; > +size_t tok_len = 0; > +char *buf = NULL; > +size_t buf_len = 0; > +tag_parse_status_t ret = TAG_PARSE_SUCCESS; > + > +*query_string = talloc_strdup (ctx, ""); > + > +while (*query_string && > +(tok = strtok_len (tok + tok_len, ": ", _len)) != NULL) { > + char delim = tok[tok_len]; > + > + *(tok + tok_len++) = '\0'; These two look a little odd: I would prefer either array or pointer in both cases. > + > + if (strcspn (tok, "%") < tok_len - 1) { > + /* something to decode */ > + if (hex_decode_inplace (tok) != HEX_SUCCESS) { > + ret = line_error (TAG_PARSE_INVALID, line_for_error, > + "hex decoding of token '%s' failed", tok); > + goto DONE; > + } > + > + if (double_quote_str (ctx, tok, , _len)) { > + ret = line_error (TAG_PARSE_OUT_OF_MEMORY, > + line_for_error, "aborting"); > + goto DONE; > + } > + *query_string = talloc_asprintf_append_buffer ( > + *query_string, "%s%c", buf, delim); > + > + } else { > + /* This is not just an optimization, but used to preserve > + * prefixes like id:, which cannot be quoted. > + */ > + *query_string = talloc_asprintf_append_buffer ( > + *query_string, "%s%c", tok, delim); > + } What happens if a message id (for example) contains a ':'? Is a query of the form id:stuff"encoded_stuff" acceptable? (As far as I can see from the man page ':' does not need to be in hex.) Best wishes Mark > + > +} > + > + DONE: > +if (ret != TAG_PARSE_SUCCESS && *query_string) > + talloc_free (*query_string); > +return ret; > +} > + > tag_parse_status_t > parse_tag_line (void *ctx, char *line, > tag_op_flag_t flags, > -- > 1.7.10.4 > > ___ > notmuch mailing list > notmuch at notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch
[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
Mark Walters writes: > What happens if a message id (for example) contains a ':'? Is a query of > the form > > id:stuff"encoded_stuff" > > acceptable? (As far as I can see from the man page ':' does not need to > be in hex.) The updated version of the notmuch-dump man page does say that : will be hex encoded, so I think the fix here is to update the notmuch tag man page.
Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
On Fri, 14 Dec 2012, da...@tethera.net wrote: From: David Bremner brem...@debian.org The query is split into tokens, with ' ' and ':' as delimiters. Any token containing some hex-escaped character is quoted according to Xapian rules. This maps id:foo%20%22bar to id:foo bar. This intentionally does not quote prefixes, so they still work as prefixes. --- tag-util.c | 50 ++ 1 file changed, 50 insertions(+) diff --git a/tag-util.c b/tag-util.c index f89669a..e1181f8 100644 --- a/tag-util.c +++ b/tag-util.c @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove) return NULL; } +static tag_parse_status_t +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error, + char **query_string) +{ Would decode_and_quote_query be a better name given the order these two happen? Also a comment describing the function would be nice. +char *tok = encoded; +size_t tok_len = 0; +char *buf = NULL; +size_t buf_len = 0; +tag_parse_status_t ret = TAG_PARSE_SUCCESS; + +*query_string = talloc_strdup (ctx, ); + +while (*query_string +(tok = strtok_len (tok + tok_len, : , tok_len)) != NULL) { + char delim = tok[tok_len]; + + *(tok + tok_len++) = '\0'; These two look a little odd: I would prefer either array or pointer in both cases. + + if (strcspn (tok, %) tok_len - 1) { + /* something to decode */ + if (hex_decode_inplace (tok) != HEX_SUCCESS) { + ret = line_error (TAG_PARSE_INVALID, line_for_error, + hex decoding of token '%s' failed, tok); + goto DONE; + } + + if (double_quote_str (ctx, tok, buf, buf_len)) { + ret = line_error (TAG_PARSE_OUT_OF_MEMORY, + line_for_error, aborting); + goto DONE; + } + *query_string = talloc_asprintf_append_buffer ( + *query_string, %s%c, buf, delim); + + } else { + /* This is not just an optimization, but used to preserve + * prefixes like id:, which cannot be quoted. + */ + *query_string = talloc_asprintf_append_buffer ( + *query_string, %s%c, tok, delim); + } What happens if a message id (for example) contains a ':'? Is a query of the form id:stuffencoded_stuff acceptable? (As far as I can see from the man page ':' does not need to be in hex.) Best wishes Mark + +} + + DONE: +if (ret != TAG_PARSE_SUCCESS *query_string) + talloc_free (*query_string); +return ret; +} + tag_parse_status_t parse_tag_line (void *ctx, char *line, tag_op_flag_t flags, -- 1.7.10.4 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
Mark Walters markwalters1...@gmail.com writes: What happens if a message id (for example) contains a ':'? Is a query of the form id:stuffencoded_stuff acceptable? (As far as I can see from the man page ':' does not need to be in hex.) The updated version of the notmuch-dump man page does say that : will be hex encoded, so I think the fix here is to update the notmuch tag man page. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
On Fri, 14 Dec 2012, da...@tethera.net wrote: From: David Bremner brem...@debian.org The query is split into tokens, with ' ' and ':' as delimiters. Any token containing some hex-escaped character is quoted according to Xapian rules. This maps id:foo%20%22bar to id:foo bar. This intentionally does not quote prefixes, so they still work as prefixes. --- tag-util.c | 50 ++ 1 file changed, 50 insertions(+) diff --git a/tag-util.c b/tag-util.c index f89669a..e1181f8 100644 --- a/tag-util.c +++ b/tag-util.c @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove) return NULL; } +static tag_parse_status_t +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error, + char **query_string) +{ +char *tok = encoded; +size_t tok_len = 0; +char *buf = NULL; +size_t buf_len = 0; +tag_parse_status_t ret = TAG_PARSE_SUCCESS; + +*query_string = talloc_strdup (ctx, ); + +while (*query_string +(tok = strtok_len (tok + tok_len, : , tok_len)) != NULL) { strtok_len() will eat all the leading delimiters at each call, and will not return a zero-length token if you have multiple consecutive delimiters. Which means you may end up losing stuff here. Whether that matters or not I'm too tired to tell... BR, Jani. + char delim = tok[tok_len]; + + *(tok + tok_len++) = '\0'; + + if (strcspn (tok, %) tok_len - 1) { + /* something to decode */ + if (hex_decode_inplace (tok) != HEX_SUCCESS) { + ret = line_error (TAG_PARSE_INVALID, line_for_error, + hex decoding of token '%s' failed, tok); + goto DONE; + } + + if (double_quote_str (ctx, tok, buf, buf_len)) { + ret = line_error (TAG_PARSE_OUT_OF_MEMORY, + line_for_error, aborting); + goto DONE; + } + *query_string = talloc_asprintf_append_buffer ( + *query_string, %s%c, buf, delim); + + } else { + /* This is not just an optimization, but used to preserve + * prefixes like id:, which cannot be quoted. + */ + *query_string = talloc_asprintf_append_buffer ( + *query_string, %s%c, tok, delim); + } + +} + + DONE: +if (ret != TAG_PARSE_SUCCESS *query_string) + talloc_free (*query_string); +return ret; +} + tag_parse_status_t parse_tag_line (void *ctx, char *line, tag_op_flag_t flags, -- 1.7.10.4 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
Jani Nikula j...@nikula.org writes: strtok_len() will eat all the leading delimiters at each call, and will not return a zero-length token if you have multiple consecutive delimiters. Which means you may end up losing stuff here. Right, I think for ':' it does matter, but it should be fixable with a a little loop to copy ':'s to the query string after the (possibly quoted) token. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
From: David BremnerThe query is split into tokens, with ' ' and ':' as delimiters. Any token containing some hex-escaped character is quoted according to Xapian rules. This maps id:foo%20%22bar to id:"foo ""bar". This intentionally does not quote prefixes, so they still work as prefixes. --- tag-util.c | 50 ++ 1 file changed, 50 insertions(+) diff --git a/tag-util.c b/tag-util.c index f89669a..e1181f8 100644 --- a/tag-util.c +++ b/tag-util.c @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove) return NULL; } +static tag_parse_status_t +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error, + char **query_string) +{ +char *tok = encoded; +size_t tok_len = 0; +char *buf = NULL; +size_t buf_len = 0; +tag_parse_status_t ret = TAG_PARSE_SUCCESS; + +*query_string = talloc_strdup (ctx, ""); + +while (*query_string && + (tok = strtok_len (tok + tok_len, ": ", _len)) != NULL) { + char delim = tok[tok_len]; + + *(tok + tok_len++) = '\0'; + + if (strcspn (tok, "%") < tok_len - 1) { + /* something to decode */ + if (hex_decode_inplace (tok) != HEX_SUCCESS) { + ret = line_error (TAG_PARSE_INVALID, line_for_error, + "hex decoding of token '%s' failed", tok); + goto DONE; + } + + if (double_quote_str (ctx, tok, , _len)) { + ret = line_error (TAG_PARSE_OUT_OF_MEMORY, + line_for_error, "aborting"); + goto DONE; + } + *query_string = talloc_asprintf_append_buffer ( + *query_string, "%s%c", buf, delim); + + } else { + /* This is not just an optimization, but used to preserve +* prefixes like id:, which cannot be quoted. +*/ + *query_string = talloc_asprintf_append_buffer ( + *query_string, "%s%c", tok, delim); + } + +} + + DONE: +if (ret != TAG_PARSE_SUCCESS && *query_string) + talloc_free (*query_string); +return ret; +} + tag_parse_status_t parse_tag_line (void *ctx, char *line, tag_op_flag_t flags, -- 1.7.10.4
[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries
From: David Bremner brem...@debian.org The query is split into tokens, with ' ' and ':' as delimiters. Any token containing some hex-escaped character is quoted according to Xapian rules. This maps id:foo%20%22bar to id:foo bar. This intentionally does not quote prefixes, so they still work as prefixes. --- tag-util.c | 50 ++ 1 file changed, 50 insertions(+) diff --git a/tag-util.c b/tag-util.c index f89669a..e1181f8 100644 --- a/tag-util.c +++ b/tag-util.c @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove) return NULL; } +static tag_parse_status_t +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error, + char **query_string) +{ +char *tok = encoded; +size_t tok_len = 0; +char *buf = NULL; +size_t buf_len = 0; +tag_parse_status_t ret = TAG_PARSE_SUCCESS; + +*query_string = talloc_strdup (ctx, ); + +while (*query_string + (tok = strtok_len (tok + tok_len, : , tok_len)) != NULL) { + char delim = tok[tok_len]; + + *(tok + tok_len++) = '\0'; + + if (strcspn (tok, %) tok_len - 1) { + /* something to decode */ + if (hex_decode_inplace (tok) != HEX_SUCCESS) { + ret = line_error (TAG_PARSE_INVALID, line_for_error, + hex decoding of token '%s' failed, tok); + goto DONE; + } + + if (double_quote_str (ctx, tok, buf, buf_len)) { + ret = line_error (TAG_PARSE_OUT_OF_MEMORY, + line_for_error, aborting); + goto DONE; + } + *query_string = talloc_asprintf_append_buffer ( + *query_string, %s%c, buf, delim); + + } else { + /* This is not just an optimization, but used to preserve +* prefixes like id:, which cannot be quoted. +*/ + *query_string = talloc_asprintf_append_buffer ( + *query_string, %s%c, tok, delim); + } + +} + + DONE: +if (ret != TAG_PARSE_SUCCESS *query_string) + talloc_free (*query_string); +return ret; +} + tag_parse_status_t parse_tag_line (void *ctx, char *line, tag_op_flag_t flags, -- 1.7.10.4 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch