[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-16 Thread Jani Nikula
On Fri, 14 Dec 2012, david at tethera.net wrote:
> From: David Bremner 
>
> The query is split into tokens, with ' ' and ':' as delimiters.  Any
> token containing some hex-escaped character is quoted according to
> Xapian rules.  This maps id:foo%20%22bar to id:"foo ""bar".
> This intentionally does not quote prefixes, so they still work as prefixes.
> ---
>  tag-util.c |   50 ++
>  1 file changed, 50 insertions(+)
>
> diff --git a/tag-util.c b/tag-util.c
> index f89669a..e1181f8 100644
> --- a/tag-util.c
> +++ b/tag-util.c
> @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
>  return NULL;
>  }
>  
> +static tag_parse_status_t
> +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error,
> + char **query_string)
> +{
> +char *tok = encoded;
> +size_t tok_len = 0;
> +char *buf = NULL;
> +size_t buf_len = 0;
> +tag_parse_status_t ret = TAG_PARSE_SUCCESS;
> +
> +*query_string = talloc_strdup (ctx, "");
> +
> +while (*query_string &&
> +(tok = strtok_len (tok + tok_len, ": ", _len)) != NULL) {

strtok_len() will eat all the leading delimiters at each call, and will
not return a zero-length token if you have multiple consecutive
delimiters. Which means you may end up losing stuff here. Whether that
matters or not I'm too tired to tell...

BR,
Jani.


> + char delim = tok[tok_len];
> +
> + *(tok + tok_len++) = '\0';
> +
> + if (strcspn (tok, "%") < tok_len - 1) {
> + /* something to decode */
> + if (hex_decode_inplace (tok) != HEX_SUCCESS) {
> + ret = line_error (TAG_PARSE_INVALID, line_for_error,
> +   "hex decoding of token '%s' failed", tok);
> + goto DONE;
> + }
> +
> + if (double_quote_str (ctx, tok, , _len)) {
> + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> +   line_for_error, "aborting");
> + goto DONE;
> + }
> + *query_string = talloc_asprintf_append_buffer (
> + *query_string, "%s%c", buf, delim);
> +
> + } else {
> + /* This is not just an optimization, but used to preserve
> +  * prefixes like id:, which cannot be quoted.
> +  */
> + *query_string = talloc_asprintf_append_buffer (
> + *query_string, "%s%c", tok, delim);
> + }
> +
> +}
> +
> +  DONE:
> +if (ret != TAG_PARSE_SUCCESS && *query_string)
> + talloc_free (*query_string);
> +return ret;
> +}
> +
>  tag_parse_status_t
>  parse_tag_line (void *ctx, char *line,
>   tag_op_flag_t flags,
> -- 
> 1.7.10.4
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-15 Thread David Bremner
Jani Nikula  writes:

> strtok_len() will eat all the leading delimiters at each call, and will
> not return a zero-length token if you have multiple consecutive
> delimiters. Which means you may end up losing stuff here.

Right, I think for ':' it does matter, but it should be fixable with a
a little loop to copy ':'s to the query string after the (possibly quoted)
token.


[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-15 Thread Mark Walters

On Fri, 14 Dec 2012, david at tethera.net wrote:
> From: David Bremner 
>
> The query is split into tokens, with ' ' and ':' as delimiters.  Any
> token containing some hex-escaped character is quoted according to
> Xapian rules.  This maps id:foo%20%22bar to id:"foo ""bar".
> This intentionally does not quote prefixes, so they still work as prefixes.
> ---
>  tag-util.c |   50 ++
>  1 file changed, 50 insertions(+)
>
> diff --git a/tag-util.c b/tag-util.c
> index f89669a..e1181f8 100644
> --- a/tag-util.c
> +++ b/tag-util.c
> @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
>  return NULL;
>  }
>  
> +static tag_parse_status_t
> +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error,
> + char **query_string)
> +{

Would decode_and_quote_query be a better name given the order these two
happen? Also a comment describing the function would be nice.

> +char *tok = encoded;
> +size_t tok_len = 0;
> +char *buf = NULL;
> +size_t buf_len = 0;
> +tag_parse_status_t ret = TAG_PARSE_SUCCESS;
> +
> +*query_string = talloc_strdup (ctx, "");
> +
> +while (*query_string &&
> +(tok = strtok_len (tok + tok_len, ": ", _len)) != NULL) {
> + char delim = tok[tok_len];
> +
> + *(tok + tok_len++) = '\0';

These two look a little odd: I would prefer either array or pointer in
both cases.

> +
> + if (strcspn (tok, "%") < tok_len - 1) {
> + /* something to decode */
> + if (hex_decode_inplace (tok) != HEX_SUCCESS) {
> + ret = line_error (TAG_PARSE_INVALID, line_for_error,
> +   "hex decoding of token '%s' failed", tok);
> + goto DONE;
> + }
> +
> + if (double_quote_str (ctx, tok, , _len)) {
> + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
> +   line_for_error, "aborting");
> + goto DONE;
> + }
> + *query_string = talloc_asprintf_append_buffer (
> + *query_string, "%s%c", buf, delim);
> +
> + } else {
> + /* This is not just an optimization, but used to preserve
> +  * prefixes like id:, which cannot be quoted.
> +  */
> + *query_string = talloc_asprintf_append_buffer (
> + *query_string, "%s%c", tok, delim);
> + }

What happens if a message id (for example) contains a ':'? Is a query of
the form 

id:stuff"encoded_stuff" 

acceptable? (As far as I can see from the man page ':' does not need to
be in hex.)

Best wishes

Mark


> +
> +}
> +
> +  DONE:
> +if (ret != TAG_PARSE_SUCCESS && *query_string)
> + talloc_free (*query_string);
> +return ret;
> +}
> +
>  tag_parse_status_t
>  parse_tag_line (void *ctx, char *line,
>   tag_op_flag_t flags,
> -- 
> 1.7.10.4
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-15 Thread David Bremner
Mark Walters  writes:


> What happens if a message id (for example) contains a ':'? Is a query of
> the form 
>
> id:stuff"encoded_stuff" 
>
> acceptable? (As far as I can see from the man page ':' does not need to
> be in hex.)

The updated version of the notmuch-dump man page does say that : will be
hex encoded, so I think the fix here is to update the notmuch tag man
page.


Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-15 Thread Mark Walters

On Fri, 14 Dec 2012, da...@tethera.net wrote:
 From: David Bremner brem...@debian.org

 The query is split into tokens, with ' ' and ':' as delimiters.  Any
 token containing some hex-escaped character is quoted according to
 Xapian rules.  This maps id:foo%20%22bar to id:foo bar.
 This intentionally does not quote prefixes, so they still work as prefixes.
 ---
  tag-util.c |   50 ++
  1 file changed, 50 insertions(+)

 diff --git a/tag-util.c b/tag-util.c
 index f89669a..e1181f8 100644
 --- a/tag-util.c
 +++ b/tag-util.c
 @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
  return NULL;
  }
  
 +static tag_parse_status_t
 +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error,
 + char **query_string)
 +{

Would decode_and_quote_query be a better name given the order these two
happen? Also a comment describing the function would be nice.

 +char *tok = encoded;
 +size_t tok_len = 0;
 +char *buf = NULL;
 +size_t buf_len = 0;
 +tag_parse_status_t ret = TAG_PARSE_SUCCESS;
 +
 +*query_string = talloc_strdup (ctx, );
 +
 +while (*query_string 
 +(tok = strtok_len (tok + tok_len, : , tok_len)) != NULL) {
 + char delim = tok[tok_len];
 +
 + *(tok + tok_len++) = '\0';

These two look a little odd: I would prefer either array or pointer in
both cases.

 +
 + if (strcspn (tok, %)  tok_len - 1) {
 + /* something to decode */
 + if (hex_decode_inplace (tok) != HEX_SUCCESS) {
 + ret = line_error (TAG_PARSE_INVALID, line_for_error,
 +   hex decoding of token '%s' failed, tok);
 + goto DONE;
 + }
 +
 + if (double_quote_str (ctx, tok, buf, buf_len)) {
 + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
 +   line_for_error, aborting);
 + goto DONE;
 + }
 + *query_string = talloc_asprintf_append_buffer (
 + *query_string, %s%c, buf, delim);
 +
 + } else {
 + /* This is not just an optimization, but used to preserve
 +  * prefixes like id:, which cannot be quoted.
 +  */
 + *query_string = talloc_asprintf_append_buffer (
 + *query_string, %s%c, tok, delim);
 + }

What happens if a message id (for example) contains a ':'? Is a query of
the form 

id:stuffencoded_stuff 

acceptable? (As far as I can see from the man page ':' does not need to
be in hex.)

Best wishes

Mark


 +
 +}
 +
 +  DONE:
 +if (ret != TAG_PARSE_SUCCESS  *query_string)
 + talloc_free (*query_string);
 +return ret;
 +}
 +
  tag_parse_status_t
  parse_tag_line (void *ctx, char *line,
   tag_op_flag_t flags,
 -- 
 1.7.10.4

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-15 Thread David Bremner
Mark Walters markwalters1...@gmail.com writes:


 What happens if a message id (for example) contains a ':'? Is a query of
 the form 

 id:stuffencoded_stuff 

 acceptable? (As far as I can see from the man page ':' does not need to
 be in hex.)

The updated version of the notmuch-dump man page does say that : will be
hex encoded, so I think the fix here is to update the notmuch tag man
page.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-15 Thread Jani Nikula
On Fri, 14 Dec 2012, da...@tethera.net wrote:
 From: David Bremner brem...@debian.org

 The query is split into tokens, with ' ' and ':' as delimiters.  Any
 token containing some hex-escaped character is quoted according to
 Xapian rules.  This maps id:foo%20%22bar to id:foo bar.
 This intentionally does not quote prefixes, so they still work as prefixes.
 ---
  tag-util.c |   50 ++
  1 file changed, 50 insertions(+)

 diff --git a/tag-util.c b/tag-util.c
 index f89669a..e1181f8 100644
 --- a/tag-util.c
 +++ b/tag-util.c
 @@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
  return NULL;
  }
  
 +static tag_parse_status_t
 +quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error,
 + char **query_string)
 +{
 +char *tok = encoded;
 +size_t tok_len = 0;
 +char *buf = NULL;
 +size_t buf_len = 0;
 +tag_parse_status_t ret = TAG_PARSE_SUCCESS;
 +
 +*query_string = talloc_strdup (ctx, );
 +
 +while (*query_string 
 +(tok = strtok_len (tok + tok_len, : , tok_len)) != NULL) {

strtok_len() will eat all the leading delimiters at each call, and will
not return a zero-length token if you have multiple consecutive
delimiters. Which means you may end up losing stuff here. Whether that
matters or not I'm too tired to tell...

BR,
Jani.


 + char delim = tok[tok_len];
 +
 + *(tok + tok_len++) = '\0';
 +
 + if (strcspn (tok, %)  tok_len - 1) {
 + /* something to decode */
 + if (hex_decode_inplace (tok) != HEX_SUCCESS) {
 + ret = line_error (TAG_PARSE_INVALID, line_for_error,
 +   hex decoding of token '%s' failed, tok);
 + goto DONE;
 + }
 +
 + if (double_quote_str (ctx, tok, buf, buf_len)) {
 + ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
 +   line_for_error, aborting);
 + goto DONE;
 + }
 + *query_string = talloc_asprintf_append_buffer (
 + *query_string, %s%c, buf, delim);
 +
 + } else {
 + /* This is not just an optimization, but used to preserve
 +  * prefixes like id:, which cannot be quoted.
 +  */
 + *query_string = talloc_asprintf_append_buffer (
 + *query_string, %s%c, tok, delim);
 + }
 +
 +}
 +
 +  DONE:
 +if (ret != TAG_PARSE_SUCCESS  *query_string)
 + talloc_free (*query_string);
 +return ret;
 +}
 +
  tag_parse_status_t
  parse_tag_line (void *ctx, char *line,
   tag_op_flag_t flags,
 -- 
 1.7.10.4

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-15 Thread David Bremner
Jani Nikula j...@nikula.org writes:

 strtok_len() will eat all the leading delimiters at each call, and will
 not return a zero-length token if you have multiple consecutive
 delimiters. Which means you may end up losing stuff here.

Right, I think for ':' it does matter, but it should be fixable with a
a little loop to copy ':'s to the query string after the (possibly quoted)
token.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-14 Thread da...@tethera.net
From: David Bremner 

The query is split into tokens, with ' ' and ':' as delimiters.  Any
token containing some hex-escaped character is quoted according to
Xapian rules.  This maps id:foo%20%22bar to id:"foo ""bar".
This intentionally does not quote prefixes, so they still work as prefixes.
---
 tag-util.c |   50 ++
 1 file changed, 50 insertions(+)

diff --git a/tag-util.c b/tag-util.c
index f89669a..e1181f8 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
 return NULL;
 }

+static tag_parse_status_t
+quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error,
+   char **query_string)
+{
+char *tok = encoded;
+size_t tok_len = 0;
+char *buf = NULL;
+size_t buf_len = 0;
+tag_parse_status_t ret = TAG_PARSE_SUCCESS;
+
+*query_string = talloc_strdup (ctx, "");
+
+while (*query_string &&
+  (tok = strtok_len (tok + tok_len, ": ", _len)) != NULL) {
+   char delim = tok[tok_len];
+
+   *(tok + tok_len++) = '\0';
+
+   if (strcspn (tok, "%") < tok_len - 1) {
+   /* something to decode */
+   if (hex_decode_inplace (tok) != HEX_SUCCESS) {
+   ret = line_error (TAG_PARSE_INVALID, line_for_error,
+ "hex decoding of token '%s' failed", tok);
+   goto DONE;
+   }
+
+   if (double_quote_str (ctx, tok, , _len)) {
+   ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
+ line_for_error, "aborting");
+   goto DONE;
+   }
+   *query_string = talloc_asprintf_append_buffer (
+   *query_string, "%s%c", buf, delim);
+
+   } else {
+   /* This is not just an optimization, but used to preserve
+* prefixes like id:, which cannot be quoted.
+*/
+   *query_string = talloc_asprintf_append_buffer (
+   *query_string, "%s%c", tok, delim);
+   }
+
+}
+
+  DONE:
+if (ret != TAG_PARSE_SUCCESS && *query_string)
+   talloc_free (*query_string);
+return ret;
+}
+
 tag_parse_status_t
 parse_tag_line (void *ctx, char *line,
tag_op_flag_t flags,
-- 
1.7.10.4



[Patch v7 05/14] quote_and_decode_query: new function to quote hex-decoded queries

2012-12-14 Thread david
From: David Bremner brem...@debian.org

The query is split into tokens, with ' ' and ':' as delimiters.  Any
token containing some hex-escaped character is quoted according to
Xapian rules.  This maps id:foo%20%22bar to id:foo bar.
This intentionally does not quote prefixes, so they still work as prefixes.
---
 tag-util.c |   50 ++
 1 file changed, 50 insertions(+)

diff --git a/tag-util.c b/tag-util.c
index f89669a..e1181f8 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -56,6 +56,56 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
 return NULL;
 }
 
+static tag_parse_status_t
+quote_and_decode_query (void *ctx, char *encoded, const char *line_for_error,
+   char **query_string)
+{
+char *tok = encoded;
+size_t tok_len = 0;
+char *buf = NULL;
+size_t buf_len = 0;
+tag_parse_status_t ret = TAG_PARSE_SUCCESS;
+
+*query_string = talloc_strdup (ctx, );
+
+while (*query_string 
+  (tok = strtok_len (tok + tok_len, : , tok_len)) != NULL) {
+   char delim = tok[tok_len];
+
+   *(tok + tok_len++) = '\0';
+
+   if (strcspn (tok, %)  tok_len - 1) {
+   /* something to decode */
+   if (hex_decode_inplace (tok) != HEX_SUCCESS) {
+   ret = line_error (TAG_PARSE_INVALID, line_for_error,
+ hex decoding of token '%s' failed, tok);
+   goto DONE;
+   }
+
+   if (double_quote_str (ctx, tok, buf, buf_len)) {
+   ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
+ line_for_error, aborting);
+   goto DONE;
+   }
+   *query_string = talloc_asprintf_append_buffer (
+   *query_string, %s%c, buf, delim);
+
+   } else {
+   /* This is not just an optimization, but used to preserve
+* prefixes like id:, which cannot be quoted.
+*/
+   *query_string = talloc_asprintf_append_buffer (
+   *query_string, %s%c, tok, delim);
+   }
+
+}
+
+  DONE:
+if (ret != TAG_PARSE_SUCCESS  *query_string)
+   talloc_free (*query_string);
+return ret;
+}
+
 tag_parse_status_t
 parse_tag_line (void *ctx, char *line,
tag_op_flag_t flags,
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch