Hi David, thanks for the review! On Sun, 05 Aug 2012, David Bremner <da...@tethera.net> wrote: > Jani Nikula <j...@nikula.org> writes: > >> + >> +static enum field >> +abs_to_rel_field (enum field field) >> +{ >> + assert (field <= TM_ABS_YEAR); >> + >> + /* note: depends on the enum ordering */ >> + return field + (TM_REL_SEC - TM_ABS_SEC); >> +} >> + > > I wonder if this would be slightly nicer of you defined a TM_FIRST_REL > or so as a synonym like TM_NONE and TM_SIZE
Good idea. >> +/* get zero value for field */ >> +static int >> +field_zero (enum field field) >> +{ >> + if (field == TM_ABS_MDAY || field == TM_ABS_MON) >> + return 1; >> + else if (field == TM_ABS_YEAR) >> + return 1970; >> + else >> + return 0; >> +} > > what do you think about using the word "epoch" instead of zero here? As a non-native speaker, I'll just take your word for it if you think it would be better. :) >> +static bool >> +get_postponed_number (struct state *state, int *v, int *n, char *d) >> +{ > > I found the 1 letter names not quite obvious here. True. I think v and n are used fairly consistently throughout the source file, so I'll have to consider longer names vs. documentation comment for them. > At this point reading the code, I have not trouble understanding each > line/function, but I feel like I'm missing the big picture a bit. Yeah, perhaps the code reads better if you follow from the top level parse_time_string() down. Which is at the very end. A "big picture" documentation comment would do no harm. > What is a postponed number? I see that you grasped that later, but I'll describe anyway. Parsing is done from left to right, in a greedy fashion (i.e. match the longest possible known expression). If after that we encounter a number that we don't know what to do with yet, postpone it until we move on to the next expression. The parser for that might eat the preceding "postponed" number (for example "5 August", 5 would be postponed because we don't know what to do with yet, but then "August" would be the context to make it day of month). If that is not the case, the number will be parsed by parse_postponed_number() as a lonely, single number between there. (Yes, this should be documented better with the "big picture".) > >> + /* >> + * REVISIT: There could be a "next_field" that would be set from >> + * "field" for the duration of the handle_postponed_number() call, >> + * so it has more information to work with. >> + */ > > The notmuch convention seems to be to use XXX: for this. I'm not sure > I'd bother changing, especially if we can't decide how to package this. Okay, can be changed if needed. > >> +/* Time set helper. No input checking. Use UNSET (-1) to leave unset. */ >> +static int >> +set_abs_time (struct state *state, int hour, int min, int sec) >> +{ >> + int r; >> + >> + if (hour != UNSET) { >> + if ((r = set_field (state, TM_ABS_HOUR, hour))) >> + return r; >> + } > > So for this function and the next, the first match wins? I don't really > see the motivation for this, maybe you can explain a bit. The whole parser tries to be as unambiguous as possible. If the input leads to a situation in which any absolute time field (see enum field) is attempted to set twice, it means it appears twice in the input. For example, "2012-08-06 August", where the month is set twice. By design, that is not allowed, and set_field() fails, even if it's the same value. The only semi-exception is having redundant weekday there, for example "Monday, August 6". > > >> + /* timezone codes: offset in minutes. FIXME: add more codes. */ > > Did you think about trying to delegate the list of timezones to the > system? No. :) I'll look into it. > >> + * Compare strings s and keyword. Return number of matching chars on >> + * match, 0 for no match. Match must be at least n chars (n == 0 all >> + * of keyword), otherwise it's not a match. Use match_case for case >> + * sensitive matching. >> + */ > > I guess that's fine, and it is internal, but maybe -1 for whole string > would be slightly nicer (although I can't imagine what good matching 0 > length strings is at the moment). Can be changed. > >> + /* Minimum match length. */ >> + p = strchr (keyword, '|'); >> + if (p) { >> + minlen = p - keyword; >> + memmove (p, p + 1, strlen (p + 1) + 1); >> + } > > Something about that memmove creeps me out, but I trust you that it's > correct. Alternatively I guess you could represent keywords as pairs of > strings, which is probably more of a pain. I didn't bother to double check it now, but I remember thinking it over very carefully. :) I agree it could use more clarity to be more obviously correct. (Initially the minlen was coded as an int in the table, but the above allows the localization to decide how long the match must be.) > > >> + >> +/* Parse a single number. Typically postpone parsing until later. */ > > OK, so I finally start to understand what a postponed number is :) > I understand the compiler likes bottom up declarations, but some > top down declarations/comments are needed I think. I agree more comments would be in order. > >> +static int >> +parse_date (struct state *state, char sep, >> + unsigned long v1, unsigned long v2, unsigned long v3, >> + size_t n1, size_t n2, size_t n3) >> +{ >> + int year = UNSET, mon = UNSET, mday = UNSET; >> + >> + assert (is_date_sep (sep)); >> + >> + switch (sep) { >> + case '/': /* Date: M[M]/D[D][/YY[YY]] or M[M]/YYYY */ > > If I understand correctly, this chooses between American (?) month, day, > year ordering and "sensible" day, month, year ordering by delimiter. I > never thought about this as a way to tell (I often write D/M/Y), but > that could be just me. I agree it's fine as a convention. You understand correctly. It's obviously not a reliable way to tell unless you make it the convention, which I chose to do. Some parsers, notably the one in git, also look at the values to see if it could be D/M/Y if M/D/Y is not possible, but I don't like the ambiguity that introduces. > >> +/* >> + * Parse delimiter(s). Return < 0 on error, number of parsed chars on >> + * success. >> + */ > > So 1:-2 will parse as 1-2 ?, i.e. last delimiter wins? Maybe better to > say so explicitly. Agreed. It just throws out any extra delimiters. Perhaps this should also be more strict about the allowed delimiters (now anything is allowed). > >> +/* Combine absolute and relative fields, and round. */ >> +static int >> +create_output (struct state *state, time_t *t_out, const time_t *tnow, >> + int round) >> +{ > > It seems like most of non-obvious logic like (when is "wednesday") is > encoded here. From a maintenence point of view, it would be nice to be > able to seperate out the heuristic stuff from the mechanical, to the > degree that it is possible. Agreed. Basically this is step 2, turning all information parsed in step 1 (function parse_input()) into a sensible result. Figuring out the current time is also done here. BR, Jani. _______________________________________________ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch