Karl Berry wrote:
> If the info file is not OK, why did makeinfo not warn about it?
>
> I suppose makeinfo should have something like -Wall that warns about
> such things.
>
> I tried warning about periods and colons and such in node names, and
> there are far too many complaints in existing Texinfo files. In
> practice, the punctuation often does not cause trouble (unfortunately,
> in a way, or the misfeature would never have lasted this long).
"We have a phenomenon which sometimes causes entirely broken .info
files but only sometimes. Since normally it's harmless, let's not
warn about it."
Huh? Hey? What kind of philosophy is this? Would you be happy as a
GCC user if GCC behaved like this?
Since search might not work across the entire .info file, this warning
should be enabled by default.
> I'm a little surprised it does in this case.
Please, can you find out why it is so harmful in this case, and make
the warning conditional on this reason?
If not, then I'm in favour of an unconditional warning. It helps
understanding
1. that the problem is with the node names (this is not clear at all
when you try incremental search and the incremental search finds no
occurrences),
2. which node names need to be fixed.
See attached <<patchv1>>.
Actually, when looking at info/search.c, the comment of function
skip_node_characters appears to have the answer. It says:
Note that
this function contains quite a bit of hair to ignore periods in some
special cases. This is because we here at GNU ship some info files which
contain nodenames that contain periods. No such nodename can start with
a period, or continue with whitespace, newline, or ')' immediately following
the period.
With this info - unsurprisingly - there are no false positives any more, only
one warning:
gnulib-intro.texi:1: Warnung: Node name `Library vs. Reusable Code' contains a
period followed by whitespace or a closing parenthesis.
See attached <<patchv2>>.
> The only good answer I've been able to see to this longstanding problem
> is to invent a general escaping mechanism in Info files.
That would be the solution to the general problem. But when you look at the
bug report from 5 years ago and this one, you see that it's only about a
node containing the word "vs.". So please follow me into debugging 'info'.
The 'info' program tries to parse the menu line
* Library vs. Reusable Code::
and while doing this, the function skip_node_characters wants to end parsing
the node name when it sees the period followed by a space. This is pointless!
In a menu, all node names are terminated by a colon. The only places where
a node name is followed by a period and a space or closing parenthesis are
the expansion of @xref and @pref references inline in text. But in this
situation, the second argument to skip_node_characters will be SKIP_NEWLINES,
i.e. newlines_okay=true. So, if newlines_okay is false, there is no need
to test for a period - just wait for the colon. This patch does it, and
it fixes the problem!
<<patchv3>>
Bruno
2007-12-29 Bruno Haible <[EMAIL PROTECTED]>
* makeinfo/node.c (remember_node): Give a warning when the node name
does not satisfy the documented constraints.
*** texinfo-4.11/makeinfo/node.c.bak 2007-07-08 15:11:48.000000000 +0200
--- texinfo-4.11/makeinfo/node.c 2007-12-30 02:15:34.000000000 +0100
***************
*** 280,285 ****
--- 280,303 ----
node, tag->line_no);
return;
}
+ /* The documentation says: "You cannot use periods, commas, colons or
+ parentheses within a node name." */
+ if (strchr (node, '(') != NULL || strchr (node, ')') != NULL)
+ {
+ warning (_("Node name `%s' contains a parenthesis"), node);
+ }
+ if (strchr (node, ':') != NULL)
+ {
+ warning (_("Node name `%s' contains a colon"), node);
+ }
+ if (strchr (node, ',') != NULL)
+ {
+ warning (_("Node name `%s' contains a comma"), node);
+ }
+ if (strchr (node, '.') != NULL)
+ {
+ warning (_("Node name `%s' contains a period"), node);
+ }
}
if (!(flags & TAG_FLAG_ANCHOR))
2007-12-29 Bruno Haible <[EMAIL PROTECTED]>
* makeinfo/node.c (remember_node): Give a warning when the node name
does not satisfy the documented constraints.
*** texinfo-4.11/makeinfo/node.c.bak 2007-07-08 15:11:48.000000000 +0200
--- texinfo-4.11/makeinfo/node.c 2007-12-30 02:45:33.000000000 +0100
***************
*** 280,285 ****
--- 280,319 ----
node, tag->line_no);
return;
}
+ /* The documentation says: "You cannot use periods, commas, colons or
+ parentheses within a node name." */
+ if (strchr (node, '(') != NULL || strchr (node, ')') != NULL)
+ {
+ warning (_("Node name `%s' contains a parenthesis"), node);
+ }
+ if (strchr (node, ':') != NULL)
+ {
+ warning (_("Node name `%s' contains a colon"), node);
+ }
+ if (strchr (node, ',') != NULL)
+ {
+ warning (_("Node name `%s' contains a comma"), node);
+ }
+ /* periods are only dangerous - see info/node.c function
+ skip_node_characters - when at the start of the node name or
+ when followed by whitespace, newline, or ')'. */
+ if (*node == '.')
+ {
+ warning (_("Node name `%s' starts with a period"), node);
+ }
+ {
+ const char *np;
+
+ for (np = node; *np != '\0'; np++)
+ if (*np == '.'
+ && (np[1] == ' ' || np[1] == '\t' || np[1] == '\n'
+ || np[1] == ')'))
+ {
+ warning (_("Node name `%s' contains a period followed by whitespace or a closing parenthesis"),
+ node);
+ break;
+ }
+ }
}
if (!(flags & TAG_FLAG_ANCHOR))
2007-12-29 Bruno Haible <[EMAIL PROTECTED]>
* info/search.c (skip_node_characters): When newlines_okay is false,
continue parsing a node name past ". " or ".)".
* makeinfo/node.c (remember_node): Give a warning when the node name
does not satisfy the documented constraints.
*** texinfo-4.11/info/search.c.bak 2007-07-01 23:20:31.000000000 +0200
--- texinfo-4.11/info/search.c 2007-12-30 12:50:16.000000000 +0100
***************
*** 290,302 ****
return (i);
}
! /* Return the index of the first non-node character in STRING. Note that
! this function contains quite a bit of hair to ignore periods in some
! special cases. This is because we here at GNU ship some info files which
! contain nodenames that contain periods. No such nodename can start with
! a period, or continue with whitespace, newline, or ')' immediately following
! the period. If second argument NEWLINES_OKAY is non-zero, newlines should
! be skipped while parsing out the nodename specification. */
int
skip_node_characters (char *string, int newlines_okay)
{
--- 290,306 ----
return (i);
}
! /* Return the index of the first non-node character in STRING.
! If second argument NEWLINES_OKAY is non-zero, we are parsing a reference
! to a node inside normal text; in this case, newlines should be skipped
! while parsing out the nodename specification.
! Note that this function contains quite a bit of hair to ignore periods
! in some special cases. This is because we here at GNU ship some info
! files which contain nodenames that contain periods.
! 1. No such nodename can start with a period.
! 2. When a nodename contains a period immediatey followed by whitespace,
! newline, or ')', it cannot be used in references, but is otherwise
! ok. */
int
skip_node_characters (char *string, int newlines_okay)
{
***************
*** 338,344 ****
((!newlines_okay) && (c == '\n')) ||
((paren_seen && string[i - 1] == ')') &&
(c == ' ' || c == '.')) ||
! (c == '.' &&
(
#if 0
/* This test causes a node name ending in a period, like `This.', not to
--- 342,352 ----
((!newlines_okay) && (c == '\n')) ||
((paren_seen && string[i - 1] == ')') &&
(c == ' ' || c == '.')) ||
! /* When we see "Foo vs. Bar", we assume the node name ends at the
! period when parsing a reference (newlines_okay), but we assume
! that the node name continues past the period when parsing a
! node definition (!newlines_okay). */
! (newlines_okay && c == '.' &&
(
#if 0
/* This test causes a node name ending in a period, like `This.', not to
*** texinfo-4.11/makeinfo/node.c.bak 2007-07-08 15:11:48.000000000 +0200
--- texinfo-4.11/makeinfo/node.c 2007-12-30 12:51:24.000000000 +0100
***************
*** 280,285 ****
--- 280,319 ----
node, tag->line_no);
return;
}
+ /* The documentation says: "You cannot use periods, commas, colons or
+ parentheses within a node name." */
+ if (strchr (node, '(') != NULL || strchr (node, ')') != NULL)
+ {
+ warning (_("Node name `%s' contains a parenthesis"), node);
+ }
+ if (strchr (node, ':') != NULL)
+ {
+ warning (_("Node name `%s' contains a colon"), node);
+ }
+ if (strchr (node, ',') != NULL)
+ {
+ warning (_("Node name `%s' contains a comma"), node);
+ }
+ /* periods are only dangerous - see info/node.c function
+ skip_node_characters - when at the start of the node name or
+ when followed by whitespace, newline, or ')'. */
+ if (*node == '.')
+ {
+ warning (_("Node name `%s' starts with a period"), node);
+ }
+ {
+ const char *np;
+
+ for (np = node; *np != '\0'; np++)
+ if (*np == '.'
+ && (np[1] == ' ' || np[1] == '\t' || np[1] == '\n'
+ || np[1] == ')'))
+ {
+ warning (_("Node name `%s' contains a period followed by whitespace or a closing parenthesis; references to this node will not work"),
+ node);
+ break;
+ }
+ }
}
if (!(flags & TAG_FLAG_ANCHOR))