Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-28 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

Madeline Book wrote:
 Ha! So that is how it posts duplicate messages. After I submit my
 reply if I refresh the resulting page it posts the message again.
 Bleh RT tries hard to be disliked...
 
Firefox tells me that I cannot refresh with post data.  I always click on
the #ticket number to get back again

Anyway, I've posted a simpler patch on the new ticket, it should show up
in your mailbox soon.



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

I put the workaround into the doc/README -- and just (re-)discovered that's
the *only* method described at:

   http://freeciv.wikia.com/wiki/Install-MacOSX#Localization_Support

There's a different procedure at:

   http://freeciv.wikia.com/wiki/Install-Windows#Other_Languages

There's nothing about running with LANG at:

   http://freeciv.wikia.com/wiki/Install

Somebody needs to update these



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread Jason Dorje Short

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

William Allen Simpson wrote:
 URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 
 
 Jason Short wrote:
 
 # ... I thought I had written this up somewhere in the code, but I can find
 # nothing.  Where should it be documented?
 
 Documenting the code and README.* files.

Good, except, all rulesets must be (and to my knowledge are) in utf-8. 
Anything in latin1 will not be converted by the server and will be 
treated as utf-8.  To my knowledge the only text files that can be 
something other than utf-8 are the PO files which give their own encoding.

-jason



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread Jason Short

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

Here is a quick and partial fix.  I assume that strerror() is one of the
most common offending functions, so I quickly went through and converted
all mystrerror users in client/ and server/ directories to use the
newly-written L_().  The only real issue with it is the use of
fixed-sized buffers which are wasteful if too large and broken if too
small, but as I said in my previous message I see no real alternative here.

Madeline, if you can test this and verify it fixes those strings then at
least we'll know we're getting somewhere.  It would also be helpful if
you'd point out to us what other strings are being handled wrongly.

It is important to recognize which strings go in which encoding. If you
use printf() directly your strings must be in the local encoding.  More
commonly freelog or fc_fprintf is used and these functions expect
strings in the internal encoding.  Generally all freeciv functions
should expect the internal encoding unless specifically documented
otherwise.

Also as a side note there's another library we use, lua.  Strings from
LUA are probably in the local encoding so should use L_ too.

-jason

Index: utility/fciconv.h
===
--- utility/fciconv.h	(revision 14336)
+++ utility/fciconv.h	(working copy)
@@ -19,6 +19,8 @@
 
 #define FC_DEFAULT_DATA_ENCODING UTF-8
 
+#define L_(s, b) local_to_internal_string_buffer((s), (b), sizeof(b))
+
 void init_character_encodings(const char *internal_encoding,
 			  bool use_transliteration);
 
Index: server/meta.c
===
--- server/meta.c	(revision 14336)
+++ server/meta.c	(working copy)
@@ -44,6 +44,7 @@
 #include connection.h
 #include dataio.h
 #include fcintl.h
+#include fciconv.h
 #include log.h
 #include mem.h
 #include netintf.h
@@ -200,6 +201,7 @@
 {
   static char msg[8192];
   static char str[8192];
+  char buf[512];
   int rest = sizeof(str);
   int n = 0;
   char *s = str;
@@ -213,13 +215,14 @@
 
   if ((sock = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
 freelog(LOG_ERROR, Metaserver: can't open stream socket: %s,
-	mystrerror());
+	L_(mystrerror(), buf));
 metaserver_failed();
 return FALSE;
   }
 
   if (my_connect(sock, (struct sockaddr *) meta_addr, sizeof(meta_addr)) == -1) {
-freelog(LOG_ERROR, Metaserver: connect failed: %s, mystrerror());
+freelog(LOG_ERROR, Metaserver: connect failed: %s,
+	L_(mystrerror(), buf));
 metaserver_failed();
 my_closesocket(sock);
 return FALSE;
Index: server/sernet.c
===
--- server/sernet.c	(revision 14336)
+++ server/sernet.c	(working copy)
@@ -806,7 +806,9 @@
   fromlen = sizeof(fromend);
 
   if ((new_sock = accept(sockfd, fromend.sockaddr, fromlen)) == -1) {
-freelog(LOG_ERROR, accept failed: %s, mystrerror());
+char buf[512];
+
+freelog(LOG_ERROR, accept failed: %s, L_(mystrerror(), buf));
 return -1;
   }
 
@@ -885,16 +887,17 @@
   struct ip_mreq mreq;
   const char *group;
   int opt;
+  char buf[512];
 
   /* Create socket for client connections. */
   if((sock = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
-die(socket failed: %s, mystrerror());
+die(socket failed: %s, L_(mystrerror(), buf));
   }
 
   opt=1; 
   if(setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, 
 		(char *)opt, sizeof(opt)) == -1) {
-freelog(LOG_ERROR, SO_REUSEADDR failed: %s, mystrerror());
+freelog(LOG_ERROR, SO_REUSEADDR failed: %s, L_(mystrerror(), buf));
   }
 
   if (!net_lookup_service(srvarg.bind_addr, srvarg.port, src)) {
@@ -904,23 +907,23 @@
   }
 
   if(bind(sock, src.sockaddr, sizeof (src)) == -1) {
-freelog(LOG_FATAL, bind failed: %s, mystrerror());
+freelog(LOG_FATAL, bind failed: %s, L_(mystrerror(), buf));
 exit(EXIT_FAILURE);
   }
 
   if(listen(sock, MAX_NUM_CONNECTIONS) == -1) {
-freelog(LOG_FATAL, listen failed: %s, mystrerror());
+freelog(LOG_FATAL, listen failed: %s, L_(mystrerror(), buf));
 exit(EXIT_FAILURE);
   }
 
   /* Create socket for server LAN announcements */
   if ((socklan = socket(AF_INET, SOCK_DGRAM, 0))  0) {
- freelog(LOG_ERROR, socket failed: %s, mystrerror());
+freelog(LOG_ERROR, socket failed: %s, L_(mystrerror(), buf));
   }
 
   if (setsockopt(socklan, SOL_SOCKET, SO_REUSEADDR,
  (char *)opt, sizeof(opt)) == -1) {
-freelog(LOG_ERROR, SO_REUSEADDR failed: %s, mystrerror());
+freelog(LOG_ERROR, SO_REUSEADDR failed: %s, L_(mystrerror(), buf));
   }
 
   my_nonblock(socklan);
@@ -933,7 +936,7 @@
   addr.sockaddr_in.sin_port = htons(SERVER_LAN_PORT);
 
   if (bind(socklan, addr.sockaddr, sizeof(addr))  0) {
-freelog(LOG_ERROR, bind failed: %s, mystrerror());
+freelog(LOG_ERROR, bind failed: %s, L_(mystrerror(), buf));
   }
 
   mreq.imr_multiaddr.s_addr = inet_addr(group);
@@ -941,7 +944,7 @@
 
   if 

[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [jdorje - Sun Jan 27 22:16:24 2008]:
 
 Here is a quick and partial fix.  I assume that strerror() is one of 
the
 most common offending functions, so I quickly went through and 
converted
 all mystrerror users in client/ and server/ directories to use the
 newly-written L_().  The only real issue with it is the use of
 fixed-sized buffers which are wasteful if too large and broken if too
 small, but as I said in my previous message I see no real alternative 
here.

Might I suggest that instead of wrapping every call to mystrerror with
L, you instead call L inside mysterror (and of course document that
mystrerror does the conversion)? Or make a mystrerror_l that does
this (to keep mysterror for the cases that shouldn't be converted).
It would make your patch much simpler. I agree in general with your
reservations about fixed size buffers, but I think that in this case
it is acceptable to used them because:

1. It is very unlikey that any strerror message is longer than 256
characters (I have looked at _sys_errlist_internal in libc and that
is the case, I assume it is very probably so on other platforms).

2. Strerror is not re-entrant (e.g. thread-safe) anyway (we would
need to use strerror_r for that).

3. It is very unlikely that someone will need to call mysterror twice
in the same expression (e.g. more than once in a printf-like call).

Hence a fixed static buffer inside mystrerror (or mysterror_l) to
hold the converted text would be an acceptable solution in this
case, in my opinion.

 Madeline, if you can test this and verify it fixes those strings then 
at
 least we'll know we're getting somewhere.  It would also be helpful if
 you'd point out to us what other strings are being handled wrongly.

All the corrupted strings that I had found previously now display
correctly (and no gtk warnings appear) with your patch applied. I
cannot of course guarantee that that is all of them. ;)
 
 It is important to recognize which strings go in which encoding. If 
you
 use printf() directly your strings must be in the local encoding.  
More
 commonly freelog or fc_fprintf is used and these functions expect
 strings in the internal encoding.  Generally all freeciv functions
 should expect the internal encoding unless specifically documented
 otherwise.
 
 Also as a side note there's another library we use, lua.  Strings from
 LUA are probably in the local encoding so should use L_ too.

I agree with you on those points.

___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

Jason Short wrote:
 Here is a quick and partial fix.  I assume that strerror() is one of the
 most common offending functions, so I quickly went through and converted
 all mystrerror users in client/ and server/ directories to use the
 newly-written L_(). 

Now, I find this irritating on a number of levels.

1) This ticket was already patched and resolved.  Another new ticket would
be better.  I'll open one.

2) Madeline reported the problem was in mystrerror(), so that's a good
target, but the solution would be to patch mystrerror(), not repeatedly
patch every invocation  This is *all* users!

3) By definition, mystrerror() can only be called once, so there's no
technical reason to use a dynamic buffer.

4) I really hate L_() hiding a simple single function.  There's no excuse.
This makes the code harder to read and harder to grep.

5) The parameter order doesn't match either cat_snprintf() or strlcat(),
or any standard C invocation.  By convention, the destination buffer is
always the first parameter.







___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

Jason Dorje Short wrote:
 Good, except, all rulesets must be (and to my knowledge are) in utf-8.

Perhaps, but that's not what the data currently said, and I'd prefer to
document reality rather than hope.

Maybe there should be a concerted effort to scan every data file and
ensure they are all UTF-8?



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [jdorje - Sun Jan 27 22:16:24 2008]:
 
 Here is a quick and partial fix.  I assume that strerror() is one of 
the
 most common offending functions, so I quickly went through and 
converted
 all mystrerror users in client/ and server/ directories to use the
 newly-written L_().  The only real issue with it is the use of
 fixed-sized buffers which are wasteful if too large and broken if too
 small, but as I said in my previous message I see no real alternative 
here.

Might I suggest that instead of wrapping every call to mystrerror with
L, you instead call L inside mysterror (and of course document that
mystrerror does the conversion)? Or make a mystrerror_l that does
this (to keep mysterror for the cases that shouldn't be converted).
It would make your patch much simpler. I agree in general with your
reservations about fixed size buffers, but I think that in this case
it is acceptable to used them because:

1. It is very unlikey that any strerror message is longer than 256
characters (I have looked at _sys_errlist_internal in libc and that
is the case, I assume it is very probably so on other platforms).

2. Strerror is not re-entrant (e.g. thread-safe) anyway (we would
need to use strerror_r for that).

3. It is very unlikely that someone will need to call mysterror twice
in the same expression (e.g. more than once in a printf-like call).

Hence a fixed static buffer inside mystrerror (or mysterror_l) to
hold the converted text would be an acceptable solution in this
case, in my opinion.

 Madeline, if you can test this and verify it fixes those strings then 
at
 least we'll know we're getting somewhere.  It would also be helpful if
 you'd point out to us what other strings are being handled wrongly.

All the corrupted strings that I had found previously now display
correctly (and no gtk warnings appear) with your patch applied. I
cannot of course guarantee that that is all of them. ;)
 
 It is important to recognize which strings go in which encoding. If 
you
 use printf() directly your strings must be in the local encoding.  
More
 commonly freelog or fc_fprintf is used and these functions expect
 strings in the internal encoding.  Generally all freeciv functions
 should expect the internal encoding unless specifically documented
 otherwise.
 
 Also as a side note there's another library we use, lua.  Strings from
 LUA are probably in the local encoding so should use L_ too.

I agree with you on those points.

___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread Jason Dorje Short

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

William Allen Simpson wrote:
 URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 
 
 Jason Short wrote:
 Here is a quick and partial fix.  I assume that strerror() is one of the
 most common offending functions, so I quickly went through and converted
 all mystrerror users in client/ and server/ directories to use the
 newly-written L_(). 
 
 Now, I find this irritating on a number of levels.
 
 1) This ticket was already patched and resolved.  Another new ticket would
 be better.  I'll open one.

What would possibly motivate you to resolve a bug that's still unfixed, 
mere hours after making a rant about how writing off bugs as acceptable 
is sloppy bugfixing?

 2) Madeline reported the problem was in mystrerror(), so that's a good
 target,

Ahh yes, I see that now.  Obviously I had only remembered it 
subconsciously before :).  Speaking of which, Madeline, what is the 
mystrsocketerror function you mention?

 but the solution would be to patch mystrerror(), not repeatedly
 patch every invocation  This is *all* users!
 
 3) By definition, mystrerror() can only be called once, so there's no
 technical reason to use a dynamic buffer.

Yes.  Naturally the first thing I considered was making the change 
directly within mystrerror, as that is far far easier. But, if one 
mystrerror call is done before the previous one's return value has been 
discarded the result will be erroneous.  Given that some of the returned 
strings are passed off into arbitrary callbacks this is a rather risky 
thing to assume IMO.  However the error if this happened would be 
non-fatal so I suppose if we want to accept that bug it's okay.

 4) I really hate L_() hiding a simple single function.  There's no excuse.
 This makes the code harder to read and harder to grep.
 
 5) The parameter order doesn't match either cat_snprintf() or strlcat(),
 or any standard C invocation.  By convention, the destination buffer is
 always the first parameter.

It aims at consistency with _, N_, PL_, and Q_, with which it shares 
similar naming.  All of these are just shortened wrappers for single 
functions that allow easy and common use, and all put the text parameter 
first.  Of course L_ does no translation, it just changes encoding (as 
do the other _ functions).

-jason



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation

2008-01-27 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [EMAIL PROTECTED] - Mon Jan 28 05:22:53 2008]:
 
 William Allen Simpson wrote:
 ...
  2) Madeline reported the problem was in mystrerror(), so that's a
  good target,
 
 Ahh yes, I see that now.  Obviously I had only remembered it 
 subconsciously before :).  Speaking of which, Madeline, what is the 
 mystrsocketerror function you mention?

It is something added in the warclient codebase to deal with the
fact that on win32 strerror/GetLastError does not return error
codes for socket operation errors. It is identical to mystrerror
except that the call to GetLastError is replaced by a call to
WSAGetLastError. It's not really a pressing issue for the network
code in the freeciv codebase; I had concocted a characteristically
over-done asynchronous networking subsystem for metaserver
communications for the client for which to debug (plagued by
all manner of difficult to track down heisenbugs) on win32 it
was necessary to have precise error reporting. In the end it
was a nice learning experience, but not terribly worthwhile for
the project as a whole. ;)

  3) By definition, mystrerror() can only be called once, so
  there's no
  technical reason to use a dynamic buffer.
 
 Yes.  Naturally the first thing I considered was making the change 
 directly within mystrerror, as that is far far easier. But, if one 
 mystrerror call is done before the previous one's return value
 has been 
 discarded the result will be erroneous.  Given that some of the
 returned 
 strings are passed off into arbitrary callbacks this is a rather
 risky 
 thing to assume IMO.  However the error if this happened would be 
 non-fatal so I suppose if we want to accept that bug it's okay.

I don't see how an invocation of mystrerror could trigger another
invocation of itself (I certainly wouldn't expect that to happen).
As far as I can tell mystrerror is only used in calls to printf
like functions. So the return value of mystrerror is immediately
copied to the buffers in those functions and once there, it is safe
from being corrupted by further calls to mystrerror. But if there
are indeed complicated callbacks that just store the pointer returned
by strerror and use it later on after other operations have been
performed, then indeed, that is a dangerous situation and should be
avoided. I would always put a big warning in the comment header
for functions returning pointers to static data that the programmer
should take care not to abuse the return value in this way.
 

___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev