Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Madeline Book wrote: Ha! So that is how it posts duplicate messages. After I submit my reply if I refresh the resulting page it posts the message again. Bleh RT tries hard to be disliked... Firefox tells me that I cannot refresh with post data. I always click on the #ticket number to get back again Anyway, I've posted a simpler patch on the new ticket, it should show up in your mailbox soon. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 I put the workaround into the doc/README -- and just (re-)discovered that's the *only* method described at: http://freeciv.wikia.com/wiki/Install-MacOSX#Localization_Support There's a different procedure at: http://freeciv.wikia.com/wiki/Install-Windows#Other_Languages There's nothing about running with LANG at: http://freeciv.wikia.com/wiki/Install Somebody needs to update these ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 William Allen Simpson wrote: URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Jason Short wrote: # ... I thought I had written this up somewhere in the code, but I can find # nothing. Where should it be documented? Documenting the code and README.* files. Good, except, all rulesets must be (and to my knowledge are) in utf-8. Anything in latin1 will not be converted by the server and will be treated as utf-8. To my knowledge the only text files that can be something other than utf-8 are the PO files which give their own encoding. -jason ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Here is a quick and partial fix. I assume that strerror() is one of the most common offending functions, so I quickly went through and converted all mystrerror users in client/ and server/ directories to use the newly-written L_(). The only real issue with it is the use of fixed-sized buffers which are wasteful if too large and broken if too small, but as I said in my previous message I see no real alternative here. Madeline, if you can test this and verify it fixes those strings then at least we'll know we're getting somewhere. It would also be helpful if you'd point out to us what other strings are being handled wrongly. It is important to recognize which strings go in which encoding. If you use printf() directly your strings must be in the local encoding. More commonly freelog or fc_fprintf is used and these functions expect strings in the internal encoding. Generally all freeciv functions should expect the internal encoding unless specifically documented otherwise. Also as a side note there's another library we use, lua. Strings from LUA are probably in the local encoding so should use L_ too. -jason Index: utility/fciconv.h === --- utility/fciconv.h (revision 14336) +++ utility/fciconv.h (working copy) @@ -19,6 +19,8 @@ #define FC_DEFAULT_DATA_ENCODING UTF-8 +#define L_(s, b) local_to_internal_string_buffer((s), (b), sizeof(b)) + void init_character_encodings(const char *internal_encoding, bool use_transliteration); Index: server/meta.c === --- server/meta.c (revision 14336) +++ server/meta.c (working copy) @@ -44,6 +44,7 @@ #include connection.h #include dataio.h #include fcintl.h +#include fciconv.h #include log.h #include mem.h #include netintf.h @@ -200,6 +201,7 @@ { static char msg[8192]; static char str[8192]; + char buf[512]; int rest = sizeof(str); int n = 0; char *s = str; @@ -213,13 +215,14 @@ if ((sock = socket(AF_INET, SOCK_STREAM, 0)) == -1) { freelog(LOG_ERROR, Metaserver: can't open stream socket: %s, - mystrerror()); + L_(mystrerror(), buf)); metaserver_failed(); return FALSE; } if (my_connect(sock, (struct sockaddr *) meta_addr, sizeof(meta_addr)) == -1) { -freelog(LOG_ERROR, Metaserver: connect failed: %s, mystrerror()); +freelog(LOG_ERROR, Metaserver: connect failed: %s, + L_(mystrerror(), buf)); metaserver_failed(); my_closesocket(sock); return FALSE; Index: server/sernet.c === --- server/sernet.c (revision 14336) +++ server/sernet.c (working copy) @@ -806,7 +806,9 @@ fromlen = sizeof(fromend); if ((new_sock = accept(sockfd, fromend.sockaddr, fromlen)) == -1) { -freelog(LOG_ERROR, accept failed: %s, mystrerror()); +char buf[512]; + +freelog(LOG_ERROR, accept failed: %s, L_(mystrerror(), buf)); return -1; } @@ -885,16 +887,17 @@ struct ip_mreq mreq; const char *group; int opt; + char buf[512]; /* Create socket for client connections. */ if((sock = socket(AF_INET, SOCK_STREAM, 0)) == -1) { -die(socket failed: %s, mystrerror()); +die(socket failed: %s, L_(mystrerror(), buf)); } opt=1; if(setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, (char *)opt, sizeof(opt)) == -1) { -freelog(LOG_ERROR, SO_REUSEADDR failed: %s, mystrerror()); +freelog(LOG_ERROR, SO_REUSEADDR failed: %s, L_(mystrerror(), buf)); } if (!net_lookup_service(srvarg.bind_addr, srvarg.port, src)) { @@ -904,23 +907,23 @@ } if(bind(sock, src.sockaddr, sizeof (src)) == -1) { -freelog(LOG_FATAL, bind failed: %s, mystrerror()); +freelog(LOG_FATAL, bind failed: %s, L_(mystrerror(), buf)); exit(EXIT_FAILURE); } if(listen(sock, MAX_NUM_CONNECTIONS) == -1) { -freelog(LOG_FATAL, listen failed: %s, mystrerror()); +freelog(LOG_FATAL, listen failed: %s, L_(mystrerror(), buf)); exit(EXIT_FAILURE); } /* Create socket for server LAN announcements */ if ((socklan = socket(AF_INET, SOCK_DGRAM, 0)) 0) { - freelog(LOG_ERROR, socket failed: %s, mystrerror()); +freelog(LOG_ERROR, socket failed: %s, L_(mystrerror(), buf)); } if (setsockopt(socklan, SOL_SOCKET, SO_REUSEADDR, (char *)opt, sizeof(opt)) == -1) { -freelog(LOG_ERROR, SO_REUSEADDR failed: %s, mystrerror()); +freelog(LOG_ERROR, SO_REUSEADDR failed: %s, L_(mystrerror(), buf)); } my_nonblock(socklan); @@ -933,7 +936,7 @@ addr.sockaddr_in.sin_port = htons(SERVER_LAN_PORT); if (bind(socklan, addr.sockaddr, sizeof(addr)) 0) { -freelog(LOG_ERROR, bind failed: %s, mystrerror()); +freelog(LOG_ERROR, bind failed: %s, L_(mystrerror(), buf)); } mreq.imr_multiaddr.s_addr = inet_addr(group); @@ -941,7 +944,7 @@ if
[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [jdorje - Sun Jan 27 22:16:24 2008]: Here is a quick and partial fix. I assume that strerror() is one of the most common offending functions, so I quickly went through and converted all mystrerror users in client/ and server/ directories to use the newly-written L_(). The only real issue with it is the use of fixed-sized buffers which are wasteful if too large and broken if too small, but as I said in my previous message I see no real alternative here. Might I suggest that instead of wrapping every call to mystrerror with L, you instead call L inside mysterror (and of course document that mystrerror does the conversion)? Or make a mystrerror_l that does this (to keep mysterror for the cases that shouldn't be converted). It would make your patch much simpler. I agree in general with your reservations about fixed size buffers, but I think that in this case it is acceptable to used them because: 1. It is very unlikey that any strerror message is longer than 256 characters (I have looked at _sys_errlist_internal in libc and that is the case, I assume it is very probably so on other platforms). 2. Strerror is not re-entrant (e.g. thread-safe) anyway (we would need to use strerror_r for that). 3. It is very unlikely that someone will need to call mysterror twice in the same expression (e.g. more than once in a printf-like call). Hence a fixed static buffer inside mystrerror (or mysterror_l) to hold the converted text would be an acceptable solution in this case, in my opinion. Madeline, if you can test this and verify it fixes those strings then at least we'll know we're getting somewhere. It would also be helpful if you'd point out to us what other strings are being handled wrongly. All the corrupted strings that I had found previously now display correctly (and no gtk warnings appear) with your patch applied. I cannot of course guarantee that that is all of them. ;) It is important to recognize which strings go in which encoding. If you use printf() directly your strings must be in the local encoding. More commonly freelog or fc_fprintf is used and these functions expect strings in the internal encoding. Generally all freeciv functions should expect the internal encoding unless specifically documented otherwise. Also as a side note there's another library we use, lua. Strings from LUA are probably in the local encoding so should use L_ too. I agree with you on those points. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Jason Short wrote: Here is a quick and partial fix. I assume that strerror() is one of the most common offending functions, so I quickly went through and converted all mystrerror users in client/ and server/ directories to use the newly-written L_(). Now, I find this irritating on a number of levels. 1) This ticket was already patched and resolved. Another new ticket would be better. I'll open one. 2) Madeline reported the problem was in mystrerror(), so that's a good target, but the solution would be to patch mystrerror(), not repeatedly patch every invocation This is *all* users! 3) By definition, mystrerror() can only be called once, so there's no technical reason to use a dynamic buffer. 4) I really hate L_() hiding a simple single function. There's no excuse. This makes the code harder to read and harder to grep. 5) The parameter order doesn't match either cat_snprintf() or strlcat(), or any standard C invocation. By convention, the destination buffer is always the first parameter. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Jason Dorje Short wrote: Good, except, all rulesets must be (and to my knowledge are) in utf-8. Perhaps, but that's not what the data currently said, and I'd prefer to document reality rather than hope. Maybe there should be a concerted effort to scan every data file and ensure they are all UTF-8? ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [jdorje - Sun Jan 27 22:16:24 2008]: Here is a quick and partial fix. I assume that strerror() is one of the most common offending functions, so I quickly went through and converted all mystrerror users in client/ and server/ directories to use the newly-written L_(). The only real issue with it is the use of fixed-sized buffers which are wasteful if too large and broken if too small, but as I said in my previous message I see no real alternative here. Might I suggest that instead of wrapping every call to mystrerror with L, you instead call L inside mysterror (and of course document that mystrerror does the conversion)? Or make a mystrerror_l that does this (to keep mysterror for the cases that shouldn't be converted). It would make your patch much simpler. I agree in general with your reservations about fixed size buffers, but I think that in this case it is acceptable to used them because: 1. It is very unlikey that any strerror message is longer than 256 characters (I have looked at _sys_errlist_internal in libc and that is the case, I assume it is very probably so on other platforms). 2. Strerror is not re-entrant (e.g. thread-safe) anyway (we would need to use strerror_r for that). 3. It is very unlikely that someone will need to call mysterror twice in the same expression (e.g. more than once in a printf-like call). Hence a fixed static buffer inside mystrerror (or mysterror_l) to hold the converted text would be an acceptable solution in this case, in my opinion. Madeline, if you can test this and verify it fixes those strings then at least we'll know we're getting somewhere. It would also be helpful if you'd point out to us what other strings are being handled wrongly. All the corrupted strings that I had found previously now display correctly (and no gtk warnings appear) with your patch applied. I cannot of course guarantee that that is all of them. ;) It is important to recognize which strings go in which encoding. If you use printf() directly your strings must be in the local encoding. More commonly freelog or fc_fprintf is used and these functions expect strings in the internal encoding. Generally all freeciv functions should expect the internal encoding unless specifically documented otherwise. Also as a side note there's another library we use, lua. Strings from LUA are probably in the local encoding so should use L_ too. I agree with you on those points. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 William Allen Simpson wrote: URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Jason Short wrote: Here is a quick and partial fix. I assume that strerror() is one of the most common offending functions, so I quickly went through and converted all mystrerror users in client/ and server/ directories to use the newly-written L_(). Now, I find this irritating on a number of levels. 1) This ticket was already patched and resolved. Another new ticket would be better. I'll open one. What would possibly motivate you to resolve a bug that's still unfixed, mere hours after making a rant about how writing off bugs as acceptable is sloppy bugfixing? 2) Madeline reported the problem was in mystrerror(), so that's a good target, Ahh yes, I see that now. Obviously I had only remembered it subconsciously before :). Speaking of which, Madeline, what is the mystrsocketerror function you mention? but the solution would be to patch mystrerror(), not repeatedly patch every invocation This is *all* users! 3) By definition, mystrerror() can only be called once, so there's no technical reason to use a dynamic buffer. Yes. Naturally the first thing I considered was making the change directly within mystrerror, as that is far far easier. But, if one mystrerror call is done before the previous one's return value has been discarded the result will be erroneous. Given that some of the returned strings are passed off into arbitrary callbacks this is a rather risky thing to assume IMO. However the error if this happened would be non-fatal so I suppose if we want to accept that bug it's okay. 4) I really hate L_() hiding a simple single function. There's no excuse. This makes the code harder to read and harder to grep. 5) The parameter order doesn't match either cat_snprintf() or strlcat(), or any standard C invocation. By convention, the destination buffer is always the first parameter. It aims at consistency with _, N_, PL_, and Q_, with which it shares similar naming. All of these are just shortened wrappers for single functions that allow easy and common use, and all put the text parameter first. Of course L_ does no translation, it just changes encoding (as do the other _ functions). -jason ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid UTF-8 warning, fix documentation
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [EMAIL PROTECTED] - Mon Jan 28 05:22:53 2008]: William Allen Simpson wrote: ... 2) Madeline reported the problem was in mystrerror(), so that's a good target, Ahh yes, I see that now. Obviously I had only remembered it subconsciously before :). Speaking of which, Madeline, what is the mystrsocketerror function you mention? It is something added in the warclient codebase to deal with the fact that on win32 strerror/GetLastError does not return error codes for socket operation errors. It is identical to mystrerror except that the call to GetLastError is replaced by a call to WSAGetLastError. It's not really a pressing issue for the network code in the freeciv codebase; I had concocted a characteristically over-done asynchronous networking subsystem for metaserver communications for the client for which to debug (plagued by all manner of difficult to track down heisenbugs) on win32 it was necessary to have precise error reporting. In the end it was a nice learning experience, but not terribly worthwhile for the project as a whole. ;) 3) By definition, mystrerror() can only be called once, so there's no technical reason to use a dynamic buffer. Yes. Naturally the first thing I considered was making the change directly within mystrerror, as that is far far easier. But, if one mystrerror call is done before the previous one's return value has been discarded the result will be erroneous. Given that some of the returned strings are passed off into arbitrary callbacks this is a rather risky thing to assume IMO. However the error if this happened would be non-fatal so I suppose if we want to accept that bug it's okay. I don't see how an invocation of mystrerror could trigger another invocation of itself (I certainly wouldn't expect that to happen). As far as I can tell mystrerror is only used in calls to printf like functions. So the return value of mystrerror is immediately copied to the buffers in those functions and once there, it is safe from being corrupted by further calls to mystrerror. But if there are indeed complicated callbacks that just store the pointer returned by strerror and use it later on after other operations have been performed, then indeed, that is a dangerous situation and should be avoided. I would always put a big warning in the comment header for functions returning pointers to static data that the programmer should take care not to abuse the return value in this way. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev