Re: [fossil-users] Broken alignment in side-by-side diffs
On Wed, Jun 20, 2012 at 1:24 AM, Александр Орефков oref...@gmail.comwrote: Hi. If text of file in utf-8 contain not latin characters, alignment in side-by-side diffs web page broken. Seems to count bytes, not characters for alignment. I have a note of your problem. I will address it when I get a chance. WBR, Alexander Orefkov ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users -- D. Richard Hipp d...@sqlite.org ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Broken alignment in side-by-side diffs
Hi. I temporary use simple crutch in diff.c in sbsWriteText: ... }else if( c=='' p-escHtml ){ memcpy(z[j], gt;, 4); j += 4; }else{ z[j++] = c; /*fix for russian utf-8 - 2 bytes per symbol*/ if( c0 ) { z[j++] = zIn[++i]; } } It worked. At least for Russian characters. WBR, Alexander Orefkov ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Broken alignment in side-by-side diffs
2012/6/20 Martin Gagnon eme...@gmail.com: Le 2012-06-20 à 04:49, Александр Орефков oref...@gmail.com a écrit : Hi. I temporary use simple crutch in diff.c in sbsWriteText: ... }else if( c=='' p-escHtml ){ memcpy(z[j], gt;, 4); j += 4; }else{ z[j++] = c; /*fix for russian utf-8 - 2 bytes per symbol*/ if( c0 ) { z[j++] = zIn[++i]; } } It worked. At least for Russian Good.. I got same problem with Chinese. But since UTF8 can have from 1 to 6 bytes, it will not work in all the case. http://en.wikipedia.org/w/index.php?title=UTF-8 So your if(c0)... Could be a while loop which would need some protection to don't overflow the input buffer. Its a temp crutch. I hope, DRH make all excelent :) WBR, Alexander Orefkov. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Broken alignment in side-by-side diffs
On Wed, Jun 20, 2012 at 09:24:53AM +0400, Александр Орефков wrote: Hi. If text of file in utf-8 contain not latin characters, alignment in side-by-side diffs web page broken. Seems to count bytes, not characters for alignment. yes, that has been reported before. It's quite easy to count utf-8... but maybe not everyone uses utf-8. Should we add a 'setting' for 8-bit or utf-8 characters? ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Broken alignment in side-by-side diffs
2012/6/20 Lluís Batlle i Rossell vi...@viric.name: yes, that has been reported before. It's quite easy to count utf-8... but maybe not everyone uses utf-8. Should we add a 'setting' for 8-bit or utf-8 characters? In Fossil in web pages header set utf-8 code page, so not utf-8 file text with non ascii characters anyway will be broken. And you need set code page for page in browser for correct view. Then it is better to have the ability to specify the encoding for the entire site ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Broken alignment in side-by-side diffs
On Wed, Jun 20, 2012 at 02:16:19PM +0400, Александр Орефков wrote: 2012/6/20 Lluís Batlle i Rossell vi...@viric.name: yes, that has been reported before. It's quite easy to count utf-8... but maybe not everyone uses utf-8. Should we add a 'setting' for 8-bit or utf-8 characters? In Fossil in web pages header set utf-8 code page, so not utf-8 file text with non ascii characters anyway will be broken. And you need set code page for page in browser for correct view. Then it is better to have the ability to specify the encoding for the entire site Ok, fine for me. Here is a quick code to count utf8 chars: int my_strlen_utf8_c(char *s) { int i = 0, j = 0; while (s[i]) { if ((s[i] 0xc0) != 0x80) j++; i++; } return j; } From this place: http://www.canonical.org/~kragen/strlen-utf8.html Regards, Lluís. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Broken alignment in side-by-side diffs
On Wed, Jun 20, 2012 at 1:24 AM, Александр Орефков oref...@gmail.comwrote: Hi. If text of file in utf-8 contain not latin characters, alignment in side-by-side diffs web page broken. Seems to count bytes, not characters for alignment. Fixed here: http://www.fossil-scm.org/fossil/info/484f8d29af Observe that the first change-block in the diff above includes a change to a line that contains multibyte unicode characters - to prove the the change does in fact work. However: The change does not take into account zero-width unicode characters. Zero-width characters (usually diacritics expressed as separate characters) will still throw off the columns. But as zero-width characters are the exception in western languages, perhaps this won't be too bad. Side-by-side diffs of text containing Hebrew, Thai, or Korean will probably still be very funky, however. WBR, Alexander Orefkov ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users -- D. Richard Hipp d...@sqlite.org ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Broken alignment in side-by-side diffs
Le 12-06-20 06:16, Александр Орефков a écrit : 2012/6/20 Lluís Batlle i Rossell vi...@viric.name: yes, that has been reported before. It's quite easy to count utf-8... but maybe not everyone uses utf-8. Should we add a 'setting' for 8-bit or utf-8 characters? In Fossil in web pages header set utf-8 code page, so not utf-8 file text with non ascii characters anyway will be broken. And you need set code page for page in browser for correct view. Then it is better to have the ability to specify the encoding for the entire site But I guess this would not simplify the sbsdiff alignment stuff. It would need to support multiple type of encoding to count caracters. And what if there's files of different encoding inside a repo. I guess keeping the site only in utf-8 make it more simple. Anyway ascii is compatible with utf-8. If there's some need to support more encoding, a prior conversion to utf-8 could be used, so same sbsdiff diff code could be used when generating diff page. And a utf-8 on/off flag could speed up (a little bit) for people that only use plain ascii but it would not be necessary. That's just my opinion, I'm not an expert in encoding. -- Martin G. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users