Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Krunose

08.04.2017 u 21:54, Krunose je napisao/la:

08.04.2017 u 21:21, Michael Wolf je napisao/la:

Krunose schrieb:


Yes, e.g. the Sorbian languages, Polish, Czech, Slovak.

Michael



Let's wait and see what happens :D


I filed a bug:

https://github.com/translate/pootle/issues/6238

Michael




I'll probably leave a comment latter to bring the heat.

Now when I think about it, it's not just about passing strings 
incorrectly to URL from search, it's more complicated then that so I 
stop playing Sherlock Holmes here.


But I kinda doubt it's easy to fix. We'll see...

And thanks for quick reaction! :)

Kruno




As I suspected, seams that search query _is_ passed as percent encoding 
for 'čitati' as Firebug in Firefox shows


...search=%C4%8Ditati

as what passed to GET and %C4%8D should be percent encoding for 'č' so 
something else is wrong.


I'll definitely leave a comment to that bug report.

Kruno


--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Krunose

08.04.2017 u 21:21, Michael Wolf je napisao/la:

Krunose schrieb:


Yes, e.g. the Sorbian languages, Polish, Czech, Slovak.

Michael



Let's wait and see what happens :D


I filed a bug:

https://github.com/translate/pootle/issues/6238

Michael




I'll probably leave a comment latter to bring the heat.

Now when I think about it, it's not just about passing strings 
incorrectly to URL from search, it's more complicated then that so I 
stop playing Sherlock Holmes here.


But I kinda doubt it's easy to fix. We'll see...

And thanks for quick reaction! :)

Kruno

--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Michael Wolf

Krunose schrieb:


Yes, e.g. the Sorbian languages, Polish, Czech, Slovak.

Michael



Let's wait and see what happens :D


I filed a bug:

https://github.com/translate/pootle/issues/6238

Michael





--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Krunose

08.04.2017 u 20:11, Michael Wolf je napisao/la:

Krunose schrieb:

08.04.2017 u 19:42, Michael Wolf je napisao/la:

Michael Wolf schrieb:

Krunose schrieb:

Hi,

facing rather strange bug (?) in Pootle. If I put 'citat' in 
search box, Pootle is returning words like 'čitati' and for 
'čitati' is returning 'citat' also. It happens with 'š', 'ž', 'č' 
and 'ć'. For non-existing word 'moze' it will return 'može', which 
is actually a word but that's not what I searched for.


Seams like it's converting diacritics to 'c', 'z', 's' internally.


Yes, it's true. I translate into Upper and Lower Sorbian, they are 
Slavic languages as well.



Then it affects Serbian, Bosnian, Montenegrin and Slovenian and 
possible some other Slavic languages.


Yes, e.g. the Sorbian languages, Polish, Czech, Slovak.

Michael



Let's wait and see what happens :D

Kruno

--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Michael Wolf

Krunose schrieb:

08.04.2017 u 19:42, Michael Wolf je napisao/la:

Michael Wolf schrieb:

Krunose schrieb:

Hi,

facing rather strange bug (?) in Pootle. If I put 'citat' in search 
box, Pootle is returning words like 'čitati' and for 'čitati' is 
returning 'citat' also. It happens with 'š', 'ž', 'č' and 'ć'. For 
non-existing word 'moze' it will return 'može', which is actually a 
word but that's not what I searched for.


Seams like it's converting diacritics to 'c', 'z', 's' internally.


Yes, it's true. I translate into Upper and Lower Sorbian, they are 
Slavic languages as well.



Then it affects Serbian, Bosnian, Montenegrin and Slovenian and possible 
some other Slavic languages.


Yes, e.g. the Sorbian languages, Polish, Czech, Slovak.

Michael

--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Michael Wolf

Krunose schrieb:
No, HTML entities probably wouldn't work. Letter 'đ' is not passed like 
that. Don't know if they can fix that easily.


This letter works with me by Alt+numeric 240 (208 is upper case) method 
on Windows 10 on three Pootle projects: Mozilla, LO and Pootle 2.8.0. I 
tested it with Icelandic.


Michael



08.04.2017 u 19:38, Krunose je napisao/la:
Does that mean Pootle can't be set to this to work? Don't think chines 
characters are ASCII but I guess they can use there script. Think it's 
related to what is passed to URL.


These character should be passed to URL as html entities and think 
that would fix it. That's what happens to 'đ' when search for that 
letter.


Can you confirm that to mailing list? Think they can fix that?

Kruno


08.04.2017 u 19:33, Michael Wolf je napisao/la:

Krunose schrieb:
And seams that  'đ' is recognized correctly. Maybe because that 
letter is used in other languages?




Yes, it's ASCII. It exists in Icelandic and Faroese. \u00D0 and 
\u00F0 (hexadecimal).








--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Krunose

08.04.2017 u 19:42, Michael Wolf je napisao/la:

Michael Wolf schrieb:

Krunose schrieb:

Hi,

facing rather strange bug (?) in Pootle. If I put 'citat' in search 
box, Pootle is returning words like 'čitati' and for 'čitati' is 
returning 'citat' also. It happens with 'š', 'ž', 'č' and 'ć'. For 
non-existing word 'moze' it will return 'može', which is actually a 
word but that's not what I searched for.


Seams like it's converting diacritics to 'c', 'z', 's' internally.


Yes, it's true. I translate into Upper and Lower Sorbian, they are 
Slavic languages as well.



Then it affects Serbian, Bosnian, Montenegrin and Slovenian and possible 
some other Slavic languages.


Kruno

--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Krunose

08.04.2017 u 19:42, Michael Wolf je napisao/la:

Michael Wolf schrieb:

Krunose schrieb:

Hi,

facing rather strange bug (?) in Pootle. If I put 'citat' in search 
box, Pootle is returning words like 'čitati' and for 'čitati' is 
returning 'citat' also. It happens with 'š', 'ž', 'č' and 'ć'. For 
non-existing word 'moze' it will return 'može', which is actually a 
word but that's not what I searched for.


Seams like it's converting diacritics to 'c', 'z', 's' internally.


Yes, it's true. I translate into Upper and Lower Sorbian, they are 
Slavic languages as well.





Thanks for confirming this.

Those characters are passed to URL as 'c', 's' and 'z'. Maybe it can be 
fixed with percent encoding or something?


Kruno


--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Michael Wolf

Michael Wolf schrieb:

Krunose schrieb:

Hi,

facing rather strange bug (?) in Pootle. If I put 'citat' in search 
box, Pootle is returning words like 'čitati' and for 'čitati' is 
returning 'citat' also. It happens with 'š', 'ž', 'č' and 'ć'. For 
non-existing word 'moze' it will return 'može', which is actually a 
word but that's not what I searched for.


Seams like it's converting diacritics to 'c', 'z', 's' internally.


Yes, it's true. I translate into Upper and Lower Sorbian, they are 
Slavic languages as well.







--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Michael Wolf

Michael Wolf schrieb:

Krunose schrieb:
And seams that  'đ' is recognized correctly. Maybe because that letter 
is used in other languages?



Yes, it's ASCII. It exists in Icelandic and Faroese. \u00D0 and \u00F0 
(hexadecimal).



--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-l10n] Pootle doesn't recognize difference between c, z, s and Croatian diacritics č, ž, š

2017-04-08 Thread Krunose
And seams that  'đ' is recognized correctly. Maybe because that letter 
is used in other languages?


Kruno



08.04.2017 u 18:16, Krunose je napisao/la:

Hi,

facing rather strange bug (?) in Pootle. If I put 'citat' in search 
box, Pootle is returning words like 'čitati' and for 'čitati' is 
returning 'citat' also. It happens with 'š', 'ž', 'č' and 'ć'. For 
non-existing word 'moze' it will return 'može', which is actually a 
word but that's not what I searched for.


Seams like it's converting diacritics to 'c', 'z', 's' internally.

Not noticed that before but this might not be new. Usually I'm 
searching for two or three words so I wasn't really been able to 
notice that because of the additional context.


It's not a big deal but instead two or three results, I'm getting fifty.

Does it happen with other languages?

I guess it's not easy to make Pootle to cope well with every existing 
language as it might get resource intensive?


Thanks,

Kruno





--
To unsubscribe e-mail to: l10n+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted