Danilo Šegan wrote:
> Hi Simos,
>
> Yesterday at 15:02, Simos Xenitellis wrote:
>
>   
>>> I'll like to see some real numbers on the memory usage instead of
>>> numbers being thrown around.
>>>       
>> In Ubuntu 7.10, the PO files for en_GB are
>> $ du
>> -h /usr/share/locale/en_GB/LC_MESSAGES 
>> /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> 2.3M    /usr/share/locale/en_GB/LC_MESSAGES
>> 17M     /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> $_ 
>>
>> In Ubuntu 8.04 (alpha 6), the PO files for en_GB are
>> $ du
>> -h /usr/share/locale/en_GB/LC_MESSAGES 
>> /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> 84K    /usr/share/locale/en_GB/LC_MESSAGES
>> 2.2M     /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> $_ 
>>
>> What I am missing here is that I do not know when/how Ubuntu adds this
>> functionality. It would benefit other distros as well. Did Debian
>> introduce with feature? Danilo, any links?
>>     
>
> I am not handling Ubuntu packaging stuff—it'd be worth checking with
> Ubuntu guys instead.  Martin Pitt is probably the right person to ask
> about it, but looking at the language pack sourcepackage should give a
> clue as well.
>
> However, I'd note that en_GB is not really the right locale to do
> the metrics on.
>   
Hi Danilo,
Why would en_GB not be the right locale to do metrics on?
>   
>>> >From the 2.3M + 17M MO files in Ubuntu 7.10, a typical GNOME session
>>>       
>> loads up a subset of the MO files,
>>
>> # lsof | grep \.mo\$ | awk '{print $7,$9}' | sort -n | uniq
>>
>> At this moment, my 7.10 is a bit messed up (I have en_GB.UTF-8 but most
>> apps have en_US?!?). The figures for 8.04 with el_GR should be
>> comparative of what you get now with 7.10 and en_GB:
>>     
>
> They wouldn't be. A majority of el_GR probably uses two-byte UTF-8
> sequences, while en_GB would use a majority of single byte UTF-8
> sequences (i.e. ASCII).
>   
Good point. I provided in the other email figures from the same locale.
Halving the figures from "el" should give a very rough estimate.
>   
>> # lsof | grep \.mo\$ | awk '{print $7,$9}' | sort -n | uniq | awk
>> '{printf "%d+",$1}' > /tmp/bc_sums
>>
>> Using "bc" with /tmp/bc_sums gives the figure
>> 3.6M (3624412) for a standard session. This figure is a bit
>> conservative, because en_GB probably did more work than el.
>>
>> With Ubuntu 8.04 (alpha6) and en_GB, the figure for the MO files is
>> less than 600K (585375).
>> Bastien, could you provide the proper figure for your system?
>>
>> That is a saving of at least 3M in memory.
>>     
>
> As Bastien explained, mmap() doesn't read the entire file into memory,
> but only reads it as needed.
>
>   
>> The stripping of "unneeded" messages is good, and should happen at the
>> package generation level (not in GNOME, or when creating tarballs). 
>>     
>
> Technically, I've opposed introducing this in intltool because of a
> one incompatible difference:
>
>   current gettext("Something") != such gettext("Something")
>
> i.e. if "Something" was (un)translated as "Something" in the MO file,
> gettext would return a static pointer with the string "Something".  If
> it was untranslated, it would return the passed pointer.
>
> That can and was used to detect whether there is a translation in some
> programs (I've seen it done), so, until gains are proven to be big
> enough to warrant breaking a few programs in strange ways, I wouldn't
> do it on the packaging/build time.
>   
I do not know whether GNOME applications do (or have the need to do) 
such a check.
Can you give one example, in order to see why they need to find if there 
is a translation file?

A valid concern I have seen (and this has to do with correctness) is 
when people manually configure the LANGUAGE variable, with something 
like "es:fr:en". That is, pick the Spanish translation, if not available 
for a message pick French, else pick English. If the Spanish translation 
for a specific message is the same with English, but not in French, then 
the user will see the French translation (she should have seen the 
Spanish-English translation instead).

Danilo also gave an example with Serbian, if a user chooses something 
like "serbian_cyrillic:serbian_latin:en".

As far as I know, there is no UI tool (at least in GNOME) to set a 
triple LANGUAGE option.

For a general purpose system one may make the assumption that a single 
language is expected.
> Of course, providing numbers to show what the gains are would help
> make the decision.
>   
Assuming my memory figures are correct (previous e-mail), I have 
provided file size and memory figures.

Simos

_______________________________________________
desktop-devel-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Reply via email to