Danilo Šegan wrote:
> Hi Simos,
>
> Yesterday at 15:02, Simos Xenitellis wrote:
>
>
>>> I'll like to see some real numbers on the memory usage instead of
>>> numbers being thrown around.
>>>
>> In Ubuntu 7.10, the PO files for en_GB are
>> $ du
>> -h /usr/share/locale/en_GB/LC_MESSAGES
>> /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> 2.3M /usr/share/locale/en_GB/LC_MESSAGES
>> 17M /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> $_
>>
>> In Ubuntu 8.04 (alpha 6), the PO files for en_GB are
>> $ du
>> -h /usr/share/locale/en_GB/LC_MESSAGES
>> /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> 84K /usr/share/locale/en_GB/LC_MESSAGES
>> 2.2M /usr/share/locale-langpack/en_GB/LC_MESSAGES/
>> $_
>>
>> What I am missing here is that I do not know when/how Ubuntu adds this
>> functionality. It would benefit other distros as well. Did Debian
>> introduce with feature? Danilo, any links?
>>
>
> I am not handling Ubuntu packaging stuff—it'd be worth checking with
> Ubuntu guys instead. Martin Pitt is probably the right person to ask
> about it, but looking at the language pack sourcepackage should give a
> clue as well.
>
> However, I'd note that en_GB is not really the right locale to do
> the metrics on.
>
Hi Danilo,
Why would en_GB not be the right locale to do metrics on?
>
>>> >From the 2.3M + 17M MO files in Ubuntu 7.10, a typical GNOME session
>>>
>> loads up a subset of the MO files,
>>
>> # lsof | grep \.mo\$ | awk '{print $7,$9}' | sort -n | uniq
>>
>> At this moment, my 7.10 is a bit messed up (I have en_GB.UTF-8 but most
>> apps have en_US?!?). The figures for 8.04 with el_GR should be
>> comparative of what you get now with 7.10 and en_GB:
>>
>
> They wouldn't be. A majority of el_GR probably uses two-byte UTF-8
> sequences, while en_GB would use a majority of single byte UTF-8
> sequences (i.e. ASCII).
>
Good point. I provided in the other email figures from the same locale.
Halving the figures from "el" should give a very rough estimate.
>
>> # lsof | grep \.mo\$ | awk '{print $7,$9}' | sort -n | uniq | awk
>> '{printf "%d+",$1}' > /tmp/bc_sums
>>
>> Using "bc" with /tmp/bc_sums gives the figure
>> 3.6M (3624412) for a standard session. This figure is a bit
>> conservative, because en_GB probably did more work than el.
>>
>> With Ubuntu 8.04 (alpha6) and en_GB, the figure for the MO files is
>> less than 600K (585375).
>> Bastien, could you provide the proper figure for your system?
>>
>> That is a saving of at least 3M in memory.
>>
>
> As Bastien explained, mmap() doesn't read the entire file into memory,
> but only reads it as needed.
>
>
>> The stripping of "unneeded" messages is good, and should happen at the
>> package generation level (not in GNOME, or when creating tarballs).
>>
>
> Technically, I've opposed introducing this in intltool because of a
> one incompatible difference:
>
> current gettext("Something") != such gettext("Something")
>
> i.e. if "Something" was (un)translated as "Something" in the MO file,
> gettext would return a static pointer with the string "Something". If
> it was untranslated, it would return the passed pointer.
>
> That can and was used to detect whether there is a translation in some
> programs (I've seen it done), so, until gains are proven to be big
> enough to warrant breaking a few programs in strange ways, I wouldn't
> do it on the packaging/build time.
>
I do not know whether GNOME applications do (or have the need to do)
such a check.
Can you give one example, in order to see why they need to find if there
is a translation file?
A valid concern I have seen (and this has to do with correctness) is
when people manually configure the LANGUAGE variable, with something
like "es:fr:en". That is, pick the Spanish translation, if not available
for a message pick French, else pick English. If the Spanish translation
for a specific message is the same with English, but not in French, then
the user will see the French translation (she should have seen the
Spanish-English translation instead).
Danilo also gave an example with Serbian, if a user chooses something
like "serbian_cyrillic:serbian_latin:en".
As far as I know, there is no UI tool (at least in GNOME) to set a
triple LANGUAGE option.
For a general purpose system one may make the assumption that a single
language is expected.
> Of course, providing numbers to show what the gains are would help
> make the decision.
>
Assuming my memory figures are correct (previous e-mail), I have
provided file size and memory figures.
Simos
_______________________________________________
desktop-devel-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/desktop-devel-list