Re: Policing dead/zombie code in m-c

2014-06-04 Thread Henri Sivonen
On Tue, Jun 3, 2014 at 9:13 PM, Trevor Saunders trev.saund...@gmail.com wrote:
 On Tue, Jun 03, 2014 at 12:08:52PM -0400, Ehsan Akhgari wrote:
 On 2014-06-03, 5:57 AM, Henri Sivonen wrote:
 On Wed, May 28, 2014 at 5:45 PM, Nathan Froyd froy...@mozilla.com wrote:
 Assuming that ICU is already compiled with the moral equivalent of GCC's 
 -ffunction-sections -fdata-sections or MSVC's /Gy, then statically linking 
 ICU into libxul should already strip out all the un-needed ICU bits (when 
 using the appropriate linker option).
 
 Disabling ICU IDNA code made libxul smaller, so it seems practice
 doesn't match the above theory. (I.e. clearly the linker wasn't
 automatically dropping ICU IDNA code.)

 Well, was this tested in a local build, or ones done in automation?

Tested with tryserver builds.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-06-03 Thread Henri Sivonen
On Wed, May 28, 2014 at 5:45 PM, Nathan Froyd froy...@mozilla.com wrote:
 Assuming that ICU is already compiled with the moral equivalent of GCC's 
 -ffunction-sections -fdata-sections or MSVC's /Gy, then statically linking 
 ICU into libxul should already strip out all the un-needed ICU bits (when 
 using the appropriate linker option).

Disabling ICU IDNA code made libxul smaller, so it seems practice
doesn't match the above theory. (I.e. clearly the linker wasn't
automatically dropping ICU IDNA code.)

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-06-03 Thread Ehsan Akhgari

On 2014-06-03, 5:57 AM, Henri Sivonen wrote:

On Wed, May 28, 2014 at 5:45 PM, Nathan Froyd froy...@mozilla.com wrote:

Assuming that ICU is already compiled with the moral equivalent of GCC's 
-ffunction-sections -fdata-sections or MSVC's /Gy, then statically linking ICU 
into libxul should already strip out all the un-needed ICU bits (when using the 
appropriate linker option).


Disabling ICU IDNA code made libxul smaller, so it seems practice
doesn't match the above theory. (I.e. clearly the linker wasn't
automatically dropping ICU IDNA code.)


The linker can only eliminate symbols that are not referenced by 
anything else.  Basically the linker creates a graph and preserves 
everything that is referenced by some roots, for example, main, static 
initializers, and very importantly, vtables.  So, anything that is a 
virtual function or is referenced by one will remain in the binary even 
if there is no code _calling_ that virtual function anywhere.


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-06-03 Thread Trevor Saunders
On Tue, Jun 03, 2014 at 12:08:52PM -0400, Ehsan Akhgari wrote:
 On 2014-06-03, 5:57 AM, Henri Sivonen wrote:
 On Wed, May 28, 2014 at 5:45 PM, Nathan Froyd froy...@mozilla.com wrote:
 Assuming that ICU is already compiled with the moral equivalent of GCC's 
 -ffunction-sections -fdata-sections or MSVC's /Gy, then statically linking 
 ICU into libxul should already strip out all the un-needed ICU bits (when 
 using the appropriate linker option).
 
 Disabling ICU IDNA code made libxul smaller, so it seems practice
 doesn't match the above theory. (I.e. clearly the linker wasn't
 automatically dropping ICU IDNA code.)

Well, was this tested in a local build, or ones done in automation? We
don't seem to pass --gc-sections to the linker on local builds
presumably to make builds faster.

 The linker can only eliminate symbols that are not referenced by anything
 else.  Basically the linker creates a graph and preserves everything that is
 referenced by some roots, for example, main, static initializers, and very

I'd expect static initializers don't get explicitly added to the set of
roots since they should be rooted through .init_array / .ctors.

This also misses that function exported from the library are obviously
roots.  and ICU exports a *ton* of symbols from libxul which I'm trying to fix
for ELF targets in bug 1019744.

 importantly, vtables.  So, anything that is a virtual function or is
 referenced by one will remain in the binary even if there is no code
 _calling_ that virtual function anywhere.

I'd expect that vtables aren't special if there class is hidden and you
can gc all the references in the library to the vtable then you can gc
the vtable too because nobody can get at it.  If the class isn't local
then you have to keep the vtable, but that would be true anyway because
its a global symbol.

Trev

 
 Cheers,
 Ehsan
 
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-05-28 Thread Trevor Saunders
  This is at least in part due to bug 915735, you can go and read the bug to 
  see that I did the best I could there, given the constraints that I had 
  (not understanding how the magic of the interaction of ICU and our build 
  system worked, not understanding which parts of ICU we actually want to 
  use, the performance and code quality limitations of MSVC's PGO compiler, 
  the fact that increasing the amount of time we allow our builds to progress 
  is rocket science, etc.
 
 I don’t have numbers for this change specifically, but during my 
 investigation I found that on Mac statically linking ICU into libmozjs 
 reduced the binary size by almost 600KB, compared to building ICU as a 
 separate DLL.

 We should probably be able to make that difference smaller with a bit
 of work.  We just need to get ICU to build in a configuration where
 the library only exports what we need, of course it would be better to
 just not build what we don't need.

 On Apr 25, 2014, at 0:31 , Henri Sivonen hsivo...@hsivonen.fi wrote:
 
  Therefore, it looks like we should turn off (if we haven't already):
  * The ICU LayoutEngine.
  * Ustdio
  * ICU encoding converters and their mapping tables.
  * ICU break iterators and their data.
  * ICU transliterators and their data.
 
 All done, except for a few remaining encoding converters:
 
  However, that flag is misdesigned in the sense that it considers
  US-ASCII, ISO-8859-1, UTF-7, UTF-32, CESU-8, SCSU and BOCU-1 as
  non-legacy, even though, frankly, those are legacy, too. (UTF-16 is
  legacy also, but it's legacy we need, since both ICU and Gecko are
  UTF-16 legacy code bases!)
  http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#267
 
 UTF-16 may be legacy from the point of view of encoding HTML pages, but 
 otherwise it’s an essential part of most of today’s computing environments – 
 and the alternative in many cases would be UTF-32. But at least four of the 
 above are clearly obsolete.

for talking to  win32 API sure, but win32 is a crazy beast, and we
aren't using ICU for that anyway.

  Also, If the ICU build system is an configurable enough, I think we
  should consider identifying what parts of ICU we can delete even
  though the build system doesn't let us to and then automate the
  deletion as a script so that it can be repeated with future imports of
  ICU.
 
 That seems worth investigating, but getting the build system to strip out as 
 much as possible might be more effective – see the next item.

easier probably, better no think of the hundreds of people who are stuck
building it all the time.

 On Apr 25, 2014, at 8:12 , Trevor Saunders trev.saund...@gmail.com wrote:
 
  Well, it really depends how you approach the problem of finding code
  thats dead.  One way is to decide code really should be dead by reading
  it and knowing what it accomplishes doesn't matter.  Another is to
  prove to your self that code is unreachable and so can be removed
  without effecting anything.  Both of these require a pretty good bit of
  knowledge and skill.  On the other hand the below question prompts
  another approach.
 
 Is there a way to give the linker a list of functions that you want to have 
 as public entry points of a dynamically linked library, and have it strip out 
 everything that can’t be reached from these functions? That’s essentially 
 what happens when you statically link a library into another, and the list of 
 ICU functions that Mozilla code calls wouldn’t be excessively long. The bulk 
 of them can be found in this stubs section:
 http://mxr.mozilla.org/mozilla-central/source/js/src/builtin/Intl.cpp#62

yes, though how hard they are to use with the ICU build system is left
as an exercise for the reader.

Trev

 On Apr 28, 2014, at 5:59 , Henri Sivonen hsivo...@hsivonen.fi wrote:
 
  I haven't looked into Ustdio.
 
 That’s disabled by the --enable-icuio=no flag in
 http://mxr.mozilla.org/mozilla-central/source/build/autoconf/icu.m4#213
 
  However, I noticed that we aren't turning off ICU IDNA but we should:
  https://bugzilla.mozilla.org/show_bug.cgi?id=1002435
 
 Thanks for finding and fixing this issue, as well as 1002437!
 
 
 On Apr 25, 2014, at 0:31 , Henri Sivonen hsivo...@hsivonen.fi wrote:
 
  Using system ICU seems wrong in terms of correctness. That's the
  reason why we don't use system ICU on Mac and desktop Linux, right?
 
 Correctness from the user’s point of view – quality of locale and time zone 
 data – is clearly an issue (conformance with the Ecma standard is not, as 
 Jeff pointed out). In addition, a system ICU isn’t always available to us: On 
 Mac, it’s not public API; on some Linux versions, it’s built with version 
 numbers baked into function names, which we can’t link against.
 
 
 On May 1, 2014, at 12:25 , Jeff Walden jwalden+...@mit.edu wrote:
 
  On 04/28/2014 05:59 AM, Henri Sivonen wrote:
  Hopefully we didn't remove collation rules, since that's the part we
  are supposed to 

Re: Policing dead/zombie code in m-c

2014-05-28 Thread Nathan Froyd
- Original Message -
 Is there a way to give the linker a list of functions that you want to have
 as public entry points of a dynamically linked library, and have it strip
 out everything that can’t be reached from these functions? That’s
 essentially what happens when you statically link a library into another,
 and the list of ICU functions that Mozilla code calls wouldn’t be
 excessively long. The bulk of them can be found in this stubs section:
 http://mxr.mozilla.org/mozilla-central/source/js/src/builtin/Intl.cpp#62

Assuming that ICU is already compiled with the moral equivalent of GCC's 
-ffunction-sections -fdata-sections or MSVC's /Gy, then statically linking ICU 
into libxul should already strip out all the un-needed ICU bits (when using the 
appropriate linker option).  I see that intl/icu/source/configure.ac has bits 
to turn those on for Linux, but not for Mac or Windows.  I can't tell whether 
we enable all the flags to convince ICU to use those options, though we might 
be passing some of Gecko's compilation flags into ICU's build process, in which 
case this is a moot point.

For the dynamic library case, if you're using symbol maps (Linux) or export 
lists (Windows), using -ffunction-sections -fdata-sections or equivalent should 
given you the same effect, assuming you trim down your mapfile appropriately.

-Nathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-05-27 Thread Norbert Lindenberg
Sorry, I only now discovered this thread. Below are replies to many of the 
issues raised with regards to the work I did adding ICU and the ECMAScript 
Internationalization API early last year.

A lot of background information is in this document (no longer fully up to 
date):
http://lindenbergsoftware.com/mozilla/implementation.html
and in Jeff’s more current one:
https://wiki.mozilla.org/User:Waldo/Internationalization_API


On Apr 24, 2014, at 5:31 , Henri Sivonen hsivo...@hsivonen.fi wrote:

 * Are we building and shipping dead code in ICU on B2G?

B2G doesn’t have ICU and the internationalization API yet:
https://bugzilla.mozilla.org/show_bug.cgi?id=866301
…but of course that’s only a short-term fix.


On Apr 24, 2014, at 11:31 , ISHIKAWA,chiaki ishik...@yk.rim.or.jp wrote:

 There is at least some code in ICU that is
 - used by FF, but
 - not used by TB.

As far as I know, the ECMAScript Internationalization API is so far only 
enabled for desktop Firefox.


On Apr 24, 2014, at 5:49 , Till Schneidereit t...@tillschneidereit.net wrote:

 One idea was to slim down the ICU build and get rid of some
 things not needed for Intl. I might very well be mistaken, but I'm not
 aware of this having happened.

It did happen. ICU has a number of build flags that can be used to reduce code 
and data size. I used the applicable subset to shrink the binary size to less 
than half of the size of a full ICU build. Mozilla doesn’t even have the full 
set of ICU source files – the import script drops the sources for encoding 
mappings, break iterators, language names and much else.
http://mxr.mozilla.org/mozilla-central/source/build/autoconf/icu.m4#145
http://mxr.mozilla.org/mozilla-central/source/build/autoconf/icu.m4#213
http://mxr.mozilla.org/mozilla-central/source/intl/update-icu.sh#32


On Apr 24, 2014, at 16:03 , Ehsan Akhgari ehsan.akhg...@gmail.com wrote:

 ... We build and ship *all* of ICU, and presumably ship a tiny portion of it.

We don’t – see above.

 This is at least in part due to bug 915735, you can go and read the bug to 
 see that I did the best I could there, given the constraints that I had (not 
 understanding how the magic of the interaction of ICU and our build system 
 worked, not understanding which parts of ICU we actually want to use, the 
 performance and code quality limitations of MSVC's PGO compiler, the fact 
 that increasing the amount of time we allow our builds to progress is rocket 
 science, etc.

I don’t have numbers for this change specifically, but during my investigation 
I found that on Mac statically linking ICU into libmozjs reduced the binary 
size by almost 600KB, compared to building ICU as a separate DLL.


On Apr 25, 2014, at 0:31 , Henri Sivonen hsivo...@hsivonen.fi wrote:

 Therefore, it looks like we should turn off (if we haven't already):
 * The ICU LayoutEngine.
 * Ustdio
 * ICU encoding converters and their mapping tables.
 * ICU break iterators and their data.
 * ICU transliterators and their data.

All done, except for a few remaining encoding converters:

 However, that flag is misdesigned in the sense that it considers
 US-ASCII, ISO-8859-1, UTF-7, UTF-32, CESU-8, SCSU and BOCU-1 as
 non-legacy, even though, frankly, those are legacy, too. (UTF-16 is
 legacy also, but it's legacy we need, since both ICU and Gecko are
 UTF-16 legacy code bases!)
 http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#267

UTF-16 may be legacy from the point of view of encoding HTML pages, but 
otherwise it’s an essential part of most of today’s computing environments – 
and the alternative in many cases would be UTF-32. But at least four of the 
above are clearly obsolete.

 Also, If the ICU build system is an configurable enough, I think we
 should consider identifying what parts of ICU we can delete even
 though the build system doesn't let us to and then automate the
 deletion as a script so that it can be repeated with future imports of
 ICU.

That seems worth investigating, but getting the build system to strip out as 
much as possible might be more effective – see the next item.


On Apr 25, 2014, at 8:12 , Trevor Saunders trev.saund...@gmail.com wrote:

 Well, it really depends how you approach the problem of finding code
 thats dead.  One way is to decide code really should be dead by reading
 it and knowing what it accomplishes doesn't matter.  Another is to
 prove to your self that code is unreachable and so can be removed
 without effecting anything.  Both of these require a pretty good bit of
 knowledge and skill.  On the other hand the below question prompts
 another approach.

Is there a way to give the linker a list of functions that you want to have as 
public entry points of a dynamically linked library, and have it strip out 
everything that can’t be reached from these functions? That’s essentially what 
happens when you statically link a library into another, and the list of ICU 
functions that Mozilla code calls wouldn’t be 

Re: Policing dead/zombie code in m-c

2014-05-01 Thread Jeff Walden
On 04/28/2014 05:59 AM, Henri Sivonen wrote:
 Hopefully we didn't remove collation rules, since that's the part we
 are supposed to be using ICU for! :-)

I'm uncertain exactly what the terminology means necessarily for us, but 
intl/icu-patches/genrb-omitCollationRules.diff exists because a flag passed in 
the ICU build process to omit collation rules was broken in 52, and we had to 
fix it ourselves.  (Well, technically I wrote a patch-fix, submitted it 
upstream, then they rewrote it to be sane and I backported it to our tree.  :-) 
)  So we do omit collation-related data, that was claimed in the code and 
elsewhere to be collation rules, that's not being used by us.

Jeff
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-30 Thread Ehsan Akhgari

On 2014-04-29, 4:46 AM, Henri Sivonen wrote:

On Mon, Apr 28, 2014 at 5:23 PM, Ehsan Akhgari ehsan.akhg...@gmail.com wrote:

On 2014-04-28, 10:17 AM, Henri Sivonen wrote:


On Mon, Apr 28, 2014 at 4:36 PM, Ehsan Akhgari ehsan.akhg...@gmail.com
wrote:


On 2014-04-28, 8:59 AM, Henri Sivonen wrote:

New question:

We have various scriptable nsIFoo stuff (e.g. nsIParserService,
nsIScriptableUConv) on the fringes of Gecko for use by mailnews, the
Firefox UI or extensions. Since Gaia doesn't use XPCOM, those things
are dead code in B2G, right? Would it make sense to invest time in
#ifdefing them out for B2G? Or are they already left out by some
mechanism not obvious to me?


No, there is no such mechanism.  We should turn these off where we can.


OK. Is
if CONFIG['MOZ_WIDGET_TOOLKIT'] != 'gonk':
the right check to use in moz.build?


Depends on what you mean.  That check would be true for b2g desktop for
example, since it doesn't use gonk.  If you want to turn something off for
b2g entirely, use CONFIG['MOZ_B2G'].


Thanks. I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1003028 .

When attempting to file dependencies, I discovered that some cruft
that I thought could be turned off for B2G is used more broadly than I
thought, because previously I was looking at C++ callers forgetting
that so .jsm files in m-c are still B2G-relevant. Is there an easy way
to see what JS code in mozilla-central is used by B2G?


Not that I know of.  You need to look at the moz.build/jar.mn file and 
see if the file is included based on any conditionals.


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-29 Thread Henri Sivonen
On Mon, Apr 28, 2014 at 5:23 PM, Ehsan Akhgari ehsan.akhg...@gmail.com wrote:
 On 2014-04-28, 10:17 AM, Henri Sivonen wrote:

 On Mon, Apr 28, 2014 at 4:36 PM, Ehsan Akhgari ehsan.akhg...@gmail.com
 wrote:

 On 2014-04-28, 8:59 AM, Henri Sivonen wrote:
 New question:

 We have various scriptable nsIFoo stuff (e.g. nsIParserService,
 nsIScriptableUConv) on the fringes of Gecko for use by mailnews, the
 Firefox UI or extensions. Since Gaia doesn't use XPCOM, those things
 are dead code in B2G, right? Would it make sense to invest time in
 #ifdefing them out for B2G? Or are they already left out by some
 mechanism not obvious to me?

 No, there is no such mechanism.  We should turn these off where we can.

 OK. Is
 if CONFIG['MOZ_WIDGET_TOOLKIT'] != 'gonk':
 the right check to use in moz.build?

 Depends on what you mean.  That check would be true for b2g desktop for
 example, since it doesn't use gonk.  If you want to turn something off for
 b2g entirely, use CONFIG['MOZ_B2G'].

Thanks. I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1003028 .

When attempting to file dependencies, I discovered that some cruft
that I thought could be turned off for B2G is used more broadly than I
thought, because previously I was looking at C++ callers forgetting
that so .jsm files in m-c are still B2G-relevant. Is there an easy way
to see what JS code in mozilla-central is used by B2G?

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-28 Thread Ehsan Akhgari

On 2014-04-28, 8:59 AM, Henri Sivonen wrote:

New question:

We have various scriptable nsIFoo stuff (e.g. nsIParserService,
nsIScriptableUConv) on the fringes of Gecko for use by mailnews, the
Firefox UI or extensions. Since Gaia doesn't use XPCOM, those things
are dead code in B2G, right? Would it make sense to invest time in
#ifdefing them out for B2G? Or are they already left out by some
mechanism not obvious to me?


No, there is no such mechanism.  We should turn these off where we can. 
 With bug 996061, even marking them as non-scriptable would help.


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-26 Thread Jeff Walden
On 04/24/2014 05:49 AM, Till Schneidereit wrote:
 Questions:
  * Are we building and shipping dead code in ICU on B2G?
 
 I don't know the state of ICU on B2G, but if we have it enabled there, then 
 almost certainly, yes.

There's doubtless *some* dead code in the ICU we build, yes.  (I don't remember 
whether we build ICU on b2g yet, tho.)  Various ICU components were removed as 
part of building it into Mozilla (I'd have to check Makefile.in/moz.build to 
get a complete list).  It's possible there are more, but I thought Norbert 
picked off the low-hanging fruit at the very least.  Someone might be able to 
further split up ICU into subcomponents and pick off more, with enough 
investment of time in it.

  * The bug about building useless code in ICU has been open since
 November. Whose responsibility is it to make sure we stop building
 stuff that we don't use in ICU?
 
 I guess that this isn't entirely clear.

In practice it's largely no one's.  I've been the one to update ICU the one 
time it's happened so far (we're due for another at this point, I just need to 
find time to do it).  But that's just porting our patch set forward, plus 
debugging upstream issues.  I have no particular expertise (beyond that of any 
random Mozillian) in tackling this sort of thing in ICU.

  * Do we have any mechanisms in place for preventing stuff like the
 ICU encoding converters becoming part of the building the future? When
 people propose to import third-party code, do reviewers typically ask
 if we are importing more than we intend to use? Clearly, considering
 that it is hard to get people to remove unused code from the build
 after the code has landed, we shouldn't have allowed code like the ICU
 encoding converters to become part of the build in the first place?

I agree in spirit, but this is kind of pedantic.  I really don't think 
clearly is so obviously true, given that by nature of how we use ICU, we're 
only feeding it UChar/jschar/char16_t UTF-16 strings, so it's totally 
non-obvious how ICU encoders would ever be triggered.  Not ideal, sure.  But as 
we use it, encoders simply don't come into things, at least not in any 
user-visible way.  (Conceivably they might be invoked in the back end when 
reading locale data from the file system, but that's presumably correctly 
encoded not as UTF-7 or so.  It only really matters if the encoders are visible 
on the web, which I don't believe they are.)

 I remember that back when Norbert first presented his Intl implementation 
 work at the JS work week in 2012, we discussed the impact ICU would have on 
 download size. One idea was to slim down the ICU build and get rid of some 
 things not needed for Intl. I might very well be mistaken, but I'm not aware 
 of this having happened.

It happened.  We removed a number of things like collation rules, some of the 
ICU components, and more beyond that.

 We looked at what other vendors were doing and saw that Google essentially 
 just bundle all of ICU and swallow the increase in download size.

Well, that was justification for shipping ICU.  Not so much justification for 
shipping more of it than needed.

 Also, properly integrating ICU into our build system was a major undertaking 
 in itself, so getting to a point where we could ship it at all might have 
 taken precedence.

That's reasonably accurate, combined with the basic steps to prune size and 
built code having already been taken.

If there's more of ICU that's easily pruned, I'm all for it.  If pruning harder 
requires splitting ICU up into more components, that's fine, too.  But someone 
has to do it, and my skills and knowledge lie elsewhere.

Jeff
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-26 Thread Jeff Walden
On 04/25/2014 02:10 AM, Henri Sivonen wrote:
 Different means old, right? Having an old version is a correctness
 problem, if the CLDR has changed since.

Depends what's considered correctness.  The ECMA Internationalization API 
doesn't specify behavior, so older just means not-as-good, not incorrect.  
(Technically you could expose the Intl API without *any* locale data.  It'd 
defeat the purpose of Intl, but it would not be incorrect.)  I don't know to 
what extent older data is a correctness issue for anything else using ICU.  I'd 
be kind of astonished if slightly-old data ever constituted a black-letter 
correctness issue.  Specs don't/can't move fast enough for the difference 
between ICU/CLDR 51/52 to matter, correctness-wise.

Jeff
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Henri Sivonen
On Thu, Apr 24, 2014 at 4:20 PM, Benoit Jacob jacob.benoi...@gmail.com wrote:
 2014-04-24 8:31 GMT-04:00 Henri Sivonen hsivo...@hsivonen.fi:

 I have prepared a queue of patches that removes Netscape-era (circa
 1999) internationalization code that efforts to implement the Encoding
 Standard have shown unnecessary to have in Firefox. This makes libxul
 on ARMv7 smaller by 181 KB, so that's a win.

 Have we measured the impact of this change on actual memory usage (as
 opposed to virtual address space size) ?

No, we haven't. I don't have a B2G phone, but I could give my whole
patch queue in one diff to someone who wants to try.

 Have we explored how much this problem could be automatically helped by the
 linker being smart about locality?

Not to my knowledge, but I'm very skeptical about getting these
benefits by having the linker be smart so that the dead code ends up
on memory pages that  aren't actually mapped to real RAM.

The code that is no longer in use is sufficiently intermingled with
code that's still is in use. Useful and useless plain old C data is
included side-by-side. Useful and useless classes are included next to
each other in unified compilation units. Since the classes are
instantiated via XPCOM, a linker that's unaware of XPCOM couldn't tell
that some classes are in use and some aren't via static analysis. All
of them would look equally dead or alive depending on what we do you
take on the root of the caller chain being function pointers in a
contract ID table.

Using PGO to determine what's dead code and what's not wouldn't work,
either, if the profiling run was load mozilla.org, because the run
would exercise too little code, or if the profiling run was all the
unit tests, because the profiling run would exercise too much code.

On Fri, Apr 25, 2014 at 2:03 AM, Ehsan Akhgari ehsan.akhg...@gmail.com wrote:
 * Are we building and shipping dead code in ICU on B2G?

 No.  That is at least partly covered by bug 864843.

Using system ICU seems wrong in terms of correctness. That's the
reason why we don't use system ICU on Mac and desktop Linux, right?

For a given phone, the Android base system practically never updates,
so for a given Firefox version, the Web-exposed APIs would have as
many behaviors as there are differing ICU snapshots on different
Android versions out there.

As for B2G, considering that Gonk is supposed to update less often
than Gecko, it seems like a bad idea to have ICU be part of Gonk
rather than part of Gecko on B2G.

 In my experience, ICU is unfortunately a hot potato. :(  The real blocker
 there is finding someone who can tell us what bits of ICU _are_ used in the
 JS engine.

Apart from ICU initialization/shutdown, the callers seem to be
http://mxr.mozilla.org/mozilla-central/source/js/src/builtin/Intl.cpp
and http://mxr.mozilla.org/mozilla-central/source/js/src/jsstr.cpp#852
.

So the JS engine uses:
 * Collation
 * Number formatting
 * Date and time formatting
 * Normalization

It looks like the JS engine has its own copy of the Unicode database
for other purposes. It seems like that should be unified with ICU so
that there'd be only one copy of the Unicode database.

Additionally, we should probably rewrite nsCollation users to use ICU
collation and delete nsCollation.

Therefore, it looks like we should turn off (if we haven't already):
 * The ICU LayoutEngine.
 * Ustdio
 * ICU encoding converters and their mapping tables.
 * ICU break iterators and their data.
 * ICU transliterators and their data.

http://apps.icu-project.org/datacustom/ gives a good idea of what
there is to turn off.

 The parts used in Gecko for input type=number are pretty
 small.  And of course someone needs to figure out the black magic of
 conveying the information to the ICU build system.

So it looks like we already build with UCONFIG_NO_LEGACY_CONVERSION:
http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#264

However, that flag is misdesigned in the sense that it considers
US-ASCII, ISO-8859-1, UTF-7, UTF-32, CESU-8, SCSU and BOCU-1 as
non-legacy, even though, frankly, those are legacy, too. (UTF-16 is
legacy also, but it's legacy we need, since both ICU and Gecko are
UTF-16 legacy code bases!)
http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#267

So I guess the situation isn't quite as bad as I thought.

We should probably set UCONFIG_NO_CONVERSION to 1 and
U_CHARSET_IS_UTF8 to 1 per:
http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#248
After all, we should easily be able to make sure that we don't use
non-UTF-8 encodings when passing char* to ICU.

Also, If the ICU build system is an configurable enough, I think we
should consider identifying what parts of ICU we can delete even
though the build system doesn't let us to and then automate the
deletion as a script so that it can be repeated with future imports of
ICU.

   * Do we have any mechanisms in place for 

Re: Policing dead/zombie code in m-c

2014-04-25 Thread Henri Sivonen
On Fri, Apr 25, 2014 at 10:31 AM, Henri Sivonen hsivo...@hsivonen.fi wrote:
 So it looks like we already build with UCONFIG_NO_LEGACY_CONVERSION:
 http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#264

Oops. No. I misread.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Mike Hommey
On Fri, Apr 25, 2014 at 10:31:44AM +0300, Henri Sivonen wrote:
 Using system ICU seems wrong in terms of correctness. That's the
 reason why we don't use system ICU on Mac and desktop Linux, right?

Actually the main reason is that every version of desktop linux has a
different version of ICU, which has a different ABI. I /guess/ some
linux distros are actually using system ICU.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Henri Sivonen
On Fri, Apr 25, 2014 at 10:31 AM, Henri Sivonen hsivo...@hsivonen.fi wrote:
 Therefore, it looks like we should turn off (if we haven't already):
  * The ICU LayoutEngine.

I don't find this code in the tree at all.

  * Ustdio

I don't yet know what we are doing about this one.

  * ICU encoding converters and their mapping tables.

It looks like we are actually  turning of the converters that need
conversion tables.
http://mxr.mozilla.org/mozilla-central/source/build/autoconf/icu.m4#146

(Previously, I thought we had customized uconfig.h, but then I found icu.m4.)

We are still building algorithmic converters, though. I'll try to turn
those off.

  * ICU break iterators and their data.
  * ICU transliterators and their data.

Looking at the icu.m4 script, it looks like we already turn off the
code for these.

It's not yet clear to me what part of the build system is responsible
for omitting the associated data, but considering that the word as
that ICU added around 3 MB, we have to be subsetting the data already.

Additionally, I noticed that we are not using ICU IDNA support and
IDNA support can be turned off with a define.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Henri Sivonen
On Fri, Apr 25, 2014 at 11:26 AM, Mike Hommey m...@glandium.org wrote:
 On Fri, Apr 25, 2014 at 10:31:44AM +0300, Henri Sivonen wrote:
 Using system ICU seems wrong in terms of correctness. That's the
 reason why we don't use system ICU on Mac and desktop Linux, right?

 Actually the main reason is that every version of desktop linux has a
 different version of ICU

Different means old, right? Having an old version is a correctness
problem, if the CLDR has changed since.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Benoit Jacob
2014-04-25 3:31 GMT-04:00 Henri Sivonen hsivo...@hsivonen.fi:

 On Thu, Apr 24, 2014 at 4:20 PM, Benoit Jacob jacob.benoi...@gmail.com
 wrote:
  2014-04-24 8:31 GMT-04:00 Henri Sivonen hsivo...@hsivonen.fi:
 
  I have prepared a queue of patches that removes Netscape-era (circa
  1999) internationalization code that efforts to implement the Encoding
  Standard have shown unnecessary to have in Firefox. This makes libxul
  on ARMv7 smaller by 181 KB, so that's a win.
 
  Have we measured the impact of this change on actual memory usage (as
  opposed to virtual address space size) ?

 No, we haven't. I don't have a B2G phone, but I could give my whole
 patch queue in one diff to someone who wants to try.

  Have we explored how much this problem could be automatically helped by
 the
  linker being smart about locality?

 Not to my knowledge, but I'm very skeptical about getting these
 benefits by having the linker be smart so that the dead code ends up
 on memory pages that  aren't actually mapped to real RAM.

 The code that is no longer in use is sufficiently intermingled with
 code that's still is in use. Useful and useless plain old C data is
 included side-by-side. Useful and useless classes are included next to
 each other in unified compilation units. Since the classes are
 instantiated via XPCOM, a linker that's unaware of XPCOM couldn't tell
 that some classes are in use and some aren't via static analysis. All
 of them would look equally dead or alive depending on what we do you
 take on the root of the caller chain being function pointers in a
 contract ID table.

 Using PGO to determine what's dead code and what's not wouldn't work,
 either, if the profiling run was load mozilla.org, because the run
 would exercise too little code, or if the profiling run was all the
 unit tests, because the profiling run would exercise too much code.


Thanks for this answer, it does totally make sense (and shed light on the
specifics here that make this hard to solve automatically).

Benoit




 On Fri, Apr 25, 2014 at 2:03 AM, Ehsan Akhgari ehsan.akhg...@gmail.com
 wrote:
  * Are we building and shipping dead code in ICU on B2G?
 
  No.  That is at least partly covered by bug 864843.

 Using system ICU seems wrong in terms of correctness. That's the
 reason why we don't use system ICU on Mac and desktop Linux, right?

 For a given phone, the Android base system practically never updates,
 so for a given Firefox version, the Web-exposed APIs would have as
 many behaviors as there are differing ICU snapshots on different
 Android versions out there.

 As for B2G, considering that Gonk is supposed to update less often
 than Gecko, it seems like a bad idea to have ICU be part of Gonk
 rather than part of Gecko on B2G.

  In my experience, ICU is unfortunately a hot potato. :(  The real blocker
  there is finding someone who can tell us what bits of ICU _are_ used in
 the
  JS engine.

 Apart from ICU initialization/shutdown, the callers seem to be
 http://mxr.mozilla.org/mozilla-central/source/js/src/builtin/Intl.cpp
 and http://mxr.mozilla.org/mozilla-central/source/js/src/jsstr.cpp#852
 .

 So the JS engine uses:
  * Collation
  * Number formatting
  * Date and time formatting
  * Normalization

 It looks like the JS engine has its own copy of the Unicode database
 for other purposes. It seems like that should be unified with ICU so
 that there'd be only one copy of the Unicode database.

 Additionally, we should probably rewrite nsCollation users to use ICU
 collation and delete nsCollation.

 Therefore, it looks like we should turn off (if we haven't already):
  * The ICU LayoutEngine.
  * Ustdio
  * ICU encoding converters and their mapping tables.
  * ICU break iterators and their data.
  * ICU transliterators and their data.

 http://apps.icu-project.org/datacustom/ gives a good idea of what
 there is to turn off.

  The parts used in Gecko for input type=number are pretty
  small.  And of course someone needs to figure out the black magic of
  conveying the information to the ICU build system.

 So it looks like we already build with UCONFIG_NO_LEGACY_CONVERSION:

 http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#264

 However, that flag is misdesigned in the sense that it considers
 US-ASCII, ISO-8859-1, UTF-7, UTF-32, CESU-8, SCSU and BOCU-1 as
 non-legacy, even though, frankly, those are legacy, too. (UTF-16 is
 legacy also, but it's legacy we need, since both ICU and Gecko are
 UTF-16 legacy code bases!)

 http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#267

 So I guess the situation isn't quite as bad as I thought.

 We should probably set UCONFIG_NO_CONVERSION to 1 and
 U_CHARSET_IS_UTF8 to 1 per:

 http://mxr.mozilla.org/mozilla-central/source/intl/icu/source/common/unicode/uconfig.h#248
 After all, we should easily be able to make sure that we don't use
 non-UTF-8 encodings when passing char* to ICU.

 Also, If the ICU 

Re: Policing dead/zombie code in m-c

2014-04-25 Thread Mike Hommey
On Fri, Apr 25, 2014 at 12:10:24PM +0300, Henri Sivonen wrote:
 On Fri, Apr 25, 2014 at 11:26 AM, Mike Hommey m...@glandium.org wrote:
  On Fri, Apr 25, 2014 at 10:31:44AM +0300, Henri Sivonen wrote:
  Using system ICU seems wrong in terms of correctness. That's the
  reason why we don't use system ICU on Mac and desktop Linux, right?
 
  Actually the main reason is that every version of desktop linux has a
  different version of ICU
 
 Different means old, right? Having an old version is a correctness
 problem, if the CLDR has changed since.

Can be both ways. Before we upgraded to 52, the version on my system was
newer than the one in nightly. In all likeliness, the version on my
system is still newer than the one in one or several or release, beta
and aurora. ICU changes soname for each version, so it's never backwards
compatible.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Trevor Saunders
On Fri, Apr 25, 2014 at 10:42:43AM -0400, Mike Hoye wrote:
 On 2014-04-25, 3:31 AM, Henri Sivonen wrote:
 On Thu, Apr 24, 2014 at 4:20 PM, Benoit Jacob jacob.benoi...@gmail.com 
 wrote:
* How should we identify code that we build but that isn't used
 anywhere?
 I'm afraid we need humans for that.
 Yeah, but how do we get humans to do that?
 
 We ask them! There are thousands of people out there who want to help us do
 stuff, if we can give them the right stuff to do.
 
 Noob question here: how accessible is the profiling process, and can
 we/would it be useful to make that participatory?

 Well, it really depends how you approach the problem of finding code
 thats dead.  One way is to decide code really should be dead by reading
 it and knowing what it accomplishes doesn't matter.  Another is to
 prove to your self that code is unreachable and so can be removed
 without effecting anything.  Both of these require a pretty good bit of
 knowledge and skill.  On the other hand the below question prompts
 another approach.

 If we ask a thousand people to run a nightly under a profiler for a few days
 and aggregated the results, would that be valuable information?

If the users are a good distribution and we collect a list of every
function we run we should be able to compare that to a list of things
that go into libxul, and maybe we'd find something useful.

Trev

 
 - mhoye
 
 
 
 
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Mike Hoye

On 2014-04-25, 11:12 AM, Trevor Saunders wrote:

On Fri, Apr 25, 2014 at 10:42:43AM -0400, Mike Hoye wrote:

If we ask a thousand people to run a nightly under a profiler for a few days
and aggregated the results, would that be valuable information?

If the users are a good distribution and we collect a list of every
function we run we should be able to compare that to a list of things
that go into libxul, and maybe we'd find something useful.


I guess the question becomes how practical it is to set this up and run 
it, vs. the anticipated value of the results.


(Whether it's a good fit for this task or not, community-assisted 
tooling is something we can do that not a lot of other orgs can, and I 
think it's worth remembering that we've got that potentially very heavy 
club in the bag.)



- mhoye
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-25 Thread Ehsan Akhgari

On 2014-04-25, 3:31 AM, Henri Sivonen wrote:

On Fri, Apr 25, 2014 at 2:03 AM, Ehsan Akhgari ehsan.akhg...@gmail.com wrote:

* Are we building and shipping dead code in ICU on B2G?


No.  That is at least partly covered by bug 864843.


Using system ICU seems wrong in terms of correctness. That's the
reason why we don't use system ICU on Mac and desktop Linux, right?


If I'm reading that bug correctly, we're not trying to use the system 
ICU there.



In my experience, ICU is unfortunately a hot potato. :(  The real blocker
there is finding someone who can tell us what bits of ICU _are_ used in the
JS engine.


Apart from ICU initialization/shutdown, the callers seem to be
http://mxr.mozilla.org/mozilla-central/source/js/src/builtin/Intl.cpp
and http://mxr.mozilla.org/mozilla-central/source/js/src/jsstr.cpp#852
.

So the JS engine uses:
  * Collation
  * Number formatting
  * Date and time formatting
  * Normalization

It looks like the JS engine has its own copy of the Unicode database
for other purposes. It seems like that should be unified with ICU so
that there'd be only one copy of the Unicode database.

Additionally, we should probably rewrite nsCollation users to use ICU
collation and delete nsCollation.

Therefore, it looks like we should turn off (if we haven't already):
  * The ICU LayoutEngine.
  * Ustdio
  * ICU encoding converters and their mapping tables.
  * ICU break iterators and their data.
  * ICU transliterators and their data.

http://apps.icu-project.org/datacustom/ gives a good idea of what
there is to turn off.


Are you looking into these?  They all seem like great things to investigate.


Also, If the ICU build system is an configurable enough, I think we
should consider identifying what parts of ICU we can delete even
though the build system doesn't let us to and then automate the
deletion as a script so that it can be repeated with future imports of
ICU.


You need to ask Waldo about that, I think, since he is the person who 
did the latest ICU update.



   * Do we have any mechanisms in place for preventing stuff like the
ICU encoding converters becoming part of the building the future?


No, that is not possible to automate.


I was thinking of policy / review solutions.


Oh I see.  I think this should be part of the normal code review process 
already.  But perhaps we just need to act on it better.  I can't think 
of any policy changes which would be of a substantial help, the best 
that I can think of is reminding people through these kinds of 
discussions (and thanks for starting this one!)



   * How should we identify code that we build but that isn't used
anywhere?


I'm afraid we need humans for that.


Yeah, but how do we get humans to do that?


It's a hard question.  Not everybody is going to care much about this. 
In the case of something like ICU it's actually relatively easy for 
someone to detect the dead code here even without intimate knowledge of 
the code being used, but in many cases determining dead code really 
requires a bit of domain expertise, or the willingness to read and learn 
some new code.


I think it would be a great idea if module owners filed mentored bugs on 
deleting the dead code that they already know of, or even to investigate 
whether something that they suspect is actually dead code.


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread Till Schneidereit
(CC'ing people who have worked on the ICU integration)

On Thu, Apr 24, 2014 at 2:31 PM, Henri Sivonen hsivo...@hsivonen.fi wrote:

 However, especially in the context of slimming down our own set of
 encoding converters, it's rather demotivating to see that at least on
 desktop, we are building ICU encoding converters that we don't use.
 See https://bugzilla.mozilla.org/show_bug.cgi?id=944348 . This isn't
 even a matter of building code that some might argue we maybe should
 use (https://bugzilla.mozilla.org/show_bug.cgi?id=724540). We are even
 building ICU encoding converters that we should never use even if we
 gave up on our own converters. We're building stuff like BOCU-1 that's
 explicitly banned by the HTML spec! (In general, it's uncool that
 abandoned researchy stuff like BOCU-1 is included by default in a
 widely-used production library like ICU.)

 Questions:
  * Are we building and shipping dead code in ICU on B2G?


I don't know the state of ICU on B2G, but if we have it enabled there, then
almost certainly, yes.


  * The bug about building useless code in ICU has been open since
 November. Whose responsibility is it to make sure we stop building
 stuff that we don't use in ICU?


I guess that this isn't entirely clear.


  * Do we have any mechanisms in place for preventing stuff like the
 ICU encoding converters becoming part of the building the future? When
 people propose to import third-party code, do reviewers typically ask
 if we are importing more than we intend to use? Clearly, considering
 that it is hard to get people to remove unused code from the build
 after the code has landed, we shouldn't have allowed code like the ICU
 encoding converters to become part of the build in the first place?


I remember that back when Norbert first presented his Intl implementation
work at the JS work week in 2012, we discussed the impact ICU would have on
download size. One idea was to slim down the ICU build and get rid of some
things not needed for Intl. I might very well be mistaken, but I'm not
aware of this having happened. We looked at what other vendors were doing
and saw that Google essentially just bundle all of ICU and swallow the
increase in download size. Also, properly integrating ICU into our build
system was a major undertaking in itself, so getting to a point where we
could ship it at all might have taken precedence.


--
till
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread Benoit Jacob
2014-04-24 8:31 GMT-04:00 Henri Sivonen hsivo...@hsivonen.fi:

 I have prepared a queue of patches that removes Netscape-era (circa
 1999) internationalization code that efforts to implement the Encoding
 Standard have shown unnecessary to have in Firefox. This makes libxul
 on ARMv7 smaller by 181 KB, so that's a win.


Have we measured the impact of this change on actual memory usage (as
opposed to virtual address space size) ?

Have we explored how much this problem could be automatically helped by the
linker being smart about locality?

I totally agree about the value of removing dead code (if only to make the
codebase easier to read and maintain), I just wonder if there might be a
shortcut to get the short-term memory usage benefits that you mention.

Benoit




 However, especially in the context of slimming down our own set of
 encoding converters, it's rather demotivating to see that at least on
 desktop, we are building ICU encoding converters that we don't use.
 See https://bugzilla.mozilla.org/show_bug.cgi?id=944348 . This isn't
 even a matter of building code that some might argue we maybe should
 use (https://bugzilla.mozilla.org/show_bug.cgi?id=724540). We are even
 building ICU encoding converters that we should never use even if we
 gave up on our own converters. We're building stuff like BOCU-1 that's
 explicitly banned by the HTML spec! (In general, it's uncool that
 abandoned researchy stuff like BOCU-1 is included by default in a
 widely-used production library like ICU.)

 Questions:
  * Are we building and shipping dead code in ICU on B2G?
  * The bug about building useless code in ICU has been open since
 November. Whose responsibility is it to make sure we stop building
 stuff that we don't use in ICU?
  * Do we have any mechanisms in place for preventing stuff like the
 ICU encoding converters becoming part of the building the future? When
 people propose to import third-party code, do reviewers typically ask
 if we are importing more than we intend to use? Clearly, considering
 that it is hard to get people to remove unused code from the build
 after the code has landed, we shouldn't have allowed code like the ICU
 encoding converters to become part of the build in the first place?
  * How should we identify code that we build but that isn't used anywhere?
  * How should we identify code that we build as part of Firefox but is
 used only in other apps (Thunderbird, SeaMonkey, etc.)?
  * Are there obvious places that people should inspect for code that's
 being built but not used? Some libs that got imported for WebRTC
 maybe?

 --
 Henri Sivonen
 hsivo...@hsivonen.fi
 https://hsivonen.fi/
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread Ben Kelly

On 4/24/2014 9:20 AM, Benoit Jacob wrote:

2014-04-24 8:31 GMT-04:00 Henri Sivonen hsivo...@hsivonen.fi:

I have prepared a queue of patches that removes Netscape-era (circa
1999) internationalization code that efforts to implement the Encoding
Standard have shown unnecessary to have in Firefox. This makes libxul
on ARMv7 smaller by 181 KB, so that's a win.


Have we measured the impact of this change on actual memory usage (as
opposed to virtual address space size) ?


We are also limited by disk space in the system partition on devices 
like tarako.  So reducing lib size has a benefit even if the code is 
usually paged out.


Ben
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread Trevor Saunders
On Thu, Apr 24, 2014 at 09:20:06AM -0400, Benoit Jacob wrote:
 2014-04-24 8:31 GMT-04:00 Henri Sivonen hsivo...@hsivonen.fi:
 
  I have prepared a queue of patches that removes Netscape-era (circa
  1999) internationalization code that efforts to implement the Encoding
  Standard have shown unnecessary to have in Firefox. This makes libxul
  on ARMv7 smaller by 181 KB, so that's a win.
 
 
 Have we measured the impact of this change on actual memory usage (as
 opposed to virtual address space size) ?

that's probably harder, and might depend on how much spare RAM you have
on the test machine.

 Have we explored how much this problem could be automatically helped by the
 linker being smart about locality?

people have and are working on that, so to some degree it probably is
already happening.

Trev

 
 I totally agree about the value of removing dead code (if only to make the
 codebase easier to read and maintain), I just wonder if there might be a
 shortcut to get the short-term memory usage benefits that you mention.
 
 Benoit
 
 
 
 
  However, especially in the context of slimming down our own set of
  encoding converters, it's rather demotivating to see that at least on
  desktop, we are building ICU encoding converters that we don't use.
  See https://bugzilla.mozilla.org/show_bug.cgi?id=944348 . This isn't
  even a matter of building code that some might argue we maybe should
  use (https://bugzilla.mozilla.org/show_bug.cgi?id=724540). We are even
  building ICU encoding converters that we should never use even if we
  gave up on our own converters. We're building stuff like BOCU-1 that's
  explicitly banned by the HTML spec! (In general, it's uncool that
  abandoned researchy stuff like BOCU-1 is included by default in a
  widely-used production library like ICU.)
 
  Questions:
   * Are we building and shipping dead code in ICU on B2G?
   * The bug about building useless code in ICU has been open since
  November. Whose responsibility is it to make sure we stop building
  stuff that we don't use in ICU?
   * Do we have any mechanisms in place for preventing stuff like the
  ICU encoding converters becoming part of the building the future? When
  people propose to import third-party code, do reviewers typically ask
  if we are importing more than we intend to use? Clearly, considering
  that it is hard to get people to remove unused code from the build
  after the code has landed, we shouldn't have allowed code like the ICU
  encoding converters to become part of the build in the first place?
   * How should we identify code that we build but that isn't used anywhere?
   * How should we identify code that we build as part of Firefox but is
  used only in other apps (Thunderbird, SeaMonkey, etc.)?
   * Are there obvious places that people should inspect for code that's
  being built but not used? Some libs that got imported for WebRTC
  maybe?
 
  --
  Henri Sivonen
  hsivo...@hsivonen.fi
  https://hsivonen.fi/
  ___
  dev-platform mailing list
  dev-platform@lists.mozilla.org
  https://lists.mozilla.org/listinfo/dev-platform
 
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread ISHIKAWA,chiaki

(2014/04/24 21:49), Till Schneidereit wrote:

(CC'ing people who have worked on the ICU integration)

On Thu, Apr 24, 2014 at 2:31 PM, Henri Sivonen hsivo...@hsivonen.fi wrote:


However, especially in the context of slimming down our own set of
encoding converters, it's rather demotivating to see that at least on
desktop, we are building ICU encoding converters that we don't use.
See https://bugzilla.mozilla.org/show_bug.cgi?id=944348 . This isn't
even a matter of building code that some might argue we maybe should
use (https://bugzilla.mozilla.org/show_bug.cgi?id=724540). We are even
building ICU encoding converters that we should never use even if we
gave up on our own converters. We're building stuff like BOCU-1 that's
explicitly banned by the HTML spec! (In general, it's uncool that
abandoned researchy stuff like BOCU-1 is included by default in a
widely-used production library like ICU.)

Questions:
  * Are we building and shipping dead code in ICU on B2G?



I don't know the state of ICU on B2G, but if we have it enabled there, then
almost certainly, yes.



  * The bug about building useless code in ICU has been open since
November. Whose responsibility is it to make sure we stop building
stuff that we don't use in ICU?



I guess that this isn't entirely clear.



  * Do we have any mechanisms in place for preventing stuff like the
ICU encoding converters becoming part of the building the future? When
people propose to import third-party code, do reviewers typically ask
if we are importing more than we intend to use? Clearly, considering
that it is hard to get people to remove unused code from the build
after the code has landed, we shouldn't have allowed code like the ICU
encoding converters to become part of the build in the first place?



I remember that back when Norbert first presented his Intl implementation
work at the JS work week in 2012, we discussed the impact ICU would have on
download size. One idea was to slim down the ICU build and get rid of some
things not needed for Intl. I might very well be mistaken, but I'm not
aware of this having happened. We looked at what other vendors were doing
and saw that Google essentially just bundle all of ICU and swallow the
increase in download size. Also, properly integrating ICU into our build
system was a major undertaking in itself, so getting to a point where we
could ship it at all might have taken precedence.


--
till


This is just a random observation, not answering the original questions.

There is at least some code in ICU that is
 - used by FF, but
 - not used by TB.

I noticed this when I began compiling FF ( under ./mozilla subdirectory 
of C-C, which is basically a copy of M-C [could be outdated by a few 
patches behind.].


There are a few files which did not compile if I enabled -DDEBUG=1 on my 
compilation command line due to broken code within #ifdef DEBUG, 
#endif. But this compilation failure was only seen during FF build.


TB build using the same tree did not complain about this.
So obviously TB does not use such files.

Just an observation I noticed recently.
Bug 963090 - Compilation error while compiling DEBUG BUILD of Firefox 
(C-C) rbnf.cpp, ucbuf.c

https://bugzilla.mozilla.org/show_bug.cgi?id=963090



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread Brian Smith
On Thu, Apr 24, 2014 at 4:03 PM, Ehsan Akhgari ehsan.akhg...@gmail.comwrote:

   * Are there obvious places that people should inspect for code that's

 being built but not used? Some libs that got imported for WebRTC
 maybe?


 Nothing big comes to my mind.  Perhaps hunspell on b2g?


https://bugzilla.mozilla.org/show_bug.cgi?id=611781: Project to reduce the
size of NSS libraries included in Firefox distributions. IIRC, this would
cut 500KB of object code.

Cheers,
Brian
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread Ehsan Akhgari

On 2014-04-24, 7:24 PM, Mike Hommey wrote:

On Thu, Apr 24, 2014 at 07:03:09PM -0400, Ehsan Akhgari wrote:

On 2014-04-24, 8:31 AM, Henri Sivonen wrote:

However, especially in the context of slimming down our own set of
encoding converters, it's rather demotivating to see that at least on
desktop, we are building ICU encoding converters that we don't use.
See https://bugzilla.mozilla.org/show_bug.cgi?id=944348 .


Please allow me to upset you even further!  We build and ship *all* of ICU,
and presumably ship a tiny portion of it.  This is at least in part due to
bug 915735, you can go and read the bug to see that I did the best I could
there, given the constraints that I had (not understanding how the magic of
the interaction of ICU and our build system worked, not understanding which
parts of ICU we actually want to use, the performance and code quality
limitations of MSVC's PGO compiler, the fact that increasing the amount of
time we allow our builds to progress is rocket science, etc.  None of this
was fun, but please understand that it wasn't like somebody sat down and
decided that it's a great idea to build and ship all of ICU.


I'm still amazed that we allowed ICU to land that way at all. And there
I don't mean your changes, which don't particularly change how much of
ICU is built.


I'm pretty sure my changes did make things worse, since they effectively 
made the linker unable to eliminate anything assuming that it did back 
when ICU was built into mozjs.dll...


Cheers,
Ehsan

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Policing dead/zombie code in m-c

2014-04-24 Thread Martin Thomson
On 2014-04-24, at 05:31, Henri Sivonen hsivo...@hsivonen.fi wrote:

 * Are there obvious places that people should inspect for code that's
 being built but not used? Some libs that got imported for WebRTC
 maybe?

https://bugzilla.mozilla.org/show_bug.cgi?id=1001114

I’m told that sipcc is 3M stripped.  We certainly don’t use all of it, but I’m 
not sure how much can be safely removed.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform