Intent to ship: Honoring bogo-XML declaration for character encoding in text/html

2021-03-24 Thread Henri Sivonen
This has now landed and is expected to ride the trains in Firefox 89.

For added historical context:

Prior to HTML parsing getting specified, in addition to WebKit, also
Gecko and Presto implemented this. At the time, the specification
process paid too much attention to IE behavior as a presumed indicator
of Web-compatibility instead of looking at engine quorum.

Like WebKit, Presto kept this behavior when implementing
HTML5-compliant tokenization and tree building. That is, I was the
only browser implementor fooled into removing this behavior as part of
re-implementing parsing from the spec--not just the tokenization and
tree building layers but also the input stream layer.

What can we learn? Instead of trusting the spec and trusting other
implementors to loudly object to the parts of the spec they don't
intend to follow, proactively check what the others are doing and
adjust sooner.

On Wed, Mar 10, 2021 at 5:56 PM Henri Sivonen  wrote:
>
> # Summary
>
> For compatibility with WebKit and Blink, honor the character encoding
> declared using the XML declaration syntax in text/html.
>
> For reasons explained in https://hsivonen.fi/utf-8-detection/ , unlike
> other encodings, UTF-8 isn't detected from content, so with the demise
> of Trident and EdgeHTML (which don't honor the XML declaration syntax
> in text/html),  has become a
> more notable Web compat problem for us. With non-Latin scripts, the
> failure mode is particularly bad for a Web compat problem: The text is
> completely unreadable.
>
> That is, this isn't a feature for Web authors to use. This is to
> address a push factor for users when authors do use this feature.
>
> # Bug
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=673087
>
> # Standard
>
> https://github.com/whatwg/html/pull/1752
>
> # Platform coverage
>
> All
>
> # Preference
>
> To be enabled unconditionally.
>
> # DevTools bug
>
> No integration needed.
>
> # Other browsers
>
> WebKit has had this behavior for a very long time and didn't remove it
> when HTML parsing was standardized.
>
> Blink inherited this from WebKit upon forking.
>
> Trident and EdgeHTML don't have this; their demise changed the balance
> for this feature.
>
> # web-platform-tests
>
> https://hsivonen.com/test/moz/xml-decl/ contains tests which are
> wrapped for WPT as part of the Gecko patch.
>
> --
> Henri Sivonen
> hsivo...@mozilla.com



-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Intent to prototype: Honoring bogo-XML declaration for character encoding in text/html

2021-03-10 Thread Henri Sivonen
# Summary

For compatibility with WebKit and Blink, honor the character encoding
declared using the XML declaration syntax in text/html.

For reasons explained in https://hsivonen.fi/utf-8-detection/ , unlike
other encodings, UTF-8 isn't detected from content, so with the demise
of Trident and EdgeHTML (which don't honor the XML declaration syntax
in text/html),  has become a
more notable Web compat problem for us. With non-Latin scripts, the
failure mode is particularly bad for a Web compat problem: The text is
completely unreadable.

That is, this isn't a feature for Web authors to use. This is to
address a push factor for users when authors do use this feature.

# Bug

https://bugzilla.mozilla.org/show_bug.cgi?id=673087

# Standard

https://github.com/whatwg/html/pull/1752

# Platform coverage

All

# Preference

To be enabled unconditionally.

# DevTools bug

No integration needed.

# Other browsers

WebKit has had this behavior for a very long time and didn't remove it
when HTML parsing was standardized.

Blink inherited this from WebKit upon forking.

Trident and EdgeHTML don't have this; their demise changed the balance
for this feature.

# web-platform-tests

https://hsivonen.com/test/moz/xml-decl/ contains tests which are
wrapped for WPT as part of the Gecko patch.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: User-facing benefits from UA exposure of Android version and Linux CPU architecture

2021-02-18 Thread Henri Sivonen
On Thu, Feb 18, 2021 at 11:26 PM Mike Hommey  wrote:
>
> On Thu, Feb 18, 2021 at 01:51:07PM +0200, Henri Sivonen wrote:
> > Does reporting "Linux aarch64" have significant concrete benefits to
> > users? Would actual presently-existing app download pages break if,
> > for privacy, we always reported "Linux x86_64" on Linux regardless of
> > the actual CPU architecture (or reported it on anything but 32-bit
> > x86)?
>
> Would not exposing the CPU architecture be an option? Are UA sniffers
> expecting the UA format to include the CPU architecture?

In general, changing the format of the UA string is always riskier
than freezing parts to some value that has been common in the past. I
think finding out whether removal would be Web compatible is not worth
the risk, churn, cost, and time investment.

The attempt to take away the Gecko date was a costly episode of churn
that left us with weird divergence between the desktop and mobile
Gecko tokens.

As far as removals from the UA string go, the main success is the
removal of the crypto level token, but that removed an entire
between-semicolons item from the middle of the list.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


User-facing benefits from UA exposure of Android version and Linux CPU architecture

2021-02-18 Thread Henri Sivonen
We currently expose the major version of Android and the CPU
architecture on Linux in every HTTP request.

That is, these are exposed to passive fingerprinting that doesn't
involve running JS. (On Android, the CPU architecture is exposed via
JavaScript but not in the HTTP User-Agent string. Exposing it probably
doesn't help users, but the exposure probably doesn't contribute
JS-reachable entropy on top of WebGL-exposed values.)

Previously, we've had problems from exposing the Android version. Back
when Firefox still ran on Android versions lower than 4.4, we ended up
reporting 4.4 for those to avoid discrimination by Web sites.

I'm aware of one use case for the Android version that could be
justified from the perspective of what users might want: Deciding if
intent links work. This only requires checking < 6 vs. >= 6, and Fenix
runs on 5 still.

Assuming that we accept the level of breakage that a device currently
running Android 9 or 10 experience relative to what Android 8 devices
experience (9 and 10 have potential breakage from not having a dot and
a minor version), what would break if we reported 5.0 for anything
below 6 and the latest (currently 10) for anything 6 or above?

As for the CPU architecture on Linux, on Mac and Windows we don't
expose aarch64 separately. (On Windows, consistent with Edge, aarch64
looks like x86. On Mac, aarch64 looks like x86_64 which itself doesn't
differ from what x86 looked like.)

The software download situation for each desktop platform is
different: On Windows, an x86 stub installer can decide whether to
install x86, x86_64, or aarch64 app. On Mac, Universal Binaries make
it irrelevant to know the CPU architecture at the app download time.
On Linux, downloads outside the distro's package manager typically
involve the user having to choose from various options anyway due to
tarball vs. .deb vs. .rpm vs. Flatpak vs. Snap, etc. OTOH, unlike on
Windows and Mac, x86 or x86_64 emulation isn't typically automatically
ready to work on Linux.

We don't ship official builds for aarch64 Linux (do we have plans
to?), but we do have it as an option on try and the configuration is
becoming increasingly relevant in distro-shipped form. However, if we
wanted to avoid the fingerprintability here, it would be good to take
action before Linux on aarch64 is popular enough for us to ship
official builds.

Does reporting "Linux aarch64" have significant concrete benefits to
users? Would actual presently-existing app download pages break if,
for privacy, we always reported "Linux x86_64" on Linux regardless of
the actual CPU architecture (or reported it on anything but 32-bit
x86)?

Historically, distros and BSDs have wanted to draw attention to
themselves instead of letting their users hide in the crowd for
privacy, which is unfortunate considering that the user cohort is
already smaller than the Windows and Mac cohorts. Does that dynamic
apply to ISAs? I.e. should we expect distro maintainers to undo the
change if we made mozilla-central say "Linux x86_64" regardless of the
actual ISA on Linux?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Intent to unship: Exposure of 11.x macOS versions in the User-Agent string

2021-02-17 Thread Henri Sivonen
https://bugzilla.mozilla.org/show_bug.cgi?id=1679929 caps the macOS
version exposed in the User-Agent string to 10.15.

The motivation that set this in motion is the Web compat impact of
sites not expecting to see a version starting with 11. The reason for
not exposing Big Sur as 10.16 is that Safari capped the version and
capping the version has a privacy benefit going forward while the
utility of exposing the macOS version is questionable especially after
Safari capped it.
https://bugs.webkit.org/show_bug.cgi?id=217364

As for why not do even better for privacy and report 10.15 on older
versions of macOS, capping the version is a more prudent change.

Chrome will also cap the version like Safari in its UA string, but
Chrome will expose the real version via Sec-CH-UA-Platform-Version,
which neither Safari nor Firefox supports. (Client Hints in general
and the Sec-CH-UA-* parts in particular are beyond the scope of this
email.)
https://groups.google.com/a/chromium.org/g/blink-dev/c/hAI4QoX6rEo/m/qQNPThr0AAAJ

So far, the use case why Web developers would prefer to distinguish
Safari on Catalina vs. Safari on Big Sur relates to WebP support, but
the issue is moot for both Firefox and Chrome, as Firefox and Chrome
advertise WebP via the Accept header and don't rely on the OS decoder.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Gecko performance with newer x86_64 levels

2021-02-10 Thread Henri Sivonen
On Tue, Feb 9, 2021 at 5:35 PM Gian-Carlo Pascutto  wrote:
>
> On 3/02/2021 10:51, Henri Sivonen wrote:
> > I came across 
> > https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
> > . Previously, when microbenchmarking Rust code that used count_ones()
> > in an inner loop (can't recall what code this was), I noticed 4x
> > runtime speed when compiling for target_cpu=nehalem and running on a
> > much later CPU.
>
> That's an extreme edge case though.

It is an extreme edge case but it's also a case where run-time
dispatch doesn't make sense. The interesting thing is how much these
plus LLVM using newer instructions on its own would add up around the
code base.

> > I'm wondering:
> >
> > Have we done benchmark comparisons with libxul compiled for the
> > newly-defined x86_64 levels?
>
> No. Should be easy to do

In that case, it seems worth trying.

> but I don't expect much to come off of it. The
> main change (that is broadly applicable, unlike POPCNT) in recent years
> would be AVX. Do we have much floating point code in critical paths? I
> was wondering about the JS' engine usage of double for value storage -
> but it's what comes out of the JIT that matters, right?

AVX is much more recent than what's available after SSE2, which is our
current baseline.

Chrome is moving to SSE3 as the unconditional baseline, which I
personally find surprising:
https://docs.google.com/document/d/1QUzL4MGNqX4wiLvukUwBf6FdCL35kCDoEJTm2wMkahw/edit#

A quick and very unscientific look at Searchfox suggests that
unconditional SSE3 would mainly eliminate conditional/dynamic dispatch
on YUV conversion code paths when it comes to explicit SSE3 usage. No
idea how LLVM would insert SSE3 usage on its own.

> Media codecs don't count - they should detect at runtime. Same applies
> to crypto code, that - I really hope - would be using runtime detection
> for their SIMD implementations or even hardware AES/SHA routines.
>
> > For macOS and Android, do we actively track the baseline CPU age that
> > Firefox-compatible OS versions run on and adjust the compiler options
> > accordingly when we drop compatibility for older OS versions?
>
> Android only recently added 64-bit builds, and 32-bit would be limited
> to ARMv7-A. There used to be people on non-NEON devices, but those are
> probably gone by now. Google says "For NDK r21 and newer Neon is enabled
> by default for all API levels." - note that should be the NDK used for
> 64-bit builds.
>
> So it's possible Android could now assume NEON even on 32-bit, if it
> isn't already. Most of the code that cares (i.e. media) will already be
> doing runtime detection though.

I meant tracking baseline CPU age on the x86/x86_64 Android side. We
have required NEON on Android ARMv7 for quite a while already.

> For macOS Apple Silicon is a hard break. For macOS on x86, I guess AVX
> is also breaking point. There was an open question if any non-AVX
> hardware is still supported on Big Sur because Rosetta doesn't support
> AVX code, but given that we support (much) older macOS releases I don't
> think we can assume AVX presence regardless. We support back to macOS
> 10.12, which runs on "MacBook Late 2009", which was a Core 2 Duo. Guess
> we could assume SSSE3 but nothing more.

That's older than I expected, but it still seems worthwhile to make
our compiler settings for Mac reflect that if they don't already.
Also, doesn't the whole Core 2 Duo family have SSE 4.1?


--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship: FTP protocol implementation

2021-02-10 Thread Henri Sivonen
On Wed, Feb 10, 2021 at 10:37 AM Valentin Gosu  wrote:
> FTP support is currently disabled on Nightly.
> Our current plan is for the pref flip to ride the trains with Firefox 88 to
> beta and release [1], meaning we would be disabling FTP a week after Chrome
> [2]

Are we also stopping advertising the capability to act as an ftp: URL
handler to operating systems? Currently, if I try to follow an ftp:
URL in Gnome Terminal, it tries to launch Firefox. Is that something
we advertise to Gnome or something that Gnome just knows and needs to
be patched to stop knowing?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Gecko performance with newer x86_64 levels

2021-02-03 Thread Henri Sivonen
I came across 
https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
. Previously, when microbenchmarking Rust code that used count_ones()
in an inner loop (can't recall what code this was), I noticed 4x
runtime speed when compiling for target_cpu=nehalem and running on a
much later CPU.

I'm wondering:

Have we done benchmark comparisons with libxul compiled for the
newly-defined x86_64 levels?

How feasible would it be, considering CI cost, to compile for multiple
x86_64 levels and make the Windows installer / updater pick the right
one and to use the new glibc-hwcaps mechanism on Linux?

For macOS and Android, do we actively track the baseline CPU age that
Firefox-compatible OS versions run on and adjust the compiler options
accordingly when we drop compatibility for older OS versions?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Status of Ubuntu 20.04 as a development platform

2020-11-17 Thread Henri Sivonen
On Tue, Nov 10, 2020 at 4:39 PM James Graham  wrote:
>
> On 10/11/2020 14:17, Kyle Huey wrote:
> > On Tue, Nov 10, 2020 at 3:48 AM Henri Sivonen  wrote:
> >>
> >> Does Ubuntu 20.04 work properly as a platform for Firefox development?
> >> That is, does rr work with the provided kernel and do our tools work
> >> with the provided Python versions?
> >
> > rr works. I use 20.04 personally.
>
> I've also been using 20.04 and all the Python bits have worked fine.

Thanks. I upgraded, and both rr and Python-based tools work.


--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Enabled CRLite in Nightly

2020-11-16 Thread Henri Sivonen
On Fri, Nov 13, 2020 at 6:19 AM J.C. Jones  wrote:
> Not yet, no. Neither this nor Intermediate Preloading (which CRLite depends
> on) are enabled in Fenix yet, as we have outstanding bugs about "only
> download this stuff when on WiFi + Power" and "that, but configurable."

If the delta updates are averaging 66 KB, do we really need to avoid
the updates over cellular data even when that's assumed to be metered?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Status of Ubuntu 20.04 as a development platform

2020-11-10 Thread Henri Sivonen
Does Ubuntu 20.04 work properly as a platform for Firefox development?
That is, does rr work with the provided kernel and do our tools work
with the provided Python versions?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Please don't use functions from ctype.h and strings.h

2020-06-25 Thread Henri Sivonen
On Wed, Jun 24, 2020 at 10:35 PM Chris Peterson  wrote:
>
> On 8/27/2018 7:00 AM, Henri Sivonen wrote:
> > I think it's worthwhile to have a lint, but regexps are likely to have
> > false positives, so using clang-tidy is probably better.
> >
> > A bug is on file:https://bugzilla.mozilla.org/show_bug.cgi?id=1485588
> >
> > On Mon, Aug 27, 2018 at 4:06 PM, Tom Ritter  wrote:
> >> Is this something worth making a lint over?  It's pretty easy to make
> >> regex-based lints, e.g.
> >>
> >> yml-only based lint:
> >> https://searchfox.org/mozilla-central/source/tools/lint/cpp-virtual-final.yml
> >>
> >> yml+python for slightly more complicated regexing:
> >> https://searchfox.org/mozilla-central/source/tools/lint/mingw-capitalization.yml
> >> https://searchfox.org/mozilla-central/source/tools/lint/cpp/mingw-capitalization.py
>
>
> Bug 1642825 recently added a "rejected words" lint. It was intended to
> warn about words like "blacklist" and "whitelist", but dangerous C
> function names could easily be added to the list:
>
> https://searchfox.org/mozilla-central/source/tools/lint/rejected-words.yml
>
> A "good enough" solution that can find real bugs now is preferable to a
> cleaner clang-tidy solution someday, maybe. (The clang-tidy lint bug
> 1485588 was filed two years ago.)

Thanks. Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1648390

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Please don't use locale-dependent C standard library functions (was: Re: Please don't use functions from ctype.h and strings.h)

2020-06-12 Thread Henri Sivonen
This is an occasional re-reminder that anything in the C standard
library that is locale-sensitive is fundamentally broken and should
not be used.

Today's example is strerr(), which returns a string that is meant to
be rendered to the user, but the string isn't guaranteed to be UTF-8.

On Mon, Aug 27, 2018 at 3:04 PM Henri Sivonen  wrote:
>
> Please don't use the functions from ctype.h and strings.h.
>
> See:
> https://daniel.haxx.se/blog/2018/01/30/isalnum-is-not-my-friend/
> https://daniel.haxx.se/blog/2008/10/15/strcasecmp-in-turkish/
> https://stackoverflow.com/questions/2898228/can-isdigit-legitimately-be-locale-dependent-in-c
>
> In addition to these being locale-sensitive, the functions from
> ctype.h are defined to take (signed) int with the value space of
> *unsigned* char or EOF and other argument values are Undefined
> Behavior. Therefore, on platforms where char is signed, passing a char
> sign-extends to int and invokes UB if the most-significant bit of the
> char was set! Bug filed 15 years ago!
> https://bugzilla.mozilla.org/show_bug.cgi?id=216952 (I'm not aware of
> implementations doing anything surprising with this UB but there
> exists precedent for *compiler* writers looking at the standard
> *library* UB language and taking calls into standard library functions
> as optimization-guiding assertions about the values of their
> arguments, so better not risk it.)
>
> For isfoo(), please use mozilla::IsAsciiFoo() from mozilla/TextUtils.h.
>
> For tolower() and toupper(), please use ToLowerCaseASCII() and
> ToUpperCaseASCII() from nsUnicharUtils.h
>
> For strcasecmp() and strncasecmp(), please use their nsCRT::-prefixed
> versions from nsCRT.h.
>
> (Ideally, we should scrub these from vendored C code, too, since being
> in third-party code doesn't really make the above problems go away.)
>
> --
> Henri Sivonen
> hsivo...@mozilla.com



-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal: remove support for running desktop Firefox in single-process mode (e10s disabled) anywhere but in tests

2020-06-12 Thread Henri Sivonen
On Wed, Jun 10, 2020 at 11:13 PM James Teh  wrote:
> In general, this obviously makes a lot of sense. However, because there is
> so much extra complication for accessibility when e10s is enabled, I find
> myself disabling e10s in local opt/debug builds to isolate problems to the
> core a11y engine (vs the a11y e10s stuff).

This is also relevant to other debugging scenarios, especially when
not being able to use Pernosco to search for the right process.

What does this proposal mean for ./mach run --disable-e10s ?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship: FTP protocol implementation

2020-03-19 Thread Henri Sivonen
On Thu, Mar 19, 2020 at 2:24 AM Michal Novotny  wrote:
> We plan to remove FTP protocol implementation from our code.

Chrome's status dashboard says "deprecated" and
https://textslashplain.com/2019/11/04/bye-ftp-support-is-going-away/
said the plan was to turn FTP off by default in version 80. Yet, I
just successfully loaded ftp://ftp.funet.fi in Chrome 80 on Mac and in
Edge 82 (Canary) on Windows 10, and I'm certain I haven't touched the
flag in either. (The location bar kept showing the ftp:// URL, so it
doesn't appear to be a case of automatically trying HTTP.)

Do we know why Chrome didn't proceed as planned? Do we know what their
current plan is?

Do we know if Edge intends to track Chrome on this feature or to make
an effort to patch a different outcome?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to ship: Autodiscovery of WebExtension search engines

2020-02-26 Thread Henri Sivonen
On Tue, Feb 25, 2020 at 10:04 PM Dale Harvey  wrote:
> Yes,  extensions that only define a new search engine will be permitted,
> the extension will not be able to do anything else.

What capabilities do search engine-only WebExtensions have that
OpenSearch doesn't provide?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to prototype: Character encoding detector

2019-12-16 Thread Henri Sivonen
On Mon, Dec 2, 2019 at 2:42 PM Henri Sivonen  wrote:
> 1. On _unlabeled_ text/html and text/plain pages, autodetect _legacy_
> encoding, excluding UTF-8, for non-file: URLs and autodetect the
> encoding, including UTF-8, for file: URLs.
>
> Elevator pitch: Chrome already did this unilaterally. The motivation
> is to avoid a situation where a user switches to a Chromium-based as a
> result of browsing the legacy Web or local files.

Feature #1 is now on autoland.

> # Preference

For file: URLs, I ended up not putting the new detector behind a pref,
because the file: detection code is messy enough even without
alternative code paths, and I'm pretty confident that the new detector
is an improvement for our file: URL handling behavior.

For non-file: URLs, the new detector is overall controlled by
intl.charset.detector.ng.enabled, which defaults to true, i.e.
detector enabled. When the detector is enabled, various old
intl.charset.* are ignored in various ways.

The detector is, however, disabled by default for three TLDs: .jp,
.in, and .lk. This can be overridden via the prefs
intl.charset.detector.ng.jp.enabled,
intl.charset.detector.ng.in.enabled, and
intl.charset.detector.ng.lk.enabled all three of which default to
false. (These prefs cannot enable the detector if
intl.charset.detector.ng.enabled is false)

In the case of .jp, the pre-existing Japanese-specific detector is
used. This avoids regressing how soon we start reloading if we detect
EUC-JP.

The detector detects encodings that are actually part of the Web
Platform. However, this can cause problems when a site expects the
page to be decoded as windows-1252 _as a matter of undeclared
fallback_ and expects the user to have an _intentionally mis-encoded_
font that assigns non-Latin glyphs to the windows-1252 code points.
(Note that if the site says , that
continues to be undisturbed:
https://searchfox.org/mozilla-central/rev/62a130ba0ac80f75175e4b65536290b52391f116/parser/html/nsHtml5StreamParser.cpp#1512
)

Chrome has detection for three windows-1252-misusing Devanagari font
encodings and nine Tamil ones. (Nine looks like a lot, but Python tool
in this space is documented to handle 25 Tamil legacy encodings!)
There is no indication that the Chrome developers found it necessary
to have these detections. Actively-maintained newspaper sites that,
according to old Bugzilla items, previously used these font hacks have
migrated to Unicode. Rather, it looks like Chrome inherited them from
Google search engine code. Still, this leaves the possibility that
there are sites that presently work (if the user has the appropriate
fonts installed) in Chrome thanks to this detection and in Firefox
thanks to Firefox mapping the .in TLD to windows-1252 and mapping .com
to windows-1252 in the English localizations as well as in the
localizations for the Brahmic-script languages of India.

By not enabling the new detector on .in at least for now avoids
disrupting sites that intentionally misuse windows-1252 without
declaring it if such sites are still used by users (at the expense of
out-of-locale usage of .in as a generic TLD; data disclosed by Google
as part of Chrome's detector suggest e.g. Japanese use of .in). To the
extent the phenomenon of relying on intentionally misencoded fonts
still exists but on .com, the new detector will likely disrupt it
(likely by guessing some Cyrillic encoding). However, I think it
doesn't make sense to let that possibility derail this whole
project/feature.

Although I believe this phenomenon to be mostly a Tamil in Tamil Nadu
thing rather than a general Tamil language thing, I disabled the
detector on .lk just in case to have more time to research the issue.

If reports of legacy Tamil sites breaking show up, please needinfo me
on Bugzilla.

I didn't disable the detector for .am, because Chrome doesn't appear
to have detections for Armenian intentional misuse of windows-1252.

If intl.charset.detector.ng.enabled is false, Japanese detection
behaves like previously, except that encoding inheritance from a
same-origin parent frame now takes precedence over the detector. (This
was a spec compliance bug that had previously gone unnoticed because
we hadn't run the full test suite with a detector enabled. It turns
out that tests both semi-intentionally and accidentally depend on
same-origin inheritance taking precedence as the spec says.)

In the interest of binary size, I removed the old Cyrillic detector at
the same time as landing the new one. If the new detector is disabled
by the old Cyrillic detector is enabled, the new detector runs in the
situations where the old Cyrillic detector would have run in a mode
that approximates the old Cyrillic detector. (This approximation can,
however, result in some non-Cyrillic outcomes that were impossible
with the old Cyrillic detector.)

> # web-platform-tests

I added tests as tentative WPTs.

-- 
Henri Sivonen
hsivo...@mozilla.com
_

Re: Intent to prototype: Character encoding detector

2019-12-09 Thread Henri Sivonen
On Thu, Dec 5, 2019 at 8:08 PM Boris Zbarsky  wrote:
>
> On 12/2/19 7:42 AM, Henri Sivonen wrote:
> > Since there isn't a spec and Safari doesn't implement the feature,
> > there are no cross-vendor tests.
>
> Could .tentative tests be created here, on the off chance that we do
> create a spec for this at some point?

Good point. I'll use tentative WPTs for end-to-end automated tests. Thanks.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Intent to prototype: Character encoding detector

2019-12-03 Thread Henri Sivonen
the browser
already containing an implementation of the Encoding Standard. This
cuts the binary size impact to less than one fourth compared to
adopting the detector from Chrome, which doesn't benefit from any data
tables that a browser already has to have anyway.)

I've gone with demonstrating feasibility before further cross-vendor
discussion, because this is a user retention measure in response to a
unilateral move on Chrome's part and Safari on iOS doesn't face
pressure from switching to browsers with a different Web engine.

# Platform coverage

All platforms.

# Preference

There will probably be one for an initial testing period, but I
haven't picked a name yet.

# DevTools bug

There is no new DevTool surface for this. The HTML parser already
complains in a DevTool-visible way about unlabeled pages, and this
change will not remove those messages.

# Other browsers

Chromium-based browsers: Already shipping feature #1 (not shipping feature #2)

IE: Off-by-default (not precisely feature #1 or #2 but a kind of
combination of the two).

Safari: Not shipping either feature but, like Firefox and unlike
Chrome, provides a menu for addressing the use cases that feature #2
is meant to address.

# web-platform-tests

Since there isn't a spec and Safari doesn't implement the feature,
there are no cross-vendor tests.

# Secure contexts

Since this pair of features is about compatibility with legacy
content, both features apply to insecure contexts.

# Sandboxed iframes

The feature applies to sandboxed iframes.

For feature #1, the feature applies only to different-origin frames
and the situation is the same as for the pre-existing Japanese
detection: The framer cannot turn off the feature for the framee. Both
the framer or the framee can turn off the feature for itself by
adhering to the HTML authoring conformance requirements, i.e. by
declaring its own encoding.

For feature #2, the situation is the same as for the pre-existing
menu: The top-level page can turn off the feature for the whole
hierarchy by using UTF-8, not having any UTF-8 errors, and declaring
UTF-8, or, alternatively, by using the UTF-8 BOM (even if there are
subsequent errors). The framee can turn off the feature for itself by
using the UTF-8 BOM.

--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: C++ standards proposal for a embedding library

2019-10-24 Thread Henri Sivonen
On Thu, Oct 24, 2019 at 12:30 PM Gijs Kruitbosch
 wrote:
>  From experience, people seriously underestimate how hard this is -
> things like "I want a URL bar" or "I want tabs / multiple navigation
> contexts and want them to interact correctly" or "users should be able
> to download files and/or open them in helper apps cross-platform" are
> considerably less trivial than most people seem to assume, and even as
> Mozilla we have (perhaps embarrassingly) repeatedly made the same /
> similar mistakes in these areas when creating new "embedders" from
> scratch (Firefox for iOS, FirefoxOS, the various Android browsers), or
> have had to go over all our existing separate gecko consumers to adjust
> them to new web specs (most recent example I can think of is same site
> cookies, for instance, which requires passing along origin-of-link
> information for context menu or similar affordances), which is
> non-trivial and of course cannot happen without embedding API
> adjustments.

The Mozilla cases are harder, because the applications we build around
Web engines are Web browsers. My understanding (which may be wrong!)
is that the purpose of the C++ proposal isn't to enable creating Web
browsers around the API but to use the API to render the GUI for a
local C++ app whose primary purpose isn't to browse the Web, so I
assume "I want a URL bar" is the opposite of what the proposal is
after.

But even so, the proposal is inadequate in addressing questions like
multiple windows and various issues related to loading content from
the network as opposed to content from the app's own URL scheme that
maps to a stream produced by the C++ app internals. And its inadequate
for even the app-internal URL scheme: It looks like the app-internal
URL scheme is expected to map a URL to a stream of bytes as opposed to
a Content-Type and a stream of bytes.

--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: C++ standards proposal for a embedding library

2019-10-23 Thread Henri Sivonen
On Tue, Oct 22, 2019 at 11:55 PM Botond Ballo  wrote:
> Given that, would anyone be interested in reviewing the proposed API
> and providing feedback on its design? I feel like the committee would
> be receptive to constructive technical feedback, and as a group with
> experience in developing embedding APIs, we are in a particularly good
> position to provide such feedback.
>
> The latest draft of the proposal can be found here:
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1108r4.html

These comments are still more on the theme of this API being a bad
idea for the standard library on the high level as opposed to being
design comments on the specifics.

Section 4.2 refers to Gecko's XPCOM embedding API, which hasn't been
supported in quite a while. It also refers to the Trident embedding
API. Putting aside for the moment the it's bad from Mozilla
perspective to treat a Web engine as a generic platform capability as
opposed to it being a thing that one chooses from a number of
competitive options, it seems bad to build on the assumption that on
Windows the platform capability is Trident, which isn't getting new
Web features anymore.

Further on the Trident point, it doesn't really work to both say that
this is just exposing a platform capability and to go on to say that
implementations are encouraged to support [list of Web specs]. This
sort of thinking failed for the various Web on TV initiatives that got
whatever Presto and WebKit had. If the mechanism here is the Trident
embedding API on Windows, then you get whatever Trident has.

I'm curious how developers of libstdc++ and libc++ view the notion of
WebKitGTK+ as a platform capability. The C++ standard libraries are
arguably for Linux while the "platform capability" named (WebKitGTK+)
is arguably a Gnome thing rather than a Linux-level thing. That WPE
WebKit exists separately from WebKitGTK+ seems like a data point of
some kind: https://wpewebkit.org/

Section 4.4 argues against providing a localhost Web server
capability. I understand that the goal here is a user experience that
differs from the experience of launching a localhost HTTP server and
telling a Web browser to navigate to a localhost URL. However, the
additional arguments about the HTTP/2 and HTTP/3 landscape evolving
quickly and requiring TLS seem bad.

For localhost communication, HTTP/1.1 should address the kind of use
cases presented (other than the launch UX). Without actual network
latencies, HTTP/2 and HTTP/3 optimizations over HTTP/1.1 aren't _that_
relevant. Also, the point about TLS doesn't apply to HTTP/1.1 to
localhost. The notion that it's easier to create a multi-engine Web
engine embedding API that allows the embedder to feed the Web engine
pseudonetwork data is simpler than creating a localhost HTTP/1.1
server seems like a huge misestimation of the relative complexities.

Moreover, for either this API and a local HTTP server to work well, a
better way of dynamically generating HTML than is presented in the
example is required. It doesn't make sense to me to argue that
everything belongs in the standard library to the point of putting a
Web engine API there if a mechanism for generating the HTML to talk
with the Web engine doesn't belong in the standard library. If the
mechanism for generating the HTML is something you pull from GitHub
instead of something you get in the standard library, why can't
https://github.com/hfinkel/web_view be pulled from GitHub, too?

Finally, this seems very hand-wavy in terms of the security aspects of
loading remote content in a Web engine launched like this. My
understanding is that this has been an area of security problems for
Electron apps. It seems irresponsible not to cover this in detail, but
it also seems potentially impossible to do so, given the need to work
with whatever Web engine APIs that already exist as "platform
capabilities".

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to prototype: Web Speech API

2019-10-15 Thread Henri Sivonen
On Tue, Oct 15, 2019 at 2:56 AM Andre Natal  wrote:
> Regarding the UI, yes, the experience will be exactly the same in our case: 
> the user will get a prompt asking for permission to open the microphone (I've 
> attached a screenshot below [3])
...
> [3] 
> https://www.dropbox.com/s/fkyymiyryjjbix5/Screenshot%202019-10-14%2016.13.49.png?dl=0

Since the UI is the same as for getUserMedia(), is the permission bit
that gets stored the same as for getUserMedia()? I.e. if a site
obtains the permission for one, can it also use the other without
another prompt?

If a user understands how WebRTC works and what this piece of UI meant
for WebRTC, this UI now represents a different trust decision on the
level of principle. How intentional or incidental is it that this
looks like a getUserMedia() use (audio goes to where the site named in
the dialog decides to route it) instead of surfacing to the user that
this is different (audio goes to where the browser vendor decides to
route it)?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to ship: Web Speech API

2019-10-14 Thread Henri Sivonen
On Sat, Oct 12, 2019 at 12:29 PM Andre Natal  wrote:
> We tried to capture everything here [1], so please if you don't see your
> question addressed in this document, just give us a shout either here in
> the thread or directly.
...
> [1]
> https://docs.google.com/document/d/1BE90kgbwE37fWoQ8vqnsQ3YMiJCKJSvqQwa463yCN1Y/edit?ts=5da0f63f#

Thanks. It doesn't address the question of what the UI in Firefox is
like. Following the links for experimenting with the UI on one's own
leads to https://mdn.github.io/web-speech-api/speech-color-changer/ ,
which doesn't work in Nightly even with prefs flipped.

(Trying that example in Chrome shows that Chrome presents the
permission prompt as a matter of sharing the microphone with
mdn.github.io as if this was WebRTC, which suggests that mdn.github.io
decides where the audio goes. Chrome does not surface that, if I
understand correctly how this API works in Chrome, the audio is
instead sent to a destination of Chrome's choosing and not to a
destination of mdn.github.io's choosing. The example didn't work for
me in Safari.)

--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Passing UniquePtr by value is more expensive than by rref

2019-10-14 Thread Henri Sivonen
On Mon, Oct 14, 2019 at 9:05 AM Gerald Squelart  wrote:
>
> I'm in the middle of watching Chandler Carruth's CppCon talk "There Are No 
> Zero-Cost Abstractions" and there's this interesting insight:
> https://youtu.be/rHIkrotSwcc?t=1041
>
> The spoiler is already in the title (sorry!), which is that passing 
> std::unique_ptr by value is more expensive than passing it by rvalue 
> reference, even with no exceptions!
>
> I wrote the same example using our own mozilla::UniquePtr, and got the same 
> result: https://godbolt.org/z/-FVMcV (by-value on the left, by-rref on the 
> right.)
> So I certainly need to recalibrate my gutfeelometer.

The discussion in the talk about what is needed to fix this strongly
suggested (without uttering "Rust") that Rust might be getting this
right. With panic=abort, Rust gets this right (
https://rust.godbolt.org/z/SZQaAS ) which really makes one appreciate
both Rust-style move semantics and the explicitly not-committal ABI.

(I had to put a side-effectful println! in bar to make sure a call to
bar is generated, since #[inline(never)] isn't enough to prevent the
compiler from eliding calls to functions it can see do nothing.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to ship: Web Speech API

2019-10-07 Thread Henri Sivonen
On Mon, Oct 7, 2019 at 5:00 AM Marcos Caceres  wrote:
>  - The updated implementation more closely aligns with Chrome's 
> implementation - meaning we get better interop across significant sites.

What site can one try to get an idea of what the user interface is like?

>  - speech is processed in our cloud servers, not on device.

What should one read to understand the issues that lead to this change?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Non-XPCOM in-RAM Unicode representation conversions have moved

2019-09-20 Thread Henri Sivonen
Unicode representation conversions (and range queries like "is this in
the ASCII range?") for data we already have inside the engine (i.e.
we're not doing IO with an external source/destination) for the cases
where the conversion doesn't need to be able to resize a target XPCOM
string have moved. They are no longer in nsReadableUtils.h. They are
now in mozilla/TextUtils.h, mozilla/Utf8.h, and mozilla/Latin1.h. As a
result, the functions have moved to the mozilla:: namespace and now
follow MfbtCase (IsAscii and IsUtf8 instead of the old IsASCII and
IsUTF8).

Support for external encodings and streaming continues to be in
mozilla/Encoding.h. Conversions where the target is an XPCOM string
remain in nsReadableUtils.h.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship: TLS 1.0 and TLS 1.1

2019-09-13 Thread Henri Sivonen
On Fri, Sep 13, 2019 at 3:09 AM Martin Thomson  wrote:
>
> On Thu, Sep 12, 2019 at 5:50 PM Henri Sivonen  wrote:
>>
>> Do we know what the situation looks like for connections to RFC 1918 
>> addresses?
>
> That's a hard one to even speculate about, and that's all we really have 
> there.  Our telemetry doesn't really allow us to gain insight into that.

I see.

> The big question being enterprise uses, where there is some chance of having 
> names on servers in private address space.  Most use of 1918 outside of 
> enterprise is likely still unsecured entirely.

I was thinking of home printer, NAS and router config UIs that are
unsecured in the sense of using self-signed certificates but that
still use TLS, so that TLS matters for practical compatibility. I
don't know of real examples of devices that both use TLS exclusively
and don't support TLS 1.2. (My printer redirects http to https with
self-signed cert but supports TLS 1.2.)

--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship: TLS 1.0 and TLS 1.1

2019-09-12 Thread Henri Sivonen
On Thu, Sep 12, 2019 at 7:03 AM Martin Thomson  wrote:
> Telemetry shows that TLS 1.0 usage is much higher
> than we would ordinarily tolerate for this sort of deprecation

Do we know what the situation looks like for connections to RFC 1918 addresses?

> Finally, we will disable TLS 1.0 and 1.1 for all people using the Release
> channel of Firefox in March 2020.  Exact plans for how and when this will
> happen are not yet settled.

What expectations are there for being able to remove the code from NSS?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charter: Timed Text (TT) Working Group

2019-08-29 Thread Henri Sivonen
On Thu, Aug 29, 2019 at 1:41 AM L. David Baron  wrote:
>
> The W3C is proposing a revised charter for:
>
>   Timed Text (TT) Working Group
>   https://www.w3.org/2019/08/ttwg-proposed-charter.html
>   https://lists.w3.org/Archives/Public/public-new-work/2019Aug/0004.html
>
> The comparison to the group's previous charter is:
>   
> https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.w3.org%2F2018%2F05%2Ftimed-text-charter.html=https%3A%2F%2Fwww.w3.org%2F2019%2F08%2Fttwg-proposed-charter.html

What should one read to understand what perceived need there is for
further development on TTML and WebVTT? (That is, what's currently
missing such that these aren't considered "done"?)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: JS testing functions and compartments in mochitest-plain

2019-08-26 Thread Henri Sivonen
On Mon, Aug 26, 2019 at 1:37 PM Jan de Mooij  wrote:
>
> On Mon, Aug 26, 2019 at 12:25 PM Henri Sivonen  wrote:
>>
>> Thanks. Since SpecialPowers doesn't exist in xpcshell tests, is there
>> another way to reach JS testing functions from there?
>
>
> I think just Cu.getJSTestingFunctions() should work.

Thanks. This worked.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: JS testing functions and compartments in mochitest-plain

2019-08-26 Thread Henri Sivonen
On Mon, Aug 26, 2019 at 11:27 AM Jan de Mooij  wrote:
>
> On Mon, Aug 26, 2019 at 9:02 AM Henri Sivonen  wrote:
>>
>> In what type of test does
>> SpecialPowers.Cu.getJSTestingFunctions().newRope() actually return a
>> rope within the calling compartment such that passing the rope to a
>> WebIDL API really makes the rope enter the WebIDL bindings instead of
>> getting intercepted by a cross-compartment wrapper first?
>
>
> An xpcshell test or mochitest-chrome is probably easiest

Thanks. Since SpecialPowers doesn't exist in xpcshell tests, is there
another way to reach JS testing functions from there?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


JS testing functions and compartments in mochitest-plain

2019-08-26 Thread Henri Sivonen
If in a plain mochitest I do
  var rope =
SpecialPowers.Cu.getJSTestingFunctions().newRope(t.head, t.tail);
  var encoded = (new TextEncoder()).encode(rope);
the encode() method doesn't see the rope. Instead, the call to
encode() sees a linear string that was materialized by a copy in a
cross-compartment wrapper.

Does SpecialPowers always introduce a compartment boundary in a plain
mochitest? In what type of test does
SpecialPowers.Cu.getJSTestingFunctions().newRope() actually return a
rope within the calling compartment such that passing the rope to a
WebIDL API really makes the rope enter the WebIDL bindings instead of
getting intercepted by a cross-compartment wrapper first?

Alternatively: What kind of string lengths should I use with normal JS
string concatenation to be sure that I get a rope instead of the right
operand getting copied into an extensible left operand?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Structured bindings and minimum GCC & clang versions

2019-08-16 Thread Henri Sivonen
On Fri, Aug 16, 2019 at 9:51 AM Eric Rahm  wrote:
>
> We are actively working on this. Unfortunately, as expected, its never as
> simple as we'd like. Updating the minimum gcc version (
> https://bugzilla.mozilla.org/show_bug.cgi?id=1536848) is blocked on getting
> our hazard builds updated, updating to c++17 has some of it's own quirks.

Thanks. The dependencies indeed look tricky. :-(

I take it that doing what Chromium does and shipping a statically
linked symbol-swapped copy of libc++ instead of depending on the
system C++ standard library would have its own set of issues. :-(

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Structured bindings and minimum GCC & clang versions

2019-08-16 Thread Henri Sivonen
This week, I wrote some code that made me wish we already had support
for structured bindings and return by initializer list (both from
C++17) for mozilla::Tuple.

That is, if we have
mozilla::Tuple Foo()
it would be nice to be able to call it via
auto [a, b] = Foo();
and within Foo to write returns as
return { a, b };

It appears that our minimum GCC and minimum clang documented at
https://developer.mozilla.org/en-US/docs/Mozilla/Using_CXX_in_Mozilla_code
are pretty old.

What's the current outlook for increasing the minimum GCC and clang
versions such that we could start using structured bindings and return
by initializer list for tuples (either by making sure mozilla::Tuple
support these or by migrating from mozilla::Tuple to std::tuple) and
thereby get ergonomic multiple return values in C++?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Removing --enable-shared-js [Was: Rust and --enable-shared-js]

2019-08-14 Thread Henri Sivonen
On Tue, Aug 13, 2019 at 2:18 PM Lars Hansen  wrote:
> Cranelift should be genuinely optional until further notice; to my 
> knownledge, no near-term product work in Firefox or SpiderMonkey depends on 
> Cranelift.  Cranelift is present in Nightly but (so far as I can tell) not in 
> Release. It can be disabled in the JS shell by configuring with 
> --disable-cranelift, and I just tested that this works.  To the extent there 
> is other Rust code in SpiderMonkey it should not, so far as I know, depend on 
> the presence of Cranelift.  It also seems to me that we should be able to use 
> Rust in SpiderMonkey independently of whether Cranelift is there, so if that 
> does not work it ought to be fixed.

Thanks. That makes sense to me.

The present state (now that
https://bugzilla.mozilla.org/show_bug.cgi?id=1572364 has landed) is
that when built as part of libxul, SpiderMonkey can use Rust code
(jsrust_shared gets built) regardless of whether Cranelift is enabled.
However, when SpiderMonkey is built outside libxul, SpiderMonkey can
use Rust code (jsrust_shared gets built) only if Cranelift is enabled.
I've filed https://bugzilla.mozilla.org/show_bug.cgi?id=1573098 to
change that.

(The actual addition of non-Cranelift Rust code of interest to
jsrust_shared hasn't landed yet.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Non-header-only headers shared between SpiderMonkey and the rest of Gecko

2019-08-12 Thread Henri Sivonen
On Tue, Aug 6, 2019 at 8:54 PM Kris Maglione  wrote:
>
> On Tue, Aug 06, 2019 at 10:56:55AM +0300, Henri Sivonen wrote:
> > Do we have some #ifdef for excluding parts of mfbt/ when mfbt/ is being used
> > in a non-SpiderMonkey/Gecko context?
>
> #ifdef MOZ_HAS_MOZGLUE

Thanks. This appears to be undefined for gtests that call in to libxul
code. (Is that intentional?) It appears that MOZILLA_INTERNAL_API is
needed for covering gtests that call into libxul code.

As far as I can tell, after
https://bugzilla.mozilla.org/show_bug.cgi?id=1572364 , ".cpp that
links with jsrust_shared" is detected by:
#if defined(MOZILLA_INTERNAL_API) || \
(defined(MOZ_HAS_MOZGLUE) && defined(ENABLE_WASM_CRANELIFT))

Does that look right? I verified this experimentally by checking that
the above evaluates to false when compiling
--enable-application=tools/update-packaging or
--enable-application=memory.

(Despite advice to the contrary, I still think it's important for
discoverability to put my code in mfbt/TextUtils.h and having it
disabled in contexts that don't link with jsrust_shared. It would be
bad if e.g. mozilla::IsAscii taking a single char and mozilla::IsAscii
taking mozilla::Span weren't discoverable in the same
place. Or even worse if mozilla::IsAscii taking '\0'-terminated const
char* and mozilla::IsAscii taking mozilla::Span weren't
discoverable in the same place.)


--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Removing --enable-shared-js [Was: Rust and --enable-shared-js]

2019-08-07 Thread Henri Sivonen
On Tue, May 28, 2019 at 3:16 AM Mike Hommey  wrote:
>
> On Tue, May 21, 2019 at 10:32:20PM -0400, Boris Zbarsky wrote:
> > On 5/21/19 9:55 PM, Mike Hommey wrote:
> > > Considering this has apparently been broken for so long, I guess nobody
> > > will object to me removing the option for Gecko builds?
> >
> > It's probably fine, yeah...
>
> Now removed on autoland via bug 1554056.

Thanks.

It appears that building jsrust_shared is still conditional on
ENABLE_WASM_CRANELIFT. How optional is ENABLE_WASM_CRANELIFT in
practice these days? Is it genuinely optional for Firefox? Is it
genuinely optional for standalone SpiderMonkey? If it is, are we OK
with building without ENABLE_WASM_CRANELIFT having other non-Cranelift
effects on SpiderMonkey performance (e.g. turning off SIMD for some
operations) or on whether a particular string conversion is available
in jsapi.h?

I'm trying to understand the implication of Cranelift being optional
for other Rust code in SpiderMonkey. I'd like to add Rust-backed
SIMD-accelerated Latin1ness checking and UTF-8 validity checking to
SpiderMonkey and Rust-backed conversion from JSString to UTF-8 in
jsapi.h, and my understanding from All Hands was that adding these
things would be OK, since SpiderMonkey already depends on Rust.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Non-header-only headers shared between SpiderMonkey and the rest of Gecko

2019-08-06 Thread Henri Sivonen
On Tue, Aug 6, 2019 at 10:15 AM Henri Sivonen  wrote:
> In general, it seems problematic to organize headers based on whether
> they have associated .cpp or crate code. I'd expect developers to look
> for stuff under mfbt/ instead of some place else, since developers
> using the header shouldn't have to know if the implementation is
> header-only or not. :-(

Notably, in my case the functions would logically belong in Utf8.h
(which already has Utf8.cpp under mfbt/) and in TextUtils.h. What
should my r+ expectations be if I put the entry points there despite
the code requiring linking with Rust crates that SpiderMonkey (and,
therefore, Gecko) depends on? Do we have some #ifdef for excluding
parts of mfbt/ when mfbt/ is being used in a non-SpiderMonkey/Gecko
context?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Non-header-only headers shared between SpiderMonkey and the rest of Gecko

2019-08-06 Thread Henri Sivonen
On Mon, Aug 5, 2019 at 4:14 PM Gabriele Svelto  wrote:
> On 05/08/19 12:04, Henri Sivonen wrote:
> > I has come to my attention that that putting non-header-only code
> > under mfbt/ is something we're trying to get away from:
> > https://bugzilla.mozilla.org/show_bug.cgi?id=1554062
> >
> > Do we have an appropriate place for headers that declare entry points
> > for non-header-only functionality (in my case, backed by Rust code)
> > and that depend on types declared in headers that live under mfbt/ and
> > that need to be available both to SpiderMonkey and the rest of Gecko?
>
> IIRC we have some stuff like that under mozglue/misc. The TimeStamp
> class for example is used in both Gecko and SpiderMonky, has
> platform-dependent C++ implementations (also linking to external
> libraries) and uses MFBT headers.

Is mozglue only for Gecko and SpiderMonkey and not anything else?
(I.e. not crash reporter or anything else that doesn't link the Rust
crates that SpiderMonkey links?)

In general, it seems problematic to organize headers based on whether
they have associated .cpp or crate code. I'd expect developers to look
for stuff under mfbt/ instead of some place else, since developers
using the header shouldn't have to know if the implementation is
header-only or not. :-(

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: NotNull and pointer parameters that may or may not be null

2019-08-05 Thread Henri Sivonen
On Mon, Jul 22, 2019 at 10:00 AM Karl Tomlinson  wrote:
> Google style requires pointers for parameters that may be mutated
> by the callee, which provides that the potential mutation is
> visible at the call site.  Pointers to `const` types are
> permitted, but recommended when "input is somehow treated
> differently" [1], such as when a null value may be passed.
>
> Comments at function declarations should mention
> "Whether any of the arguments can be a null pointer." [2]

I understand that there's value in adopting Google rules without
modification in order to avoid having to discuss which parts to adopt
and to avoid having to adapt tooling in accordance to the results of
such discussion. However, what should we think of Google disagreeing
with the C++ Core Guidelines, which can be expected to also have
larger ecosystem value as a tooling target and that in some sense
could be considered to come from higher C++ authority than Google's
rules?

On this particular topic, the Core Guidelines have:
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-inout
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f60-prefer-t-over-t-when-no-argument-is-a-valid-option
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f23-use-a-not_nullt-to-indicate-that-null-is-not-a-valid-value

(FWIW, I have relatively recently r+ed code that used non-const
references to indicate non-nullable modifiable arguments on the
grounds that it was endorsed by the Core Guidelines.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Non-header-only headers shared between SpiderMonkey and the rest of Gecko

2019-08-05 Thread Henri Sivonen
I has come to my attention that that putting non-header-only code
under mfbt/ is something we're trying to get away from:
https://bugzilla.mozilla.org/show_bug.cgi?id=1554062

Do we have an appropriate place for headers that declare entry points
for non-header-only functionality (in my case, backed by Rust code)
and that depend on types declared in headers that live under mfbt/ and
that need to be available both to SpiderMonkey and the rest of Gecko?

(So far, shipping headers that depend on types that come from mfbt/
inside the related crates.io crate has been suggested, but it seems
weird to ship Gecko-specific code via crates.io and Gecko developers
probably aren't looking for mfbt type-aware C++ API headers under
third_party/rust/.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Coding style  : `int` vs `intX_t` vs `unsigned/uintX_t`

2019-07-05 Thread Henri Sivonen
On Fri, Jul 5, 2019 at 1:28 PM Nathan Froyd  wrote:
>
> On Fri, Jul 5, 2019 at 2:48 AM Jeff Gilbert  wrote:
> > It is, however, super poignant to me that uint32_t-indexing-on-x64 is
> > pessimal, as that's precisely what our ns* containers (nsTArray) use
> > for size, /unlike/ their std::vector counterparts, which will be using
> > the more-optimal size_t.
>
> nsTArray uses size_t for indexing since bug 1004098.

We should probably endorse the use of size_t more explicitly in our
guidelines. Apart from the issue of object size motivating especially
strings using uint32_t for _fields_, it seems to me that a significant
part of our uint32_t (originally PRUint32, of course) habit comes from
the days when Tru64 Unix was the main 64-bit Gecko platform,
therefore, we lacked proper 64-bit testing coverage.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Coding style  : `int` vs `intX_t` vs `unsigned/uintX_t`

2019-07-04 Thread Henri Sivonen
On Thu, Jul 4, 2019 at 9:55 AM Boris Zbarsky  wrote:
> > never use any unsigned type unless you work with bitfields or need 2^N 
> > overflow (in particular, don't use unsigned for always-positive numbers, 
> > use signed and assertions instead).
>
> Do you happen to know why?  Is this due to worries about underflow or
> odd behavior on subtraction or something?

I don't _know_, but most like they want to benefit from optimizations
based on overflow being UB.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship:

2019-06-14 Thread Henri Sivonen
On Fri, Jun 14, 2019 at 1:24 PM Jonathan Kingston  wrote:
> Most of the use cases are resolved by web crypto or u2f.

Thanks for the removal. Do we have enterprise Web developer-facing
documentation on 1) how TLS client cert enrollment should work now or
2) if there is no in-browser client cert enrollment path anymore, what
concretely should be used instead? (To be clear: I'm not a fan of
client certs, and I'm not requesting that there be an enrollment
path.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running C++ early in shutdown without an observer

2019-06-10 Thread Henri Sivonen
On Fri, Jun 7, 2019 at 9:45 PM Chris Peterson  wrote:
>
> On 6/7/2019 9:36 AM, Kris Maglione wrote:
> > On Fri, Jun 07, 2019 at 09:18:38AM +0300, Henri Sivonen wrote:
> >> For late shutdown cleanup, we have nsLayoutStatics::Shutdown(). Do we
> >> have a similar method for running things as soon as we've decided that
> >> the application is going to shut down?
> >>
> >> (I know there are observer topics, but I'm trying to avoid having to
> >> create an observer object and to make sure that _it_ gets cleaned up
> >> properly.)
> >
> > Observers are automatically cleaned up at XPCOM shutdown, so you
> > generally don't need to worry too much about them. That said,
> > nsIAsyncShutdown is really the way to go when possible. But it currently
> > requires an unfortunate amount of boilerplate.

Thanks. (nsIAsyncShutdown indeed looks like it involves a lot of boilerplate.)

> Note that on Android, you may never get an opportunity for a clean
> shutdown because the OS can kill your app at any time.

My use case relates to Windows only.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Running C++ early in shutdown without an observer

2019-06-07 Thread Henri Sivonen
For late shutdown cleanup, we have nsLayoutStatics::Shutdown(). Do we
have a similar method for running things as soon as we've decided that
the application is going to shut down?

(I know there are observer topics, but I'm trying to avoid having to
create an observer object and to make sure that _it_ gets cleaned up
properly.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Remove browser and OS architecture from Firefox's User-Agent string?

2019-05-20 Thread Henri Sivonen
On Tue, May 21, 2019 at 12:40 AM Randell Jesup  wrote:
>
> >On Fri, May 10, 2019 at 11:40 PM Chris Peterson  
> >wrote:
> >> I propose that Win64 and WOW64 use the unadorned Windows UA already used
> >> by Firefox on x86 and AArch64 Windows:
> >>
> >> < "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101
> >> Firefox/66.0"
> >> > "Mozilla/5.0 (Windows NT 10.0; rv:66.0) Gecko/20100101 Firefox/66.0"
> >
> >Would there be significant downsides to hard-coding the Windows
> >version to "10.0" in order to put Windows 7 and 8.x users in the same
> >anonymity set with Windows 10 users?
> >
> >(We could still publish statistics of Windows version adoption at
> >https://data.firefox.com/dashboard/hardware )
>
> I wonder if any sites distributing windows executables might key off the
> OS version to default to the correct exe for your version of windows?

Are there known examples of apps like that? AFAICT, even Skype.com
provides the Windows 7 -compatible .exe as the download even with a
Windows 10 UA string, and you need to know to go to the Store if you
want the Windows 10-only version.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Remove browser and OS architecture from Firefox's User-Agent string?

2019-05-11 Thread Henri Sivonen
On Fri, May 10, 2019 at 11:40 PM Chris Peterson  wrote:
> I propose that Win64 and WOW64 use the unadorned Windows UA already used
> by Firefox on x86 and AArch64 Windows:
>
> < "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101
> Firefox/66.0"
> > "Mozilla/5.0 (Windows NT 10.0; rv:66.0) Gecko/20100101 Firefox/66.0"

Would there be significant downsides to hard-coding the Windows
version to "10.0" in order to put Windows 7 and 8.x users in the same
anonymity set with Windows 10 users?

(We could still publish statistics of Windows version adoption at
https://data.firefox.com/dashboard/hardware )

> And that Linux omit the OS architecture entirely (like Firefox on
> Android or always spoof "i686" if an architecture token is needed for UA
> parsing webcompat):

Do we have any anecdata of the Web compat impact of not having
anything between "Linux" and the next semicolon? Is there any evidence
that " i686" would be a better single filler for everyone than "
x86_64" if something is needed there for Web compat?

Do we have indications if "Linux" is needed for Web compat? According
to 
https://docs.google.com/spreadsheets/d/1I--o6uYWUkBw05IP964Ee2aZCf67P9E3TxpuDawH4_I/edit#gid=0
FreeBSD currently does not say "Linux". (Chrome on Chrome OS does not
say Linux, either, but does say "X11; ".) That is, could "X11; " alone
be sufficient for Web compat? (I'm happy to see that running Firefox
in Wayland mode still says "X11; ". Let's keep it that way!)

Do we have an idea if distros would counteract Mozilla and restore the
CPU architecture if we removed it? Previous evidence suggests that
distros are willing to split the anonymity set for self-promotional
reasons by adding "; Ubuntu" or "; Fedora". Is there a similar distro
interest in exposing the CPU architecture?

https://docs.google.com/spreadsheets/d/1I--o6uYWUkBw05IP964Ee2aZCf67P9E3TxpuDawH4_I/edit#gid=0
suggests making Firefox on FreeBSD say "Linux". Are there indications
that the self-promotion interests of FreeBSD wouldn't override privacy
or Web compat benefits of saying "Linux"?

> I propose no change to the macOS UA string at this time. Removing
> "Intel" now would not reduce any fingerprinting entropy (all modern Macs
> are x86_64) and might risk confusing some UA string parsers. If AArch64
> MacBooks become a real platform, I propose we then remove "Intel" so
> x86_64 and AArch64 macOS would have the same UA string:
>
> < "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101
> Firefox/66.0"
>  > "Mozilla/5.0 (Macintosh; Mac OS X 10.14; rv:66.0) Gecko/20100101
> Firefox/66.0".

Or they could have the same UA string by Aarch64 saying "Intel"...

Meanwhile, could we make the system version number "10.14" (or
whatever is latest at a given point in time) regardless of actual
version number to put all macOS users in the same anonymity set?
(Curiously, despite Apple's privacy efforts, Safari exposes the third
component of the OS version number. Also, it uses underscores instead
of periods as the separator.)

> Here is a spreadsheet comparing UA strings of different browser and OS
> architectures:
>
> https://docs.google.com/spreadsheets/d/1I--o6uYWUkBw05IP964Ee2aZCf67P9E3TxpuDawH4_I/edit#gid=0

The reference there to
https://bugzilla.mozilla.org/show_bug.cgi?id=1169772 about exposing
_some_ Android version number for Web compat says the reason not to
make Firefox claim the same Android version for all users regardless
of actual system version is that doing so would require bumping the
version later:
https://bugzilla.mozilla.org/show_bug.cgi?id=1169772#c36

It seems that for privacy reasons, we should claim the latest Android
version for everyone even if it means introducing the recurring task
of incrementing the number annually or so.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Personally opining that new C++ library facilities should not support non-UTF encodings

2019-04-24 Thread Henri Sivonen
I sent this to SG16 (after tweaking it a bit to downplay UTF-16 and UTF-32).

On Mon, Apr 22, 2019 at 1:19 PM Henri Sivonen  wrote:
>
> (If you don't care about what I say about character encoding issues in
> the context of C++ standardization outside my Mozilla activities, you
> can save time by skipping the rest of this email.)
>
> On my own time outside my Mozilla activities, I was invited to read
> and comment on the C++ standardization papers under the purview of the
> Unicode Study Group (SG16) of the C++ committee.
>
> I've drafted a reply in which I opine that _new_ text processing
> features (other than character encoding conversion facilities) in the
> C++ standard library should only support UTF-8/16/32 and should not
> seek to support non-UTF execution encodings and that conversion
> facilities should have an API similar encoding_rs's API:
> https://hsivonen.fi/p/non-unicode-in-cpp.html
>
> Based on prior advice from Botond, I'm sending this heads-up here just
> in case: If you have a reason why it would be bad from the Mozilla
> perspective for me to send the above-linked document to SG16 even with
> the personal-capacity disclaimer, please let me know.
>
> (I expect this to be non-controversial from the Mozilla perspective,
> since we already treat the concept of C++ "execution encoding" as a
> C++ design bug that we route around.)
>
> --
> Henri Sivonen
> hsivo...@hsivonen.fi
> https://hsivonen.fi/



-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Personally opining that new C++ library facilities should not support non-UTF encodings

2019-04-22 Thread Henri Sivonen
(If you don't care about what I say about character encoding issues in
the context of C++ standardization outside my Mozilla activities, you
can save time by skipping the rest of this email.)

On my own time outside my Mozilla activities, I was invited to read
and comment on the C++ standardization papers under the purview of the
Unicode Study Group (SG16) of the C++ committee.

I've drafted a reply in which I opine that _new_ text processing
features (other than character encoding conversion facilities) in the
C++ standard library should only support UTF-8/16/32 and should not
seek to support non-UTF execution encodings and that conversion
facilities should have an API similar encoding_rs's API:
https://hsivonen.fi/p/non-unicode-in-cpp.html

Based on prior advice from Botond, I'm sending this heads-up here just
in case: If you have a reason why it would be bad from the Mozilla
perspective for me to send the above-linked document to SG16 even with
the personal-capacity disclaimer, please let me know.

(I expect this to be non-controversial from the Mozilla perspective,
since we already treat the concept of C++ "execution encoding" as a
C++ design bug that we route around.)

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to deprecate - linux32 tests starting with Firefox 69

2019-04-10 Thread Henri Sivonen
On Tue, Apr 9, 2019 at 6:05 PM Gian-Carlo Pascutto  wrote:

> On top of that, we know that not all distros have telemetry enabled and
> so we won't be counting those either (Debian is the largest).

We get telemetry from Ubuntu, Fedora, and Arch (subject to choices
made by the user). I'm not particularly worried about Debian, because
x86_64 (as amd64) has had prominent (unlike Ubuntu; see below)
availability from Debian pretty much for as long as x86_64 hardware
has been available. But in any case, not enabling telemetry means not
having representation in data-driven decision making.

> At least current Ubuntu and Ubuntu LTS are still available in 32-bit:
> https://www.ubuntu.com/download/alternative-downloads
>
> Ubuntu is our largest userbase (with telemetry...)

The latest and the latest LTS are offered for 32-bit x86 only as
netboot installers, and the latest LTS also as an upgrade from an
earlier version, so Ubuntu now positions 32-bit x86 as more obscure
than s390x, ppc64el, aarch64, and armv7hf. Ubuntu has already disabled
automatic updates to 18.10 for 32-bit x86, because the expectation is
that 18.04 will be the longest-supported release for 32-bit x86.

I think the main problem with Ubuntu is that Ubuntu promoted 32-bit
x86 downloads to users who didn't have the expertise to seek the
x86_64 version for way longer than they, in my opinion, should have
(though, in fairness, at that time, Mozilla also was promoting 32-bit
x86 downloads over 64-bit). Users who installed Ubuntu at that time
may have gotten a 32-bit distro even if the hardware would work just
fine with the 64-bit version, and there are no upgrades from 32-bit to
64-bit other than reinstall. Probably more Canonical's job than ours
to communicate to those users that they should do a 64-bit reinstall.

(In terms of security support from the Ubuntu repos, I find the
situation for architectures that aren't tier-1 for us a bit
concerning. Compare
https://packages.ubuntu.com/search?suite=disco=names=firefox
with 
https://packages.ubuntu.com/search?suite=cosmic=names=firefox
. Despite 32-bit x86 having been demoted below s390x, ppc64el,
aarch64, and armv7hf in terms of distribution images, 32-bit x86 gets
better security support in the Ubuntu repos than s390x, ppc64el,
aarch64, and armv7hf. So does security support in Ubuntu repos depend
on our tier positioning or to Ubuntu's own positioning?)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charters: Internationalization (i18n) Working Group and Interest Group

2019-04-10 Thread Henri Sivonen
On Tue, Apr 9, 2019 at 11:13 PM L. David Baron  wrote:
>
> On Tuesday 2019-04-09 13:55 +0300, Henri Sivonen wrote:
> > On Mon, Apr 8, 2019 at 11:32 PM L. David Baron  wrote:
> > >
> > > The W3C is proposing revised charters for:
> > >
> > >   Internationalization (i18n) Working Group
> > >   https://www.w3.org/2019/04/proposed-i18n-wg-charter.html
> > >   https://lists.w3.org/Archives/Public/public-new-work/2019Apr/0004.html
> > >   diff from previous charter: 
> > > https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.w3.org%2FInternational%2Fcore%2Fcharter-2016.html=https%3A%2F%2Fwww.w3.org%2F2019%2F04%2Fproposed-i18n-wg-charter.html
> > >
> > >   Internationalization (i18n) Interest Group
> > >   https://www.w3.org/2019/04/proposed-i18n-ig-charter.html
> > >   https://lists.w3.org/Archives/Public/public-new-work/2019Apr/0004.html
> > >   diff from previous charter: 
> > > https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.w3.org%2FInternational%2Fig%2Fcharter-2016.html=https%3A%2F%2Fwww.w3.org%2F2019%2F04%2Fproposed-i18n-ig-charter.html
> > >
> > > Mozilla has the opportunity to send comments or objections through
> > > Friday, May 3.
> > >
> > > Please reply to this thread if you think there's something we should
> > > say as part of this charter review, or if you think we should
> > > support or oppose it.  (Absent specific comments, I'd be inclined to
> > > support the charters, because I think the i18n work at W3C has been
> > > generally effective.)
> >
> > Is the WG expected to continue to issue revisions to
> > https://www.w3.org/TR/charmod-norm/ ? The document itself suggests
> > sending comments via GitHub issues. I don't see a charter item that
> > clearly covers the maintenance of this document. Should we ask for an
> > item that ensures that the group is explicitly chartered to continue
> > to maintain this document?
>
> I expect the wording at the start of section 2.1 (Normative
> Specifications) that says:
>
>   The formal documents produced by the Working Group are guidelines,
>   best practices, requirements, and the like. These are best
>   published as Working Group Notes.
>
> probably covers this.  Or does this document not fit within that
> description?

Oops. Sorry. Yes, those sentences cover it. Thanks. (I searched for
the case-insensitive string "note", but somehow I managed to miss the
second sentence you quoted.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charters: Internationalization (i18n) Working Group and Interest Group

2019-04-09 Thread Henri Sivonen
On Mon, Apr 8, 2019 at 11:32 PM L. David Baron  wrote:
>
> The W3C is proposing revised charters for:
>
>   Internationalization (i18n) Working Group
>   https://www.w3.org/2019/04/proposed-i18n-wg-charter.html
>   https://lists.w3.org/Archives/Public/public-new-work/2019Apr/0004.html
>   diff from previous charter: 
> https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.w3.org%2FInternational%2Fcore%2Fcharter-2016.html=https%3A%2F%2Fwww.w3.org%2F2019%2F04%2Fproposed-i18n-wg-charter.html
>
>   Internationalization (i18n) Interest Group
>   https://www.w3.org/2019/04/proposed-i18n-ig-charter.html
>   https://lists.w3.org/Archives/Public/public-new-work/2019Apr/0004.html
>   diff from previous charter: 
> https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.w3.org%2FInternational%2Fig%2Fcharter-2016.html=https%3A%2F%2Fwww.w3.org%2F2019%2F04%2Fproposed-i18n-ig-charter.html
>
> Mozilla has the opportunity to send comments or objections through
> Friday, May 3.
>
> Please reply to this thread if you think there's something we should
> say as part of this charter review, or if you think we should
> support or oppose it.  (Absent specific comments, I'd be inclined to
> support the charters, because I think the i18n work at W3C has been
> generally effective.)

Is the WG expected to continue to issue revisions to
https://www.w3.org/TR/charmod-norm/ ? The document itself suggests
sending comments via GitHub issues. I don't see a charter item that
clearly covers the maintenance of this document. Should we ask for an
item that ensures that the group is explicitly chartered to continue
to maintain this document?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charter: Web & Networks Interest Group

2019-04-09 Thread Henri Sivonen
On Mon, Apr 8, 2019 at 11:11 PM L. David Baron  wrote:
>
> The W3C is proposing a new charter for:
>
>   Web & Networks Interest Group
>   https://www.w3.org/2019/03/web-networks-charter-draft.html
>   https://lists.w3.org/Archives/Public/public-new-work/2019Mar/0010.html
>
> Mozilla has the opportunity to send comments or objections through
> Friday, April 26.
>
> Please reply to this thread if you think there's something we should
> say as part of this charter review, or if you think we should
> support or oppose it.

The phrasing of "Application hints to the network" part of the charter
suggest that the IG envisions the browser declaring preferences to the
*network* rather than the other end point of the connection. Am I
reading that part right? That seems contrary to the general trend,
including Mozilla efforts, to encrypt things so that things aren't
visible to the network between the end points and the tendency to
consider it unwanted for the network to take actions other than making
the packets travel between the end points.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to implement and experiment: Require user interaction for notification permission prompts

2019-04-02 Thread Henri Sivonen
On Tue, Mar 19, 2019 at 3:15 PM Johann Hofmann  wrote:
>
> In bug 1524619 <https://bugzilla.mozilla.org/show_bug.cgi?id=1524619> I
> plan to implement support for requiring a user gesture when calling
> Notification.requestPermission() [0] and PushManager.subscribe() [1].

What's the current status of getting a cross-browser definition for
something being invoked in response to a user gesture?

Does scrolling count as a user gesture?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent-to-Ship: Backward-Compatibility FIDO U2F support for Google Accounts

2019-03-21 Thread Henri Sivonen
On Thu, Mar 14, 2019 at 8:12 PM J.C. Jones  wrote:
> It appears that if we want full security key support for Google
> Accounts in Firefox in the near term, we need to graduate our FIDO U2F
> API support from “experimental and behind a pref”

I think it's problematic to describe something as "experimental" if
it's not on path to getting enabled. "Experimental and behind a pref"
sounds like it's on track to getting enabled, so simultaneously 1)
sites have a reason to believe they don't need to do anything for
Firefox, since for now users can flip a pref and the feature is coming
anyway and 2) still the feature doesn't actually work by default for
users, and, considering the penalty of using an experimental feature
where the experiment fails is getting locked out of an account for
this particular feature.

So I think it's especially important to move *somewhere* from the
"experimental and behind a pref" state: Either to interop with Chrome
to the extent required by actual sites (regardless of what's de jure
standard) or to clear removal so that the feature doesn't look like
sites should just wait for it to get enabled and that the sites expect
the user to flip a pref.

As a user, I'd prefer the "interop with Chrome" option.

> to either “enabled
> by default” or “enabled for specific domains by default.” I am
> proposing the latter.

Why not the former? Won't the latter still make other sites wait in
the hope that if they don't change, they'll get onto the list
eventually anyway?

> First, we only implemented the optional Javascript version of the API,
> not the required MessagePort implementation [3]. This is mostly
> semantics, because everyone actually uses the JS API via a
> Google-supplied polyfill called u2f-api.js.

Do I understand correctly that the part that is actually needed for
interop is implemented?

> As I’ve tried to establish, I’ve had reasons to resist shipping the
> FIDO U2F API in Firefox, and I believe those reasons to be valid.
> However, a multi-year delay for the largest security key-enabled web
> property is, I think, unreasonable to push upon our users. We should
> do what’s necessary to enable full security key support on Google
> Accounts as quickly as is  practical.

This concern seems to apply to other services as well.

> I’ve proposed here making the FIDO U2F API whitelist a pref. I can’t
> say whether I would welcome adding more domains to it by default; I
> think we’re going to have to take them on a case-by-case basis.

What user-relevant problem is solved by having to add domains to a
list compared to making the feature available to all domains?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Type-based alias analysis and Gecko C++

2019-03-07 Thread Henri Sivonen
On Wed, Feb 27, 2019 at 10:20 AM Henri Sivonen  wrote:
> Given the replies to this thread and especially the one I quoted
> above, I suggest appending the following paragraph after the first
> paragraph of 
> https://developer.mozilla.org/en-US/docs/Mozilla/Using_CXX_in_Mozilla_code

I've made the edit after checking with Ehsan.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: What is dom/browser-element/ used for?

2019-02-28 Thread Henri Sivonen
I think I found a user in dev tools:
https://searchfox.org/mozilla-central/rev/2a6f3dde00801374d3b2a704232de54a132af389/devtools/client/responsive.html/components/Browser.js#140

On Thu, Feb 28, 2019 at 11:45 AM Henri Sivonen  wrote:
>
> It appears dom/browser-element/ was created for Gaia. Is it used for
> something still? WebExtensions perhaps?
>
> --
> Henri Sivonen
> hsivo...@mozilla.com



-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


What is dom/browser-element/ used for?

2019-02-28 Thread Henri Sivonen
It appears dom/browser-element/ was created for Gaia. Is it used for
something still? WebExtensions perhaps?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Type-based alias analysis and Gecko C++

2019-02-27 Thread Henri Sivonen
On Tue, Feb 19, 2019 at 10:17 PM Gabriele Svelto  wrote:
> On the reverse I've seen performance regressions from using
> -fno-strict-aliasing only in tight loops where the inability to move
> accesses around was lengthening the critical path through the loop.
> However this was on processors with very limited memory reordering
> capabilities; my guess is that on today's hardware
> -fno-strict-aliasing's impact is lost in the noise.

Given the replies to this thread and especially the one I quoted
above, I suggest appending the following paragraph after the first
paragraph of 
https://developer.mozilla.org/en-US/docs/Mozilla/Using_CXX_in_Mozilla_code
:

On the side of extending C++, we compile with -fno-strict-aliasing.
This means that when reinterpreting a pointer as a differently-typed
pointer, you don't need to adhere to the "effective type" (of the
pointee) rule from the standard when dereferencing the reinterpreted
pointer. You still need make sure that you don't violate alignment
requirements and need to make sure that the data at the memory
location pointed to forms a valid value when interpreted according to
the type of the pointer when dereferencing the pointer for reading.
Likewise, if you write by dereferencing the reinterpreted pointer and
the originally-typed pointer might still be dereferenced for reading,
you need to make sure that the values you write are valid according to
the original type. This issue is moot for e.g. primitive integers for
which all bit patterns of their size are valid value.

--
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Type-based alias analysis and Gecko C++

2019-02-22 Thread Henri Sivonen
On Fri, Feb 22, 2019 at 1:00 AM Jeff Walden  wrote:
>
> On 2/17/19 11:40 PM, Henri Sivonen wrote:
> > Rust, which combines the
> > perf benefits of -fstrict-aliasing with the understandability of
> > -fno-strict-aliasing?
>
> This is not really true of Rust.  Rust's memory model is not really defined 
> yet https://doc.rust-lang.org/reference/memory-model.html but from what I've 
> been able to read as to how you're "supposed" to and "should" use the 
> language in unsafe code and through FFI, it *does* require the same sorts of 
> things as C++ in "you can't dereference a pointer/reference unless it 
> contains a well-formed value of the type of the pointer/reference".  Just, 
> Rust has somewhat more tools that hide away this unsafety so you don't often 
> manually bash on memory yourself in that manner.

Requiring a dereferenced pointer to point to a value that is
well-formed according to the type of the pointer is *very* different
from having requirements on how the value was written ("effective
type"). E.g. all possible bit patterns of f64 are well-formed bit
patterns for u64, so in Rust it's permissible to use a u64-typed
pointer to access a value that was created as f64. However, in C++
(without -fno-strict-aliasing, of course), if the "effective type" of
a pointee is double, i.e. it was written as double, it's not
permissible to access the value via a uint64_t-type pointer.

In fact, the Rust standard library even provides an API for such viewing:
https://doc.rust-lang.org/std/primitive.slice.html#method.align_to

The unsafety remark is not in terms of aliasing but in terms of
*value* transmutability. The method is fully safe when U is a type for
which all bit patterns of U's size are valid values. (I'm a bit
disappointed that there isn't a safe method to that effect with a
trait bound to a trait that says that all bit patterns are valid. Then
primitive integers and SIMD vectors of integer lanes could implement
that marker trait.)

> As a practical matter, I don't honestly see how Rust can avoid having a 
> memory model very similar to C++'s, including with respect to aliasing, even 
> if they're not there *formally* yet.

As far as I'm aware, Ralf Jung, who is working on the formalization,
is against introducing *type-based* alias analysis to Rust. Unsafe
Rust has informal and will have formal aliasing rules, but all
indications are that they won't be *type-based*.

See
https://www.ralfj.de/blog/2018/08/07/stacked-borrows.html
https://www.ralfj.de/blog/2018/11/16/stacked-borrows-implementation.html
https://www.ralfj.de/blog/2018/12/26/stacked-borrows-barriers.html
for the aliasing rule formulation that does not involve type-based
alias analysis.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Type-based alias analysis and Gecko C++

2019-02-17 Thread Henri Sivonen
On Fri, Feb 15, 2019 at 6:47 PM Ted Mielczarek  wrote:
>
> On Fri, Feb 15, 2019, at 4:00 AM, Henri Sivonen wrote:
> > How committed are we to -fno-strict-aliasing?
>
> FWIW, some work was done to investigate re-enabling strict aliasing a while 
> ago but it proved untenable at the time:
> https://bugzilla.mozilla.org/show_bug.cgi?id=414641

The bug was closed with "Realistically, this is WONTFIX.  Life is too
short to figure out why -O3 breaks -fstrict-aliasing." That conclusion
makes sense to me.

Is there any reason to believe that strict-aliasing in clang would
yield the kind of performance benefits that would outweigh the trouble
of writing strict-aliasing-conformant code and the performance
penalties of additional memcpy() required for
strict-aliasing-conformant code?

Out of curiosity: Do we know if WebKit and Chromium compile with or
without strict-aliasing?

On Fri, Feb 15, 2019 at 4:43 PM David Major  wrote:
>  If we can easily remove (or reduce) uses of this flag, I think
> that would be pretty uncontroversial.

What are the UB implications of using it for some parts of the code
but not for others in the context of LTO?

If we have specific places where we'd need strict-aliasing for
performance, shouldn't we write those bits in Rust, which combines the
perf benefits of -fstrict-aliasing with the understandability of
-fno-strict-aliasing?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Type-based alias analysis and Gecko C++

2019-02-15 Thread Henri Sivonen
I happened to have a reason to run our build system under strace, and
I noticed that we pass -fno-strict-aliasing to clang.

How committed are we to -fno-strict-aliasing?

If we have no intention of getting rid of -fno-strict-aliasing, it
would make sense to document this at
https://developer.mozilla.org/en-US/docs/Mozilla/Using_CXX_in_Mozilla_code
and make it explicitly OK for Gecko developers not to worry about
type-based alias analysis UB--just like we don't worry about writing
exception-safe code.

Debating in design/review or making an effort to avoid type-based
alias analysis UB is not a good use of time if we're committed to not
having type-based alias analysis.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charter: SVG Working Group

2019-01-23 Thread Henri Sivonen
On Wed, Jan 23, 2019 at 11:23 AM Cameron McCormack  wrote:
>
> On Thu, Jan 10, 2019, at 12:38 AM, Henri Sivonen wrote:
> > A (non-changed) part of the charter says under SVG2: "This
> > specification updates SVG 1.1 to include HTML5-compatible parsing". Is
> > that in reference to
> > https://svgwg.org/svg2-draft/single-page.html#embedded-HTMLElements or
> > something else? I.e. does it mean the SVG WG wants to change the HTML
> > parsing algorithm to put  and  with
> > non-integration-point SVG parent into the HTML namespace in the HTML
> > parser?
>
> I see the note in that section you link that says:
>
> > Currently, within an SVG subtree, these tagnames are not recognized by the 
> > HTML parser to
> > be HTML-namespaced elements, although this may change in the future. 
> > Therefore, in order
> > to include these elements within SVG, one of the following must be used:
> > ...
>
> The "this may change in the future" part sounds like someone thought that it 
> might be the case in the future.  Saying that SVG 2 "includes 
> HTML5-compatible parsing" is a bit odd, though, since that behavior is 
> defined in the HTML spec.  In any case, given the group's intended focus on 
> stabilizing and documenting what is currently implemented and interoperable, 
> I doubt that making such a change would be in scope.

Thanks. I think it would be prudent for Mozilla to request that "
updates SVG 1.1 to include HTML5-compatible parsing," be struck from
the charter, so that changes to the HTML parsing algorithm can't be
justified using an argument from a charter that we approved.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to ship: TextEncoder.encodeInto() - UTF-8 encode into caller-provided buffer

2019-01-15 Thread Henri Sivonen
On Mon, Jan 14, 2019 at 4:45 PM Boris Zbarsky  wrote:
> On 1/14/19 4:28 AM, Henri Sivonen wrote:
> > This is now an "intent to ship". The feature landed in the spec, in
> > WPT and in Gecko (targeting 66 release).
>
> Where do other browsers stand on this feature, do you know?

I see active involvement in spec review and in test case reviewing and
test validation using a polyfill from a Chromium developer and the bug
(https://bugs.chromium.org/p/chromium/issues/detail?id=920107)
indicates interest from another person to do the implementation.

No signals from WebKit. https://bugs.webkit.org/show_bug.cgi?id=193274

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Intent to ship: TextEncoder.encodeInto() - UTF-8 encode into caller-provided buffer

2019-01-14 Thread Henri Sivonen
This is now an "intent to ship". The feature landed in the spec, in
WPT and in Gecko (targeting 66 release).

On Mon, Dec 17, 2018 at 9:12 AM Henri Sivonen  wrote:
>
> # Summary
>
> TextEncoder.encodeInto() adds encoding JavaScript strings into UTF-8
> into caller-provided byte buffers. This is a performance optimization
> for JavaScript/Wasm interop that allows the encoder output to be
> written directly into Wasm memory without extra copies.
>
> # Details
>
> TextEncoder.encode() returns a DOM implementation-created buffer. This
> involves coping internally in the implementation (in principle, this
> copy could be optimized away with internal-only API changes) and then
> yet another copy from the returned buffer into Wasm memory (optimizing
> away this copy needs a Web-visible change anyway).
>
> TextEncoder.encodeInto() avoids both copies.
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=1449849 combined with
> TextEncoder.encodeInto() is expected to avoid yet more copying.
>
> The expectation is that passing strings from JS to Wasm is / is going
> to be common enough to be worthwhile to optimize.
>
> # Bug
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=1514664
>
> # Link to standard
>
> https://github.com/whatwg/encoding/pull/166
>
> # Platform coverage
>
> All
>
> # Estimated or target release
>
> 67
>
> # Preference behind which this will be implemented
>
> Not planning to have a pref for this.
>
> # Is this feature enabled by default in sandboxed iframes?
>
> Yes.
>
> # DevTools bug
>
> No need for new DevTools integration.
>
> # Do other browser engines implement this
>
> No, but Chrome developers have been active in the spec discussion.
>
> # web-platform-tests
>
> https://github.com/web-platform-tests/wpt/pull/14505
>
> # Is this feature restricted to secure contexts?
>
> No. This is a new method on an interface that predates restricting
> features to secure contexts.
>
> --
> Henri Sivonen
> hsivo...@mozilla.com



-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charter: SVG Working Group

2019-01-09 Thread Henri Sivonen
On Sun, Dec 23, 2018 at 7:59 PM L. David Baron  wrote:
>
> The W3C is proposing a revised charter for:
>
>   Scalable Vector Graphics (SVG) Working Group
>   https://www.w3.org/Graphics/SVG/svg-2019-ac.html
>   https://lists.w3.org/Archives/Public/public-new-work/2018Dec/0006.html

(Not a charter comment yet. At this point a question.)

A (non-changed) part of the charter says under SVG2: "This
specification updates SVG 1.1 to include HTML5-compatible parsing". Is
that in reference to
https://svgwg.org/svg2-draft/single-page.html#embedded-HTMLElements or
something else? I.e. does it mean the SVG WG wants to change the HTML
parsing algorithm to put  and  with
non-integration-point SVG parent into the HTML namespace in the HTML
parser?

(Even with evergreen browsers, changing the HTML parsing algorithm
poses the problem that, if the algorithm is ever-changing, server-side
software cannot make proper security decisions on the assumption that
their implementation of the HTML parsing algorithm from some point in
time matches the behavior of browsers. )

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Rust code coverage

2019-01-07 Thread Henri Sivonen
On Fri, Jan 4, 2019 at 2:54 PM Marco Castelluccio
 wrote:
> Hi everyone,
> we have recently enabled collecting code coverage for Rust code too,

Nice!

> running Rust tests in coverage builds.

Does this mean running cargo test for each crate under
third_party/rust, running Firefox test suites or both?

As for trying to make sense of what the numbers mean:

Is the coverage ratio reported on lines attributed at all in ELF as
opposed to looking at the number of lines in the source files?

What kind of expectations one should have on how the system measures
coverage for code that gets inlined?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Pointer to the stack limit

2018-12-19 Thread Henri Sivonen
On Wed, Dec 19, 2018 at 2:37 PM David Major  wrote:
> You'll need platform-specific code, but on Windows there's 
> https://searchfox.org/mozilla-central/rev/13788edbabb04d004e4a1ceff41d4de68a8320a2/js/xpconnect/src/XPCJSContext.cpp#986.
>
> And, to get a sense of caution, have a look at the ifdef madness surrounding 
> the caller -- 
> https://searchfox.org/mozilla-central/rev/13788edbabb04d004e4a1ceff41d4de68a8320a2/js/xpconnect/src/XPCJSContext.cpp#1125
>  -- to see the number of hoops we have to jump through to accommodate various 
> build configs.

Thanks. It looks like the Android case just hard-codes a limit that
works for Dalvik instead of querying from the OS or ever querying for
the API level to decide between a limit that works for Dalvik and a
limit that works for ART.

(On IRC, I was pointed to the code that uses the limit:
https://searchfox.org/mozilla-central/search?q=symbol:_ZN2js19CheckRecursionLimitEP9JSContext=false
)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Pointer to the stack limit

2018-12-19 Thread Henri Sivonen
Is it possible to dynamically at run-time obtain a pointer to call
stack limit? I mean the address that is the lowest address that the
run-time stack can grow into without the process getting terminated
with a stack overflow.

I'm particularly interested in a solution that'd work on 32-bit
Windows and on Dalvik. (On ART, desktop Linux, and 64-bit platforms we
can make the stack "large enough" anyway.)

Use case: Implementing a dynamic recursion limit.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Intent to implement: TextEncoder.encodeInto() - UTF-8 encode into caller-provided buffer

2018-12-16 Thread Henri Sivonen
# Summary

TextEncoder.encodeInto() adds encoding JavaScript strings into UTF-8
into caller-provided byte buffers. This is a performance optimization
for JavaScript/Wasm interop that allows the encoder output to be
written directly into Wasm memory without extra copies.

# Details

TextEncoder.encode() returns a DOM implementation-created buffer. This
involves coping internally in the implementation (in principle, this
copy could be optimized away with internal-only API changes) and then
yet another copy from the returned buffer into Wasm memory (optimizing
away this copy needs a Web-visible change anyway).

TextEncoder.encodeInto() avoids both copies.

https://bugzilla.mozilla.org/show_bug.cgi?id=1449849 combined with
TextEncoder.encodeInto() is expected to avoid yet more copying.

The expectation is that passing strings from JS to Wasm is / is going
to be common enough to be worthwhile to optimize.

# Bug

https://bugzilla.mozilla.org/show_bug.cgi?id=1514664

# Link to standard

https://github.com/whatwg/encoding/pull/166

# Platform coverage

All

# Estimated or target release

67

# Preference behind which this will be implemented

Not planning to have a pref for this.

# Is this feature enabled by default in sandboxed iframes?

Yes.

# DevTools bug

No need for new DevTools integration.

# Do other browser engines implement this

No, but Chrome developers have been active in the spec discussion.

# web-platform-tests

https://github.com/web-platform-tests/wpt/pull/14505

# Is this feature restricted to secure contexts?

No. This is a new method on an interface that predates restricting
features to secure contexts.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to implement and ship: UTF-8 autodetection for HTML and plain text loaded from file: URLs

2018-12-12 Thread Henri Sivonen
On Tue, Dec 11, 2018 at 10:08 AM Henri Sivonen  wrote:
> How about I change it to 5 MB on the assumption that that's still very
> large relative to pre-UTF-8-era HTML and text file sizes?

I changed the limit to 4 MB.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to implement and ship: UTF-8 autodetection for HTML and plain text loaded from file: URLs

2018-12-11 Thread Henri Sivonen
On Tue, Dec 11, 2018 at 2:24 AM Martin Thomson  wrote:
> This seems reasonable, but 50M is a pretty large number.  Given the
> odds of UTF-8 detection failing, I would have thought that this could
> be much lower.

Consider the case of a document of ASCII text with a copyright sign in
the footer. I'd rather not make anyone puzzle over why the behavior of
the footer depends on how much text comes before the footer.

50 MB is intentionally extremely large relative to "normal" HTML and
text files so that the limit is reached approximately "never" unless
you open *huge* log files.

The HTML spec is about 11 MB these days, so that's existence proof
that a non-log-file HTML document can exceed 10 MB. Of course, the
limit doesn't need to be larger than present-day UTF-8 files but
larger than "normal"-sized *legacy* non-UTF-8 files.

It is quite possible that 50 MB is *too* large considering 32-bit
systems and what *other* allocations are proportional to the buffer
size, and I'm open to changing the limit to something smaller than 50
MB as long as it's still larger than "normal" non-UTF-8 HTML and text
files.

How about I change it to 5 MB on the assumption that that's still very
large relative to pre-UTF-8-era HTML and text file sizes?

> What is the number in Chrome?

It depends. It's unclear to me what exactly it depends on. Based on
https://github.com/whatwg/encoding/issues/68#issuecomment-272993181 ,
I expect it to depend on some combination of file system, OS kernel
and Chromium IO library internals.

On Ubuntu 18.04 with ext4 on an SSD, the number is 64 KB. On Windows
10 1803 with NTFS on an SSD, it's something smaller.

I think making the limit depend on the internals of file IO buffering
instead of a constant in the HTML parser is a really bad idea. Also 64
KB or something less than 64 KB seem way too small for the purpose of
making it so that the user approximately never needs to puzzle over
why things are different based on the length of the ASCII prefix of a
file with non-ASCII later in the file.

> I assume that other local sources like chrome: are expected to be
> annotated properly.

>From source inspection, it seems that chrome: URLs already get
hard-coded to UTF-8 on the channel level:
https://searchfox.org/mozilla-central/source/chrome/nsChromeProtocolHandler.cpp#187

As part of developing the patch, I saw only resource: URLs showing up
as file: URLs to the HTML parser, so only resource: URLs got a special
check that fast-tracks them to UTF-8 instead of buffering for
detection like normal file: URLs.

> On Mon, Dec 10, 2018 at 11:28 PM Henri Sivonen  wrote:
> >
> > (Note: This isn't really a Web-exposed feature, but this is a Web
> > developer-exposed feature.)
> >
> > # Summary
> >
> > Autodetect UTF-8 when loading HTML or plain text from file: URLs (only!).
> >
> > Some Web developers like to develop locally from file: URLs (as
> > opposed to local HTTP server) and then deploy using a Web server that
> > declares charset=UTF-8. To get the same convenience as when developing
> > with Chrome, they want the files loaded from file: URLs be treated as
> > UTF-8 even though the HTTP header isn't there.
> >
> > Non-developer users save files from the Web verbatim without the HTTP
> > headers and open the files from file: URLs. These days, those files
> > are most often in UTF-8 and lack the BOM, and sometimes they lack
> > , and plain text files can't even use  > charset=utf-8>. These users, too, would like a Chrome-like convenience
> > when opening these files from file: URLs in Firefox.
> >
> > # Details
> >
> > If a HTML or plain text file loaded from a file: URL does not contain
> > a UTF-8 error in the first 50 MB, assume it is UTF-8. (It is extremely
> > improbable for text intended to be in a non-UTF-8 encoding to look
> > like valid UTF-8 on the byte level.) Otherwise, behave like at
> > present: assume the fallback legacy encoding, whose default depends on
> > the Firefox UI locale.
> >
> > The 50 MB limit exists to avoid buffering everything when loading a
> > log file whose size is on the order of a gigabyte. 50 MB is an
> > arbitrary size that is significantly larger than "normal" HTML or text
> > files, so that "normal"-sized files are examined with 100% confidence
> > (i.e. the whole file is examined) but can be assumed to fit in RAM
> > even on computers that only have a couple of gigabytes of RAM.
> >
> > The limit, despite being arbitrary, is checked exactly to avoid
> > visible behavior changes depending on how Necko chooses buffer
> > boundaries.
> >
> > The limit is a number of bytes instead of a timeout in order to avoid
> > reintroduci

Intent to implement and ship: UTF-8 autodetection for HTML and plain text loaded from file: URLs

2018-12-10 Thread Henri Sivonen
t
for user-perceived performance reasons.) This is what the solution for
file: URLs does on the assumption that it's OK, because the data in
its entirety is (approximately) immediately available.

* Causing reloads. This is the mode of badness that applies when our
Japanese detector is in use and the first 1024 aren't enough to make
the decision.

All of these are bad. It's better to make the failure to declare UTF-8
in the http/https case something that the Web developer obviously has
to fix (by adding , HTTP header or the BOM) than to make it
appear that things work when actually at least one of the above forms
of badness applies.

# Bug

https://bugzilla.mozilla.org/show_bug.cgi?id=1071816

# Link to standard

https://html.spec.whatwg.org/#determining-the-character-encoding step
7 is basically an "anything goes" step for legacy reasons--mainly to
allow Japanese encoding detection that IE, WebKit and Gecko had before
the spec was written. Chrome started detecting more without prior
standard-setting discussion. See
https://github.com/whatwg/encoding/issues/68 for after-the-fact
discussion.

# Platform coverage

All

# Estimated or target release

66

# Preference behind which this will be implemented

Not planning to have a pref for this.

# Is this feature enabled by default in sandboxed iframes?

This is implemented to apply to all non-resource:-URL-derived file:
URLs, but since same-origin inheritance to child frames takes
precedence, this isn't expected to apply to sandboxed iframes in
practice.

# DevTools bug

No new dev tools integration. The pre-existing console warning about
undeclared character encoding will be shown still in the autodetection
case.

# Do other browser engines implement this

Chrome does, but not with the same number of bytes examined.

Safari as of El Capitan (my Mac is stuck on El Capitan) doesn't.

Edge as of Windows 10 1803 doesn't.

# web-platform-tests

As far as I'm aware, WPT doesn't cover file: URL behavior, and there
isn't a proper spec for this. Hence, unit tests use mochitest-chrome.

# Is this feature restricted to secure contexts?

Restricted to file: URLs.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Checking if an nsIURI came from a resource: URL

2018-12-07 Thread Henri Sivonen
On Fri, Dec 7, 2018 at 5:05 PM Dave Townsend  wrote:
>
> This suggests that channel.originalURI should help: 
> https://searchfox.org/mozilla-central/source/netwerk/base/nsIChannel.idl#37

Indeed, getting both nsIURIs from the channel works. Thanks!

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Checking if an nsIURI came from a resource: URL

2018-12-07 Thread Henri Sivonen
On Fri, Dec 7, 2018 at 3:23 PM Daniel Veditz  wrote:
>
> I'm afraid to ask why you want to treat these differently.

I'd like to make out resource: URLs default to UTF-8 and skip
(upcoming) detection between UTF-8 and the locale-affiliated legacy
encoding.

> Do you have a channel or a principal?

I have a channel at a later point, so I could reverse the decision
made from the nsIURI by looking at the channel before the initial
decision is acted upon in a meaningful way.

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Checking if an nsIURI came from a resource: URL

2018-12-07 Thread Henri Sivonen
It appears that my the time resource: URLs reach the HTML parser,
their scheme is reported as "file" (at least in debug builds).

Is there a way to tell from an nsIURI that it was expanded from a resource: URL?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Rust and --enable-shared-js

2018-10-02 Thread Henri Sivonen
On Mon, Sep 24, 2018 at 3:24 PM, Boris Zbarsky  wrote:
> On 9/24/18 4:04 AM, Henri Sivonen wrote:
>>
>> How important is --enable-shared-js? I gather its use case is making
>> builds faster for SpiderMonkey developers.
>
>
> My use case for it is to be able to use the "exclude samples from library X"
> or "collapse library X" tools in profilers (like Instruments) to more easily
> break down profiles into "page JS" and "Gecko things".

OK.

On Mon, Sep 24, 2018 at 1:24 PM, Mike Hommey  wrote:
>> How important is --enable-shared-js? I gather its use case is making
>> builds faster for SpiderMonkey developers. Is that the only use case?
>
> for _Gecko_ developers.

This surprises me. Doesn't the build system take care of not
rebuilding SpiderMonkey if it hasn't been edited? Is this only about
the link time?

What's the conclusion regarding next steps? Should I introduce
js_-prefixed copies of the four Rust FFI functions that I want to make
available to SpiderMonkey?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Rust and --enable-shared-js

2018-09-24 Thread Henri Sivonen
There's an effort to add Rust code to SpiderMonkey:
https://bugzilla.mozilla.org/show_bug.cgi?id=1490948

This will introduce a jsrust_shared crate that will just depend on all
the Rust crates that SpiderMonkey needs like gkrust_shared depends on
the crates the rest of Gecko needs.

This is fine both for building standalone SpiderMonkey (a top-level
jsrust will produce a .a and depend on jsrust_shared) and SpiderMonkey
as part of libxul (gkrust_shared will depend on jsrust_shared).

However, there exists a third configuration: --enable-shared-js. With
this option, SpiderMonkey is linked dynamically instead of being baked
into libxul. This is fine as long the set of FFI-exposing crates that
SpiderMonkey depends on and the set of FFI-exposing crates that the
rest of Gecko depends on are disjoint. If they aren't disjoint, a
symbol conflict is expected.

AFAICT, this could be solved in at least three ways:

 1) Keeping the sets disjoint. If both SpiderMonkey and the rest of
Gecko want to call the same Rust code, introduce a differently-named
FFI binding for SpiderMonkey.

 2) Making FFI symbols .so-internal so that they don't conflict
between shared libraries. Per
https://bugzilla.mozilla.org/show_bug.cgi?id=1490603 , it seems that
this would require rustc changes that don't exist yet.

 3) Dropping support for --enable-shared-js

For my immediate use case, I want to make 4 functions available both
to SpiderMonkey and the rest of Gecko, so option #1 is feasible, but
it won't scale. Maybe #2 becomes feasible before scaling up #1 becomes
a problem.

But still, I'm curious:

How important is --enable-shared-js? I gather its use case is making
builds faster for SpiderMonkey developers. Is that the only use case?
Is it being used that way in practice?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Extending the length of an XPCOM string when writing to it via a raw pointer

2018-08-31 Thread Henri Sivonen
On Fri, Aug 31, 2018 at 8:43 AM, Henri Sivonen  wrote:
> At this point, it's probably relevant to mention that SetCapacity() in
> situations other that ahead of a sequence of Append()s is most likely
> wrong (and has been so since at least 2004; I didn't bother doing code
> archeology further back than that).

I wrote some SetCapacity() docs at:
https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Guide/Internal_strings#Sequence_of_appends_without_reallocating

(I caused SetLength, AppendUTF16toUTF8 and AppendUTF8toUTF16 to move
from the first list to the second, but, other than that, XPCOM strings
have been this way long before I took a look at this code and I'm just
the messenger here.)

Also filed a static analysis request for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=1487612

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Extending the length of an XPCOM string when writing to it via a raw pointer

2018-08-30 Thread Henri Sivonen
On Thu, Aug 30, 2018 at 7:43 PM, Henri Sivonen  wrote:
>> What is then the point of SetCapacity anymore?
>
> To avoid multiple allocations during a sequence of Append()s. (This is
> documented on the header.)

At this point, it's probably relevant to mention that SetCapacity() in
situations other that ahead of a sequence of Append()s is most likely
wrong (and has been so since at least 2004; I didn't bother doing code
archeology further back than that).

SetCapacity() followed immediately by Truncate() is bad. SetCapacity()
allocates a buffer. Truncate() releases the buffer.

SetCapacity() followed immediately by AssignLiteral() of the
compatible character type ("" literal with nsACString and u"" literal
with nsAString) is bad. SetCapacity() allocates a buffer.
AssignLiteral() releases the buffer and makes the string point to the
literal in POD.

SetCapacity() followed immediately by Adopt() is bad. SetCapacity()
allocates a buffer. Adopt() releases the buffer and makes the string
point to the buffer passed to Adopt().

SetCapacity() followed immediately by Assign() is likely bad. If the
string that gets assigned points to a shareable buffer and doesn't
need to be copied, Assign() releases the buffer allocated by
SetCapacity().

Allocating an nsAuto[C]String and immediately calling SetCapacity()
with a constant argument is bad. If the requested capacity is smaller
than the inline buffer, it's a no-op. If the requested capacity is
larger, the inline buffer is wasted stack space. Instead of
SetCapacity(N), it makes sense to declare nsAuto[C]StringN (with
awareness that a very large N may be a problem in terms of overflowing
the run-time stack).

(I've seen all of the above in our code base and have a patch coming up.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Extending the length of an XPCOM string when writing to it via a raw pointer

2018-08-30 Thread Henri Sivonen
On Thu, Aug 30, 2018 at 6:00 PM, smaug  wrote:
> On 08/30/2018 11:21 AM, Henri Sivonen wrote:
>>
>> We have the following that a pattern in our code base:
>>
>>   1) SetCapacity(newCapacity) is called on an XPCOM string.
>>   2) A pointer obtained from BeginWriting() is used for writing more
>> than Length() but no more than newCapacity code units to the XPCOM
>> string.
>>   3) SetLength(actuallyWritten)  is called on the XPCOM string such
>> that actuallyWritten > Length() but actuallyWritten <= newCapacity.
>>
>> This is an encapsulation violation, because the string implementation
>> doesn't know that the content past Length() is there.
>
> How so? The whole point of capacity is that the string has that much
> capacity.

It has that much capacity for the use of the string implementation so
that you can do a sequence of Append()s without reallocating multiple
times. It doesn't mean that the caller is eligible to write into the
internal structure of the string in an undocumented way.

>  The caller
>>
>> assumes that step #3 will not reallocate and will only write a
>> zero-terminator at actuallyWritten and set mLength to actuallyWritten.
>> (The pattern is common enough that clearly people have believed it to
>> be a valid pattern. However, I haven't seen any in-tree or on-wiki
>> string documentation endorsing the pattern.)
>>
>> It should be non-controversial that this is an encapsulation
>> violation,
>
> Well, I'm not seeing any encapsulation violation ;)
>
>
>> but does the encapsulation violation matter? It matters if
>> we want SetLength() to be able to conserve memory by allocating a
>> smaller buffer when actuallyWritten code units would fit in a smaller
>> mozjemalloc bucket.
>
>
> Please be very very careful when doing allocations and deallocations. They
> are very slow, showing
> up all the time in performance profiles.

There is a threshold so that we don't reallocate from small to even
smaller. There's a good chance that the threshold should be higher
than it is now.

>  In order for the above pattern to work if
>>
>> SetLength() can reallocate in such case, SetLength() has to memcpy the
>> whole buffer in case someone has written content that the string
>> implementation is unaware of instead of just memcpying the part of the
>> buffer that the string implementation knows to be in use. Pessimizing
>> the number of code units to memcpy is bad for performance.
>>
>> It's unclear if trying to use a smaller mozjemalloc bucket it is a
>> worthwhile thing. It obviously is for large long-lived strings and it
>> obviously isn't for short-lived strings even if they are large.
>> SetLength() doesn't know what the future holds for the string. :-( But
>> in any case, it's bad if we can't make changes that are sound from the
>> perspective of the string encapsulation, because we have other code
>> violating the encapsulation.
>>
>> After the soft freeze, I'd like to change things so that we memcpy
>> only the part of the buffer that the string implementation knows is in
>> use. To that end, we should stop using the above pattern that is an
>> encapsulation violation.
>>
> What is then the point of SetCapacity anymore?

To avoid multiple allocations during a sequence of Append()s. (This is
documented on the header.)

>> For m-c, I've filed
>> https://bugzilla.mozilla.org/show_bug.cgi?id=1472113 and worked on
>> fixing the bugs that block it. For c-c, I've filed
>> https://bugzilla.mozilla.org/show_bug.cgi?id=1486706 but don't intend
>> to do the work of investigating or fixing the string usage in c-c.
>>
>> As for fixing the above pattern, there are two alternatives. The first one
>> is:
>>
>>   1) SetLength(newCapacity)
>>   2) BeginWriting()
>>   3) Truncate(actuallyWritten) (or SetLength(actuallyWritten), Truncate
>> simply tells to the reader that the string isn't being made longer)
>>
>> With this pattern, writing happens to the part of the buffer that the
>> string implementation believes to be in use. This has the downside
>> that the first SetLength() call (like, counter-intuitively,
>> SetCapacity() currently!) writes the zero terminator, which from the
>> point of view of CPU caches is an out-of-the-blue write that's not
>> part of a well-behaved forward-only linear write pattern and not
>> necessarily near recently-accessed locations.
>>
>> The second alternative is BulkWrite() in C++ and bulk_write() in Rust.
>
> The API doesn't seem to be too user friendly.

Zero-termination is sad. An API that doesn't zero-terminate eagerly

Extending the length of an XPCOM string when writing to it via a raw pointer

2018-08-30 Thread Henri Sivonen
We have the following that a pattern in our code base:

 1) SetCapacity(newCapacity) is called on an XPCOM string.
 2) A pointer obtained from BeginWriting() is used for writing more
than Length() but no more than newCapacity code units to the XPCOM
string.
 3) SetLength(actuallyWritten)  is called on the XPCOM string such
that actuallyWritten > Length() but actuallyWritten <= newCapacity.

This is an encapsulation violation, because the string implementation
doesn't know that the content past Length() is there. The caller
assumes that step #3 will not reallocate and will only write a
zero-terminator at actuallyWritten and set mLength to actuallyWritten.
(The pattern is common enough that clearly people have believed it to
be a valid pattern. However, I haven't seen any in-tree or on-wiki
string documentation endorsing the pattern.)

It should be non-controversial that this is an encapsulation
violation, but does the encapsulation violation matter? It matters if
we want SetLength() to be able to conserve memory by allocating a
smaller buffer when actuallyWritten code units would fit in a smaller
mozjemalloc bucket. In order for the above pattern to work if
SetLength() can reallocate in such case, SetLength() has to memcpy the
whole buffer in case someone has written content that the string
implementation is unaware of instead of just memcpying the part of the
buffer that the string implementation knows to be in use. Pessimizing
the number of code units to memcpy is bad for performance.

It's unclear if trying to use a smaller mozjemalloc bucket it is a
worthwhile thing. It obviously is for large long-lived strings and it
obviously isn't for short-lived strings even if they are large.
SetLength() doesn't know what the future holds for the string. :-( But
in any case, it's bad if we can't make changes that are sound from the
perspective of the string encapsulation, because we have other code
violating the encapsulation.

After the soft freeze, I'd like to change things so that we memcpy
only the part of the buffer that the string implementation knows is in
use. To that end, we should stop using the above pattern that is an
encapsulation violation.

For m-c, I've filed
https://bugzilla.mozilla.org/show_bug.cgi?id=1472113 and worked on
fixing the bugs that block it. For c-c, I've filed
https://bugzilla.mozilla.org/show_bug.cgi?id=1486706 but don't intend
to do the work of investigating or fixing the string usage in c-c.

As for fixing the above pattern, there are two alternatives. The first one is:

 1) SetLength(newCapacity)
 2) BeginWriting()
 3) Truncate(actuallyWritten) (or SetLength(actuallyWritten), Truncate
simply tells to the reader that the string isn't being made longer)

With this pattern, writing happens to the part of the buffer that the
string implementation believes to be in use. This has the downside
that the first SetLength() call (like, counter-intuitively,
SetCapacity() currently!) writes the zero terminator, which from the
point of view of CPU caches is an out-of-the-blue write that's not
part of a well-behaved forward-only linear write pattern and not
necessarily near recently-accessed locations.

The second alternative is BulkWrite() in C++ and bulk_write() in Rust.
This is new API that is well-behaved in terms of the cache access
pattern and is also more versatile in the sense that it lets the
caller know how newCapacity was rounded up, which is relevant to
callers that ask for best-case capacity and then ask more capacity if
there turns out to be more to write. When the caller is made aware of
the rounding, a second request for added capacity can be avoided if
the amount that actually needs to be written exceeds the best case
estimate but fits within the rounded-up capacity.

In Rust, bulk_write() is rather nicely misuse-resistant.  However, on
the C++ side the lack of a borrow checker as well as mozilla::Result
not working with move-only types
(https://bugzilla.mozilla.org/show_bug.cgi?id=1418624) pushes more
things to documentation. The documentation can be found at
https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Guide/Internal_strings#Bulk_Write
https://searchfox.org/mozilla-central/rev/2fe43133dbc774dda668d4597174d73e3969181a/xpcom/string/nsTSubstring.h#1190
https://searchfox.org/mozilla-central/rev/2fe43133dbc774dda668d4597174d73e3969181a/xpcom/string/nsTSubstring.h#32

P.S. GetMutableData() is redundant with BeginWriting() and
SetLength(). It's used very rarely and I'd like to remove it as
redundant, so please don't use GetMutableData().

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Dead-code removal of unused Rust FFI exports

2018-08-29 Thread Henri Sivonen
On Tue, Aug 28, 2018 at 4:56 PM, Till Schneidereit
 wrote:
> On Tue, Aug 28, 2018 at 3:20 PM Mike Hommey  wrote:
>> We don't LTO across languages on any platform yet. Rust is LTOed on all
>> platforms, which removes a bunch of its symbols. Everything that is
>> exposed for C/C++ from Rust, though, is left alone. That's likely to
>> stay true even with cross-language LTO, because as far as the linker is
>> concerned, those FFI symbols might be used by code that link against
>> libxul, so it would still export them. We'd essentially need the
>> equivalent to -fvisibility=hidden for Rust for that to stop being true.

Exporting Rust FFI symbols from libxul seems bad not just for binary
size but also in terms of giving contact surface to invasive
third-party Windows software. Do we have a bug on file tracking the
hiding of FFI symbols from the outside of libxul?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Dead-code removal of unused Rust FFI exports

2018-08-28 Thread Henri Sivonen
Does some lld mechanism successfully remove dead code when gkrust
exports some FFI function that the rest of Gecko never ends up
calling?

I.e. in terms of code size, is it OK to vendor an FFI-exposing Rust
crate where not every FFI function is used (at least right away)?

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Please don't use functions from ctype.h and strings.h

2018-08-27 Thread Henri Sivonen
I think it's worthwhile to have a lint, but regexps are likely to have
false positives, so using clang-tidy is probably better.

A bug is on file: https://bugzilla.mozilla.org/show_bug.cgi?id=1485588

On Mon, Aug 27, 2018 at 4:06 PM, Tom Ritter  wrote:
> Is this something worth making a lint over?  It's pretty easy to make
> regex-based lints, e.g.
>
> yml-only based lint:
> https://searchfox.org/mozilla-central/source/tools/lint/cpp-virtual-final.yml
>
> yml+python for slightly more complicated regexing:
> https://searchfox.org/mozilla-central/source/tools/lint/mingw-capitalization.yml
> https://searchfox.org/mozilla-central/source/tools/lint/cpp/mingw-capitalization.py
>
> -tom
>
> On Mon, Aug 27, 2018 at 7:04 AM, Henri Sivonen  wrote:
>>
>> Please don't use the functions from ctype.h and strings.h.
>>
>> See:
>> https://daniel.haxx.se/blog/2018/01/30/isalnum-is-not-my-friend/
>> https://daniel.haxx.se/blog/2008/10/15/strcasecmp-in-turkish/
>>
>> https://stackoverflow.com/questions/2898228/can-isdigit-legitimately-be-locale-dependent-in-c
>>
>> In addition to these being locale-sensitive, the functions from
>> ctype.h are defined to take (signed) int with the value space of
>> *unsigned* char or EOF and other argument values are Undefined
>> Behavior. Therefore, on platforms where char is signed, passing a char
>> sign-extends to int and invokes UB if the most-significant bit of the
>> char was set! Bug filed 15 years ago!
>> https://bugzilla.mozilla.org/show_bug.cgi?id=216952 (I'm not aware of
>> implementations doing anything surprising with this UB but there
>> exists precedent for *compiler* writers looking at the standard
>> *library* UB language and taking calls into standard library functions
>> as optimization-guiding assertions about the values of their
>> arguments, so better not risk it.)
>>
>> For isfoo(), please use mozilla::IsAsciiFoo() from mozilla/TextUtils.h.
>>
>> For tolower() and toupper(), please use ToLowerCaseASCII() and
>> ToUpperCaseASCII() from nsUnicharUtils.h
>>
>> For strcasecmp() and strncasecmp(), please use their nsCRT::-prefixed
>> versions from nsCRT.h.
>>
>> (Ideally, we should scrub these from vendored C code, too, since being
>> in third-party code doesn't really make the above problems go away.)
>>
>> --
>> Henri Sivonen
>> hsivo...@mozilla.com
>> ___
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>
>



-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Please don't use functions from ctype.h and strings.h

2018-08-27 Thread Henri Sivonen
Please don't use the functions from ctype.h and strings.h.

See:
https://daniel.haxx.se/blog/2018/01/30/isalnum-is-not-my-friend/
https://daniel.haxx.se/blog/2008/10/15/strcasecmp-in-turkish/
https://stackoverflow.com/questions/2898228/can-isdigit-legitimately-be-locale-dependent-in-c

In addition to these being locale-sensitive, the functions from
ctype.h are defined to take (signed) int with the value space of
*unsigned* char or EOF and other argument values are Undefined
Behavior. Therefore, on platforms where char is signed, passing a char
sign-extends to int and invokes UB if the most-significant bit of the
char was set! Bug filed 15 years ago!
https://bugzilla.mozilla.org/show_bug.cgi?id=216952 (I'm not aware of
implementations doing anything surprising with this UB but there
exists precedent for *compiler* writers looking at the standard
*library* UB language and taking calls into standard library functions
as optimization-guiding assertions about the values of their
arguments, so better not risk it.)

For isfoo(), please use mozilla::IsAsciiFoo() from mozilla/TextUtils.h.

For tolower() and toupper(), please use ToLowerCaseASCII() and
ToUpperCaseASCII() from nsUnicharUtils.h

For strcasecmp() and strncasecmp(), please use their nsCRT::-prefixed
versions from nsCRT.h.

(Ideally, we should scrub these from vendored C code, too, since being
in third-party code doesn't really make the above problems go away.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Changes in XPCOM string encoding conversions

2018-08-16 Thread Henri Sivonen
I've made changes to encoding conversions between XPCOM strings. Here
are the important bits of info:

 * The conversions are now generally faster, so processing text as
UTF-8 should be considered more appropriate than before even if there
exists a case where the data needs to be passed to a UTF-16 consumer.

 * There are now faster paths in Rust for appending or assigning 
to nsAString. If you have an , please use the *_str() methods
instead of the *_utf8() methods on nsAString in Rust. (That is, the
*_str() methods make use of the knowledge that the input is guaranteed
to be valid UTF-8.)

 * I'd like to make the UTF-16 to Latin1 (the function names say ASCII
instead of Latin1 for legacy reasons) conversion assert in debug
builds if the input isn't in the Latin1 range, so if you have a
mostly-ASCII UTF-16 string that you want to printf for logging and
don't care about what happens to non-ASCII, please convert to UTF-8 as
your "I don't care about non-ASCII" conversion.

 * The conversions between UTF-16 and UTF-8 in both directions now
implement spec-compliant REPLACEMENT CHARACTER generation if the input
UTF isn't valid. Previously, the output got truncated. This is not to
say that it's now OK to be less diligent about UTF validity but to say
that you can't rely on the old truncation behavior.

 * There are now conversions between UTF-8 and Latin1 to allow for
more efficient interaction with UTF-8 and SpiderMonkey strings and DOM
text nodes going forward.

 * The conversions no longer accept zero-terminated C-style strings.
The cost of strlen() is now made visible to the caller by requiring
the caller to wrap C-style strings with mozilla::MakeStringSpan().
(Please avoid C-style strings. Strings that know their length are
nice. This change wasn't made in order to make the use of C-style
string hard, though, but in order to avoid clang complaining about
ambiguous overloads.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Rust crate approval

2018-07-05 Thread Henri Sivonen
On Wed, Jun 27, 2018 at 5:02 AM, Adam Gashlin  wrote:
> * Already vendored crates
> Can I assume any crates we have already in mozilla-central are ok to use?

Seems like a reasonable assumption.

> * Updates
> I need winapi 0.3.5 for BITS support, currently third_party/rust/winapi is
> 0.3.4. There should be no problem updating it, but should I have this
> reviewed by the folks who originally vendored it into mozilla-central?

In my opinion, it should be enough for _someone_ qualified to review
code in the area of Windows integration to review the diff.

> * New crates
> I'd like to use the windows-service crate, which seems well written and has
> few dependencies, but the first 0.1.0 release was just a few weeks ago. I'd
> like to have that reviewed at least as carefully as my own code,
> particularly given how much unsafety there is, but where do I draw the
> line? For instance, it depends on "widestring", which is small and has been
> around for a while but isn't widely used, should I have that reviewed
> internally as well? Is popularity a reasonable measure?

In principle, all code landing in m-c needs to be reviewed, but
sometimes the reviewer may rubber-stamp code instead of truly
reviewing it carefully. All the newly-vendored code should be part of
the review request and then it's up to the reviewer to decide if it's
appropriate to look at some code less carefully because there are
other indicators of quality.

As for widestring specifically, a cursory look at the code suggests
that it's a quality crate and should have no trouble passing review.
It is also small enough that it should be actually feasible to review
it instead of rubber-stamping it.

(For Mozilla-developed code that is on a performance-sensitive path,
there exists encoding_rs::mem (particularly
https://docs.rs/encoding_rs/0.8.4/encoding_rs/mem/fn.convert_str_to_utf16.html
and 
https://docs.rs/encoding_rs/0.8.4/encoding_rs/mem/fn.convert_utf16_to_str.html),
which doesn't provide the ergonomics that widestring provides but
provides faster (SIMD-accelerated on our tier-1 CPU architectures and
aarch64, which is on path to tier-1) conversions for long (16 code
units or longer) strings containing mostly ASCII code points. An
update service probably isn't performance-sensitive in this way. I'm
mentioning this to generate awareness generally on the topic of UTF-16
conversions in m-c Rust code.)

-- 
Henri Sivonen
hsivo...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Update on rustc/clang goodness

2018-06-04 Thread Henri Sivonen
On Wed, May 30, 2018 at 5:16 PM, Mike Hommey  wrote:
> On Wed, May 30, 2018 at 02:40:01PM +0300, Henri Sivonen wrote:
>> The Linux distro case is
>> trickier than Mozilla's compiler choice. For CPUs that are tier-3 for
>> Mozilla, we already tolerate less great performance attributes in
>> order to enable availability, so distros keeping using GCC for tier-3
>> probably isn't a problem. x86_64 could be a problem, though. If
>> Firefox's performance becomes significantly dependent on having
>> cross-language inlining, and I expect it will, having a substantial
>> portion of the user base run without it while thinking they have a
>> top-tier build could be bad. I hope we can get x86_64 Linux distros to
>> track our compiler configuration closely.
>
> That part might end up more difficult than one could expect.
> Cross-language inlining is going to require rustc and clang having a
> compatible llvm ir, and that's pretty much guaranteed to be a problem,
> even for Mozilla.

I thought the rustc codebase supported building with unpatched LLVM in
order to let distros maintain one copy of LLVM source (if not .so). Is
that not the case?

Why couldn't Mozilla build clang with Rust's LLVM fork and use that
for building releases? (And move Rust's fork forward as needed.)

(Also, what Ehsan said about IR compat suggests these might not even
need to be as closely synced.)

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Update on rustc/clang goodness

2018-05-30 Thread Henri Sivonen
On Wed, May 30, 2018 at 8:06 AM, Dave Townsend  wrote:
> On Tue, May 29, 2018 at 10:03 PM Jeff Gilbert  wrote:
>> I get that, but it reminds me of the reasons people give for "our
>> website works best in $browser".
>
> I was concerned by this too but found myself swayed by the arguments in
> https://blog.mozilla.org/nfroyd/2018/05/29/when-implementation-monoculture-right-thing/and
> in particular the first comment there.

Indeed, the first comment there (by roc) gets to the point.

Additionally, the reasons for not supporting multiple browsers tend to
be closer to the "didn't bother" kind whereas we're looking to get a
substantial benefit from clang that MSVC and GCC don't offer to us but
clang likely will: Cross-language inlining across code compiled with
clang and code compiled with rustc.

To the extent Mozilla runs the compiler, it makes sense to go for the
Open Source choice that allows us to deliver better on "performance as
a feature". We still have at least one static analysis running on GCC,
so I wouldn't expect GCC-compatibility to be dropped even if the app
wouldn't be "best compiled with" GCC. The Linux distro case is
trickier than Mozilla's compiler choice. For CPUs that are tier-3 for
Mozilla, we already tolerate less great performance attributes in
order to enable availability, so distros keeping using GCC for tier-3
probably isn't a problem. x86_64 could be a problem, though. If
Firefox's performance becomes significantly dependent on having
cross-language inlining, and I expect it will, having a substantial
portion of the user base run without it while thinking they have a
top-tier build could be bad. I hope we can get x86_64 Linux distros to
track our compiler configuration closely.

I do feel bad for the GCC devs, but it's worth noting that this is a
result of a deliberate decision not to modularize GCC for licensing
strategy reasons while LLVM has been designed as a module, demand for
which has had solid technical reasons. The modularity meant it made
more sense to build rustc on LLVM than on GCC and now that technical
design leads to better synergy with clang.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: License of test data?

2018-05-18 Thread Henri Sivonen
On Thu, May 17, 2018 at 8:31 PM, mhoye <mh...@mozilla.com> wrote:
> Well, more than a day or two. The MIT license is fine to include, and we
> have a pile of MIT-licensed code in-tree already.
>
> Other already-in-tree MPL-2.0 compatible licenses - the "just do it" set,
> basically - include Apache 2.0, BSD 2- and 3-clause, LGPL 2.1 and 3.0, GPL
> 3.0 and the Unicode Consortium's ICU.

Does "just do it" imply that it's now OK to import that stuff without
an analog of the previous r+ from Gerv?

> For anything not on that list a legal bug is def. the next step.

For test files, i.e. stuff that doesn't get linked into libxul, we
also have precedent for the MPL-incompatible CC-by and CC-by-sa. I
hope we can add these to the above list.

On Fri, May 18, 2018 at 12:33 AM, Mike Hommey <m...@glandium.org> wrote:
> The above list is for tests. For things that go in Firefox, it's more
> complicated. LGPL have requirements that makes us have to put all LGPL
> libraries in a separate dynamic library (liblgpllibs), and GPL can't be
> used at all.

For stuff that goes into Firefox, MIT and BSD (and, I'm guessing,
Apache with NOTICE file) involve editing
https://searchfox.org/mozilla-central/source/toolkit/content/license.html
, too.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charter: JSON-LD Working Group

2018-04-29 Thread Henri Sivonen
On Sun, Apr 29, 2018, 19:35 L. David Baron  wrote:

> OK, here's a draft of an explicit abtension that I can submit later
> today.  Does this seem reasonable?
>

This looks good to me. Thank you.



>
> One concern that we've had over the past few years about JSON-LD
> is that some people have been advocating that formats adopt
> JSON-LD semantics, but at the same time allow processing as
> regular JSON, as a way to make the adoption of JSON-LD
> lighter-weight for producers and consumers who (like us) don't
> want to have to implement full JSON-LD semantics.  This yields a
> format with two classes of processors that will produce different
> results on many inputs, which is bad for interoperability.  And
> full JSON-LD implementation is often more complexity than needed
> for both producers and consumers of content.  We don't want
> people who produce Web sites or maintain Web browsers to have to
> deal with this complexity.  For more details on this issue, see
> https://hsivonen.fi/no-json-ns/ .
>
> This leads us to be concerned about the Coordination section in
> the charter, which suggests that some W3C Groups that we are
> involved in or may end up implementing the work of (Web of
> Things, Publishing) will use JSON-LD.  We would prefer that the
> work of these groups not use JSON-LD for the reasons described
> above, but this charter seems to imply that they will.
>
> While in general we support doing maintenance (and thus aren't
> objecting), we're also concerned that the charter is quite
> open-ended about what new features will be included (e.g.,
> referring to "requests for new features" and "take into account
> new features and desired simplifications that have become
> apparent since its publication").  As the guidance in
> https://www.w3.org/Guide/standards-track/ suggests, new features
> should be limited to those already incubated in the CG.  (If we
> were planning to implement, we might be objecting on these
> grounds.)
>
>
> -David
>
> --
> 턞   L. David Baron http://dbaron.org/   턂
> 턢   Mozilla  https://www.mozilla.org/   턂
>  Before I built a wall I'd ask to know
>  What I was walling in or walling out,
>  And to whom I was like to give offense.
>- Robert Frost, Mending Wall (1914)
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charter: JSON-LD Working Group

2018-04-27 Thread Henri Sivonen
On Fri, Apr 27, 2018 at 1:04 AM, L. David Baron <dba...@dbaron.org> wrote:
> The W3C is proposing a charter for:
>
>   JSON-LD Working Group
>   https://www.w3.org/2018/03/jsonld-wg-charter.html
>   https://lists.w3.org/Archives/Public/public-new-work/2018Mar/0004.html
>
> Mozilla has the opportunity to send comments or objections through
> Sunday, April 29.  (Sorry for failing to send this out sooner!)
>
> This is a charter to produce JSON-LD 1.1 (A JSON-based Serialization
> for Linked Data), which is a revision of JSON-LD 1.0, which was
> developed under the now-closed RDF Working Group.  The
> specifications proposed in a charter have been developed in a
> community group.
>
> Please reply to this thread if you think there's something we should
> say as part of this charter review, or if you think we should
> support or oppose it.

JSON-LD's compatibility with the RDF data model and the fundamental
principle that identifiers expand to URIs means that JSON-LD
perpetuates the fundamental ergonomic problem of RDF. All
serializations of RDF, JSON-LD included, try to take steps to
alleviate the visibility of the problem on the syntax level but none
can git rid of the problem on the data model layer (since they
subscribe to the fundamental principle that is the source of the
problem and don't break data model compatibility). Thus, code that
processes the formats either has to be unergonomic or incorrect. It's
bad to have specs that promote unergonomic or incorrect
implementations and especially the mutually-incompatible co-existence
of the two.

See https://hsivonen.fi/no-json-ns/ for an elaboration--especially the
epilog. JSON-LD evangelism, including the slides linked to from the
charter (slide 3), tends to be about selling the format by claiming
that developers can ignore the RDF/URI stuff (i.e. write code that's
incorrect in terms of the full processing model). The very last
section of https://hsivonen.fi/no-json-ns/ addresses this based on
experience from Web-scale formats.

For this reason, I think we should resist introducing dependencies on
JSON-LD in formats and APIs that are relevant to the Web Platform. I
think it follows that we should not support this charter. I expect
this charter to pass in any case, so I'm not sure us saying something
changes anything, but it might still be worth a try to register
displeasure about the prospect of JSON-LD coming into contact with
stuff that Web engines or people who make Web apps or sites need to
deal with and to register displeasure with designing formats whose
full processing model differs from how the format is evangelized to
developers (previously: serving XHTML as text/html while pretending to
get benefits of the XML processing model that way).

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Default Rust optimization level decreased from 2 to 1

2018-04-26 Thread Henri Sivonen
On Wed, Apr 25, 2018 at 7:11 PM, Gregory Szorc <gsz...@mozilla.com> wrote:
> The build peers have long thought about adding the concept of “build 
> profiles” to the build system. Think of them as in-tree mozconfigs for 
> common, supported scenarios.

This would be good to have. It would also help if such mozconfigs had
comprehensive comments explaining how our release builds differ from
tooling defaults especially now that we have areas of code that are
developed outside a full Firefox build. For example, I A/B tested code
for performance using cargo bench and mistakenly thought that cargo's
"release" mode meant the same thing as Firefox "release" mode. I only
later learned that I had developed with opt-level=3 (cargo's default
for "release") and we ship with opt-level=2.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent To Require Manifests For Vendored Code In mozilla-central

2018-04-10 Thread Henri Sivonen
On Tue, Apr 10, 2018 at 7:33 AM, Byron Jones <g...@mozilla.com> wrote:
> glob wrote:
>>
>> The plan is to create a YAML file for each library containing metadata
>> such as the homepage url, vendored version, bugzilla component, etc. See
>> https://goo.gl/QZyz4xfor the full specification.
>
>
> this should be: https://goo.gl/QZyz4x for the full specification.

This proposal makes sense to me when it comes to libraries that are
not vendored from crates.io. However, this seems very heavyweight and
only adds the Bugzilla metadata for crates.io crates. It seems to me
that declaring the Bugzilla component isn't worth the trouble of
having another metadata file in addition to Cargo.toml.

Additonally, the examples suggest that this invents new ad hoc license
identifiers. I suggest we not do that but instead use
https://spdx.org/licenses/ and have a script to enforce that bogus
values don't creep in.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Is realloc() between bucket sizes worthwhile with jemalloc?

2018-04-10 Thread Henri Sivonen
On Mon, Apr 9, 2018 at 10:30 PM, Eric Rahm <er...@mozilla.com> wrote:
>> Upon superficial code reading, it seems to me that currently changing
>> the capacity of an nsA[C]STring might uselessly use realloc to copy
>> data that's not semantically live data from the string's point of view
>> and wouldn't really need to be preserved. Have I actually discovered
>> useless copying or am I misunderstanding?
>
>
> In this case I think you're right. In the string code we use a doubling
> strategy up to 8MiB so they'll always be in a new bucket/chunk. After 8MiB
> we grow by 1.125 [2], but always round up to the nearest MiB. Our
> HugeRealloc logic always makes a new allocation if the difference is greater
> than or equal to 1MiB [3] so that's always going to get hit. I should note
> that on OSX we use some sort of 'pages_copy' when the realloc is large
> enough, this is probably more efficient than memcpy.

Thanks. Being able to avoid useless copying for most strings probably
outweighs the loss of the pages_copy optimization for huge strings on
Mac.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Editing a vendored crate for a try push

2018-04-10 Thread Henri Sivonen
On Mon, Apr 9, 2018 at 10:32 PM,  <twisniew...@mozilla.com> wrote:
> On Monday, April 9, 2018 at 11:39:35 AM UTC-4, Henri Sivonen wrote:
>> How do I waive .cargo-checksum.json checking for a crate?
>
> In bug 1449613 (part 12) I just hand-edited the .cargo-checksum.json in 
> question, and updated the sha256 values for the modified files. That was 
> enough to get try runs going (though the debug builds do show related 
> failures, they still built and ran my tests).

This is what I've done, but it shouldn't have to be like this.

On Mon, Apr 9, 2018 at 8:48 PM, Andreas Tolfsen <a...@sny.no> wrote:
> I don’t know the exact answer to your question, but it might be
> possible to temporarily depend on the crate by a its path?
>
> There are a couple of examples of this in
> https://searchfox.org/mozilla-central/source/testing/geckodriver/Cargo.toml.
>
> Remember to run "cargo update -p ".

This seems more complicated than editing the crate in place and
manually editing the sha256 values.

What I'm looking for is a simple way to skip the sha256 editing.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Is realloc() between bucket sizes worthwhile with jemalloc?

2018-04-09 Thread Henri Sivonen
My understanding is that under some "huge" size, jemalloc returns
allocations from particularly-sized buckets.

This makes me expect that realloc() between bucket sizes is going to
always copy the data instead of just adjusting allocated metadata,
because to do otherwise would mess up the bucketing.

Is this so? Specifically, is it actually useful that nsStringBuffer
uses realloc() as opposed to malloc(), memcpy() with actually
semantically filled amount and free()?

Upon superficial code reading, it seems to me that currently changing
the capacity of an nsA[C]STring might uselessly use realloc to copy
data that's not semantically live data from the string's point of view
and wouldn't really need to be preserved. Have I actually discovered
useless copying or am I misunderstanding?

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Editing a vendored crate for a try push

2018-04-09 Thread Henri Sivonen
What's the current status of tooling for editing vendored crates for
local testing and try pushes?

It looks like our toml setup is too complex for cargo edit-locally to
handle (or, alternatively, I'm holding it wrong). It also seems that
mach edit-crate never happened.

How do I waive .cargo-checksum.json checking for a crate?

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


  1   2   3   4   5   >