Hannes,
That's fine. I see you've pushed the changeset. :-)
For your future consideration, even if you don't split the map, you
could overload the put method with variants that take a char and a
String. Then, it just becomes an impl decision within the methods
whether to use one or two maps. Also, by using character constants
instead of String constants, you'll save intern-ing the strings.
-- Jon
On 06/12/2019 01:57 PM, Hannes Wallnöfer wrote:
Thanks, Jon.
I’m not sure what the savings from splitting the Map would be, but I’d rather
go with this version given the time constraints.
I can do some experiments tomorrow to see if it’s worth to do the splitting and
file an issue in case it is.
Hannes
Am 12.06.2019 um 22:40 schrieb Jonathan Gibbons <[email protected]>:
Hannes,
Since time is growing short, this version is OK/approved if you do not think it
worth while to compress the space any.
-- Jon
On 6/12/19 1:33 PM, Jonathan Gibbons wrote:
Hannes,
A more compact representation would be two tables, one for single-character
entities and the other for multi-character entities?
Is that worth considering? I guess that until we have value types, we would
still have to box the single-character ones, but a Character should still be
smaller than a String, right?
-- Jon
On 6/12/19 1:10 PM, Hannes Wallnöfer wrote:
Please review:
JBS: https://bugs.openjdk.java.net/browse/JDK-8225671
Webrev: http://cr.openjdk.java.net/~hannesw/8225671/webrev.00/
This is the second attempt at supporting HTML 5 entities after JDK-8222318 had
to be reverted.
Fortunately I didn’t have to keep the HTML 4 entities around after all as I had
assumed, I just got confused very thoroughly by the test output.
Given the huge increase in number of entities I decided to switch from an enum
to a plain class with a static Map. Entity values are now stored as strings
since some entities require dual codepoints. Also, we do not need to use the
reverse table anymore for lookup of numeric entities, as HTML 5 has a concise
definition of valid numeric entities [1].
[1]: https://www.w3.org/TR/html52/syntax.html#character-references
I updated the test with entities from all relevant groups (new valid named and
numeric entities, invalid entities from control characters, surrogates, and
non-characters). I also tested these manually using the W3 HTML validator [2].
Mach4 tier 1 tests also do pass.
[2]: https://validator.w3.org/
Hannes