Thanks, Jon. I’m not sure what the savings from splitting the Map would be, but I’d rather go with this version given the time constraints.
I can do some experiments tomorrow to see if it’s worth to do the splitting and file an issue in case it is. Hannes > Am 12.06.2019 um 22:40 schrieb Jonathan Gibbons <[email protected]>: > > Hannes, > > Since time is growing short, this version is OK/approved if you do not think > it worth while to compress the space any. > > -- Jon > > On 6/12/19 1:33 PM, Jonathan Gibbons wrote: >> Hannes, >> >> A more compact representation would be two tables, one for single-character >> entities and the other for multi-character entities? >> >> Is that worth considering? I guess that until we have value types, we would >> still have to box the single-character ones, but a Character should still be >> smaller than a String, right? >> >> -- Jon >> >> On 6/12/19 1:10 PM, Hannes Wallnöfer wrote: >>> Please review: >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8225671 >>> Webrev: http://cr.openjdk.java.net/~hannesw/8225671/webrev.00/ >>> >>> This is the second attempt at supporting HTML 5 entities after JDK-8222318 >>> had to be reverted. >>> >>> Fortunately I didn’t have to keep the HTML 4 entities around after all as I >>> had assumed, I just got confused very thoroughly by the test output. >>> >>> Given the huge increase in number of entities I decided to switch from an >>> enum to a plain class with a static Map. Entity values are now stored as >>> strings since some entities require dual codepoints. Also, we do not need >>> to use the reverse table anymore for lookup of numeric entities, as HTML 5 >>> has a concise definition of valid numeric entities [1]. >>> >>> [1]: https://www.w3.org/TR/html52/syntax.html#character-references >>> >>> I updated the test with entities from all relevant groups (new valid named >>> and numeric entities, invalid entities from control characters, surrogates, >>> and non-characters). I also tested these manually using the W3 HTML >>> validator [2]. Mach4 tier 1 tests also do pass. >>> >>> [2]: https://validator.w3.org/ >>> >>> Hannes
