Thanks, Jon. 

I’m not sure what the savings from splitting the Map would be, but I’d rather 
go with this version given the time constraints.

I can do some experiments tomorrow to see if it’s worth to do the splitting and 
file an issue in case it is.

Hannes



> Am 12.06.2019 um 22:40 schrieb Jonathan Gibbons <[email protected]>:
> 
> Hannes,
> 
> Since time is growing short, this version is OK/approved if you do not think 
> it worth while to compress the space any.
> 
> -- Jon
> 
> On 6/12/19 1:33 PM, Jonathan Gibbons wrote:
>> Hannes,
>> 
>> A more compact representation would be two tables, one for single-character 
>> entities and the other for multi-character entities?
>> 
>> Is that worth considering? I guess that until we have value types, we would 
>> still have to box the single-character ones, but a Character should still be 
>> smaller than a String, right?
>> 
>> -- Jon
>> 
>> On 6/12/19 1:10 PM, Hannes Wallnöfer wrote:
>>> Please review:
>>> 
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8225671
>>> Webrev: http://cr.openjdk.java.net/~hannesw/8225671/webrev.00/
>>> 
>>> This is the second attempt at supporting HTML 5 entities after JDK-8222318 
>>> had to be reverted.
>>> 
>>> Fortunately I didn’t have to keep the HTML 4 entities around after all as I 
>>> had assumed, I just got confused very thoroughly by the test output.
>>> 
>>> Given the huge increase in number of entities I decided to switch from an 
>>> enum to a plain class with a static Map. Entity values are now stored as 
>>> strings since some entities require dual codepoints. Also, we do not need 
>>> to use the reverse table anymore for lookup of numeric entities, as HTML 5 
>>> has a concise definition of valid numeric entities [1].
>>> 
>>> [1]: https://www.w3.org/TR/html52/syntax.html#character-references
>>> 
>>> I updated the test with entities from all relevant groups (new valid named 
>>> and numeric entities, invalid entities from control characters, surrogates, 
>>> and non-characters). I also tested these manually using the W3 HTML 
>>> validator [2]. Mach4 tier 1 tests also do pass.
>>> 
>>> [2]: https://validator.w3.org/
>>> 
>>> Hannes

Reply via email to