On Wed, 22 Jun 2022 14:24:05 GMT, Daniel Jeliński <[email protected]> wrote:
> This PR improves the performance of deduplication done by
> ResourceBundleGenerator.
>
> The original implementation compared every pair of values, requiring O(n^2)
> time. The new implementation uses a HashMap to find duplicates, trading off
> some extra memory consumption for O(n) computational complexity. In practice
> the time to generate jdk.localedata on my Linux VM files dropped from 14 to 8
> seconds.
>
> The resulting files (under build/support/gensrc/java.base and jdk.localedata)
> have different contents; map iteration order depends on the insertion order,
> and the insertion order of the new implementation is different from the
> original.
> The files generated before and after this change have the same size.
make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java
line 146:
> 144: // generic reduction of duplicated values
> 145: Map<String, Object> newMap = new HashMap<>(map);
> 146: Map<BundleEntryValue, BundleEntryValue> dedup = new
> HashMap<>(map.size());
LinkedHashMap could be used to retain the iteration order.
Or TreeMap if some deterministic order was desirable.
make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java
line 157:
> 155: fmt = new Formatter();
> 156: }
> 157: String metaVal = oldEntry.metaKey();
The new instanceof pattern matching could be used avoid the cast below.
make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java
line 270:
> 268: if (value instanceof String s) {
> 269: return s.equals(entry.value);
> 270: } else if (!(entry.value instanceof String[])) {
Could be re-written to use instanceof pattern and save a cast.
-------------
PR: https://git.openjdk.org/jdk/pull/9243