@k-yle commented on this pull request.


> +TAG2LINK = lambda {
+  # the JSON data is an array with duplicate entries, which is not efficient 
for lookups.
+  # So, convert it to a hash and only keep the item with the best rank.
+  array = JSON.parse(Rails.root.join("node_modules/tag2link/index.json").read)
+
+  ranks = %w[deprecated normal preferred].freeze
+
+  output = {}
+
+  all_keys = array.map { |item| item["key"] }.uniq
+
+  all_keys.each do |key|
+    # for each key, find the item with the best rank
+    best_definition = array
+                      .select { |item| item["key"] == key }
+                      .max_by { |item| ranks.index(item["rank"]) }

> [...] in some cases there is more than one that is `preferred`.

Wikidata considers this an error:

<img width="414" height="210" alt="image" 
src="https://github.com/user-attachments/assets/64aeea5b-a287-4d51-95a9-200d78e5b13e";
 />


After exluding [`wikidata:P3303`](https://wikidata.org/wiki/Property:P3303) and 
cleaning up a few cases where the only difference was `www.` or `http[s]`, 
there are only 4 left which are ambiguous:

`de:amtlicher_gemeindeschluessel`, `ref:INEP`, `ref:ruian`, `woeid`

So... I think we just exclude any keys where there isn't a single-best URL.

<details>
<summary>source</summary>

```js
const { default: uniqBy } = await import("https://esm.sh/lodash.uniqby";);
const { default: list } = await import("https://esm.sh/tag2link";);

const RANKS = ["normal", "preferred"];

Object.fromEntries(
  Object.values(
    Object.groupBy(
      list.filter(
        (item) => item.rank !== "deprecated" && item.source !== "wikidata:P3303"
      ),
      (item) => item.key
    )
  )
    .map((sublist) =>
      // remove items with the same URL and sort by rank
      uniqBy(
        sublist.sort((a, b) => RANKS.indexOf(b.rank) - RANKS.indexOf(a.rank)),
        (item) => item.url
      )
    )
    .filter((sublist) => {
      // keep only those where the best & second-best have the same rank
      const [best, secondBest] = sublist;
      return best && secondBest && best.rank == secondBest.rank;
    })
    .map((sublist) => [
      // print the ones with the best rank
      sublist[0].key,
      sublist.filter((item) => item.rank === sublist[0].rank),
    ])
);
```


</details>

-- 
Reply to this email directly or view it on GitHub:
https://github.com/openstreetmap/openstreetmap-website/pull/6197#discussion_r2218556704
You are receiving this because you are subscribed to this thread.

Message ID: 
<openstreetmap/openstreetmap-website/pull/6197/review/3037103...@github.com>
_______________________________________________
rails-dev mailing list
rails-dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/rails-dev

Reply via email to