Re: [NTG-context] Ligature suppression word list

2021-04-12 Thread denis.maier
of: pdftotext book.pdf | showIncorrectLigatures.py > incorrect-ligatures.txt Denis Von: ntg-context Im Auftrag von rh...@t-online.de Gesendet: Mittwoch, 7. April 2021 20:20 An: ntg-context@ntg.nl Betreff: Re: [NTG-context] Ligature suppression word list Message: 2 Date: Tue, 6 Apr 2021 15

Re: [NTG-context] Ligature suppression word list

2021-04-08 Thread Hans Hagen
On 4/8/2021 9:37 PM, Arthur Rosendahl wrote: Dutch, by contrast, does not seem so well served: the OpenTaal group is dormant and no longer offers the hyphenated word list that was once available (that was already the case five years ago). The most relevant page I find:

Re: [NTG-context] Ligature suppression word list

2021-04-08 Thread Arthur Rosendahl
On Sat, Apr 03, 2021 at 06:02:10PM +0200, Hans Hagen wrote: > german is just an example, dutch has some specific things, and i bet other > languages have their demands so my aim is some general mechanism I appreciate that, but if you want to have data of sufficiently good quality to use this

Re: [NTG-context] Ligature suppression word list

2021-04-08 Thread denis.maier
Von: ntg-context Im Auftrag von rh...@t-online.de Gesendet: Mittwoch, 7. April 2021 20:20 An: ntg-context@ntg.nl Betreff: Re: [NTG-context] Ligature suppression word list A lot of corpora can be found here: https://wortschatz.uni-leipzig.de/de especially here: https://wortschatz.uni-leipzig.de

Re: [NTG-context] Ligature suppression word list

2021-04-07 Thread rha17
> Message: 2 > Date: Tue, 6 Apr 2021 15:03:54 + > From: mailto:denis.ma...@ub.unibe.ch>> > To: mailto:j.ha...@xs4all.nl>>, <mailto:ntg-context@ntg.nl>> > Subject: Re: [NTG-context] Ligature suppression word list > Message-ID: <41e6530172b54

Re: [NTG-context] Ligature suppression word list

2021-04-06 Thread denis.maier
> -Ursprüngliche Nachricht- > Von: Hans Hagen > Gesendet: Samstag, 3. April 2021 17:58 > An: mailing list for ConTeXt users ; Maier, Denis > Christian (UB) > Betreff: Re: [NTG-context] Ligature suppression word list > > On 4/3/2021 5:06 PM, denis.ma...@ub.unibe.c

Re: [NTG-context] Ligature suppression word list

2021-04-06 Thread denis.maier
> -Ursprüngliche Nachricht- > Von: Hans Hagen > Gesendet: Samstag, 3. April 2021 17:58 > An: mailing list for ConTeXt users ; Maier, Denis > Christian (UB) > Betreff: Re: [NTG-context] Ligature suppression word list > > On 4/3/2021 5:06 PM, denis.ma...@u

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Thangalin
Untested. Lists are not subject to copyright, so public domain should be legal, even though SE posts are CC-BY-SA. When a word has a single suffix or prefix (e.g., safflower/s), the two words are listed together, rather than using an explicit suffix/prefix section. return { name =

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Hans Hagen
On 4/3/2021 6:30 PM, Thangalin wrote: A starting list of English non-ligatures: https://english.stackexchange.com/a/50957/22099 The entire SE thread has additional resources and is quite informative. So can you make a file from that like we

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Hans Hagen
On 4/3/2021 5:06 PM, denis.ma...@ub.unibe.ch wrote: For those interested, that file only has ligature prevention definitions. { actions = { ["|"] = "noligature" }, words = [[ Auf|lagefläche Auf|lageflächen

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Thangalin
A starting list of English non-ligatures: https://english.stackexchange.com/a/50957/22099 The entire SE thread has additional resources and is quite informative. ___ If your question is of interest to others as well,

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Hans Hagen
On 4/3/2021 5:06 PM, denis.ma...@ub.unibe.ch wrote: 1. The new language options features include a tracker that allows for tracking for which words in a given document ligature prevention happened, and which words haven’t been touched by the mechanism. It should be possible to

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Hans Hagen
On 4/3/2021 5:20 PM, Arthur Rosendahl wrote: On Sat, Apr 03, 2021 at 03:06:22PM +, denis.ma...@ub.unibe.ch wrote: What do you think? I think you should collaborate with the group of volunteers working on German hyphenation and related topics. They have a mailing list (in German):

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Hans Hagen
On 4/3/2021 5:06 PM, denis.ma...@ub.unibe.ch wrote: Hi everyone Now that Hans has implemented the new ligature suppression mechanism via language goodies – thanks again Hans! – we now need to come up with wordlists. I’ve started working on a list of German words with ligatures that should

Re: [NTG-context] Ligature suppression word list

2021-04-03 Thread Arthur Rosendahl
On Sat, Apr 03, 2021 at 03:06:22PM +, denis.ma...@ub.unibe.ch wrote: > What do you think? I think you should collaborate with the group of volunteers working on German hyphenation and related topics. They have a mailing list (in German): https://lists.dante.de/mailman/listinfo/trennmuster

[NTG-context] Ligature suppression word list

2021-04-03 Thread denis.maier
Hi everyone Now that Hans has implemented the new ligature suppression mechanism via language goodies - thanks again Hans! - we now need to come up with wordlists. I've started working on a list of German words with ligatures that should be suppressed. The list is derived from the word list