Re: [NTG-context] Set luatex cache directory path

2021-04-06 Thread Hans Hagen

On 4/6/2021 8:59 PM, Thangalin wrote:

Thanks Aditya.

What do you think of changing the default luatex-cache directory to the 
system's temporary directory? Consider:


  * The $HOME directory is sacrosanct (4784 people agree:
https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1575053
)
  * The temp directory is cleared on Linux (Unix?) system reboots;
purged during regular Windows upkeep
  * The temp directory is writable by default
  * Changing the location requires calling an additional program, which
isn't obvious (principle of least astonishment)


Do you really want to recahe fonts so often?


My text editor invokes ConTeXt like:

     if( TYPESETTER.canRun() ) {
       env.put( "TEXMFCACHE", System.getProperty( "java.io.tmpdir" ) );

       mArgs.add( TYPESETTER.getName() );
       mArgs.add( .. --path .. --purge .. --batch .. --result .. 
--environment .. etc. );

       mArgs.add( inputFilename );
     }


--batch only makes sense for an unattended run
--purging every time can lead to extra runs

The first line ensures that "context" is an executable located in a PATH 
directory. The second line attempts to change the luatex-cache 
directory. The remaining lines configure the command-line arguments 
prior to running ConTeXt.


Fearing flaming wrath from users, an additional mtxrun call is required, 
which incurs overhead:


  * Check for mtxrun executable
  * Run mtxrun each time


see aditya's reply ... the --autogenerate is clever enough not to do 
redundant things (and context knows when it has been updated so ...)


This would work but feels like a leaky abstraction (i.e., the context 
executable should honour TEXMFCACHE without needing to invoke mtxrun 
because context creates the luatex-cache directory).
see aditya's reply ... quite some effort has gone into making sure 
context starts up fast so i'm not going to advocate a different practice


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Set luatex cache directory path

2021-04-06 Thread Thangalin
Perfect, thank you.

Will be Wikified.
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Set luatex cache directory path

2021-04-06 Thread Aditya Mahajan
On Tue, 6 Apr 2021, Thangalin wrote:

> Thanks Aditya.
> 
> What do you think of changing the default luatex-cache directory to the
> system's temporary directory? Consider:
> 
>- The $HOME directory is sacrosanct (4784 people agree:
>https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1575053)
>- The temp directory is cleared on Linux (Unix?) system reboots; purged
>during regular Windows upkeep
>- The temp directory is writable by default
>- Changing the location requires calling an additional program, which
>isn't obvious (principle of least astonishment)

Use:

mtxrun --autogenerate --script mtx-context ...

Aditya
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Set luatex cache directory path

2021-04-06 Thread Thangalin
Thanks Aditya.

What do you think of changing the default luatex-cache directory to the
system's temporary directory? Consider:

   - The $HOME directory is sacrosanct (4784 people agree:
   https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1575053)
   - The temp directory is cleared on Linux (Unix?) system reboots; purged
   during regular Windows upkeep
   - The temp directory is writable by default
   - Changing the location requires calling an additional program, which
   isn't obvious (principle of least astonishment)

My text editor invokes ConTeXt like:

if( TYPESETTER.canRun() ) {
  env.put( "TEXMFCACHE", System.getProperty( "java.io.tmpdir" ) );

  mArgs.add( TYPESETTER.getName() );
  mArgs.add( .. --path .. --purge .. --batch .. --result ..
--environment .. etc. );
  mArgs.add( inputFilename );
}

The first line ensures that "context" is an executable located in a PATH
directory. The second line attempts to change the luatex-cache directory.
The remaining lines configure the command-line arguments prior to running
ConTeXt.

Fearing flaming wrath from users, an additional mtxrun call is required,
which incurs overhead:

   - Check for mtxrun executable
   - Run mtxrun each time

This would work but feels like a leaky abstraction (i.e., the context
executable should honour TEXMFCACHE without needing to invoke mtxrun
because context creates the luatex-cache directory).

Thoughts?
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Ligature suppression word list

2021-04-06 Thread denis.maier
> -Ursprüngliche Nachricht-
> Von: Hans Hagen 
> Gesendet: Samstag, 3. April 2021 17:58
> An: mailing list for ConTeXt users ; Maier, Denis
> Christian (UB) 
> Betreff: Re: [NTG-context] Ligature suppression word list
> 
> On 4/3/2021 5:06 PM, denis.ma...@ub.unibe.ch wrote:
> > Hi everyone
> >
> > Now that Hans has implemented the new ligature suppression mechanism
> > via language goodies - thanks again Hans! - we now need to come up
> > with wordlists.
> >
> > I've started working on a list of German words with ligatures that
> > should be suppressed. The list is derived from the word list that
> > comes with the lualatex selnolig package:
> > https://github.com/micoloretan/selnolig/blob/master/selnolig-german-wo
> > rdlist.tex
> >  > ordlist.tex>
> >
> > You can find the current list here :
> > https://github.com/denismaier/context-nolig-wordlist
> > 
> >
> > The list is currently organized as follows :
> >
> >  1. L.25-l.35: This specifies words where automatic pattern matching is
> > more difficult than usually because the words contain multiple
> > ligatures, some of which must be suppressed while others must be
> > preserved. In the case of « Auflagefläche » it's even the same
> > combination of letters. So here, we use the bar | to manually
> > indicate points where no ligature must occur.
> >  2. L. 36ff.: The vast amount of words is currently in that list that
> > specifies words where a ff, fl, fi, ffi, or ffl ligature has to be
> > broken up after the first f.
> >  3. L.1804ff contain words where ffi, ffl, or fff ligatures have to be
> > prevented after the second f, so the first two fs form a ligature.
> >  4. The remaining blocks starting at L.1900, l. 2073, l. 2157, l. 2225,
> > and l. 2277 suppress ligatures for « ft » and « fft »,  « fb » and
> > « ffb », « fh » and « ffh», «fj» and «ffj», and «fk» and «ffk»
> >
> > Obviously, that list is far from being complete, and the question is
> > if it ever can be. Please have a look and feel free to propose more
> > words to be included - either via mail or directly on github.
> >
> > More generally, there's the question how such a list should be enhanced?
> > I was thinking about two options:
> >
> >  1. The new language options features include a tracker that allows for
> > tracking for which words in a given document ligature prevention
> > happened, and which words haven't been touched by the mechanism. It
> > should be possible to analyze the log file and to create lists of
> > words with ligatures. Should be a rather simple step to derive new
> > words for the ligature-suppression wordlist.
> >  2. A bigger solution might be to use selnoligs patterns in a script
> > that can be run over a large corpus, such as the DWDS (Digitales
> > Wörterbuch der deutschen Sprache). That should produce us a more
> > complete list of words where ligatures must be suppressed.
> 
> where is that DWDS ... i can write some code to deal with it (i'd rather start
> from the source than from some interpretation; who know what more there
> is to uncover)

As it turn out, the linguists that helped with the selnolig package did use 
another corpus: Stuttgart "Deutsch" Web as Corpus
They describe their approach in that paper: 
https://raw.githubusercontent.com/SHildebrandt/selnolig-check/master/selnolig-check-documentation.pdf

Denis

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Ligature suppression word list

2021-04-06 Thread denis.maier


> -Ursprüngliche Nachricht-
> Von: Hans Hagen 
> Gesendet: Samstag, 3. April 2021 17:58
> An: mailing list for ConTeXt users ; Maier, Denis
> Christian (UB) 
> Betreff: Re: [NTG-context] Ligature suppression word list
> 
> On 4/3/2021 5:06 PM, denis.ma...@ub.unibe.ch wrote:
> > Hi everyone
> >
> > Now that Hans has implemented the new ligature suppression mechanism
> > via language goodies - thanks again Hans! - we now need to come up
> > with wordlists.
> >
> > I've started working on a list of German words with ligatures that
> > should be suppressed. The list is derived from the word list that
> > comes with the lualatex selnolig package:
> > https://github.com/micoloretan/selnolig/blob/master/selnolig-german-wo
> > rdlist.tex
> >  > ordlist.tex>
> >
> > You can find the current list here :
> > https://github.com/denismaier/context-nolig-wordlist
> > 
> >
> > The list is currently organized as follows :
> >
> >  1. L.25-l.35: This specifies words where automatic pattern matching is
> > more difficult than usually because the words contain multiple
> > ligatures, some of which must be suppressed while others must be
> > preserved. In the case of « Auflagefläche » it's even the same
> > combination of letters. So here, we use the bar | to manually
> > indicate points where no ligature must occur.
> >  2. L. 36ff.: The vast amount of words is currently in that list that
> > specifies words where a ff, fl, fi, ffi, or ffl ligature has to be
> > broken up after the first f.
> >  3. L.1804ff contain words where ffi, ffl, or fff ligatures have to be
> > prevented after the second f, so the first two fs form a ligature.
> >  4. The remaining blocks starting at L.1900, l. 2073, l. 2157, l. 2225,
> > and l. 2277 suppress ligatures for « ft » and « fft »,  « fb » and
> > « ffb », « fh » and « ffh», «fj» and «ffj», and «fk» and «ffk»
> >
> > Obviously, that list is far from being complete, and the question is
> > if it ever can be. Please have a look and feel free to propose more
> > words to be included - either via mail or directly on github.
> >
> > More generally, there's the question how such a list should be enhanced?
> > I was thinking about two options:
> >
> >  1. The new language options features include a tracker that allows for
> > tracking for which words in a given document ligature prevention
> > happened, and which words haven't been touched by the mechanism. It
> > should be possible to analyze the log file and to create lists of
> > words with ligatures. Should be a rather simple step to derive new
> > words for the ligature-suppression wordlist.
> >  2. A bigger solution might be to use selnoligs patterns in a script
> > that can be run over a large corpus, such as the DWDS (Digitales
> > Wörterbuch der deutschen Sprache). That should produce us a more
> > complete list of words where ligatures must be suppressed.
> 
> where is that DWDS ... i can write some code to deal with it (i'd rather start
> from the source than from some interpretation; who know what more there
> is to uncover)

The DWDS is here: https://www.dwds.de/
But I still need to check how we can extract the words from there...

Denis
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Set luatex cache directory path

2021-04-06 Thread Aditya Mahajan
On Mon, 5 Apr 2021, Thangalin wrote:

> Peter Münster once asked:
> 
> > What should I do please, to prevent ConTeXt from creating
> $HOME/luatex-cache?
> 
> I'd like to do the same:
> 
> $ cd $HOME
> $ ls luatex-cache
> ls: cannot access 'luatex-cache': No such file or directory
> $ context test.tex
> $ ls luatex-cache/
> context
> $ rm -rf luatex-cache
> $ export TEXMFCACHE=/tmp

Add

$ mtxrun --generate

> $ context test.tex
> mtxrun  | unknown script 'mtx-context.lua' or 'mtx-mtx-context.lua'
> $ export TEXMFCACHE=
> $ context --version
> ...
> mtx-context | current version: 2021.03.31 18:04
> 
> What environment variable must change to set the luatex-cache directory?

Aditya___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___