Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto
Hi all,

I think I solved the problem. At least for my actual errors...
I read the following net article about string coding in ruby 1.9 and up:

  http://blog.grayproductions.net/articles/ruby_19s_string

With that info at hand, I made two  brute-force trial patches  (read
the
above article to see why I call them brute force :-) ) in two of the ruby
context files
where problems were arising (original line numbers are shown):

###  .../scripts/context/ruby/base/tex.rb  ##
(946)  case str.chomp
===
str = str.force_encoding(ISO-8859-1) if RUBY_VERSION = 1.9
case str.chomp

### .../scripts/context/ruby/base/texutil.rb ###
(1033)  case line.chomp
   ===
   line = line.force_encoding(ISO-8859-1) if RUBY_VERSION = 1.9
   case line.chomp

The error is due to the fact that, by default, ruby 1.9 considers strings as
US-ASCII and complains when finding chars not in that encoding.

I don't know how to solve the problem for people writing in other
encoding which is not ISO-8859-1. I tried the above with UTF-8
instead of ISO-8859-1 and it didn't work.

Finally I don't know if there are any other places (at least case
expressions) in the
context ruby scripts where the problem might also show up.

Kind regards,

J. Augusto


On Mon, Aug 10, 2009 at 4:15 AM, Jose Augusto jasaugu...@gmail.com wrote:

 Hi all,

 Ok, here it goes. Atached are the files used in the test.

 The problem as reported in the  previous email  used
 the file with the offending chars wrapped in a main file, which was just:

 \starttext
 \input zzz.tex
 \stoptext

 That is, the offending chars were in zzz.tex.

 In that example I noticed the error because the cross-refs
 in the equation numbering were not working.
 The parsing of the .tui file by ruby 1.9.1 failed. Then I saw the errors.

 But then I made  a single context file , which goes attached
 with the correct results (tui, tuo, pdf), obtained with ruby 1.8.7.

 Howver, when i run ruby 1.9.1 with patch 129 (the last one), in this
 single tex file (attached) now the first pass don't work!
 Here is the result (in windows xp), which proves ruby 1.9.1 doesn't like
 the non US-ASCII chars :-)

 F:\ANOS\ano09-10-pen\NotasProcSinaltexexec test1.tex
 TeXExec | processing document 'test1.tex'
 TeXExec | no ctx file found
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:946:in `===':
 invalid byte sequence in US-ASCII (ArgumentError)
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:946: in
 `scantexcontent'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1907: in
 `processfile'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1143: in
 `block (2 levels) in processtex'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1133: in
 `timedrun'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1142: in
 `block in processtex'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1139: in
 `each'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1139: in
 `processtex'
 from
 D:/Context/tex/texmf-context/scripts/context/ruby/texexec.rb:63:in`process'
 from
 D:/Context/tex/texmf-context/scripts/context/ruby/texexec.rb:53:in `main'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/switch.rb:133:in
 `execute'
 from
 D:/Context/tex/texmf-context/scripts/context/ruby/texexec.rb:787:in `main'

 Thanks all for your interest,

 Kind regards

 J. Augusto


 On Sun, Aug 9, 2009 at 8:57 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

 when /^c (.*)$/o then @plugins.reader('MyCommands', [$1])


 what if you remove the o (/o)

 can you find out what changed between 1.8 and 1.9 ... actually 1.9 is the
 stepping stone to 2.0 and 2 versions can be incompatible to 1 versions

 also, can you make a test file so that we can see if there's a platform
 dependency?


 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___



___
If your

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto
I Hans,

I just sent a mail with a possible patch, before I read this answer from you
:-)
As I say there, the patches work (at least for me) and I had updated context
mkii a few hours ago, so I don't know if the betas you mentioned have
already
been installed...

Hope the proposed patches be helpful...
Thx very much for your answer.

J. Augusto

On Mon, Aug 10, 2009 at 2:10 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

 Hi all,

 Ok, here it goes. Atached are the files used in the test.

 The problem as reported in the  previous email  used
 the file with the offending chars wrapped in a main file, which was just:

 \starttext
 \input zzz.tex
 \stoptext

 That is, the offending chars were in zzz.tex.

 In that example I noticed the error because the cross-refs
 in the equation numbering were not working.
 The parsing of the .tui file by ruby 1.9.1 failed. Then I saw the errors.


 ruby 1.9 internally is no longer 8 bit clean i.e. there is always an
 encoding (file as well as internal); there is no way to enforce this (there
 is the -E option but that is useless for 1.8)

 i now open some files explicitly in binary mode; maybe that helps; i have
 no clue what happens with string manipulations later on

 i always liked ruby but such fundamental changes (encoding, dropping
 functions etc) without renaming the program are a showstopper for me as one
 cannot predict what will be on the user's system

 it looks like i have to convert the texutil part to lua (takes a few days
 and since i mostly use luatex it has a low priority)

 i uploaded a beta for testing

 Hans


 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto
Hi Hans,

The patch I proposed works also with ruby less than 1.9 (e.g. ruby 1.8.7)!
The force_encoding() method is used only if RUBY_VERSION = 1.9.
If the scripts are executed by ruby 1.8 or lesser version, there's no change
done to
the current line of code (e.g. 'case line.chomp' ).

Also, I verified the patch with ruby 1.8.7 and  with 1.9.1, and it worked in
both cases.

The patch has however the problem of slowing processing (the if is
executed
when parsing each line of the files, and probably this issue could be
optimized...)

Meanwhile I don't think that the magic string
# encoding: ASCII-8BIT
solves the problem. This string indicates that the script is written in
ASCII-8BIT,
but when is reading the strings from the .tex or .tui files ruby 1.9.1
considers
them as US-ASCII regardless of the encoding declared in # encoding: ...

I introduced  # encoding: ASCII-8BIT   in texmfstart.rb, tex.rb and
texutil.rb
and the problem didn't disapeer :-(

Of course I may be wrong. But the experiments I did make me think this way.
Also, I don't have Linux at my disposal (I mean, with context installed) and
there
the behavior perhaps is different...

Kind regards and thank you very much.

J. Augusto






On Mon, Aug 10, 2009 at 5:27 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

 I Hans,

 I just sent a mail with a possible patch, before I read this answer from
 you
 :-)
 As I say there, the patches work (at least for me) and I had updated
 context
 mkii a few hours ago, so I don't know if the betas you mentioned have
 already
 been installed...

 Hope the proposed patches be helpful...


 your patch will not work with ruby  1.9 so if my patch (opening files in
 rb mode) works ok that's more robust;

 another option is to patch texmfstart.rb

 #!/usr/bin/env ruby
 #encoding: ASCII-8BIT



 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto
Hi Hans,

I ran just now ruby 1.8.6 and the force_encoding() patch worked well.

Just now I upgrade --context=current. The banner in the texexec.rb is
banner = ['TeXExec', 'version 6.2.1', '1997-2009', 'PRAGMA ADE/POD']
and the date of this script (after updating) is 10-04-2009 (its April..)

I'm running mkii. How do I get mkii beta scripts, as texexec.rb you mention?

All my rubys are compiled from the box with mingw in windows
(2000 or XP, in 3 different machines). Of course the encoding
thing is different in Linux, Windows (and DOS prompts, for the matter),
so there is probably different behavior in ruby/context/tex interaction with

chars in Linux and Windows boxes...

Thx

Jose

On Mon, Aug 10, 2009 at 6:39 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

  Meanwhile I don't think that the magic string
 # encoding: ASCII-8BIT
 solves the problem. This string indicates that the script is written in
 ASCII-8BIT,
 but when is reading the strings from the .tex or .tui files ruby 1.9.1
 considers
 them as US-ASCII regardless of the encoding declared in # encoding: ...


 not when opened as 'rb' (which i do in the latest texexec.rb) so i wonder
 why that does not work at your place

 (http://blog.nuclearsquid.com/writings/ruby-1-9-encodings)

 i run ruby 1.8.6 (and on a couple of servers even older versions and i'm
 not going to touch ruby on these machines (i don't want to patch scripts
 that are supposed to run another 5-10 years) but i might update context and
 texexec)

  I introduced  # encoding: ASCII-8BIT   in texmfstart.rb, tex.rb and
 texutil.rb
 and the problem didn't disapeer :-(


 hm, it worked here

  Of course I may be wrong. But the experiments I did make me think this
 way.
 Also, I don't have Linux at my disposal (I mean, with context installed)
 and
 there
 the behavior perhaps is different...


 that's my biggest fear ... introducing more problems


 Hans


 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


[NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-08 Thread Jose Augusto
Hi all,

A few weeks ago I reported a problem with ruby 1.9.1, which
was solved by removing the offending .tui line (Mojca and Hans
AFAIR). The problem was related with the existence of non-ascii
chars in the .tui file. Sadly it strikes again, now when chars with
accents appear in titles (sections, subsections, etc...).

The  parsing of the line signaled below in the end of this message, from a
.tui file,
fails in ruby 1.9.1, but not in ruby 1.8.7. The error which is returned is
also shown.
If I remove the chars with accents from the section title all goes well. I'm
using
Mkii (--context=current).

One of  the advantages of ruby 1.8 over 1.8 is tht it is 3 times faster...
However, ruby made lots of changes in string manipulation and storing
when moving from 1.8 to 1.9, and that must be the source of the problem.

I tracked the error to texutil.rb, line 1035:

when /^c (.*)$/o then @plugins.reader('MyCommands', [$1])

but then i got lost in the Classes/Modules jungle :-) in that script.
Perhaps it is this this procedure, in line 403 of texutil.rb, which triggers
the error?

def MyCommands::reader(logger,data)
@@commands.push(data.shift+data.collect do |d| \{#{d}\}
end.join)
end

Thx for your support in advance.

If I can help in the solution of the problem please direct me in the task. I
have some
experience with ruby (I started using it when the 1st pickaxe book edition
was
published, around 2001) and with perl. But not with Lua :-)... Although I
read
alraedy quite a lot of Roberto's Lua book, I didn't started coding in Lua
yet :-)

Kind Regards

J. Augusto

##TUI file and trigered error   ###

## .tui snippet

c \mainreference{}{a}{2--0-1-1-0-0-0-0--1}{1}{1.1}
c \listentry{subsection}{3}{1.1.1}{Title with accents:
Ãçê}{2--0-1-1-1-0-0-0--1}{1}
c \mainreference{}{b}{2--0-1-1-1-0-0-0--1}{1}{1.2}

### error

pdfTeX warning: pdftex.exe: no GlyphToUnicode entry has been inserted yet!
Output written on test1.pdf (1 page, 72793 bytes).
Transcript written on test1.log.
TeXUtil | parsing file test1.tui
TeXUtil | fatal error in parsing test1.tui
TeXUtil | check loading of file 'test1', begin/end problem
TeXUtil | shortcuts : 169
TeXUtil | expansions: 308
#
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] ConTeXt Minimals parsing of .tui broken with ruby 1.9.1 in Windows

2009-07-15 Thread Jose Augusto
Hi all,

Thanks for the patch. I just updated ConTeXt Minimals and re-tried.
Here is the GOOD result, now its working:

-
F:\ANOS\TeXesruby -v
ruby 1.9.1p129 (2009-05-12 revision 23412) [i386-mingw32]

F:\ANOS\TeXestexexec con-hello1.tex
TeXExec | processing document 'con-hello1.tex'

Output written on con-hello1.pdf (1 page, 21759 bytes).
Transcript written on con-hello1.log.
TeXUtil | parsing file con-hello1.tui
TeXUtil | shortcuts : 169
TeXUtil | expansions: 308
TeXUtil | reductions: 0
TeXUtil | divisions : 0
TeXUtil | loaded files: 1
TeXUtil | temporary files: 0
TeXUtil | commands: 20
TeXUtil | programs: 0
TeXUtil | tuo file saved
TeXExec | runtime: 4.578125
-

Meanwhile, the evil line with ctrl chars is not anymore in the .tui
file.
I want to thank Hans and Mojca for the patching and the kindness.

Jose.


On Tue, Jul 14, 2009 at 1:56 PM, Mojca Miklavec 
mojca.miklavec.li...@gmail.com wrote:

  After installing ConTeXt Minimals (the devel version) yesterday,
  I ran the above example with ruby 1.9.1-p129 in Windows
  (both Win 2000 and XP show the problem).
 
  (maybe mojca can patch this in core-uti.mkii: ):
 
 
  % \appendtoks
  %
 \immediatewriteutilitycommand{\thisisbytesequence{\testbytesequence}}%
  % \to \everyopenutilities
 
  \let\testbytesequence  \empty % keep this
  \let\thisisbytesequence\gobbleoneargument % keep this

 Done, but untested.

 Mojca

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


[NTG-context] ConTeXt Minimals parsing of .tui broken with ruby 1.9.1 in Windows

2009-07-14 Thread Jose Augusto
Hello all,

I want to report a problem that is either in ConTeXt, or in ruby 1.9.1
(last version of ruby). More probably, the problem has to do with ruby
handling non-ASCII characters. I have no means of trying Linux, Solaris,
etc...
Anyone using ConTeXt with ruby 1.9.1 will face it probably (at least in
Windows :-)

The problem happens with all files, even with the simple Hello:

\starttext
Hello World
\stoptext

After installing ConTeXt Minimals (the devel version) yesterday,
I ran the above example with ruby 1.9.1-p129 in Windows
(both Win 2000 and XP show the problem).

Meanwhile I compiled and tried several versions of Ruby, and found the
following pattern of problems:

ruby 1.9.1p129 (2009-05-12 revision 23412) [i386-mingw32] PROBLEM
ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mingw32]PROBLEM
ruby 1.9.0 (2008-10-04 revision 19669) [i386-mingw32]  No problem
ruby 1.8.7 (2009-06-12 patchlevel 174) [i386-mingw32]  No problem
ruby 1.8.6 (2009-06-08 patchlevel 369) [i386-mingw32]  No problem

So, whatever it is, it is broken with ruby 1.9.1.
All the versions of ruby were compiled in Windows using the mingw toolchain,
with GCC 3.4.5.


 Here is the description of what happens   

When texutil parses the .tui file, I get the following (see comments after
this text output):

-
...
Output written on con-hello1.pdf (1 page, 21759 bytes).
Transcript written on con-hello1.log.
TeXUtil | parsing file con-hello1.tui
TeXUtil | debug 1 jasa #File:0x1271d18
xxx c \thisissectionseparator{-}
xxx c \thisisutilityversion{2008.10.14}
xxx c \thisisbytesequence{?+Ç}
TeXUtil | fatal error in parsing con-hello1.tui
TeXUtil | shortcuts : 0
TeXUtil | expansions: 0
TeXUtil | reductions: 0
TeXUtil | divisions : 0
TeXUtil | loaded files: 0
TeXUtil | temporary files: 0
TeXUtil | commands: 2
TeXUtil | programs: 0
TeXUtil | tuo file saved
TeXExec | runtime: 2.703125


The lines with debug 1 jasa and starting with xxx result from
the simple debug code I inserted in the file texutil.rb to find the
problematic line. The error happens when the following ruby code
is executed: (the extra  debug lines have a mark # jasa )

-- texutil.rb (snippet, around line 1025)
--

def loaded(filename)
begin
tuifile = File.suffixed(filename,'tui')
if FileTest.file?(tuifile) then
report(parsing file #{tuifile})
if f = open(tuifile) then
report(debug 1 jasa #{f})  # jasa
f.each do |line|
print xxx #{line}  # jasa
case line.chomp
when /^f (.*)$/o then
@plugins.reader('MyFiles',$1.splitdata)
when /^c (.*)$/o then
@plugins.reader('MyCommands', [$1])
when /^e (.*)$/o then
@plugins.reader('MyExtras',   $1.splitdata)
when /^s (.*)$/o then
@plugins.reader('MySynonyms', $1.splitdata)
when /^r (.*)$/o then
@plugins.reader('MyRegisters',$1.splitdata)
when /^p (.*)$/o then
@plugins.reader('MyPlugins',  $1.splitdata)
when /^x (.*)$/o then
@plugins.reader('MyKeys', $1.splitdata)
when /^r (.*)$/o then # nothing, not handled
here
else
# report(unknown entry #{line[0,1]} in line
#{line.chomp})
end
end
f.close
end
else
report(unable to locate #{tuifile})
end
rescue
report(fatal error in parsing #{tuifile})
@filename = 'texutil'
else
@filename = filename
end
end

---

From the debugging lines that are expelled, it is clear that the line in the
.tui file that triggers the problem is:

c \thisisbytesequence{ ...non-ASCII codes... }

and, precisely, it s the second line of the 'case':

when /^c (.*)$/o then @plugins.reader('MyCommands', [$1])

which processes the .tui line and triggers the 'rescue' clause. So I think
the problem lies in the digestion
of non-ASCII characters by the last version of Ruby.

I don't know what is the meaning of the \thisisbytesequence line in ConTeXt
and the maening of those non-ASCII chars. I followed the
@plugins.reader('MyCommands', [$1])
and figured out that what raises the exception happens before the
@plugins.reader method,
since it is never reached when the  \thisisbytesequence line is