subject:"\[NTG\-context\] Ruby 1.9.1 and non\-ascii char parsing in .tui file"

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Hans Hagen


Jose Augusto wrote:

Hi all,

Ok, here it goes. Atached are the files used in the test.

The problem as reported in the  previous email  used
the file with the offending chars wrapped in a main file, which was just:

\starttext
\input zzz.tex
\stoptext

That is, the offending chars were in zzz.tex.

In that example I noticed the error because the cross-refs
in the equation numbering were not working.
The parsing of the .tui file by ruby 1.9.1 failed. Then I saw the errors.


ruby 1.9 internally is no longer 8 bit clean i.e. there is always an 
encoding (file as well as internal); there is no way to enforce this 
(there is the -E option but that is useless for 1.8)


i now open some files explicitly in binary mode; maybe that helps; i 
have no clue what happens with string manipulations later on


i always liked ruby but such fundamental changes (encoding, dropping 
functions etc) without renaming the program are a showstopper for me as 
one cannot predict what will be on the user's system


it looks like i have to convert the texutil part to lua (takes a few 
days and since i mostly use luatex it has a low priority)


i uploaded a beta for testing

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto

Hi all,

I think I solved the problem. At least for my actual errors...
I read the following net article about string coding in ruby 1.9 and up:

  http://blog.grayproductions.net/articles/ruby_19s_string

With that info at hand, I made two  brute-force trial patches  (read
the
above article to see why I call them brute force :-) ) in two of the ruby
context files
where problems were arising (original line numbers are shown):

###  .../scripts/context/ruby/base/tex.rb  ##
(946)  case str.chomp
===
str = str.force_encoding(ISO-8859-1) if RUBY_VERSION = 1.9
case str.chomp

### .../scripts/context/ruby/base/texutil.rb ###
(1033)  case line.chomp
   ===
   line = line.force_encoding(ISO-8859-1) if RUBY_VERSION = 1.9
   case line.chomp

The error is due to the fact that, by default, ruby 1.9 considers strings as
US-ASCII and complains when finding chars not in that encoding.

I don't know how to solve the problem for people writing in other
encoding which is not ISO-8859-1. I tried the above with UTF-8
instead of ISO-8859-1 and it didn't work.

Finally I don't know if there are any other places (at least case
expressions) in the
context ruby scripts where the problem might also show up.

Kind regards,

J. Augusto


On Mon, Aug 10, 2009 at 4:15 AM, Jose Augusto jasaugu...@gmail.com wrote:

 Hi all,

 Ok, here it goes. Atached are the files used in the test.

 The problem as reported in the  previous email  used
 the file with the offending chars wrapped in a main file, which was just:

 \starttext
 \input zzz.tex
 \stoptext

 That is, the offending chars were in zzz.tex.

 In that example I noticed the error because the cross-refs
 in the equation numbering were not working.
 The parsing of the .tui file by ruby 1.9.1 failed. Then I saw the errors.

 But then I made  a single context file , which goes attached
 with the correct results (tui, tuo, pdf), obtained with ruby 1.8.7.

 Howver, when i run ruby 1.9.1 with patch 129 (the last one), in this
 single tex file (attached) now the first pass don't work!
 Here is the result (in windows xp), which proves ruby 1.9.1 doesn't like
 the non US-ASCII chars :-)

 F:\ANOS\ano09-10-pen\NotasProcSinaltexexec test1.tex
 TeXExec | processing document 'test1.tex'
 TeXExec | no ctx file found
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:946:in `===':
 invalid byte sequence in US-ASCII (ArgumentError)
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:946: in
 `scantexcontent'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1907: in
 `processfile'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1143: in
 `block (2 levels) in processtex'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1133: in
 `timedrun'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1142: in
 `block in processtex'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1139: in
 `each'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/tex.rb:1139: in
 `processtex'
 from
 D:/Context/tex/texmf-context/scripts/context/ruby/texexec.rb:63:in`process'
 from
 D:/Context/tex/texmf-context/scripts/context/ruby/texexec.rb:53:in `main'
 from
 D:/Context/tex/texmf-context/SCRIPTS/CONTEXT/ruby/base/switch.rb:133:in
 `execute'
 from
 D:/Context/tex/texmf-context/scripts/context/ruby/texexec.rb:787:in `main'

 Thanks all for your interest,

 Kind regards

 J. Augusto


 On Sun, Aug 9, 2009 at 8:57 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

 when /^c (.*)$/o then @plugins.reader('MyCommands', [$1])


 what if you remove the o (/o)

 can you find out what changed between 1.8 and 1.9 ... actually 1.9 is the
 stepping stone to 2.0 and 2 versions can be incompatible to 1 versions

 also, can you make a test file so that we can see if there's a platform
 dependency?


 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___



___
If your

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto

I Hans,

I just sent a mail with a possible patch, before I read this answer from you
:-)
As I say there, the patches work (at least for me) and I had updated context
mkii a few hours ago, so I don't know if the betas you mentioned have
already
been installed...

Hope the proposed patches be helpful...
Thx very much for your answer.

J. Augusto

On Mon, Aug 10, 2009 at 2:10 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

 Hi all,

 Ok, here it goes. Atached are the files used in the test.

 The problem as reported in the  previous email  used
 the file with the offending chars wrapped in a main file, which was just:

 \starttext
 \input zzz.tex
 \stoptext

 That is, the offending chars were in zzz.tex.

 In that example I noticed the error because the cross-refs
 in the equation numbering were not working.
 The parsing of the .tui file by ruby 1.9.1 failed. Then I saw the errors.


 ruby 1.9 internally is no longer 8 bit clean i.e. there is always an
 encoding (file as well as internal); there is no way to enforce this (there
 is the -E option but that is useless for 1.8)

 i now open some files explicitly in binary mode; maybe that helps; i have
 no clue what happens with string manipulations later on

 i always liked ruby but such fundamental changes (encoding, dropping
 functions etc) without renaming the program are a showstopper for me as one
 cannot predict what will be on the user's system

 it looks like i have to convert the texutil part to lua (takes a few days
 and since i mostly use luatex it has a low priority)

 i uploaded a beta for testing

 Hans


 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Hans Hagen


Jose Augusto wrote:

I Hans,

I just sent a mail with a possible patch, before I read this answer from you
:-)
As I say there, the patches work (at least for me) and I had updated context
mkii a few hours ago, so I don't know if the betas you mentioned have
already
been installed...

Hope the proposed patches be helpful...


your patch will not work with ruby  1.9 so if my patch (opening files 
in rb mode) works ok that's more robust;


another option is to patch texmfstart.rb

#!/usr/bin/env ruby
#encoding: ASCII-8BIT


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto

Hi Hans,

The patch I proposed works also with ruby less than 1.9 (e.g. ruby 1.8.7)!
The force_encoding() method is used only if RUBY_VERSION = 1.9.
If the scripts are executed by ruby 1.8 or lesser version, there's no change
done to
the current line of code (e.g. 'case line.chomp' ).

Also, I verified the patch with ruby 1.8.7 and  with 1.9.1, and it worked in
both cases.

The patch has however the problem of slowing processing (the if is
executed
when parsing each line of the files, and probably this issue could be
optimized...)

Meanwhile I don't think that the magic string
# encoding: ASCII-8BIT
solves the problem. This string indicates that the script is written in
ASCII-8BIT,
but when is reading the strings from the .tex or .tui files ruby 1.9.1
considers
them as US-ASCII regardless of the encoding declared in # encoding: ...

I introduced  # encoding: ASCII-8BIT   in texmfstart.rb, tex.rb and
texutil.rb
and the problem didn't disapeer :-(

Of course I may be wrong. But the experiments I did make me think this way.
Also, I don't have Linux at my disposal (I mean, with context installed) and
there
the behavior perhaps is different...

Kind regards and thank you very much.

J. Augusto






On Mon, Aug 10, 2009 at 5:27 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

 I Hans,

 I just sent a mail with a possible patch, before I read this answer from
 you
 :-)
 As I say there, the patches work (at least for me) and I had updated
 context
 mkii a few hours ago, so I don't know if the betas you mentioned have
 already
 been installed...

 Hope the proposed patches be helpful...


 your patch will not work with ruby  1.9 so if my patch (opening files in
 rb mode) works ok that's more robust;

 another option is to patch texmfstart.rb

 #!/usr/bin/env ruby
 #encoding: ASCII-8BIT



 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Hans Hagen


Jose Augusto wrote:


Meanwhile I don't think that the magic string
# encoding: ASCII-8BIT
solves the problem. This string indicates that the script is written in
ASCII-8BIT,
but when is reading the strings from the .tex or .tui files ruby 1.9.1
considers
them as US-ASCII regardless of the encoding declared in # encoding: ...


not when opened as 'rb' (which i do in the latest texexec.rb) so i 
wonder why that does not work at your place


(http://blog.nuclearsquid.com/writings/ruby-1-9-encodings)

i run ruby 1.8.6 (and on a couple of servers even older versions and i'm 
not going to touch ruby on these machines (i don't want to patch scripts 
that are supposed to run another 5-10 years) but i might update context 
and texexec)



I introduced  # encoding: ASCII-8BIT   in texmfstart.rb, tex.rb and
texutil.rb
and the problem didn't disapeer :-(


hm, it worked here


Of course I may be wrong. But the experiments I did make me think this way.
Also, I don't have Linux at my disposal (I mean, with context installed) and
there
the behavior perhaps is different...


that's my biggest fear ... introducing more problems


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Jose Augusto

Hi Hans,

I ran just now ruby 1.8.6 and the force_encoding() patch worked well.

Just now I upgrade --context=current. The banner in the texexec.rb is
banner = ['TeXExec', 'version 6.2.1', '1997-2009', 'PRAGMA ADE/POD']
and the date of this script (after updating) is 10-04-2009 (its April..)

I'm running mkii. How do I get mkii beta scripts, as texexec.rb you mention?

All my rubys are compiled from the box with mingw in windows
(2000 or XP, in 3 different machines). Of course the encoding
thing is different in Linux, Windows (and DOS prompts, for the matter),
so there is probably different behavior in ruby/context/tex interaction with

chars in Linux and Windows boxes...

Thx

Jose

On Mon, Aug 10, 2009 at 6:39 PM, Hans Hagen pra...@wxs.nl wrote:

 Jose Augusto wrote:

  Meanwhile I don't think that the magic string
 # encoding: ASCII-8BIT
 solves the problem. This string indicates that the script is written in
 ASCII-8BIT,
 but when is reading the strings from the .tex or .tui files ruby 1.9.1
 considers
 them as US-ASCII regardless of the encoding declared in # encoding: ...


 not when opened as 'rb' (which i do in the latest texexec.rb) so i wonder
 why that does not work at your place

 (http://blog.nuclearsquid.com/writings/ruby-1-9-encodings)

 i run ruby 1.8.6 (and on a couple of servers even older versions and i'm
 not going to touch ruby on these machines (i don't want to patch scripts
 that are supposed to run another 5-10 years) but i might update context and
 texexec)

  I introduced  # encoding: ASCII-8BIT   in texmfstart.rb, tex.rb and
 texutil.rb
 and the problem didn't disapeer :-(


 hm, it worked here

  Of course I may be wrong. But the experiments I did make me think this
 way.
 Also, I don't have Linux at my disposal (I mean, with context installed)
 and
 there
 the behavior perhaps is different...


 that's my biggest fear ... introducing more problems


 Hans


 -
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
 -

 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : https://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-10 Thread Hans Hagen


Jose Augusto wrote:

Hi Hans,

I ran just now ruby 1.8.6 and the force_encoding() patch worked well.


yes, but if we can avoid adapting all those strings ... i'm pretty sure 
that if we follow that route we have to patch a lot


also keep in mind that in 1.9 there are several encodings (external and 
internal) so setting up a roundtrip using the string properties involves 
more patches)



Just now I upgrade --context=current. The banner in the texexec.rb is
banner = ['TeXExec', 'version 6.2.1', '1997-2009', 'PRAGMA ADE/POD']
and the date of this script (after updating) is 10-04-2009 (its April..)

I'm running mkii. How do I get mkii beta scripts, as texexec.rb you mention?


it depends: if (on linux) texexec is a big file then you need to copy 
texexec.rb to texexec, else if it's a stub it should just work (in that 
case texmfstart will start texexec.rb)



All my rubys are compiled from the box with mingw in windows
(2000 or XP, in 3 different machines). Of course the encoding
thing is different in Linux, Windows (and DOS prompts, for the matter),
so there is probably different behavior in ruby/context/tex interaction with


on windows there should be a stub (something texexec.cmd == ruby 
texmfstart texexec ...)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

[NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

2009-08-08 Thread Jose Augusto

Hi all,

A few weeks ago I reported a problem with ruby 1.9.1, which
was solved by removing the offending .tui line (Mojca and Hans
AFAIR). The problem was related with the existence of non-ascii
chars in the .tui file. Sadly it strikes again, now when chars with
accents appear in titles (sections, subsections, etc...).

The  parsing of the line signaled below in the end of this message, from a
.tui file,
fails in ruby 1.9.1, but not in ruby 1.8.7. The error which is returned is
also shown.
If I remove the chars with accents from the section title all goes well. I'm
using
Mkii (--context=current).

One of  the advantages of ruby 1.8 over 1.8 is tht it is 3 times faster...
However, ruby made lots of changes in string manipulation and storing
when moving from 1.8 to 1.9, and that must be the source of the problem.

I tracked the error to texutil.rb, line 1035:

when /^c (.*)$/o then @plugins.reader('MyCommands', [$1])

but then i got lost in the Classes/Modules jungle :-) in that script.
Perhaps it is this this procedure, in line 403 of texutil.rb, which triggers
the error?

def MyCommands::reader(logger,data)
@@commands.push(data.shift+data.collect do |d| \{#{d}\}
end.join)
end

Thx for your support in advance.

If I can help in the solution of the problem please direct me in the task. I
have some
experience with ruby (I started using it when the 1st pickaxe book edition
was
published, around 2001) and with perl. But not with Lua :-)... Although I
read
alraedy quite a lot of Roberto's Lua book, I didn't started coding in Lua
yet :-)

Kind Regards

J. Augusto

##TUI file and trigered error   ###

## .tui snippet

c \mainreference{}{a}{2--0-1-1-0-0-0-0--1}{1}{1.1}
c \listentry{subsection}{3}{1.1.1}{Title with accents:
Ãçê}{2--0-1-1-1-0-0-0--1}{1}
c \mainreference{}{b}{2--0-1-1-1-0-0-0--1}{1}{1.2}

### error

pdfTeX warning: pdftex.exe: no GlyphToUnicode entry has been inserted yet!
Output written on test1.pdf (1 page, 72793 bytes).
Transcript written on test1.log.
TeXUtil | parsing file test1.tui
TeXUtil | fatal error in parsing test1.tui
TeXUtil | check loading of file 'test1', begin/end problem
TeXUtil | shortcuts : 169
TeXUtil | expansions: 308
#
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

Re: [NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

[NTG-context] Ruby 1.9.1 and non-ascii char parsing in .tui file

9 matches

Site Navigation

Mail list logo

Footer information