Re: Forgot the code again!

2009-05-07 Thread Tomas Hlavaty
Hi Alex,

> I would suggest to keep a separate version for other purposes. Is
> this acceptable?

yes, no problem;-)

Thank you,

Tomas
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe


Re: Forgot the code again!

2009-05-07 Thread Alexander Burger
Hi Tomas,

> In the long-term, wouldn't it be better to put the glyph
> related stuff into @lib/glyph.l and keep access to the full glyph <->
> codepoint mapping?

I see, you are thinking of the PDF generator. At the moment, though,
this involves only a few lines of code, and I don't want to blow up
"@lib/ps.l" too much, as it is mission-critical.

I would suggest to keep a separate version for other purposes. Is this
acceptable?

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe


Re: Forgot the code again!

2009-05-06 Thread Tomas Hlavaty
Hi Alex,

> the usage of 'idx' in my implementation had a serious flaw: Because the
> unicod values in "glyphlist.txt" are partially sorted, we get a highly
> imbalanced tree:

Why does inserting and removing using 'idx' require manual rebalancing
using 'balance' in the first place?  I would expect insert and remove
to keep the tree balanced which should be more efficient than
rebalancing the whole tree.

It seems to me that to keep key/value pairs in the tree, the key must
be a symbol holding that value.  What if I want the keys to be numbers
or pairs?

> Then, to avoid the conflict with multiple values, I simply ignored all
> lines with multiple values, by checking with (member " " L). Does this
> sound reasonable? In this way the first entry in "lib/glyphlist.txt"
> will "win". At least this is the simplest solution ;-)
>
>> I would actually prefer keeping the complete mapping as in the
>> original code I sent.  That would allow searching through the fonts
>> for the right glyph.
>
> I see. For now I would like to stick with the simple version to see how
> it works out. I have to be very careful, as "lib/ps.l" is used heavily
> in several business applications.

Fair enough.  In the long-term, wouldn't it be better to put the glyph
related stuff into @lib/glyph.l and keep access to the full glyph <->
codepoint mapping?

# *Glyph *Codepoint

(in "lib/glyphlist.txt"
   (use (L N)
  (while (setq L (line))
 (unless (= '"#" (car L))
(setq L (split L '";") N (pack (car L)))
(for C (mapcar '((X) (char (hex (pack X (split (cadr L) " "))
   (if (idx '*Glyph C)
  (push (car @) N)
  (set C (list N))
  (idx '*Glyph C T) )
   (if (idx '*Codepoint N)
  (push (car @) C)
  (set N (list C))
  (idx '*Codepoint N T) ) ) ) ) ) )

(balance '*Glyph (sort (idx '*Glyph)))
(balance '*Codepoint (sort (idx '*Codepoint)))

(de glyph (C)
   (val (car (idx '*Glyph C))) )

(de codepoint (C)
   (val (car (idx '*Codepoint C))) )

>> I prefer 'glyph' returning NIL instead of ".notdef".
>
> OK, me too. Is it not necessary?

It would be good to keep .notdef in the output postscript file so we
could use (or (glyph X) ".notdef") or (or (car (glyph X)) ".notdef")
in case 'glyph' returns a list.  (I don't think I want to output
".notdef" in pdf.)

Thank you,

Tomas
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe


Re: Forgot the code again!

2009-05-05 Thread Alexander Burger
Hi Tomas,

the usage of 'idx' in my implementation had a serious flaw: Because the
unicod values in "glyphlist.txt" are partially sorted, we get a highly
imbalanced tree:

   : (depth *PsGlyph)  
   -> 71

I rewrote it now using 'balance' (and made *Glyph to a local transient
symbol).

> I also removed the "Ps" prefix of *PsGlyph as this unicode/glyph
> mapping is not specific to postscript.

   (balance '"*Glyph"
  (sort
 (make
(in "lib/glyphlist.txt"
   (use (L C)
  (while (setq L (line))
 (unless (or (= "#" (car L)) (member " " L))
(setq
   L (split L ";")
   C (char (hex (pack (cadr L )
(set (link C) (pack (car L))) ) ) ) ) ) ) )

With that, the depth is optimal now:

   : (depth (val (loc "*Glyph" glyph)))
   -> 13

Then, to avoid the conflict with multiple values, I simply ignored all
lines with multiple values, by checking with (member " " L). Does this
sound reasonable? In this way the first entry in "lib/glyphlist.txt"
will "win". At least this is the simplest solution ;-)


> I would actually prefer keeping the complete mapping as in the
> original code I sent.  That would allow searching through the fonts
> for the right glyph.

I see. For now I would like to stick with the simple version to see how
it works out. I have to be very careful, as "lib/ps.l" is used heavily
in several business applications.

> I prefer 'glyph' returning NIL instead of ".notdef".

OK, me too. Is it not necessary?


> I see that you define own encoding and add euro sign there.

Is this also not necessary? I took this from the PostScrip book. I never
understood if and why this font-definition stuff is really needed. How
else could we do it?

> all the other examples with utf strings worked only because
> ghostscript managed to find suitable font with the requested glyph.
> That did not unfortunately work for Cyrillic characters.  Did it ever
> work for you with something else than ISOLatin1Encoding?

No. As you know, I explicitly converted it all to IsoLatin1 anyway,
using

   (out (list "@bin/lat1" ...

I removed this out now, and the Euro is probably also not needed any
longer.


> it is used in '_ps'.  I think the stringwidth computation won't work
> as expected if there are some glyphs in the string.  What do you
> think?

Yes, this is the main problem.

Practically, it is not critical for me now, because the alignments are
used in my applications only for numeric values, and these won't change.
But it is very unsatisfactory.

I released now the current changes to the testing version.

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe


Re: Forgot the code again!

2009-05-04 Thread Tomas Hlavaty
Hi Alex,

> # *PsGlyph
>
> (in "lib/glyphlist.txt"
>(use (L C)
>   (while (setq L (line))
>  (unless (=3D '"#" (car L))
> (setq
>L (split L '";")
>C (char (hex (pack (cadr L )
> (set C (pack (car L)))
> (idx '*PsGlyph C T) ) ) ) )
>
> (de glyph (C)
>(if (idx '*PsGlyph C)
>   (val (car @))
>   ".notdef" ) )
>
> (=3D=3D=3D=3D)

I see, that's nice.

It seems to me that it still does not work as intended:

lammeemjeeminitialarabic;FEDF FEE4 FEA0
jeemmedialarabic;FEA0

(setq L (split "lammeemjeeminitialarabic;FEDF FEE4 FEA0" ";"))
-> (("l" "a" "m" "m" "e" "e" "m" "j" "e" "e" "m"
 "i" "n" "i" "t" "i" "a" "l" "a" "r" "a" "b" "i" "c")
("F" "E" "D" "F" " " "F" "E" "E" "4" " " "F" "E" "A" "0"))
: (pack (cadr L))
-> "FEDF FEE4 FEA0"
: (hex (pack (cadr L)))
-> 71738804086570656
: (glyph (char (hex "FEDF FEE4 FEA0")))
-> "jeemmedialarabic"

but we should split the cadr and take the first number only:

: (pack (car (split (cadr L) " ")))
-> "FEDF"
: (hex (pack (car (split (cadr L) " "
-> 65247

(in "lib/glyphlist.txt"
   (use (L C)
  (while (setq L (line))
 (unless (=3D '"#" (car L))
(setq
   L (split L '";")
   C (char (hex (pack (car (split (cadr L) " ") )
(set C (pack (car L)))
(idx '*PsGlyph C T) ) ) ) )

Also, there is the problem with which value take for the mapping:

laminitialarabic;FEDF
lammeemjeeminitialarabic;FEDF FEE4 FEA0
lammeemkhahinitialarabic;FEDF FEE4 FEA8

Maybe we should set the value of the transient symbol only if it is
not defined yet or if there is only one codepoint?

I would actually prefer keeping the complete mapping as in the
original code I sent.  That would allow searching through the fonts
for the right glyph.

(in "lib/glyphlist.txt"
   (use (L N)
  (while (setq L (line))
 (unless (=3D '"#" (car L))
(setq L (split L '";") N (pack (car L)))
(for C (mapcar '((X) (char (hex (pack X (split (cadr L) " "=
))
   (if (idx '*PsGlyph C)
  (push (car @) N)
  (set C (list N))
  (idx '*PsGlyph C T) ) ) ) ) ) )

: (glyph "=C5=AF")
-> ("uring")
: (glyph "=C2=A3")
-> ("sterling")

Or even better:

(in "lib/glyphlist.txt"
   (use (L N)
  (while (setq L (line))
 (unless (=3D '"#" (car L))
(setq L (split L '";") N (pack (car L)))
(for C (mapcar '((X) (char (hex (pack X (split (cadr L) " "=
))
   (if (idx '*Glyph C)
  (push (car @) N)
  (set C (list N))
  (idx '*Glyph C T) )
   (if (idx '*Codepoint N)
  (push (car @) C)
  (set N (list C))
  (idx '*Codepoint N T) ) ) ) ) ) )

(de glyph (C)
   (val (car (idx '*Glyph C))) )

(de codepoint (C)
   (val (car (idx '*Codepoint C))) )

: (glyph "=C2=A3")
-> ("sterling")
: (codepoint "sterling")
-> ("=C2=A3")
: (codepoint "lammeemjeeminitialarabic")
-> ("=EF=BA=A0" "=EF=BB=A4" "=EF=BB=9F")
: (glyph (car (codepoint "lammeemjeeminitialarabic")))
-> ("lammeemjeeminitialarabic" "jeemmedialarabic")

I prefer 'glyph' returning NIL instead of ".notdef".

I also removed the "Ps" prefix of *PsGlyph as this unicode/glyph
mapping is not specific to postscript.

>(prinl "/PicoEncoding")
>(prinl "   ISOLatin1Encoding  dup length array  copy")
>(prinl "   dup 164  /Euro  put")
>(prinl "def")
>(prinl "/isoLatin1 {")
>(prinl "   dup dup findfont  dup length  dict begin")
>(prinl "  {1 index /FID ne {def} {pop pop} ifelse} forall")
>(prinl "  /Encoding PicoEncoding def  currentdict")
>(prinl "   end  definefont")
>(prinl "} def")

I see that you define own encoding and add euro sign there.  I thing
all the other examples with utf strings worked only because
ghostscript managed to find suitable font with the requested glyph.
That did not unfortunately work for Cyrillic characters.  Did it ever
work for you with something else than ISOLatin1Encoding?

> (de _ps ("L")
>("?ff")
>(setq "L" (escPs "L"))
>(cond
>   ((not "H")
>  (prin 0) )
>   ((=3D0 "H")
>  (prin "*DX" " (" "L" ") stringwidth pop sub 2 div") )
>   (T (prin "*DX" " (" "L" ") stringwidth pop sub")) )
>(prin
>   " "
>   (-
>  "*PgY"
>  (cond
> ((not "V")
>(inc '"*Pos" "*Size") )
> ((=3D0 "V")
>(setq "*Pos" (+ (/ "*Size" 4) (/ "*DY" 2))) )
> (T (setq "*Pos" "*DY")) ) ) )
>(prin " moveto ")
>(?ul1)
>(prinPs "L")
>(?ul2) )
>
> (de escPs (L)
>(mapcan
>   '((C)
>  (if (sub? C "\\()")
> (list "\\" C)
> (list C) ) )
>   L ) )
>
> (de prinPs (Lst)
>(while Lst
>   (if (>=3D `(char 127) (car Lst))
>  (prog
> (prin "(" (pop 'Lst)