[fricas-devel] DParse draft

Serge D. Mechveliani Tue, 21 Feb 2012 04:42:03 -0800

Dear Axiom/FriCAS developers,

The archive   http://botik.ru/pub/local/Mechveliani/axiQuest/DParse.zip


contains a small  draft  Spad program for the project  DParse  of a  
general fast parsing. 
It was tested under  FriCAS-1.1.5  built from source under  
GNU CLISP 2.48,  Debian Linux:

  unzip DParse.zip
  )compile dPrelude
  )compile uPol
  )compile dParse
  )compile testParse

  )time on
  )clear all
  test1   3000
  )clear all
  test1Fr 3000
  )clear all
  test2   3000
  )clear all
  test2Fr 3000

(it also has timing).

This parser, -- the function  dParseL, --
(1) is for the  String Interface  of  Foo --> Axiom --> Foo
(2) aims to become much faster than the current  parse * interpret
    by FriCAS.

Conserning (1)
--------------
Foo initiates the exchange, it knows everything about the target domain
in Axiom, and can put to the string all the info in terms of Axiom,
while the Axiom part parser  dParseL  initially knows nothing, and has 
only the input string.

Concerning (2)
--------------
a) The performance must be achieved by providing a special syntax in 
  which a data is described closely to the internal Axiom form, so that
most of an usual parsing is not needed. 
* This syntax is not for human reading, and it is not for manual input. 
* On the other hand, the syntax should be safe: not to be based on the 
  Axiom implementation features that are likely to change. 
* Also the parser needs not to do any expensive check for errors: 
  the input is supposed to be correct.

b) If the parser is fast, it can be used in Axiom independently from Foo
   (so far, I do not know of a possible example!).

As the Axiom part does not know the target domain before analysing
the input string, I was forced to set `Type' and `Any' in the argument 
and result of the parser function respectively: 

    LexList   ==> List String
    MaybeType ==> Union(Type, "failed")
    ParseRes  ==> Record(dDom : Type, val : Any, remLexs : LexList)

        dParseL (mbDom : MaybeType, lexemes LexList) : ParseRes

(before this,  String  is broken to lexemes by  lexLots).
`Any' is used as an exception, because the aim is very particular:  
I need from Axiom only a single function  dParseL,  
(this is expected to support the interface usage of several selected 
functions from the Axiom mathematical library).

`Any' is used because   val : T,   for  T = mbDom :: Type,
and  T  is not known initially.
  
Method
------
It is as in
Example.     x^3 + (3/4)*x^2 + 2/5   :: UP(x, Fraction Integer)
is input as
         "(UP x (Fr (I 1) (I 1)) 
                [(Fr (I 1) (I 1)) (Fr (I 3) (I 4)) (Fr (I 2) (I 5))] 
                [3 2 0]
          )" 

-- ( UP  <sampleCoefficient>  <coefficients>  <exponents> ).
dParse  sees the tag lexeme  "UP",  then -- "x",  and puts for the final 
result:  d : UP(x, D),  where  
D : CommutativeRing  needs to be defined further
(dParseL is ready to deal with  D[x]  only for D : CommutativeRing).

Furher, it sees in the sample coefficient the tag  "Fr"  and puts  
D := Fraction D2,  where  D2 : IntegralDomain   
needs to be defined further.
It sees in the sample coefficient  "(" "I" "1" ")"  for the numerator.
And it puts  D2 := Integer,  by the "I" tag.  
After this, the full target concrete domain is defined:
                                            UPol(x, Fraction Integer).
The further denominator is parsed by  dParseL(D2 := Integer, ...),  
because  D2  is already found. Similarly, each coefficient is parsed by 
dParseL(D := Fraction Integer, ...).

Why domain is in the result
---------------------------
To build a fraction coefficient from the found numerator  n  and 
denominator  d,  the target   D := Fraction D2  needs to be set in the 
Spad program for  dParseL:  `(n/d) :: Fraction D2'  
-- or something similar.
A similar construction is expected for other type constructors. 
I understand this as a nature of the Spad language.
At least, I do not know how to avoid this explicit coercion, or a package 
calling. And for this coercion, the domain must be found recursively, 
hence it must be in the result:  the domains are computed dynamically.

Further, each exponent is parsed by repeated applying of  
                          parseInteger = READ_-FROM_-STRING(str) $ Lisp,
because its domain is defined from the "UP" tag.

Currently, only the constructors  Integer, Fraction, UP  

are possible in a composition defining a target domain.
If the parser proves all right, then more constructors will need to 
join.

`Any', `retract', `pretend'
---------------------------
To build a  fraction,  a polynomial,  etc.,  dParseL  needs to use 
operations from a more special categories than  Type.
This contradicts to the signature of  dParse.
So, it applies  coerce/retract  between  Any  and some currently found 
domain  D,  and applies  pretend SuchAndSuchCategory  to  D  
(see the source).
I
a) have seen in the book the words about dynamic types, 
b) observed `Any', `retract' and `pretend' in some materials,
c) noticed that the compiler is often calm, when the program 
   _operates with types as with data_ 
   (the feature new for me, and totally impossible in Haskell). 
So, I wrote this, with an intuitive hope, still being almost sure that 
this building will be ruined. Because such extremal features usually 
need to obey some subtle restrictions, of which retrictions I am not 
currently aware.
For 3-4 days I could not make it working. But then it has occured that 
this particular fail was not due to the domain extremism, but it was 
due to  a) misused Spad syntax, beginner errors,  
        b) may be, some compiler bug,  c) enigmatic compiler messages.
Now it seems to work.

Before this, I have been healthy and was running skies up to hills. 
And with dParse, I have been cussing like a good Russian navy boatswain, 
got ill of radiculitis of sitting with tension. Now starting to recover, 
yesterday tried skiing. 
Programming in an unknown and rich language is harmful.

Waldek  wrote that this my usage of domains will lead to troubles, 
because  T : Type  contradicts to the used functionality of more 
special domains.

Question:  as the examples in  testParse.spad  work,
           where are the troubles?  

(apart from the boatswain ones). Is this program safe?

I shall be grateful to people, if they look into  testParse.spad,
into all the (small) source, and give comments and advice.
First, I need to know where there are unsafe things in  dParse.spad.

If it occurs  "All right, it is valid  (even may be ugly)",
then I shall need to consider optimizations -- many optimizations are 
possible.

Several beginner questions on details are marked in the source by the 
key word  "debug".

Thanks to people for explanations,

------
Sergei
[email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/fricas-devel?hl=en.

[fricas-devel] DParse draft

Reply via email to