Re: Grammar - Class creation

2001-05-29 Thread Leon Brocard

Paul Makepeace sent the following bits through the ether:

 Are there modules/frameworks that exist to create classes from a
 grammar spec (e.g. EBNF)?

Well, Parse::RecDescent[1] probably does what you want. Check out the
autotree directive.

Parsing is fun. Let's try and parse everything!

[1] Or Parse::Yapp, but I betcha it'll be more work
-- 
Leon Brocard.http://www.astray.com/
Iterative Software...http://www.iterative-software.com/

... Careful. We don't want to learn from this. - Calvin



Re: Grammar - Class creation

2001-05-29 Thread Simon Wistow

Paul Makepeace wrote:
 
 Are there modules/frameworks that exist to create classes from a
 grammar spec
 (e.g. EBNF)? Restating, I'm envisaging something where the input is a
 grammar and the output is a class or set of classes that provides
 parsing capabilities and validating accessor methods.
 
 Immediate application is feeding the PDF spec 

I started looking into this when I first started doing the SWF stuff ...
a kind of YACC for file formats. Describe it in a BNF-a-like language
and then run a program over it et voila - you have a library for reading
and creating that file format (he says, glossing over lots of
complications and gotchas). Write that program for each different
language and lots of different languages/systems have access to lots of
different fiel formats and every time a format changes the spec gets
updated and everyone runs their grammar-library programs again and
everybody's got full functionality again.

I think this article talks about it some
http://advogato.org/article/59.html

Like I said, I looked into and didn't find anything and didn't have the
time/experience/inclination to start doing something myself - too many
gotchas :(

-- 
simon wistowwireless systems coder
i think, i said i think this is our fault.



Re: Grammar - Class creation

2001-05-29 Thread Marcel Grunauer

On Tuesday, May 29, 2001, at 11:18  AM, Simon Wistow wrote:

 I started looking into this when I first started doing the SWF stuff ...
 a kind of YACC for file formats. Describe it in a BNF-a-like language
 and then run a program over it et voila - you have a library for reading
 and creating that file format (he says, glossing over lots of
 complications and gotchas). Write that program for each different
 language and lots of different languages/systems have access to lots of
 different fiel formats and every time a format changes the spec gets
 updated and everyone runs their grammar-library programs again and
 everybody's got full functionality again.

As Leon points out, Parse::RecDescent is One Way To Do It. However, it's
mostly used to parse some input according to some grammar and to 
construct
the desired result directly. If you need a different result from the same
grammar, you have to specify the grammar and actions again.

It might be an idea to have grammars packed up in modules (i.e., 
reusable)
and make the actions callbacks (some sort of autoaction might do that),
much like HTML and XML parsers do it. I imagine lots of little Parse::*
modules (Parse::Regex, Parse::PDF, Parse::RPN etc.).

Is that a) a good idea, b) a bad idea, c) common practice anyway and I 
just
haven't found it?

Marcel

--
$ perl -we time
Useless use of time in void context at -e line 1.



Re: Grammar - Class creation

2001-05-29 Thread Leon Brocard

Marcel Grunauer sent the following bits through the ether:

 Is that a) a good idea, b) a bad idea, c) common practice anyway and
 I just haven't found it?

japhy's apparently kinda doing this:
http://search.cpan.org/doc/PINYAN/YAPE-Regex-3.01/extra/YAPE.pm

  The YAPE hierarchy of modules is an attempt at a unified means of
  parsing and extracting content. It attempts to maintain a generic
  interface, to promote simplicity and reusability. The API is
  powerful, yet simple. The modules do tokenization (which can be
  intercepted) and build trees, so that extraction of specific nodes
  is doable.

Other programming languages need code generators to spit out
libraries. Perl doesn't need to do this as it's dynamic, baby. This is
why Parse::RecDescent / Template Toolkit are so groovy, yeah.

[Of course, the reason nobody's done this before is that everyone
wants a slightly different interface...]

Leon
-- 
... We're not worthy! We're not worthy!



Re: Grammar - Class creation

2001-05-29 Thread Matthew Byng-Maddick

On Tue, May 29, 2001 at 02:27:40AM -0700, Paul Makepeace wrote:
 Anyway, PDF is easier re: packing/endianness since it's a text format!
 The only time you get binary data is for unencoded streams (which they advise
 against, although it's permitted, for example PDFlib generates it)
 like a  /Filter /FlateDecode streamzlib-deflated-dataendstream 

Not quite, it's a human-readable binary format. All the indexes rely on
offsets in the file, and the various fun with the newline conventions
mean that in my book, it's a binary format, you can't just go along and
edit it with a text editor, because it won't work anymore. Sure, the only
bits that are non-*ASCII* are the streams, but even so...

MBM




Re: Grammar - Class creation

2001-05-29 Thread Paul Makepeace

On Tue, May 29, 2001 at 10:45:59AM +0100, Leon Brocard wrote:
 [Of course, the reason nobody's done this before is that everyone
 wants a slightly different interface...]

Surely it should be possible to specify the underlying *functionality*
of the system and then have a perl source filter (or other component of
perl's mind-addling n-tier parsing architecture) that
rewrites/re-presents the interface in the API style du jour...

Paul



Re: Grammar - Class creation

2001-05-29 Thread Marcel Grunauer

On Tuesday, May 29, 2001, at 11:49  AM, Paul Makepeace wrote:

 Surely it should be possible to specify the underlying *functionality*
 of the system and then have a perl source filter (or other component of
 perl's mind-addling n-tier parsing architecture) that
 rewrites/re-presents the interface in the API style du jour...

Separate the end-user API from the parser's action API. I.e., each parser
module specifies the grammar and uses a fixed style of action that 
creates
something like an abstract syntax tree. Then have several end-user APIs 
to
access that AST. I.e., one that traverses the tree and makes callbacks 
for
each node. Or one that uses an XPath-like syntax to get at certain nodes 
in
the tree. Or, in the spirit of the model-view controller, make it a tree
model and have various tree viewers (Leon will recognise this idea; as 
I've
been going on about this since before the German Perl Workshop...) so you
could also 'view' the tree as an XML document, as a Data::Dumper-like
output, as a directory hierarchy (where nodes are directories and leaf
entries are files), or whatever one's sick mind can come up with.

Marcel

--
my int ($x, $y, $z, $n); $x**$n + $y**$n = $z**$n is insoluble if $n  2;
I have discovered a truly remarkable proof which this signature is too
short to contain.  (20 Aug 2001: Pierre de Fermat's 400th birthday)



Re: Grammar - Class creation

2001-05-29 Thread Paul Makepeace

On Tue, May 29, 2001 at 10:48:54AM +0100, Matthew Byng-Maddick wrote:
 Not quite, it's a human-readable binary format. All the indexes rely on
 offsets in the file, and the various fun with the newline conventions
 mean that in my book, it's a binary format, you can't just go along and
 edit it with a text editor, because it won't work anymore. Sure, the only

Yeah, a dumb editor for sure -- if the editor can figure out the
line-ending conventions from the magic number I don't see a problem.

Hmm, I suppose if a pathological editor mixed cr/lfs then that scheme
wouldn't work.

But yeah, the line-ending (esp. in the xref table[1]) is a bit bizarre. 
PDF seems optimised for the PDF writing application placing burden on
the reader.

What possible point is there in specifying the length of a stream object
as an indirect reference? The spec says (in essence) so the writing
application can write out the length of variable data without knowing a
priori its length (to use an
e.g.[2] from before, a deflated object). Now, surely the point of having a
length spec at all is so that the reader can a) allocate a chunk of
memory in advance b) know when the endstream isn't part of the
stream itself. What c) am I missing that where the length is useful?

Hmm, I suppose using the xref it could seek() for the length object,
read it and then seek() back to the stream. Like I say, no regard for
the woes of the reader, or something being piped from HTTP...

[1] [For those that don't know, the cross-ref table consists mostly of
object IDs and byte offsets. Each xref line is 20bytes in which the
final char is a space if the line-ending isn't a full crlf
(i.e. is either(!) of cr or lf. *toke, toke*]

[2] Bollocks, autoformat did it again.

 bits that are non-*ASCII* are the streams, but even so...
 
 MBM
 



Re: Grammar - Class creation

2001-05-29 Thread Robin Szemeti

On Tue, 29 May 2001, Leon Brocard wrote:

 Other programming languages need code generators to spit out
 libraries. Perl doesn't need to do this as it's dynamic, baby. This is
 why Parse::RecDescent / Template Toolkit are so groovy, yeah.

I propose a new convention : we all shout 'CAMEL' if Leon uses the words
qw/ baby groovy yeah/ in the same mail. ..  this could be quite often.

-- 
Robin Szemeti   

Redpoint Consulting Limited
Real Solutions For A Virtual World 



Re: Grammar - Class creation

2001-05-29 Thread Simon Wistow

Paul Makepeace wrote:

  Like I said, I looked into and didn't find anything and didn't have the
  time/experience/inclination to start doing something myself - too many
  gotchas :(
 
 Like what kind of gotchas, besides the padding/endianity stuff?

Well, Parse::RecDescent didn't do binary (I looked at patching it, hence
File::Binary, but didn't get round to it) plus some files store things
as binary offsets in files and do nasty optimisation tricks like ..

1. $x = next 5 bits as an UINT
2. if ($x==32) $x next 8 bits as an UINT
3. next 5 variables are all SINTS and are $x bits long 

doing it (i.e a BNF that described Binary files) wouldn't be impossible
but I don't have enough experience with other file formats to think of
other gotchas and/or the time to write this sort of stuff. I'd have to
think of some sort of IDL as well I suppose.



Re: Grammar - Class creation

2001-05-29 Thread Nicholas Clark

On Tue, May 29, 2001 at 11:59:48AM +0100, Robin Szemeti wrote:
 On Tue, 29 May 2001, Leon Brocard wrote:
 
  Other programming languages need code generators to spit out
  libraries. Perl doesn't need to do this as it's dynamic, baby. This is
  why Parse::RecDescent / Template Toolkit are so groovy, yeah.
 
 I propose a new convention : we all shout 'CAMEL' if Leon uses the words
 qw/ baby groovy yeah/ in the same mail. ..  this could be quite often.

There doesn't appear to be any drinking involved in this new convention.
Is this the new bit? Not sure if it will catch on.

Nicholas Clark



Re: Grammar - Class creation

2001-05-29 Thread Robin Szemeti

On Tue, 29 May 2001, Nicholas Clark wrote:
 On Tue, May 29, 2001 at 11:59:48AM +0100, Robin Szemeti wrote:
  On Tue, 29 May 2001, Leon Brocard wrote:
  
   Other programming languages need code generators to spit out
   libraries. Perl doesn't need to do this as it's dynamic, baby. This is
   why Parse::RecDescent / Template Toolkit are so groovy, yeah.
  
  I propose a new convention : we all shout 'CAMEL' if Leon uses the words
  qw/ baby groovy yeah/ in the same mail. ..  this could be quite often.
 
 There doesn't appear to be any drinking involved in this new convention.
 Is this the new bit? Not sure if it will catch on.

uhh .. you're *so* right .. I really must remember to compile my mails
with a -w option .. I'd probably have got a :

'Buffy' not declared or used in this mail,
Useless use of 'CAMEL' in a void context, did you mean 'Drink!' instead?


note: Drink! is to be pronounced loudly and in a southern Irish accent as
per Father Jack  [ see 'Father Ted'-{'Optician'}  passim ]

-- 
Robin Szemeti   

Redpoint Consulting Limited
Real Solutions For A Virtual World