Re: Grammar -> Class creation
On Tue, 29 May 2001, Nicholas Clark wrote: > On Tue, May 29, 2001 at 11:59:48AM +0100, Robin Szemeti wrote: > > On Tue, 29 May 2001, Leon Brocard wrote: > > > > > Other programming languages need code generators to spit out > > > libraries. Perl doesn't need to do this as it's dynamic, baby. This is > > > why Parse::RecDescent / Template Toolkit are so groovy, yeah. > > > > I propose a new convention : we all shout 'CAMEL' if Leon uses the words > > qw/ baby groovy yeah/ in the same mail. .. this could be quite often. > > There doesn't appear to be any drinking involved in this new convention. > Is this the "new" bit? Not sure if it will catch on. uhh .. you're *so* right .. I really must remember to compile my mails with a -w option .. I'd probably have got a : 'Buffy' not declared or used in this mail, Useless use of 'CAMEL' in a void context, did you mean 'Drink!' instead? note: Drink! is to be pronounced loudly and in a southern Irish accent as per Father Jack [ see 'Father Ted'->{'Optician'} & passim ] -- Robin Szemeti Redpoint Consulting Limited Real Solutions For A Virtual World
Re: Grammar -> Class creation
On Tue, May 29, 2001 at 11:59:48AM +0100, Robin Szemeti wrote: > On Tue, 29 May 2001, Leon Brocard wrote: > > > Other programming languages need code generators to spit out > > libraries. Perl doesn't need to do this as it's dynamic, baby. This is > > why Parse::RecDescent / Template Toolkit are so groovy, yeah. > > I propose a new convention : we all shout 'CAMEL' if Leon uses the words > qw/ baby groovy yeah/ in the same mail. .. this could be quite often. There doesn't appear to be any drinking involved in this new convention. Is this the "new" bit? Not sure if it will catch on. Nicholas Clark
Re: Grammar -> Class creation
Paul Makepeace wrote: > > Like I said, I looked into and didn't find anything and didn't have the > > time/experience/inclination to start doing something myself - too many > > gotchas :( > > Like what kind of gotchas, besides the padding/endianity stuff? Well, Parse::RecDescent didn't do binary (I looked at patching it, hence File::Binary, but didn't get round to it) plus some files store things as binary offsets in files and do nasty optimisation tricks like .. 1. $x = next 5 bits as an UINT 2. if ($x==32) $x next 8 bits as an UINT 3. next 5 variables are all SINTS and are $x bits long doing it (i.e a BNF that described Binary files) wouldn't be impossible but I don't have enough experience with other file formats to think of other gotchas and/or the time to write this sort of stuff. I'd have to think of some sort of IDL as well I suppose.
Re: Grammar -> Class creation
On Tue, 29 May 2001, Leon Brocard wrote: > Other programming languages need code generators to spit out > libraries. Perl doesn't need to do this as it's dynamic, baby. This is > why Parse::RecDescent / Template Toolkit are so groovy, yeah. I propose a new convention : we all shout 'CAMEL' if Leon uses the words qw/ baby groovy yeah/ in the same mail. .. this could be quite often. -- Robin Szemeti Redpoint Consulting Limited Real Solutions For A Virtual World
Re: Grammar -> Class creation
On Tue, May 29, 2001 at 10:48:54AM +0100, Matthew Byng-Maddick wrote: > Not quite, it's a human-readable binary format. All the indexes rely on > offsets in the file, and the various fun with the newline conventions > mean that in my book, it's a binary format, you can't just go along and > edit it with a text editor, because it won't work anymore. Sure, the only Yeah, a dumb editor for sure -- if the editor can figure out the line-ending conventions from the magic number I don't see a problem. Hmm, I suppose if a pathological editor mixed cr/lfs then that scheme wouldn't work. But yeah, the line-ending (esp. in the xref table[1]) is a bit bizarre. PDF seems optimised for the PDF writing application placing burden on the reader. What possible point is there in specifying the length of a stream object as an indirect reference? The spec says (in essence) "so the writing application can write out the length of variable data without knowing a priori its length" (to use an e.g.[2] from before, a deflated object). Now, surely the point of having a length spec at all is so that the reader can a) allocate a chunk of memory in advance b) know when the "endstream" isn't part of the stream itself. What c) am I missing that where the length is useful? Hmm, I suppose using the xref it could seek() for the length object, read it and then seek() back to the stream. Like I say, no regard for the woes of the reader, or something being piped from HTTP... [1] [For those that don't know, the cross-ref table consists mostly of object IDs and byte offsets. Each xref line is 20bytes in which the final char is a space if the line-ending isn't a full (i.e. is either(!) of or . *toke, toke*] [2] Bollocks, autoformat did it again. > bits that are non-*ASCII* are the streams, but even so... > > MBM >
Re: Grammar -> Class creation
On Tuesday, May 29, 2001, at 11:49 AM, Paul Makepeace wrote: > Surely it should be possible to specify the underlying *functionality* > of the system and then have a perl source filter (or other component of > perl's mind-addling n-tier parsing architecture) that > rewrites/re-presents the interface in the API style du jour... Separate the end-user API from the parser's action API. I.e., each parser module specifies the grammar and uses a fixed style of action that creates something like an abstract syntax tree. Then have several end-user APIs to access that AST. I.e., one that traverses the tree and makes callbacks for each node. Or one that uses an XPath-like syntax to get at certain nodes in the tree. Or, in the spirit of the model-view controller, make it a tree model and have various tree viewers (Leon will recognise this idea; as I've been going on about this since before the German Perl Workshop...) so you could also 'view' the tree as an XML document, as a Data::Dumper-like output, as a directory hierarchy (where nodes are directories and leaf entries are files), or whatever one's sick mind can come up with. Marcel -- my int ($x, $y, $z, $n); $x**$n + $y**$n = $z**$n is insoluble if $n > 2; I have discovered a truly remarkable proof which this signature is too short to contain. (20 Aug 2001: Pierre de Fermat's 400th birthday)
Re: Grammar -> Class creation
On Tue, May 29, 2001 at 10:45:59AM +0100, Leon Brocard wrote: > [Of course, the reason nobody's done this before is that everyone > wants a slightly different interface...] Surely it should be possible to specify the underlying *functionality* of the system and then have a perl source filter (or other component of perl's mind-addling n-tier parsing architecture) that rewrites/re-presents the interface in the API style du jour... Paul
Re: Grammar -> Class creation
On Tue, May 29, 2001 at 02:27:40AM -0700, Paul Makepeace wrote: > Anyway, PDF is easier re: packing/endianness since it's a text format! > The only time you get binary data is for unencoded streams (which they advise > against, although it's permitted, for example PDFlib generates it) > like a << /Filter /FlateDecode streamzlib-deflated-dataendstream >> Not quite, it's a human-readable binary format. All the indexes rely on offsets in the file, and the various fun with the newline conventions mean that in my book, it's a binary format, you can't just go along and edit it with a text editor, because it won't work anymore. Sure, the only bits that are non-*ASCII* are the streams, but even so... MBM
Re: Grammar -> Class creation
Marcel Grunauer sent the following bits through the ether: > Is that a) a good idea, b) a bad idea, c) common practice anyway and > I just haven't found it? japhy's apparently kinda doing this: http://search.cpan.org/doc/PINYAN/YAPE-Regex-3.01/extra/YAPE.pm The YAPE hierarchy of modules is an attempt at a unified means of parsing and extracting content. It attempts to maintain a generic interface, to promote simplicity and reusability. The API is powerful, yet simple. The modules do tokenization (which can be intercepted) and build trees, so that extraction of specific nodes is doable. Other programming languages need code generators to spit out libraries. Perl doesn't need to do this as it's dynamic, baby. This is why Parse::RecDescent / Template Toolkit are so groovy, yeah. [Of course, the reason nobody's done this before is that everyone wants a slightly different interface...] Leon -- ... We're not worthy! We're not worthy!
Re: Grammar -> Class creation
On Tuesday, May 29, 2001, at 11:18 AM, Simon Wistow wrote: > I started looking into this when I first started doing the SWF stuff ... > a kind of YACC for file formats. Describe it in a BNF-a-like language > and then run a program over it et voila - you have a library for reading > and creating that file format (he says, glossing over lots of > complications and gotchas). Write that program for each different > language and lots of different languages/systems have access to lots of > different fiel formats and every time a format changes the spec gets > updated and everyone runs their grammar->library programs again and > everybody's got full functionality again. As Leon points out, Parse::RecDescent is One Way To Do It. However, it's mostly used to parse some input according to some grammar and to construct the desired result directly. If you need a different result from the same grammar, you have to specify the grammar and actions again. It might be an idea to have grammars packed up in modules (i.e., reusable) and make the actions callbacks (some sort of autoaction might do that), much like HTML and XML parsers do it. I imagine lots of little Parse::* modules (Parse::Regex, Parse::PDF, Parse::RPN etc.). Is that a) a good idea, b) a bad idea, c) common practice anyway and I just haven't found it? Marcel -- $ perl -we time Useless use of time in void context at -e line 1.
Re: Grammar -> Class creation
On Tue, May 29, 2001 at 10:18:53AM +0100, Simon Wistow wrote: > > Immediate application is feeding the PDF spec > > I started looking into this when I first started doing the SWF stuff ... > a kind of YACC for file formats. Describe it in a BNF-a-like language > and then run a program over it et voila - you have a library for reading > and creating that file format (he says, glossing over lots of > complications and gotchas). Oddly enough, I was going to write a Flash parser/generator in '98 and my idea at the time was to actually generate the grammar direct from their HTML spec :-) Unfortunately tuit.com tanked and I had to pay rent, (etc). Oh, and then you did it.. Anyway, PDF is easier re: packing/endianness since it's a text format! The only time you get binary data is for unencoded streams (which they advise against, although it's permitted, for example PDFlib generates it) like a << /Filter /FlateDecode streamzlib-deflated-dataendstream >> > I think this article talks about it some > http://advogato.org/article/59.html It's surprising to me the original poster didn't discuss XDR/NDR and network byte ordering, or perhaps I was missing the point (it is 2:26am here after all) > Like I said, I looked into and didn't find anything and didn't have the > time/experience/inclination to start doing something myself - too many > gotchas :( Like what kind of gotchas, besides the padding/endianity stuff? Paul
Re: Grammar -> Class creation
Paul Makepeace wrote: > > Are there modules/frameworks that exist to create classes from a > grammar spec > (e.g. EBNF)? Restating, I'm envisaging something where the input is a > grammar and the output is a class or set of classes that provides > parsing capabilities and validating accessor methods. > > Immediate application is feeding the PDF spec I started looking into this when I first started doing the SWF stuff ... a kind of YACC for file formats. Describe it in a BNF-a-like language and then run a program over it et voila - you have a library for reading and creating that file format (he says, glossing over lots of complications and gotchas). Write that program for each different language and lots of different languages/systems have access to lots of different fiel formats and every time a format changes the spec gets updated and everyone runs their grammar->library programs again and everybody's got full functionality again. I think this article talks about it some http://advogato.org/article/59.html Like I said, I looked into and didn't find anything and didn't have the time/experience/inclination to start doing something myself - too many gotchas :( -- simon wistowwireless systems coder "i think," i said "i think this is our fault."
Re: Grammar -> Class creation
Paul Makepeace sent the following bits through the ether: > Are there modules/frameworks that exist to create classes from a > grammar spec (e.g. EBNF)? Well, Parse::RecDescent[1] probably does what you want. Check out the directive. Parsing is fun. Let's try and parse everything! [1] Or Parse::Yapp, but I betcha it'll be more work -- Leon Brocard.http://www.astray.com/ Iterative Software...http://www.iterative-software.com/ ... "Careful. We don't want to learn from this." - Calvin
Grammar -> Class creation
Are there modules/frameworks that exist to create classes from a grammar spec (e.g. EBNF)? Restating, I'm envisaging something where the input is a grammar and the output is a class or set of classes that provides parsing capabilities and validating accessor methods. Immediate application is feeding the PDF spec (which admittedly doesn't quite exist in EBNF) and having something I can programmatically generate PDF from. PDF::* isn't terribly capable at anything and the two sets of authors haven't rev'ed it for a while and Text::PDF::* has almost no documentation (= useless, IMO). Paul (PS Any ideas why autoformat creates that gap on line 2?)