Re: Compile time regex matching

2014-07-15 Thread Jason den Dulk via Digitalmars-d-learn
On Monday, 14 July 2014 at 11:43:01 UTC, Philippe Sigaud via 
Digitalmars-d-learn wrote:


You can try Pegged, a parser generator that works at 
compile-time

(both the generator and the generated parser).


I did, and I got it to work. Unfortunately, the code used to in 
the CTFE is left in the final executable even though it is not 
used at runtime. So now the question is, is there away to get rid 
of the excess baggage?


BTW Here is the code I am playing with.

import std.stdio;

string get_match()
{
  import pegged.grammar;
  mixin(grammar(`
MyRegex:
foo - abc* def?
`));

  auto result = MyRegex(import(config-file.txt)); // 
compile-time parsing

  return writeln(\~result.matches[0]~\);;
}

void main()
{
  mixin(get_match());
}



Re: Compile time regex matching

2014-07-15 Thread Philippe Sigaud via Digitalmars-d-learn
 I did, and I got it to work. Unfortunately, the code used to in the CTFE is
 left in the final executable even though it is not used at runtime. So now
 the question is, is there away to get rid of the excess baggage?

Not that I know of. Once code is injected, it's compiled into the executable.

   auto result = MyRegex(import(config-file.txt)); // compile-time parsing
   return writeln(\~result.matches[0]~\);;


   mixin(get_match());

I never tried that, I'm happy that works.

Another solution would be to push these actions at runtime, by using a
small script instead of your compilation command. This script can be
in D.

- The script takes a file name as input
- Open the file
- Use regex to parse it
- Extract the values you want and write them to a temporary file.
- Invoke the compiler (with std.process) on your main file with -Jpath
flag to the temporary file. Inside your real code, you can thus use
mixin(import(temp file)) happily.
- Delete the temporary file once the previous step is finished.

Compile the script once and for all, it should execute quite rapidly.
It's a unusual pre-processor, in a way.


Re: Compile time regex matching

2014-07-14 Thread Philippe Sigaud via Digitalmars-d-learn
 I am trying to write some code that uses and matches to regular expressions
 at compile time, but the compiler won't let me because matchFirst and
 matchAll make use of malloc().

 Is there an alternative that I can use that can be run at compile time?

You can try Pegged, a parser generator that works at compile-time
(both the generator and the generated parser).

https://github.com/PhilippeSigaud/Pegged

docs:

https://github.com/PhilippeSigaud/Pegged/wiki/Pegged-Tutorial

It's also on dub:

http://code.dlang.org/packages/pegged

It takes a grammar as input, not a single regular expression, but the
syntax is not too different.


  import pegged.grammar;

  mixin(grammar(`
  MyRegex:
  foo - abc* def?
  `));

  void main()
  {
  enum result = MyRegex(abcabcdefFOOBAR); // compile-time parsing

  // everything can be queried and tested at compile-time, if need be.
  static assert(result.matches == [abc, abc, def]);
  static assert(result.begin == 0);
  static assert(result.end == 9);

  pragma(msg, result.toString()); // parse tree
  }


It probably does not implement all those regex nifty features, but it
has all the usual Parsing Expression Grammars powers. It gives you an
entire parse result, though: matches, children, subchildren, etc. As
you can see, matches are accessible at the top level.

One thing to keep in mind, that comes from the language and not this
library: in the previous code, since 'result' is an enum, it'll be
'pasted' in place everytime it's used in code: all those static
asserts get an entire copy of the parse tree. It's a bit wasteful, but
using 'immutable' directly does not work here, but this is OK:

enum res = MyRegex(abcabcdefFOOBAR); // compile-time parsing
immutable result = res; // to avoid copying the enum value everywhere

The static asserts then works (not the toString, though). Maybe
someone more knowledgeable than me on DMD internals could certify it
indeed avoid re-allocating those parse results.


Re: Compile time regex matching

2014-07-14 Thread Artur Skawina via Digitalmars-d-learn
On 07/14/14 13:42, Philippe Sigaud via Digitalmars-d-learn wrote:
 asserts get an entire copy of the parse tree. It's a bit wasteful, but
 using 'immutable' directly does not work here, but this is OK:
 
 enum res = MyRegex(abcabcdefFOOBAR); // compile-time parsing
 immutable result = res; // to avoid copying the enum value everywhere   
 
   static immutable result = MyRegex(abcabcdefFOOBAR); // compile-time parsing


 The static asserts then works (not the toString, though). Maybe

diff --git a/pegged/peg.d b/pegged/peg.d
index 98959294c40e..307e8a14b1dd 100644
--- a/pegged/peg.d
+++ b/pegged/peg.d
@@ -55,7 +55,7 @@ struct ParseTree
 /**
 Basic toString for easy pretty-printing.
 */
-string toString(string tabs = )
+string toString(string tabs = ) const
 {
 string result = name;
 
@@ -262,7 +262,7 @@ Position position(string s)
 /**
 Same as previous overload, but from the begin of P.input to p.end
 */
-Position position(ParseTree p)
+Position position(const ParseTree p)
 {
 return position(p.input[0..p.end]);
 }

[completely untested; just did a git clone and fixed the two
 errors the compiler was whining about. Hmm, did pegged get
 faster? Last time i tried (years ago) it was unusably slow;
 right now, compiling your example, i didn't notice the extra
 multi-second delay that was there then.]

artur


Re: Compile time regex matching

2014-07-14 Thread Philippe Sigaud via Digitalmars-d-learn
On Mon, Jul 14, 2014 at 3:19 PM, Artur Skawina via Digitalmars-d-learn
digitalmars-d-learn@puremagic.com wrote:
 On 07/14/14 13:42, Philippe Sigaud via Digitalmars-d-learn wrote:
 asserts get an entire copy of the parse tree. It's a bit wasteful, but
 using 'immutable' directly does not work here, but this is OK:

 enum res = MyRegex(abcabcdefFOOBAR); // compile-time parsing
 immutable result = res; // to avoid copying the enum value everywhere

static immutable result = MyRegex(abcabcdefFOOBAR); // compile-time 
 parsing

Ah, static!



 The static asserts then works (not the toString, though). Maybe
(snip diff)

I'll push that to the repo, thanks! I should sprinkle some const and
pure everywhere...

 [completely untested; just did a git clone and fixed the two
  errors the compiler was whining about. Hmm, did pegged get
  faster? Last time i tried (years ago) it was unusably slow;
  right now, compiling your example, i didn't notice the extra
  multi-second delay that was there then.]

It's still slower than some handcrafted parsers. At some time, I could
get it on par with std.regex (between 1.3 and 1.8 times slower), but
that meant losing some other properties. I have other parsing engines
partially implemented, with either a larger specter of grammars or
better speed (but not both!). I hope the coming holidays will let me
go back to it.


Compile time regex matching

2014-07-13 Thread Jason den Dulk via Digitalmars-d-learn

Hi

I am trying to write some code that uses and matches to regular 
expressions at compile time, but the compiler won't let me 
because matchFirst and matchAll make use of malloc().


Is there an alternative that I can use that can be run at compile 
time?


Thanks in advance.
Jason