Hello Peter,
On Thursday, November 6, 2003 at 23:26 GMT +0100, something compelled
Peter Fjelsten [PF] to inscribe:
PF> It's the strangest thing: for some orders, it works 85% on for others
PF> only 10% - although there are more that works 10% than 85%.
That's because the address regexps can't handle the colon after the
labels in the second example. There are a couple of other errors too.
Try the following (note: I haven't tested this much, so it may need
polishing):
=====[Begin template fragment]=====
%Rem=" Get Name and address from Billing Address "%-
%-
%SetPattRegexp="(?im-s)^(Billing\sAddress|Fakturaadresse)[^\n]*?\n-{50,}\s*?\n(.*?)\s*?\n((.*?)\s*?\n)?((.*?)\s*?\n)?((.*?)\s*?\n)?((\D{1,2}.\d{3,6})|\d{3,6})\s(.*?)\s*\n\s*(.*?)\s*?\n"
%RegexpBlindMatch="%Text"%-
NAVN=%SubPatt("2")
ADRESSE1=%SubPatt("4")
ADRESSE2=%SubPatt("6")
ADRESSE3=%SubPatt("8")
POSTNR=%SubPatt("10")
BYNAVN=%SubPatt("11")
LAND=%SubPatt("12")
%Rem=" Get Name and address from Delivery Address "%-
%-
%SetPattRegexp="(?im-s)^(Delivery\sAddress|Leveringsadresse)[^\n]*?\n-{50,}\s*?\n(.*?)\s*?\n((.*?)\s*?\n)?((.*?)\s*?\n)?((.*?)\s*?\n)?((\D{1,2}.\d{3,6})|\d{3,6})\s(.*?)\s*\n\s*(.*?)\s*?\n"
%RegexpBlindMatch="%Text"%-
LEV:NAVN=%SubPatt("2")
LEV:ADRESSE1=%SubPatt("4")
LEV:ADRESSE2=%SubPatt("6")
LEV:ADRESSE3=%SubPatt("8")
LEV:POSTNR=%SubPatt("10")
LEV:BYNAVN=%SubPatt("11")
LEV:LAND=%SubPatt("12")
MOMSBEREGNING=DK
VEDR0REND=%SetPattRegexp="(?ism)^(Date\sOrdered|Ordre\smodtaget)[^\n]*\n\s*(.*?)\s*\n(Products|Produkter):"%-
%_________%RegexpBlindMatch="%Text"%-
%_________%SubPatt("2")
LEVERING=%RegexpText="(?im-s)^Sub-?total[^\n]*?\n\s*?(.*)\s\(.*?\n"
BETALINGSMETODE=For
%_Shipping='%SetPattRegexp="(?im-s)shipping.*?(\d*([\.\,]d*)?)dkk\s*\n"%-
%___________%RegexpMatch="%Text"'%-
%-
%If:'%RegexpText="(?im-s)^Total:.*\n\s*?(Moms|DK\smoms\/VAT)\n\n"'<>'':'%-
FRAGTMOMSPLIGTIGT=%Calc="%_Shipping*0.8"dkk':'%-
FRAGTMOMSFRI=%_Shipping'
=====[ End template fragment]=====
PF> As I don't really understand how you make sub-patterns and variables
PF> it's a bit hard for me to change your code.
Subpatterns are simply parts of the regexp surrounded by round
brackets. Counting them is also very easy, just count the number of
opening brackets. Sometimes subpatterns are created so you can apply
a repeat operator (*,+,{x,y},?) to a string rather than a character.
Sometimes brackets and therefore the subpatterns are designed *only*
to capture a specific part of the string, but they play no role in the
actual matching. There are more uses, but those two are the main
ones.
Variables are also not too bad. Any macro that starts with %_ is a
variable. You don't have to declare a new variable, so you can just
start using them. Just remember that they have global scope, so if
you have complex, nested templates, you need to give them unique names
unless you want to share data between templates.
PF> You use completely different (and probably much better) RegExp
PF> then me but it's at a different level then me which makes it hard
PF> to understand.
I actually used a fair bit of what you wrote, just combined it and
cleaned some of it up a bit. But you had done quite a bit of the
logic. So hang in there, look for things that look similar to yours
and go from there.
--
Thanks for writing,
Januk Aggarwal
________________________________________________________
http://www.silverstones.com/thebat/TBUDLInfo.html