TheBat-techies,
Please bear with me, as this may be a log message and a bit difficult
to understand.
Executive summary: I want to be able to extract text from an e-mail and
reformat it and save it in a text file so I can import it into our
books. Maybe I need a third party application for this (as I cannot do
RegEx), maybe someone here can provide me with the RegEx?
I co-own a small web shop based on the open source system OSCommerce.
When an order is placed, I get sent an e-mail containing the order
details. However, since the shop is in two languages, this e-mail looks
different each time, also since the different numbers of goods the
client may have ordered, the shipping method, etc.
I think TheBat can do all this without "outside help". Export is in the
filters and here you may call different QTs.
Below I will give an example of such a mail with the info I want to
extract in [ ]'s. { } denotes a note for that field.
This is the form (in the English version. Where there are differences
in the Danish version of the fixed text, a note "#1" has been added):
=========================================================================
dk.tech.gear
------------------------------------------------------
Order Number: (Order number)
Detailed Invoice#1: (Url to our web site with order details)
Date Ordered#2: [The order date] {1}
[Multi-line comment field, including blank lines, that may or may not be
there]
Products#3
------------------------------------------------------
[Number ordered] x Single tank adapter ([Item model name]) = 280dkk
[Number ordered] x Transport bag ([Item model name]) = 280dkk
(And so on as above. It looks like below:)
2 x Camband (s/s buckle) (DTU-Camband) = 320dkk
1 x Backplate & doubles wing (DTD-PakD) = 2.640dkk
1 x Regulator hose: deco/stage (DZ-Reg40) = 120dkk
------------------------------------------------------
Sub-Total#4: 3.640dkk
[Shipping method] (Shipping (5-7 days) to NO : 9.72 kg): [Shipping
Price] {2}
Total: 3.865dkk
{3}
Delivery Address#5
------------------------------------------------------
[Delivery Name]
[Delivery Address]
[Delivery Address2, if applicable]
[Delivery Address3, if applicable]
[Delivery Post code]{4} [Delivery City]
[Delivery country]
Billing Address#6
------------------------------------------------------
[Name]
[Address]
[Address2, if applicable]
[Address3, if applicable]
[Post code]{4} [City]
[Country]
Payment Method
------------------------------------------------------
Money Order/Cheque (in DKK)
(Some text that can be disregarded)
EOF
=========================================================================
{1} It is in the form "Friday 24 October, 2003", but should be changed
to "20031024". In the corresponding Danish version that would be "fredag
27 oktober, 2003".
{2} This is a number ending with dkk, "160dkk". See below as this
probably presents the largest issue with formatting after extracting it.
{3} There there may be a text string "Moms" or "DK moms/VAT". It is
always followed by 1 empty line. Whether this line is here is very
important for {2} above and the formatting of the output.
{4} Typical format is 4 digits, but may also be (country code-a number
of digits): S-123 34. It is always followed by a space and then the
city.
#1 Detaljeret faktura:
#2 Ordre modtaget:
#3 Produkter:
#4 Subtotal:
#5 Leveringsadresse:
#6 Fakturaadresse:
I would like to format the extracted stuff thus (there should be no
[ ]'s):
Notes in { }
=========================================================================
<ORDRE>
ORDREDATO=[The order date, in the form YYYYMMDD]
KUNDENR=[Random generated string, max 10 characters, may be left out]
NAVN=[Name]
ADRESSE1=[Address]
ADRESSE2=[Address2, if applicable]
ADRESSE3=[Address3, if applicable]
POSTNR=[Post code]
BYNAVN=[City]
LAND=[Country]
LEV:NAVN=[Delivery Name]
LEV:ADRESSE1=[Delivery Address]
LEV:ADRESSE2=[Delivery Address2, if applicable]
LEV:ADRESSE3=[Delivery Address3, if applicable]
LEV:POSTNR=[Delivery Post code]
LEV:BYNAVN=[Delivery City]
LEV:LAND=[Delivery country]
MOMSBEREGNING=DK {This should be fixed as "DK"}
VEDR�RENDE=[Multi-line comment field, including blank lines, that
may or may not be there]
LEVERING=[Shipping method]
BETALINGSMETODE=For {This should be fixed as "For".}
FRAGTMOMSFRI=[Shipping Price] {If the strings in {3} above are }
FRAGTMOMSPLIGTIGT=[Shipping Price] {present, then [Shipping Price] }
{should be preceded by }
{"FRAGTMOMSPLIGTIGT", except the }
{[Shipping Price] number should be}
{the _extracted_ [Shipping Price] }
{number multiplied by 0.8! }
{If not present, it should be }
{preceded by "FRAGTMOMSFRI" }
{without recalculation. I.e. }
{_only_ one of the lines. } }
<VARE>
ANTAL=[Number ordered]
VARENUMMER=[Item model name]
<VARE>
ANTAL=[Number ordered]
VARENUMMER=[Item model name]
{and so on for all lines of goods}
=========================================================================
The output should be saved as a file.
Phew!
This was complicated!
OK, now for the fairly simple question: Is anybody of you RegEx gurus
willing to do that for me (I know I am pushing it, but I would be
eternally grateful!)? Or does anyone know of some software (preferably
freeware as this is only a small business) that will allow me to do this
(without it being to technical), without knowing RegEx/programming? Some
sort of "VisualRegExp"?
I suspect it's a question of making a number of QTs that finally merge
together and is called from the template of the Export function in the
Sorting office filters.
BTW, I am trying to get OSC to send the phone number also, because this
is what we use for customer IDs.
--
<greeting> Best regards </greeting>
<author> Peter Fjelsten </author>
<thebat version> 2.01.15 </thebat version>
<os> Windows XP 5.1 Build 2600 Service Pack 1 </os>
________________________________________________________
http://www.silverstones.com/thebat/TBUDLInfo.html