Some very simple code for regexp. Easy to change the pattern and specify
the limits of each field and very quickly determine what is correct and what
is not. Powerful !!
ACTIVATE SCREEN
CLEAR
TEXT TO lcFile NOSHOW
"123","Anna","Manchester United","London, UK",""
"1234","Ann","Gas House Gorillas","New York, NY",""
"12345","Anna Maria","Brooklyn Dodgers","Brooklyn, NY","This is a really
long string"
"678901","Santa Anna","New York Rangers","New York City, NY",""
ENDTEXT
lnLines = ALINES(laLines, m.lcFile)
*-----------------------------------------------
loRegExp = CREATEOBJECT("vbscript.regexp")
loRegExp.global = .T.
loRegExp.pattern = '"([^"]*)"'
FOR EACH lcLine IN laLines
loResults = loRegExp.execute(m.lcLine)
? "Fields " + transform(m.loResults.count)
FOR lnFld = 0 TO m.loResults.count - 1 && zero based
? " (" + PADL(m.loResults.item(m.lnFld).length - 2, 3, "0") + ") - " +
m.loResults.item(m.lnFld).submatches(0) && -2 for the quotes around each
element
ENDFOR
?
ENDFOR
-----Original Message-----
From: ProfoxTech [mailto:[email protected]] On Behalf Of MB
Software Solutions General Account
Sent: Friday, 3 January 2014 8:26 AM
To: [email protected]
Subject: Determining maximum field widths in text file
VFP9SP2
Goal: to determine maximum field widths for entire input/text file.
Input file is Sample.txt. It contains 'n' fields, let's say 5 for this
example: Field1, Field2, Field3, Field4, Field5.
Example of file:
"123","Anna","Manchester United","London, UK",""
"1234","Ann","Gas House Gorillas","New York, NY",""
"12345","Anna Maria","Brooklyn Dodgers","Brooklyn, NY","This is a really
long string"
"678901","Santa Anna","New York Rangers","New York City, NY",""
* Notice some commas may appear INSIDE fields.
I want my output array to basically have the maximum widths for each field.
I'm trying to build an import tool and this will help me when the provider
sends me data without field layouts. I figure I'll grab the line with
FGetsEx and then analyze each field from there. Off the top of my head I
can create a stub cursor with each field type being Memo, import the line
read into that cursor and then cycle through each and calculate the max via
LEN(ALLTRIM(Fieldn)), but that seems like an awfully big hammer (in other
words, it seems very 'kludgey').
Mind you, it's not ideal, but it's something to give me an idea of where my
truncations are occurring. I'm importing into a MySQL database and getting
a ton of truncations. I want to see where he's sending me something
different than officially expected. The SHOW WARNINGS command in MySQL
isn't the solution either. I just want to know what fields have longer data
values than he's declaring.
tia,
--Mike
[excessive quoting removed by server]
_______________________________________________
Post Messages to: [email protected]
Subscription Maintenance: http://mail.leafe.com/mailman/listinfo/profox
OT-free version of this list: http://mail.leafe.com/mailman/listinfo/profoxtech
Searchable Archive: http://leafe.com/archives/search/profox
This message:
http://leafe.com/archives/byMID/profox/[email protected]
** All postings, unless explicitly stated otherwise, are the opinions of the
author, and do not constitute legal or medical advice. This statement is added
to the messages for those lawyers who are too stupid to see the obvious.