Yes. You open the PDF with Adobe Acrobat, do a select all, then do a copy. Open 
write.exe and do a paste. Word could be used also but write is simple and faster. 
Start, Run, Write, enter. Save the new document as an MS-DOS text file. View the PDF 
in Adobe and note the beginning page number, ending page number, and text to denote 
page numbers. For beginning I use one page before the ST segment. For ending I use one 
page after the SE. Text to denote page numbers is whatever exists next to the page 
number such as "MAY 2000".

In my program I enter the scraped file name and other info. The program reads, parses, 
and interprets the scraped information creating one data item in a table for each 
segment element, and composites if they exist. Segment name, values, data name, 
generic name, industry name, alias, loop, min length, and max length are loaded as 
well. Rows in the table then have a one-to-one correspondence with the implementation 
guide. Anyone who has worked with this data is aware of the stupidity with which it 
was created, and how difficult it is to identify data items when the same names are 
used repeatedly. So for uniqueness we use page number combined with segment and 
element number.

Unfortunately, because of the addenda our one method of establishing uniqueness is 
gone. There is no longer an explicit page number to use. Such will not exist until a 
new implementation guide is printed.

--
dgm, MMIS Enhancement, 557-9780

>>> [EMAIL PROTECTED] 12/05/02 10:33AM >>>
I would be interested in how the scraping works.  Was the pdf file used?

-----Original Message-----
From: David McDivitt [mailto:[EMAIL PROTECTED]] 
Sent: Thursday, December 05, 2002 11:10 AM
To: WEDI SNIP Transactions Workgroup List
Subject: addenda


A program was written to scrape text from implementation guides and populate
a table with page, segment, element, values, and other information. The
program was made over a year ago and continues to work well. This data is
used in a mapping tool to create mainframe COBOL file layouts. Unfortunately
each implementation guide must be changed pursuant to addendas. Text from
addendas cannot be scraped and processed the same as implementation guides,
and created tables must be updated manually. Some addendas are quite large.
I am wondering if anything exists to help me. Are there any consolidated
implementation guides I can scrape? Does the data I need already exist
somewhere so I can import it? Thanks


--
dgm, MMIS Enhancement, 557-9780



---
The WEDI SNIP listserv to which you are subscribed is not moderated. The
discussions on this listserv therefore represent the views of the individual
participants, and do not necessarily represent the views of the WEDI Board
of Directors nor WEDI SNIP. If you wish to receive an official opinion, post
your question to the WEDI SNIP Issues Database at
http://snip.wedi.org/tracking/.   These listservs should not be used for
commercial marketing purposes or discussion of specific vendor products and
services.  They also are not intended to be used as a forum for personal
disagreements or unprofessional communication at any time.

You are currently subscribed to wedi-transactions as:
[EMAIL PROTECTED] 
To unsubscribe from this list, go to the Subscribe/Unsubscribe form at
http://subscribe.wedi.org or send a blank email to
[EMAIL PROTECTED] 
If you need to unsubscribe but your current email address is not the same as
the address subscribed to the list, please use the Subscribe/Unsubscribe
form at http://subscribe.wedi.org 

---
The WEDI SNIP listserv to which you are subscribed is not moderated. The discussions 
on this listserv therefore represent the views of the individual participants, and do 
not necessarily represent the views of the WEDI Board of Directors nor WEDI SNIP. If 
you wish to receive an official opinion, post your question to the WEDI SNIP Issues 
Database at http://snip.wedi.org/tracking/.   These listservs should not be used for 
commercial marketing purposes or discussion of specific vendor products and services.  
They also are not intended to be used as a forum for personal disagreements or 
unprofessional communication at any time.

You are currently subscribed to wedi-transactions as: [EMAIL PROTECTED] 
To unsubscribe from this list, go to the Subscribe/Unsubscribe form at 
http://subscribe.wedi.org or send a blank email to 
[EMAIL PROTECTED] 
If you need to unsubscribe but your current email address is not the same as the 
address subscribed to the list, please use the Subscribe/Unsubscribe form at 
http://subscribe.wedi.org 



---
The WEDI SNIP listserv to which you are subscribed is not moderated. The discussions 
on this listserv therefore represent the views of the individual participants, and do 
not necessarily represent the views of the WEDI Board of Directors nor WEDI SNIP. If 
you wish to receive an official opinion, post your question to the WEDI SNIP Issues 
Database at http://snip.wedi.org/tracking/.   These listservs should not be used for 
commercial marketing purposes or discussion of specific vendor products and services.  
They also are not intended to be used as a forum for personal disagreements or 
unprofessional communication at any time.

You are currently subscribed to wedi-transactions as: [email protected]
To unsubscribe from this list, go to the Subscribe/Unsubscribe form at 
http://subscribe.wedi.org or send a blank email to 
[EMAIL PROTECTED]
If you need to unsubscribe but your current email address is not the same as the 
address subscribed to the list, please use the Subscribe/Unsubscribe form at 
http://subscribe.wedi.org

Reply via email to