Ok, I took your advice from your first post and converted it to text and dumped it just to see what I would be dealing with. I understand what you are saying about pulling structured data from an unstructured source. I guess my thought was if the the document was created from a form, that there may be a way to pull the form fields out once the document had been created.... and the answer is no.
Thanks again for the help. I downloaded the suggested utilities and played with them some and it will obviously be much easier if we can get the form submission and work from there in CF8. Otherwise it looks like I'll be dealing with the beast of converting to text and parsing which I guess can be done using CF7 still if the accounting department saves the pdf as text before uploading, which was the main question I was trying to answer. I'll present those options to them and let them decide. Thanks again. On Wed, Aug 6, 2008 at 2:35 PM, Josh Adams <[EMAIL PROTECTED]> wrote: > If you can get the process changed such that what goes out is a PDF form > that comes back to you populated, it will be very easy for you to work with > that with CF 8. > > > > Here's a question for you: if the data came back in a Word document, how > would you get the data out? You would have to pull out all the text and > parse it, right? A PDF document is no different. We have implemented a lot > of great functionality for working with PDF documents in CF 8—including the > ability for you to process DDX that can do things like pulling all the text > out of document—but there is no functionality for pulling structured data > out of an unstructured document because doing such is impossible (or, > technically, it's document-dependent which makes it impossible for us to > build the functionality—however, you can still build the functionality > yourself for your own documents). > > > > Josh > > > > *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Jeff > Howard > *Sent:* Wednesday, August 06, 2008 2:22 PM > *To:* [email protected] > *Subject:* Re: [ACFUG Discuss] CF8 & PDFs > > > > I'm assuming that at some point in the process that the PO is a PDF form > since the PDF document is fairly static, but when we get it, it is just a > PDF document. That is kind of what made me ask the questionl. It is > definitely in a statc format (which is also why I thought about converting > to text and using some regular expressions to extract the data). So, I > thought that with that being the case, CF8 may have some new PDF utilities > to help in the process. > > > > I'll have to consult with my team and see if maybe we can get a copy of the > PDF form rather than the document because obviously there are different > approaches to handling this based on which format we recieve the data. > > > > Thanks. > > > > > On Wed, Aug 6, 2008 at 1:22 PM, Josh Adams <[EMAIL PROTECTED]> wrote: > > Are you receiving PDF documents or PDF forms? Tonight I will cover how to > use CF 8 to get data from PDF forms, however, I will not cover how to get > data out of PDF documents because PDF documents don't have a data structure > from which to get data. > > > > You can, however, obtain the text of a PDF using CF 8. It's not a built-in > tag option, but Raymond Camden's PDFUtils CFC (see > http://pdfutils.riaforge.org; note that rather than using the GetPage > method, I recommend you use <cfpdf action="merge"> and use only one sub > <cfpdfparam> tag, which has the net result of extracting pages from the > specified PDF) has a GetText method that uses DDX to extract all the text of > a PDF document. > > > > Josh > > > > *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Jeff > Howard > *Sent:* Wednesday, August 06, 2008 12:06 PM > *To:* [email protected] > *Subject:* [ACFUG Discuss] CF8 & PDFs > > > > I just got out of a meeting and came here to ask the ACFUG while I did some > research on my own and coincidentally enough tonight's meeting looks like it > is going to cover my exact question. The problem with that is that there is > no way that I will be able to attend the meeting tonight. > > > > I also saw mention of it being recorded, so I am going to try and watch > that once the details are posted. > > > > I figured that it couldn't hurt to still ask my questions since I am > researching this still anyways to report back this afternoon on if what we > need to do is possible and easier in CF8 than CF7. > > > > Basically, the situation is this: > > > > We get POs from a customer that come in the form of a PDF. Until this > point, accounting has been entering the information from the PO into a db by > hand. I know that CF8 added many PDF enhancements and was wondering if you > can extract information from a pdf to import into a db. The point being, > trying to improve accuracy to 100%. > > > > Currently we use CF7 and when I presented that as a possibility, I was told > we could have CF8 by the end of the week if this were possible. Otherwise, > I think the approach I am going to have to take is converting to a text file > and then using regular expression to pull the needed info out and then save > that to the db. > > > > I am not trying to play spoiler to tonight's meeting, but can anyone give > some feedback on the feasibility of CF8 to do what I need? If it can't, > does anyone have a better suggestion than the convert to text and then using > regular expression to accomplish this goal with CF7? > > > > > > Thanks, > > Jeff > > > > > > > ------------------------------------------------------------- > To unsubscribe from this list, manage your profile @ > http://www.acfug.org?fa=login.edituserform<http://www.acfug.org/?fa=login.edituserform> > > For more info, see http://www.acfug.org/mailinglists > Archive @ http://www.mail-archive.com/discussion%40acfug.org/ > List hosted by FusionLink <http://www.fusionlink.com/> > ------------------------------------------------------------- > > > ------------------------------------------------------------- > To unsubscribe from this list, manage your profile @ > http://www.acfug.org?fa=login.edituserform<http://www.acfug.org/?fa=login.edituserform> > > For more info, see http://www.acfug.org/mailinglists > Archive @ http://www.mail-archive.com/discussion%40acfug.org/ > List hosted by FusionLink <http://www.fusionlink.com/> > ------------------------------------------------------------- > > > > > ------------------------------------------------------------- > To unsubscribe from this list, manage your profile @ > http://www.acfug.org?fa=login.edituserform<http://www.acfug.org/?fa=login.edituserform> > > For more info, see http://www.acfug.org/mailinglists > Archive @ http://www.mail-archive.com/discussion%40acfug.org/ > List hosted by FusionLink <http://www.fusionlink.com/> > ------------------------------------------------------------- > > ------------------------------------------------------------- > To unsubscribe from this list, manage your profile @ > http://www.acfug.org?fa=login.edituserform<http://www.acfug.org/?fa=login.edituserform> > > For more info, see http://www.acfug.org/mailinglists > Archive @ http://www.mail-archive.com/discussion%40acfug.org/ > List hosted by FusionLink <http://www.fusionlink.com/> > ------------------------------------------------------------- > ------------------------------------------------------------- To unsubscribe from this list, manage your profile @ http://www.acfug.org?fa=login.edituserform For more info, see http://www.acfug.org/mailinglists Archive @ http://www.mail-archive.com/discussion%40acfug.org/ List hosted by http://www.fusionlink.com -------------------------------------------------------------
