Re: [PHP] extract data from html
1. Open the html file in read only mode 2. Start reading the html file till I encounter a td tag (I don't know how to do this) 3. Grab that data after the td tag (and then what?) See http://php.net/manual/en/function.fopen.php and http://php.net/manual/en/function.fgetss.php plus the chapter for whatever DBMS you want to drop the file contents into. Thanks. One thing just reading the manual without the idea of how the function works is of no use. Some examples would help. In fact I did use fopen, fgets, fgetss but the problems is that the html tag that I am looking is td. Now this is easy but if td width=25% or td colspan=7 would give a problem. (grimace) The PHP manual is actually very well written; I can usually find exactly what I need in 10s. I think your complaint just covers sloppy thinking. I'd think you should be able to find screen-scraper code around; if not, try this: - search for 'td'. NOTE: use a case-insensitive search! - search for the first trailing ''. Save (this character position + 1). - search for the first trailing '/td'. Again, case-insensitive! - store everything between the two; strip all HTML tags, add slashes, and store it. - increment your file position by 5 characters and repeat. I'd give you actual code, but I think you could use some manual practice (smirk). -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP] extract data from html
On Thu, 28 Jun 2001, CC Zona wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Adrian D'Costa) wrote: 1. Open the html file in read only mode 2. Start reading the html file till I encounter a td tag (I don't know how to do this) 3. Grab that data after the td tag (and then what?) See http://php.net/manual/en/function.fopen.php and http://php.net/manual/en/function.fgetss.php plus the chapter for whatever DBMS you want to drop the file contents into. Thanks. One thing just reading the manual without the idea of how the function works is of no use. Some examples would help. In fact I did use fopen, fgets, fgetss but the problems is that the html tag that I am looking is td. Now this is easy but if td width=25% or td colspan=7 would give a problem. Adrian -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP] extract data from html
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Adrian D'Costa) wrote: 1. Open the html file in read only mode 2. Start reading the html file till I encounter a td tag (I don't know how to do this) 3. Grab that data after the td tag (and then what?) See http://php.net/manual/en/function.fopen.php and http://php.net/manual/en/function.fgetss.php plus the chapter for whatever DBMS you want to drop the file contents into. Thanks. One thing just reading the manual without the idea of how the function works is of no use. It should be exactly of use. Explaining how a function works, and how to use it, is the point of a manual. If there are finer points that you need clarification on *after reading the manual entry*, that's understandable. But for the most part the manual is clear and accessible, even to a newbie. I learned PHP, as a complete novice to that language and programming in general, by simply reading the manual through. Some examples would help. ..which is why most manual entries include 1-2 official examples, plus several others in the user annotations. If you're not reading the annotated version of the manual (as in the links provided), then you're only scratching the surface of what the manual has to offer. In fact I did use fopen, fgets, fgetss but the problems is that the html tag that I am looking is td. Now this is easy but if td width=25% or td colspan=7 would give a problem. See http://php.net/manual/en/function.strip_tags.php (as is noted in the see also note and user annotation for fgetss, BTW; did you bother to read the links provided before proclaiming them of no use?) -- CC -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP] extract data from html
Hi, I keep receiving a lot of word documents the I need to extract and put into a mysql table. As of now, I do a cut and paste manually using a html from to a php script to dump it into the mysql table. Since the format is usually the same (most of the time) I am sure there should be another way to do this. I converted the word doc into a html file. Now the question is that I need to write a php script to open that file and read thru its contents. I am not that experienced in php but can understand the basics. The steps that I would take are : 1. Open the html file in read only mode 2. Start reading the html file till I encounter a td tag (I don't know how to do this) 3. Grab that data after the td tag (and then what?) 4. Repeat this process till encounter the /table tag. If anyone could give me the steps in hard code and any other alternative what I would be happy. A Sample of the word doc (too big to attach) +---+ COS Zurich +---+ JuneJulyAug Hotel Price ++---+---++-+ 28 2 8 Arion Fr.1349 Inc Village Fr.1099 ++---+---+---+--+ This is a table. Each table belongs to a different city. Each row has more than 2 hotels. What would be the best solution? TIA Adrian -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP] extract data from html
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Adrian D'Costa) wrote: 1.Open the html file in read only mode 2.Start reading the html file till I encounter a td tag (I don't know how to do this) 3.Grab that data after the td tag (and then what?) See http://php.net/manual/en/function.fopen.php and http://php.net/manual/en/function.fgetss.php plus the chapter for whatever DBMS you want to drop the file contents into. -- CC -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]