Re: [PHP] Re: Extract specific div element from page

2007-06-15 Thread Anthony Hiscox

Oops, I accidentally sent this directly to CK, my apologies.

Thank you for your replies. The reason that I didn't explore the JS route is
because this will be running in the background, I didn't want to have to
visit the page in any way. I went looking for an easy way to accomplish this
in PHP but due to malformed HTML in some sites (not wordpress that I am
aware of) it wasn't going to be so easy. Someone in ##php on
irc.freenode.net pointed me to BeautifulSoup which is a Python module for
scraping pages even if they have bad HTML. Within a minute I had a script
that grabbed the parts I wanted, and even removed the parts I didn't (such
as comments). Now I have a Python script that runs when I am going to update
the docs on my Palm, it grabs the page(s), strips out the unimportant stuff,
saves to a local directory, and then I have Sunrise parse that into plucker
document format.

Once again, thank you for the responses.



On 6/15/07, Dan <[EMAIL PROTECTED]> wrote:


Or you could just use Javascript combined with PHP, just use javascript
it's
something like this document.getElementById('tagId').innerHtml that will
give you the html(contents) of the  tag you specify.  Then just do
something like document.form.value =
document.getElementById('tagId').innerHtml.  Basicly you're setting a
hidden
form element to have the value of the div, then when you submit the page,
you have the content as $_POST['formYouSetTo'].  You could have the JS
execute on the submit button's onclick.

It should be relatively easy if you look up the exact syntax of the
javascript.

- Daniel

""Anthony Hiscox"" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Hey folks,
>
> I need to pull the contents inside of a specific div out of a page, and
> write it to a separate file. In this instance I am taking everything
> inside
> of  tags from a wordpress blog, this will give
me
> only the content and not the menus, or other stuff. I need to do this
> because the final document will be converted for viewing on a palm
pilot.
>
> Is anyone aware of a simple solution to this problem, short of parsing
the
> entire page and starting when I hit that div opening tag, and stopping
> when
> I hit the closing tag? One problem I can see with this method is that I
> would have to count divs inside of that div, otherwise I would end too
> early
> on.
>
> Any advice would be greatly appreciated.
>
> Peace and Love,
> distatica.
>
> --
> -
> Anthony Hiscox
>
> Video Watch Group
> Public Site Currently Under Development
> Group Members Site Fully Operational
> -
>

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php





--
-
Anthony Hiscox

Video Watch Group
Public Site Currently Under Development
Group Members Site Fully Operational
-


[PHP] Re: Extract specific div element from page

2007-06-15 Thread Dan
Or you could just use Javascript combined with PHP, just use javascript it's 
something like this document.getElementById('tagId').innerHtml that will 
give you the html(contents) of the  tag you specify.  Then just do 
something like document.form.value = 
document.getElementById('tagId').innerHtml.  Basicly you're setting a hidden 
form element to have the value of the div, then when you submit the page, 
you have the content as $_POST['formYouSetTo'].  You could have the JS 
execute on the submit button's onclick.


It should be relatively easy if you look up the exact syntax of the 
javascript.


- Daniel

""Anthony Hiscox"" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]

Hey folks,

I need to pull the contents inside of a specific div out of a page, and
write it to a separate file. In this instance I am taking everything 
inside

of  tags from a wordpress blog, this will give me
only the content and not the menus, or other stuff. I need to do this
because the final document will be converted for viewing on a palm pilot.

Is anyone aware of a simple solution to this problem, short of parsing the
entire page and starting when I hit that div opening tag, and stopping 
when

I hit the closing tag? One problem I can see with this method is that I
would have to count divs inside of that div, otherwise I would end too 
early

on.

Any advice would be greatly appreciated.

Peace and Love,
distatica.

--
-
Anthony Hiscox

Video Watch Group
Public Site Currently Under Development
Group Members Site Fully Operational
-



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php