On Tue, 2003-06-24 at 16:21, David A. Desrosiers wrote:
>       Using javascript to navigate between links is not HTML, and Plucker
> shouldn't be expected to handle that.

I agree, that's why I try to transform the page before plucking it.

>  Plucker also doesn't handle forms, so
> converting the Javascript to forms won't work either. Convert the links back
> to their native HTML equivalents, and you'll see better results.

What is the native HTML equivalent that you think of here?
The URL is always the same, I only get the additional pages
with a POST operation that has the correct parameters.

>  Perhaps the
> - --url-pattern options would help here. Apply a regex to convert the invalid
> Javascript code to HTML, and parse again.

At the moment I am trying to do it with JPluck, because the Python 
parser does not support the UTF-8 encoding used by this site.

-- 
Freundliche Gruesse / Best Regards

Patrick Ohly
Senior Software Engineer
--------------------------------------------------------------------
//// pallas 
Pallas GmbH / Hermuelheimer Str. 10 / 50321 Bruehl / Germany
[EMAIL PROTECTED] / www.pallas.com
Tel +49-2232-1896-30 / Fax +49-2232-1896-29
--------------------------------------------------------------------

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to