I also care about http redirect, javascript execution. css lookup (e.g. remove the hidden element in the DOM)
On Tue, Sep 1, 2009 at 5:15 AM, tonikitoo (Antonio Gomes)<[email protected]> wrote: > Are you interested on fetching web pages content only or how webkit > would lay it out also matters for you "crawler" ? > > i am asking because, even if you have no UI, webkit would not just get > the page source (and its associated resources) but also parser, > decode, render and all other steps involved. These could be a > potential performance bottleneck for you if you just care about > fetching web pages source/content (which is usually a crowler cares > about). > > please be more specific about your needs ... > > On Fri, Aug 28, 2009 at 9:02 PM, n179911<[email protected]> wrote: >> On Tue, Aug 25, 2009 at 10:53 PM, Nevo<[email protected]> wrote: >>> >>> >>> 2009/8/26 Dan <[email protected]> >>>> >>>> Hi list, >>>> Just posted this to webkit-dev, and was advised that this is a better list >>>> for the question. Sorry if this is a little vague... but, does anyone have >>>> any general guidance as to where I'd start with webkit if I wanted to build >>>> a headless web client, along the lines of a crawler / bot, on top of it? >>>> Would I be best to use individual parts of the code, or implement a browser >>>> and hide the UI side of it? >>>> I'm not much of a C++/ObjC developer, so I can't begin to expect to be >>>> able to do this immediately, but any tips you can give would be greatly >>>> appreciated. >>> >>> You might take a look at Webkit's WebInspector, which helps you to view DOM >>> hierarchy in a tree style , so you could have a good sense of how WebCore >>> manipulates/traverses a web page . >>> >> >> I have a related question on this kind of Webkit usage as well. How >> can we run Webkit without any display (e.g. X server on Linux)? For >> web crawler purpose, it does not need to display anything on screen. >> Is there a configure of Webkit for this kind of thing? >> >> Thank you. >> >> >> >>> Nevo >>> >>> >>> _______________________________________________ >>> webkit-help mailing list >>> [email protected] >>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-help >>> >>> >> _______________________________________________ >> webkit-help mailing list >> [email protected] >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-help >> > > > > -- > --Antonio Gomes > _______________________________________________ webkit-help mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-help
