Re: GSoC: HTTP API & Visual Scrapy Browser Plugin

Ruben Vereecken Tue, 18 Feb 2014 08:16:39 -0800

Thanks for the great answer, Scrapinghub looks really promising by the way. 
Generating Parsley sounds interesting, but I feel you've basically got that 
covered with slybot and an UI on top of that.


I'm currently back to looking in the direction of an HTTP API, yet I feel 
the project as we discussed it before is a bit immature on its own. If 
anyone has had any uses for an HTTP API for their Scrapy spiders before 
that required some more intricate functionality, please get back to me so 
we could discuss how such an HTTP API could be extended beyond 
communicating with a simple spider. In the meanwhile, I'll be looking on on 
myself.



Op maandag 17 februari 2014 16:16:33 UTC+1 schreef shane:
>
>
> The ideas page lists an intermediate task about an HTTP API for Scrapy 
>> Spiders, a task that probably fits me best. The mentor listed is Shane 
>> Evans, which is just my bad luck as he seems to be a busy guy. I've got 
>> some ideas around this project so if anything would be willing to 
>> (informally) talk them over it would be greatly appreciated. 
>>
> I got in touch with Ruben and we're discussing it. We plan to put our IRC 
> details on the wiki, which should help.
>
>  
>
>>
>> As soon as I started using Scrapy I had this short but vivid dream of 
>> simply anyone having access to Srapy through a browser plugin that would 
>> interactively and visually construct spiders for the user, without the user 
>> ever having to touch any Python code. Later I thought I'd found exactly 
>> this idea on one of your ideas page but I can't seem to find it again. This 
>> is a project I would love to work on even more than the aforementioned one 
>> but I'm still investigating the feasibility. But let's be honest, it would 
>> be really cool if you could just select some text on a page as a certain 
>> 'thing', click links for the crawler to investigate and make it crawl all 
>> that for you. This is even more of a shout-out to any interested users or 
>> developers to discuss this subject with me, because this is what I'd love 
>> to focus on. Be it technical or simply talking ideas, just give me a holla.
>>
>
> The idea that was there was a browser extension that would help with 
> spider generation. The reason we pulled it was that it wasn't well 
> developed enough in time for the proposal. We should (and probably will) 
> put it back on the draft ideas in some form.
>
> Did you see the Scrapinghub autoscraping tool?
> http://scrapinghub.com/autoscraping
> We're currently working on an open source version of the UI (the back end 
> is https://github.com/scrapy/slybot ), which will be available soon. The 
> UI will generate slybot spiders, but we also had an idea for the GSoC to 
> generate python code or files for parslepy (
> https://github.com/redapple/parslepy ). If this is area is your biggest 
> interest, we can try and come up with something interesting here for you.
>
>  
>
>>
>> For the next couple of weeks you'll see me at #scrapy as Randomaniac and 
>> around the mailing lists. I'm planning on delving into the scrapy code to 
>> familiarise myself and hopefully manage a couple of patches where needed at 
>> the same time.
>>
> great
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Re: GSoC: HTTP API & Visual Scrapy Browser Plugin

Reply via email to