Hi Parth,

Mikhail actually requested to be removed from the mentors list for the
benchamrking suite idea. (He may be a backup mentor in the future if needed
but does not have enough insights on the idea at the moment)
The other mentor to contact would be Daniel Grana.

Best,
Paul.

On Tue, Mar 7, 2017 at 11:07 AM, Parth Verma <vermapart...@gmail.com> wrote:

> Hi Paul,
>
> I am currently looking at the issues that you mentioned and have studied
> how benchmarking works. I am currently going through the various issues
> related with memory leaks and have also opened an issue (
> https://github.com/scrapy/scrapy/issues/2629).
>
> How do I get into contact with Mikhail?
>
> Parth
>
> On Friday, 3 March 2017 22:16:55 UTC+5:30, Paul Tremberth wrote:
>>
>> Hello Parth,
>>
>> Sorry we did not reply to your first message in February.
>> It's great that you're interested in participating in GSoC with a Scrapy
>> project!
>>
>> For "Scrapy benchmarking suite" idea, you may want to get in touch with
>> Daniel and Mikhail who are listed as potential mentors for the project.
>>
>> A few pointers in the meantime:
>> Scrapy currently has a `scrapy bench` command that tries to fetch pages
>> at maximum speed:
>> https://docs.scrapy.org/en/latest/topics/benchmarking.html#benchmarking
>> You can check how that is implemented and what is does and does not.
>> It's quite naive and may not represent a realistic use-case with large or
>> broken HTML files, or broad crawls with lots of domains visited
>>
>> Scrapy commands also have a (undocumented?) --profile option to write
>> cProfile stats.
>> you can try it out to see what you can get out of it.
>>
>> There are (at least) a couple of issues about potential memory leaks:
>> - https://github.com/scrapy/scrapy/issues/482
>> - https://github.com/scrapy/scrapy/issues/482
>>
>> Another question: maybe Python 2 and Python 3 show differences in terms
>> of CPU and memory usage?
>>
>> I would assume a succesful project for GSoC would allow investigating
>> such issues and find the root causes (if not fixing them).
>>
>> Hope this helps,
>> Paul.
>>
>>
>> On Fri, Mar 3, 2017 at 11:51 AM, Parth Verma <vermap...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm interested in "Scrapy benchmarking suite" idea in the ideas list for
>>> GSoC '17.
>>> Please help with what are the prerequisites for the same.
>>>
>>> Thanks.
>>>
>>>
>>> On Saturday, 11 February 2017 21:20:19 UTC+5:30, Parth Verma wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am Parth Verma, a second year undergraduate pursuing MSc. in
>>>> Mathematics and Computing at IIT Kharagpur, India.
>>>> I have been doing open-source programming for a year. My github profile
>>>> is https://github.com/Parth-Vader.
>>>> My programming knowledge includes Python (Intermediate) , C
>>>> (Intermediate) , C++(Intermediate), HTML/CSS (basic) and Bash. I use Ubuntu
>>>> 16.04 as my main operating system and Windows 8 for gaming.
>>>> I have been doing Data Analytics, and for that, I need to collect data
>>>> from various online sources and that's why I used Scrapy.
>>>>
>>>> I am interested in Scrapy benchmarking suite, since I have prior
>>>> knowledge of various algorithms and I want to learn memory management in
>>>> CPUs. What should be my next steps?
>>>>
>>>> Furthermore, I would like to suggest an idea.
>>>>
>>>> A new section in the official documentation could be added where people
>>>> could share their configuration files that they used to successfully scrape
>>>> data from a specific website (by successful, I mean not getting banned and
>>>> getting a good speed.) This way, I believe , it would be easier for people
>>>> without any prior knowledge of HTML, Python or Shell, could easily use
>>>> scrapy to get data from those specific sites.
>>>> In addition, we could create benchmarking for those sites as well.
>>>>
>>>> Thanks.
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "scrapy-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to scrapy-users...@googlegroups.com.
>>> To post to this group, send email to scrapy...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/scrapy-users.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scrapy-users+unsubscr...@googlegroups.com.
> To post to this group, send email to scrapy-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to