Re: Queries regarding adding Python 3 support for scrapy.

Mikhail Korobov Thu, 26 Mar 2015 08:21:07 -0700

Hi Anuj,

I understand your situation - the exams can be very stressful - but 
unfortunately a contribution to Scrapy or a related project (e.g. w3lib) is 
a hard requirement. It is the best way for us to understand how well can we 
work together with a student, and a best way for a student to understand if 
he likes working with us or not. We can't accept a proposal without this 
information.


четверг, 26 марта 2015 г., 17:16:22 UTC+5 пользователь Anuj Bansal написал:
>
> Sir,
>
> My exams just finished yesterday so I can finally get back to work on 
> scrapy. I have submitted my GSoC proposal. I know I'm late but I will 
> surely cover the lost time.
> I have created a blog where I will be posting my work with scrapy (
> http://ahhda.blogspot.in/).
> The proposal however requires the link of a contribution which I don't 
> have as I was busy with my college. Although I have contributed to sympy (
> https://github.com/sympy/sympy/pull/9121). I have given this link in the 
> proposal. I hope this is acceptable.
>
> I have also created a copy at (
> https://docs.google.com/document/d/1FUg1fhdIWS5HLh8zjbPTpR6kXwsG60pRdJ6QsF4m3u0/edit).
>  
> Do tell me if you find something missing or wrong with in the proposal.
>
> The results will be announced on 27th April. Till then I will continue to 
> work on scrapy and fix some bugs.
>
> Looking towards a great summer :)
>
> Regards,
> Anuj
>
>
> On Thursday, March 19, 2015 at 1:13:46 AM UTC+5:30, Mikhail Korobov wrote:
>>
>> Hi,
>>
>> среда, 18 марта 2015 г., 23:52:19 UTC+5 пользователь Anuj Bansal написал:
>>>
>>> Sir,
>>>
>>> I have learned the differences between Python 2 and Python 3. I have 
>>> created a google doc (
>>> https://docs.google.com/document/d/1xf7OtuyB5b6npCOLalZ-yjPZEcoKNb19iimfElyDino/edit)
>>>  
>>> in which I have written the common porting errors which I could find after 
>>> going through various blogs and projects and there corresponding syntax 
>>> corrections. You can add your valuable suggestions or anything that I have 
>>> missed out to it by directly going to the link and editing it. Do tell me 
>>> if you find something wrong with the approach.
>>>  
>>>
>>>> The recommended way is to use "six" Python module. Some parts of Scrapy 
>>>> are already ported to Python 3 - see e.g. 
>>>> https://travis-ci.org/scrapy/scrapy/jobs/54761340 - 235 tests pass in 
>>>> Python 3.3. To get started try cloning Scrapy and running some tests using 
>>>> tox (as described in docs). 
>>>>
>>>
>>> I got some errors while setting up scrapy and found out that I had to 
>>> install libssl-dev, libffi-dev, python-dev and libxml2-dev. As mentioned on 
>>> (
>>> http://stackoverflow.com/questions/17611324/error-when-installing-scrapy-on-ubuntu-13-04
>>> ).
>>> Shouldn't these be added to the scrapy requirements ? Should I create an 
>>> issue relating to this ? I'm currently working on Ubuntu 14.04.
>>>
>>
>> Scrapy requirements.txt lists Python packages (not system packages). 
>> There are some install notes here: 
>> http://doc.scrapy.org/en/latest/intro/install.html
>> libffi-dev is a dependency of PyOpenSSL; libxml2-dev is a dependency of 
>> lxml. I'm not sure - maybe we can document this all. It would be 
>> documenting the requirements of our requirements though.
>>  
>>
>>>  
>>>
>>>> You can also check 
>>>> https://github.com/scrapy/scrapy/blob/master/tests/py3-ignores.txt 
>>>> file - try uncommenting something and run tests again to see what's not 
>>>> ported. We can't rely only on tests when porting, but they are a good 
>>>> start.
>>>>
>>>
>>> This is great ! Would really help me in planning my strategy. 
>>>  
>>>
>>>> This URL encoding thing is where we stopped. Without having a solid 
>>>> solution we can't port scrapy.Request, and without scrapy.Request most 
>>>> other Scrapy components don't work.
>>>>
>>>  
>>> Handling binary data is the most trickiest issue that people face in 
>>> supporting Python 2 and Python 3. So the first thing to do would be to find 
>>> the best solution for URL encoding. Only then we would be able to port 
>>> other scrapy components.
>>> So I should first take a look at the w3lib project.
>>>
>>> As quoted in the book (
>>> http://python3porting.com/strategies.html#python-2-and-python-3-without-conversion
>>> ):
>>>
>>> "My recommendation for the development workflow if you want to support 
>>> Python 3 without using 2to3 is to run 2to3 on the code once and then 
>>> fix it up until it works on Python 3. Only then introduce Python 2 support 
>>> into the Python 3 code, using six where needed. Add support for Python 
>>> 2.7 first, and then Python 2.6. Doing it this way can sometimes result in a 
>>> very quick and painless process."
>>>
>>> Is this the recommended method ?
>>>
>>
>> Usually I just start with the existing code and add Python 3 support to 
>> it using "six" package and a common sense :) The metod from the book sounds 
>> OK, but you need to be very careful not to break existing Python 2.x code. 
>> __future__ imports can be also helpful (2to3 doesn't add them). We don't 
>> need Python 2.6 support. 
>>  
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Queries regarding adding Python 3 support for scrapy.

Reply via email to