Hi, This week I:
- Implemented Crawling logic for mapping vulnerability identifiers to equivalent vulnerabilities that can be parsed by the patch-finder. This mainly involved implementing a spider to parse various forms of content-types, the logic of which I have implemented for three: text/html, application/json and text/plain. - Wrote utilities that would be used by the above mentioned spider. - Manually tested the spider on DSAs[1], RHSAs[2] and GLSAs[3]. - Wrote unit tests for most of the above functionalities. - Looked into Scrapy's Crawler API[4], its interaction with the networking framework Twisted[5] and Scrapyd[6]. This was to analyse how multiprocessing implementation would look like and function with Scrapy. - Learnt about database terminology and concepts such as Object Relational Mapping, Data Access Objects and Database Abstraction Layers. - Looked into various Python DAL and ORM projects such as PonyORM[7], SQLAlchemy[8] and PyDAL[9]. My working repository: https://github.com/jajajasalu2/patch-finder Cheers, Jaskaran [1]https://www.debian.org/security/ [2]https://access.redhat.com/security/security-updates/#/security-advisories [3]https://security.gentoo.org/glsa [4]https://docs.scrapy.org/en/latest/topics/api.html [5]https://github.com/twisted/twisted [6]https://github.com/scrapy/scrapyd [7]https://github.com/ponyorm/pony/ [8]https://github.com/sqlalchemy/sqlalchemy [9]https://github.com/web2py/pydal
