Hi, I have updated the timeline of the work during GSoC, based on our discussion:
TIMELINE: ---------------- April - May 24: -> Community Bonding Period -> Familiarization with codebase -> Familiarization with UDD -> Finalization of working plan and skeleton code Week 1 (May 25 - June 1): -> Add details of Blends like Description, Remarks, Responsible to UDD. -> Formalize Registration and Donation into the debian/upstream/metadata file of the corresponding packages. -> Extract information from this file into UDD. Weeks 2-3-4 (June 2 - June 25): -> Rewrite tasks.py to use UDD exclusively. -> Rewrite blendstasktools.py (dependant modules of tasks.py), to use only UDD. Week 5 (June 26 - July 3): -> Mid-term evaluation -> Writing tests -> Documentation work Weeks 6-7 (July 4 - July 18): -> Complete refactoring of tasks.py & blendstasktools.py. -> Implement addtional features: - To enable a Blend to read the metapackage of other relevant Blends and dependencies - Issue a warning in case of duplicated prospective package entries in task file -> Write tests to check implemented functionalities Weeks 8-9 (July 19 - August 3) : -> Co-ordinate with NeuroDebian; get their inputs on whether they need JSON export of all data -> Modify the way data is rendered if needed -> Work on improving the time needed to execute bugs_udd.py Weeks 10-11-12 (August 4 - August 25): -> Finalize work -> Finish writing tests -> Complete Documentation -> Cleanup Code -> Final pep8 draft revision Beyond August 25: -> Keep contributing to Blends Web Sentinels - >Get associated with one of Debian Pure Blends Can you please review it and give me your feedback. Thanks in advance. On Thu, Mar 26, 2015 at 7:52 AM, Akshita Jha <[email protected]> wrote: > Hi, > > On Wed, Mar 25, 2015 at 1:42 PM, Andreas Tille <[email protected]> wrote: > >> Hi Akshita, >> >> On Wed, Mar 25, 2015 at 11:07:52AM +0530, Akshita Jha wrote: >> > The GSoC task is to rewrite tasks.py to exclusively use UDD. I was >> thinking >> > of coming up with a more specific timeline to get a clearer picture. >> While >> > going through the code - tasks.py and blendstasktools.py, I made a few >> > observations, which I wanted to discuss with you: >> >> thanks for your engaged work on this topic. >> >> > Thank You. > > >> > 2) Is the additional information (descriptions and remarks) from all >> blends >> > to be added to UDD ? Or, are there only some blends whose information >> is to >> > be updated? >> >> Yes. I intend to add the descriptions and remarks as well as >> Responsible to another blends related table in UDD. We also need to >> work on formalising Registration and Donation into the >> debian/upstream/metadata file of the according packages and need to >> extract the information from there into UDD as we do with the >> bibliographic information. Regarding the Published-* fields: Since a >> long time I'm telling the people who injected them that we will drop >> this interface and debian/upstream/metadata will be the only way to >> inject publications. We need to push harder on this front. I did some >> overhaul of most of debian/upstream/metadata in the last year but some >> might be missing and most of the remaining entries do not even have a >> packaging skeleton in VCS (which would be a precondition). I think we >> need to do a short ping to the "Responsible" person if the entry is >> needed any more and whether help might be needed to create a packaging >> skeleton in VCS. BTW, do you have some basic Debian packaging skills? >> >> > I am sorry but I cannot say that I have basic Debian packaging skills. > However, I am more than willing to learn. Is there something relevant I can > work on ? > > >> > 3) I am assuming rewriting tasks.py also involves some changes to >> > blendstasktools.py. I noticed a few things in blendstasktools.py, which >> I >> > feel could be improved upon: >> >> I think it is *mainly* rewriting blendstasktools.py. :-) >> >> > i) In blendstasktools.py, some variable names are python keywords. I >> think >> > it is better if we use some diffrent variable names. Changing the names >> > here, might involve changing these names in other dependant modules >> also. >> > >> > ii) Also, I feel that we could rewrite some portions of >> blendstasktools.py >> > using the DRY(Don't repeat yourself) principle, so that it is easier to >> > maintain. >> >> Well, these are really kind words for "blendstasktools.py is a dirty >> hack." >> :-) >> >> > iii) Do we plan on replacing svn with git finally? Or is it good this >> way ? >> > I feel it is preferable to use git, because I don't think svn handles >> proxy >> > servers very well (I faced this issue). >> >> Once we have injected all data into UDD this question becomes orthogonal >> since blendstasks.py will not have to deal with any VCS at all any more. >> However, for the UDD importer we might need to stick onto this but the >> VCS interface was implemented previously so there is no need to worry >> about this. Generally speaking I'd say: While I tend to prefer Git >> over SVN some active people just stick to SVN and I do not want to force >> them to something else. So SVN support should remain if we do not have >> really strong reasons to drop it. >> >> > iv) In the class DependantPackage, there is a comment ehich says: >> > self.PrintedName = None # Only for Meta package names - no use for a >> > real dependant package >> > # FIXME -> object model >> > Can you please brief me about this? >> >> The tasks file has a field "Task" (which no normal package has). While >> the final package name is created via $BLENDNAME-$TASKFILENAME (for >> instance med-bio) the Task has some "human readable name" (which would >> be a better word for PrintedName). I do not remember what I wanted to >> say with "object model", sorry. I do not think that you need to care >> about this. You get the value from the title field in >> >> udd=# select blend, task, title from blends_tasks where task = 'bio'; >> blend | task | title >> ------------+------+--------- >> debian-med | bio | Biology >> >> (to stick to the example above). >> >> > v) Similarly, in the class TaskDependencies(), there's a >> comment: >> > # If a Blend just bases on the meta package of an other Blend >> (this >> > is the >> > # case in Debian Science which bases on med-bio for biology and >> > gis-workstation >> > # for geography it makes no sense to build an own sentinel page >> but >> > read >> > # meta package information of other meta packages and include >> the >> > content >> > # of these while enabling to add further Dependencies as well >> > # >> > # metadepends should be a SVN URL >> > # >> > # This is NOT YET implemented >> >> This is a really long wanted missing feature which IMHO will be pretty >> easy to implement when basing on UDD. Just have a look at >> >> http://blends.debian.org/science/tasks/biology >> >> It pretty much contains no packages except from some non-microbiology >> packages which are not used in medical microbiology. In addition it has >> med-bio as Recommends and med-bio-dev as Suggests. However a user does >> not want to see the metapackages here but rather the list of packages >> coming with the metapackage. So what we need to approach is to >> "resolve" these Dependencies for rendering the tasks page. Is this a >> sensible explanation or do I need to explain it more verbosely? >> >> > Thanks alot. After your explanation, things are much clearer. > > >> > vi) Also, in GetTaskDependencies(): >> > # TODO: warn about possibly duplicated prospective package >> entries >> > in tasks files >> >> This would be something for the UDD importer. I guess it will be a >> requirement since we have put a primary key on the package name (if I'm >> not misleaded) and thus we need to check first whether a package with >> this name is just in the table. I also think this is simple to do. >> >> > Can we try implementing the above features during GSoC? >> >> Yes, I think the rewrite will simplify a lot of these things and some of >> them are simply solved by design. >> >> > 4) We can write tests to check the functionalities, instead of make >> minor >> > changes and running the code on the entire data set. It'll save alot of >> > time, but the issue here might be to include all possible boundary >> > conditions. What is your take on this ? >> >> I'm in favour of sensible testing. So if you see any chance to >> implement tests I'd value it higher than implementing more and more >> features. >> >> Hope this answers all your questions. Feel free to bother me about >> more details. >> > > Thanks again for answering all my questions. This has really helped me get > a clearer picture of the task to be accomplished. > > Regards, > Akshita Jha > -- Akshita Jha
