Hi,

On Sun, Nov 25, 2012 at 12:24 AM, anatoly techtonik <techto...@gmail.com>wrote:

> On Sat, Nov 24, 2012 at 7:04 AM, Ezio Melotti <ezio.melo...@gmail.com>
>  wrote:
>
>> Thanks for your work!
>>
>> I played a bit with the code tonight and used it to create a json that
>> maps filename -> list of issues with a patch that affects that filename.
>>
>
> Nice. Is it possible to add this lookup to the post commit hook script to
> report about amount of patches available for files that had been committed?
> It is much easier to review code that's already in your mind.
>

I'm not sure I understand what you are asking here.  Are you suggesting to
add a mercurial hook that, once you commit/push something, suggests other
issues with patches that affect the same file(s)?
This could be done, but I think it's better to make the data available in
the tracker so that developers can search other issues themselves.


>
> I made a simple page to filter the results and uploaded it here for now:
>> http://wolfprojects.altervista.org/issues.html
>> It requires javascript and it's a bit slow (at least on my pc), but it
>> allows you to enter a module name or path and it will list all the issues
>> related to the files that match the search (regex search should also work).
>> This is still a work in progress though.
>> If you want you can find the json at
>> http://wolfprojects.altervista.org/files.json
>
>
> Good work. I've pulled all the changes, but now I am getting:
>
> Traceback (most recent call last):
>   File "modstats.py", line 142, in <module>
>     print('#%s: %s' % (issuen, issue['title']))
>   File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
>     return codecs.charmap_encode(input,errors,encoding_map)
> UnicodeEncodeError: 'charmap' codec can't encode character u'\u2019' in
> position 65: character maps to <undefined>
>

That's probably due to the limitations of the windows console.


>
> Path cleaning is a good thing. Good auto classification also needs some
> rules that needs additional data:
>   - detect full path from just filename
>

The cleanup function I wrote just removes extraneous things from the path.
It doesn't verify if the file exists in the Python codebase.


>     - if path is unknown, analyse filename
>       - if filename is unique in Python source tree, return it's path
>       - if filename is not unique, compare parent path components
> recursively
>         - if not successful, try context match patch
>         - if everything fails, choose the first one
>           - if it fails also, maintain manual connection patch <---> file
>   For that to work we need an index of Python source code directory tree.
>

I don't think all this is necessary.  Once we have the list of file names
extracted from the patches, it's enough to search for keyword or module
name to find all the related issues.
For example if you try searching for 'json' on
http://wolfprojects.altervista.org/issues.html you will find all the
json-related files, including the ones in the Python package, the C
acceleration module, the documentation, the tests, and even files like
"doc\json.rst" that don't exist in the Python codebase or got renamed at
some point.

And add % of recognized patches.
> With manual classification (triaging) it is possible to keep this per-cent
> at 100.
>

Trying to establish a mapping between the patches and the actual files is
both cumbersome (especially if requires manual classification) and might
end up missing some of the patches if they specify an incorrect path, add
new files, or affected files that got renamed or deleted.

I'm considering adding a way to search for modules to the tracker, in a way
similar to what I did on http://wolfprojects.altervista.org/issues.html.
The tracker has direct access to the files and issues so analyzing the
patches and keep the database updated as new patches are attached should be
easier.  Once we have the equivalent of files.json (maybe in a db table),
it's just a matter to add a search form.


> BTW, what kind of license do you prefer? I didn't mention it first, but
> it's better to set this early. I am for public domain with some credits
> file at the top.
>

SGTM.

Best Regards,
Ezio Melotti


-- 
> anatoly t.
>
>
_______________________________________________
Tracker-discuss mailing list
Tracker-discuss@python.org
http://mail.python.org/mailman/listinfo/tracker-discuss

Reply via email to