On 5/19/20 5:53 PM, Joseph Myers wrote:
On Tue, 19 May 2020, Martin Liška wrote:

On 5/19/20 10:11 AM, Martin Liška wrote:
Can you please share how do you do it? It would be easy to add it.

I added the feature via --fill-up-bug-titles option. It uses common
request and beatifulsoup packages.

The REST interface is much better to use for extracting bug data than
screen scraping of HTML output.  Fetch e.g.
https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=12345&include_fields=summary
to get JSON bug data (change or omit include_fields if you want more than
just the summary).


You are right, there's a patch I'm going to install.

Martin

>From b5a89069a074aff0ae94176c676eda069ff0a1c3 Mon Sep 17 00:00:00 2001
From: Martin Liska <mli...@suse.cz>
Date: Tue, 19 May 2020 21:14:36 +0200
Subject: [PATCH] Use REST API for bug titles in mklog.

contrib/ChangeLog:

2020-05-19  Martin Liska  <mli...@suse.cz>

	* mklog.py: Use REST API for bug title downloading.
---
 contrib/mklog.py | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 45559afbe6b..b27fad0ca2e 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -31,8 +31,6 @@ import os
 import re
 import sys
 
-import bs4
-
 import requests
 
 from unidiff import PatchSet
@@ -46,6 +44,8 @@ macro_regex = re.compile(r'#\s*(define|undef)\s+([a-zA-Z0-9_]+)')
 super_macro_regex = re.compile(r'^DEF[A-Z0-9_]+\s*\(([a-zA-Z0-9_]+)')
 fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
 template_and_param_regex = re.compile(r'<[^<>]*>')
+bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&;' \
+               'include_fields=summary'
 
 function_extensions = set(['.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def'])
 
@@ -106,18 +106,16 @@ def sort_changelog_files(changed_file):
 
 
 def get_pr_titles(prs):
-    if not prs:
-        return ''
-
     output = ''
     for pr in prs:
         id = pr.split('/')[-1]
-        r = requests.get('https://gcc.gnu.org/PR%s' % id)
-        html = bs4.BeautifulSoup(r.text, features='lxml')
-        title = html.title.text
-        title = title[title.find('–') + 1:].strip()
-        output += '%s - %s\n' % (pr, title)
-    output += '\n'
+        r = requests.get(bugzilla_url % id)
+        bugs = r.json()['bugs']
+        if len(bugs) == 1:
+            output += '%s - %s\n' % (pr, bugs[0]['summary'])
+            print(output)
+    if output:
+        output += '\n'
     return output
 
 
-- 
2.26.2

Reply via email to