Hello community, here is the log from the commit of package urlscan for openSUSE:Factory checked in at 2017-07-04 11:58:20 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/urlscan (Old) and /work/SRC/openSUSE:Factory/.urlscan.new (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "urlscan" Tue Jul 4 11:58:20 2017 rev:3 rq:507984 version:0.8.6 Changes: -------- --- /work/SRC/openSUSE:Factory/urlscan/urlscan.changes 2017-03-20 17:08:37.397829629 +0100 +++ /work/SRC/openSUSE:Factory/.urlscan.new/urlscan.changes 2017-07-04 11:58:24.333274250 +0200 @@ -1,0 +2,16 @@ +Tue Jul 4 06:19:46 UTC 2017 - wer...@suse.de + +- Update to version 0.8.6 + * Fix tag mismatch in setup.py + * Fix #27 (URLs in markdown links) + * Tweak email address recognition + * Add ability to toggle context view + * Cleanup, commenting, add keyboard hints in the header + * Add shortening and toggling shortening of URLs + * Restructure URLChooser for current urwid best practices + * Update tlds list + * Replace AttrWrap (deprecated) with AttrMap + * Highlight selected URL. Fix #17 + * Implement #21 (Option to remove duplicate URLs) + +------------------------------------------------------------------- Old: ---- urlscan-0.8.3.tar.gz New: ---- urlscan-0.8.6.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ urlscan.spec ++++++ --- /var/tmp/diff_new_pack.W2Dj1H/_old 2017-07-04 11:58:25.853060376 +0200 +++ /var/tmp/diff_new_pack.W2Dj1H/_new 2017-07-04 11:58:25.857059812 +0200 @@ -17,7 +17,7 @@ Name: urlscan -Version: 0.8.3 +Version: 0.8.6 Release: 0 Summary: An other URL extractor/viewer License: GPL-2.0 ++++++ urlscan-0.8.3.tar.gz -> urlscan-0.8.6.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlscan-0.8.3/README.rst new/urlscan-0.8.6/README.rst --- old/urlscan-0.8.3/README.rst 2016-07-11 19:36:42.000000000 +0200 +++ new/urlscan-0.8.6/README.rst 2017-07-03 22:35:10.000000000 +0200 @@ -26,7 +26,9 @@ - Support for emails in quoted-printable and base64 encodings. No more stripping out =40D from URLs by hand! -- The context of each URL is provided along with the URL. For HTML mails, a crude parser is used to render the HTML into text. +- The context of each URL is provided along with the URL. For HTML mails, a crude parser is used to render the HTML into text. Context view can be toggled on/off with `c`. + +- URLs are shortened by default to fit on one line. Viewing full URL (for one or all) is toggled with `s` or `S`. Installation and setup ---------------------- @@ -55,9 +57,9 @@ :: - urlscan [-n] <file> + urlscan [-n, --no-browser] [-c, --compact] [-d, --dedupe] <file> -Urlscan can extract URLs and email addresses from emails, or any text file. Calling without the '-n' flag will start the curses browser. Calling with '-n' will just output a list of URLs/email addressess to stdout. Files can also be piped to urlscan using normal shell pipe mechanisms: `cat <something> | urlscan` or `urlscan < <something>` +Urlscan can extract URLs and email addresses from emails or any text file. Calling with no flags will start the curses browser. Calling with '-n' will just output a list of URLs/email addressess to stdout. The '-c' flag removes the context from around the URLs in the curses browser, and the '-d' flag removes duplicate URLs. Files can also be piped to urlscan using normal shell pipe mechanisms: `cat <something> | urlscan` or `urlscan < <something>` Known bugs and limitations -------------------------- @@ -68,7 +70,7 @@ - The HTML message handling is a bit kludgy in general. -- multipart/alternative sections are handled by descending into all the sub-parts, rather than just picking one, which may lead to URLs and context appearing twice. +- multipart/alternative sections are handled by descending into all the sub-parts, rather than just picking one, which may lead to URLs and context appearing twice. (Bypass this by selecting the '--dedupe' option) - Configurability is more than a little bit lacking. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlscan-0.8.3/bin/urlscan new/urlscan-0.8.6/bin/urlscan --- old/urlscan-0.8.3/bin/urlscan 2016-07-11 19:36:42.000000000 +0200 +++ new/urlscan-0.8.6/bin/urlscan 2017-07-03 22:35:10.000000000 +0200 @@ -4,7 +4,7 @@ # properly. aka "urlview minus teh suck" # # Copyright (C) 2006-2007 Daniel Burrows -# Copyright (C) 2016 Scott Hansen +# Copyright (C) 2017 Scott Hansen # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License @@ -18,7 +18,7 @@ # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA from __future__ import unicode_literals import argparse @@ -46,6 +46,9 @@ arg_parse.add_argument('--no-browser', '-n', dest="nobrowser", action='store_true', default=False, help="Pipe URLs to stdout") + arg_parse.add_argument('--dedupe', '-d', dest="dedupe", + action='store_true', default=False, + help="Remove duplicate URLs from list") arg_parse.add_argument('message', nargs='?', default=sys.stdin, help="Filename of the message to parse") args = arg_parse.parse_args() @@ -158,9 +161,11 @@ msg = process_input(args.message) if args.nobrowser is False: ui = urlchoose.URLChooser(urlscan.msgurls(msg), - compact_mode=args.compact) + compact=args.compact, + dedupe=args.dedupe) ui.main() else: - out = urlchoose.process_urls(urlscan.msgurls(msg), - nobrowser=True) - print("\n".join(out)) + out = urlchoose.URLChooser(urlscan.msgurls(msg), + dedupe=args.dedupe, + shorten=False) + print("\n".join(out.urls)) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlscan-0.8.3/setup.py new/urlscan-0.8.6/setup.py --- old/urlscan-0.8.3/setup.py 2016-07-11 19:36:42.000000000 +0200 +++ new/urlscan-0.8.6/setup.py 2017-07-03 22:35:10.000000000 +0200 @@ -3,12 +3,12 @@ from setuptools import setup setup(name="urlscan", - version="0.8.3", + version="0.8.6", description="View/select the URLs in an email message or file", author="Scott Hansen", author_email="firecat4...@gmail.com", url="https://github.com/firecat53/urlscan", - download_url="https://github.com/firecat53/urlscan/archive/0.8.3.zip", + download_url="https://github.com/firecat53/urlscan/archive/0.8.6.zip", packages=['urlscan'], scripts=['bin/urlscan'], package_data={'urlscan': ['assets/*']}, diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlscan-0.8.3/urlscan/assets/tlds-alpha-by-domain.txt new/urlscan-0.8.6/urlscan/assets/tlds-alpha-by-domain.txt --- old/urlscan-0.8.3/urlscan/assets/tlds-alpha-by-domain.txt 2016-07-11 19:36:42.000000000 +0200 +++ new/urlscan-0.8.6/urlscan/assets/tlds-alpha-by-domain.txt 2017-07-03 22:35:10.000000000 +0200 @@ -1,9 +1,11 @@ -# Version 2016070600, Last Updated Wed Jul 6 07:07:01 2016 UTC +# Version 2017022000, Last Updated Mon Feb 20 07:07:01 2017 UTC AAA AARP +ABARTH ABB ABBOTT ABBVIE +ABC ABLE ABOGADO ABUDHABI @@ -24,24 +26,33 @@ AERO AETNA AF +AFAMILYCOMPANY AFL +AFRICA AG AGAKHAN AGENCY AI AIG +AIGO AIRBUS AIRFORCE AIRTEL AKDN AL +ALFAROMEO ALIBABA ALIPAY ALLFINANZ +ALLSTATE ALLY ALSACE ALSTOM AM +AMERICANEXPRESS +AMERICANFAMILY +AMEX +AMFAM AMICA AMSTERDAM ANALYTICS @@ -49,6 +60,7 @@ ANQUAN ANZ AO +AOL APARTMENTS APP APPLE @@ -62,15 +74,18 @@ ART ARTE AS +ASDA ASIA ASSOCIATES AT +ATHLETA ATTORNEY AU AUCTION AUDI AUDIBLE AUDIO +AUSPOST AUTHOR AUTO AUTOS @@ -84,6 +99,8 @@ BA BABY BAIDU +BANAMEX +BANANAREPUBLIC BAND BANK BAR @@ -92,20 +109,25 @@ BARCLAYS BAREFOOT BARGAINS +BASEBALL +BASKETBALL BAUHAUS BAYERN BB BBC +BBT BBVA BCG BCN BD BE BEATS +BEAUTY BEER BENTLEY BERLIN BEST +BESTBUY BET BF BG @@ -123,6 +145,7 @@ BLACK BLACKFRIDAY BLANCO +BLOCKBUSTER BLOG BLOOMBERG BLUE @@ -135,15 +158,19 @@ BO BOATS BOEHRINGER +BOFA BOM BOND BOO BOOK +BOOKING BOOTS BOSCH BOSTIK +BOSTON BOT BOUTIQUE +BOX BR BRADESCO BRIDGESTONE @@ -170,6 +197,7 @@ CAFE CAL CALL +CALVINKLEIN CAM CAMERA CAMP @@ -177,6 +205,7 @@ CANON CAPETOWN CAPITAL +CAPITALONE CAR CARAVAN CARDS @@ -186,13 +215,17 @@ CARS CARTIER CASA +CASE +CASEIH CASH CASINO CAT CATERING +CATHOLIC CBA CBN CBRE +CBS CC CD CEB @@ -213,11 +246,14 @@ CHLOE CHRISTMAS CHROME +CHRYSLER CHURCH CI CIPRIANI CIRCLE CISCO +CITADEL +CITI CITIC CITY CITYEATS @@ -241,6 +277,7 @@ COLLEGE COLOGNE COM +COMCAST COMMBANK COMMUNITY COMPANY @@ -268,6 +305,7 @@ CRICKET CROWN CRS +CRUISE CRUISES CSC CU @@ -282,6 +320,7 @@ DABUR DAD DANCE +DATA DATE DATING DATSUN @@ -310,12 +349,17 @@ DIRECT DIRECTORY DISCOUNT +DISCOVER +DISH +DIY DJ DK DM DNP DO DOCS +DOCTOR +DODGE DOG DOHA DOMAINS @@ -324,14 +368,18 @@ DRIVE DTV DUBAI +DUCK DUNLOP +DUNS DUPONT DURBAN DVAG +DVR DZ EARTH EAT EC +ECO EDEKA EDU EDUCATION @@ -352,6 +400,7 @@ ES ESQ ESTATE +ESURANCE ET EU EUROVISION @@ -376,8 +425,12 @@ FAST FEDEX FEEDBACK +FERRARI FERRERO FI +FIAT +FIDELITY +FIDO FILM FINAL FINANCE @@ -396,11 +449,11 @@ FLIR FLORIST FLOWERS -FLSMIDTH FLY FM FO FOO +FOOD FOODNETWORK FOOTBALL FORD @@ -410,12 +463,16 @@ FOUNDATION FOX FR +FREE FRESENIUS FRL FROGANS FRONTDOOR FRONTIER FTR +FUJITSU +FUJIXEROX +FUN FUND FURNITURE FUTBOL @@ -427,6 +484,7 @@ GALLUP GAME GAMES +GAP GARDEN GB GBIZ @@ -436,6 +494,7 @@ GEA GENT GENTING +GEORGE GF GG GGEE @@ -446,6 +505,7 @@ GIVES GIVING GL +GLADE GLASS GLE GLOBAL @@ -456,10 +516,12 @@ GMO GMX GN +GODADDY GOLD GOLDPOINT GOLF GOO +GOODHANDS GOODYEAR GOOG GOOGLE @@ -486,9 +548,12 @@ GURU GW GY +HAIR HAMBURG HANGOUT HAUS +HBO +HDFC HDFCBANK HEALTH HEALTHCARE @@ -509,11 +574,16 @@ HOLDINGS HOLIDAY HOMEDEPOT +HOMEGOODS HOMES +HOMESENSE HONDA +HONEYWELL HORSE +HOSPITAL HOST HOSTING +HOT HOTELES HOTMAIL HOUSE @@ -523,6 +593,8 @@ HT HTC HU +HUGHES +HYATT HYUNDAI IBM ICBC @@ -530,8 +602,8 @@ ICU ID IE +IEEE IFM -IINET IKANO IL IM @@ -549,7 +621,9 @@ INSURANCE INSURE INT +INTEL INTERNATIONAL +INTUIT INVESTMENTS IO IPIRANGA @@ -564,14 +638,17 @@ IT ITAU ITV +IVECO IWC JAGUAR JAVA JCB JCP JE +JEEP JETZT JEWELRY +JIO JLC JLL JM @@ -586,6 +663,7 @@ JPMORGAN JPRS JUEGOS +JUNIPER KAUFEN KDDI KE @@ -620,14 +698,18 @@ KZ LA LACAIXA +LADBROKES LAMBORGHINI LAMER LANCASTER +LANCIA +LANCOME LAND LANDROVER LANXESS LASALLE LAT +LATINO LATROBE LAW LAWYER @@ -636,6 +718,7 @@ LDS LEASE LECLERC +LEFRAK LEGAL LEGO LEXUS @@ -648,6 +731,7 @@ LIFESTYLE LIGHTING LIKE +LILLY LIMITED LIMO LINCOLN @@ -662,23 +746,28 @@ LOANS LOCKER LOCUS +LOFT LOL LONDON LOTTE LOTTO LOVE +LPL +LPLFINANCIAL LR LS LT LTD LTDA LU +LUNDBECK LUPIN LUXE LUXURY LV LY MA +MACYS MADRID MAIF MAISON @@ -690,9 +779,14 @@ MARKETING MARKETS MARRIOTT +MARSHALLS +MASERATI MATTEL MBA MC +MCD +MCDONALDS +MCKINSEY MD ME MED @@ -711,6 +805,9 @@ MICROSOFT MIL MINI +MINT +MIT +MITSUBISHI MK ML MLB @@ -720,6 +817,7 @@ MN MO MOBI +MOBILE MOBILY MODA MOE @@ -727,10 +825,13 @@ MOM MONASH MONEY +MONSTER MONTBLANC +MOPAR MORMON MORTGAGE MOSCOW +MOTO MOTORCYCLES MOV MOVIE @@ -739,6 +840,7 @@ MQ MR MS +MSD MT MTN MTPC @@ -746,18 +848,20 @@ MU MUSEUM MUTUAL -MUTUELLE MV MW MX MY MZ NA +NAB NADEX NAGOYA NAME +NATIONWIDE NATURA NAVY +NBA NC NE NEC @@ -767,6 +871,7 @@ NETWORK NEUSTAR NEW +NEWHOLLAND NEWS NEXT NEXTDIRECT @@ -778,6 +883,7 @@ NHK NI NICO +NIKE NIKON NINJA NISSAN @@ -799,10 +905,13 @@ NYC NZ OBI +OBSERVER +OFF OFFICE OKINAWA OLAYAN OLAYANGROUP +OLDNAVY OLLO OM OMEGA @@ -810,7 +919,9 @@ ONG ONL ONLINE +ONYOURSIDE OOO +OPEN ORACLE ORANGE ORG @@ -824,6 +935,7 @@ PA PAGE PAMPEREDCHEF +PANASONIC PANERAI PARIS PARS @@ -831,14 +943,17 @@ PARTS PARTY PASSAGENS +PAY PCCW PE PET PF +PFIZER PG PH PHARMACY PHILIPS +PHONE PHOTO PHOTOGRAPHY PHOTOS @@ -869,6 +984,7 @@ PORN POST PR +PRAMERICA PRAXI PRESS PRIME @@ -881,6 +997,8 @@ PROPERTIES PROPERTY PROTECTION +PRU +PRUDENTIAL PS PT PUB @@ -891,7 +1009,10 @@ QPON QUEBEC QUEST +QVC RACING +RADIO +RAID RE READ REALESTATE @@ -905,6 +1026,7 @@ REISE REISEN REIT +RELIANCE REN RENT RENTALS @@ -919,12 +1041,16 @@ RICH RICHARDLI RICOH +RIGHTATHOME +RIL RIO RIP +RMIT RO ROCHER ROCKS RODEO +ROGERS ROOM RS RSVP @@ -941,6 +1067,7 @@ SAKURA SALE SALON +SAMSCLUB SAMSUNG SANDVIK SANDVIKCOROMANT @@ -964,16 +1091,19 @@ SCHULE SCHWARZ SCIENCE +SCJOHNSON SCOR SCOT SD SE SEAT +SECURE SECURITY SEEK SELECT SENER SERVICES +SES SEVEN SEW SEX @@ -992,6 +1122,7 @@ SHOPPING SHOUJI SHOW +SHOWTIME SHRIRAM SI SILK @@ -1005,7 +1136,9 @@ SKY SKYPE SL +SLING SM +SMART SMILE SN SNCF @@ -1026,8 +1159,10 @@ SPREADBETTING SR SRL +SRT ST STADA +STAPLES STAR STARHUB STATEBANK @@ -1052,6 +1187,7 @@ SUZUKI SV SWATCH +SWIFTCOVER SWISS SX SY @@ -1063,6 +1199,7 @@ TAIPEI TALK TAOBAO +TARGET TATAMOTORS TATAR TATTOO @@ -1087,6 +1224,7 @@ THD THEATER THEATRE +TIAA TICKETS TIENDA TIFFANY @@ -1094,7 +1232,10 @@ TIRES TIROL TJ +TJMAXX +TJX TK +TKMAXX TL TM TMALL @@ -1131,7 +1272,9 @@ TW TZ UA +UBANK UBS +UCONNECT UG UK UNICOM @@ -1145,6 +1288,7 @@ VA VACATIONS VANA +VANGUARD VC VE VEGAS @@ -1162,14 +1306,17 @@ VIN VIP VIRGIN +VISA VISION VISTA VISTAPRINT VIVA +VIVO VLAANDEREN VN VODKA VOLKSWAGEN +VOLVO VOTE VOTING VOTO @@ -1177,6 +1324,7 @@ VU VUELOS WALES +WALMART WALTER WANG WANGGOU @@ -1200,17 +1348,20 @@ WIN WINDOWS WINE +WINNERS WME WOLTERSKLUWER WOODSIDE WORK WORKS WORLD +WOW WS WTC WTF XBOX XEROX +XFINITY XIHUAN XIN XN--11B4C3D @@ -1220,11 +1371,13 @@ XN--3BST00M XN--3DS443G XN--3E0B707E +XN--3OQ18VL8PN36A XN--3PXU8K XN--42C2D9A XN--45BRJ9C XN--45Q11C XN--4GBRIM +XN--54B7FTA0CC XN--55QW42G XN--55QX5D XN--5SU34J936BGSG @@ -1233,6 +1386,7 @@ XN--6QQ986B3XL XN--80ADXHKS XN--80AO21A +XN--80AQECDR1A XN--80ASEHDB XN--80ASWG XN--8Y0A063A @@ -1272,6 +1426,7 @@ XN--G2XX48C XN--GCKR3F0F XN--GECRJ9C +XN--GK3AT1E XN--H2BRJ9C XN--HXT814E XN--I1B6B1A6A2E @@ -1295,12 +1450,14 @@ XN--MGBA7C0BBN0A XN--MGBAAM7A8H XN--MGBAB2BD +XN--MGBAI9AZGQP6J XN--MGBAYH7GPA XN--MGBB9FBPOB XN--MGBBH1A71E XN--MGBC0A9AZCG XN--MGBCA7DZDO XN--MGBERP4A5D4AR +XN--MGBI4ECEXP XN--MGBPL2FH XN--MGBT3DHD XN--MGBTX2B @@ -1330,6 +1487,7 @@ XN--SES554G XN--T60B56A XN--TCKWE +XN--TIQ49XQYJ XN--UNUP4Y XN--VERMGENSBERATER-CTB XN--VERMGENSBERATUNG-PWB diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlscan-0.8.3/urlscan/urlchoose.py new/urlscan-0.8.6/urlscan/urlchoose.py --- old/urlscan-0.8.3/urlscan/urlchoose.py 2016-07-11 19:36:42.000000000 +0200 +++ new/urlscan-0.8.6/urlscan/urlchoose.py 2017-07-03 22:35:10.000000000 +0200 @@ -1,5 +1,5 @@ # Copyright (C) 2006-2007 Daniel Burrows -# Copyright (C) 2016 Scott Hansen +# Copyright (C) 2017 Scott Hansen # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License @@ -13,13 +13,14 @@ # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA """An urwid listview-based widget that lets you choose a URL from a list of URLs.""" import urwid import urwid.curses_display +import urwid.raw_display import webbrowser from threading import Thread from time import sleep @@ -34,114 +35,123 @@ return browse -def process_urls(extractedurls, compact_mode=False, nobrowser=False): +def shorten_url(url, cols, shorten): + """Shorten long URLs to fit on one line. + + """ + cols = ((cols - 6) * .85) # 6 cols for urlref and don't use while line + if shorten is False or len(url) < cols: + return url + split = int(cols * .5) + return url[:split] + "..." + url[-split:] + + +def process_urls(extractedurls, dedupe, shorten): """Process the 'extractedurls' and ready them for either the curses browser or non-interactive output Args: extractedurls - compact_mode - True/False (Default False) - nobrowser - True/False (Default False) + dedupe - Remove duplicate URLs from list - Returns: items - urls - firstbutton - Number of first URL button - if nobrowser, then _only_ return urls + Returns: items - List of widgets for the ListBox + urls - List of all URLs """ + cols, _ = urwid.raw_display.Screen().get_cols_rows() items = [] urls = [] first = True - firstbutton = 0 - if nobrowser is True: - compact_mode = True for group, usedfirst, usedlast in extractedurls: if first: first = False - elif not compact_mode: - items.append(urwid.Divider(div_char='-', top=1, bottom=1)) + items.append(urwid.Divider(div_char='-', top=1, bottom=1)) + if dedupe is True: + # If no unique URLs exist, then skip the group completely + if not [chunk for chunks in group for chunk in chunks + if chunk.url is not None and chunk.url not in urls]: + continue groupurls = [] markup = [] - if compact_mode: - lasturl = None - for chunks in group: - for chunk in chunks: - if chunk.url and chunk.url != lasturl: - groupurls.append(chunk.url) - urls.append(chunk.url) - lasturl = chunk.url - else: - if not usedfirst: - markup.append(('msgtext:ellipses', '...\n')) - for chunks in group: - i = 0 - while i < len(chunks): - chunk = chunks[i] - i += 1 - if chunk.url is None: - markup.append(chunk.markup) - else: - urls.append(chunk.url) - groupurls.append(chunk.url) - # Collect all immediately adjacent - # chunks with the same URL. - tmpmarkup = [] - if chunk.markup: - tmpmarkup.append(chunk.markup) - while i < len(chunks) and \ - chunks[i].url == chunk.url: - if chunks[i].markup: - tmpmarkup.append(chunks[i].markup) - i += 1 - markup += [tmpmarkup or '<URL>', - ('urlref:number:braces', ' ['), - ('urlref:number', repr(len(urls))), - ('urlref:number:braces', ']')] - markup += '\n' - if not usedlast: - markup += [('msgtext:ellipses', '...\n\n')] - - items.append(urwid.Text(markup)) + if not usedfirst: + markup.append(('msgtext:ellipses', '...\n')) + for chunks in group: + i = 0 + while i < len(chunks): + chunk = chunks[i] + i += 1 + if chunk.url is None: + markup.append(chunk.markup) + elif (dedupe is True and chunk.url not in urls) \ + or dedupe is False: + urls.append(chunk.url) + groupurls.append(chunk.url) + # Collect all immediately adjacent + # chunks with the same URL. + tmpmarkup = [] + if chunk.markup: + tmpmarkup.append(chunk.markup) + while i < len(chunks) and \ + chunks[i].url == chunk.url: + if chunks[i].markup: + tmpmarkup.append(chunks[i].markup) + i += 1 + markup += [tmpmarkup or '<URL>', + ('urlref:number:braces', ' ['), + ('urlref:number', repr(len(urls))), + ('urlref:number:braces', ']')] + markup += '\n' + if not usedlast: + markup += [('msgtext:ellipses', '...\n\n')] + items.append(urwid.Text(markup)) i = len(urls) - len(groupurls) for url in groupurls: - if firstbutton == 0 and not compact_mode: - firstbutton = len(items) i += 1 - markup = [('urlref:number:braces', '['), - ('urlref:number', repr(i)), - ('urlref:number:braces', ']'), - ' ', - ('urlref:url', url)] - items.append(urwid.Button(markup, - mkbrowseto(url), - user_data=url)) - - if not items: - items.append(urwid.Text("No URLs found")) - firstbutton = 1 - if nobrowser is True: - return urls - else: - return items, urls, firstbutton + markup = [(6, urwid.Text([('urlref:number:braces', '['), + ('urlref:number', repr(i)), + ('urlref:number:braces', ']'), + ' '])), + urwid.AttrMap(urwid.Button(shorten_url(url, + cols, + shorten), + mkbrowseto(url), + user_data=url), + 'urlref:url', 'url:sel')] + items.append(urwid.Columns(markup)) + + return items, urls -# Based on urwid examples. class URLChooser: - def __init__(self, extractedurls, compact_mode=False): - items, urls, firstbutton = process_urls(extractedurls, - compact_mode) - self.listbox = urwid.ListBox(items) - self.listbox.set_focus(firstbutton) - if len(urls) == 1: + def __init__(self, extractedurls, compact=False, dedupe=False, + shorten=True): + self.shorten = shorten + self.compact = compact + self.items, self.urls = process_urls(extractedurls, + dedupe=dedupe, + shorten=self.shorten) + # Store 'compact' mode items + self.items_com = [i for i in self.items if + isinstance(i, urwid.Columns) is True] + if self.compact is True: + self.items, self.items_com = self.items_com, self.items + self.contents = urwid.SimpleFocusListWalker(self.items) + listbox = urwid.ListBox(self.contents) + if len(self.urls) == 1: header = 'Found 1 url.' else: - header = 'Found %d urls.' % len(urls) - headerwid = urwid.AttrWrap(urwid.Text(header), 'header') - self.top = urwid.Frame(self.listbox, headerwid) - - def main(self): + header = 'Found %d urls.' % len(self.urls) + header = "{} :: {}".format(header, "q - Quit :: " + "c - context :: " + "s - URL short :: " + "S - all URL short :: ") + headerwid = urwid.AttrMap(urwid.Text(header), 'header') + self.top = urwid.Frame(listbox, headerwid) + if self.urls: + self.top.body.focus_position = \ + (2 if self.compact is False else 0) self.ui = urwid.curses_display.Screen() - self.ui.register_palette([ + self.palette = [ ('header', 'white', 'dark blue', 'standout'), ('footer', 'white', 'dark red', 'standout'), ('msgtext', 'light gray', 'black'), @@ -156,43 +166,76 @@ ('msgtext:ellipses', 'light gray', 'black'), ('urlref:number:braces', 'light gray', 'black'), ('urlref:number', 'yellow', 'black', 'standout'), - ('urlref:url', 'white', 'black', 'standout') - ]) - return self.ui.run_wrapper(self.run) + ('urlref:url', 'white', 'black', 'standout'), + ('url:sel', 'white', 'dark blue', 'bold') + ] - def run(self): - size = self.ui.get_cols_rows() + def main(self): + loop = urwid.MainLoop(self.top, self.palette, screen=self.ui, + input_filter=self.handle_keys, + unhandled_input=self.unhandled) + loop.run() + + def handle_keys(self, keys, raw): + """Handle the enter or space key to trigger the 'loading' footer + + """ + for k in keys: + if (k == 'enter' or k == ' ') and self.urls: + footerwid = urwid.AttrMap(urwid.Text("Loading URL..."), + 'footer') + self.top.footer = footerwid + load_thread = Thread(target=self._loading_thread) + load_thread.daemon = True + load_thread.start() + return keys + + def unhandled(self, keys): + """Add other keyboard actions (q, j, k, s, S, c) not handled by the + ListBox widget. - try: - while True: - self.ui.s.erase() + """ + size = self.ui.get_cols_rows() + for k in keys: + if k == 'q' or k == 'Q': + raise urwid.ExitMainLoop() + elif not self.urls: + continue # No other actions are useful with no URLs + elif k == 'ctrl l': self.draw_screen(size) - keys = self.ui.get_input() - for k in keys: - if k == 'window resize': - size = self.ui.get_cols_rows() - elif k == 'q': - return None - elif k == 'ctrl l': - self.ui.s.clear() - elif k == 'j': - self.top.keypress(size, "down") - elif k == 'k': - self.top.keypress(size, "up") - elif k == 'enter' or k == ' ': - footer = "loading URL" - footerwid = urwid.AttrWrap(urwid.Text(footer), - 'footer') - self.top.set_footer(footerwid) - self.top.keypress(size, k) - load_thread = Thread(target=self._loading_thread) - load_thread.daemon = True - load_thread.start() - self.ui.s.clear() - else: - self.top.keypress(size, k) - except KeyboardInterrupt: - return None + elif k == 'j': + self.top.keypress(size, "down") + elif k == 'k': + self.top.keypress(size, "up") + elif k == 's': + # Toggle shortened URL for selected item + fp = self.top.body.focus_position + url_idx = len([i for i in self.items[:fp + 1] + if isinstance(i, urwid.Columns)]) - 1 + url = self.urls[url_idx] + short = False if "..." in self.items[fp][1].label else True + self.items[fp][1].set_label(shorten_url(url, size[0], short)) + elif k == 'S': + # Toggle all shortened URLs + self.shorten = False if self.shorten is True else True + urls = iter(self.urls) + columns_idx = 1 + for item in self.items: + # Each Column has (Text, Button). Update the Button label + if isinstance(item, urwid.Columns): + item[1].set_label(shorten_url(next(urls), + size[0], + self.shorten)) + columns_idx += 1 + elif k == 'c': + # Show/hide context + fp = self.top.body.focus_position + self.items, self.items_com = self.items_com, self.items + self.top.body = urwid.ListBox(self.items) + self.top.body.focus_position = self._cur_focus(fp) + self.compact = False if self.compact is True else True + else: + self.top.keypress(size, k) def _loading_thread(self): """Simple thread to wait 5 seconds after launching a URL, @@ -200,11 +243,22 @@ """ sleep(5) - footerwid = urwid.AttrWrap(urwid.Text(""), "default") - self.top.set_footer(footerwid) + footerwid = urwid.AttrMap(urwid.Text(""), "default") + self.top.footer = footerwid size = self.ui.get_cols_rows() self.draw_screen(size) + def _cur_focus(self, fp=0): + # Return correct focus when toggling 'show context' + if self.compact is False: + idx = len([i for i in self.items_com[:fp + 1] + if isinstance(i, urwid.Columns)]) - 1 + elif self.compact is True: + idx = [i for i in enumerate(self.items) + if isinstance(i[1], urwid.Columns)][fp][0] + return idx + def draw_screen(self, size): + self.ui.clear() canvas = self.top.render(size, focus=True) self.ui.draw_screen(size, canvas) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlscan-0.8.3/urlscan/urlscan.py new/urlscan-0.8.6/urlscan/urlscan.py --- old/urlscan-0.8.3/urlscan/urlscan.py 2016-07-11 19:36:42.000000000 +0200 +++ new/urlscan-0.8.6/urlscan/urlscan.py 2017-07-03 22:35:10.000000000 +0200 @@ -1,6 +1,6 @@ # -*- coding: utf-8 -*- # Copyright (C) 2006-2007 Daniel Burrows -# Copyright (C) 2016 Scott Hansen +# Copyright (C) 2017 Scott Hansen # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License @@ -244,8 +244,9 @@ # added above. self.handle_data('&%s;' % name) + urlinternalpattern = r'[{}()@\w/\-%?!&.=:;+,#~]' -urltrailingpattern = r'[{}()@\w/\-%&=+#]' +urltrailingpattern = r'[{}(@\w/\-%&=+#]' httpurlpattern = (r'(?:(https?|file|ftps?)://' + urlinternalpattern + r'*' + urltrailingpattern + r')') # Used to guess that blah.blah.blah.TLD is a URL. @@ -259,12 +260,13 @@ return [elem for elem in f.read().lower().splitlines()[1:] if "--" not in elem] + tlds = load_tlds() guessedurlpattern = (r'(?:[\w\-%]+(?:\.[\w\-%]+)*\.(?:' + '|'.join(tlds) + ')$)') urlre = re.compile(r'(?:<(?:URL:)?)?(' + httpurlpattern + '|' + guessedurlpattern + - '|(?P<email>(mailto:)?[\w\-.]*@[\w\-.]*[\w\-]))>?', + '|(?P<email>(mailto:)?[\w\-.]+@[\w\-.]*[\w\-]))>?', flags=re.U) # Poor man's test cases. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlscan-0.8.3/urlscan.1 new/urlscan-0.8.6/urlscan.1 --- old/urlscan-0.8.3/urlscan.1 2016-07-11 19:36:42.000000000 +0200 +++ new/urlscan-0.8.6/urlscan.1 2017-07-03 22:35:10.000000000 +0200 @@ -1,6 +1,6 @@ .\" Hey, EMACS: -*- nroff -*- -.TH URLSCAN 1 "October 26, 2015" +.TH URLSCAN 1 "February 26, 2017" .SH NAME urlscan \- browse the URLs in an email message from a terminal @@ -29,13 +29,20 @@ \fB1.\fR Support for more message encodings, such as quoted-printable and base64. -\fB2.\fR Extraction and display of the context surrounding each URL. +\fB2.\fR Extraction and display of the context surrounding each URL. Toggle +context view on/off with `c`. + +\fB3.\fR URLs are shortened by default to fit on one line. Toggle one or all +shortened URLs with `s` or `S`. .SH OPTIONS .TP .B \-c, \-\-compact Display a simple list of the extracted URLs, instead of showing the -context of each URL. +context of each URL. Also toggle with `c` from within the viewer. +.TP +.B \-d, \-\-dedupe +Remove duplicated URLs from the list of URLs. .TP .B \-n, \-\-no-browser Disables the selection interface and print the links to standard output. @@ -58,6 +65,9 @@ Control-b will allow you to browse and open the URLs in the currently selected message. +Alternately, you can pipe a message into urlscan using the '|' operator. This +can be useful for applying a different flag (such as the '-d' or '-c' options). + .SH SEE ALSO \fI/usr/share/doc/urlscan/README\fR, \fBurlview\fR(1),