Wikinaut has submitted this change and it was merged.

Change subject: version 2.18 + squashed commit RSS changes fromSVN
......................................................................


version 2.18 + squashed commit RSS changes fromSVN

bump version number from 2.17 to 2.18 for the releas version

removed 4 white spaces tabs.

followed the advices of the code reviewer. removed an unwanted switch(true) 
structure
removed switch case by an assoc array, removed unneeded http factory comments 
Wikinaut 2013-01-04

removed unneeded INSTALL text file
new version 2.17 incl. code cosmetics. rebased on master 
bea4447d24ad33c115e64385ef8fc5a308b58188  2012-12-22
bear with me ! It's my first real-life commit to gerrit. Wikinaut, 2012-12-30

Catrope squashed these together per Wikinaut's request. List of commit
summaries:

adding the long-wanted date format attribute.
implemented a date format equalising function,
so that dates of RSS feeds are rendered in a common format.

follow-up r111347 : adding escapeTemplateParameter around the user supplied 
optional date attribute

fix for bug30377 : add a new parameter to limit the number of characters when 
rendering the channel item <description>

follow-up r111350 . check if optional parameter  isset and is_numeric, 
otherwise limit to the built-in default (30000)

removed a wrong comment regarding PHP 5.3 function date_create_from_format,
which is not suited to auto-detect a time string in any formats -  only 
strtotime() can do it.

follow-up r111350 r111351 . switch replaced by if elseif construct.

name and behaviour change of wgRSSAllowedFeeds towgRSSUrlWhitelist.
The wgRSSUrlWhitelist is _now_ empty by default which was not the case until 
this version.
Admins who want to allow their users to insert arbitrary feed urls must now 
denote this expressly
with an asterisk in quotes as whitelist array element.
This is harmonised to the same method as recently introduced in E:EtherpadLite.
The RELEASE NOTES file has been updated, updates to the MediaWiki manual page 
will follow soon.

increased wgRSSFetchTimeout default from 5 to 15 seconds - many sites are too 
slow.

v2.00 can parse ATOM feeds, at least some.
This is a major improvement over pre-2.00 versions which only could read and 
parse RSS feeds but no ATOM feeds.
Version 2.00 begins to keep care of namespaces in the XML.
The parser still leaves room for further improvements.
At least, E:RSS can now read E:WikiArticleFeeds generated RSS _and_ ATOM feeds.

v2.01 fixed ATOM summary element was forgotten to be parsed.
Added handling of basic HTML layout tags (p br b u i s) in feed descriptions,
they are preserved in the wiki output after sanitizing.

improved code legibility function namespacePrefixedQuery

fix for ultra bug 30028 .
The RSS extension can parse RSS and ATOM feeds of different flavours.
The php xml dom xpath query uses now a namespace-safe method to find all 
elements like item (RSS, RDF)
or entry (ATOM).
Further fixed a hidden problem when the feed url was redirecting,
this threw the Cannot parse RSS for XML error, which is now history.
Introduced a new parameter wgRSSUrlNumberOfAllowedRedirects which defaults to 
zero,
i.e. no redirects are allowed by default. See Manual page

removed superfluous code for setting userAgent since r112466

function name typo correction. Version number update

fix for bug34763 'RSS feed items (HTML) are not rendered as HTML but 
htmlescaped';
tolerated controlled regression bug30377 'feed item length limitation',
because this now becomes very tricky when we allow some tags in order to close 
bug 34763.

add tracking category feature (enabled by default). Each page using this 
extensions gets
automatically the tracking category with MediaWiki:Rss-tracking-category name 
(= RSS).
Tracking-Cat-Feature can be disabled, or a different MediaWiki message text can 
be assigned.
Documentation of the switch is inline and follows on MediaWiki.

follow up r113508 : escaped html tag brackets to make translaters happy

beautifying the tracking category name

adding casts. better ?

removed the redundant code for handling tracking categories.
By using '-' for the message text rss-tracking-category , this can be disabled 
easily.

+ Patchset 11

rebased on master

+ Patchset 12

wrapped commit message text lines
version number bumped to 2.18

+ Patchset 13

improved and updated README
added history of the present version 2.18

+ Patchset 14

white space fixes
version number fixes

Change-Id: I2d9724314f94c216650370071b31390c5c2c97fc
---
D INSTALL
M RELEASE-NOTES
M RSS.i18n.php
M RSS.php
M RSSData.php
M RSSHooks.php
M RSSParser.php
7 files changed, 407 insertions(+), 116 deletions(-)

Approvals:
  Wikinaut: Verified; Looks good to me, approved



diff --git a/INSTALL b/INSTALL
deleted file mode 100644
index 5dfb109..0000000
--- a/INSTALL
+++ /dev/null
@@ -1,11 +0,0 @@
-[[RSS 1.7]]
-
-== Installation ==
-
-# Save the [[#Source|source]] in your /extensions directory.
-# Place the following text in your 
[[Manual:LocalSettings.php|LocalSettings.php]] file: 
<tt>require_once("$IP/extensions/RSS/RSS.php");</tt><br />(Make sure there's a 
semicolon (;) at the end of that line)
-# Finally, load your wiki, and have fun with RSS feeds!
-
-More information can be found at [0].
-
-[0] http://www.mediawiki.org/wiki/Extension:RSS#Installation
\ No newline at end of file
diff --git a/RELEASE-NOTES b/RELEASE-NOTES
index a2b1540..89c430d 100644
--- a/RELEASE-NOTES
+++ b/RELEASE-NOTES
@@ -1,10 +1,17 @@
 RELEASE NOTES of the MediaWiki extension RSS
-http://www.mediawiki.org/wiki/Extension:RSS
+
+Version 2.18 20130220
+
+Manual http://www.mediawiki.org/wiki/Extension:RSS
+
+Browser view
+https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/extensions/RSS.git;a=tree
+
+git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/RSS.git
 
 === TO DO ===
-* bug 30377 add a new parameter to limit the number of characters when 
rendering
-  the channel item <description>
-* set an upper default limit for HttpRequest request size when fetching feeds
+
+- set an upper default limit for HttpRequest request size when fetching feeds
   doing a HEAD request first to ask for the size but that value may not be
   available. Check how much data is returned as its coming back
   in which case you'd override the content fetch callback and count data as it
@@ -12,8 +19,78 @@
   coming in. Then you could abort cleanly once it's gotten too much
   (otherwise using the defaults - PHP will abort the entire program when your
   memory usage gets too high)
-* bug 30028 "Error parsing XML for RSS" - improve and harden Extension:RSS when
-  parsing differently flavoured RSS feeds
+
+- This is currently an open issue:
+  Suggestion: add a new parameter to limit the number of characters when
+  rendering the channel item <description>.
+  https://bugzilla.wikimedia.org/show_bug.cgi?id=30377#c5
+  The length limitation must be HTML tag-safe, but it is not at the moment.
+  Length limitation is disabled by default.
+
+=== Version 2.18 2013-02-20 ===
+* release version
+
+  Version 2.18 submits the unported and reverted changes from the former SVN
+  repository version. In the transition phase from SVN to Git/Gerrit,
+  formerly unreviewed changes stayed in the Gerrit limbus from April 2012 to
+  February 2013 and were reviewed there and further improved.
+
+  With version 2.18, all these pending changes were reviewed and become now
+  available in the Git repository.
+
+=== Version 2.17 2012-12-30 ===
+* code cosmetics
+
+=== Version 2.12 2012-03-07 ===
+* bug fix 34763 "RSS feed items (HTML) are not rendered as HTML but 
htmlescaped"
+* regression bug 30377 "Add a new parameter to limit the number of characters
+  when rendering the channel item <description>". Feed item string length
+  limitation is difficult when we allow HTML <a> or <img> tags, because a mere
+  content-unaware limitation breaks (can break) tags which results in 
disastrous
+  rendering results.
+
+=== Version 2.11 2012-02-29 ===
+* function name typo correction
+
+=== Version 2.10 2012-02-27 ===
+* final solution of bug 30028 "Error parsing XML for RSS" - improve and harden
+  Extension:RSS when parsing differently flavoured RSS feeds and ATOM feeds
+* new parameter $wgRSSUrlNumberOfAllowedRedirects (default = 0)
+  Some feed urls redirect. The new RSS version can deal with redirects,
+  but it must be expressly enabled. For example, you can set
+  $wgRSSUrlNumberOfAllowedRedirects = 1;
+
+=== Version 2.01 2012-02-24 ===
+* "summary" element of ATOM feed items are shown
+  which is handled like "description" element of RSS
+* handling of basic HTML layout tags <p> <br> <b> <i> <u> <s> in item 
description
+
+=== Version 2.00 2012-02-24 ===
+* first version which can parse RSS and at least some ATOM feeds
+  partial solution of bug 30028 "Error parsing XML for RSS" - improve and 
harden
+  Extension:RSS when parsing differently flavoured RSS feeds and ATOM feeds
+
+=== Version 1.94 2012-02-23 ===
+* changed white list definition and behaviour:
+
+  1. changed the name from $wgRSSAllowedFeeds to $wgRSSUrlWhitelist
+  2. behaviour has been changed
+
+  the new behaviour is:
+  $wgRSSUrlWhitelist is empty by default. Since version 1.94 it must be
+  expressly set to an array( list-of-comma-separated-allowed-RSS-urls-strings )
+  or set to array( "*" ) if you want to allow any url
+
+  the old behaviour was:
+  $wgRSSAllowedFeeds was empty by default and empty meant that every Url
+  was allowed by default. This has been changed, see new behaviour.
+
+=== Version 1.92 2012-02-13 ===
+* added optional date= attribute and $wgRSSDateDefaultFormat parameter
+* added optional item-max-length= attribute and $wgRSSItemMaxLength parameter
+  fixes bug 30377 add a new parameter to limit the number of characters when
+  rendering the channel item <description>
+  (update: this fix is reverted, the bug 30377 re-opened with version 2.17)
 
 === Version 1.90 2011-08-15 ===
 * removed parsing of each single channel subelement (item)
diff --git a/RSS.i18n.php b/RSS.i18n.php
index ba39306..efdd141 100644
--- a/RSS.i18n.php
+++ b/RSS.i18n.php
@@ -13,13 +13,16 @@
  */
 $messages['en'] = array(
        'rss-desc' => 'Displays RSS feeds on MediaWiki pages in a standard or 
in user-definable formats using template pages',
+       'rss-tracking-category' => 'Pages with RSS feeds',
        'rss-error' => 'Failed to load RSS feed from $1: $2',
        'rss-empty' => 'Failed to load RSS feed from $1!',
        'rss-fetch-nourl' => 'Fetch called without a URL!',
        'rss-invalid-url' => 'Not a valid URL: $1',
        'rss-parse-error' => 'Error parsing XML for RSS',
        'rss-ns-permission' => 'RSS is not allowed in this namespace',
-       'rss-url-permission' => 'This URL is not allowed to be included',
+       'rss-url-is-not-whitelisted' => '"$1" is not in the whitelist of 
allowed feeds. {{PLURAL:$3|$2 is the only allowed feed|The allowed feeds are as 
follows: $2}}.',
+       'rss-empty-whitelist' => '"$1" is not in the whitelist of allowed 
feeds. There are no allowed feed URLs in the whitelist.',
+       'rss-deprecated-wgrssallowedfeeds-found' => 'The deprecated variable 
$wgRSSAllowedFeeds has been detected. Since RSS version 2.0 this variable has 
to be replaced by $wgRSSUrlWhitelist as described in the manual page 
Extension:RSS.',
        'rss-item' => '{{$1 | title = {{{title}}} | link = {{{link}}} | date = 
{{{date}}} | author = {{{author}}} | description = {{{description}}} }}',
        'rss-feed' => "<!--  the following are two alternative templates. The 
first is the basic default template for feeds -->; '''<span 
class='plainlinks'>[{{{link}}} {{{title}}}]</span>'''
 : {{{description}}}
@@ -36,6 +39,7 @@
 $messages['qqq'] = array(
        'rss-desc' => 
'{{desc|name=RSS|url=http://www.mediawiki.org/wiki/Extension:RSS}}',
        'rss-invalid-url' => '$1 is the invalid URL for the RSS feed',
+       'rss-tracking-category' => 'The name of a category for all pages which 
use the &lt;rss&gt; parser extension tag. The category is automatically added 
unless the feature is disabled.',
        'rss-item' => '{{notranslate}}',
        'rss-feed' => "; $1
 : ''not to be localised''
diff --git a/RSS.php b/RSS.php
index 1fe406d..a1bde4e 100644
--- a/RSS.php
+++ b/RSS.php
@@ -4,7 +4,7 @@
  *
  * @file
  * @ingroup Extensions
- * @version 1.90
+ * @version 2.18
  * @author mutante, Daniel Kinzler, Rdb, Mafs, Thomas Gries, Alxndr, Chris 
Reigrut, K001
  * @author Kellan Elliott-McCrea <[email protected]> -- author of MagpieRSS
  * @author Jeroen De Dauw
@@ -13,6 +13,8 @@
  * @copyright © mutante, Daniel Kinzler, Rdb, Mafs, Thomas Gries, Alxndr, 
Chris Reigrut, K001
  * @link http://www.mediawiki.org/wiki/Extension:RSS Documentation
  */
+
+define( "EXTENSION_RSS_VERSION", "2.18 20130220" );
 
 if ( !defined( 'MEDIAWIKI' ) ) {
        die( "This is not a valid entry point.\n" );
@@ -26,7 +28,7 @@
                'Rdb', 'Mafs', 'Alxndr', 'Thomas Gries', 'Chris Reigrut',
                'K001', 'Jack Phoenix', 'Jeroen De Dauw', 'Mark A. Hershberger'
        ),
-       'version' => '1.90 20110815',
+       'version' => EXTENSION_RSS_VERSION,
        'url' => 'https://www.mediawiki.org/wiki/Extension:RSS',
        'descriptionmsg' => 'rss-desc',
 );
@@ -36,32 +38,63 @@
 $wgExtensionMessagesFiles['RSS'] = $dir . 'RSS.i18n.php';
 $wgAutoloadClasses['RSSHooks'] = $dir . 'RSSHooks.php';
 $wgAutoloadClasses['RSSParser'] = $dir . 'RSSParser.php';
+$wgAutoloadClasses['RSSUtils'] = $dir . 'RSSParser.php';
 $wgAutoloadClasses['RSSData'] = $dir . 'RSSData.php';
 
 $wgHooks['ParserFirstCallInit'][] = 'RSSHooks::parserInit';
 
- // one hour
- $wgRSSCacheAge = 3600;
+// one hour
+$wgRSSCacheAge = 3600;
 
 // Check cached content, if available, against remote.
 // $wgRSSCacheCompare should be set to false or a timeout
 // (less than $wgRSSCacheAge) after which a comparison will be made.
+// for debugging set $wgRSSCacheCompare = 1;
 $wgRSSCacheCompare = false;
 
-// 5 second timeout
-$wgRSSFetchTimeout = 5;
+// 15 second timeout
+$wgRSSFetchTimeout = 15;
 
 // Ignore the RSS tag in all but the namespaces listed here.
 // null (the default) means the <rss> tag can be used anywhere.
 $wgRSSNamespaces = null;
 
-// URL whitelist of RSS Feeds:
-// if there are items in the array, and the used URL isn't in the array,
-// it will not be allowed (originally proposed in bug 27768)
-$wgRSSAllowedFeeds = array();
+// Whitelist of allowed RSS Urls
+//
+// If there are items in the array, and the user supplied URL is not in the 
array,
+// the url will not be allowed
+//
+// Urls are case-sensitively tested against values in the array.
+// They must exactly match including any trailing "/" character.
+//
+// Warning: Allowing all urls (not setting a whitelist)
+// may be a security concern.
+//
+// an empty or non-existent array means: no whitelist defined
+// this is the default: an empty whitelist. No servers are allowed by default.
+$wgRSSUrlWhitelist = array();
+
+// include "*" if you expressly want to allow all urls (you should not do this)
+// $wgRSSUrlWhitelist = array( "*" );
+
+// Maximum number of redirects to follow (defaults to 0)
+// Note: this should only be used when the target URLs are trusted,
+// to avoid attacks on intranet services accessible by HTTP.
+$wgRSSUrlNumberOfAllowedRedirects = 0;
 
 // Agent to use for fetching feeds
-$wgRSSUserAgent = 'MediaWikiRSS/0.02 
(+http://www.mediawiki.org/wiki/Extension:RSS) / MediaWiki RSS extension';
+$wgRSSUserAgent = "MediaWikiRSS/" . strtok( EXTENSION_RSS_VERSION, " " ) . " 
(+http://www.mediawiki.org/wiki/Extension:RSS) / MediaWiki RSS extension";
 
 // Proxy server to use for fetching feeds
 $wgRSSProxy = false;
+
+// default date format of item publication dates see http://www.php.net/date
+$wgRSSDateDefaultFormat = "(Y-m-d H:i:s)";
+
+// limit the number of characters in the item description
+// or set to false for unlimited length.
+// THIS IS CURRENTLY NOT WORKING (bug 30377)
+$wgRSSItemMaxLength = false;
+
+// You can choose to allow active links in feed items; default: false
+$wgRSSAllowLinkTag = false;
diff --git a/RSSData.php b/RSSData.php
index 82581f3..96f0ea3 100644
--- a/RSSData.php
+++ b/RSSData.php
@@ -15,30 +15,36 @@
                        return;
                }
                $xpath = new DOMXPath( $xml );
-               $items = $xpath->query( '/rss/channel/item' );
 
-               if( $items->length !== 0 ) {
-                       foreach ( $items as $item ) {
-                               $bit = array();
-                               foreach ( $item->childNodes as $n ) {
-                                       $name = $this->rssTokenToName( 
$n->nodeName );
-                                       if ( $name != null ) {
-                                               /**
-                                                * Because for DOMElements the 
nodeValue is just
-                                                * the text of the containing 
element, without any
-                                                * tags, it makes this a safe, 
if unattractive,
-                                                * value to use. If you want to 
allow people to
-                                                * mark up their RSS, some more 
precautions are
-                                                * needed.
-                                                */
-                                               $bit[$name] = $n->nodeValue;
-                                       }
-                               }
-                               $this->items[] = $bit;
-                       }
-               } else {
-                       $this->error = 'No RSS items found.';
+               // namespace-safe method to find all elements
+               $items = $xpath->query( "//*[local-name() = 'item']" );
+
+               if ( $items->length === 0 ) {
+                       $items = $xpath->query( "//*[local-name() = 'entry']" );
+               }
+
+               if ( $items->length === 0 ) {
+                       $this->error = 'No RSS/ATOM items found.';
                        return;
+               }
+
+               foreach ( $items as $item ) {
+                       $bit = array();
+                       foreach ( $item->childNodes as $n ) {
+                               $name = $this->rssTokenToName( $n->nodeName );
+                               if ( $name != null ) {
+                                       /**
+                                        * Because for DOMElements the 
nodeValue is just
+                                        * the text of the containing element, 
without any
+                                        * tags, it makes this a safe, if 
unattractive,
+                                        * value to use. If you want to allow 
people to
+                                        * mark up their RSS, some more 
precautions are
+                                        * needed.
+                                        */
+                                       $bit[$name] = $n->nodeValue;
+                               }
+                       }
+                       $this->items[] = $bit;
                }
        }
 
@@ -52,33 +58,31 @@
         * @param $n String: name of the element we have
         * @return String Name to map it to
         */
-       protected function rssTokenToName( $n ) {
-               switch( $n ) {
-                       case 'dc:date':
-                               return 'date';
-                               # parse "2010-10-18T18:07:00Z"
-                       case 'pubDate':
-                               return 'date';
-                               # parse RFC date
-                       case 'dc:creator':
-                               return 'author';
-                       case 'title':
-                               return 'title';
-                       case 'content:encoded':
-                               return 'encodedContent';
+       protected function rssTokenToName( $name ) {
 
-                       case 'slash:comments':
-                       case 'slash:department':
-                       case 'slash:section':
-                       case 'slash:hit_parade':
-                       case 'feedburner:origLink':
-                       case 'wfw:commentRss':
-                       case 'comments':
-                       case 'category':
-                               return null;
+               $tokenNames = array(
+                       'dc:date' => 'date',
+                       'pubDate' => 'date',
+                       'updated' => 'date',
+                       'dc:creator' => 'author',
+                       'summary' => 'description',
+                       'content:encoded' => 'encodedContent',
+                       'category' => null,
+                       'comments' => null,
+                       'feedburner:origLink' => null,
+                       'slash:comments' => null,
+                       'slash:department' => null,
+                       'slash:hit_parade' => null,
+                       'slash:section' => null,
+                       'wfw:commentRss' => null,
+               );
 
-                       default:
-                               return $n;
+               if ( array_key_exists( $name, $tokenNames ) ) {
+                       return $tokenNames[ $name ];
                }
+
+       return $name;
+
        }
+
 }
diff --git a/RSSHooks.php b/RSSHooks.php
index 2f7ec28..5b7921b 100644
--- a/RSSHooks.php
+++ b/RSSHooks.php
@@ -1,6 +1,7 @@
 <?php
 
 class RSSHooks {
+
        /**
         * Tell the parser how to handle <rss> elements
         * @param $parser Parser Object
@@ -21,25 +22,56 @@
         * @param $frame PPFrame parser context
         * @return string
         */
-       static function renderRss( $input, $args, $parser, $frame ) {
-               global $wgRSSCacheAge, $wgRSSCacheCompare, $wgRSSNamespaces, 
$wgRSSAllowedFeeds;
+       static function renderRss( $input, array $args, Parser $parser, PPFrame 
$frame ) {
+               global $wgRSSCacheAge, $wgRSSCacheCompare, $wgRSSNamespaces,
+                       $wgRSSUrlWhitelist,$wgRSSAllowedFeeds;
 
                if ( is_array( $wgRSSNamespaces ) && count( $wgRSSNamespaces ) 
) {
+
                        $ns = $parser->getTitle()->getNamespace();
                        $checkNS = array_flip( $wgRSSNamespaces );
 
                        if( !isset( $checkNS[$ns] ) ) {
-                               return wfMessage( 'rss-ns-permission' )->text();
+                               return RSSUtils::RSSError( 'rss-ns-permission' 
);
                        }
                }
 
-               if ( count( $wgRSSAllowedFeeds ) && !in_array( $input, 
$wgRSSAllowedFeeds ) ) {
-                       return wfMessage( 'rss-url-permission' )->text();
+               if ( isset( $wgRSSAllowedFeeds ) ) {
+                       return RSSUtils::RSSError( 
'rss-deprecated-wgrssallowedfeeds-found' );
+               }
+
+               # disallow because there is no whitelist at all or an empty 
whitelist
+
+               if ( !isset( $wgRSSUrlWhitelist )
+                       || !is_array( $wgRSSUrlWhitelist )
+                       || ( count( $wgRSSUrlWhitelist ) === 0 ) ) {
+
+                       return RSSUtils::RSSError( 'rss-empty-whitelist',
+                               $input
+                       );
+
+               }
+
+               # disallow the feed url because the url is not whitelisted;  or
+               # disallow because the wildcard joker is not present to allow 
any feed url
+               # which can be dangerous
+
+               if ( !( in_array( $input, $wgRSSUrlWhitelist ) )
+                       && !( in_array( "*", $wgRSSUrlWhitelist ) ) ) {
+
+                       $listOfAllowed = 
$parser->getFunctionLang()->listToText( $wgRSSUrlWhitelist );
+                       $numberAllowed = $parser->getFunctionLang()->formatNum( 
count( $wgRSSUrlWhitelist ) );
+
+                       return RSSUtils::RSSError( 'rss-url-is-not-whitelisted',
+                               array( $input, $listOfAllowed, $numberAllowed )
+                       );
+
                }
 
                if ( !Http::isValidURI( $input ) ) {
-                       return wfMessage( 'rss-invalid-url', htmlspecialchars( 
$input ) )->text();
+                       return RSSUtils::RSSError( 'rss-invalid-url', 
htmlspecialchars( $input ) );
                }
+
                if ( $wgRSSCacheCompare ) {
                        $timeout = $wgRSSCacheCompare;
                } else {
@@ -58,9 +90,10 @@
                }
 
                if ( !is_object( $rss->rss ) || !is_array( $rss->rss->items ) ) 
{
-                       return wfMessage( 'rss-empty', htmlspecialchars( $input 
) )->text();
+                       return RSSUtils::RSSError( 'rss-empty', 
htmlspecialchars( $input ) );
                }
 
                return $rss->renderFeed( $parser, $frame );
        }
+
 }
diff --git a/RSSParser.php b/RSSParser.php
index a28f34b..33a85ca 100644
--- a/RSSParser.php
+++ b/RSSParser.php
@@ -2,6 +2,8 @@
 
 class RSSParser {
        protected $maxheads = 32;
+       protected $date = "Y-m-d H:i:s";
+       protected $ItemMaxLength = 200;
        protected $reversed = false;
        protected $highlight = array();
        protected $filter = array();
@@ -40,6 +42,8 @@
         * and return an object that can produce rendered output.
         */
        function __construct( $url, $args ) {
+               global $wgRSSDateDefaultFormat,$wgRSSItemMaxLength;
+
                $this->url = $url;
 
                # Get max number of headlines from argument-array
@@ -53,9 +57,12 @@
                }
 
                # Get date format from argument array
+               # or use a default value
                # @todo FIXME: not used yet
                if ( isset( $args['date'] ) ) {
                        $this->date = $args['date'];
+               } elseif ( isset( $wgRSSDateDefaultFormat ) ) {
+                       $this->date = $wgRSSDateDefaultFormat;
                }
 
                # Get highlight terms from argument array
@@ -69,6 +76,13 @@
                        $this->filter = self::explodeOnSpaces( $args['filter'] 
);
                }
 
+               # Get a maximal length for item texts
+               if ( isset( $args['item-max-length'] ) ) {
+                       $this->ItemMaxLength = $args['item-max-length'];
+               } elseif ( is_numeric( $wgRSSItemMaxLength ) ) {
+                       $this->ItemMaxLength = $wgRSSItemMaxLength;
+               }
+
                if ( isset( $args['filterout'] ) ) {
                        $this->filterOut = self::explodeOnSpaces( 
$args['filterout'] );
                }
@@ -77,7 +91,7 @@
                // a further pagename for the feedTemplate
                // In that way everything is handled via these two pages
                // and no default pages or templates are used.
-               
+
                // 'templatename' is an optional pagename of a user's 
feedTemplate
                // In that way it substitutes $1 (default: RSSPost) in 
MediaWiki:Rss-item
 
@@ -91,7 +105,7 @@
                        } else {
 
                                // compatibility patch for rss extension
-                               
+
                                $feedTemplatePagename = 'RSSPost';
                                $feedTemplateTitleObject = Title::newFromText( 
$feedTemplatePagename, NS_TEMPLATE );
 
@@ -208,7 +222,8 @@
         * @return Status object
         */
        protected function fetchRemote( $key, array $headers = array()) {
-               global $wgRSSFetchTimeout, $wgRSSUserAgent, $wgRSSProxy;
+               global $wgRSSFetchTimeout, $wgRSSUserAgent, $wgRSSProxy,
+                       $wgRSSUrlNumberOfAllowedRedirects;
 
                if ( $this->etag ) {
                        wfDebugLog( 'RSS', 'Used etag: ' . $this->etag );
@@ -220,12 +235,45 @@
                        $headers['If-Modified-Since'] = $lm;
                }
 
-               $client = HttpRequest::factory( $this->url, array( 
-                       'timeout' => $wgRSSFetchTimeout,
-                       'proxy' => $wgRSSProxy
+               /**
+                * 'noProxy' can conditionally be set as shown in the commented
+                * example below; in HttpRequest 'noProxy' takes precedence over
+                * any value of 'proxy' and disables the use of a proxy.
+                *
+                * This is useful if you run the wiki in an intranet and need to
+                * access external feed urls through a proxy but internal feed
+                * urls must be accessed without a proxy.
+                *
+                * The general handling of such cases will be subject of a
+                * forthcoming version.
+                */
 
-               ) );
-               $client->setUserAgent( $wgRSSUserAgent );
+               $url = $this->url;
+               $noProxy = !isset( $wgRSSProxy );
+
+               // Example for disabling proxy use for certain urls
+               // $noProxy = preg_match( '!\.internal\.example\.com$!i', 
parse_url( $url, PHP_URL_HOST ) );
+
+               if ( isset( $wgRSSUrlNumberOfAllowedRedirects )
+                       && is_numeric( $wgRSSUrlNumberOfAllowedRedirects ) ) {
+                       $maxRedirects = $wgRSSUrlNumberOfAllowedRedirects;
+               } else {
+                       $maxRedirects = 0;
+               }
+
+               // we set followRedirects intentionally to true to see error 
messages
+               // in cases where the maximum number of redirects is reached
+               $client = HttpRequest::factory( $url,
+                       array(
+                               'timeout'         => $wgRSSFetchTimeout,
+                               'followRedirects' => true,
+                               'maxRedirects'    => $maxRedirects,
+                               'proxy'           => $wgRSSProxy,
+                               'noProxy'         => $noProxy,
+                               'userAgent'       => $wgRSSUserAgent,
+                       )
+               );
+
                foreach ( $headers as $header => $value ) {
                        $client->setHeader( $header, $value );
                }
@@ -242,6 +290,14 @@
                return $ret;
        }
 
+       function sandboxParse($wikiText) {
+               global $wgTitle, $wgUser;
+               $myParser = new Parser();
+               $myParserOptions = ParserOptions::newFromUser($wgUser);
+               $result = $myParser->parse($wikiText, $wgTitle, 
$myParserOptions);
+               return $result->getText();
+       }
+
        /**
         * Render the entire feed so that each item is passed to the
         * template which the MediaWiki then displays.
@@ -251,9 +307,11 @@
         * @return string
         */
        function renderFeed( $parser, $frame ) {
+
                $renderedFeed = '';
-               
+
                if ( isset( $this->itemTemplate ) && isset( $parser ) && isset( 
$frame ) ) {
+
                        $headcnt = 0;
                        if ( $this->reversed ) {
                                $this->rss->items = array_reverse( 
$this->rss->items );
@@ -265,14 +323,17 @@
                                }
 
                                if ( $this->canDisplay( $item ) ) {
-                                       $renderedFeed .= $this->renderItem( 
$item ) . "\n";
+                                       $renderedFeed .= $this->renderItem( 
$item, $parser ) . "\n";
                                        $headcnt++;
                                }
                        }
 
-                       $renderedFeed = $parser->recursiveTagParse( 
$renderedFeed, $frame );
-        }
-               
+                       $renderedFeed = $this->sandboxParse( $renderedFeed );
+
+               }
+
+               $parser->addTrackingCategory( 'rss-tracking-category' );
+
                return $renderedFeed;
        }
 
@@ -282,7 +343,8 @@
         * @param $item Array: an array produced by RSSData where keys are the 
names of the RSS elements
         * @return mixed
         */
-       protected function renderItem( $item ) {
+       protected function renderItem( $item, $parser ) {
+
                $renderedItem = $this->itemTemplate;
 
                // $info will only be an XML element name, so we're safe using 
it.
@@ -290,14 +352,35 @@
                // and that means bad RSS with stuff like
                // <description><script>alert("hi")</script></description> will 
find its
                // rogue <script> tags neutered.
+               // use the overloaded multi byte wrapper functions in 
GlobalFunctions.php
 
                foreach ( array_keys( $item ) as $info ) {
-                       if ( $info != 'link' ) {
-                               $txt = $this->highlightTerms( 
$this->escapeTemplateParameter( $item[ $info ] ) );
-                       } else {
-                               $txt = $this->sanitizeUrl( $item[ $info ] );
+                       switch ( $info ) {
+                       // ATOM <id> elements and RSS <link> elements are item 
link urls
+                       case 'id':
+                               $txt = $this->sanitizeUrl( $item['id'] );
+                               $renderedItem = str_replace( '{{{link}}}', 
$txt, $renderedItem );
+                               break;
+                       case 'link':
+                               if ( !isset( $item['id'] ) ) {
+                                       $txt = $this->sanitizeUrl( 
$item['link'] );
+                               }
+                               $renderedItem = str_replace( '{{{link}}}', 
$txt, $renderedItem );
+                               break;
+                       case 'date':
+                               $tempTimezone = date_default_timezone_get();
+                               date_default_timezone_set( 'UTC' );
+                               $txt = date( $this->date, strtotime( 
$this->escapeTemplateParameter( $item['date'] ) ) );
+                               date_default_timezone_set( $tempTimezone );
+                               $renderedItem = str_replace( '{{{date}}}', 
$txt, $renderedItem );
+                               break;
+                       default:
+                               $str = $this->escapeTemplateParameter( 
$item[$info] );
+                               global $wgLang;
+                               $str = $wgLang->truncate( $str, 
$this->ItemMaxLength );
+                               $str = $this->highlightTerms( $str );
+                               $renderedItem = str_replace( '{{{' . $info . 
'}}}', $parser->insertStripItem( $str ), $renderedItem );
                        }
-                       $renderedItem = str_replace( '{{{' . $info . '}}}', 
$txt, $renderedItem );
                }
 
                // nullify all remaining info items in the template
@@ -309,7 +392,7 @@
        }
 
        /**
-        * Sanitize a URL for inclusion in wikitext. Escapes characters that 
have 
+        * Sanitize a URL for inclusion in wikitext. Escapes characters that 
have
         * a special meaning in wikitext, replacing them with URL escape codes, 
so
         * that arbitrary input can be included as a free or bracketed external
         * link and both work and be safe.
@@ -334,18 +417,65 @@
 
        /**
         * Sanitize user input for inclusion as a template parameter.
+        *
         * Unlike in wfEscapeWikiText() as of r77127, this escapes }} in 
addition
-        * to the other kinds of markup, to avoid user input ending a template 
+        * to the other kinds of markup, to avoid user input ending a template
         * invocation.
+        *
+        * If you want to allow clickable link Urls (HTML <a> tag) in RSS feeds:
+        * $wgRSSAllowLinkTag = true;
+        *
+        * If you want to allow images (HTML <img> tag) in RSS feeds:
+        * $wgAllowImageTag = true;
+        *
         */
        protected function escapeTemplateParameter( $text ) {
-               return str_replace(
-                       array( '[',     '|',      ']',     '\'',    'ISBN ',    
 
-                               'RFC ',     '://',     "\n=",     '{{',         
  '}}' ),
-                       array( '&#91;', '&#124;', '&#93;', '&#39;', 
'ISBN&#32;', 
-                               'RFC&#32;', '&#58;//', "\n&#61;", 
'&#123;&#123;', '&#125;&#125;' ),
-                       htmlspecialchars( $text )
-               );
+               global $wgRSSAllowLinkTag, $wgAllowImageTag;
+
+               if ( isset( $wgRSSAllowLinkTag ) && $wgRSSAllowLinkTag ) {
+                       $extra = array( "a" );
+               } else {
+                       $extra = array();
+               }
+
+               if ( ( isset( $wgRSSAllowLinkTag ) && $wgRSSAllowLinkTag )
+                       || ( isset( $wgAllowImageTag ) && $wgAllowImageTag ) ) {
+
+                       $ret = Sanitizer::removeHTMLtags( $text, null, array(), 
$extra, array( "iframe" ) );
+
+               } else { // use the old escape method for a while
+
+                       $text = str_replace(
+                               array( '[',     '|',      ']',     '\'',    
'ISBN ',
+                                       'RFC ',     '://',     "\n=",     '{{', 
          '}}',
+                               ),
+                               array( '&#91;', '&#124;', '&#93;', '&#39;', 
'ISBN&#32;',
+                                       'RFC&#32;', '&#58;//', "\n&#61;", 
'&#123;&#123;', '&#125;&#125;',
+                               ),
+                               htmlspecialchars( str_replace( "\n", "", $text 
) )
+                       );
+
+                       // keep some basic layout tags
+                       $ret = str_replace(
+                               array( '&lt;p&gt;', '&lt;/p&gt;',
+                                       '&lt;br/&gt;', '&lt;br&gt;', 
'&lt;/br&gt;',
+                                       '&lt;b&gt;', '&lt;/b&gt;',
+                                       '&lt;i&gt;', '&lt;/i&gt;',
+                                       '&lt;u&gt;', '&lt;/u&gt;',
+                                       '&lt;s&gt;', '&lt;/s&gt;',
+                               ),
+                               array( "", "<br/>",
+                                       "<br/>", "<br/>", "<br/>",
+                                       "'''", "'''",
+                                       "''", "''",
+                                       "<u>", "</u>",
+                                       "<s>", "</s>",
+                               ),
+                               $text
+                       );
+               }
+
+               return $ret;
        }
 
        /**
@@ -420,9 +550,8 @@
         * Filters items in or out if the match a string we're looking for.
         *
         * @param $text String: the text to examine
-        * @param $filterType String: "filterOut" to check for matches in the
-        *                                                              
filterOut member list.
-        *                                                              
Otherwise, uses the filter member list.
+        * @param $filterType String: "filterOut" to check for matches in the 
filterOut member list.
+        *      Otherwise, uses the filter member list.
         * @return Boolean: decision to filter or not.
         */
        protected function filter( $text, $filterType ) {
@@ -475,8 +604,8 @@
        /**
         * Actually replace the supplied list of words with HTML code to 
highlight the words.
         * @param $match Array: list of matched words to highlight.
-        *                                              The words are assigned 
colors based upon the order
-        *                                              they were supplied in 
setTerms()
+        *      The words are assigned colors based upon the order
+        *      they were supplied in setTerms()
         * @return String word wrapped in HTML code.
         */
        static function highlightThis( $match ) {
@@ -496,3 +625,25 @@
                return sprintf( $styleStart, $bgcolor[$index], $color[$index] ) 
. $match[0] . $styleEnd;
        }
 }
+
+class RSSUtils {
+
+       /**
+       * Output an error message, all wraped up nicely.
+       * @param String $errorMessageName The system message that this error is
+       * @param String|Array $param Error parameter (or parameters)
+       * @return String Html that is the error.
+       */
+       public static function RSSError( $errorMessageName, $param = false ) {
+
+               // Anything from a parser tag should use Content lang for 
message,
+               // since the cache doesn't vary by user language: do not use 
wfMsgForContent but wfMsgForContent
+               // The ->parse() part makes everything safe from an escaping 
standpoint.
+
+               return Html::rawElement( 'span', array( 'class' => 'error' ),
+                       "Extension:RSS -- Error: " . wfMessage( 
$errorMessageName )->inContentLanguage()->params( $param )->parse()
+               );
+
+       }
+
+}

-- 
To view, visit https://gerrit.wikimedia.org/r/3925
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I2d9724314f94c216650370071b31390c5c2c97fc
Gerrit-PatchSet: 14
Gerrit-Project: mediawiki/extensions/RSS
Gerrit-Branch: master
Gerrit-Owner: Catrope <[email protected]>
Gerrit-Reviewer: Reedy <[email protected]>
Gerrit-Reviewer: Wikinaut <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to