Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by FergusMcMenemie: http://wiki.apache.org/solr/DataImportHandler The comment on the change is: futher enhancements to wikipedia example, adding use of DateFormatTransformer... ------------------------------------------------------------------------------ <field column="user" xpath="/mediawiki/page/revision/contributor/username" /> <field column="userId" xpath="/mediawiki/page/revision/contributor/id" /> <field column="text" xpath="/mediawiki/page/revision/text" /> - <field column="timestamp" xpath="/mediawiki/page/revision/timestamp" /> + <field column="timestamp" xpath="/mediawiki/page/revision/timestamp" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" /> <field column="$skipDoc" regex="^#REDIRECT .*" replaceWith="true" sourceColName="text"/> </entity> </document> @@ -455, +455 @@ <copyField source="title" dest="titleText"/> }}} - Time taken was around 2 hours 40 minutes to index 7278241 articles with peak memory usage at around 4GB. Note that many articles are merely redirects to other articles. The use of $skipDoc allows those articles to be ignored. + Time taken was around 2 hours 40 minutes to index 7278241 articles with peak memory usage at around 4GB. Note that many wikipedia articles are merely redirects to other articles, the use of $skipDoc allows those articles to be ignored. Also, the column '''$skipDoc''' is only defined when the regexp matches. == Using delta-import command == The only !EntityProcessor which supports delta is !SqlEntityProcessor! The X!PathEntityProcessor has not implemented it yet. So, unfortunately, there is no delta support for XML at this time.
