All, I am ingesting a lot of RSS feeds as part of my application and I keep getting the same error.
WARNING: Could not parse a Date field java.text.ParseException: Unparseable date: "Mon, 06 Dec 2010 23:31:38 +0000" at java.text.DateFormat.parse(Unknown Source) at org.apache.solr.handler.dataimport.DateFormatTransformer.process(Date FormatTransformer.java:89) at org.apache.solr.handler.dataimport.DateFormatTransformer.transformRow (DateFormatTransformer.java:69) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransf ormer(EntityProcessorWrapper.java:195) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Ent ityProcessorWrapper.java:241) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde r.java:357) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde r.java:383) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.j ava:242) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java :180) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImpo rter.java:331) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.j ava:389) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.ja va:370) Dec 11, 2010 6:25:47 PM org.apache.solr.handler.dataimport.DocBuilder finish INFO: Import completed successfully Dec 11, 2010 6:25:47 PM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDelete s=false) Are there any tips or tricks to getting standard RSS <update> fields to import correctly? An example for a DIH config XML file is as follows: <entity name="CBS" pk="link" datasource="filedatasource" url="http://feeds.cbsnews.com/CBSNewsMain?format=xml" processor="XPathEntityProcessor" forEach="/rss/channel | /rss/channel/item" transformer="DateFormatTransformer,HTMLStripTransformer"> <field column="source" xpath="/rss/channel/title" commonField="true" /> <field column="source-link" xpath="/rss/channel/link" commonField="true" /> <field column="subject" xpath="/rss/channel/description" commonField="true" /> <field column="title" xpath="/rss/channel/item/title" /> <field column="link" xpath="/rss/channel/item/link" /> <field column="description" xpath="/rss/channel/item/description" stripHTML="true" /> <field column="creator" xpath="/rss/channel/item/creator" /> <field column="item-subject" xpath="/rss/channel/item/subject" /> <field column="author" xpath="/rss/channel/item/author" /> <field column="comments" xpath="/rss/channel/item/comments" /> <field column="pubdate" xpath="/rss/channel/item/pubDate" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" /> </entity> Any tips on this would be really appreciated as I need to query based on the date the article was published. Thanks, Adam