Hello,

I'm trying to get syslog data into a SOLR index. I'm using James Keating's 
omsolr output module (thanks James!). I'm using a binary I built from rsyslog 
5.8.7 and its outputting things as specified in my rsyslog.conf template:

$Template SolrLog, "<add><doc><field name='from'>%fromhost%</field><field 
name='facility'>%syslogfacility-text%</field><field 
name='hostname'>%hostname%</field><field name='tag'>%syslogtag%</field><field 
name='program'>%programname%</field><field 
name='severity'>%syslogseverity-text%</field><field 
name='priority'>%syslogpriority-text%</field><field 
name='msg'><![CDATA[%msg%]]></field><field 
name='generated'>%timegenerated:::date-rfc3339%</field><field 
name='timestamp'>%timereported:::date-rfc3339%</field></doc></add>"

The documents are added to the SOLR index just fine so long as each field is 
defined as a "text" type in the SOLR schema. I'd like to define the "generated" 
and "timestamp" fields with type "DateField" or "TrieDateField" so that the 
index is searchable by date (vs string pattern matching).

The problem I'm having is when I try to add documents to the SOLR index using 
fields defined as native solr.DateField or solr.TrieDateField types in the SOLR 
schema. According to the documentation, 
http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html :

A date field shall be of the form 1995-12-31T23:59:59Z The trailing "Z" 
designates UTC time and is mandatory (See below for an explanation of UTC). 
Optional fractional seconds are allowed, as long as they do not end in a 
trailing 0 (but any precision beyond milliseconds will be ignored). All other 
parts are mandatory.

I sniffed the wire to see what rsyslog is sending to SOLR and I see that the 
output (RFC-3339) is formatted like so:

2012-01-25T21:46:13.102571+00:00

When I attempt to insert the document using this format I get an error:

The request sent by the client was syntactically incorrect (ERROR: [doc=null] 
Error adding field 'generated'='2012-01-25T21:46:13.102571+00:00').

I see 3 possible workarounds:


1)      Add another property option to format the timestamp so that its 
compliant with ISO-8601 / Java DateField

2)      Format the timestamp using a [regex] property replacer in rsyslog.conf. 
The time zone suffix would be stripped off and replaced with a trailing 'Z'. 
Not sure if this is possible and if it is it seems ugly and perhaps costly from 
a performance perspective.

3)      Ditch the omsolr plugin altogether and use an external script to pipe 
the output to so that the field can be properly formatted. Yuck - I'd much 
rather use the compiled module.

Any suggestions?

And thanks to Rainer & James for sharing these tools!

Regards,

Lars Peterson


_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/

Reply via email to