To fix up invalid JSON you can try readClob (or maybe readLine) followed by 
findReplace or grok, followed by toByteArray, followed by setValues { 
_attachment_body : "@{message}" }, followed by readJson.

Wolfgang.

On Mar 26, 2014, at 8:59 PM, Andrew Sammut <[email protected]> wrote:

> 
> Hi all
> 
> I'm a relative beginner to flume and morphlines, however I will try to 
> explain myself.
> 
> I am generating logs (with the message in JSON) from PHP and pumping them to 
> rsyslog which then in turn pumps the logs into a flume syslog tcp source.
> 
> An example message would be the following:
> 
> <12>1 2014-03-27T03:46:56.648886+00:00 x x  - -  
> {"time":1395892016,"level":4,"body":"HTTP_Exception_404 [404]:  \/ 
> KIX-53339f309cafb6.72508132 - Unable to find a route to match the URI:  in 
> MODPATH\/patches\/classes\/Request.php on line 254"}
> 
> And this translates into the following on entry to my morphlines command:
> 
> 27 Mar 2014 03:46:56,890 DEBUG [pool-9-thread-1] 
> (com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug.log:63)  - begin: 
> [{Facility=[1], Severity=[4], _attachment_body=[[B@2d913d11], 
> environment=[x], host=[x], hostname=[x], pop=[x], product=[x], 
> timestamp=[1395892664886]}]
> 
> That's all good, however when you look at the actual string representation of 
> _attachment_body (using readLine) it is formatted as this:
> 
> x  - -  {"time":1395892016,"level":4,"body":"HTTP_Exception_404 [404]:  \/ 
> KIX-53339f309cafb6.72508132 - Unable to find a route to match the URI:  in 
> MODPATH\/patches\/classes\/Request.php on line 254"}
> 
> Now, if I run readJson on that, it fails completely as it's improperly 
> formatted. The question is, how can one process the attachment body to remove 
> the leading 'x  - -  ' so that readJson would work? Am I missing something 
> completely?
> 
> Regards,
> Andrew S 

Reply via email to