[
https://issues.apache.org/jira/browse/FLUME-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
BitsOfInfo updated FLUME-1988:
------------------------------
Comment: was deleted
(was: FYI, I am in fact trying to use the morphline interceptor in combination
with this new RegexDelimiterDeSerializer and when doing so, morphline barfs
with the below error. Note that the file I that lives in the directory is a
modsec audit log file, where each "event" spans multiple newlines in the source
log file, my RegexDeserializer handles this, and sends this new "event" (which
is one larger string w/ newlines in it itself) off to morphline.
2013-11-06 14:24:34,642 (pool-3-thread-1) [ERROR -
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:201)]
FATAL: Spool Directory source src1: { spoolDir:
/Users/me/Documents/mm/flume/mylogs }: Uncaught exception in
SpoolDirectorySource thread. Restart or reconfigure Flume to continue
processing.
org.apache.flume.FlumeException:
org.apache.flume.sink.solr.morphline.MorphlineInterceptor$LocalMorphlineInterceptor
must not generate more than one output record per input event
at
org.apache.flume.sink.solr.morphline.MorphlineInterceptor$LocalMorphlineInterceptor.intercept(MorphlineInterceptor.java:173)
at
org.apache.flume.sink.solr.morphline.MorphlineInterceptor$LocalMorphlineInterceptor.intercept(MorphlineInterceptor.java:156)
at
org.apache.flume.sink.solr.morphline.MorphlineInterceptor.intercept(MorphlineInterceptor.java:74)
at
org.apache.flume.interceptor.InterceptorChain.intercept(InterceptorChain.java:62)
at
org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:146)
at
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:195)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
)
> Add Support for Additional Deserializers for SpoolingDirectorySource
> --------------------------------------------------------------------
>
> Key: FLUME-1988
> URL: https://issues.apache.org/jira/browse/FLUME-1988
> Project: Flume
> Issue Type: New Feature
> Components: Docs, Sinks+Sources
> Affects Versions: v1.4.0
> Reporter: Israel Ekpo
> Assignee: Israel Ekpo
> Labels: serializers
> Attachments: EventDeserializerType.java,
> RegexDelimiterDeSerializer.java, ResettableTestStringInputStream.java,
> TestRegexDelimiterDeSerializer.java
>
>
> There are certain use cases for SpoolingDirectorySource where the events in
> the log file are not delimited with newline characters.
> Certain log files that contain stack traces, xml documents and pretty JSON
> strings seem to contain multiple new line characters within each event.
> We can use alternative logic such as specific characters, strings or regular
> expressions to determine when the event is complete.
> Hence I am proposing the following new deserializers based on
> org.apache.flume.serialization.LineDeserializer
> # org.apache.flume.serialization.RegexDelimiterDeSerializer
> Allows the user to specify a regular expression that is a delimiter for
> events within the log file
> # org.apache.flume.serialization.CharSequenceDelimiterDeSerializer
> Allows the user to specify a comma separated character sequence that is a
> delimiter for events within the log file
> The user will specify an integer for the ascii characters and we will use
> that as the delimter.
> For example support for \r\n could be specified as 13,10
> A list of codes is available at http://www.asciitable.com/
> We will also need to update the user guide with examples on how to configure
> and specify a custom deserializer.
--
This message was sent by Atlassian JIRA
(v6.1#6144)