Glad to hear you got it worked out! On Mon, Mar 9, 2015 at 4:09 PM, Marina <[email protected]> wrote:
> Hi, Jeff, > Thank you for your quick response! > I could not easily find the exact log entry that had the issue - as all I > had were 30M input log files :). > After further debugging, I figured out what the issue was . Here is what > happened. > > For production, we use Exec sink with 'tail -f '. For my local testing I > use a spooling dir. The issue happened when I was using the spooldir sink, > when a log file had non-UTF-8 characters. > However, the exception that I've posted came not from processing the log > file! The flow was as following: > 1. Flume is started with spooldir sink > 2. a log file with non-utf-8 chars is moved into the spooldir > 3. Flume starts processing, encounters a "bad" character and stops (no > errors or anything) > 4. I kill Flume manually and restart - without cleaning out its > .flumespool dir > 5. FLume starts up and now chokes up processing its own .flumespool dir > and the left-over file in there! - this is where the > MalformedInputException came from > > When I processed the same file via Exec sink, and 'tail -n 10000 ..' > command - it was processed successfully - which told me the issue is > specific to the spooled sink. > > The solution was to add this parameter to the spooldir sink: > a1.sources.r1.inputCharset = ISO8859-1 > > Thanks! > Marina > > > > > ------------------------------ > *From:* Jeff Lord <[email protected]> > *To:* "[email protected]" <[email protected]>; Marina < > [email protected]> > *Sent:* Monday, March 9, 2015 11:17 AM > *Subject:* Re: MalformedInputException processing logs from Varnish server > > Hi Marina, > > Do you have a sample of the characters/data which you believe to be > causing this? > Can you just confirm you are using apache version of flume or a specific > distro? > Also in your message you mention that you are using tail -f which would be > the exec source but the stack trace looks like you are actually using the > spooldir source. > > Best, > > Jeff > > > > On Mon, Mar 9, 2015 at 10:26 AM, Marina <[email protected]> wrote: > > Hi, > I have configured Flume to "tail -f" logs from my Varnish server - pretty > much standard Apache HTTP logs. > However, sometimes Flume chokes on some special characters and dies - > stops processing new log entries. > > See below for a stack trace. > > It seems like this exact issue was reported as Flume bug in 1.4.x version: > https://issues.apache.org/jira/browse/FLUME-2052 > and it was marked as resolved in 1.5.0 version. > The version I am using is Flume 1.5.2 - and I am still seeing this issue... > > Could somebody confirm/deny if what I am seeing is the same issue and > should have been fixed? OR is this completely different? > > Thank you! > Marina > > 06 Mar 2015 18:16:57,820 ERROR [pool-3-thread-1] (org.apache.flume.source. > SpoolDirectorySource$SpoolDirectoryRunnable.run:256) - FATAL: Spool > Directory source r1: { spoolDir: /data1/varnish-logs-active }: *Uncaught > exception in SpoolDirectorySource thread. Restart or reconfigure Flume to > continue processing.* > > *java.nio.charset.MalformedInputException: Input length = 1* > > at java.nio.charset.CoderResult.throwException(CoderResult.java:260) > > at org.apache.flume.serialization.ResettableFileInputStream.readChar( > ResettableFileInputStream.java:195) > > at org.apache.flume.serialization.LineDeserializer.readLine( > LineDeserializer.java:134) > > at org.apache.flume.serialization.LineDeserializer.readEvent( > LineDeserializer.java:72) > > at org.apache.flume.serialization.LineDeserializer.readEvents( > LineDeserializer.java:91) > > at org.apache.flume.client.avro.ReliableSpoolingFileEventReade > r.readEvents(ReliableSpoolingFileEventReader.java:238) > > at org.apache.flume.source.SpoolDirectorySource$ > SpoolDirectoryRunnable.run(SpoolDirectorySource.java:227) > > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > > at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask. > java:317) > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) > > > > > > >
