Hi, Jeff, 
Thank you for your quick response!I could not easily find the exact log entry 
that had the issue - as all I had were 30M input log files :).
After further debugging, I figured out what the issue was . Here is what 
happened.
For production, we use Exec sink with 'tail -f '. For my local testing I use a 
spooling dir. The issue happened when I was using the spooldir sink, when a log 
file had non-UTF-8 characters. 
However, the exception that I've posted came not from processing the log file! 
The flow was as following:1. Flume is started with spooldir sink2. a log file 
with non-utf-8 chars is moved into the spooldir3. Flume starts processing, 
encounters a "bad" character and stops (no errors or anything)4. I kill Flume 
manually and restart - without cleaning out its .flumespool dir 
5. FLume starts up and now chokes up processing its own .flumespool dir and the 
left-over file in there! - this is where the MalformedInputException came from 
When I processed the same file via Exec sink, and 'tail -n 10000 ..' command - 
it was processed successfully - which told me the issue is specific to the 
spooled sink.
The solution was to add this parameter to the spooldir 
sink:a1.sources.r1.inputCharset = ISO8859-1
Thanks!Marina



      From: Jeff Lord <[email protected]>
 To: "[email protected]" <[email protected]>; Marina <[email protected]> 
 Sent: Monday, March 9, 2015 11:17 AM
 Subject: Re: MalformedInputException processing logs from Varnish server
   
Hi Marina,
Do you have a sample of the characters/data which you believe to be causing 
this?Can you just confirm you are using apache version of flume or a specific 
distro?Also in your message you mention that you are using tail -f which would 
be the exec source but the stack trace looks like you are actually using the 
spooldir source.
Best,
Jeff


On Mon, Mar 9, 2015 at 10:26 AM, Marina <[email protected]> wrote:

Hi,I have configured Flume to "tail -f" logs from my Varnish server - pretty 
much standard Apache HTTP logs.However, sometimes Flume chokes on some special 
characters and dies - stops processing new log entries.
See below for a stack trace.
It seems like this exact issue was reported as Flume bug in 1.4.x 
version:https://issues.apache.org/jira/browse/FLUME-2052and it was marked as 
resolved in 1.5.0 version.The version I am using is Flume 1.5.2 - and I am 
still seeing this issue...
Could somebody confirm/deny if what I am seeing is the same issue and should 
have been fixed? OR is this completely different?
Thank you!Marina


06 Mar 2015 18:16:57,820 ERROR [pool-3-thread-1] 
(org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:256)  
- FATAL: Spool Directory source r1: { spoolDir: /data1/varnish-logs-active }: 
Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume 
to continue processing.

java.nio.charset.MalformedInputException: Input length = 1

at java.nio.charset.CoderResult.throwException(CoderResult.java:260)

at 
org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:195)

at 
org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)

at 
org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)

at 
org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)

at 
org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:238)

at 
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:227)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)

at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)







  

Reply via email to