I too had the same problem (in flume 1.4).
We had checked that the input data is actually  utf-8.
When we used input charset as 'unicode' it worked.
By "worked" I mean, it didn't give this exception.
At the destination that data was garbage for us?

Is it a known thing or are we missing anything?




On 08/04/2013 12:26 PM, Anat Rozenzon wrote:
Hi,

I'm trying to read a directory with the spooler and at some point I'm starting to get these errors:

01 Aug 2013 10:10:17,892 ERROR [pool-6-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:173) - Uncaught exception in Runnable
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:277) at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:169) at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134) at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72) at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91) at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221) at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:160) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)


I can see that this is a character set issue, however, the files are suppose to be UTF-8 files.
Still some characters are invalid, is there any way to ignore these lines?

Also, is there a way to know which file/line is causing the exception?

Thanks
Anat

Reply via email to