Hello, thank you for the hint to use the new spoolDir feature in the fresh released 1.3.0 version of Flume.
unfortunately I am not getting the expected result. Here is my configuration: agent1.channels = MemoryChannel-2 agent1.channels.MemoryChannel-2.type = memory agent1.sources = spooldir-1 agent1.sources.spooldir-1.type = spooldir agent1.sources.spooldir-1.spoolDir = /opt/apache2/logs/flumeSpool agent1.sources.spooldir-1.fileHeader = true agent1.sinks = HDFS agent1.sinks.HDFS.channel = MemoryChannel-2 agent1.sinks.HDFS.type = hdfs agent1.sinks.HDFS.hdfs.fileType = DataStream agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000 agent1.sinks.HDFS.hdfs.writeFormat = Text Upon start I am getting the following warning: 2012-12-05 11:05:19,216 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:571)] Removed spooldir-1 due to No Channels configured for spooldir-1 Question: 1) Is something wrong in the above config? 2) How are the files gathered from the spool directory? Every time I drop (copy, etc...) a file in it? 3) What happens to the files that were already in the spool directory before I start the flume agent? I would appreciate any Help! Cheers, Emile -------- Original-Nachricht -------- > Datum: Tue, 4 Dec 2012 06:48:46 -0800 > Von: Mike Percy <[email protected]> > An: [email protected] > Betreff: Re: A customer use case > Hi Emile, > > On Tue, Dec 4, 2012 at 2:04 AM, Emile Kao <[email protected]> wrote: > > > > 1. Which is the best way to implement such a scenario using Flume/ > Hadoop? > > > > You could use the file spooling client / source to stream these files back > in the latest trunk and upcoming Flume 1.3.0 builds, along with hdfs sink. > > 2. The customer would like to keep the log files in thier original state > > (file name, size, etc..). Is it practicable using Flume? > > > > Not recommended. Flume is an event streaming system, not a file copying > mechanism. If you want to do that, just use some scripts with hadoop fs > -put instead of Flume. Flume provides a bunch of stream-oriented features > on top of its event streaming architecture, such as data enrichment > capabilities, event routing, and configurable file rolling on HDFS, to > name > a few. > > Regards, > Mike
