Hi ,
I am moving logs from local machine to HDFS server using flume with spooling 
directory. Each log contain lacks of lines 
My use case is below  
Log file name foldername-filename-timestamp.suffix  example file name is 
LogFiles-Log1-1463238298.log
my CONF is below 
a1.sinks = k1a1.channels = c1
#the source
a1.sources.r1.type = spooldira1.sources.r1.spoolDir  = 
F:\\SpoolingDirectorya1.sources.r1.deletePolicy=immediatea1.sources.r1.fileHeader
 = truea1.sources.r1.interceptors = i1a1.sources.r1.interceptors.i1.type = 
com.company.CustomInterceptor.CustomInterceptor$Builder
#the sinka1.sinks.k1.type = hdfsa1.sinks.k1.hdfs.fileType = 
DataStreama1.sinks.k1.hdfs.fileSuffix= .txta1.sinks.k1.hdfs.path  = 
hdfs://localhost:9000/spoolingdirectory/{foldername}
#Channela1.channels.c1.type = memorya1.channels.c1.capacity = 
10000a1.channels.c1.transactionCapacity = 1000
#Flowa1.sources.r1.channels = c1a1.sinks.k1.channel = c1

in the custom interceptor we will process the file hear and extract the folder 
name and add this as {foldername} header it is use in hdfspath. What problem we 
are facing is  for single file with lacks line this interceptor extract the 
same folder name for lacks of time  this will leads very high performance 
degradation. 
Is there any way to handle my case without performing the same file header for 
lacks time ?
thanks.

                                          

Reply via email to