Yep, you haven't configured hdfs.rollInterval and hdfs.rollSize in the sink so it will produce tiny files. The batch size is not related with the file size
Regards Gonzalo On Oct 4, 2015 8:15 AM, "Shiva Ram" <[email protected]> wrote: > This is my conf file: > > flumeAgentSpoolDir.sources = spoolDirSource > flumeAgentSpoolDir.channels = memoryChannel > flumeAgentSpoolDir.sinks = hdfsSink > > # source > flumeAgentSpoolDir.sources.spoolDirSource.type = spooldir > flumeAgentSpoolDir.sources.spoolDirSource.channels = memoryChannel > flumeAgentSpoolDir.sources.spoolDirSource.spoolDir = > /home/hduser/Downloads/app_log_data > flumeAgentSpoolDir.sources.spoolDirSource.fileHeader = true > > # HDFS sinks > flumeAgentSpoolDir.sinks.hdfsSink.type = hdfs > flumeAgentSpoolDir.sinks.hdfsSink.hdfs.fileType = DataStream > # change to your host > flumeAgentSpoolDir.sinks.hdfsSink.hdfs.path = hdfs:// > 192.168.234.181:8020/flume_ng/log_data > flumeAgentSpoolDir.sinks.hdfsSink.hdfs.filePrefix = sales_web_log > flumeAgentSpoolDir.sinks.hdfsSink.hdfs.fileSuffix = .log > flumeAgentSpoolDir.sinks.hdfsSink.hdfs.batchSize = 50 > flumeAgentSpoolDir.sinks.hdfsSink.hdfs.bufferMaxLines = 50 > > # Use a channel which buffers events in memory > flumeAgentSpoolDir.channels.memoryChannel.type = memory > flumeAgentSpoolDir.channels.memoryChannel.capacity = 10000 > flumeAgentSpoolDir.channels.memoryChannel.transactionCapacity = 1000 > flumeAgentSpoolDir.channels.memoryChannel.byteCapacity = 100000000 > > # Bind the source and sink to the channel > flumeAgentSpoolDir.sources.spoolDirSource.channels = memoryChannel > flumeAgentSpoolDir.sinks.hdfsSink.channel = memoryChannel > > *Thanks & Regards,* > > *Shiva Ram* > *Website: http://datamaking.com <http://datamaking.com>Facebook Page: > www.facebook.com/datamaking <http://www.facebook.com/datamaking>* > > On Sun, Oct 4, 2015 at 12:19 PM, IT CTO <[email protected]> wrote: > > > Sorry, I can't see attacked file. > > > > בתאריך יום א׳, 4 באוק׳ 2015, 08:34 מאת Shiva Ram < > > [email protected]>: > > > > > Thanks for your inputs. > > > > > > This is my conf. file. > > > > > > *Thanks & Regards,* > > > > > > *Shiva Ram* > > > *Website: http://datamaking.com <http://datamaking.com>Facebook Page: > > > www.facebook.com/datamaking <http://www.facebook.com/datamaking>* > > > > > > On Sat, Oct 3, 2015 at 11:09 PM, IT CTO <[email protected]> wrote: > > > > > >> Can you share your conf file? > > >> The size of the file can be determined by few parameters such a roll* > or > > >> idle-timeout. > > >> Eran > > >> > > >> > > >> On Sat, Oct 3, 2015 at 6:33 PM Shiva Ram < > [email protected] > > > > > >> wrote: > > >> > > >> > My flume agent conf. file. > > >> > > > >> > > > > *How to increase the output file size? Thanks.* > > >> > > > >> > *Thanks & Regards,* > > >> > > > >> > *Shiva Ram* > > >> > > > > *Website: http://datamaking.com <http://datamaking.com>Facebook > Page: > > >> > www.facebook.com/datamaking <http://www.facebook.com/datamaking>* > > > > > > > > >> > > > >> > On Sat, Oct 3, 2015 at 4:58 PM, Shiva Ram < > > >> [email protected]> > > >> > wrote: > > >> > > > >> >> Hi > > >> >> > > >> >> I am using spooldir source, memory channel, hdfs sink to collect > log > > >> >> files and store into HDFS. > > >> >> > > >> >> When I run the flume agent, it is creating very very small files > with > > >> >> size 766 bytes. > > >> >> > > >> >> Input file: test.log [11.4 KB] > > >> >> Output files: sales_web_log.1443871052640.log, etc.[all are very > very > > >> >> small files with size 766 bytes] > > >> >> > > >> > > > >> *How to increase the output file size?* > > >> >> > > >> >> *Thanks & Regards,* > > >> >> > > >> >> *Shiva Ram* > > >> > > > >> *Website: http://datamaking.com <http://datamaking.com>Facebook > Page: > > >> >> www.facebook.com/datamaking <http://www.facebook.com/datamaking>* > > > > > > > > >> >> > > >> > > > >> > -- > > >> Eran | "You don't need eyes to see, you need vision" (Faithless) > > >> > > > -- > > Eran | "You don't need eyes to see, you need vision" (Faithless) > > >
