Thanks for your reply. I will try syslog as source.
Zhiwen Sun On Wed, Mar 20, 2013 at 3:11 PM, Alexander Alten-Lorenz <[email protected] > wrote: > HI, > > I suspect tail -F and nc for filling up the directory. Whats inside of > such a file which grows without a event? > > My assumption: > nc is open one stream, and deliver over this stream all incoming events. > Flume doesn't know that no event is coming in, since the stream never > breaks up. I wondering if you could use syslog(-ng) for the event delivery? > > Cheers, > Alex > > > > On Mar 20, 2013, at 2:30 AM, Zhiwen Sun <[email protected]> wrote: > > > Thanks all for your reply. > > > > @Kenison > > I stop my tail -F | nc program and there is no new event file in HDFS, > so I think there is no event arrive. To make sure, I will test again with > enable JMX. > > > > @Alex > > > > The latest log is following. I can't see any exception or warning. > > > > 13/03/19 15:28:16 INFO hdfs.BucketWriter: Renaming hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901.tmp to hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901 > > 13/03/19 15:28:16 INFO hdfs.BucketWriter: Creating hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp > > 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Start checkpoint > for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to > sync = 3 > > 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Updating > checkpoint metadata: logWriteOrderID: 1363659953997, queueSize: 0, > queueHead: 362981 > > 13/03/19 15:28:17 INFO file.LogFileV3: Updating log-7.meta > currentPosition = 216278208, logWriteOrderID = 1363659953997 > > 13/03/19 15:28:17 INFO file.Log: Updated checkpoint for file: > /home/zhiwensun/.flume/file-channel/data/log-7 position: 216278208 > logWriteOrderID: 1363659953997 > > 13/03/19 15:28:26 INFO hdfs.BucketWriter: Renaming hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp to hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902 > > 13/03/19 15:28:27 INFO hdfs.BucketWriter: Creating hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp > > 13/03/19 15:28:37 INFO hdfs.BucketWriter: Renaming hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp to hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903 > > 13/03/19 15:28:37 INFO hdfs.BucketWriter: Creating hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp > > > > 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Start checkpoint > for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to > sync = 2 > > 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Updating > checkpoint metadata: logWriteOrderID: 1363659954200, queueSize: 0, > queueHead: 362981 > > 13/03/19 15:28:47 INFO file.LogFileV3: Updating log-7.meta > currentPosition = 216288815, logWriteOrderID = 1363659954200 > > 13/03/19 15:28:47 INFO file.Log: Updated checkpoint for file: > /home/zhiwensun/.flume/file-channel/data/log-7 position: 216288815 > logWriteOrderID: 1363659954200 > > 13/03/19 15:28:48 INFO hdfs.BucketWriter: Renaming hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp to hdfs:// > 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904 > > > > @Hari > > em, 12 hours passed. The size of file channel directory has no reduce. > > > > Files in file channel directory: > > > > -rw-r--r-- 1 zhiwensun zhiwensun 0 2013-03-19 09:15 in_use.lock > > -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 log-6 > > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 10:12 log-6.meta > > -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 log-7 > > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 15:28 log-7.meta > > -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 > ./file-channel/data/log-7 > > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 10:12 > ./file-channel/data/log-6.meta > > -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 15:28 > ./file-channel/data/log-7.meta > > -rw-r--r-- 1 zhiwensun zhiwensun 0 2013-03-19 09:15 > ./file-channel/data/in_use.lock > > -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 > ./file-channel/data/log-6 > > > > > > > > > > > > Zhiwen Sun > > > > > > > > On Wed, Mar 20, 2013 at 2:32 AM, Hari Shreedharan < > [email protected]> wrote: > > It is possible for the directory size to increase even if no writes are > going in to the channel. If the channel size is non-zero and the sink is > still writing events to HDFS, the takes get written to disk as well (so we > know what events in the files were removed when the channel/agent > restarts). Eventually the channel will clean up the files which have all > events taken (though it will keep at least 2 files per data directory, just > to be safe). > > > > -- > > Hari Shreedharan > > > > On Tuesday, March 19, 2013 at 10:32 AM, Alexander Alten-Lorenz wrote: > > > >> Hey, > >> > >> what says debug? Do you can gather logs and attach them? > >> > >> - Alex > >> > >> On Mar 19, 2013, at 5:27 PM, "Kenison, Matt" <[email protected]> > wrote: > >> > >>> Check the JMX counter first, to make sure you really are not sending > new events. If not, is it your checkpoint directory or data directory that > is increasing in size? > >>> > >>> > >>> From: Zhiwen Sun <[email protected]> > >>> Reply-To: "[email protected]" <[email protected]> > >>> Date: Tue, 19 Mar 2013 01:19:19 -0700 > >>> To: "[email protected]" <[email protected]> > >>> Subject: Why used space of flie channel buffer directory increase? > >>> > >>> hi all: > >>> > >>> I test flume-ng in my local machine. The data flow is : > >>> > >>> tail -F file | nc 127.0.0.01 4444 > flume agent > hdfs > >>> > >>> My configuration file is here : > >>> > >>>> a1.sources = r1 > >>>> a1.channels = c2 > >>>> > >>>> a1.sources.r1.type = netcat > >>>> a1.sources.r1.bind = 192.168.201.197 > >>>> a1.sources.r1.port = 44444 > >>>> a1.sources.r1.max-line-length = 1000000 > >>>> > >>>> a1.sinks.k1.type = logger > >>>> > >>>> a1.channels.c1.type = memory > >>>> a1.channels.c1.capacity = 10000 > >>>> a1.channels.c1.transactionCapacity = 10000 > >>>> > >>>> a1.channels.c2.type = file > >>>> a1.sources.r1.channels = c2 > >>>> > >>>> a1.sources.r1.interceptors = i1 > >>>> a1.sources.r1.interceptors.i1.type = timestamp > >>>> > >>>> a1.sinks = k2 > >>>> a1.sinks.k2.type = hdfs > >>>> a1.sinks.k2.channel = c2 > >>>> a1.sinks.k2.hdfs.path = hdfs://127.0.0.1:9000/flume/events/%Y-%m-%d > >>>> a1.sinks.k2.hdfs.writeFormat = Text > >>>> a1.sinks.k2.hdfs.rollInterval = 10 > >>>> a1.sinks.k2.hdfs.rollSize = 10000000 > >>>> a1.sinks.k2.hdfs.rollCount = 0 > >>>> > >>>> a1.sinks.k2.hdfs.filePrefix = app > >>>> a1.sinks.k2.hdfs.fileType = DataStream > >>> > >>> > >>> > >>> it seems that events were collected correctly. > >>> > >>> But there is a problem boring me: Used space of file channel > (~/.flume) has always increased, even there is no new event. > >>> > >>> Is my configuration wrong or other problem? > >>> > >>> thanks. > >>> > >>> > >>> Best regards. > >>> > >>> Zhiwen Sun > >> > >> -- > >> Alexander Alten-Lorenz > >> http://mapredit.blogspot.com > >> German Hadoop LinkedIn Group: http://goo.gl/N8pCF > > > > > > -- > Alexander Alten-Lorenz > http://mapredit.blogspot.com > German Hadoop LinkedIn Group: http://goo.gl/N8pCF > >
