Thanks for your reply. I just wanna confirm whether the space of file channel has a limit.
Zhiwen Sun On Wed, Mar 20, 2013 at 4:06 PM, Hari Shreedharan <[email protected] > wrote: > If you reduce the capacity the channel will be able to buffer fewer > events. If you want to reduce the space used when there are only a few > events remaining set the config param: "maxFileSize" to something > lower(this is in bytes). I don't advice setting this to lower than a few > hundred megabytes (in fact, the default value works pretty well - do you > really need to save 3GB space?)- else you will end up having a huge number > of small files if there are many events wait to be taken from the channel. > > > Hari > > > On Wed, Mar 20, 2013 at 12:50 AM, Zhiwen Sun <[email protected]> wrote: > >> Hi Hari: >> >> Is that means I can reduce the capacity of file channel to cut down max >> disk space used by file channel? >> >> >> Zhiwen Sun >> >> >> >> On Wed, Mar 20, 2013 at 3:23 PM, Hari Shreedharan < >> [email protected]> wrote: >> >>> Hi, >>> >>> Like I mentioned earlier, we will always keep 2 data files in each data >>> directory (the ".meta" files are metadata associated to the actual data). >>> Once a log-8 is created(when log-7 gets rotated when it hits maximum size) >>> and all of the events in log-6 are taken, then log-6 will get deleted, but >>> you will still will see log-7 and log-8. So what you are seeing is not >>> unexpected. >>> >>> >>> Hari >>> >>> -- >>> Hari Shreedharan >>> >>> On Tuesday, March 19, 2013 at 6:30 PM, Zhiwen Sun wrote: >>> >>> Thanks all for your reply. >>> >>> @Kenison >>> I stop my tail -F | nc program and there is no new event file in HDFS, >>> so I think there is no event arrive. To make sure, I will test again with >>> enable JMX. >>> >>> @Alex >>> >>> The latest log is following. I can't see any exception or warning. >>> >>> 13/03/19 15:28:16 INFO hdfs.BucketWriter: Renaming hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901.tmp to hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901 >>> 13/03/19 15:28:16 INFO hdfs.BucketWriter: Creating hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp >>> 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Start checkpoint >>> for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to >>> sync = 3 >>> 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Updating >>> checkpoint metadata: logWriteOrderID: 1363659953997, queueSize: 0, >>> queueHead: 362981 >>> 13/03/19 15:28:17 INFO file.LogFileV3: Updating log-7.meta >>> currentPosition = 216278208, logWriteOrderID = 1363659953997 >>> 13/03/19 15:28:17 INFO file.Log: Updated checkpoint for file: >>> /home/zhiwensun/.flume/file-channel/data/log-7 position: 216278208 >>> logWriteOrderID: 1363659953997 >>> 13/03/19 15:28:26 INFO hdfs.BucketWriter: Renaming hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp to hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902 >>> 13/03/19 15:28:27 INFO hdfs.BucketWriter: Creating hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp >>> 13/03/19 15:28:37 INFO hdfs.BucketWriter: Renaming hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp to hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903 >>> 13/03/19 15:28:37 INFO hdfs.BucketWriter: Creating hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp >>> >>> 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Start checkpoint >>> for /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to >>> sync = 2 >>> 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Updating >>> checkpoint metadata: logWriteOrderID: 1363659954200, queueSize: 0, >>> queueHead: 362981 >>> 13/03/19 15:28:47 INFO file.LogFileV3: Updating log-7.meta >>> currentPosition = 216288815, logWriteOrderID = 1363659954200 >>> 13/03/19 15:28:47 INFO file.Log: Updated checkpoint for file: >>> /home/zhiwensun/.flume/file-channel/data/log-7 position: 216288815 >>> logWriteOrderID: 1363659954200 >>> 13/03/19 15:28:48 INFO hdfs.BucketWriter: Renaming hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp to hdfs:// >>> 127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904 >>> >>> >>> @Hari >>> em, 12 hours passed. The size of file channel directory has no reduce. >>> >>> Files in file channel directory: >>> >>> -rw-r--r-- 1 zhiwensun zhiwensun 0 2013-03-19 09:15 in_use.lock >>> -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 log-6 >>> -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 10:12 log-6.meta >>> -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 log-7 >>> -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 15:28 log-7.meta >>> -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 >>> ./file-channel/data/log-7 >>> -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 10:12 >>> ./file-channel/data/log-6.meta >>> -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 15:28 >>> ./file-channel/data/log-7.meta >>> -rw-r--r-- 1 zhiwensun zhiwensun 0 2013-03-19 09:15 >>> ./file-channel/data/in_use.lock >>> -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 >>> ./file-channel/data/log-6 >>> >>> >>> >>> >>> >>> >>> Zhiwen Sun >>> >>> >>> >>> On Wed, Mar 20, 2013 at 2:32 AM, Hari Shreedharan < >>> [email protected]> wrote: >>> >>> It is possible for the directory size to increase even if no writes are >>> going in to the channel. If the channel size is non-zero and the sink is >>> still writing events to HDFS, the takes get written to disk as well (so we >>> know what events in the files were removed when the channel/agent >>> restarts). Eventually the channel will clean up the files which have all >>> events taken (though it will keep at least 2 files per data directory, just >>> to be safe). >>> >>> -- >>> Hari Shreedharan >>> >>> On Tuesday, March 19, 2013 at 10:32 AM, Alexander Alten-Lorenz wrote: >>> >>> Hey, >>> >>> what says debug? Do you can gather logs and attach them? >>> >>> - Alex >>> >>> On Mar 19, 2013, at 5:27 PM, "Kenison, Matt" <[email protected]> >>> wrote: >>> >>> Check the JMX counter first, to make sure you really are not sending new >>> events. If not, is it your checkpoint directory or data directory that is >>> increasing in size? >>> >>> >>> From: Zhiwen Sun <[email protected]> >>> Reply-To: "[email protected]" <[email protected]> >>> Date: Tue, 19 Mar 2013 01:19:19 -0700 >>> To: "[email protected]" <[email protected]> >>> Subject: Why used space of flie channel buffer directory increase? >>> >>> hi all: >>> >>> I test flume-ng in my local machine. The data flow is : >>> >>> tail -F file | nc 127.0.0.01 4444 > flume agent > hdfs >>> >>> My configuration file is here : >>> >>> a1.sources = r1 >>> a1.channels = c2 >>> >>> a1.sources.r1.type = netcat >>> a1.sources.r1.bind = 192.168.201.197 >>> a1.sources.r1.port = 44444 >>> a1.sources.r1.max-line-length = 1000000 >>> >>> a1.sinks.k1.type = logger >>> >>> a1.channels.c1.type = memory >>> a1.channels.c1.capacity = 10000 >>> a1.channels.c1.transactionCapacity = 10000 >>> >>> a1.channels.c2.type = file >>> a1.sources.r1.channels = c2 >>> >>> a1.sources.r1.interceptors = i1 >>> a1.sources.r1.interceptors.i1.type = timestamp >>> >>> a1.sinks = k2 >>> a1.sinks.k2.type = hdfs >>> a1.sinks.k2.channel = c2 >>> a1.sinks.k2.hdfs.path = hdfs://127.0.0.1:9000/flume/events/%Y-%m-%d >>> a1.sinks.k2.hdfs.writeFormat = Text >>> a1.sinks.k2.hdfs.rollInterval = 10 >>> a1.sinks.k2.hdfs.rollSize = 10000000 >>> a1.sinks.k2.hdfs.rollCount = 0 >>> >>> a1.sinks.k2.hdfs.filePrefix = app >>> a1.sinks.k2.hdfs.fileType = DataStream >>> >>> >>> >>> >>> it seems that events were collected correctly. >>> >>> But there is a problem boring me: Used space of file channel (~/.flume) >>> has always increased, even there is no new event. >>> >>> Is my configuration wrong or other problem? >>> >>> thanks. >>> >>> >>> Best regards. >>> >>> Zhiwen Sun >>> >>> >>> -- >>> Alexander Alten-Lorenz >>> http://mapredit.blogspot.com >>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF >>> >>> >>> >>> >>> >> >
