HI,

I suspect tail -F and nc for filling up the directory. Whats inside of such a 
file which grows without a event?

My assumption:
nc is open one stream, and deliver over this stream all incoming events. Flume 
doesn't know that no event is coming in, since the stream never breaks up. I 
wondering if you could use syslog(-ng) for the event delivery?

Cheers,
 Alex



On Mar 20, 2013, at 2:30 AM, Zhiwen Sun <[email protected]> wrote:

> Thanks all for your reply.
> 
> @Kenison 
> I stop my tail -F | nc program and there is no new event file in HDFS, so I 
> think there is no event arrive. To make sure, I will test again with enable 
> JMX.
> 
> @Alex
> 
> The latest log is following. I can't see any exception or warning.
> 
> 13/03/19 15:28:16 INFO hdfs.BucketWriter: Renaming 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901.tmp to 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490901
> 13/03/19 15:28:16 INFO hdfs.BucketWriter: Creating 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp
> 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Start checkpoint for 
> /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to sync = 
> 3
> 13/03/19 15:28:17 INFO file.EventQueueBackingStoreFile: Updating checkpoint 
> metadata: logWriteOrderID: 1363659953997, queueSize: 0, queueHead: 362981
> 13/03/19 15:28:17 INFO file.LogFileV3: Updating log-7.meta currentPosition = 
> 216278208, logWriteOrderID = 1363659953997
> 13/03/19 15:28:17 INFO file.Log: Updated checkpoint for file: 
> /home/zhiwensun/.flume/file-channel/data/log-7 position: 216278208 
> logWriteOrderID: 1363659953997
> 13/03/19 15:28:26 INFO hdfs.BucketWriter: Renaming 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902.tmp to 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490902
> 13/03/19 15:28:27 INFO hdfs.BucketWriter: Creating 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp
> 13/03/19 15:28:37 INFO hdfs.BucketWriter: Renaming 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903.tmp to 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490903
> 13/03/19 15:28:37 INFO hdfs.BucketWriter: Creating 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp
> 
> 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Start checkpoint for 
> /home/zhiwensun/.flume/file-channel/checkpoint/checkpoint, elements to sync = 
> 2
> 13/03/19 15:28:47 INFO file.EventQueueBackingStoreFile: Updating checkpoint 
> metadata: logWriteOrderID: 1363659954200, queueSize: 0, queueHead: 362981
> 13/03/19 15:28:47 INFO file.LogFileV3: Updating log-7.meta currentPosition = 
> 216288815, logWriteOrderID = 1363659954200
> 13/03/19 15:28:47 INFO file.Log: Updated checkpoint for file: 
> /home/zhiwensun/.flume/file-channel/data/log-7 position: 216288815 
> logWriteOrderID: 1363659954200
> 13/03/19 15:28:48 INFO hdfs.BucketWriter: Renaming 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904.tmp to 
> hdfs://127.0.0.1:9000/flume/events/2013-03-19/app.1363660490904
> 
> @Hari
> em, 12 hours passed. The size of file channel directory has no reduce.
> 
> Files in file channel directory:
> 
> -rw-r--r-- 1 zhiwensun zhiwensun    0 2013-03-19 09:15 in_use.lock
> -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 log-6
> -rw-r--r-- 1 zhiwensun zhiwensun   29 2013-03-19 10:12 log-6.meta
> -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 log-7
> -rw-r--r-- 1 zhiwensun zhiwensun   29 2013-03-19 15:28 log-7.meta
> -rw-r--r-- 1 zhiwensun zhiwensun 207M 2013-03-19 15:28 
> ./file-channel/data/log-7
> -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 10:12 
> ./file-channel/data/log-6.meta
> -rw-r--r-- 1 zhiwensun zhiwensun 29 2013-03-19 15:28 
> ./file-channel/data/log-7.meta
> -rw-r--r-- 1 zhiwensun zhiwensun 0 2013-03-19 09:15 
> ./file-channel/data/in_use.lock
> -rw-r--r-- 1 zhiwensun zhiwensun 1.0M 2013-03-19 10:11 
> ./file-channel/data/log-6
> 
> 
> 
> 
> 
> Zhiwen Sun 
> 
> 
> 
> On Wed, Mar 20, 2013 at 2:32 AM, Hari Shreedharan <[email protected]> 
> wrote:
> It is possible for the directory size to increase even if no writes are going 
> in to the channel. If the channel size is non-zero and the sink is still 
> writing events to HDFS, the takes get written to disk as well (so we know 
> what events in the files were removed when the channel/agent restarts). 
> Eventually the channel will clean up the files which have all events taken 
> (though it will keep at least 2 files per data directory, just to be safe).
> 
> -- 
> Hari Shreedharan
> 
> On Tuesday, March 19, 2013 at 10:32 AM, Alexander Alten-Lorenz wrote:
> 
>> Hey,
>> 
>> what says debug? Do you can gather logs and attach them?
>> 
>> - Alex
>> 
>> On Mar 19, 2013, at 5:27 PM, "Kenison, Matt" <[email protected]> wrote:
>> 
>>> Check the JMX counter first, to make sure you really are not sending new 
>>> events. If not, is it your checkpoint directory or data directory that is 
>>> increasing in size?
>>> 
>>> 
>>> From: Zhiwen Sun <[email protected]>
>>> Reply-To: "[email protected]" <[email protected]>
>>> Date: Tue, 19 Mar 2013 01:19:19 -0700
>>> To: "[email protected]" <[email protected]>
>>> Subject: Why used space of flie channel buffer directory increase?
>>> 
>>> hi all:
>>> 
>>> I test flume-ng in my local machine. The data flow is :
>>> 
>>> tail -F file | nc 127.0.0.01 4444 > flume agent > hdfs
>>> 
>>> My configuration file is here :
>>> 
>>>> a1.sources = r1
>>>> a1.channels = c2
>>>> 
>>>> a1.sources.r1.type = netcat
>>>> a1.sources.r1.bind = 192.168.201.197
>>>> a1.sources.r1.port = 44444
>>>> a1.sources.r1.max-line-length = 1000000
>>>> 
>>>> a1.sinks.k1.type = logger
>>>> 
>>>> a1.channels.c1.type = memory
>>>> a1.channels.c1.capacity = 10000
>>>> a1.channels.c1.transactionCapacity = 10000
>>>> 
>>>> a1.channels.c2.type = file
>>>> a1.sources.r1.channels = c2
>>>> 
>>>> a1.sources.r1.interceptors = i1
>>>> a1.sources.r1.interceptors.i1.type = timestamp
>>>> 
>>>> a1.sinks = k2
>>>> a1.sinks.k2.type = hdfs
>>>> a1.sinks.k2.channel = c2
>>>> a1.sinks.k2.hdfs.path = hdfs://127.0.0.1:9000/flume/events/%Y-%m-%d
>>>> a1.sinks.k2.hdfs.writeFormat = Text
>>>> a1.sinks.k2.hdfs.rollInterval = 10
>>>> a1.sinks.k2.hdfs.rollSize = 10000000
>>>> a1.sinks.k2.hdfs.rollCount = 0
>>>> 
>>>> a1.sinks.k2.hdfs.filePrefix = app
>>>> a1.sinks.k2.hdfs.fileType = DataStream
>>> 
>>> 
>>> 
>>> it seems that events were collected correctly.
>>> 
>>> But there is a problem boring me: Used space of file channel (~/.flume) has 
>>> always increased, even there is no new event.
>>> 
>>> Is my configuration wrong or other problem?
>>> 
>>> thanks.
>>> 
>>> 
>>> Best regards.
>>> 
>>> Zhiwen Sun
>> 
>> --
>> Alexander Alten-Lorenz
>> http://mapredit.blogspot.com
>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
> 
> 

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF

Reply via email to