Flume removes them when there are no more events to be read from them (though 
once 2 or more files are created, there will be a minimum of 2 files - that is 
just a safety net).


Thanks,
Hari

On Mon, Nov 10, 2014 at 1:23 AM, Needham, Guy
<[email protected]> wrote:

> Is there a concept of a data file which is 'done'? Does Flume remove data 
> files it no longer needs, or will these build up?
> Regards,
> Guy Needham | Data Discovery
> Virgin Media | Enterprise Data, Design & Management
> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
> D 01256 75 3362
> I welcome VSRE emails. Learn more at http://vsre.info/
> ________________________________
> From: Hari Shreedharan [mailto:[email protected]]
> Sent: 10 November 2014 09:15
> To: [email protected]
> Cc: [email protected]
> Subject: RE: File channels creating many large files
> That value is in bytes. At 500k, you will likely end up with too many files. 
> You should set it as high as you can.
> Thanks, Hari
> On Mon, Nov 10, 2014 at 1:05 AM, Needham, Guy 
> <[email protected]<mailto:[email protected]>> wrote:
> Hari, Jeff,
> thanks for your replies. It's Flume 1.5.0, I'll use the maxFileSize parameter 
> to fix this. Is there any impact on channel optimisation from setting it to 
> say 500000?
> Regards,
> Guy Needham | Data Discovery
> Virgin Media | Enterprise Data, Design & Management
> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
> D 01256 75 3362
> I welcome VSRE emails. Learn more at http://vsre.info/
> ________________________________
> From: Hari Shreedharan [mailto:[email protected]]
> Sent: 07 November 2014 17:59
> To: [email protected]
> Cc: [email protected]
> Subject: Re: File channels creating many large files
> Flume will leave at least 2 files per data directory. Once you have enough 
> events to cause 2 files to be created, there will be at least 2 per dir. You 
> can use maxFileSize parameter to control the size of these files.
> Thanks, Hari
> On Fri, Nov 7, 2014 at 10:25 AM, Jeff Lord 
> <[email protected]<mailto:[email protected]>> wrote:
> Guy,
> What version of flume is this?
> -Jeff
> On Fri, Nov 7, 2014 at 1:19 AM, Needham, Guy 
> <[email protected]<mailto:[email protected]>> wrote:
> Hi all,
> I have a configuration with a file channel configured such that:
> a1.channels.ch1.type = file
> a1.channels.ch1.checkpointDir = /hadoop/user/flume/channels/checkpoint
> a1.channels.ch1.dataDirs = /hadoop/user/flume/channels/data
> a1.channels.ch1.capacity = 100000
> a1.channels.ch1.transactionCapacity = 5000
> It's been running since October 28th with no issues, but when I looked today 
> in /hadoop/user/flume/channels/data I saw that the file channel was building 
> up large files which had been processed and not deleting them:
> [rdd@hadoop-kn-p2-m01 flume]$ ls -lh channels/data/
> total 1.6G
> -rw-r----- 1 rdd rdd 1.5G Oct 28 16:10 log-1
> -rw-r----- 1 rdd rdd   47 Oct 28 16:10 log-1.meta
> -rw-r----- 1 rdd rdd  72M Oct 31 16:28 log-2
> -rw-r----- 1 rdd rdd   47 Oct 31 16:29 log-2.meta
> It seems like for each day that data landed (we're still in testing so data 
> not landing constantly) a data file has been created but not deleted when 
> reading was completed.
> Is this expected behaviour? Is there a way to stop large files building up 
> and still use the file channel?
> Regards,
> Guy Needham | Data Discovery
> Virgin Media | Enterprise Data, Design & Management
> Bartley Wood Business Park, Hook, Hampshire RG27 9UP
> D 01256 75 3362
> I welcome VSRE emails. Learn more at http://vsre.info/
> --------------------------------------------------------------------
> Save Paper - Do you really need to print this e-mail?
> Visit www.virginmedia.com<http://www.virginmedia.com> for more information, 
> and more fun.
> This email and any attachments are or may be confidential and legally 
> privileged
> and are sent solely for the attention of the addressee(s). If you have 
> received this
> email in error, please delete it from your system: its use, disclosure or 
> copying is
> unauthorised. Statements and opinions expressed in this email may not 
> represent
> those of Virgin Media. Any representations or commitments in this email are
> subject to contract.
> Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, 
> RG27 9UP
> Registered in England and Wales with number 2591237
> --------------------------------------------------------------------
> Save Paper - Do you really need to print this e-mail?
> Visit www.virginmedia.com for more information, and more fun.
> This email and any attachments are or may be confidential and legally 
> privileged
> and are sent solely for the attention of the addressee(s). If you have 
> received this
> email in error, please delete it from your system: its use, disclosure or 
> copying is
> unauthorised. Statements and opinions expressed in this email may not 
> represent
> those of Virgin Media. Any representations or commitments in this email are
> subject to contract.
> Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, 
> RG27 9UP
> Registered in England and Wales with number 2591237
> --------------------------------------------------------------------
> Save Paper - Do you really need to print this e-mail?
> Visit www.virginmedia.com for more information, and more fun.
> This email and any attachments are or may be confidential and legally 
> privileged
> and are sent solely for the attention of the addressee(s). If you have 
> received this
> email in error, please delete it from your system: its use, disclosure or 
> copying is
> unauthorised. Statements and opinions expressed in this email may not 
> represent
> those of Virgin Media. Any representations or commitments in this email are
> subject to contract. 
> Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, 
> RG27 9UP
> Registered in England and Wales with number 2591237

Reply via email to