Flume has a tool that will allow you to run all events in the file channel 
through a piece of custom code you’d supply:

bin/flume-ng tool FCINTEGRITYTOOL

You can see the arguments you’d need to supply when you execute this command.

Thanks,
Hari Shreedharan




> On Jun 8, 2015, at 7:30 PM, Robert B Hamilton <[email protected]> wrote:
> 
> Is there anything like a logdump tool for flume file channel?   
> Specifically I+IBk-m looking for some way to extract say the event data for 
> the last N puts.
> Alternatively can the logs be modified so that the last N (sink) commits will 
> be ignored on restart?
>  
> The scenario that I+IBk-m concerned about is this:  
>  
> 1.       server crashes, flume is restarted once the server is brought back.
> 2.       End user sees something odd in his HiveQL and speculates that data 
> was lost.
> 3.       We peek into the WAL as they existed just before the restart (we 
> saved off a copy) and either
> a.       Find an event corresponding to the missing data and use that to fix 
> the data in the destination, or
> b.      Prove that the event corresponding to the missing data was not 
> present at least as far back as the logs go
>  
> I+IBk-m just wondering if there is a tool which makes number 3 possible+ICY.
>  
> 
> 
> Nothing in this message is intended to constitute an electronic signature 
> unless a specific statement to the contrary is included in this message. 
> 
> Confidentiality Note: This message is intended only for the person or entity 
> to which it is addressed. It may contain confidential and/or privileged 
> material. Any review, transmission, dissemination or other use, or taking of 
> any action in reliance upon this message by persons or entities other than 
> the intended recipient is prohibited and may be unlawful. If you received 
> this message in error, please contact the sender and delete it from your 
> computer.

Reply via email to