On Thu, Aug 1, 2013 at 1:29 PM, Anat Rozenzon <[email protected]> wrote:
> Hi, > > I'm having the same problem with HDFS sink. > > A 'poison' message which doesn't have timestamp header in it as the sink > expects. > This causes a NPE which ends in returning the message to the channel , > over and over again. > > Is my only option to re-write the HDFS sink? > Isn't there any way to intercept in the sink work? > You can write a custom interceptor and remove/modify the poison message. Interceptors are called before message makes it way into the channel. http://flume.apache.org/FlumeUserGuide.html#flume-interceptors I wrote a blog about it a while back http://www.ashishpaliwal.com/blog/2013/06/flume-cookbook-implementing-custom-interceptors/ > > Thanks > Anat > > > On Fri, Jul 26, 2013 at 3:35 AM, Arvind Prabhakar <[email protected]>wrote: > >> Sounds like a bug in ElasticSearch sink to me. Do you mind filing a Jira >> to track this? Sample data to cause this would be even better. >> >> Regards, >> Arvind Prabhakar >> >> >> On Thu, Jul 25, 2013 at 9:50 AM, Jeremy Karlson >> <[email protected]>wrote: >> >>> This was using the provided ElasticSearch sink. The logs were not >>> helpful. I ran it through with the debugger and found the source of the >>> problem. >>> >>> ContentBuilderUtil uses a very "aggressive" method to determine if the >>> content is JSON; if it contains a "{" anywhere in it, it's considered JSON. >>> My body contained that but wasn't JSON, causing the JSON parser to throw a >>> CharConversionException from addComplexField(...) (but not the expected >>> JSONException). We've changed addComplexField(...) to catch different >>> types of exceptions and fall back to treating it as a simple field. We'll >>> probably submit a patch for this soon. >>> >>> I'm reasonably happy with this, but I still think that in the bigger >>> picture there should be some sort of mechanism to automatically detect and >>> toss / skip / flag problematic events without them plugging up the flow. >>> >>> -- Jeremy >>> >>> >>> On Wed, Jul 24, 2013 at 7:51 PM, Arvind Prabhakar <[email protected]>wrote: >>> >>>> Jeremy, would it be possible for you to show us logs for the part where >>>> the sink fails to remove an event from the channel? I am assuming this is a >>>> standard sink that Flume provides and not a custom one. >>>> >>>> The reason I ask is because sinks do not introspect the event, and >>>> hence there is no reason why it will fail during the event's removal. It is >>>> more likely that there is a problem within the channel in that it cannot >>>> dereference the event correctly. Looking at the logs will help us identify >>>> the root cause for what you are experiencing. >>>> >>>> Regards, >>>> Arvind Prabhakar >>>> >>>> >>>> On Wed, Jul 24, 2013 at 3:56 PM, Jeremy Karlson < >>>> [email protected]> wrote: >>>> >>>>> Both reasonable suggestions. What would a custom sink look like in >>>>> this case, and how would I only eliminate the problem events since I don't >>>>> know what they are until they are attempted by the "real" sink? >>>>> >>>>> My philosophical concern (in general) is that we're taking the >>>>> approach of exhaustively finding and eliminating possible failure cases. >>>>> It's not possible to eliminate every single failure case, so shouldn't >>>>> there be a method of last resort to eliminate problem events from the >>>>> channel? >>>>> >>>>> -- Jeremy >>>>> >>>>> >>>>> >>>>> On Wed, Jul 24, 2013 at 3:45 PM, Hari Shreedharan < >>>>> [email protected]> wrote: >>>>> >>>>>> Or you could write a custom sink that removes this event (more work >>>>>> of course) >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Hari >>>>>> >>>>>> On Wednesday, July 24, 2013 at 3:36 PM, Roshan Naik wrote: >>>>>> >>>>>> if you have a way to identify such events.. you may be able to use >>>>>> the Regex interceptor to toss them out before they get into the channel. >>>>>> >>>>>> >>>>>> On Wed, Jul 24, 2013 at 2:52 PM, Jeremy Karlson < >>>>>> [email protected]> wrote: >>>>>> >>>>>> Hi everyone. My Flume adventures continue. >>>>>> >>>>>> I'm in a situation now where I have a channel that's filling because >>>>>> a stubborn message is stuck. The sink won't accept it (for whatever >>>>>> reason; I can go into detail but that's not my point here). This just >>>>>> blocks up the channel entirely, because it goes back into the channel >>>>>> when >>>>>> the sink refuses. Obviously, this isn't ideal. >>>>>> >>>>>> I'm wondering what mechanisms, if any, Flume has to deal with these >>>>>> situations. Things that come to mind might be: >>>>>> >>>>>> 1. Ditch the event after n attempts. >>>>>> 2. After n attempts, send the event to a "problem area" (maybe a >>>>>> different source / sink / channel?) that someone can look at later. >>>>>> 3. Some sort of mechanism that allows operators to manually kill >>>>>> these messages. >>>>>> >>>>>> I'm open to suggestions on alternatives as well. >>>>>> >>>>>> Thanks. >>>>>> >>>>>> -- Jeremy >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > -- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal
