Sounds like a bug in ElasticSearch sink to me. Do you mind filing a Jira to track this? Sample data to cause this would be even better.
Regards, Arvind Prabhakar On Thu, Jul 25, 2013 at 9:50 AM, Jeremy Karlson <[email protected]>wrote: > This was using the provided ElasticSearch sink. The logs were not > helpful. I ran it through with the debugger and found the source of the > problem. > > ContentBuilderUtil uses a very "aggressive" method to determine if the > content is JSON; if it contains a "{" anywhere in it, it's considered JSON. > My body contained that but wasn't JSON, causing the JSON parser to throw a > CharConversionException from addComplexField(...) (but not the expected > JSONException). We've changed addComplexField(...) to catch different > types of exceptions and fall back to treating it as a simple field. We'll > probably submit a patch for this soon. > > I'm reasonably happy with this, but I still think that in the bigger > picture there should be some sort of mechanism to automatically detect and > toss / skip / flag problematic events without them plugging up the flow. > > -- Jeremy > > > On Wed, Jul 24, 2013 at 7:51 PM, Arvind Prabhakar <[email protected]>wrote: > >> Jeremy, would it be possible for you to show us logs for the part where >> the sink fails to remove an event from the channel? I am assuming this is a >> standard sink that Flume provides and not a custom one. >> >> The reason I ask is because sinks do not introspect the event, and hence >> there is no reason why it will fail during the event's removal. It is more >> likely that there is a problem within the channel in that it cannot >> dereference the event correctly. Looking at the logs will help us identify >> the root cause for what you are experiencing. >> >> Regards, >> Arvind Prabhakar >> >> >> On Wed, Jul 24, 2013 at 3:56 PM, Jeremy Karlson >> <[email protected]>wrote: >> >>> Both reasonable suggestions. What would a custom sink look like in this >>> case, and how would I only eliminate the problem events since I don't know >>> what they are until they are attempted by the "real" sink? >>> >>> My philosophical concern (in general) is that we're taking the approach >>> of exhaustively finding and eliminating possible failure cases. It's not >>> possible to eliminate every single failure case, so shouldn't there be a >>> method of last resort to eliminate problem events from the channel? >>> >>> -- Jeremy >>> >>> >>> >>> On Wed, Jul 24, 2013 at 3:45 PM, Hari Shreedharan < >>> [email protected]> wrote: >>> >>>> Or you could write a custom sink that removes this event (more work of >>>> course) >>>> >>>> >>>> Thanks, >>>> Hari >>>> >>>> On Wednesday, July 24, 2013 at 3:36 PM, Roshan Naik wrote: >>>> >>>> if you have a way to identify such events.. you may be able to use the >>>> Regex interceptor to toss them out before they get into the channel. >>>> >>>> >>>> On Wed, Jul 24, 2013 at 2:52 PM, Jeremy Karlson < >>>> [email protected]> wrote: >>>> >>>> Hi everyone. My Flume adventures continue. >>>> >>>> I'm in a situation now where I have a channel that's filling because a >>>> stubborn message is stuck. The sink won't accept it (for whatever reason; >>>> I can go into detail but that's not my point here). This just blocks up >>>> the channel entirely, because it goes back into the channel when the sink >>>> refuses. Obviously, this isn't ideal. >>>> >>>> I'm wondering what mechanisms, if any, Flume has to deal with these >>>> situations. Things that come to mind might be: >>>> >>>> 1. Ditch the event after n attempts. >>>> 2. After n attempts, send the event to a "problem area" (maybe a >>>> different source / sink / channel?) that someone can look at later. >>>> 3. Some sort of mechanism that allows operators to manually kill these >>>> messages. >>>> >>>> I'm open to suggestions on alternatives as well. >>>> >>>> Thanks. >>>> >>>> -- Jeremy >>>> >>>> >>>> >>>> >>> >> >
