I suppose that really depends on the usage scenario. There are a hundred things that may affect the ability of the Flume chain to keep up with incoming data, only one of which is the sink being a JDBC connection. I think for cases like mine where the data is structured and of a reasonable volume, a JDBC connection makes sense.
I guess what I'm saying is that if someone uses it without thinking or testing what they're doing with it... That's not a problem with JDBC, the sink, or Flume. It's a problem with the operator. :-P -- Jeremy On Thu, Nov 28, 2013 at 8:33 AM, Steve Morin <[email protected]> wrote: > Think the biggest problem is not that people wouldn't want to use it but > that data wouldn't be written fast enough to DB's to clear channels in many > moderate volumes. > > I'll follow the ticket thanks > > > On Thu, Nov 28, 2013 at 8:17 AM, Jeremy Karlson > <[email protected]>wrote: > >> Hi Steve, >> >> I’ve submitted the sink for review here: >> >> http://issues.apache.org/jira/browse/FLUME-2256 >> >> If it’s something that interests you, I encourage you to apply the patch >> and let me know if it meets your needs or if you find problems. >> >> So far, no movement on it… But it’s only been a couple of days. If >> Flume doesn’t want it (for whatever reason) I’ll just take off all of the >> Apache headers and put it up on GitHub with a similar license. It’ll get >> open sourced one way or another, but I think folding it into Flume makes >> the most sense. >> >> -- Jeremy >> >> >> On Nov 28, 2013, at 7:39, Steve Morin <[email protected]> wrote: >> >> Jeremy, >> I am interested in a JDBC flume sink are you open sourcing it? >> -Steve >> >> >> On Tue, Nov 26, 2013 at 8:52 PM, Jeremy Karlson >> <[email protected]>wrote: >> >>> Is there any interest in a generic JDBC sink? >>> >>> Over the few days I decided to try and write one. I have something that >>> requires more testing, but seems to be working. >>> >>> Since the config file is how you’d interact with it, here’s a working >>> example from my source tree: >>> >>> a.sinks.k.type=jdbc >>> a.sinks.k.channel=c >>> a.sinks.k.driver=com.mysql.jdbc.Driver >>> a.sinks.k.url=jdbc:mysql://localhost:8889/flume >>> a.sinks.k.user=username >>> a.sinks.k.password=password >>> a.sinks.k.batchSize=100 >>> a.sinks.k.sql=insert into twitter (body, timestamp) values >>> (${body:string}, ${header.timestamp:long}) >>> >>> The interesting part is the SQL statement. You can put anything you >>> want in there - it will get converted to a prepared statement on execution. >>> The Ant-ish tokens get parsed and replaced with parameters at startup. >>> >>> The tokens are three part. For example, in: >>> >>> ${body:string(UTF-8)} >>> >>> The first is a place in the event to get the value from (“body”, >>> “header.foo”, or “custom”). The second part ("string") is a type >>> identifier that converts into an appropriate JDBC parameter. The third >>> part (“UTF-8") is a configuration string for that type, if needed. As for >>> types, so far I’ve defined: >>> >>> body: string (with optional charset encoding), bytearray >>> header: string, long, int, float, double, date (with mandatory date >>> format and optional timezone) >>> >>> Additionally, if none of those make you happy you can define you own >>> parameter converters: >>> >>> ${custom:com.company.foo.MyConverter(optionaltextconfig)} >>> >>> I know there is still improvement to be made, but I’d like to get some >>> feedback, bug fixes, and maybe get it included before I do a bunch of >>> useless work. If there is interest, how would you like it for review or >>> inclusion? >>> >>> -- Jeremy >>> >>> >>> >> >> >
