Haha, no. I have just been swamped to start a meaningful discussion. If you do have time, you can do that too - you don’t need to wait for me to start it.
Thanks, Hari On Thursday, December 12, 2013 at 9:40 AM, Jeremy Karlson wrote: > Did I fall asleep at the wheel for a bit and miss the discussion on > contributed sources / sinks? > > -- Jeremy > > > > On Thu, Nov 28, 2013 at 11:04 AM, Hari Shreedharan < > [email protected] (mailto:[email protected])> wrote: > > > I think we could add this to flume as a contrib module (rather than in core > > flume itself). At this time, there is no contrib module yet, but I will > > start a discussion on this early next week on the dev list and let's take > > it from there. > > > > > > Hari > > > > On Thursday, November 28, 2013, Jeremy Karlson wrote: > > > > > I suppose that really depends on the usage scenario. There are a hundred > > > things that may affect the ability of the Flume chain to keep up with > > > incoming data, only one of which is the sink being a JDBC connection. I > > > think for cases like mine where the data is structured and of a > > > > > > > reasonable > > > volume, a JDBC connection makes sense. > > > > > > I guess what I'm saying is that if someone uses it without thinking or > > > testing what they're doing with it... That's not a problem with JDBC, > > > > > > > the > > > sink, or Flume. It's a problem with the operator. :-P > > > > > > -- Jeremy > > > > > > > > > > > > On Thu, Nov 28, 2013 at 8:33 AM, Steve Morin <[email protected] > > > (mailto:[email protected]) > > <javascript:;>> > > > wrote: > > > > > > > Think the biggest problem is not that people wouldn't want to use it > > but > > > > that data wouldn't be written fast enough to DB's to clear channels in > > > > > > many > > > > moderate volumes. > > > > > > > > I'll follow the ticket thanks > > > > > > > > > > > > On Thu, Nov 28, 2013 at 8:17 AM, Jeremy Karlson < > > [email protected] (mailto:[email protected])<javascript:;> > > > > wrote: > > > > > > > > > Hi Steve, > > > > > > > > > > I’ve submitted the sink for review here: > > > > > > > > > > http://issues.apache.org/jira/browse/FLUME-2256 > > > > > > > > > > If it’s something that interests you, I encourage you to apply the > > patch > > > > > and let me know if it meets your needs or if you find problems. > > > > > > > > > > So far, no movement on it… But it’s only been a couple of days. If > > > > > Flume doesn’t want it (for whatever reason) I’ll just take off all of > > > > > > > > > > > > > > > the > > > > > Apache headers and put it up on GitHub with a similar license. It’ll > > > > > > > > > > get > > > > > open sourced one way or another, but I think folding it into Flume > > > > > > > > > > > > > > makes > > > > > the most sense. > > > > > > > > > > -- Jeremy > > > > > > > > > > > > > > > On Nov 28, 2013, at 7:39, Steve Morin <[email protected] > > > > > (mailto:[email protected]) > > <javascript:;>> > > > wrote: > > > > > > > > > > Jeremy, > > > > > I am interested in a JDBC flume sink are you open sourcing it? > > > > > -Steve > > > > > > > > > > > > > > > On Tue, Nov 26, 2013 at 8:52 PM, Jeremy Karlson < > > > [email protected] (mailto:[email protected]) > > > <javascript:;>>wrote: > > > > > > > > > > > Is there any interest in a generic JDBC sink? > > > > > > > > > > > > Over the few days I decided to try and write one. I have something > > > that > > > > > > requires more testing, but seems to be working. > > > > > > > > > > > > Since the config file is how you’d interact with it, here’s a > > > > > > working > > > > > > example from my source tree: > > > > > > > > > > > > a.sinks.k.type=jdbc > > > > > > a.sinks.k.channel=c > > > > > > a.sinks.k.driver=com.mysql.jdbc.Driver > > > > > > a.sinks.k.url=jdbc:mysql://localhost:8889/flume > > > > > > a.sinks.k.user=username > > > > > > a.sinks.k.password=password > > > > > > a.sinks.k.batchSize=100 > > > > > > a.sinks.k.sql=insert into twitter (body, timestamp) values > > > > > > (${body:string}, ${header.timestamp:long}) > > > > > > > > > > > > The interesting part is the SQL statement. You can put anything you > > > > > > want in there - it will get converted to a prepared statement on > > > > > > > > > > > > > > > > > > > > > execution. > > > > > > The Ant-ish tokens get parsed and replaced with parameters at > > > > > > > > > > > > > > > > > > > startup. > > > > > > > > > > > > The tokens are three part. For example, in: > > > > > > > > > > > > ${body:string(UTF-8)} > > > > > > > > > > > > The first is a place in the event to get the value from (“body”, > > > > > > “header.foo”, or “custom”). The second part ("string") is a type > > > > > > identifier that converts into an appropriate JDBC parameter. The > > > > > > > > > > > > > > > > > > > > > > third > > > > > > part (“UTF-8") is a configuration string for that type, if needed. > > > > > > > > > > > > > > > > As > > > for > > > > > > types, so far I’ve defined: > > > > > > > > > > > > body: string (with optional charset encoding), bytearray > > > > > > header: string, long, int, float, double, date (with mandatory date > > > > > > format and optional timezone) > > > > > > > > > > > > Additionally, if none of those make you happy you can define you own > > > > > > parameter converters: > > > > > > > > > > > > ${custom:com.company.foo.MyConverter(optionaltextconfig)} > > > > > > > > > > > > I know there is still improvement to be made, but I’d like to get > > some > > > > > > feedback, bug fixes, and maybe get it included before I do a bunch > > > > > > of > > > > > > useless work. If there is interest, how would you like it for review > > > > > > > > > > > > > > > > > > > > > or > > > > > > inclusion? > > > > > > > > > > > > -- Jeremy
