Hi Edward, Thank you for reaching out. Among the two features I requested for ElasticSearch sink in flume I have implemented the smaller one ( https://issues.apache.org/jira/browse/FLUME-2206) which will allow users to provide TTL values with day / hour / week etc. specifier as in current version of ElasticSearch and have posted the patch for review here ( https://reviews.apache.org/r/14614/).
The second / bigger one is where users will be able to provide ElasticSearch index naming with / without rolling specifier as opposed to the current way where user provides the ElasticSearch indexName e.g. say "flume" and ElasticSearchSink appends %daytimestamp i.e. 2013-10-28 to create index "flume-2013-10-28" and keeps on creating indexes on a daily basis. While the current way of creating indices works great under circumstances where user wants to roll indices on a daily basis it constrains the user from creating indices on monthly basis i.e. "flume-2013-10" or "flume-2013-11" etc. or yearly basis, so on and so forth. Essentially I was looking for HDFS filePrefix style ElasticSearch index naming. I haven't yet started working on this patch. Please go ahead if you want to work on this feature request. I have already created a JIRA ticket (https://issues.apache.org/jira/browse/FLUME-2207) for this one. Best, - Dib On Sun, Oct 27, 2013 at 8:03 AM, Edward Sargisson <[email protected]> wrote: > Hi Dib, > I seem to spend the most time maintaining the Elasticsearch Sink and, > sadly, am *way* behind on email. > > If you raise Jira issues for your proposed changes and set up the Review > board for them then either I or a colleague should be able to take a look. > Normally, once we're happy, a committer will commit them to the repository. > > I will note that I'm on parental leave until Dec 9 and won't have a chance > to have a look until then. However, when everything's ready drop me an > email and I'll see if a colleague has time. > > Cheers, > Edward > > "On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote: > > > Hi all, > > > > This is a repost from [email protected] (mailto:[email protected] > ). > I was not sure if flume developers got the email thus pardon my repost if > it feels like I am spamming the mailing list. > > > > I have a couple of feature requests for ElasticSearchSink and didn't find > open JIRA tickets for these requirements. > > > > I have already modified ElasticSearchSink locally for the smaller of the > feature request and the longer one is in progress. I wanted to discuss the > features first with you first before creating the JIRA tickets so here is a > brief summary of the improvements I have in mind. > > > > > > DETAILS>>> > > > > Flume version: > > > > Flume 1.4.0-cdh4.4.0 > > Source code repository: > https://git-wip-us.apache.org/repos/asf/flume.git > > Revision: 154d35659212f07edc896b414a43996fb8121773 > > Compiled by jenkins on Tue Sep 3 20:53:28 PDT 2013 > > From source with checksum f95b4a7f48080f876d6482bb88bcc342 > > > > > > And ElasticSearch v0.90.1. > > > > Improvement request #1 - HDFS file suffix style index suffix in > ElasticSearchSink: > > > > agent.sinks.myESsink.indexName = myIndex > > > > ElasticSearchSink uses the provided index name as index prefix and > appends "YYYY-MM-DD" to generate the actual index in ES which being > convenient for my testing purposes, doesn't allow creating index monthly / > yearly or more generally speaking based on some regex provided in flume > config similar to HDFS fileSuffix .e.g. > > > > agent.sinks.myESsink.indexSuffix = "YYYY" will create index as > myIndex-2013 / myIndex-2014 etc and when not provided will create index > with just the index name or can default back to 'YYYY-MM-DD'. > > > > Improvement request #2 - ElasticSearchSink ttl field modification to > mimic actual ES: > > > > agent.sinks.myESsink.ttl = <some integer value> (current specification) > > > > The second one is comparatively trivial but good to have. Current > ElasticSearch TTL defaults to 5 days and works with integers only again > which is treated as days. > > > > It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to > mimic the TTL configuration in ElasticSearch mapping. > > > > agent.sinks.myESsink.ttl = "3w" / 3 (requested specification) > > > > For the ttl I have already made changes in my local flume git repo and > currently testing it. The change doesn't break existing way of specifying > TTL field only extends it to allow "1d" / "2w" style TTL specification. > > > > <<<DETAILS > > > > Kindly suggest what should I do to make these changes incorporated in the > future release(s) of Flume. > > > > Best and thanks, > > - Dib > > > Thanks Hari. > > I am creating JIRA tickets for the improvements. > > Best, > - Dib" >
