Re: Roadmap for v1.1.0

Mike Percy Sun, 04 Mar 2012 23:44:38 -0800

Sorry, I missed a couple things at the end (inline)

On Mar 4, 2012, at 11:29 PM, Mike Percy wrote:


> On Mar 4, 2012, at 9:52 PM, Juhani Connolly wrote:
> 
>> In the "poor code reviews" discussion, Mike Percy suggested opening up a 
>> thread regarding the roadmap for 1.1.0 and beyond, so here's a go at kicking 
>> that off.
>> 
>> I think a the following questions present themselves, along with my opinions:
>> 
>> - When do we hope to make the next solid release? Do we have a planned 
>> schedule(that I may be unaware of?)
>> Personally I am not too attached to deciding a date in advance and would 
>> prefer to decide a fixed set of issues that we prioritize to fix, then limit 
>> the branch to bug fixes only(moving any further dev to a separate branch), 
>> and push that out as the next release when sufficient testing has been made 
>> with harmful bugs removed.
> 
> I'd be inclined to try to release as often as we think we have useful 
> features and bug fixes implemented, to maintain a rhythm and keep the 
> vitality of the project high. I think releasing often also helps encourage 
> users to engage with the developer community and try out and vet experimental 
> features.
> 
>> - What belongs in 1.1.0?
>> I for one think that for any log delivery infrastructure the core parts for 
>> delivery mechanisms and error recovery mechanisms should be of primary 
>> importance, and this is what I've been trying to work on. I do not feel that 
>> any further sources or sinks are necessary, but feel that for delivery 
>> mechanisms, the lack of a FileChannel is pretty painful. I also feel that a 
>> buffering mechanism(as in scribed), allowing to store channel overflow in a 
>> long-term medium should be a priority.
> 
> I tend agree with what you're saying, although I don't really have an 
> aversion to integrating more Sinks as long as they have maintainers. I agree 
> that a long term buffering solution is very important, I think that would be 
> part of FileChannel though. Overall I think we should strive for correctness 
> in the core, medium term API stability, and system speed, in that order for 
> the next release. The primary thing I am looking at right now is the RPC 
> mechanism, to ensure we are set up to take full advantage of Avro RPC 
> performance features and ensure that remote clients can integrate with Flume 
> in the future. I have some concerns there and I'll start a thread about it 
> tomorrow probably, since if there are reasons to break wire compatibility we 
> should do it as early as possible in the life of 1.x. (incidentally I also 
> think we should start calling it 1.x instead of NG to avoid coining terms 
> like Flume ONG and Flume NNG for 2.x :)
> 
> Along the vein of system interfaces, one big thing that I think is missing in 
> Flume is Javadoc of all the core interfaces and classes. This is something I 
> am certainly willing to work on. Mainly I believe that the various interface 
> contracts need to be strongly specified in the base class Javadoc so that 
> it's easier to tell if something is wrong and to ensure consistency across 
> implementations. For example, if there is an error delivering an event should 
> a Sink return BACKOFF or throw an EventDeliveryException? I'm not sure why 
> one is a return value and the other is an exception, but we should make sure 
> consequences and best practices are documented, and any Sinks in the core 
> should be consistent. I'm still getting my head around the system and using 
> the source (, Luke) to figure these things out. But hopefully future devs and 
> API users won't have to do that as much.
> 
> One more thing that I think is important, while not really related to a 
> software release per se, is coming up with stories around how common use 
> cases are supposed to work or eventually be possible. Something I've been 
> thinking about a lot is Apache web server log collection onto HDFS. While 
> tail source is known to be problematic (deserves a FAQ entry), we should 
> provide explanations and best practices for the most common cases. (In this 
> case I think it involves writing an apache httpd mod_flume module that speaks 
> Avro). We can then eventually provide code for these most common cases when 
> we have time to implement them or as they are contributed. These very common 
> use cases and the stories around them should inform our design decisions.
> 
>> I am unsure of configuration overhauls. We have one configuration method 
>> that works. Should a centralized one be an immediate target or one for 
>> 1.1.0. Should refactoring the  configuration be a priority(it was pointed 
>> out that FlumeConfiguration has become a god class)?
> 
> OK so my understanding is that some changes to how we do config validation 
> are required to be able to write a tool to validate Flume configs without 
> having to start an agent. The idea is for this functionality to be separated 
> from the core to some extent so that the validation mechanism can be exposed 
> as an API. The initial request for an API came from the Cloudera enterprise 
> team, who wants to add Flume configuration validation support in the Cloudera 
> Manager app. Personally I think it would be a great feature to have in a 
> command line tool as well. From an operations perspective, it's nice to have 
> the ability to check that your config is valid before pushing it, instead of 
> finding out your config is broken once you deploy to all your agents… 
> especially if you are in an emergency production situation and you need to 
> make changes fast. If you have concerns about the implementation beyond the 
> issues that Eric raised, or even if you agree/disagree with the current 
> feedback on the review, then I know Hari would appreciate any constructive 
> feedback that you or other folks can provide. Of course if folks think that 
> it's an undesirable feature, have concerns, or think there is a better way to 
> design it then they should definitely speak up in the JIRA, the review tool, 
> or here as well.
> 
> Anyway, I think other folks should chime in on this thread and we should 
> ultimately morph this discussion into a list of JIRAs for inclusion into a 
> 1.1.0. And I would advocate that the rest would move to 1.2.0 by default.

> 
>> There are a few other leftovers from flume-728: metric collection 
>> infrastructure, documentation, master. Should these be targets for 1.1.0 or 
>> for further down the road?
>> We should probably also make clear which components need to be thread safe 
>> and which don't. We should also verify this is the case.

What do you mean by Master?

+1 on documenting thread safety and providing much more documentation in 
general.

I'm not sure about exposing metrics for 1.1.0… while it's important for folks 
running Flume and we should make it a high priority, I think we could probably 
provide enough value with more important stuff to justify a next release 
without it, if we are releasing frequently. Then again if someone wanted to 
work on JMX support or something like that I wouldn't be against it!

Regards,
Mike

Re: Roadmap for v1.1.0

Reply via email to