On Thu, Oct 20, 2011 at 11:43 PM, Mingjie Lai <[email protected]> wrote:
> > Eric. > > It makes sense to me. I also agree it can be applied to similar sources, > such as the UDP source I'm dealing with > > > that executes a Unix process and ... > Can also be a Windows command, right? > You know, I haven't tested it but provided ProcessBuilder / Process and stdout work "as expected" on Windows, I don't see why it wouldn't work just fine. My Windows knowledge is severely limited. > > Thanks, > Mingjie > > > On 10/20/2011 04:32 PM, Eric Sammer wrote: > >> I've included the following in the javadoc for the ExecSource in NG: >> >> """ >> >> *org.apache.flume.source.**ExecSource* >> >> A Source<eclipse-javadoc:%E2%98%**82=flume-ng-core/src%5C/main%** >> 5C/java%3Corg.apache.flume.**source%7BExecSource.java%E2%** >> 98%83ExecSource%E2%98%**82Source>implementation >> >> that executes a Unix process and turns each line of text into >> an event. >> >> The ExecSource is meant for situations where one must integrate with >> existing systems without modifying code. It is a compatibility gateway >> built >> to allow simple, stop-gap integration and doesn't necessarily offer all of >> the benefits or guarantees of native integration with Flume. If one has >> the >> option of using the AvroSource, for instance, that would be greatly >> preferred to this source as it (and similarly implemented sources) can >> maintain the transactional guarantees that exec can not. >> >> Why doesn't *ExecSource* offer transactional guarantees? >> >> >> The problem with ExecSource and other asynchronous sources is that the >> source can not guarantee that if there is a failure to put the event into >> the Channel<eclipse-javadoc:%E2%**98%82=flume-ng-core/src%5C/** >> main%5C/java%3Corg.apache.**flume.source%7BExecSource.** >> java%E2%98%83ExecSource%E2%98%**82Channel>the >> >> client knows about it. As a for instance, one of the most commonly >> requested features is the tail -F [file]-like use case where an >> application >> writes to a log file on disk and Flume tails the file, sending each line >> as >> an event. While this is possible, there's an obvious problem; what happens >> if the channel fills up and Flume can't send an event? Flume has no way of >> indicating to the application writing the log file that it needs to retain >> the log or that the event hasn't been sent, for some reason. If this >> doesn't >> make sense, you need only know this: *Your application can never guarantee >> >> data has been received when using a unidirectional asynchronous interface >> such as ExecSource!* As an extension of this warning - and to be >> completely >> >> clear - there is absolutely zero guarantee of event delivery when using >> this >> source. You have been warned. >> >> """ >> >> Does anyone feel like this isn't clear or disagrees with this warning? I'd >> like to make sure this is *very* well understood by users going forward. >> This would carry for any kind of source similar to exec (which absolutely >> includes "tail"). >> >> >> Feedback welcome / appreciated. >> > -- Eric Sammer twitter: esammer data: www.cloudera.com
