David, First of all, thanks a lot for your input in this matter. For me it's eye-opening in many areas and it's very interesting for this subject in general.
More inline. 2014/1/23 David Lang <[email protected]> > On Thu, 23 Jan 2014, Rainer Gerhards wrote: > > On Thu, Jan 23, 2014 at 6:46 PM, David Lang <[email protected]> wrote: >> >> so what exactly is being proposed? >>> >>> It sounds as if we are talking about omprog, but that also captures >>> stderr >>> of the program that's executed, and watches that stderr for specific >>> keywords. >>> >>> >>> I think yes, that's bascially the idea. >> >> >> what keywords are you talking about, and what actions will be taken? how >>> would these actions differ from the program just stalling reading of the >>> pipe or exiting with an error code? >>> >>> >>> don't know yet - this needs to evolve/be specified. Currently working >> on a >> python test script and test integration. >> > > Ok, probably a better question than qhat keywords, is what functionality > you are looking for in this feedback. Radu, do you have thoughts? With the way I've used omprog so far, I didn't ever try (or miss trying) to communicate from the script back to rsyslog. The script's internal queue would be tiny (say, 1000 messages) and if it gets full, bad luck, it just stops. If something catastrophic happens, throw an exception and rely on omprog to restart it. Warnings would have been logged to its own logfile, and I guess that's when it would be nice to have a channel back to rsyslog. But again, I wouldn't stress too much on that. For example, a custom UDP port to listen to, that binds to a ruleset that spits those logs to a file should be enough. You wouldn't want to send the same logs to the same script, because it would create an endless loop. In short, what I needed was to discard malformed messages (which should be a rare exception, indicating a bug somewhere - which is why it's nice to log when that happens) and to stop and retry on temporary errors (like network issues). > > > This is not going to be able to support batch mode, some number of logs >>> are going to be lost in any case because we will dequeue messages >>> (consider >>> them delivered) when we put them in the pipe to the program, any feedback >>> we get from stderr can only affect logs that haven't been sent to the >>> pipe >>> yet. >>> >>> >> yeah, it's not totally reliable. It's the same scenario we have with omfwd >> in plain tcp syslog mode. Judging from that, it's still good for many >> applications. If that's a problem, we could even go down (via config >> option) to do a half-duplex mode, where only one message is posted at one >> time, and reply awaited. Obviously much slower, but there always is a >> price >> for reliability. >> > > given this level of reliability, I'm not sure that it's really worth > trying to get more feedback than we get with TCP (it died and we are > restarting, or it blocked and we can't send to it) +1 > > > what other restrictions are there? >>> >>> >> I don't know yet. Quite honestly, I don't try to design this fully >> through. This time, I'd prefer to do some thing, see how they work out and >> what hurts, refactor and begin a new cycle. In this mode, I hope to have >> something workable (even without real feedback, much like UDP) within a >> day >> or two. That would probably be enough to see the utility and if it is >> being >> used. A full design requires more time - time I don't have and I don't >> know >> if it would be well spent. You may consider this effort of a simple kind >> of >> "omprog" evolution but from the marketing PoV this seems not to be very >> appealing... So let's do what everyone does and exaggregate the terms. >> Still, I think it's technical far superior, and we could have a fairly >> working SOLR script (with batches) very quickly... >> > > how would you handle batches? > This is all in the script, rsyslog only pushes messages to the script's stdin and doesn't bother: - one thread continuously reads from stdin and puts messages into a queue - one or more threads write to Solr (or whatever). Logic would be something like: ######################### output_array = [] try: output_array += get_new_message_from_queue() except QueueIsEmptyException: if output_array.HasSomething: PushArrayContentsToSolr() #batch is sent if the queue is empty if output_array.size() == max_batch_size: PushArrayContentsToSolr() #batch is sent if the maximum batch size is reached ######################### Yes, it does reinvent the rsyslog batch logic (or some of it, at least, I'm not super-intimate with how rsyslog does it). The point of providing such a "skeleton" script is to spare people from re-doing this logic the third time, fourth time and so on :) Basically, for each destination, one would have to modify the PushArrayContentsToSolr() function from the snippet above. Which proves that the name containing Solr is bad :D _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

