On 2011-02-16 11:39, Rainer Gerhards wrote: > The SIP CLF WG has just recently rejected IPFIX for it being binary and > chosen indexed ASCII instead for their format. Their reasoning (after a long > struggle) is probably educating: > > http://www.ietf.org/mail-archive/web/sip-clf/current/msg00364.html > > I don't think that IPFIX is a good solution *in the syslog context*. It is > very far from what people expect. Other than that, I'd probably need to > re-iterate the arguments made on the SIP CLF mailing list, so it probably is > better to refer to their archive ;)
Why would they expect anything about the *DATA* format of a protocol? Note that the whole point that IPFIX (or any other structured data format for that matter) 'solves' is that one has to make a parser for every single log file format out there. Doing this at the meter tends to be cheaper due to the ability to distribute that than at the aggregated part. (then again sFlow as an example does it exactly the other way around, just pushing packets and letting the collector do the hard parsing part, but we are talking about sampled flows here thus you will miss out on events which is not a decision you can make at the meter if you are looking at say breaking attempts or failures ;) I think the pro-ascii versus binary argument comes effectively primarily from organizations who process large amounts of variable-string ascii data already and who do not really care about a few extra bits or a bit more overhead in processing data as they have large global clusters of hosts already doing that work. Their programming languages tend to be of a scripted-style too which tend to make it harder / less efficient to work on binary data but work great with ascii-alike data. Nevertheless, I've a generic logline parser which simply converts syslog and other log file formats into IPFIX. The problem with the whole ascii thing though is that one has to teach the parser what fields are what, and in the case of for instance the Apache CLF teach it the weird delimiters that are present. These are all special cases, something that one would really like to avoid if one wants to keep it speedy. My model partially solves that as I only have to do the special casing at the edge, where the log file gets converted into IPFIX. As those are considered 'meters' I just deploy more and more of those, while I can keep the collector side generally either a single box and otherwise easily distribute the data amongst them. And of course, the conversion goes the other way too, it can spit out reformatted 'ascii' again if needed. Greets, Jeroen (who finds it funny to see ASCII btw, as there is this thing called UTF-8 that makes it possible to express things in all languages of the world. I guess those people have to live with punycode etc...) _______________________________________________ Syslog mailing list [email protected] https://www.ietf.org/mailman/listinfo/syslog
