On 2011-02-16 11:39, Rainer Gerhards wrote:
> The SIP CLF WG has just recently rejected IPFIX for it being binary and
> chosen indexed ASCII instead for their format. Their reasoning (after a long
> struggle) is probably educating:
> 
> http://www.ietf.org/mail-archive/web/sip-clf/current/msg00364.html
> 
> I don't think that IPFIX is a good solution *in the syslog context*. It is
> very far from what people expect. Other than that, I'd probably need to
> re-iterate the arguments made on the SIP CLF mailing list, so it probably is
> better to refer to their archive ;)

Why would they expect anything about the *DATA* format of a protocol?

Note that the whole point that IPFIX (or any other structured data
format for that matter) 'solves' is that one has to make a parser for
every single log file format out there. Doing this at the meter tends to
be cheaper due to the ability to distribute that than at the aggregated
part. (then again sFlow as an example does it exactly the other way
around, just pushing packets and letting the collector do the hard
parsing part, but we are talking about sampled flows here thus you will
miss out on events which is not a decision you can make at the meter if
you are looking at say breaking attempts or failures ;)

I think the pro-ascii versus binary argument comes effectively primarily
from organizations who process large amounts of variable-string ascii
data already and who do not really care about a few extra bits or a bit
more overhead in processing data as they have large global clusters of
hosts already doing that work. Their programming languages tend to be of
a scripted-style too which tend to make it harder / less efficient to
work on binary data but work great with ascii-alike data.

Nevertheless, I've a generic logline parser which simply converts syslog
and other log file formats into IPFIX. The problem with the whole ascii
thing though is that one has to teach the parser what fields are what,
and in the case of for instance the Apache CLF teach it the weird
delimiters that are present. These are all special cases, something that
one would really like to avoid if one wants to keep it speedy.

My model partially solves that as I only have to do the special casing
at the edge, where the log file gets converted into IPFIX. As those are
considered 'meters' I just deploy more and more of those, while I can
keep the collector side generally either a single box and otherwise
easily distribute the data amongst them.

And of course, the conversion goes the other way too, it can spit out
reformatted 'ascii' again if needed.

Greets,
 Jeroen

 (who finds it funny to see ASCII btw, as there is this thing called
  UTF-8 that makes it possible to express things in all languages of
  the world. I guess those people have to live with punycode etc...)
_______________________________________________
Syslog mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/syslog

Reply via email to