David Arnold wrote: > Not claiming this assumption does imply parsing of a *rolling* set > of log lines with *previously unkown cardinality*. That's expensive > on computing resources. I don't have actual numbers, but it doesn't > seem too far fetched, neither. > I filed a question to the author of fluent-bit to that extend which > you can consult here: > https://github.com/fluent/fluent-bit/issues/564 Let's see what > Eduardo has to inform us about this...
fluent-bit does not appear to support CSV, as mentioned in https://github.com/fluent/fluent-bit/issues/459 which got flagged as an enhancement request some time ago. In CSV a line break inside a field is easy to process for a parser, because (per https://tools.ietf.org/html/rfc4180): "Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes" So there is no look ahead to do. In a character-by-character loop, when encountering a line break, either the current field did not start with a double quote and the line break is part of the content, or it did start with a double quote and the line break ends the current record. What doesn't quite work is to parse CSV with a regex, it's discussed in some detail here for instance: https://softwareengineering.stackexchange.com/questions/166454/can-the-csv-format-be-defined-by-a-regex Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite