This discussion has made big steps forward. It is very encouraging to see
this amount of interest. It seems that this has been around at the back of
many minds for some time already...
Thanks to Chrisophe friendly reminder, I aim to try to define the problem
space as concise as possible.
I think what we have left to clarify is the root reason of the observation
that logs streaming/processing tools do not embrace multi lines.
Then, hopefully, a final decision can be obtained on whether this is a
admissible problem worth addressing or not from postgres side.
In case it is, it can be additionally decided, whether it's also an UX that
should be improved (-> JSON/logfmt) by occasion of this opportunity.
Let's hope Eduardo, the maker of fluent-bit finds time soon to tell us what
he has to say about the multi line problem in log parsing.
El lun., 16 abr. 2018 a las 9:41, David Fetter (<da...@fetter.org>)
> On Mon, Apr 16, 2018 at 10:06:29AM -0400, Andrew Dunstan wrote:
> > On 04/15/2018 05:05 PM, Christophe Pettus wrote:
> > >> On Apr 15, 2018, at 12:16, David Arnold <firstname.lastname@example.org> wrote:
> > >>
> > >> Core-Problem: "Multi line logs are unnecessarily inconvenient to
> parse and are not compatible with the design of some (commonly used)
> logging aggregation flows."
> > > I'd argue that the first line of attack on that should be to explain
> to those consumers of logs that they are making some unwarranted
> assumptions about the kind of inputs they'll be seeing. PostgreSQL's CSV
> log formats are not a particular bizarre format, or very difficult to
> parse. The standard Python CSV library handles them just file, for
> example. You have to handle newlines that are part of a log message
> somehow; a newline in a PostgreSQL query, for example, needs to be emitted
> to the logs.
> > In JSON newlines would have to be escaped, since literal newlines are
> > not legal in JSON strings. Postgres' own CSV parser has no difficulty at
> > all in handling newlines embedded in the fields of CSV logs.
> True, and anything that malloc()s in the process of doing that
> escaping could fail on OOM, and hilarity would ensue. I don't see
> these as show-stoppers, or even as super relevant to the vast majority
> of users. If you're that close to the edge, you were going to crash
> > I'm not necessarily opposed to providing for JSON logs, but the
> > overhead of named keys could get substantial. Abbreviated keys might
> > help, but generally I think I would want to put such logs on a
> > compressed ZFS drive or some such.
> Frequently at places I've worked, the end destination is of less
> concern immediate than the ability to process those logs for
> near-real-time monitoring. This is where formats like JSON really
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778 <(415)%20235-3778>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
[image: XOE Solutions] <http://xoe.solutions/> DAVID ARNOLD
+57 (315) 304 13 68
*Confidentiality Note: * This email may contain confidential and/or private
information. If you received this email in error please delete and notify
*Environmental Consideration: * Please avoid printing this email on paper,
unless really necessary.