It's especially important when doing log processing on Apache Hadoop, if you give uncompressed text files as input files to a Hadoop job, it'd split large log files on newlines to be processed on multiple nodes. That split should be done on a record boundary.
On Wed, Apr 13, 2016 at 5:16 PM, Yann Ylavic <ylavic....@gmail.com> wrote: > On Wed, Apr 13, 2016 at 11:08 PM, Eric Covener <cove...@gmail.com> wrote: > > On Wed, Apr 13, 2016 at 5:05 PM, Daniel Lescohier > > <daniel.lescoh...@cbsi.com> wrote: > >> Isn't T_ESCAPE_LOGITEM also used by mod_log_config's use of > >> ap_escape_logitem? We rely on the API that data from HTTP requests > that are > >> logged in our mod_log_config logfiles are newline-escaped, so that one > line > >> in the logfile is parsed as one log entry. Our parsers first split on > >> newline to get records, then splits the fields of the record on the > field > >> delimiter to get fields, then it unescapes the backslash-escapes to get > the > >> original data for that field. > > > > You make a good point, it couldn't change and affect current callers > > of ap_escape_logitem(). > > IMHO, even ErrorLog shouldn't contain splitted lines (w/o "[date] > [level] [pid]" prefix). >