Re: Lilith and Log4j 2

Mikael Ståldal Sat, 22 Jul 2017 13:32:13 -0700

Yes, I am putting it in its own Java package.


On 2017-07-22 22:13, Gary Gregory wrote:

Should this API sit in its own package?

On Jul 22, 2017 11:41, "Mikael Ståldal" <[email protected]> wrote:

I'll give it a shot:

https://issues.apache.org/jira/browse/LOG4J2-1986


On 2017-07-22 20:28, Ralph Goers wrote:

Yes, that is a good idea. Probably XML and YAML as well.

Ralph

On Jul 22, 2017, at 11:21 AM, Mikael Ståldal <[email protected]> wrote:


But here we have a concrete use case: a third-party tool, Lilith, needs
to parse log events from our JsonLayout. (Our SocketServer in log4j-server
have the same need, and it uses this class.)

Should we provide a public API in log4j-core for doing so, which Lilith,
our SocketServer, and possibly others (Chainsaw maybe?) can use? Maybe such
public API should be a facade around this which hides the fact that we use
Jackson.


On 2017-07-22 19:04, Gary Gregory wrote:

Hi All,
Just like anything in log4j-core, we try not break binary compatibility
when we can but we do not bend over backwards like we do for log4j-api.
This class is public probably because Jackson requires it to be,
otherwise
I would have made it package private. The fact that we use Jackson for
JSON
IO is an implementation detail IMO, not an extension point. If anyone
wants
to hard-code a reference to log4j-core's guts then I think its fair to
day
that this comes with caveats.
Gary
On Sat, Jul 22, 2017 at 7:27 AM, Jörn Huxhorn <[email protected]>
wrote:

Well… it says

/**
   * A Jackson JSON {@link ObjectMapper} initialized for Log4j.
   * <p>
   * <em>Consider this class private.</em>
   * </p>
   */

so I wanted to holler about how to continue.

I’ll give it a shot.


On 22. July 2017 at 15:06:28, Mikael Ståldal ([email protected]) wrote:

Would it work to use
org.apache.logging.log4j.core.jackson.Log4jJsonObjectMapper, which is
public?

See here how it is used:

https://github.com/apache/logging-log4j-tools/blob/

master/log4j-server/src/main/java/org/apache/logging/log4j/server/
JsonInputStreamLogEventBridge.java



On 2017-07-22 13:47, Jörn Huxhorn wrote:

It seems like I don’t get access to the log4j2 ObjectMapper or
Jackson

Module used for

JSON serialization, i.e. I can’t do something like this:

new JacksonFactory.JSON(encodeThreadContextAsList,
includeStacktrace,

false)

It would make sense to reuse your ObjectMapper/Module in my code

instead of reimplementing

them to prevent future compatibility issues down the road.



On 19. July 2017 at 23:04:26, Jörn Huxhorn ([email protected])

wrote:

I’ll give this a shot over the next view days.


Thanks!


On 19. July 2017 at 21:33:49, Mikael Ståldal ([email protected])

wrote:

JsonLayout (and XmlLayout and YamlLayout) now supports 0-byte

termination of log events. Will be part of the upcoming 2.9
release.


On 2017-07-18 11:48, Jörn Huxhorn wrote:

A general event header and footer would probably be overkill.

I’m using the following interface in Lilith:

public interface WriteByteStrategy {
void writeBytes(DataOutputStream dataOutputStream, byte[] bytes)
throws IOException;
}

One implementation writes an int with the length of bytes before

“bytes", the other

implementation writes a 0-byte after “bytes".


The problem with that approach is that it requires the
intermediate

“bytes” array

and

is thus probably incompatible with your zero garbage approach.


I didn’t try end-of-line (\r\n) mainly because I first wanted to

discuss this with

you.

I would also have to implement a new receiver type whereas I could

just reuse an existing

one in case of 0-byte termination.


I’d prefer using 0-byte termination since it would be less fragile

than end-of-line.

The end-of-line approach would break if compact=“false” was set

erroneously while

you’d still have the option of using either compact=“true” or

compact=“false” in

case

of 0-byte termination. 0-byte termination is also slightly more

efficient while

reading

since the splitting can operate on bytes instead of chars or Strings

- another reason

why I didn’t just implement it and instead wanted to discuss this

with you first.

Joern


On 18. July 2017 at 03:26:32, Ralph Goers (

[email protected]) wrote:

Right now that is all handled by the specific layouts. For

example, by default the

RFC5424Layout

doesn’t append newlines so when writing to a file they will all be

on the same “line”,

but

it has an option to append one if needed. Doing the same would be

another option for

the

JSONLayout.


I’d be ok with event header and footers but ONLY if they have
zero

overhead when not

present.

IOW, when the Layout is initialized it would have to wire in the

appropriate encode

method.

Ralph

On Jul 17, 2017, at 3:06 PM, Gary Gregory wrote:


Do we want a general event header and footer then?

Gary

On Jul 17, 2017 14:43, "Ralph Goers" wrote:

No. A Footer is only used at end of file. He needs to know how

long each

event is or when it is the start of a new event.


Ralph

On Jul 17, 2017, at 12:32 PM, Gary Gregory

wrote:


Can't you use a footer for any terminator you wish?

Gary

On Mon, Jul 17, 2017 at 12:13 PM, Mikael Ståldal

wrote:

Hi.


(Moving this discussion to logging dev mailing list.)

Have you tried to use:


Then each log event will be terminated by end-of-line (\r\n).

I think it would be easy to implement 0-byte terminated log

events in

JsonLayout, and that would make sense since we have

implemented support

for

that in GelfLayout. Created a JIRA issue for it:

https://issues.apache.org/jira/browse/LOG4J2-1981

As for other tools, the only receivers for Log4j 2

SerializedLayout we

know are Log4j's own SocketServer and Lilith. Chainsaw

currently only

support Log4j 1 SerializedLayout. Log4j's own SocketServer

support

JsonLayout and XmlLayout as well, and we are changing it to use

JsonLayout

by default.



On 2017-07-17 15:01, Joern Huxhorn wrote:

Hi Mikael,


I've just taken a look at the JsonLayout and see no way to

parse its

output event by event.


It's semi-fine for files because then it will result in a

well-formed

array at the root level that can be read in one big swoop. It

isn't

really ideal for that use case either since it means that the

whole

array containing all events needs to be read into memory in

one piece,

causing OOM in case of huge files.


But in Lilith, I really need to process events one by one.

This is

absolutely crucial.


The only way to achieve this would essentially require me to
re-implement a JSON parser that can cope with data like
this.

I implemented a similar kludge to support reading of log4j

xml files

(see

https://github.com/huxi/lilith/blob/master/log4j/log4j-xml/
src/main/java/de/huxhorn/lilith/log4j/xml/

Log4jImportCallable.java

) but this was way easier since i could simply search for

"" to find the position at which I could split the stream
of events into a valid XML document for every single event .

This was

still kind of nasty but doable, especially since processing an

"offline"

XML file doesn't have the same performance restriction as

handling of

live events.


I wouldn't have the luxury of such a unique split signal in

JSON,

though.

I would need to detect the "}" at the level of the root
array

which

would not simply involve counting of opening and closing

brackets but

also ignoring of brackets inside of strings. This would boil

down to

implementing a custom JSON reader and I won't be doing this.


This isn't just me being lazy either. Such an implementation

wouldn't

be

resource (CPU/memory) friendly anyway.


In my own JSON receiver using my own JSON format there are

two ways to

send those events:

- write an int containing the amount of bytes representing

the event,

then the bytes of the event. This type of events also supports

compression.
- write the bytes of the event, followed by a 0-byte. This

works fine

and JavaScript/ActionScript is able to produce events like

that while

they are unable (or were unable 7 years ago) to count the

bytes on

their

own. This type of events only supports plain text like JSON

or XML and

no compression since compressed events could contain the

0-byte while

XML and JSON both won't allow it.


Both are essentially "binary" formats because of either the

raw ints or

the 0-bytes.


I'm not sure why you deprecate SerializedLayout since the

security

issues arise only while deserializing the events, not while

serializing

them. Is this just meant to educate the user about the issue?


I fixed the remote code execution exploit by implementing
https://github.com/huxi/lilith/blob/master/lilith-engine/
src/main/java/de/huxhorn/lilith/engine/impl/eventproduc
er/WhitelistObjectInputStream.java

This still doesn't fix DoS scenarios (endless loops or OOM
in

case of

malicious data) but is close enough for me, mainly because

fixing the

DoS issues would really have to be handled by Java, not by

user code.

Manually re-implementing deserialization would likely be

fragile and

error prone, possibly even introducing other security issues

on its

own.

I'd be interested how other tools are tackling this issue.
Is

Chainsaw

officially dead or are they implementing receivers for event

streams

like this?


I'm totally fine with implementing a new receiver as a

replacement for

log4j2 SerializedLayout (even though I won't be able to get

rid of the

Serializable receiver anyway) but the current JsonLayout just

isn't up

to the job at the moment.


Feel free to share this mail with your fellow developers.
I'm

open for

suggestions.


I hope this doesn't come across in a harsh way. It isn't

meant to be.

Cheers,
Jörn.

Re: Lilith and Log4j 2

Reply via email to