Yes, that is a good idea. Probably XML and YAML as well.

Ralph

> On Jul 22, 2017, at 11:21 AM, Mikael Ståldal <mi...@apache.org> wrote:
> 
> But here we have a concrete use case: a third-party tool, Lilith, needs to 
> parse log events from our JsonLayout. (Our SocketServer in log4j-server have 
> the same need, and it uses this class.)
> 
> Should we provide a public API in log4j-core for doing so, which Lilith, our 
> SocketServer, and possibly others (Chainsaw maybe?) can use? Maybe such 
> public API should be a facade around this which hides the fact that we use 
> Jackson.
> 
> 
> On 2017-07-22 19:04, Gary Gregory wrote:
>> Hi All,
>> Just like anything in log4j-core, we try not break binary compatibility
>> when we can but we do not bend over backwards like we do for log4j-api.
>> This class is public probably because Jackson requires it to be, otherwise
>> I would have made it package private. The fact that we use Jackson for JSON
>> IO is an implementation detail IMO, not an extension point. If anyone wants
>> to hard-code a reference to log4j-core's guts then I think its fair to day
>> that this comes with caveats.
>> Gary
>> On Sat, Jul 22, 2017 at 7:27 AM, Jörn Huxhorn <jhuxh...@googlemail.com>
>> wrote:
>>> Well… it says
>>> 
>>> /**
>>>  * A Jackson JSON {@link ObjectMapper} initialized for Log4j.
>>>  * <p>
>>>  * <em>Consider this class private.</em>
>>>  * </p>
>>>  */
>>> 
>>> so I wanted to holler about how to continue.
>>> 
>>> I’ll give it a shot.
>>> 
>>> 
>>> On 22. July 2017 at 15:06:28, Mikael Ståldal (mi...@apache.org) wrote:
>>>> Would it work to use
>>>> org.apache.logging.log4j.core.jackson.Log4jJsonObjectMapper, which is
>>>> public?
>>>> 
>>>> See here how it is used:
>>>> 
>>>> https://github.com/apache/logging-log4j-tools/blob/
>>> master/log4j-server/src/main/java/org/apache/logging/log4j/server/
>>> JsonInputStreamLogEventBridge.java
>>>> 
>>>> 
>>>> On 2017-07-22 13:47, Jörn Huxhorn wrote:
>>>>> It seems like I don’t get access to the log4j2 ObjectMapper or Jackson
>>> Module used for
>>>> JSON serialization, i.e. I can’t do something like this:
>>>>> new JacksonFactory.JSON(encodeThreadContextAsList, includeStacktrace,
>>> false)
>>>>> 
>>>>> It would make sense to reuse your ObjectMapper/Module in my code
>>> instead of reimplementing
>>>> them to prevent future compatibility issues down the road.
>>>>> 
>>>>> 
>>>>> On 19. July 2017 at 23:04:26, Jörn Huxhorn (jhuxh...@googlemail.com)
>>> wrote:
>>>>>> I’ll give this a shot over the next view days.
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> 
>>>>>> On 19. July 2017 at 21:33:49, Mikael Ståldal (mi...@apache.org)
>>> wrote:
>>>>>>> JsonLayout (and XmlLayout and YamlLayout) now supports 0-byte
>>>>>>> termination of log events. Will be part of the upcoming 2.9 release.
>>>>>>> 
>>>>>>> 
>>>>>>> On 2017-07-18 11:48, Jörn Huxhorn wrote:
>>>>>>>> A general event header and footer would probably be overkill.
>>>>>>>> 
>>>>>>>> I’m using the following interface in Lilith:
>>>>>>>> 
>>>>>>>> public interface WriteByteStrategy {
>>>>>>>> void writeBytes(DataOutputStream dataOutputStream, byte[] bytes)
>>>>>>>> throws IOException;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> One implementation writes an int with the length of bytes before
>>> “bytes", the other
>>>>>>> implementation writes a 0-byte after “bytes".
>>>>>>>> 
>>>>>>>> The problem with that approach is that it requires the intermediate
>>> “bytes” array
>>>>>> and
>>>>>>> is thus probably incompatible with your zero garbage approach.
>>>>>>>> 
>>>>>>>> I didn’t try end-of-line (\r\n) mainly because I first wanted to
>>> discuss this with
>>>>>> you.
>>>>>>> I would also have to implement a new receiver type whereas I could
>>> just reuse an existing
>>>>>>> one in case of 0-byte termination.
>>>>>>>> 
>>>>>>>> I’d prefer using 0-byte termination since it would be less fragile
>>> than end-of-line.
>>>>>>> The end-of-line approach would break if compact=“false” was set
>>> erroneously while
>>>>>>> you’d still have the option of using either compact=“true” or
>>> compact=“false” in
>>>> case
>>>>>>> of 0-byte termination. 0-byte termination is also slightly more
>>> efficient while
>>>> reading
>>>>>>> since the splitting can operate on bytes instead of chars or Strings
>>> - another reason
>>>>>>> why I didn’t just implement it and instead wanted to discuss this
>>> with you first.
>>>>>>>> 
>>>>>>>> Joern
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 18. July 2017 at 03:26:32, Ralph Goers (
>>> ralph.go...@dslextreme.com) wrote:
>>>>>>>>> Right now that is all handled by the specific layouts. For
>>> example, by default the
>>>>>> RFC5424Layout
>>>>>>>>> doesn’t append newlines so when writing to a file they will all be
>>> on the same “line”,
>>>>>>> but
>>>>>>>>> it has an option to append one if needed. Doing the same would be
>>> another option for
>>>>>> the
>>>>>>>>> JSONLayout.
>>>>>>>>> 
>>>>>>>>> I’d be ok with event header and footers but ONLY if they have zero
>>> overhead when not
>>>>>> present.
>>>>>>>>> IOW, when the Layout is initialized it would have to wire in the
>>> appropriate encode
>>>>>>> method.
>>>>>>>>> 
>>>>>>>>> Ralph
>>>>>>>>> 
>>>>>>>>>> On Jul 17, 2017, at 3:06 PM, Gary Gregory wrote:
>>>>>>>>>> 
>>>>>>>>>> Do we want a general event header and footer then?
>>>>>>>>>> 
>>>>>>>>>> Gary
>>>>>>>>>> 
>>>>>>>>>> On Jul 17, 2017 14:43, "Ralph Goers" wrote:
>>>>>>>>>> 
>>>>>>>>>>> No. A Footer is only used at end of file. He needs to know how
>>> long each
>>>>>>>>>>> event is or when it is the start of a new event.
>>>>>>>>>>> 
>>>>>>>>>>> Ralph
>>>>>>>>>>> 
>>>>>>>>>>>> On Jul 17, 2017, at 12:32 PM, Gary Gregory
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Can't you use a footer for any terminator you wish?
>>>>>>>>>>>> 
>>>>>>>>>>>> Gary
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Jul 17, 2017 at 12:13 PM, Mikael Ståldal
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> (Moving this discussion to logging dev mailing list.)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Have you tried to use:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Then each log event will be terminated by end-of-line (\r\n).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I think it would be easy to implement 0-byte terminated log
>>> events in
>>>>>>>>>>>>> JsonLayout, and that would make sense since we have
>>> implemented support
>>>>>>>>>>> for
>>>>>>>>>>>>> that in GelfLayout. Created a JIRA issue for it:
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LOG4J2-1981
>>>>>>>>>>>>> 
>>>>>>>>>>>>> As for other tools, the only receivers for Log4j 2
>>> SerializedLayout we
>>>>>>>>>>>>> know are Log4j's own SocketServer and Lilith. Chainsaw
>>> currently only
>>>>>>>>>>>>> support Log4j 1 SerializedLayout. Log4j's own SocketServer
>>> support
>>>>>>>>>>>>> JsonLayout and XmlLayout as well, and we are changing it to use
>>>>>>>>>>> JsonLayout
>>>>>>>>>>>>> by default.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 2017-07-17 15:01, Joern Huxhorn wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Mikael,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I've just taken a look at the JsonLayout and see no way to
>>> parse its
>>>>>>>>>>>>>> output event by event.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It's semi-fine for files because then it will result in a
>>> well-formed
>>>>>>>>>>>>>> array at the root level that can be read in one big swoop. It
>>> isn't
>>>>>>>>>>>>>> really ideal for that use case either since it means that the
>>> whole
>>>>>>>>>>>>>> array containing all events needs to be read into memory in
>>> one piece,
>>>>>>>>>>>>>> causing OOM in case of huge files.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> But in Lilith, I really need to process events one by one.
>>> This is
>>>>>>>>>>>>>> absolutely crucial.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The only way to achieve this would essentially require me to
>>>>>>>>>>>>>> re-implement a JSON parser that can cope with data like this.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I implemented a similar kludge to support reading of log4j
>>> xml files
>>>>>>>>>>>>>> (see
>>>>>>>>>>>>>> https://github.com/huxi/lilith/blob/master/log4j/log4j-xml/
>>>>>>>>>>>>>> src/main/java/de/huxhorn/lilith/log4j/xml/
>>> Log4jImportCallable.java
>>>>>>>>>>>>>> ) but this was way easier since i could simply search for
>>>>>>>>>>>>>> "" to find the position at which I could split the stream
>>>>>>>>>>>>>> of events into a valid XML document for every single event .
>>> This was
>>>>>>>>>>>>>> still kind of nasty but doable, especially since processing an
>>>>>>>>>>> "offline"
>>>>>>>>>>>>>> XML file doesn't have the same performance restriction as
>>> handling of
>>>>>>>>>>>>>> live events.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I wouldn't have the luxury of such a unique split signal in
>>> JSON,
>>>>>>>>>>> though.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I would need to detect the "}" at the level of the root array
>>> which
>>>>>>>>>>>>>> would not simply involve counting of opening and closing
>>> brackets but
>>>>>>>>>>>>>> also ignoring of brackets inside of strings. This would boil
>>> down to
>>>>>>>>>>>>>> implementing a custom JSON reader and I won't be doing this.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This isn't just me being lazy either. Such an implementation
>>> wouldn't
>>>>>>>>>>> be
>>>>>>>>>>>>>> resource (CPU/memory) friendly anyway.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In my own JSON receiver using my own JSON format there are
>>> two ways to
>>>>>>>>>>>>>> send those events:
>>>>>>>>>>>>>> - write an int containing the amount of bytes representing
>>> the event,
>>>>>>>>>>>>>> then the bytes of the event. This type of events also supports
>>>>>>>>>>>>>> compression.
>>>>>>>>>>>>>> - write the bytes of the event, followed by a 0-byte. This
>>> works fine
>>>>>>>>>>>>>> and JavaScript/ActionScript is able to produce events like
>>> that while
>>>>>>>>>>>>>> they are unable (or were unable 7 years ago) to count the
>>> bytes on
>>>>>>>>>>> their
>>>>>>>>>>>>>> own. This type of events only supports plain text like JSON
>>> or XML and
>>>>>>>>>>>>>> no compression since compressed events could contain the
>>> 0-byte while
>>>>>>>>>>>>>> XML and JSON both won't allow it.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Both are essentially "binary" formats because of either the
>>> raw ints or
>>>>>>>>>>>>>> the 0-bytes.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'm not sure why you deprecate SerializedLayout since the
>>> security
>>>>>>>>>>>>>> issues arise only while deserializing the events, not while
>>> serializing
>>>>>>>>>>>>>> them. Is this just meant to educate the user about the issue?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I fixed the remote code execution exploit by implementing
>>>>>>>>>>>>>> https://github.com/huxi/lilith/blob/master/lilith-engine/
>>>>>>>>>>>>>> src/main/java/de/huxhorn/lilith/engine/impl/eventproduc
>>>>>>>>>>>>>> er/WhitelistObjectInputStream.java
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This still doesn't fix DoS scenarios (endless loops or OOM in
>>> case of
>>>>>>>>>>>>>> malicious data) but is close enough for me, mainly because
>>> fixing the
>>>>>>>>>>>>>> DoS issues would really have to be handled by Java, not by
>>> user code.
>>>>>>>>>>>>>> Manually re-implementing deserialization would likely be
>>> fragile and
>>>>>>>>>>>>>> error prone, possibly even introducing other security issues
>>> on its
>>>>>>>>>>> own.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'd be interested how other tools are tackling this issue. Is
>>> Chainsaw
>>>>>>>>>>>>>> officially dead or are they implementing receivers for event
>>> streams
>>>>>>>>>>>>>> like this?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'm totally fine with implementing a new receiver as a
>>> replacement for
>>>>>>>>>>>>>> log4j2 SerializedLayout (even though I won't be able to get
>>> rid of the
>>>>>>>>>>>>>> Serializable receiver anyway) but the current JsonLayout just
>>> isn't up
>>>>>>>>>>>>>> to the job at the moment.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Feel free to share this mail with your fellow developers. I'm
>>> open for
>>>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I hope this doesn't come across in a harsh way. It isn't
>>> meant to be.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Jörn.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
> 
> 


Reply via email to