A general event header and footer would probably be overkill. I’m using the following interface in Lilith:
public interface WriteByteStrategy { void writeBytes(DataOutputStream dataOutputStream, byte[] bytes) throws IOException; } One implementation writes an int with the length of bytes before “bytes", the other implementation writes a 0-byte after “bytes". The problem with that approach is that it requires the intermediate “bytes” array and is thus probably incompatible with your zero garbage approach. I didn’t try end-of-line (\r\n) mainly because I first wanted to discuss this with you. I would also have to implement a new receiver type whereas I could just reuse an existing one in case of 0-byte termination. I’d prefer using 0-byte termination since it would be less fragile than end-of-line. The end-of-line approach would break if compact=“false” was set erroneously while you’d still have the option of using either compact=“true” or compact=“false” in case of 0-byte termination. 0-byte termination is also slightly more efficient while reading since the splitting can operate on bytes instead of chars or Strings - another reason why I didn’t just implement it and instead wanted to discuss this with you first. Joern On 18. July 2017 at 03:26:32, Ralph Goers (ralph.go...@dslextreme.com) wrote: > Right now that is all handled by the specific layouts. For example, by > default the RFC5424Layout > doesn’t append newlines so when writing to a file they will all be on the > same “line”, but > it has an option to append one if needed. Doing the same would be another > option for the > JSONLayout. > > I’d be ok with event header and footers but ONLY if they have zero overhead > when not present. > IOW, when the Layout is initialized it would have to wire in the appropriate > encode method. > > Ralph > > > On Jul 17, 2017, at 3:06 PM, Gary Gregory wrote: > > > > Do we want a general event header and footer then? > > > > Gary > > > > On Jul 17, 2017 14:43, "Ralph Goers" wrote: > > > >> No. A Footer is only used at end of file. He needs to know how long each > >> event is or when it is the start of a new event. > >> > >> Ralph > >> > >>> On Jul 17, 2017, at 12:32 PM, Gary Gregory > >> wrote: > >>> > >>> Can't you use a footer for any terminator you wish? > >>> > >>> Gary > >>> > >>> On Mon, Jul 17, 2017 at 12:13 PM, Mikael Ståldal > >> wrote: > >>> > >>>> Hi. > >>>> > >>>> (Moving this discussion to logging dev mailing list.) > >>>> > >>>> Have you tried to use: > >>>> > >>>> > >>>> Then each log event will be terminated by end-of-line (\r\n). > >>>> > >>>> I think it would be easy to implement 0-byte terminated log events in > >>>> JsonLayout, and that would make sense since we have implemented support > >> for > >>>> that in GelfLayout. Created a JIRA issue for it: > >>>> https://issues.apache.org/jira/browse/LOG4J2-1981 > >>>> > >>>> As for other tools, the only receivers for Log4j 2 SerializedLayout we > >>>> know are Log4j's own SocketServer and Lilith. Chainsaw currently only > >>>> support Log4j 1 SerializedLayout. Log4j's own SocketServer support > >>>> JsonLayout and XmlLayout as well, and we are changing it to use > >> JsonLayout > >>>> by default. > >>>> > >>>> > >>>> On 2017-07-17 15:01, Joern Huxhorn wrote: > >>>> > >>>>> Hi Mikael, > >>>>> > >>>>> I've just taken a look at the JsonLayout and see no way to parse its > >>>>> output event by event. > >>>>> > >>>>> It's semi-fine for files because then it will result in a well-formed > >>>>> array at the root level that can be read in one big swoop. It isn't > >>>>> really ideal for that use case either since it means that the whole > >>>>> array containing all events needs to be read into memory in one piece, > >>>>> causing OOM in case of huge files. > >>>>> > >>>>> But in Lilith, I really need to process events one by one. This is > >>>>> absolutely crucial. > >>>>> > >>>>> The only way to achieve this would essentially require me to > >>>>> re-implement a JSON parser that can cope with data like this. > >>>>> > >>>>> I implemented a similar kludge to support reading of log4j xml files > >>>>> (see > >>>>> https://github.com/huxi/lilith/blob/master/log4j/log4j-xml/ > >>>>> src/main/java/de/huxhorn/lilith/log4j/xml/Log4jImportCallable.java > >>>>> ) but this was way easier since i could simply search for > >>>>> "" to find the position at which I could split the stream > >>>>> of events into a valid XML document for every single event . This was > >>>>> still kind of nasty but doable, especially since processing an > >> "offline" > >>>>> XML file doesn't have the same performance restriction as handling of > >>>>> live events. > >>>>> > >>>>> I wouldn't have the luxury of such a unique split signal in JSON, > >> though. > >>>>> > >>>>> I would need to detect the "}" at the level of the root array which > >>>>> would not simply involve counting of opening and closing brackets but > >>>>> also ignoring of brackets inside of strings. This would boil down to > >>>>> implementing a custom JSON reader and I won't be doing this. > >>>>> > >>>>> This isn't just me being lazy either. Such an implementation wouldn't > >> be > >>>>> resource (CPU/memory) friendly anyway. > >>>>> > >>>>> In my own JSON receiver using my own JSON format there are two ways to > >>>>> send those events: > >>>>> - write an int containing the amount of bytes representing the event, > >>>>> then the bytes of the event. This type of events also supports > >>>>> compression. > >>>>> - write the bytes of the event, followed by a 0-byte. This works fine > >>>>> and JavaScript/ActionScript is able to produce events like that while > >>>>> they are unable (or were unable 7 years ago) to count the bytes on > >> their > >>>>> own. This type of events only supports plain text like JSON or XML and > >>>>> no compression since compressed events could contain the 0-byte while > >>>>> XML and JSON both won't allow it. > >>>>> > >>>>> Both are essentially "binary" formats because of either the raw ints or > >>>>> the 0-bytes. > >>>>> > >>>>> I'm not sure why you deprecate SerializedLayout since the security > >>>>> issues arise only while deserializing the events, not while serializing > >>>>> them. Is this just meant to educate the user about the issue? > >>>>> > >>>>> I fixed the remote code execution exploit by implementing > >>>>> https://github.com/huxi/lilith/blob/master/lilith-engine/ > >>>>> src/main/java/de/huxhorn/lilith/engine/impl/eventproduc > >>>>> er/WhitelistObjectInputStream.java > >>>>> > >>>>> This still doesn't fix DoS scenarios (endless loops or OOM in case of > >>>>> malicious data) but is close enough for me, mainly because fixing the > >>>>> DoS issues would really have to be handled by Java, not by user code. > >>>>> Manually re-implementing deserialization would likely be fragile and > >>>>> error prone, possibly even introducing other security issues on its > >> own. > >>>>> > >>>>> I'd be interested how other tools are tackling this issue. Is Chainsaw > >>>>> officially dead or are they implementing receivers for event streams > >>>>> like this? > >>>>> > >>>>> I'm totally fine with implementing a new receiver as a replacement for > >>>>> log4j2 SerializedLayout (even though I won't be able to get rid of the > >>>>> Serializable receiver anyway) but the current JsonLayout just isn't up > >>>>> to the job at the moment. > >>>>> > >>>>> Feel free to share this mail with your fellow developers. I'm open for > >>>>> suggestions. > >>>>> > >>>>> I hope this doesn't come across in a harsh way. It isn't meant to be. > >>>>> > >>>>> Cheers, > >>>>> Jörn. > >>>>> > >>>> > >>>> > >> > >> > >> > > >