[
https://issues.apache.org/jira/browse/LOG4J2-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177735#comment-15177735
]
Ralph Goers edited comment on LOG4J2-1305 at 3/3/16 12:23 PM:
--------------------------------------------------------------
Hessian is a protocol -
https://en.m.wikipedia.org/wiki/Hessian_(Web_service_protocol). While you can
use the library provided by Caucho, it isn't required.
was (Author: [email protected]):
Hessian is a protocol -
https://en.m.wikipedia.org/wiki/Hessian_(Web_service_protocol). While you can
use th library provided by Caucho, it isn't required.
> Binary Layout
> -------------
>
> Key: LOG4J2-1305
> URL: https://issues.apache.org/jira/browse/LOG4J2-1305
> Project: Log4j 2
> Issue Type: New Feature
> Components: Layouts
> Reporter: Remko Popma
> Labels: binary
>
> Logging in a binary format instead of in text can give large performance
> improvements.
> Logging text means going from a LogEvent object to formatted text, and then
> converting this text to bytes. Performance investigations with text-based
> logging formats like PatternLayout (see LOG4J2-930), and encoding Strings to
> bytes (LOG4J2-935, LOG4J2-1151) suggest that formatting and encoding text is
> expensive and imposes limits on the performance that can be achieved.
> A different approach would be to convert the LogEvent to a binary
> representation directly without creating a text representation first. This
> would result in extremely compact log files that are fast to write. The
> trade-off is that a binary log cannot easily be read in a general-purpose
> editor like VI or Notepad. A specialized tool would be necessary to either
> display or convert to human-readable form.
> This ticket proposes a simple BinaryLayout, where each LogEvent is logged in
> a binary format.
> *Example BinaryLayout log event record format*
> ||Offset||Type||Log Event Record Field Description||
> |0|long|TimeMillis|
> |8|long|NanoTime|
> |16|int|Level|
> |20|int|Logger name index - string value in separate file|
> |24|int|Thread name index - string value in separate file|
> |28|long|Thread ID|
> |36|int|Thread priority|
> |40|int|Marker index - value & hierarchy in separate file|
> |44|int|Message length|
> |48|int|Message encoder FQCN index|
> |52|byte[]|Message data - below offset assumes 18 bytes of message data|
> |70|int| Throwable data length|
> |74|byte[]|Throwable data - below offset assumes 26 bytes of Throwable data|
> |100|int|ThreadContext key/value pair count|
> |104|int|ThreadContext key index - string value in separate file|
> |108|int|ThreadContext value index - string value in separate file|
> *Repeating String Data*
> Repeating String data like thread names, logger names, marker names and
> ThreadContextMap keys and values should be logged only once, after which they
> can be referenced by their index.
> One way to do this is to save string data to a separate file. The main log
> file contains an index (the line number, zero-based) into the string-data
> file instead of the full string. Index -1 means the String value was
> {{null}}. The format of the string-data file can simply be: each unique
> string on a separate line (separated by '\n' (0x0A) character). Any '\n'
> characters embedded in the string value are Unicode escaped and writen as
> "\u000A".
> An alternative to separate files is interspersing "string-data" records with
> "log event" records. Records could be prefixed with a single byte indicating
> their record type (e.g. '#' (0x23)=header, '\n' (0x0A)=log event, '$'
> (0x24)=string data).
> String-data record format:
> ||Offset||Type||String-Data Record Field Description||
> |0|int|index of the string (each unique String has a unique index)|
> |4|byte[]|the String value, encoded in the standard Java [modified
> UTF-8|https://docs.oracle.com/javase/8/docs/api/java/io/DataInput.html#modified-utf-8]
> format used by
> [DataOutput.writeUTF(String)|https://docs.oracle.com/javase/8/docs/api/java/io/DataOutput.html#writeUTF-java.lang.String-]|
> *Custom Messages*
> Note: custom Messages that implement the {{Encoder}} interface (introduced
> with LOG4J2-1274) can be written in binary form directly without first being
> converted to text (LOG4J2-506). Any specialized tool for reading binary log
> files should handle messages of type "text" out of the box, but could have
> some plugin mechanism for decoding custom messages.
> A more flexible and less intrusive variation of this is to have a registry of
> Encoders that map Classes to the associated Encoder. That would allow not
> only custom Messages, but also the content of any ObjectMessage to be encoded
> in custom binary format. Domain classes then no longer need to implement the
> Message interface.
> *Markers*
> TBD: as Matt points out in the comments, Markers are special since they are
> hierarchic. One way to deal with this is to manage a separate file to save
> the Marker hierarchy. Another way is to do something similar to
> PatternLayout: treat it as a String value, where the string includes
> hierarchy information. I like the simplicity of the latter approach.
> *Versioning*
> The binary file must start with a header, indicating version information and
> perhaps schema information providing meta data on the log record. Schema
> information may make it possible to include/exclude fields. For version 1.0,
> the schema can either be fixed like the above example, or it could be a
> simple bitmask for the fields mentioned above.
> *Byte Order*
> TBD: Are multi-byte values like ints and longs written in big Endian or
> little Endian? This could be specified in the header, or we could fix it to
> either one. Exchange protocols like ITCH tend to select a fixed byte order
> (ITCH uses big Endian - network byte order). I like the simplicity of this
> approach.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]