Re: log rotation and compression questions,

Ralph Goers Wed, 28 May 2014 10:58:24 -0700

I am pretty sure the file will be useless. 

As I said in my other post, I would want to understand what the cost/benefit 
looks like before spending time on this.


Sent from my iPad

> On May 28, 2014, at 9:39 AM, Matt Sicker <[email protected]> wrote:
> 
> We can use GZIPOutputStream, DeflaterOutputStream, and ZipOutputStream all 
> out of the box.
> 
> What happens if you interrupt a stream in progress? No idea! But Gzip at 
> least has CRC32 checksums on hand, so it can be detected if it's corrupted. 
> We'll have to experiment a bit to see what really happens. I couldn't find 
> anything in zlib.net's FAQ.
> 
> 
>> On 28 May 2014 08:56, Ralph Goers <[email protected]> wrote:
>> What would happen to the file if the system crashed before the file is 
>> closed? Would the file be able to be decompressed or would it be corrupted?
>> 
>> Sent from my iPad
>> 
>>> On May 28, 2014, at 6:35 AM, Remko Popma <[email protected]> wrote:
>>> 
>>> David, thank you for the clarification. I understand better what you are 
>>> trying to achieve now.
>>> 
>>> Interesting idea to have an appender that writes to a GZipOutputStream. 
>>> Would you mind raising a Jira ticket for that feature request?
>>> 
>>> I would certainly be interested in learning about efficient techniques for 
>>> compressing very large files. Not sure if or how the dd/direct I/O 
>>> mentioned in the blog you linked to could be leveraged from java. If you 
>>> find a way that works well for log file rollover, and you're interested in 
>>> sharing it, please let us know.
>>> 
>>> 
>>> 
>>>> On Wed, May 28, 2014 at 3:42 PM, David Hoa <[email protected]> wrote:
>>>> Hi Remko,
>>>> 
>>>> My point about gzip, which we've experienced, is that compressing very 
>>>> large files (multi-GB) does have considerable impact on the system. The 
>>>> dd/direct I/O workaround avoid putting that much log data into your 
>>>> filesystem cache. For that problem, after I sent the email, I did look at 
>>>> the log4j2 implementation, and saw that in 
>>>> DefaultRolloverStrategy::rollover() it calls GZCompressionAction, so I see 
>>>> how I can write my own strategy and Action to customize how gzip is called.
>>>> 
>>>> My second question was not about adding to existing gzip files; from what 
>>>> I know that's not possible. But if the GZipOutputStream is kept open and 
>>>> written to until closed by a rollover event, then the cost of gzipping is 
>>>> amortized over time rather than incurred when the rollover event gets 
>>>> triggered. The benefit is amortization of gzip so there's no resource 
>>>> usage spike; downside would be writing both compressed and uncompressed 
>>>> log files and maintaining rollover strategies for both of them. So a built 
>>>> in appender that wrote directly to gz files would be useful for this.
>>>> 
>>>> Thanks,
>>>> David
>>>> 
>>>> 
>>>>> On Tue, May 27, 2014 at 4:52 PM, Remko Popma <[email protected]> 
>>>>> wrote:
>>>>> Hi David,
>>>>> 
>>>>> I read the blog post you linked to. It seems that the author was very, 
>>>>> very upset that a utility called cp only uses a 512 byte buffer. He then 
>>>>> goes on to praise gzip for having a 32KB buffer. 
>>>>> So just based on your link, gzip is actually pretty good. 
>>>>> 
>>>>> That said, there are plans to improve the file rollover mechanism. These 
>>>>> plans are currently spread out over a number of Jira tickets. One 
>>>>> existing request is to delete archived log files that are older than some 
>>>>> number of days. (https://issues.apache.org/jira/browse/LOG4J2-656, 
>>>>> https://issues.apache.org/jira/browse/LOG4J2-524 )
>>>>> This could be extended to cover your request to keep M compressed files. 
>>>>> 
>>>>> I'm not sure about appending to existing gzip files. Why is this 
>>>>> desirable/What are you trying to accomplish with that?
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On 2014/05/28, at 3:22, David Hoa <[email protected]> wrote:
>>>>>> 
>>>>>> hi Log4j Dev,
>>>>>> 
>>>>>> I am interested in the log rollover and compression feature in log4j2. I 
>>>>>> read the documentation online, and still have some questions.
>>>>>> 
>>>>>> - gzipping large files has performance impact on latencies/cpu/file 
>>>>>> cache, and there's a workaround for that using dd and direct i/o. Is it 
>>>>>> possible to customize how log4j2 gzips files (or does log4j2 already do 
>>>>>> this)? See this link for a description of the common problem.
>>>>>> http://kevinclosson.wordpress.com/2007/02/23/standard-file-utilities-with-direct-io/
>>>>>> 
>>>>>> - is it possible to use the existing appenders to output directly to 
>>>>>> their final gzipped files, maintain M of those gzipped files, and 
>>>>>> rollover/maintain N of the uncompressed logs?  I suspect that the 
>>>>>> complicated part would be in JVM crash recovery/ application restart. 
>>>>>> Any suggestions on how best to add/extend/customize support for this?
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> David
> 
> 
> 
> -- 
> Matt Sicker <[email protected]>

Re: log rotation and compression questions,

Reply via email to