On 2013-06-26 Alexander Clouter wrote: > Attached is a patch that enables 'streaming' support for xz output, > in short LZMA_SYNC_FLUSH is called every X milliseconds.
I like the idea. The patch uses LZMA_SYNC_FLUSH after every X milliseconds even if all read() calls are able to fill the buffer without blocking. A possible alternative could be to flush when at least X milliseconds have passed and read() gives EAGAIN. That is, don't flush as long as input is coming faster than xz can compress it. I don't know if this is a good or bad idea. It might mean much higher latency especially in threaded mode (which doesn't support LZMA_SYNC_FLUSH yet). A few other thoughts: The timeout must be disabled when --list (MODE_LIST) is used. gettimeofday() shouldn't fail as long as the first argument is sane and the second argument is NULL, so there's no need to test the return value (I hope). It could be good to use clock_gettime(CLOCK_MONOTONIC, ...) when it is available. It makes a difference if the system time jumps for some reason. The threading code in liblzma uses it already so it's not a new dependency. Currently message.c uses gettimeofday() and that could use clock_gettime() too. If select() gives EINTR, there should be a test for user_abort. Otherwise if there is no input, xz won't react to SIGINT until the timeout has expired. I noticed that there is a race condition in signal handling in the existing xz code. If e.g. SIGINT is sent after the value of user_abort has been checked but before a blocking read() or write(), the read/write will block and another signal is needed to make xz notice that user_abort has been set. This affects the same code as your patch so I think this should be fixed first. Could signals be a good way to set a flag when to flush? It would allow triggering flushing from another process. xz already supports SIGUSR1/SIGINFO for to show progress info if --verbose wasn't used. A possible problem is how to raise such signals within xz. timer_create() and friends look nice but after checking a few OSes I think they aren't portable enough. setitimer() could be more portable but in practice it would mean using SIGALRM. Currently xz uses alarm() for the progress indicator. Creating a thread solely for sending timer signals should work, but I'm not sure I like that. Maybe just polling the time like your patch does is the way to go. > The patch is for 5.0.0 (what is currently in Debian > 'oldstable/squeeze') but if the community likes the look of the > patch, I can roll a version for whatever is at the HEAD of the git > tree. It won't apply directly because there's new code that uses LZMA_FULL_FLUSH. But let's not worry about it until I have fixed the race condition with signals and user_abort. It may need select() or poll(), which may then be used to implement flushing too. -- Lasse Collin | IRC: Larhzu @ IRCnet & Freenode