Re: [gentoo-portage-dev] [PATCH v2] movefile: support in-kernel file copying on Linux (bug 607868)

Zac Medico Fri, 03 Mar 2017 18:37:27 -0800

On 03/03/2017 12:45 PM, Michał Górny wrote:
> W dniu 02.03.2017, czw o godzinie 17∶09 -0800, użytkownik Zac Medico
>> +def copyfile(src, dst):
>> +    """
>> +    Copy the contents (no metadata) of the file named src to a file
>> +    named dst.
>> +
>> +    If possible, copying is done within the kernel, and uses
>> +    "copy acceleration" techniques (such as reflinks). This also
>> +    supports sparse files.
>> +
>> +    @param src: path of source file
>> +    @type src: str
>> +    @param dst: path of destination file
>> +    @type dst: str
>> +    """
>> +    global _copyfile
>> +
>> +    if _copyfile is None:
>> +            if _file_copy is None:
>> +                    _copyfile = shutil.copyfile
>> +            else:
>> +                    _copyfile = _optimized_copyfile
> 
> I dare say the caching logic takes more time than the actual logic.
> However, this could be made even simpler -- you could just check
> the condition in global scope and set _copyfile value there.


In v3 it's like this:

if _file_copy is None:
        copyfile = shutil.copyfile
else:
        copyfile = _optimized_copyfile

>> +static off_t
>> +do_lseek(int fd_out, int fd_in, loff_t *off_out) {
> 
> Maybe name it do_lseek_data() to make the goal clearer? Some function
> docs in comments would also be helpful.

Renamed to do_lseek_data v3.

>> +
>> +        if (!(error && errno == EINTR && PyErr_CheckSignals() == 0))
>> +            eintr_retry = 0;
> 
> To be honest, I can't do it but it'd be really a good idea if someone
> could analyze all the code paths where the code above could give 'error
> = 1' and ensure that:
> 
> a. there is no case when we error=1 and leave errno unmodified/unset
> (i.e. can accidentally leave earlier EINTR there),

In v3, I set error = errno immediately after every function call that
fails. That makes it really easy to see exactly which function calls the
errno value is picked up from.

> b. we can correctly recover from any EINTR, i.e. if EINTR causes some
> function to be interrupted in the middle, our next iteration starts
> safely.

The state is mostly tracked by the offset_out variable, which makes it
hard to screw up. We just have to watch that variable carefully, and
make sure we don't corrupt its state. I added in a couple of extra lseek
calls to ensure that the fd_in file offset is always in sync with the
offset_out variable when we resume from EINTR.

>> +    }
>> +
>> +    free(buf);
> 
> if (buf != NULL) ?

The free(3) documentation says we're allowed to pass in a null pointer,
but I've added a buf != NULL check for readability.
-- 
Thanks,
Zac

Re: [gentoo-portage-dev] [PATCH v2] movefile: support in-kernel file copying on Linux (bug 607868)

Reply via email to