On 06/05/18 18:58, Jason Smith wrote:
> Hello, Pádraig,
>
> I was actually the one to identify the bug and analyze the code underlying it
> (Illia, my colleague, was kind enough to volunteer to file the bug report and
> attempt a patch), and so I may be in the better position to follow up on
> this. I appreciate your addressing the matter so promptly.
>
> We are indeed provisionally using --remove-destination; however, we would
> prefer to use --force instead for the atomicity it affords.
>
> Regarding the proposed solution, I think that it well addresses the present
> issue. But there are related issues that I should raise while we are
> discussing it and that may influence its resolution.
>
> To begin, it seems problematic to make the return value in the case of
> hard-link creation dependent on whether the source and the destination are
> associated with the same device ID, as the purpose of the same_file_ok
> function seems to be to indicate whether it matters that the files may
> somehow be the same. If their file systems are different, that is not a
> problem with the files' being the same; it is a problem with their being on
> different file systems. The resulting error message (that the operation
> cannot be performed because the files are the same) is therefore incorrect
> and misleading. Perhaps, as it seems to me, this check should rather be
> performed outside this function and associated with its own error message.
>
> In fact, even if the source and destination are completely unrelated, this
> function may still return false and cause the same-file error message to be
> displayed. For example, let [source] be a regular file or directory on one
> file system, and let [destination] be a symbolic link on another. Then the
> following command will result in a same-file error, even if [destination]
> does not refer to [source]:
>
> cp --recursive --link --no-dereference [source] [destination]
>
> Moreover, the name and description of the function seem misleading. One would
> think from reading these that the source and destination files passed to it
> have already been determined to be equivalent in some way, and that the
> function is meant to determine whether it is all right in that case to
> overwrite the destination; however, it appears that the function is called
> whenever the destination already exists, regardless of its relation to the
> source. This discrepancy can lead to misunderstandings and problems such as
> the kind last described. Ideally, the function's name and description would
> be made more accurate and precise, and the function used strictly accordingly.
>
> Assuming we were to move the file-system check outside the same_file_ok
> function, we would be left with the following code within the function:
>
> /* It's ok to remove a destination symbolic link when creating a symbolic
> link or hard link. */
> if (S_ISLNK (dst_sb_link->st_mode) && (x->symbolic_link || x->hard_link))
> return true;
>
> But as we earlier in the function handle the cases when both the source and
> the destination are symbolic links, and there does not seem to be a reason to
> constrain overwriting of symbolic links otherwise (at least within this
> function), we can perhaps simplify this even further to the following:
>
> /* At this point, it's ok to remove a destination symbolic link. */
> if (S_ISLNK (dst_sb_link->st_mode))
> return true;
>
> But I have not analyzed all the relevant code to determine the full impact of
> this change.
>
> Finally, it would seem to make sense for the same_file_ok function to return
> true immediately if the source and destination files passed to it have
> different inode numbers or are on different file systems (and are thus not
> the same file). In fact, this is done in lines 1488-89 if DEREF_NEVER is not
> set; however, it is not done if DEREF_NEVER is set. For reference, those
> lines read as follows:
>
> if (!same)
> return true;
>
> Cursory analysis suggests that if these lines were unconditionally executed
> near the top of the function, certain problems might be avoided.
>
> Cheers,
> Jason
same_file_ok() has accreted awkward complexity,
having started out with a series of FIXMEs.
It gained support for `mv hardlink1 hardlink2` in fc6073d6
and then having that removed in 222d7ac0.
As part of the change in fc6073d6 (2003),
it actually made the problematic clause we've
been analyzing in same_file_ok() a noop.
Thus that clause stagnated for 14 years and so when
being converted as part of the recent 37696788 change
it no longer became a noop.
So while I agree this whole area definitely needs a refactor,
let's address the specific issues for now,
so that distributors can fix the issue with minimal risk.
Since this clause was a noop for 14 years,
we should discount it. However the recent change
also tries to handle this case:
touch foo
ln -s foo symlink
cp -dl foo symlink
Up until the recent 3