Hi, Julian,
You wrote:
> Hi Markus. I just noticed there doesn't seem to be a patch attached to your
> last email.
I did re-send the mail with the patch attached, but maybe that duplicate
got filtered by some (mental) spam filters.
Here's it again:
Summary of test results:
1619 tests PASSED
85 tests SKIPPED
38 tests XFAILED (1 WORK-IN-PROGRESS)
[[
Optimize merge_file_trivial() by avoiding to read the files twice by using a
new comparison function which compares 3 files at once.
* subversion/include/svn_io.h
(svn_io_filesizes_three_different_p): Add new declaration
(svn_io_files_contents_three_same_p): Add new declaration
* subversion/libsvn_subr/io.c
(svn_io_filesizes_three_different_p): Add new function in analogy to
svn_io_filesizes_different_p().
(contents_three_identical_p): Add new function in analogy to
contents_identical_p().
(svn_io_files_contents_three_same_p): Add new function in analogy to
svn_io_files_contents_same_p.
* subversion/libsvn_wc/merge.c
(merge_file_trivial): Use the new three-file comparison functions to avoid
reading files twice.
Patch by: Markus Schaber <[email protected]>
]]
I also intend to add test cases for the new as well as the existing filesizes
and files_contents functions, but it seems I could need some help to get
kickstarted.
Best regards
Markus Schaber
--
We software Automation.
3S-Smart Software Solutions GmbH
Markus Schaber | Developer
Memminger Str. 151 | 87439 Kempten | Germany | Tel. +49-831-54031-0 | Fax
+49-831-54031-50
Email: [email protected] | Web: http://www.3s-software.com
CoDeSys internet forum: http://forum.3s-software.com
Download CoDeSys sample projects:
http://www.3s-software.com/index.shtml?sample_projects
Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade
register: Kempten HRB 6186 | Tax ID No.: DE 167014915
________________________________________
Von: Julian Foad [[email protected]]
Gesendet: Freitag, 15. Juni 2012 11:14
Bis: Markus Schaber
Cc: [email protected]
Betreff: Re: AW: [PATCH]: Optimize merge_file_trivial()
Markus Schaber wrote:
> Here's the second round. I hope I did catch all issues.
Hi Markus. I just noticed there doesn't seem to be a patch attached to your
last email.
> As there were no comments about the names and the parameter order, I guess
> they're okay.
I think they are OK.
- Julian
> The tests still look well:
>
> Summary of test results:
> 1619 tests PASSED
> 85 tests SKIPPED
> 38 tests XFAILED (1 WORK-IN-PROGRESS)
>
> [[
> Optimize merge_file_trivial() by avoiding to read the files twice by using a
> new comparison function which compares 3 files at once.
>
> * subversion/include/svn_io.h
> (svn_io_filesizes_three_different_p): Add new declaration
> (svn_io_files_contents_three_same_p): Add new declaration
>
> * subversion/libsvn_subr/io.c
> (svn_io_filesizes_three_different_p): Add new function in analogy to
> svn_io_filesizes_different_p().
> (contents_three_identical_p): Add new function in analogy to
> contents_identical_p().
> (svn_io_files_contents_three_same_p): Add new function in analogy to
> svn_io_files_contents_same_p.
>
> * subversion/libsvn_wc/merge.c
> (merge_file_trivial): Use the new three-file comparison functions to avoid
> reading files twice.
>
> Patch by: Markus Schaber <[email protected]>
> ]]
>
>
> Best regards
>
> Markus Schaber
> --
>
> We software Automation.
>
> 3S-Smart Software Solutions GmbH
> Markus Schaber | Developer
> Memminger Str. 151 | 87439 Kempten | Germany | Tel. +49-831-54031-0 | Fax
> +49-831-54031-50
>
> Email: [email protected] | Web: http://www.3s-software.com
> CoDeSys internet forum: http://forum.3s-software.com
> Download CoDeSys sample projects:
> http://www.3s-software.com/index.shtml?sample_projects
>
> Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade
> register: Kempten HRB 6186 | Tax ID No.: DE 167014915
>
> ________________________________________
> Von: Julian Foad [[email protected]]
> Gesendet: Mittwoch, 13. Juni 2012 16:52
> Bis: Markus Schaber
> Cc: [email protected]
> Betreff: Re: AW: [PATCH]: Optimize merge_file_trivial() (Was: Fix issue #4128)
>
> Markus Schaber wrote:
>
>>> Please use 'TRUE' and 'FALSE' for Boolean values in the
> code and in the doc strings, not zero and non-zero.
>>
>> I did just copy this from the 2-file functions that existed. I'll
> update the patch accordingly.
>
>
> Ah, I see. I've fixed those in r1349889.
>
> As for auto source formatting, I have settings for the Vim text editor, but I
> don't know about VS or a stand-alone tool.
>
> - Julian
>
>
>>> Please replace 'rsp' with 'and' or 'or' in the
> doc strings. See
> <http://www.transblawg.eu/index.php?/archives/870-Resp.-and-other-non-existent-English-wordsNicht-existente-englische-Woerter.html>
> :-)
>>
>>> Please don't initialize variables that are going to be
> unconditionally initialized later: variables 'file1_h',
> 'file2_h', 'file3_h' in contents_three_identical_p().
>>
>> I'll fix both issues, too.
>>
>>> Please make all the indentation consistent, and put operators on the
> beginning of each continuation line indented to just inside the relevant
> opening
> parenthesis, so instead of this:
>>
>> Is there some reformatter script or (even better) visual studio plugin that
> implements all the formatting rules?
>>
>> Best regards
>>
>> Markus Schaber
>
Index: subversion/include/svn_io.h
===================================================================
--- subversion/include/svn_io.h (revision 1349990)
+++ subversion/include/svn_io.h (working copy)
@@ -607,6 +607,25 @@
const char *file2,
apr_pool_t *pool);
+/** Set @a *different_p12 to non-zero if @a file1 and @a file2 have different
+ * sizes, else set to zero. Do the similar for @a *different_p23 with
+ * @a file2 and @a file3, and @a *different_p13 for @a file1 and @a file3.
+ * All three of @a file1, @a file2 and @a file3 are utf8-encoded.
+ *
+ * Setting @a *different_p12 to zero does not mean the files definitely
+ * have the same size, it merely means that the sizes are not
+ * definitely different. That is, if the size of one or both files
+ * cannot be determined, then the sizes are not known to be different,
+ * so @a *different_p12 is set to 0.
+ */
+svn_error_t *
+svn_io_filesizes_three_different_p(svn_boolean_t *different_p12,
+ svn_boolean_t *different_p23,
+ svn_boolean_t *different_p13,
+ const char *file1,
+ const char *file2,
+ const char *file3,
+ apr_pool_t *pool);
/** Return in @a *checksum the checksum of type @a kind of @a file
* Use @a pool for temporary allocations and to allocate @a *checksum.
@@ -642,6 +661,20 @@
const char *file2,
apr_pool_t *pool);
+/** Set @a *same12 to TRUE if @a file1 and @a file2 have the same
+ * contents, else set it to FALSE. Do the similar for @a *same23
+ * with @a file2 and @a file3, and @a *same13 for @a file1 and @a
+ * file3. Use @a pool for temporary allocations.
+ */
+svn_error_t *
+svn_io_files_contents_three_same_p(svn_boolean_t *same12,
+ svn_boolean_t *same23,
+ svn_boolean_t *same13,
+ const char *file1,
+ const char *file2,
+ const char *file3,
+ apr_pool_t *pool);
+
/** Create file at utf8-encoded @a file with contents @a contents.
* @a file must not already exist.
* Use @a pool for memory allocations.
Index: subversion/libsvn_subr/io.c
===================================================================
--- subversion/libsvn_subr/io.c (revision 1349990)
+++ subversion/libsvn_subr/io.c (working copy)
@@ -1311,6 +1311,43 @@
svn_error_t *
+svn_io_filesizes_three_different_p(svn_boolean_t *different_p12,
+ svn_boolean_t *different_p23,
+ svn_boolean_t *different_p13,
+ const char *file1,
+ const char *file2,
+ const char *file3,
+ apr_pool_t *pool)
+{
+ apr_finfo_t finfo1, finfo2, finfo3;
+ apr_status_t status1, status2, status3;
+ const char *file1_apr, *file2_apr, *file3_apr;
+
+ /* Not using svn_io_stat() because don't want to generate
+ svn_error_t objects for non-error conditions. */
+
+ SVN_ERR(cstring_from_utf8(&file1_apr, file1, pool));
+ SVN_ERR(cstring_from_utf8(&file2_apr, file2, pool));
+ SVN_ERR(cstring_from_utf8(&file3_apr, file3, pool));
+
+ /* Stat all three files */
+ status1 = apr_stat(&finfo1, file1_apr, APR_FINFO_MIN, pool);
+ status2 = apr_stat(&finfo2, file2_apr, APR_FINFO_MIN, pool);
+ status3 = apr_stat(&finfo3, file3_apr, APR_FINFO_MIN, pool);
+
+ /* If we got an error stat'ing a file, it could be because the
+ file was removed... or who knows. Whatever the case, we
+ don't know if the filesizes are definitely different, so
+ assume that they're not. */
+ *different_p12 = !status1 && !status2 && finfo1.size != finfo2.size;
+ *different_p23 = !status2 && !status3 && finfo2.size != finfo3.size;
+ *different_p13 = !status1 && !status3 && finfo1.size != finfo3.size;
+
+ return SVN_NO_ERROR;
+}
+
+
+svn_error_t *
svn_io_file_checksum2(svn_checksum_t **checksum,
const char *file,
svn_checksum_kind_t kind,
@@ -4072,6 +4109,130 @@
+/* Do a byte-for-byte comparison of FILE1, FILE2 and FILE3. */
+static svn_error_t *
+contents_three_identical_p(svn_boolean_t *identical_p12,
+ svn_boolean_t *identical_p23,
+ svn_boolean_t *identical_p13,
+ const char *file1,
+ const char *file2,
+ const char *file3,
+ apr_pool_t *pool)
+{
+ svn_error_t *err;
+ apr_size_t bytes_read1, bytes_read2, bytes_read3;
+ char *buf1 = apr_palloc(pool, SVN__STREAM_CHUNK_SIZE);
+ char *buf2 = apr_palloc(pool, SVN__STREAM_CHUNK_SIZE);
+ char *buf3 = apr_palloc(pool, SVN__STREAM_CHUNK_SIZE);
+ apr_file_t *file1_h;
+ apr_file_t *file2_h;
+ apr_file_t *file3_h;
+ svn_boolean_t eof1 = FALSE;
+ svn_boolean_t eof2 = FALSE;
+ svn_boolean_t eof3 = FALSE;
+
+ SVN_ERR(svn_io_file_open(&file1_h, file1, APR_READ, APR_OS_DEFAULT,
+ pool));
+
+ err = svn_io_file_open(&file2_h, file2, APR_READ, APR_OS_DEFAULT,
+ pool);
+
+ if (err)
+ return svn_error_trace(
+ svn_error_compose_create(err,
+ svn_io_file_close(file1_h, pool)));
+
+ err = svn_io_file_open(&file3_h, file3, APR_READ, APR_OS_DEFAULT,
+ pool);
+
+ if (err)
+ return svn_error_trace(
+ svn_error_compose_create(
+ err,
+ svn_error_compose_create(svn_io_file_close(file1_h, pool),
+ svn_io_file_close(file2_h,
pool))));
+
+ /* assume TRUE, until disproved below */
+ *identical_p12 = *identical_p23 = *identical_p13 = TRUE;
+ /* We need to read as long as no error occurs, and as long as one of the
+ * flags could still change due to a read operation */
+ while (!err
+ && (*identical_p12 && !eof1 && !eof2)
+ || (*identical_p23 && !eof2 && !eof3)
+ || (*identical_p13 && !eof1 && !eof3))
+ {
+ /* As long as a file is not at the end yet, and it is still
+ * potentially identical to another file, we read the next chunk.*/
+ if (!eof1 && (identical_p12 || identical_p13))
+ {
+ err = svn_io_file_read_full2(file1_h, buf1,
+ SVN__STREAM_CHUNK_SIZE, &bytes_read1,
+ &eof1, pool);
+ if (err)
+ break;
+ }
+
+ if (!eof2 && (identical_p12 || identical_p23))
+ {
+ err = svn_io_file_read_full2(file2_h, buf2,
+ SVN__STREAM_CHUNK_SIZE, &bytes_read2,
+ &eof2, pool);
+ if (err)
+ break;
+ }
+
+ if (!eof3 && (identical_p13 || identical_p23))
+ {
+ err = svn_io_file_read_full2(file3_h, buf3,
+ SVN__STREAM_CHUNK_SIZE, &bytes_read3,
+ &eof3, pool);
+ if (err)
+ break;
+ }
+
+ /* If the files are still marked identical, and at least one of them
+ * is not at the end of file, we check whether they differ, and set
+ * their flag to false then. */
+ if (*identical_p12
+ && !(eof1 && eof2)
+ && ((eof1 != eof2)
+ || (bytes_read1 != bytes_read2)
+ || memcmp(buf1, buf2, bytes_read1)))
+ {
+ *identical_p12 = FALSE;
+ }
+
+ if (*identical_p23
+ && !(eof2 && eof3)
+ && ((eof2 != eof3)
+ || (bytes_read2 != bytes_read3)
+ || memcmp(buf2, buf3, bytes_read2)))
+ {
+ *identical_p23 = FALSE;
+ }
+
+ if (*identical_p13
+ && !(eof1 && eof3)
+ && ((eof1 != eof3)
+ || (bytes_read1 != bytes_read3)
+ || memcmp(buf1, buf3, bytes_read3)))
+ {
+ *identical_p13 = FALSE;
+ }
+ }
+
+ return svn_error_trace(
+ svn_error_compose_create(
+ err,
+ svn_error_compose_create(
+ svn_io_file_close(file1_h, pool),
+ svn_error_compose_create(
+ svn_io_file_close(file2_h, pool),
+ svn_io_file_close(file3_h, pool)))));
+}
+
+
+
svn_error_t *
svn_io_files_contents_same_p(svn_boolean_t *same,
const char *file1,
@@ -4098,6 +4259,53 @@
return SVN_NO_ERROR;
}
+svn_error_t *
+svn_io_files_contents_three_same_p(svn_boolean_t *same12,
+ svn_boolean_t *same23,
+ svn_boolean_t *same13,
+ const char *file1,
+ const char *file2,
+ const char *file3,
+ apr_pool_t *pool)
+{
+ svn_boolean_t diff_size12, diff_size23, diff_size13;
+
+ SVN_ERR(svn_io_filesizes_three_different_p(&diff_size12,
+ &diff_size23,
+ &diff_size13,
+ file1,
+ file2,
+ file3,
+ pool));
+
+ if (diff_size12 && diff_size23 && diff_size13)
+ {
+ *same12 = *same23 = *same13 = FALSE;
+ }
+ else if (diff_size12 && diff_size23)
+ {
+ *same12 = *same23 = FALSE;
+ SVN_ERR(contents_identical_p(same13, file1, file3, pool));
+ }
+ else if (diff_size23 && diff_size13)
+ {
+ *same23 = *same13 = FALSE;
+ SVN_ERR(contents_identical_p(same12, file1, file2, pool));
+ }
+ else if (diff_size12 && diff_size13)
+ {
+ *same12 = *same13 = FALSE;
+ SVN_ERR(contents_identical_p(same23, file2, file3, pool));
+ }
+ else
+ {
+ SVN_ERR_ASSERT(!diff_size12 && !diff_size23 && !diff_size13);
+ SVN_ERR(contents_three_identical_p(same12, same23, same13, file1, file2,
file3, pool));
+ }
+
+ return SVN_NO_ERROR;
+}
+
#ifdef WIN32
/* Counter value of file_mktemp request (used in a threadsafe way), to make
sure that a single process normally never generates the same tempname
Index: subversion/libsvn_wc/merge.c
===================================================================
--- subversion/libsvn_wc/merge.c (revision 1349990)
+++ subversion/libsvn_wc/merge.c (working copy)
@@ -948,7 +948,9 @@
apr_pool_t *scratch_pool)
{
svn_skel_t *work_item;
- svn_boolean_t same_contents = FALSE;
+ svn_boolean_t same_left_right;
+ svn_boolean_t same_right_target;
+ svn_boolean_t same_left_target;
svn_node_kind_t kind;
svn_boolean_t is_special;
@@ -961,19 +963,23 @@
return SVN_NO_ERROR;
}
+ /* Check the files */
+ SVN_ERR(svn_io_files_contents_three_same_p(&same_left_right,
+ &same_right_target,
+ &same_left_target,
+ left_abspath,
+ right_abspath,
+ detranslated_target_abspath,
+ scratch_pool));
+
/* If the LEFT side of the merge is equal to WORKING, then we can
* copy RIGHT directly. */
- SVN_ERR(svn_io_files_contents_same_p(&same_contents, left_abspath,
- detranslated_target_abspath,
- scratch_pool));
- if (same_contents)
+ if (same_left_target)
{
/* Check whether the left side equals the right side.
* If it does, there is no change to merge so we leave the target
* unchanged. */
- SVN_ERR(svn_io_files_contents_same_p(&same_contents, left_abspath,
- right_abspath, scratch_pool));
- if (same_contents)
+ if (same_left_right)
{
*merge_outcome = svn_wc_merge_unchanged;
}
@@ -1002,10 +1008,7 @@
* conflicted them needlessly, while merge_text_file figured it out
* eventually and returned svn_wc_merge_unchanged for them, which
* is what we do here. */
- SVN_ERR(svn_io_files_contents_same_p(&same_contents,
- detranslated_target_abspath,
- right_abspath, scratch_pool));
- if (same_contents)
+ if (same_right_target)
{
*merge_outcome = svn_wc_merge_unchanged;
return SVN_NO_ERROR;