Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

2012-02-09 Thread Hyrum K Wright
On Thu, Feb 9, 2012 at 12:24 PM, Daniel Shahaf danie...@elego.de wrote:
 hwri...@apache.org wrote on Wed, Feb 08, 2012 at 01:55:55 -:
 Author: hwright
 Date: Wed Feb  8 01:55:54 2012
 New Revision: 1241733

 URL: http://svn.apache.org/viewvc?rev=1241733view=rev
 Log:
 Ev2 shims: Make sure that if we are providing content to alter_file(), we
 are also providing a checksum.

 * subversion/libsvn_delta/compat.c
   (path_checksum_args): Remove checksum member.
   (process_actions): Checksum the content file.
   (ev2_apply_textdelta): Don't checksum the output stream.

 Modified:
     subversion/trunk/subversion/libsvn_delta/compat.c

 Modified: subversion/trunk/subversion/libsvn_delta/compat.c
 URL: 
 http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/compat.c?rev=1241733r1=1241732r2=1241733view=diff
 ==
 --- subversion/trunk/subversion/libsvn_delta/compat.c (original)
 +++ subversion/trunk/subversion/libsvn_delta/compat.c Wed Feb  8 01:55:54 
 2012
 @@ -203,7 +203,6 @@ struct path_checksum_args
  {
    const char *path;
    svn_revnum_t base_revision;
 -  svn_checksum_t *checksum;
  };

  static svn_error_t *
 @@ -340,9 +339,10 @@ process_actions(void *edit_baton,
                /* We can only set text on files. */
                kind = svn_node_file;

 +              SVN_ERR(svn_io_file_checksum2(checksum, pca-path,
 +                                            svn_checksum_sha1, 
 scratch_pool));
                SVN_ERR(svn_stream_open_readonly(contents, pca-path,
                                                 scratch_pool, scratch_pool));

 So, you first open the file and read it through to checksum it, and then
 open it again to read it through as a stream.

 That's because right now add_file()/alter_file() require the checksum to
 be provided up front.

 Would it make sense to relax that requirement?  On the one hand,
 many receivers don't need to know the checksum until after they've
 received the whole file (eg, the fs txn process and the pristine store);
 on the other hand, having the checksum will allow in some cases not
 transmitting the file over the wire at all, and will allow for
 error-checking the transmission in other cases.

Agreed that the current approach taken by the shim is wasteful, though
other attempts were a bit dodgy, and this one Just Works.  If there
are better ways to get this checksum, let's use 'em.

I'm not certain about relaxing the requirement just yet.  I'll have to
think about the design implications.

(I'll also note that we actually *do* have a checksum by this point,
only it is the md5 provided by close_file(), and Ev2 uses sha1s
exclusively, so we have to recalculate.  I suspect this will be a
somewhat common issue while we use a mixed shim environment.)

-Hyrum


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/


Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

2012-02-09 Thread Daniel Shahaf
Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
 (I'll also note that we actually *do* have a checksum by this point,
 only it is the md5 provided by close_file(), and Ev2 uses sha1s
 exclusively, so we have to recalculate.  I suspect this will be a

Only sha1?  It allows neither md5 nor sha256?  Why?

 somewhat common issue while we use a mixed shim environment.)


Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

2012-02-09 Thread Hyrum K Wright
On Thu, Feb 9, 2012 at 1:05 PM, Daniel Shahaf danie...@elego.de wrote:
 Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
 (I'll also note that we actually *do* have a checksum by this point,
 only it is the md5 provided by close_file(), and Ev2 uses sha1s
 exclusively, so we have to recalculate.  I suspect this will be a

 Only sha1?  It allows neither md5 nor sha256?  Why?

Convention would be my initial response.  It makes much more sense
to standardize checksum kinds across the system than to have a mixed
environment.  Any checksum can help detect corruption, but being able
to answer the question is this content the same as that content? is
much more difficult in a mixed-checksum environment.  sha1 felt like a
reasonable choice[1].

I could  be overestimating the negatives, though.  What reasons do you
suppose we *should* allow arbitrary checksum types?

-Hyrum

[1] Though, given the timeline of Ev2 and the proposed timeline for
sha3, I'm really interested in the possibility of using sha3 when
standardized, throughout *all* of Subversion.  (But that's a
completely separate discussion.)


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/


Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

2012-02-09 Thread Daniel Shahaf
Hyrum K Wright wrote on Thu, Feb 09, 2012 at 13:16:13 -0600:
 On Thu, Feb 9, 2012 at 1:05 PM, Daniel Shahaf danie...@elego.de wrote:
  Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
  (I'll also note that we actually *do* have a checksum by this point,
  only it is the md5 provided by close_file(), and Ev2 uses sha1s
  exclusively, so we have to recalculate.  I suspect this will be a
 
  Only sha1?  It allows neither md5 nor sha256?  Why?
 
 Convention would be my initial response.  It makes much more sense
 to standardize checksum kinds across the system than to have a mixed
 environment.  Any checksum can help detect corruption, but being able
 to answer the question is this content the same as that content? is
 much more difficult in a mixed-checksum environment.  sha1 felt like a

It's hardly impossible.  It shouldn't take much effort to whip up an
SQLite table that maps sha1's to sha3's (both in the wc layer and the FS
layer).

 reasonable choice[1].
 
 I could  be overestimating the negatives, though.  What reasons do you
 suppose we *should* allow arbitrary checksum types?
 

I didn't say arbitrary; but hard-wiring a specific checksum algorithm
into our APIs seems wrong.  Suppose Ev2 is in 1.8 and then 1.9 clients
want to use sha4 with 1.9 servers.  Does the editor interface allow for
that?

I'm assuming we could mandate sha1 support but allow driver/receiver to
negotiate on using a different algorithm.

 -Hyrum
 
 [1] Though, given the timeline of Ev2 and the proposed timeline for
 sha3, I'm really interested in the possibility of using sha3 when
 standardized, throughout *all* of Subversion.  (But that's a
 completely separate discussion.)
 
 
 -- 
 
 uberSVN: Apache Subversion Made Easy
 http://www.uberSVN.com/