Attaching a cleaner patch... Removed a couple of commented lines...
Regards
Prabhu
On Monday 19 September 2011 10:24 PM, Prabhu Gnana Sundar wrote:
Thanks Kamesh for your valuable review...
On Monday 12 September 2011 06:25 PM, Kamesh Jayachandran wrote:
I like this patch. We need it.
Thank you...
I found this bogus mergeinfo's were causing 404 file not found,
which were sometimes causing the merge to fail, making it run longer
than necessary.
Few comments.
1.
> opts, args = my_getopt(sys.argv[1:], "h?fRc", ["help", "fix",
"recursive", "commit"])
If you do not want to support 'commit' you can remove it as well.
Yeah... I removed it now..
2. I ran it against a working copy of
https://ctf.open.collab.net/svn/repos/svnedge/trunk/console
and got the following output
[kamesh@kamesh console]$ python /tmp/mergeinfo-sanitizer.py .
The mergeinfo has to be sanitized/reverse-merged for the following as:
/branches/CTF_MODE/SvnEdge: 515-766,
[kamesh@kamesh console]$ python /tmp/mergeinfo-sanitizer.py --fix .
[kamesh@kamesh console]$ svn di
Property changes on: .
___________________________________________________________________
Modified: svn:mergeinfo
Reverse-merged /branches/CTF_MODE/SvnEdge:r512-515
Based on my analysis on the above repo, I could see
/branches/CTF_MODE/SvnEdge:r512-515 is bogus
and hence need to be removed. Anywway your script does it correctly
with --fix.
I would suggest changing the output of default invocation(i.e no --fix)
Changed the output as suggested...
<snip>
Bogus mergeinfo summary:
Target 1: .
/branches/CTF_MODE/SvnEdge: 512-515,
/merge/source2: X-Y,
..........
Target 2: subtree_path
/merge/source3: A-B
....
Target 3: sub_sub_tree
/merge/source4: C-D
......
Run with --fix to fix this in your working copy and commit yourself.
3. Your script seemed to sanitize to the depth of 'immediates' by
default. What is the rational behind that?
No rational behind it, I was using it as default for my own
development purpose.
Now fixed it.
4. I ran your script against
https://svn.apache.org/repos/asf/subversion/trunk
And got the following output
<snip>
The path
/subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c
is not found
The path
/subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h
is not found
The mergeinfo has to be sanitized/reverse-merged for the following as:
/subversion/trunk/subversion/libsvn_subr/svn_temp_serializer.c:
/subversion/branches/tree-conflicts/subversion/include/svn_string.h:
872524-873154,868290-872329,
/subversion/trunk/subversion/include/private/svn_temp_serializer.h:
/subversion/branches/diff-optimizations/subversion/libsvn_subr/adler32.c:
/subversion/branches/tree-conflicts/subversion/libsvn_diff/diff.h:
872524-873154,868290-872329,
/subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.h:
/subversion/branches/svn-patch-improvements: 918520-934609,
/subversion/trunk/subversion/libsvn_fs_fs/temp_serializer.c:
/subversion/branches/tree-conflicts/subversion/libsvn_subr/hash.c:
872524-873154,868290-872329,
/subversion/branches/diff-optimizations/subversion/include/svn_string.h:
1031270-1037352,
/subversion/branches/diff-optimizations-bytes/subversion/include/private/svn_adler32.h:
1054361-1067789,
/subversion/branches/svn-patch-improvements/subversion/include/svn_string.h:
918520-934609,
/subversion/branches/diff-optimizations/subversion/libsvn_subr/hash.c:
1031270-1037352,
/subversion/branches/tree-conflicts: 872524-873154,868290-872329,
/subversion/branches/diff-optimizations/subversion/include/private/svn_adler32.h:
/subversion/branches/diff-optimizations-bytes/subversion/libsvn_subr/adler32.c:
1054361-1067789,
/subversion/branches/svn-patch-improvements/subversion/libsvn_diff/diff.h:
918520-934609,
/subversion/branches/diff-optimizations: 1031270-1037352,
/subversion/branches/svn-patch-improvements/subversion/libsvn_subr/hash.c:
918520-934609,
/subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.h:
1068738-1068739,
/subversion/branches/integrate-cache-item-serialization/subversion/libsvn_subr/svn_temp_serializer.c:
1068723-1068739,
/subversion/branches/integrate-cache-item-serialization/subversion/libsvn_fs_fs/temp_serializer.c:
1068738-1068739,
/subversion/trunk/subversion/libsvn_diff/diff.h: 1037352-1054248,
/subversion/trunk/subversion/libsvn_subr/adler32.c: 1054276-1054360,
/subversion/trunk/subversion/include/private/svn_adler32.h:
1054276-1054360,
/subversion/branches/integrate-cache-item-serialization/subversion/include/private/svn_temp_serializer.h:
1068723-1068739,
/subversion/trunk/subversion/libsvn_diff/util.c: 1037352-1054251,
</snip>
path not found error need to be handled. error should be on STDERR.
now it is being handled and sent to the stderr...
5. In main() you can call check_local_modifications before
get_original_mergeinfo().
That sounds really good though I don't see 'propget' as a costly
operation.
6. In get_original_mergeinfo()
> if depth == 3:
Use the named constant than the arbitrary value.
> if depth == 3:
> for entry in mergeinfo_catalog:
> pathwise_mergeinfo = pathwise_mergeinfo +
mergeinfo_catalog[entry] + "\n"
I think you do something wrong with the above concatenation of
mergeinfos of different targets.
> else:
> pathwise_mergeinfo = mergeinfo_catalog[wcpath]
> return core.svn_mergeinfo_parse(pathwise_mergeinfo, temp_pool)
Thanks for pointing this Kamesh... Fixed this in the attached patch...
7. In get_new_location_segments():
> for revision_range in parsed_original_mergeinfo[path]:
> try:
> ra.get_location_segments(ra_session, "", revision_range.end,
> revision_range.end,
revision_range.start + 1, receiver)
> except svn.core.SubversionException:
> try:
> ra.get_location_segments(ra_session, "",
core.SVN_INVALID_REVNUM,
> revision_range.end,
revision_range.start + 1, receiver)
Not sure why you run location segments against HEAD upon exception.
I was just thinking that the source path could even be present outside
the revision-ranges.
But looks like I am doing something weird here... So fixed it by
removing the part that runs
the location segments against the HEAD...
> except svn.core.SubversionException:
> print " The path %s is not found " % path
8. Function parse_args is not used.
Removed it. This was mistakenly left out...
9. Function receiver can have more meaningful name.
Changed the name ...
10. You need to organize your functions in such a way that it appears
in some logical order.
For example I started my review from 'main()' which is in the middle
of the file and I move above and below that function to review further.
Organized :)
Attached the recent patch with this mail... Please share your thoughts...
Thanks and regards
Prabhu
#!/usr/bin/env python
import svn
import sys
import os
import getopt
import hashlib
import pickle
import getpass
import re
from svn import client, core, ra, wc
## This script first fetches the mergeinfo of the working copy and tries
## to fetch the location segments for the source paths in the respective
## revisions present in the mergeinfo. With the obtained location segments
## result, it creates a new mergeinfo. The depth is infinity by default.
## This script would stop proceeding if there are any local modifications in the
## working copy.
try:
my_getopt = getopt.gnu_getopt
except AttributeError:
my_getopt = getopt.getopt
mergeinfo = {}
def usage():
sys.stderr.write(""" Usage: %s WCPATH
Analyze the mergeinfo property of the given WCPATH.
Look for the existence of merge_source's locations at their recorded
merge ranges. If non-existent merge source is found fix the mergeinfo.
Valid Options:
-f [--fix] : set the svn:mergeinfo property. Not committing the changes.
-h [--help] : display the usage
""" % os.path.basename(sys.argv[0]) )
##
# This function would 'svn propset' the new mergeinfo to the working copy
##
def set_new_mergeinfo(wcpath, newmergeinfo, ctx):
client.propset3("svn:mergeinfo", newmergeinfo, wcpath, core.svn_depth_empty,
0, core.SVN_INVALID_REVNUM, None, None, ctx)
##
# Returns the md5 hash of the file
##
def md5_of_file(f, block_size = 2*20):
md5 = hashlib.md5()
while True:
data = f.read(block_size)
if not data:
break
md5.update(data)
return md5.digest()
def hasher(hash_file, newmergeinfo_file):
new_mergeinfo = core.svn_mergeinfo_to_string(mergeinfo)
with open(newmergeinfo_file, "a") as buffer_file:
pickle.dump(new_mergeinfo, buffer_file)
buffer_file.close()
with open(newmergeinfo_file, "rb") as buffer_file:
hash_of_buffer_file = md5_of_file(buffer_file)
buffer_file.close()
with open(hash_file, "w") as hash_file:
pickle.dump(hash_of_buffer_file, hash_file)
hash_file.close()
def location_segment_callback(segment, pool):
if segment.path is not None:
source_path = '/' + segment.path
path_ranges = mergeinfo.get(source_path, [])
range = svn.core.svn_merge_range_t()
range.start = segment.range_start - 1
range.end = segment.range_end
range.inheritable = 1
path_ranges.append(range)
mergeinfo[source_path] = path_ranges
##
# This function does the authentication in an interactive way
##
def prompt_func_ssl_unknown_cert(realm, failures, cert_info, may_save, pool):
print "The certificate details are as follows:"
print "--------------------------------------"
print "Issuer : " + str(cert_info.issuer_dname)
print "Hostname : " + str(cert_info.hostname)
print "ValidFrom : " + str(cert_info.valid_from)
print "ValidUpto : " + str(cert_info.valid_until)
print "Fingerprint: " + str(cert_info.fingerprint)
print ""
ssl_trust = core.svn_auth_cred_ssl_server_trust_t()
if may_save:
choice = raw_input( "accept (t)temporarily (p)permanently: ")
else:
choice = raw_input( "(r)Reject or accept (t)temporarily: ")
if choice[0] == "t" or choice[0] == "T":
ssl_trust.may_save = False
ssl_trust.accepted_failures = failures
elif choice[0] == "p" or choice[0] == "P":
ssl_trust.may_save = True
ssl_trust.accepted_failures = failures
else:
ssl_trust = None
return ssl_trust
def prompt_func_simple_prompt(realm, username, may_save, pool):
username = raw_input("username: ")
password = getpass.getpass(prompt="password: ")
simple_cred = core.svn_auth_cred_simple_t()
simple_cred.username = username
simple_cred.password = password
simple_cred.may_save = False
return simple_cred
##
# This function tries to authenticate(if needed) and fetch the
# location segments for the available mergeinfo and create a new
# mergeinfo dictionary
##
def get_new_location_segments(parsed_original_mergeinfo, repo_root,
wcpath, ctx):
for path in parsed_original_mergeinfo:
full_url = repo_root + path
ra_callbacks = ra.callbacks_t()
ra_callbacks.auth_baton = core.svn_auth_open([
core.svn_auth_get_ssl_server_trust_file_provider(),
core.svn_auth_get_simple_prompt_provider(prompt_func_simple_prompt, 2),
core.svn_auth_get_ssl_server_trust_prompt_provider(prompt_func_ssl_unknown_cert),
svn.client.get_simple_provider(),
svn.client.get_username_provider()
])
try:
ctx.config = core.svn_config_get_config(None)
ra_session = ra.open(full_url, ra_callbacks, None, ctx.config)
for revision_range in parsed_original_mergeinfo[path]:
try:
ra.get_location_segments(ra_session, "", revision_range.end,
revision_range.end, revision_range.start + 1, location_segment_callback)
except svn.core.SubversionException:
sys.stderr.write(" Could not find location segments for %s \n" % path)
except Exception, e:
sys.stderr.write("")
def sanitize_mergeinfo(parsed_original_mergeinfo, repo_root, wcpath,
ctx, hash_file, newmergeinfo_file, temp_pool):
full_mergeinfo = {}
for entry in parsed_original_mergeinfo:
get_new_location_segments(parsed_original_mergeinfo[entry], repo_root, wcpath, ctx)
full_mergeinfo.update(parsed_original_mergeinfo[entry])
hasher(hash_file, newmergeinfo_file)
diff_mergeinfo = core.svn_mergeinfo_diff(full_mergeinfo,
mergeinfo, 1, temp_pool)
print "The bogus mergeinfo summary:"
for pathsource in diff_mergeinfo:
if pathsource is not None:
str_diff_info = core.svn_mergeinfo_to_string(pathsource)
parsed_diff_info = core.svn_mergeinfo_parse(str_diff_info)
for bogus_mergeinfo_path in parsed_diff_info:
try:
if mergeinfo[bogus_mergeinfo_path] is not None:
sys.stdout.write(bogus_mergeinfo_path + ": ")
except Exception, e:
sys.stderr.write("")
try:
for revision_range in mergeinfo[bogus_mergeinfo_path]:
sys.stdout.write(str(revision_range.start + 1) + "-" + str(revision_range.end) + ",")
print ""
except Exception, e:
sys.stderr.write("")
##
# This function tries to 'propset the new mergeinfo into the working copy.
# It reads the new mergeinfo from the .newmergeinfo file and verifies its
# hash against the hash in the .hashfile
##
def fix_sanitized_mergeinfo(parsed_original_mergeinfo, repo_root, wcpath,
ctx, hash_file, newmergeinfo_file, temp_pool):
has_local_modification = check_local_modifications(wcpath, temp_pool)
old_hash = ''
new_hash = ''
try:
with open(hash_file, "r") as f:
old_hash = pickle.load(f)
f.close
except IOError, e:
get_new_location_segments(parsed_original_mergeinfo, repo_root, wcpath, ctx)
hasher(hash_file, newmergeinfo_file)
try:
with open(hash_file, "r") as f:
old_hash = pickle.load(f)
f.close
except IOError:
hasher(hash_file, newmergeinfo_file)
try:
with open(newmergeinfo_file, "r") as f:
new_hash = md5_of_file(f)
f.close
except IOError, e:
if not mergeinfo:
get_new_location_segments(parsed_original_mergeinfo, repo_root, wcpath, ctx)
hasher(hash_file, newmergeinfo_file)
with open(newmergeinfo_file, "r") as f:
new_hash = md5_of_file(f)
f.close
if old_hash == new_hash:
with open(newmergeinfo_file, "r") as f:
newmergeinfo = pickle.load(f)
f.close
set_new_mergeinfo(wcpath, newmergeinfo, ctx)
if os.path.exists(newmergeinfo_file):
os.remove(newmergeinfo_file)
os.remove(hash_file)
else:
print "The hashes are not matching. Probable chance of unwanted tweaking in the mergeinfo"
##
# This function checks the working copy for any local modifications
##
def check_local_modifications(wcpath, temp_pool):
has_local_mod = wc.svn_wc_revision_status(wcpath, None, 0, None, temp_pool)
if has_local_mod.modified:
print """The working copy has local modifications. Please revert them or clean
the working copy before running the script."""
sys.exit(1)
def get_original_mergeinfo(wcpath, revision, depth, ctx, temp_pool):
propget_list = client.svn_client_propget3("svn:mergeinfo", wcpath,
revision, revision, depth, None,
ctx, temp_pool)
pathwise_mergeinfo = ""
pathwise_mergeinfo_list = []
mergeinfo_catalog = propget_list[0]
mergeinfo_catalog_dict = {}
for entry in mergeinfo_catalog:
mergeinfo_catalog_dict[entry] = core.svn_mergeinfo_parse(mergeinfo_catalog[entry], temp_pool)
return mergeinfo_catalog_dict
def main():
try:
opts, args = my_getopt(sys.argv[1:], "h?f", ["help", "fix"])
except Exception, e:
sys.stderr.write(""" Improperly used """)
sys.exit(1)
if len(args) == 1:
wcpath = args[0]
wcpath = os.path.abspath(wcpath)
else:
usage()
sys.exit(1)
fix = 0
current_path = os.getcwd()
hash_file = os.path.join(current_path, ".hashfile")
newmergeinfo_file = os.path.join(current_path, ".newmergeinfo")
temp_pool = core.svn_pool_create()
ctx = client.svn_client_create_context(temp_pool)
depth = core.svn_depth_infinity
revision = core.svn_opt_revision_t()
revision.kind = core.svn_opt_revision_unspecified
for opt, values in opts:
if opt == "--help" or opt in ("-h", "-?"):
usage()
elif opt == "--fix" or opt == "-f":
fix = 1
# Check for any local modifications in the working copy
check_local_modifications(wcpath, temp_pool)
parsed_original_mergeinfo = get_original_mergeinfo(wcpath, revision,
depth, ctx, temp_pool)
repo_root = client.svn_client_root_url_from_path(wcpath, ctx, temp_pool)
core.svn_config_ensure(None)
if fix == 0:
sanitize_mergeinfo(parsed_original_mergeinfo, repo_root, wcpath, ctx,
hash_file, newmergeinfo_file, temp_pool)
if fix == 1:
fix_sanitized_mergeinfo(parsed_original_mergeinfo, repo_root, wcpath,
ctx, hash_file, newmergeinfo_file, temp_pool)
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print ""
sys.stderr.write("The script is interrupted and stopped manually.")
print ""