[gentoo-portage-dev] [PATCH] movefile: support in-kernel file copying on Linux (bug 607868)

2017-03-01 Thread Zac Medico
Perform in-kernel file copying when possible, and also support
reflinks and sparse files. If the optimized implementation
fails at runtime, gracefully fallback to shutil.copyfile.

Compile-time and run-time fallbacks are implemented, so that
any incompatiblities will be handled gracefully. For example,
if the code is compiled on a system that supports the
copy_file_range syscall, but at run-time an older kernel that
does not support this syscall is detected, it will be handled
gracefully.

X-Gentoo-Bug: 607868
X-Gentoo-Bug-Url: https://bugs.gentoo.org/show_bug.cgi?id=607868
---
 pym/portage/tests/util/file_copy/__init__.py  |   0
 pym/portage/tests/util/file_copy/__test__.py  |   0
 pym/portage/tests/util/file_copy/test_copyfile.py |  68 +++
 pym/portage/util/file_copy/__init__.py|  78 
 pym/portage/util/movefile.py  |   4 +-
 setup.py  |   9 +
 src/portage_util_file_copy_reflink_linux.c| 225 ++
 7 files changed, 383 insertions(+), 1 deletion(-)
 create mode 100644 pym/portage/tests/util/file_copy/__init__.py
 create mode 100644 pym/portage/tests/util/file_copy/__test__.py
 create mode 100644 pym/portage/tests/util/file_copy/test_copyfile.py
 create mode 100644 pym/portage/util/file_copy/__init__.py
 create mode 100644 src/portage_util_file_copy_reflink_linux.c

diff --git a/pym/portage/tests/util/file_copy/__init__.py 
b/pym/portage/tests/util/file_copy/__init__.py
new file mode 100644
index 000..e69de29
diff --git a/pym/portage/tests/util/file_copy/__test__.py 
b/pym/portage/tests/util/file_copy/__test__.py
new file mode 100644
index 000..e69de29
diff --git a/pym/portage/tests/util/file_copy/test_copyfile.py 
b/pym/portage/tests/util/file_copy/test_copyfile.py
new file mode 100644
index 000..987a701
--- /dev/null
+++ b/pym/portage/tests/util/file_copy/test_copyfile.py
@@ -0,0 +1,68 @@
+# Copyright 2017 Gentoo Foundation
+# Distributed under the terms of the GNU General Public License v2
+
+import shutil
+import tempfile
+
+from portage import os
+from portage.tests import TestCase
+from portage.checksum import perform_md5
+from portage.util.file_copy import copyfile
+
+
+class CopyFileTestCase(TestCase):
+
+   def testCopyFile(self):
+
+   tempdir = tempfile.mkdtemp()
+   try:
+   src_path = os.path.join(tempdir, 'src')
+   dest_path = os.path.join(tempdir, 'dest')
+   content = b'foo'
+
+   with open(src_path, 'wb') as f:
+   f.write(content)
+
+   copyfile(src_path, dest_path)
+
+   self.assertEqual(perform_md5(src_path), 
perform_md5(dest_path))
+   finally:
+   shutil.rmtree(tempdir)
+
+
+class CopyFileSparseTestCase(TestCase):
+
+   def testCopyFileSparse(self):
+
+   # This test is expected to fail on platforms where we have
+   # not implemented sparse copy, so set the todo flag in order
+   # to tolerate failures.
+   self.todo = True
+
+   tempdir = tempfile.mkdtemp()
+   try:
+   src_path = os.path.join(tempdir, 'src')
+   dest_path = os.path.join(tempdir, 'dest')
+   content = b'foo'
+
+   # Use seek to create some sparse blocks. Don't make 
these
+   # files too big, in case the filesystem doesn't support
+   # sparse files.
+   with open(src_path, 'wb') as f:
+   f.write(content)
+   f.seek(2**18, 1)
+   f.write(content)
+   f.seek(2**19, 1)
+   f.write(content)
+
+   copyfile(src_path, dest_path)
+
+   self.assertEqual(perform_md5(src_path), 
perform_md5(dest_path))
+
+   # If sparse blocks were preserved, then both files 
should
+   # consume the same number of blocks.
+   self.assertEqual(
+   os.stat(src_path).st_blocks,
+   os.stat(dest_path).st_blocks)
+   finally:
+   shutil.rmtree(tempdir)
diff --git a/pym/portage/util/file_copy/__init__.py 
b/pym/portage/util/file_copy/__init__.py
new file mode 100644
index 000..5c7aff1
--- /dev/null
+++ b/pym/portage/util/file_copy/__init__.py
@@ -0,0 +1,78 @@
+# Copyright 2017 Gentoo Foundation
+# Distributed under the terms of the GNU General Public License v2
+
+import os
+import shutil
+import tempfile
+
+try:
+   from portage.util.file_copy.reflink_linux import file_copy as _file_copy
+except ImportError:
+   _file_copy = None
+
+
+_copyfile = None
+
+
+def 

Re: [gentoo-portage-dev] [PATCH 1/2] checksum: Fix overriding fallbacks on broken pycrypto

2017-03-01 Thread Michał Górny
W dniu 28.02.2017, wto o godzinie 23∶57 -0800, użytkownik Zac Medico
napisał:
> On 02/28/2017 11:34 PM, Michał Górny wrote:
> > The pycrypto override used the same variables as actual hash functions
> > before determining whether its functions are useful. As a result, if
> > pycrypto had a broken module and no hash function was generated,
> > the possible previous implementation was replaced by None.
> > ---
> >  pym/portage/checksum.py | 12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/pym/portage/checksum.py b/pym/portage/checksum.py
> > index a46b820af..fc38417a7 100644
> > --- a/pym/portage/checksum.py
> > +++ b/pym/portage/checksum.py
> > @@ -105,14 +105,14 @@ except ImportError:
> >  # is broken somehow.
> >  try:
> > from Crypto.Hash import SHA256, RIPEMD
> > -   sha256hash = getattr(SHA256, 'new', None)
> > -   if sha256hash is not None:
> > +   sha256hash_ = getattr(SHA256, 'new', None)
> > +   if sha256hash_ is not None:
> > sha256hash = _generate_hash_function("SHA256",
> > -   sha256hash, origin="pycrypto")
> > -   rmd160hash = getattr(RIPEMD, 'new', None)
> > -   if rmd160hash is not None:
> > +   sha256hash_, origin="pycrypto")
> > +   rmd160hash_ = getattr(RIPEMD, 'new', None)
> > +   if rmd160hash_ is not None:
> > rmd160hash = _generate_hash_function("RMD160",
> > -   rmd160hash, origin="pycrypto")
> > +   rmd160hash_, origin="pycrypto")
> >  except ImportError:
> > pass
> >  
> > 
> 
> Looks good.

Thanks. Merged both patches.

-- 
Best regards,
Michał Górny


signature.asc
Description: This is a digitally signed message part


Re: [gentoo-portage-dev] [PATCHES] Fix md5 refs in vartree

2017-03-01 Thread Michał Górny
W dniu 28.02.2017, wto o godzinie 11∶03 -0800, użytkownik Zac Medico
napisał:
> On 02/28/2017 02:12 AM, Michał Górny wrote:
> > Hi,
> > 
> > Here's a prequel to my other patch set. It cleans up the use of MD5
> > in vartree.
> > 
> > Currently, MD5 is loaded from three different locations -- most of
> > merging code uses portage.checksum high-level functions, one bit uses
> > implementation detail of portage.checksum and one bit imports hashlib
> > directly.
> > 
> > I've replaced the use of portage.checksum implementation detail with
> > another use of hashlib, and removed the compatibility for Python < 2.5.
> > I think it's a reasonable temporary measure until someone on friend
> > terms with the code reworks it not to use MD5.
> > 
> > --
> > Best regards,
> > Michał Górny
> > 
> > 
> 
> Both series look good to me.

Thanks. Merged both yesterday.

-- 
Best regards,
Michał Górny


signature.asc
Description: This is a digitally signed message part