Re: WIP patch to reduce rewrite_filename duplication
Collin Funk wrote: > > Correct. Feel free to add tests for --copy-file, if you like. > > I didn't spend time on it, since --copy-file is simple enough. > > Sure, I can have a look at adding some. I see it is part of the Gnulib > Savannah group so I can commit changes there. I assume same rules go > there too, always send patches for review? In the beginning, yes, the same rules hold as for the main gnulib repository. People with more experience can commit right away. And you have seen that I announced my test suite changes in an informal way; code review is less important in this case because possible regressions will only affect us ourselves. Bruno
Re: WIP patch to reduce rewrite_filename duplication
On 4/25/24 5:14 PM, Bruno Haible wrote: > Yes, the other heuristic is that when many functions operate on the > same object, like methods in class do, the pointer to that object > is often passed first. And, as you noticed, the two heuristics collide... > > Here, I would not think of 'rewrite_file_name' as a method pertaining > to the class GLConfig. Rather, GLConfig is just one of the parameters. Yeah, that makes sense. I'll keep that in mind. > Perfect. OK to push. Done. > Correct. Feel free to add tests for --copy-file, if you like. > I didn't spend time on it, since --copy-file is simple enough. Sure, I can have a look at adding some. I see it is part of the Gnulib Savannah group so I can commit changes there. I assume same rules go there too, always send patches for review? > Ah, that's because exitfail.h was modified in Gnulib a few days ago. > Will adapt the override in gettext. Thanks for the notice. Thanks! I see now that the #ifdef __cplusplus change messed up the patch. Collin
Re: WIP patch to reduce rewrite_filename duplication
Hi Collin, > > Yep, that's the right way to do it. Maybe file_name should come first > > (by the usual heuristic that the argument that shows most variation > > comes first)? > > Sure, that is fine with me. I think the other way is based on how I > like to write C. Usually I like to place structures (or pointers to > them) first. The best example I can think of off the top of my head > is: > > int fprintf (FILE *stream, const char *format, ...); > > vs. > >int fputs (const char *str, FILE *stream); > > I always feel awkward using 'fputs' because of that. Yes, the other heuristic is that when many functions operate on the same object, like methods in class do, the pointer to that object is often passed first. And, as you noticed, the two heuristics collide... Here, I would not think of 'rewrite_file_name' as a method pertaining to the class GLConfig. Rather, GLConfig is just one of the parameters. > How does this patch look? Perfect. OK to push. > I ran all tests with Python 3.7 and they pass. I don't think > gnulib-tool --copy-file is tested there though. Correct. Feel free to add tests for --copy-file, if you like. I didn't spend time on it, since --copy-file is simple enough. > I tried to use gettext > since I remember autogen.sh does it a few times but I see the > following failure: > > $ cd gettext > $ env GNULIB_TOOL_IMPL=sh+py sh -x ./autogen.sh > [...] > Copying file gnulib-lib/exitfail.c > 1 out of 1 hunk FAILED -- saving rejects to file /tmp/glDMpS4d/exitfail.h.rej > /home/collin/.local/src/gnulib/gnulib-tool.sh: *** patch file > gnulib-local/lib/exitfail.h.diff didn't apply cleanly > /home/collin/.local/src/gnulib/gnulib-tool.sh: *** Stop. > + exit 1 Ah, that's because exitfail.h was modified in Gnulib a few days ago. Will adapt the override in gettext. Thanks for the notice. Bruno
Re: WIP patch to reduce rewrite_filename duplication
Hi Bruno, On 4/25/24 1:18 PM, Bruno Haible wrote: > Yep, that's the right way to do it. Maybe file_name should come first > (by the usual heuristic that the argument that shows most variation > comes first)? Sure, that is fine with me. I think the other way is based on how I like to write C. Usually I like to place structures (or pointers to them) first. The best example I can think of off the top of my head is: int fprintf (FILE *stream, const char *format, ...); vs. int fputs (const char *str, FILE *stream); I always feel awkward using 'fputs' because of that. But that is a different language so I can adapt. :) How does this patch look? I'll wait for you to review it before pushing any changes. I ran all tests with Python 3.7 and they pass. I don't think gnulib-tool --copy-file is tested there though. I tried to use gettext since I remember autogen.sh does it a few times but I see the following failure: $ cd gettext $ env GNULIB_TOOL_IMPL=sh+py sh -x ./autogen.sh [...] Copying file gnulib-lib/exitfail.c 1 out of 1 hunk FAILED -- saving rejects to file /tmp/glDMpS4d/exitfail.h.rej /home/collin/.local/src/gnulib/gnulib-tool.sh: *** patch file gnulib-local/lib/exitfail.h.diff didn't apply cleanly /home/collin/.local/src/gnulib/gnulib-tool.sh: *** Stop. + exit 1 I assume that is unrelated to these changes and will be an easy fix for you. When I remove the diff everything passes with GNULIB_TOOL_IMPL=sh+py. CollinFrom 95f83aabfcd2cd0624ae85357f7f40f34bb1a01b Mon Sep 17 00:00:00 2001 From: Collin Funk Date: Thu, 25 Apr 2024 15:30:29 -0700 Subject: [PATCH] gnulib-tool.py: Reduce code duplication in file name transformations. * pygnulib/functions.py: New file for shared functions between modules. Add a function based on functions removed from GLImport and GLTestDir. Accepts a single file name instead of a list. * pygnulib/GLImport.py (GLImport.prepare): Use the new function. (GLImport.rewrite_new_files, GLImport.rewrite_old_files): Remove functions. * pygnulib/GLTestDir.py (GLTestDir.execute): Use the new function. (GLTestDir.rewrite_files): Remove functions. * pygnulib/main.py (main): Remove unused function import. Use the new function. --- ChangeLog | 14 +++ pygnulib/GLImport.py | 97 +-- pygnulib/GLTestDir.py | 55 +--- pygnulib/functions.py | 52 +++ pygnulib/main.py | 17 +--- 5 files changed, 88 insertions(+), 147 deletions(-) create mode 100644 pygnulib/functions.py diff --git a/ChangeLog b/ChangeLog index abefd3bae2..276258b885 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,17 @@ +2024-04-25 Collin Funk + + gnulib-tool.py: Reduce code duplication in file name transformations. + * pygnulib/functions.py: New file for shared functions between modules. + Add a function based on functions removed from GLImport and GLTestDir. + Accepts a single file name instead of a list. + * pygnulib/GLImport.py (GLImport.prepare): Use the new function. + (GLImport.rewrite_new_files, GLImport.rewrite_old_files): Remove + functions. + * pygnulib/GLTestDir.py (GLTestDir.execute): Use the new function. + (GLTestDir.rewrite_files): Remove functions. + * pygnulib/main.py (main): Remove unused function import. Use the new + function. + 2024-04-25 Bruno Haible doc: Remove documentation of IRIX as supported platform. diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py index cf1ebd10ec..47c0e83555 100644 --- a/pygnulib/GLImport.py +++ b/pygnulib/GLImport.py @@ -38,6 +38,7 @@ relativize, rmtree, ) +from .functions import rewrite_file_name from .GLError import GLError from .GLConfig import GLConfig from .GLModuleSystem import GLModuleTable @@ -303,80 +304,6 @@ def __repr__(self) -> str: result = '' % hex(id(self)) return result -def rewrite_old_files(self, files: list[str]) -> list[str]: -'''Replace auxdir, docbase, sourcebase, m4base and testsbase from default -to their version from cached config.''' -if type(files) is not list: -raise TypeError('files argument must has list type, not %s' -% type(files).__name__) -for file in files: -if type(file) is not str: -raise TypeError('each file must be a string instance') -files = sorted(set(files)) -files = [ '%s%s' % (file, os.path.sep) - for file in files ] -auxdir = self.cache['auxdir'] -docbase = self.cache['docbase'] -sourcebase = self.cache['sourcebase'] -m4base = self.cache['m4base'] -testsbase = self.cache['testsbase'] -result = [] -for file in files: -if file.startswith('build-aux/'): -path = substart('build-aux/', '%s/' % auxdir, file) -elif file.startswith('doc/'): -path = substart('doc/', '%s/' % docbase, file) -elif file.startswith('lib/'):
Re: WIP patch to reduce rewrite_filename duplication
Hi Collin, > > IMO, the creation of a rewrite_file_name method (singular! you mentioned > > a couple of days ago that the ability to rewrite a single file name is > > one of the motivations for this refactoring) in a central place goes in > > the correct direction. However, moving knowledge about 'cache' and 'config' > > into GLFileTable does not. > > Sure, I agree. Though I'm not sure the best place to put it. All of > our non-private functions are in constants.py, but this function seems > out of place there. Since all of the functions there are string > helpers and such. In other words, no GL* classes are imported there. I agree, it's out-of-place in constants.py. > Maybe it would be best to create a functions.py file? Sounds reasonable, yes. > For the actual function it would be something like this: > > def rewrite_file_name(config: GLConfig, > file_name: str, > also_tests_lib: bool = False) -> str: > ... > if also_tests_lib and file_name.startswith('tests=lib/'): ># OK Yep, that's the right way to do it. Maybe file_name should come first (by the usual heuristic that the argument that shows most variation comes first)? Bruno
Re: WIP patch to reduce rewrite_filename duplication
Hi Bruno, Thank you for the detailed explanations as always. On 4/25/24 1:48 AM, Bruno Haible wrote: > Object-oriented programming is not easy, and it comes with constant > hesitations and deliberations. :) My least favorite part of my degree was object-oriented design and database normalization, for similar reasons. > Another, bigger warning sign is that you can no longer easily explain > what the GLFileTable is. > Before, it was a data structure that holds file lists during an operation. > After, it is a helper class that holds file lists and also meddles with > file name rewriting as required by the caller. (That's the best description > I can come up with!) I see what you mean. I guess the strange setters would be a symptom of that warning sign that GLFileTable is doing to much. The way you described the GLFileTable reminds me of the reasoning for adding the 'dataclasses' module added in Python 3.7 [1]. Since we are both very big fans of decorators of course. :) > IMO, the creation of a rewrite_file_name method (singular! you mentioned > a couple of days ago that the ability to rewrite a single file name is > one of the motivations for this refactoring) in a central place goes in > the correct direction. However, moving knowledge about 'cache' and 'config' > into GLFileTable does not. Sure, I agree. Though I'm not sure the best place to put it. All of our non-private functions are in constants.py, but this function seems out of place there. Since all of the functions there are string helpers and such. In other words, no GL* classes are imported there. Maybe it would be best to create a functions.py file? > Neither of the two proprosals smells well. Why not add a boolean > 'also_tests_lib' parameter to the rewrite_file_name method, that tells > whether to replace 'tests=lib/' or not? This way, the refactoring does > not introduce a functional difference. Sorry, I think I worded that poorly. Since 'tests=lib/' is internally used I was thinking that maybe we should warn/error if it is used. I guess that isn't a problem worth losing sleep over. For the actual function it would be something like this: def rewrite_file_name(config: GLConfig, file_name: str, also_tests_lib: bool = False) -> str: ... if also_tests_lib and file_name.startswith('tests=lib/'): # OK Since 'tests=lib/' is used internally I felt it makes more sense to default it to False. That way the call in main.py can ignore it and the calls in GLImport.py and GLTestDir.py can use it internally. > In a case like this, it's better if you post the patch for review, rather > than push it too quickly. Trust me: OO takes a long time to learn. Yes, I agree. [1] https://docs.python.org/3/library/dataclasses.html#module-dataclasses Collin
Re: WIP patch to reduce rewrite_filename duplication
Hi Collin, Object-oriented programming is not easy, and it comes with constant hesitations and deliberations. > Less repeated code to maintain. Yes, this is one of the goals to keep in mind. The other most relevant questions are: - Can the purpose of a class be explained in simple terms? - Is a class properly decoupled from the code that uses the class? - Are we using the common design patterns (such as getters/setters, class-private caching, etc.)? Here you already have noticed a warning sign: > The setters also take two arguments which feels a bit unorthodox to me. Another, bigger warning sign is that you can no longer easily explain what the GLFileTable is. Before, it was a data structure that holds file lists during an operation. After, it is a helper class that holds file lists and also meddles with file name rewriting as required by the caller. (That's the best description I can come up with!) The two warning signs together indicate that, in this case, you are trying to move too much logic from the GLImport and GLTestDir classes into GLFileTable. Don't get me wrong: Moving application code to helper classes can be OK in other cases. But when done so, it should decouple the class and the caller. Which is not being done here: The fact that the caller has a 'cache' and a 'config' variable had no impact on the GLFileTable class, and after the patch it would. > I haven't pushed this patch yet, but it goes in the correct direction, > in my opinion. IMO, the creation of a rewrite_file_name method (singular! you mentioned a couple of days ago that the ability to rewrite a single file name is one of the motivations for this refactoring) in a central place goes in the correct direction. However, moving knowledge about 'cache' and 'config' into GLFileTable does not. > > 'tests=lib/' should not occur in the scope of mode == 'copy-file'. > > But this string is an indicator inside gnulib-tool; users of gnulib-tool > > should not be allowed to pass it. > > Does it make more sense to check for a filename starting with > 'tests=lib/' and error out or is it safe to assume the input is clean? Neither of the two proprosals smells well. Why not add a boolean 'also_tests_lib' parameter to the rewrite_file_name method, that tells whether to replace 'tests=lib/' or not? This way, the refactoring does not introduce a functional difference. > Ideally I would like to use a module-level function or @staticmethod > there. Module-level functions are OK. @staticmethod raises questions; in particular, whether the method belongs there, where you are attempting to put it. > Any opinions on these ideas? If not, I will probably sleep on it and > push an improved patch. In a case like this, it's better if you post the patch for review, rather than push it too quickly. Trust me: OO takes a long time to learn. Bruno