Re: [PATCH v2] remote-mediawiki: limit filenames to legal

2017-10-30 Thread Antoine Beaupré
On 2017-10-30 11:34:11, Matthieu Moy wrote:
> Antoine Beaupré  writes:
>
>> @@ -52,7 +53,7 @@ sub smudge_filename {
>>  $filename =~ s/ /_/g;
>>  # Decode forbidden characters encoded in clean_filename
>>  $filename =~ s/_%_([0-9a-fA-F][0-9a-fA-F])/sprintf('%c', hex($1))/ge;
>> -return $filename;
>> +return substr($filename, 0, NAME_MAX-3);
>
> There's a request to allow a configurable extension (.mediawiki would
> help importing in some wikis, see
> https://github.com/Git-Mediawiki/Git-Mediawiki/issues/42). You should at
> least make this stg like length(".mw") so that the next search
> for ".mw" finds this.

I believe I did that in v3.

> Also, note that your solution works for using Git-Mediawiki in a
> read-only way, but if you start modifying and pushing such files, you'll
> get into trouble. It probably makes sense to issue a warnign in such
> case.

True. I didn't consider that, but then again the patch is not a
regression: you couldn't have pushed those repos in the first place
anyways...

A.

-- 
The history of any one part of the earth, like the life of a soldier,
consists of long periods of boredom and short periods of terror.
   - British geologist Derek V. Ager


Re: [PATCH v2] remote-mediawiki: limit filenames to legal

2017-10-30 Thread Matthieu Moy
Antoine Beaupré  writes:

> @@ -52,7 +53,7 @@ sub smudge_filename {
>   $filename =~ s/ /_/g;
>   # Decode forbidden characters encoded in clean_filename
>   $filename =~ s/_%_([0-9a-fA-F][0-9a-fA-F])/sprintf('%c', hex($1))/ge;
> - return $filename;
> + return substr($filename, 0, NAME_MAX-3);

There's a request to allow a configurable extension (.mediawiki would
help importing in some wikis, see
https://github.com/Git-Mediawiki/Git-Mediawiki/issues/42). You should at
least make this stg like length(".mw") so that the next search
for ".mw" finds this.

Also, note that your solution works for using Git-Mediawiki in a
read-only way, but if you start modifying and pushing such files, you'll
get into trouble. It probably makes sense to issue a warnign in such
case.

Regards,

-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH v2] remote-mediawiki: limit filenames to legal

2017-10-29 Thread Antoine Beaupré
mediawiki pages can have names longer than NAME_MAX (generally 255)
characters, which will fail on checkout. we simply strip out extra
characters, which may mean one page's content will overwrite another
(the last editing winning).

ideally, we would do a more clever system to find unique names, but
that would be more difficult and error prone for a situation that
should rarely happen in the first place.

Signed-off-by: Antoine Beaupré 
---
 contrib/mw-to-git/Git/Mediawiki.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/mw-to-git/Git/Mediawiki.pm 
b/contrib/mw-to-git/Git/Mediawiki.pm
index d13c4dfa7..c9f22680a 100644
--- a/contrib/mw-to-git/Git/Mediawiki.pm
+++ b/contrib/mw-to-git/Git/Mediawiki.pm
@@ -2,6 +2,7 @@ package Git::Mediawiki;
 
 use 5.008;
 use strict;
+use POSIX;
 use Git;
 
 BEGIN {
@@ -52,7 +53,7 @@ sub smudge_filename {
$filename =~ s/ /_/g;
# Decode forbidden characters encoded in clean_filename
$filename =~ s/_%_([0-9a-fA-F][0-9a-fA-F])/sprintf('%c', hex($1))/ge;
-   return $filename;
+   return substr($filename, 0, NAME_MAX-3);
 }
 
 sub connect_maybe {
-- 
2.11.0