Re: [win32][patch] executable() may fail on very long filename

2019-02-20 Fir de Conversatie ktakata65536
Hi,

2019/2/19 Tue 23:22:35 UTC+9 ktakat...@gmail.com wrote:
> Hi,
> 
> 2018/10/8 Mon 10:50:46 UTC+9 Ken Takata wrote:
> > Hi,
> > 
> > 2016/2/2 Tue 7:11:32 UTC+9 Bram Moolenaar wrote:
> > > Ken Takata wrote:
> > > 
> > > > When 'enc' is utf-8, executable() may fail on very long filename which 
> > > > is
> > > > longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH 
> > > > character in
> > > > UTF-16.
> > > > Here is an example on Japanese Windows:
> > > > 
> > > > C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" 
> > > > .bat
> > > > :w
> > > > :echo glob('あ*.bat')
> > > > .bat
> > > > :echo strlen(glob('あ*.bat'))
> > > > 604   " longer than 260
> > > > :echo strchars(glob('あ*.bat'))
> > > > 204   " shorter than 260
> > > > :echo executable(glob('あ*.bat'))
> > > > 0 " 1 is expected.
> > > > 
> > > > 
> > > > Attached patch fixes the problem.
> > > 
> > > Thanks!
> > 
> > I have updated the patch for 8.1.0453:
> > 
> > * Fixed conflicts.
> > * Fixed a typo in a comment which was added in 8.1.0453.
> 
> I have updated the patch for the latest source code:
> 
> * Fixed conflicts.
> * Use C++ style comments.

I added a test for this and created a PR so that we can check if the test
passes on AppVeyor: https://github.com/vim/vim/pull/4015

Regards,
Ken Takata

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [win32][patch] executable() may fail on very long filename

2019-02-19 Fir de Conversatie ktakata65536
Hi,

2018/10/8 Mon 10:50:46 UTC+9 Ken Takata wrote:
> Hi,
> 
> 2016/2/2 Tue 7:11:32 UTC+9 Bram Moolenaar wrote:
> > Ken Takata wrote:
> > 
> > > When 'enc' is utf-8, executable() may fail on very long filename which is
> > > longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH character 
> > > in
> > > UTF-16.
> > > Here is an example on Japanese Windows:
> > > 
> > > C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" 
> > > .bat
> > > :w
> > > :echo glob('あ*.bat')
> > > .bat
> > > :echo strlen(glob('あ*.bat'))
> > > 604   " longer than 260
> > > :echo strchars(glob('あ*.bat'))
> > > 204   " shorter than 260
> > > :echo executable(glob('あ*.bat'))
> > > 0 " 1 is expected.
> > > 
> > > 
> > > Attached patch fixes the problem.
> > 
> > Thanks!
> 
> I have updated the patch for 8.1.0453:
> 
> * Fixed conflicts.
> * Fixed a typo in a comment which was added in 8.1.0453.

I have updated the patch for the latest source code:

* Fixed conflicts.
* Use C++ style comments.

Regards,
Ken Takata

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
# HG changeset patch
# Parent 6ee3a815113c128147fa9831c2004916a7daba86
# Parent  54b1387be17d939d03f3c1abc54ad15387e64c31

diff --git a/src/os_win32.c b/src/os_win32.c
--- a/src/os_win32.c
+++ b/src/os_win32.c
@@ -3515,14 +3515,17 @@ mch_writable(char_u *name)
 int
 mch_can_exe(char_u *name, char_u **path, int use_path)
 {
-char_u	buf[_MAX_PATH];
+// WinNT and later can use _MAX_PATH wide characters for a pathname, which
+// means that the maximum pathname is _MAX_PATH * 3 bytes when 'enc' is
+// UTF-8.
+char_u	buf[_MAX_PATH * 3];
 int		len = (int)STRLEN(name);
 char_u	*p, *saved;
 
-if (len >= _MAX_PATH)	/* safety check */
+if (len >= sizeof(buf))	// safety check
 	return FALSE;
 
-/* Ty using the name directly when a Unix-shell like 'shell'. */
+// Try using the name directly when a Unix-shell like 'shell'.
 if (strstr((char *)gettail(p_sh), "sh") != NULL)
 	if (executable_exists((char *)name, path, use_path))
 	return TRUE;
@@ -3555,7 +3558,7 @@ mch_can_exe(char_u *name, char_u **path,
 }
 vim_free(saved);
 
-vim_strncpy(buf, name, _MAX_PATH - 1);
+vim_strncpy(buf, name, sizeof(buf) - 1);
 p = mch_getenv("PATHEXT");
 if (p == NULL)
 	p = (char_u *)".com;.exe;.bat;.cmd";
@@ -3570,7 +3573,7 @@ mch_can_exe(char_u *name, char_u **path,
 		++p;
 	}
 	else
-	copy_option_part(, buf + len, _MAX_PATH - len, ";");
+	copy_option_part(, buf + len, sizeof(buf) - len, ";");
 	if (executable_exists((char *)buf, path, use_path))
 	return TRUE;
 }


Re: [win32][patch] executable() may fail on very long filename

2018-10-07 Fir de Conversatie Ken Takata
Hi,

2018/10/8 Mon 11:40:07 UTC+9 Ken Takata wrote:
> Hi Tony,
> 
> 2018/10/8 Mon 11:32:39 UTC+9 Tony Mechelynck wrote:
> > On Mon, Oct 8, 2018 at 3:50 AM Ken Takata  wrote:
> > > [...]
> > > I have updated the patch for 8.1.0453:
> > >
> > > * Fixed conflicts.
> > > * Fixed a typo in a comment which was added in 8.1.0453.
> > >
> > > Regards,
> > > Ken Takata
> > 
> > In UTF-8, characters outside the BMP (i.e. characters in the range
> > U+1 to U+10FFFD), including some "CJK Extension" characters in
> > plane 2, use 4 bytes each, not 3. However, in UTF-16le as used by
> > Windows, each of those non-BMP characters takes up 2 words (one high
> > surrogate and one low surrogate) instead of 1, so maybe (I don't know)
> > they might "count double" towards the allowed _MAX_PATH characters.
> 
> Of course, the buffer size "_MAX_PATH * 3" takes in to account those
> non-BMP characters. A non-BMP character will be stored as two WCHARs
> which are 4 bytes in UTF-16. And if it is converted to UTF-8, it is
> also 4 bytes. So the buffer size is correct. No need to multiply by 4.

And a WCHAR from U+0800 to U+ will be converted to a 3-bytes UTF-8
sequence. So it is really needs to be multiply by 3.

Regards,
Ken Takata

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [win32][patch] executable() may fail on very long filename

2018-10-07 Fir de Conversatie Ken Takata
Hi Tony,

2018/10/8 Mon 11:32:39 UTC+9 Tony Mechelynck wrote:
> On Mon, Oct 8, 2018 at 3:50 AM Ken Takata  wrote:
> > [...]
> > I have updated the patch for 8.1.0453:
> >
> > * Fixed conflicts.
> > * Fixed a typo in a comment which was added in 8.1.0453.
> >
> > Regards,
> > Ken Takata
> 
> In UTF-8, characters outside the BMP (i.e. characters in the range
> U+1 to U+10FFFD), including some "CJK Extension" characters in
> plane 2, use 4 bytes each, not 3. However, in UTF-16le as used by
> Windows, each of those non-BMP characters takes up 2 words (one high
> surrogate and one low surrogate) instead of 1, so maybe (I don't know)
> they might "count double" towards the allowed _MAX_PATH characters.

Of course, the buffer size "_MAX_PATH * 3" takes in to account those
non-BMP characters. A non-BMP character will be stored as two WCHARs
which are 4 bytes in UTF-16. And if it is converted to UTF-8, it is
also 4 bytes. So the buffer size is correct. No need to multiply by 4.

Regards,
Ken Takata

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [win32][patch] executable() may fail on very long filename

2018-10-07 Fir de Conversatie Tony Mechelynck
On Mon, Oct 8, 2018 at 3:50 AM Ken Takata  wrote:
> [...]
> I have updated the patch for 8.1.0453:
>
> * Fixed conflicts.
> * Fixed a typo in a comment which was added in 8.1.0453.
>
> Regards,
> Ken Takata

In UTF-8, characters outside the BMP (i.e. characters in the range
U+1 to U+10FFFD), including some "CJK Extension" characters in
plane 2, use 4 bytes each, not 3. However, in UTF-16le as used by
Windows, each of those non-BMP characters takes up 2 words (one high
surrogate and one low surrogate) instead of 1, so maybe (I don't know)
they might "count double" towards the allowed _MAX_PATH characters.

Best regards,
Tony.

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [win32][patch] executable() may fail on very long filename

2018-10-07 Fir de Conversatie Ken Takata
Hi,

2016/2/2 Tue 7:11:32 UTC+9 Bram Moolenaar wrote:
> Ken Takata wrote:
> 
> > When 'enc' is utf-8, executable() may fail on very long filename which is
> > longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH character in
> > UTF-16.
> > Here is an example on Japanese Windows:
> > 
> > C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" 
> > .bat
> > :w
> > :echo glob('あ*.bat')
> > .bat
> > :echo strlen(glob('あ*.bat'))
> > 604   " longer than 260
> > :echo strchars(glob('あ*.bat'))
> > 204   " shorter than 260
> > :echo executable(glob('あ*.bat'))
> > 0 " 1 is expected.
> > 
> > 
> > Attached patch fixes the problem.
> 
> Thanks!

I have updated the patch for 8.1.0453:

* Fixed conflicts.
* Fixed a typo in a comment which was added in 8.1.0453.

Regards,
Ken Takata

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
# HG changeset patch
# Parent 6ee3a815113c128147fa9831c2004916a7daba86
# Parent  707014f5a78471ca3aa6fc1bb893119da5489633

diff --git a/src/os_win32.c b/src/os_win32.c
--- a/src/os_win32.c
+++ b/src/os_win32.c
@@ -3547,14 +3547,21 @@ mch_writable(char_u *name)
 int
 mch_can_exe(char_u *name, char_u **path, int use_path)
 {
+#ifdef FEAT_MBYTE
+/* WinNT and later can use _MAX_PATH wide characters for a pathname, which
+ * means that the maximum pathname is _MAX_PATH * 3 bytes when 'enc' is
+ * UTF-8. */
+char_u	buf[_MAX_PATH * 3];
+#else
 char_u	buf[_MAX_PATH];
+#endif
 int		len = (int)STRLEN(name);
 char_u	*p, *saved;
 
-if (len >= _MAX_PATH)	/* safety check */
+if (len >= sizeof(buf))	/* safety check */
 	return FALSE;
 
-/* Ty using the name directly when a Unix-shell like 'shell'. */
+/* Try using the name directly when a Unix-shell like 'shell'. */
 if (strstr((char *)gettail(p_sh), "sh") != NULL)
 	if (executable_exists((char *)name, path, use_path))
 	return TRUE;
@@ -3587,7 +3594,7 @@ mch_can_exe(char_u *name, char_u **path,
 }
 vim_free(saved);
 
-vim_strncpy(buf, name, _MAX_PATH - 1);
+vim_strncpy(buf, name, sizeof(buf) - 1);
 p = mch_getenv("PATHEXT");
 if (p == NULL)
 	p = (char_u *)".com;.exe;.bat;.cmd";
@@ -3602,7 +3609,7 @@ mch_can_exe(char_u *name, char_u **path,
 		++p;
 	}
 	else
-	copy_option_part(, buf + len, _MAX_PATH - len, ";");
+	copy_option_part(, buf + len, sizeof(buf) - len, ";");
 	if (executable_exists((char *)buf, path, use_path))
 	return TRUE;
 }


Re: [win32][patch] executable() may fail on very long filename

2016-02-01 Fir de Conversatie Bram Moolenaar

Ken Takata wrote:

> When 'enc' is utf-8, executable() may fail on very long filename which is
> longer than _MAX_PATH bytes in UTF-8 and shorter than _MAX_PATH character in
> UTF-16.
> Here is an example on Japanese Windows:
> 
> C:\tmp>gvim -N -u NONE --cmd "set enc=utf-8" 
> .bat
> :w
> :echo glob('あ*.bat')
> .bat
> :echo strlen(glob('あ*.bat'))
> 604   " longer than 260
> :echo strchars(glob('あ*.bat'))
> 204   " shorter than 260
> :echo executable(glob('あ*.bat'))
> 0 " 1 is expected.
> 
> 
> Attached patch fixes the problem.

Thanks!

-- 
hundred-and-one symptoms of being an internet addict:
100. The most exciting sporting events you noticed during summer 1996
was Netscape vs. Microsoft.

 /// Bram Moolenaar -- b...@moolenaar.net -- http://www.Moolenaar.net   \\\
///sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\  an exciting new programming language -- http://www.Zimbu.org///
 \\\help me help AIDS victims -- http://ICCF-Holland.org///

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to vim_dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.