Re: [Bioc-devel] Windows, normalizePath(), and non-ASCII characters

2018-06-01 Thread Mike Smith
Hi Val,

I think I achieved some resolution, if not total clarity.  It's to do with
the encoding of the of the two path variables:

> Encoding(path1)
[1] "unknown"
> Encoding(path2)
[1] "UTF-8"

I don't understand why recursive calls to normalizePath() changes the
encoding, but the combination of HDF5 & Windows fails when given UTF-8
paths.  I've updated rhdf5 to try and ensure paths are encoded in Latin-1
which Windows is fine with, but it'll still go awry if you use characters
outside that set.  I'm still searching for a more comprehensive solution.

Thanks,
Mike

On Thu, 31 May 2018 at 20:09, Obenchain, Valerie <
valerie.obench...@roswellpark.org> wrote:

> Hi Mike,
> Is this still an issue or has it been resolved?
> Val
>
>
> On 05/22/2018 02:19 PM, Mike Smith wrote:
>
> In trying to diagnose this issue athttps://support.bioconductor.org/p/108548/ 
> I've found some weird behaviour
> with Windows, normalizePath(), and non-ASCII characters.  Essentially, if I
> run normalizePath() recursively on a path that contains  'é' (I haven't
> tried other characters) something 'changes' in the string, but I can't work
> out what, and it breaks a subsequent .Call() which uses the path.
>
> The example below tries to demonstrate this in a fairly concise manner. It
> works fine if normalizePath() is run once, but fails after it's run a
> second time on itself.
>
> However, change "éxample" for "example" and both instances work. Similarly,
> both run fine on my Linux machine with the non-ASCII character inplace.
>
> I'd be grateful if anyone else with a Windows machine could verify this
> behaviour, or to shed any light on what might be the difference between path1
> and path2 below.
>
> Thank,
> Mike
>
> --
>
> ## setup some HDF5 components required later
> flags <- rhdf5:::h5checkConstants("H5F_ACC", h5default("H5F_ACC"))
> fcpl <- rhdf5:::h5checktypeAndPLC(NULL, "H5P_FILE_CREATE", allowNULL = TRUE)
> fapl <- rhdf5::H5Pcreate("H5P_FILE_ACCESS")
>
> ## create a folder with non-ASCII character
> dir.create('éxample')
> setwd("éxample")
>
> ## create two normalized paths recursively - these are 'identical'
> path1 <- normalizePath('test.h5', mustWork = FALSE)
> path2 <- normalizePath(path1, mustWork = FALSE)
> identical(path1, path2)
>
> ## create an HDF5 file using path1 - this works
> fid <- .Call("_H5Fcreate", path1, flags, fcpl@ID, fapl@ID,
>  PACKAGE = "rhdf5")
> .Call("_H5Fclose", fid, PACKAGE = "rhdf5")
> file.remove(path1)
>
> ## create an HDF5 file using path2 - this fails
> fid <- .Call("_H5Fcreate", path2, flags, fcpl@ID, fapl@ID,
>  PACKAGE = "rhdf5")
> if(exists('fid2')) {
>   .Call("_H5Fclose", fid2, PACKAGE = "rhdf5")
>   file.remove(path2)
> }
>
> ## tidy up
> rhdf5::h5closeAll()
> setwd("../")
>
>   [[alternative HTML version deleted]]
>
> ___bioc-de...@r-project.org 
> mailing listhttps://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited. If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Windows, normalizePath(), and non-ASCII characters

2018-05-31 Thread Obenchain, Valerie
Hi Mike,
Is this still an issue or has it been resolved?
Val


On 05/22/2018 02:19 PM, Mike Smith wrote:

In trying to diagnose this issue at
https://support.bioconductor.org/p/108548/ I've found some weird behaviour
with Windows, normalizePath(), and non-ASCII characters.  Essentially, if I
run normalizePath() recursively on a path that contains  '�' (I haven't
tried other characters) something 'changes' in the string, but I can't work
out what, and it breaks a subsequent .Call() which uses the path.

The example below tries to demonstrate this in a fairly concise manner. It
works fine if normalizePath() is run once, but fails after it's run a
second time on itself.

However, change "�xample" for "example" and both instances work. Similarly,
both run fine on my Linux machine with the non-ASCII character inplace.

I'd be grateful if anyone else with a Windows machine could verify this
behaviour, or to shed any light on what might be the difference between path1
and path2 below.

Thank,
Mike

--

## setup some HDF5 components required later
flags <- rhdf5:::h5checkConstants("H5F_ACC", h5default("H5F_ACC"))
fcpl <- rhdf5:::h5checktypeAndPLC(NULL, "H5P_FILE_CREATE", allowNULL = TRUE)
fapl <- rhdf5::H5Pcreate("H5P_FILE_ACCESS")

## create a folder with non-ASCII character
dir.create('�xample')
setwd("�xample")

## create two normalized paths recursively - these are 'identical'
path1 <- normalizePath('test.h5', mustWork = FALSE)
path2 <- normalizePath(path1, mustWork = FALSE)
identical(path1, path2)

## create an HDF5 file using path1 - this works
fid <- .Call("_H5Fcreate", path1, flags, fcpl@ID, fapl@ID,
 PACKAGE = "rhdf5")
.Call("_H5Fclose", fid, PACKAGE = "rhdf5")
file.remove(path1)

## create an HDF5 file using path2 - this fails
fid <- .Call("_H5Fcreate", path2, flags, fcpl@ID, fapl@ID,
 PACKAGE = "rhdf5")
if(exists('fid2')) {
  .Call("_H5Fclose", fid2, PACKAGE = "rhdf5")
  file.remove(path2)
}

## tidy up
rhdf5::h5closeAll()
setwd("../")

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel





This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Windows, normalizePath(), and non-ASCII characters

2018-05-22 Thread Mike Smith
In trying to diagnose this issue at
https://support.bioconductor.org/p/108548/ I've found some weird behaviour
with Windows, normalizePath(), and non-ASCII characters.  Essentially, if I
run normalizePath() recursively on a path that contains  'é' (I haven't
tried other characters) something 'changes' in the string, but I can't work
out what, and it breaks a subsequent .Call() which uses the path.

The example below tries to demonstrate this in a fairly concise manner. It
works fine if normalizePath() is run once, but fails after it's run a
second time on itself.

However, change "éxample" for "example" and both instances work. Similarly,
both run fine on my Linux machine with the non-ASCII character inplace.

I'd be grateful if anyone else with a Windows machine could verify this
behaviour, or to shed any light on what might be the difference between path1
and path2 below.

Thank,
Mike

--

## setup some HDF5 components required later
flags <- rhdf5:::h5checkConstants("H5F_ACC", h5default("H5F_ACC"))
fcpl <- rhdf5:::h5checktypeAndPLC(NULL, "H5P_FILE_CREATE", allowNULL = TRUE)
fapl <- rhdf5::H5Pcreate("H5P_FILE_ACCESS")

## create a folder with non-ASCII character
dir.create('éxample')
setwd("éxample")

## create two normalized paths recursively - these are 'identical'
path1 <- normalizePath('test.h5', mustWork = FALSE)
path2 <- normalizePath(path1, mustWork = FALSE)
identical(path1, path2)

## create an HDF5 file using path1 - this works
fid <- .Call("_H5Fcreate", path1, flags, fcpl@ID, fapl@ID,
 PACKAGE = "rhdf5")
.Call("_H5Fclose", fid, PACKAGE = "rhdf5")
file.remove(path1)

## create an HDF5 file using path2 - this fails
fid <- .Call("_H5Fcreate", path2, flags, fcpl@ID, fapl@ID,
 PACKAGE = "rhdf5")
if(exists('fid2')) {
  .Call("_H5Fclose", fid2, PACKAGE = "rhdf5")
  file.remove(path2)
}

## tidy up
rhdf5::h5closeAll()
setwd("../")

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel