Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-04-04 Thread Kieran Campbell
Hi all,

Thanks for your replies. I had a look at the greta package
(https://github.com/greta-dev/greta) on CRAN that uses tensorflow,
which seemingly exports the install_tensorflow function but still
requires the user to call it, so it looks like that's the best way to
go for now. I'll submit our first package to bioc issues this week and
highlight this exchange, and we can take it from there.

Thanks again for all your help,

Kieran


On 30 March 2018 at 11:55, Hervé Pagès  wrote:
> FWIW the tensorflow authors didn't opt for automatic lazy installation:
>
>   > run_example("hello.R")
>   Error: Installation of TensorFlow not found.
>
>   Python environments searched for 'tensorflow' package:
>/usr/bin/python2.7
>/usr/bin/python3.5
>
>   You can install TensorFlow using the install_tensorflow() function.
>
> Would be interesting to know why.
>
> install_tensorflow() has various arguments and the chances that it
> will just work and do the right thing when called with no argument
> are low. There is also this 'restart_session' argument that is TRUE
> by default and will only work within RStudio. This suggests that after
> successful completion R needs to be restarted before the tensorflow
> package becomes operational. I didn't test that but that's something
> you might want to investigate before opting for lazy installation.
>
> Also it might help to look at how the handful of CRAN packages that
> depend on tensorflow handle this. These packages are listed in the
> reverse dependencies section of the tensorflow landing page:
>
>   https://cran.r-project.org/web/packages/tensorflow/index.html
>
> We'll install the tensorflow Python module on the build machines when
> you submit your package.
>
> Cheers,
> H.
>
>
>
> On 03/29/2018 10:08 AM, Michael Lawrence wrote:
>>
>> The problem with requiring explicit tensor flow installation is that
>> it is tantamount to a system dependency in many ways, and those are
>> annoying. Herve points out the problems with installing at load time.
>> My suggestion was to install the package the first time someone tries
>> to e.g. load an R matrix into a tensor. That way, you know that
>> examples and vignettes will always just work (if the installation
>> works) on any build machine, without any admin intervention. And, the
>> last thing a user wants when running an example is an error, even if
>> that error is easily remedied. One downside is that the user could
>> have just forgotten to point the package to a system installation of
>> tensorflow, in which case they will be cursing themselves for
>> forgetting while watching the installation process. You could check
>> for interactive() and then prompt the user to avoid that case.
>>
>> On Thu, Mar 29, 2018 at 9:44 AM, Kieran Campbell
>>  wrote:
>>>
>>> Hi Hervé, Michael,
>>>
>>> Thanks for your feedback. I will add in the reticulate check to ensure
>>> tensorflow is installed prior to running and appropriate sections in
>>> the vignettes. We have one package essentially ready for submission to
>>> bioc, so is the best route forward to submit now or wait until
>>> tensorflow is installed on the build servers?
>>>
>>> Many thanks
>>>
>>> Kieran
>>>
>>>
>>> On 28 March 2018 at 15:10, Hervé Pagès  wrote:

 On 03/28/2018 02:41 PM, Hervé Pagès wrote:
>
>
> Hi Kieran,
>
> Note that you can execute arbitrary code at load time by defining
> an .onLoad() hook in your package. So you *could* put something
> like this in your package:
>
> .onUnload <- function(libpath)
> {
>   if (!reticulate::py_module_available("tensorflow"))
>   tensorflow::install_tensorflow()
> }



 should be .onLoad() in the above code

 more below...

>
> However, having things being automatically downloaded/installed
> on the user machine at package load-time is not a good idea. There
> are just too many things that can go wrong.
>
> For example, I just tried to run tensorflow::install_tensorflow()
> on my laptop (Ubuntu 16.04) and was successful only after the 3rd
> attempt (I had to make some changes/adjustments to my system between
> each attempt). And Debian Linux is probably the easiest target!
>
> Also note that install.packages() tries to load the package at the
> end of the installation when installing from source so if the
> .onUnload() hook fails, install.packages() considers that


^^^
 .onLoad()

 same here, sorry

 H.


> the installation of the package failed and it removes it.
>
> Finally note that this installation needs to download hundreds of
> Mb of Python stuff.
>
> So this is probably the reasons why the authors of the tensorflow
> CRAN package chose to separate installation of the tensorflow Python
> module from 

Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-03-29 Thread Michael Lawrence
The problem with requiring explicit tensor flow installation is that
it is tantamount to a system dependency in many ways, and those are
annoying. Herve points out the problems with installing at load time.
My suggestion was to install the package the first time someone tries
to e.g. load an R matrix into a tensor. That way, you know that
examples and vignettes will always just work (if the installation
works) on any build machine, without any admin intervention. And, the
last thing a user wants when running an example is an error, even if
that error is easily remedied. One downside is that the user could
have just forgotten to point the package to a system installation of
tensorflow, in which case they will be cursing themselves for
forgetting while watching the installation process. You could check
for interactive() and then prompt the user to avoid that case.

On Thu, Mar 29, 2018 at 9:44 AM, Kieran Campbell
 wrote:
> Hi Hervé, Michael,
>
> Thanks for your feedback. I will add in the reticulate check to ensure
> tensorflow is installed prior to running and appropriate sections in
> the vignettes. We have one package essentially ready for submission to
> bioc, so is the best route forward to submit now or wait until
> tensorflow is installed on the build servers?
>
> Many thanks
>
> Kieran
>
>
> On 28 March 2018 at 15:10, Hervé Pagès  wrote:
>> On 03/28/2018 02:41 PM, Hervé Pagès wrote:
>>>
>>> Hi Kieran,
>>>
>>> Note that you can execute arbitrary code at load time by defining
>>> an .onLoad() hook in your package. So you *could* put something
>>> like this in your package:
>>>
>>>.onUnload <- function(libpath)
>>>{
>>>  if (!reticulate::py_module_available("tensorflow"))
>>>  tensorflow::install_tensorflow()
>>>}
>>
>>
>> should be .onLoad() in the above code
>>
>> more below...
>>
>>>
>>> However, having things being automatically downloaded/installed
>>> on the user machine at package load-time is not a good idea. There
>>> are just too many things that can go wrong.
>>>
>>> For example, I just tried to run tensorflow::install_tensorflow()
>>> on my laptop (Ubuntu 16.04) and was successful only after the 3rd
>>> attempt (I had to make some changes/adjustments to my system between
>>> each attempt). And Debian Linux is probably the easiest target!
>>>
>>> Also note that install.packages() tries to load the package at the
>>> end of the installation when installing from source so if the
>>> .onUnload() hook fails, install.packages() considers that
>>
>>   ^^^
>>.onLoad()
>>
>> same here, sorry
>>
>> H.
>>
>>
>>> the installation of the package failed and it removes it.
>>>
>>> Finally note that this installation needs to download hundreds of
>>> Mb of Python stuff.
>>>
>>> So this is probably the reasons why the authors of the tensorflow
>>> CRAN package chose to separate installation of the tensorflow Python
>>> module from the installation of the package itself. There are plenty
>>> of good reasons for doing that.
>>>
>>> What I would suggest instead is that you start your vignette with a
>>> note reminding the user to run tensorflow::install_tensorflow() if
>>> s/he didn't already do it. As a side note: I couldn't find a way to
>>> programmatically figure out whether the tensorflow Python module is
>>> already installed in the man page for tensorflow::install_tensorflow(),
>>> I had to dig in the source code of the unit tests to find
>>> reticulate::py_module_available("tensorflow")).
>>>
>>> In addition, you could also start each of your functions that rely on
>>> the tensorflow Python module with a check to see whether the module is
>>> available, and fail gracefully (with an informative error message) if
>>> it's not.
>>>
>>> We'll figure out a way to install the tensorflow Python module on our
>>> build machines.
>>>
>>> Hope this helps,
>>> H.
>>>
>>>
>>> On 03/28/2018 09:23 AM, Kieran Campbell wrote:

 Hi all,

 Rstudio have released the Tensorflow package for R -

 https://urldefense.proofpoint.com/v2/url?u=https-3A__tensorflow.rstudio.com_tensorflow_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=AchAIWmKzcnyw9VXJ7eH5M4dqnTAS0SACVMigCPusHk=
 - and we have started
 incorporating it into some of our genomics packages for the heavy
 numerical computation.

 We would ideally like these to be submitted to Bioconductor, but
 there's a custom line required for Tensorflow installation in that
 after calling

 install.packages("tensorflow")

 then Tensorflow must be installed via

 tensorflow::install_tensorflow()

 which would break package testing if tensorflow was simply imported
 into the R package and wasn't already installed. Is there any way to
 customise a package installation within Bioconductor to trigger the
 tensorflow::install_tensorflow() 

Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-03-29 Thread Kieran Campbell
Hi Hervé, Michael,

Thanks for your feedback. I will add in the reticulate check to ensure
tensorflow is installed prior to running and appropriate sections in
the vignettes. We have one package essentially ready for submission to
bioc, so is the best route forward to submit now or wait until
tensorflow is installed on the build servers?

Many thanks

Kieran


On 28 March 2018 at 15:10, Hervé Pagès  wrote:
> On 03/28/2018 02:41 PM, Hervé Pagès wrote:
>>
>> Hi Kieran,
>>
>> Note that you can execute arbitrary code at load time by defining
>> an .onLoad() hook in your package. So you *could* put something
>> like this in your package:
>>
>>.onUnload <- function(libpath)
>>{
>>  if (!reticulate::py_module_available("tensorflow"))
>>  tensorflow::install_tensorflow()
>>}
>
>
> should be .onLoad() in the above code
>
> more below...
>
>>
>> However, having things being automatically downloaded/installed
>> on the user machine at package load-time is not a good idea. There
>> are just too many things that can go wrong.
>>
>> For example, I just tried to run tensorflow::install_tensorflow()
>> on my laptop (Ubuntu 16.04) and was successful only after the 3rd
>> attempt (I had to make some changes/adjustments to my system between
>> each attempt). And Debian Linux is probably the easiest target!
>>
>> Also note that install.packages() tries to load the package at the
>> end of the installation when installing from source so if the
>> .onUnload() hook fails, install.packages() considers that
>
>   ^^^
>.onLoad()
>
> same here, sorry
>
> H.
>
>
>> the installation of the package failed and it removes it.
>>
>> Finally note that this installation needs to download hundreds of
>> Mb of Python stuff.
>>
>> So this is probably the reasons why the authors of the tensorflow
>> CRAN package chose to separate installation of the tensorflow Python
>> module from the installation of the package itself. There are plenty
>> of good reasons for doing that.
>>
>> What I would suggest instead is that you start your vignette with a
>> note reminding the user to run tensorflow::install_tensorflow() if
>> s/he didn't already do it. As a side note: I couldn't find a way to
>> programmatically figure out whether the tensorflow Python module is
>> already installed in the man page for tensorflow::install_tensorflow(),
>> I had to dig in the source code of the unit tests to find
>> reticulate::py_module_available("tensorflow")).
>>
>> In addition, you could also start each of your functions that rely on
>> the tensorflow Python module with a check to see whether the module is
>> available, and fail gracefully (with an informative error message) if
>> it's not.
>>
>> We'll figure out a way to install the tensorflow Python module on our
>> build machines.
>>
>> Hope this helps,
>> H.
>>
>>
>> On 03/28/2018 09:23 AM, Kieran Campbell wrote:
>>>
>>> Hi all,
>>>
>>> Rstudio have released the Tensorflow package for R -
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__tensorflow.rstudio.com_tensorflow_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=AchAIWmKzcnyw9VXJ7eH5M4dqnTAS0SACVMigCPusHk=
>>> - and we have started
>>> incorporating it into some of our genomics packages for the heavy
>>> numerical computation.
>>>
>>> We would ideally like these to be submitted to Bioconductor, but
>>> there's a custom line required for Tensorflow installation in that
>>> after calling
>>>
>>> install.packages("tensorflow")
>>>
>>> then Tensorflow must be installed via
>>>
>>> tensorflow::install_tensorflow()
>>>
>>> which would break package testing if tensorflow was simply imported
>>> into the R package and wasn't already installed. Is there any way to
>>> customise a package installation within Bioconductor to trigger the
>>> tensorflow::install_tensorflow() ?
>>>
>>> As more people use tensorflow / deep learning in genomics I can see
>>> this being a problem so it would be good to have a solution in place.
>>>
>>> Many thanks,
>>>
>>> Kieran Campbell
>>>
>>> ___
>>> Bioc-devel@r-project.org mailing list
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=RS0haeXXw_GuGbzVJJuh_ZJKHuYhliDfLjtojgmqKFc=
>>>
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-03-28 Thread Hervé Pagès

On 03/28/2018 02:41 PM, Hervé Pagès wrote:

Hi Kieran,

Note that you can execute arbitrary code at load time by defining
an .onLoad() hook in your package. So you *could* put something
like this in your package:

   .onUnload <- function(libpath)
   {
     if (!reticulate::py_module_available("tensorflow"))
     tensorflow::install_tensorflow()
   }


should be .onLoad() in the above code

more below...



However, having things being automatically downloaded/installed
on the user machine at package load-time is not a good idea. There
are just too many things that can go wrong.

For example, I just tried to run tensorflow::install_tensorflow()
on my laptop (Ubuntu 16.04) and was successful only after the 3rd
attempt (I had to make some changes/adjustments to my system between
each attempt). And Debian Linux is probably the easiest target!

Also note that install.packages() tries to load the package at the
end of the installation when installing from source so if the
.onUnload() hook fails, install.packages() considers that

  ^^^
   .onLoad()

same here, sorry

H.


the installation of the package failed and it removes it.

Finally note that this installation needs to download hundreds of
Mb of Python stuff.

So this is probably the reasons why the authors of the tensorflow
CRAN package chose to separate installation of the tensorflow Python
module from the installation of the package itself. There are plenty
of good reasons for doing that.

What I would suggest instead is that you start your vignette with a
note reminding the user to run tensorflow::install_tensorflow() if
s/he didn't already do it. As a side note: I couldn't find a way to
programmatically figure out whether the tensorflow Python module is
already installed in the man page for tensorflow::install_tensorflow(),
I had to dig in the source code of the unit tests to find 
reticulate::py_module_available("tensorflow")).


In addition, you could also start each of your functions that rely on
the tensorflow Python module with a check to see whether the module is
available, and fail gracefully (with an informative error message) if
it's not.

We'll figure out a way to install the tensorflow Python module on our
build machines.

Hope this helps,
H.


On 03/28/2018 09:23 AM, Kieran Campbell wrote:

Hi all,

Rstudio have released the Tensorflow package for R -
https://urldefense.proofpoint.com/v2/url?u=https-3A__tensorflow.rstudio.com_tensorflow_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=AchAIWmKzcnyw9VXJ7eH5M4dqnTAS0SACVMigCPusHk= 
- and we have started

incorporating it into some of our genomics packages for the heavy
numerical computation.

We would ideally like these to be submitted to Bioconductor, but
there's a custom line required for Tensorflow installation in that
after calling

install.packages("tensorflow")

then Tensorflow must be installed via

tensorflow::install_tensorflow()

which would break package testing if tensorflow was simply imported
into the R package and wasn't already installed. Is there any way to
customise a package installation within Bioconductor to trigger the
tensorflow::install_tensorflow() ?

As more people use tensorflow / deep learning in genomics I can see
this being a problem so it would be good to have a solution in place.

Many thanks,

Kieran Campbell

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=RS0haeXXw_GuGbzVJJuh_ZJKHuYhliDfLjtojgmqKFc= 







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-03-28 Thread Hervé Pagès

Hi Kieran,

Note that you can execute arbitrary code at load time by defining
an .onLoad() hook in your package. So you *could* put something
like this in your package:

  .onUnload <- function(libpath)
  {
if (!reticulate::py_module_available("tensorflow"))
tensorflow::install_tensorflow()
  }

However, having things being automatically downloaded/installed
on the user machine at package load-time is not a good idea. There
are just too many things that can go wrong.

For example, I just tried to run tensorflow::install_tensorflow()
on my laptop (Ubuntu 16.04) and was successful only after the 3rd
attempt (I had to make some changes/adjustments to my system between
each attempt). And Debian Linux is probably the easiest target!

Also note that install.packages() tries to load the package at the
end of the installation when installing from source so if the
.onUnload() hook fails, install.packages() considers that
the installation of the package failed and it removes it.

Finally note that this installation needs to download hundreds of
Mb of Python stuff.

So this is probably the reasons why the authors of the tensorflow
CRAN package chose to separate installation of the tensorflow Python
module from the installation of the package itself. There are plenty
of good reasons for doing that.

What I would suggest instead is that you start your vignette with a
note reminding the user to run tensorflow::install_tensorflow() if
s/he didn't already do it. As a side note: I couldn't find a way to
programmatically figure out whether the tensorflow Python module is
already installed in the man page for tensorflow::install_tensorflow(),
I had to dig in the source code of the unit tests to find 
reticulate::py_module_available("tensorflow")).


In addition, you could also start each of your functions that rely on
the tensorflow Python module with a check to see whether the module is
available, and fail gracefully (with an informative error message) if
it's not.

We'll figure out a way to install the tensorflow Python module on our
build machines.

Hope this helps,
H.


On 03/28/2018 09:23 AM, Kieran Campbell wrote:

Hi all,

Rstudio have released the Tensorflow package for R -
https://urldefense.proofpoint.com/v2/url?u=https-3A__tensorflow.rstudio.com_tensorflow_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=AchAIWmKzcnyw9VXJ7eH5M4dqnTAS0SACVMigCPusHk=
 - and we have started
incorporating it into some of our genomics packages for the heavy
numerical computation.

We would ideally like these to be submitted to Bioconductor, but
there's a custom line required for Tensorflow installation in that
after calling

install.packages("tensorflow")

then Tensorflow must be installed via

tensorflow::install_tensorflow()

which would break package testing if tensorflow was simply imported
into the R package and wasn't already installed. Is there any way to
customise a package installation within Bioconductor to trigger the
tensorflow::install_tensorflow() ?

As more people use tensorflow / deep learning in genomics I can see
this being a problem so it would be good to have a solution in place.

Many thanks,

Kieran Campbell

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=RS0haeXXw_GuGbzVJJuh_ZJKHuYhliDfLjtojgmqKFc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-03-28 Thread Michael Lawrence
Presumably the installation of tensor flow only has to happen once, so you
could factor your interface such that it installs tensor flow lazily.

Michael

On Wed, Mar 28, 2018 at 9:23 AM, Kieran Campbell 
wrote:

> Hi all,
>
> Rstudio have released the Tensorflow package for R -
> https://tensorflow.rstudio.com/tensorflow/ - and we have started
> incorporating it into some of our genomics packages for the heavy
> numerical computation.
>
> We would ideally like these to be submitted to Bioconductor, but
> there's a custom line required for Tensorflow installation in that
> after calling
>
> install.packages("tensorflow")
>
> then Tensorflow must be installed via
>
> tensorflow::install_tensorflow()
>
> which would break package testing if tensorflow was simply imported
> into the R package and wasn't already installed. Is there any way to
> customise a package installation within Bioconductor to trigger the
> tensorflow::install_tensorflow() ?
>
> As more people use tensorflow / deep learning in genomics I can see
> this being a problem so it would be good to have a solution in place.
>
> Many thanks,
>
> Kieran Campbell
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Tensorflow support for bioconductor packages

2018-03-28 Thread Kieran Campbell
Hi all,

Rstudio have released the Tensorflow package for R -
https://tensorflow.rstudio.com/tensorflow/ - and we have started
incorporating it into some of our genomics packages for the heavy
numerical computation.

We would ideally like these to be submitted to Bioconductor, but
there's a custom line required for Tensorflow installation in that
after calling

install.packages("tensorflow")

then Tensorflow must be installed via

tensorflow::install_tensorflow()

which would break package testing if tensorflow was simply imported
into the R package and wasn't already installed. Is there any way to
customise a package installation within Bioconductor to trigger the
tensorflow::install_tensorflow() ?

As more people use tensorflow / deep learning in genomics I can see
this being a problem so it would be good to have a solution in place.

Many thanks,

Kieran Campbell

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel