Hello,

I am working on a software package which is primarily built on Python but also 
uses an R package I have designed with some necessary functionality.  The R 
portion of the software uses two packages which perform specialized tasks that 
are not available in python packages.  (Those packages are MSIseq, which 
designates tumors as microsatellite stable or instable based on mutation data, 
and deconstructSigs, which determines the major mutation signatures present in 
tumor samples.)  I hope to package this software together in such a way that it 
is easily installed on other systems, as I know I appreciate such conveniences 
when working with new software.  I understand that creating a package for the 
pip installer is standard practice for python-based software, but as far as I 
understand it, I won't be able to specify dependencies outside of other python 
modules, such as the R package needed to run the software and the r-base 
package itself.

I have explored a couple alternative options so far:
First, I have tried using rpy2 to run the necessary R scripts from within 
python scripts but have encountered issues with installing the previously 
mentioned packages.  (I believe the issue had to do with one of those packages 
being dependent on rJava, which was not installing correctly through rpy2).  
Currently, I am instead using the subprocess function in python to call the R 
scripts with arguments in the command line calls.  This has worked just fine so 
far.

The second option I have explored is to package all the software, both the 
python package and the R package into a single Debian package so that are 
distributed as a unit.  So far, I have been successful in using stdeb to build 
and deploy the python portion of my project.  (I also had success using 
dh-python, although I eventually found that stdeb streamlined the build process 
significantly.)  However, I am having issues packaging the R portion of the 
project in a way that allows it to be installed directly from my ppa.  (This 
may venture outside of the scope of this mailing list, but my understanding is 
that my difficulties are stemming from the fact that MSIseq and deconstrucSigs 
are not R packages which are directly supported in the default Debian apt 
repositories.)

My goal for this project is to be able to package everything in such a way 
where a single command is all that is needed to install the package and get 
things running (e.g. "sudo apt install [my_package]")  Is this reasonable, 
considering the problems that I am encountering and the admittedly dubious 
decision I've made to hybridize python and R code into a single project?  Are 
there any solutions I have missed or documentation that you believe would be 
helpful?   Furthermore, this is the first time I have attempted a project of 
this scale, so I am open to criticism about the goals I have for distribution 
as well as the general architecture of the project itself.  I have been 
developing this software in a vacuum for far too long and would really 
appreciate assistance anyone is willing to give.

Thank you,
Ben Morledge-Hampton

Reply via email to