Would deploying virtualenv on each directory on the cluster be viable? The dependencies would get tricky but I think this is the sort of situation it's built for.

On 6/27/14, 11:06 AM, Avishek Saha wrote:
I too felt the same Nick but I don't have root privileges on the cluster, unfortunately. Are there any alternatives?


On 27 June 2014 08:04, Nick Pentreath <nick.pentre...@gmail.com <mailto:nick.pentre...@gmail.com>> wrote:

    I've not tried this - but numpy is a tricky and complex package
    with many dependencies on Fortran/C libraries etc. I'd say by the
    time you figure out correctly deploying numpy in this manner, you
    may as well have just built it into your cluster bootstrap
    process, or PSSH install it on each node...


    On Fri, Jun 27, 2014 at 4:58 PM, Avishek Saha
    <avishek.s...@gmail.com <mailto:avishek.s...@gmail.com>> wrote:

        To clarify I tried it and it almost worked -- but I am getting
        some problems from the Random module in numpy. If anyone has
        successfully passed a numpy module (via the --py-files option)
        to spark-submit then please let me know.

        Thanks !!
        Avishek


        On 26 June 2014 17:45, Avishek Saha <avishek.s...@gmail.com
        <mailto:avishek.s...@gmail.com>> wrote:

            Hi all,

            Instead of installing numpy in each worker node, is it
            possible to
            ship numpy (via --py-files option maybe) while invoking the
            spark-submit?

            Thanks,
            Avishek





Reply via email to