Re: Issue with loading dependencies and jars

2018-03-15 Thread Donald Szeto
Hi Shane, Although not highly recommended, would you mind trying to set MYSQL_JDBC_DRIVER in your conf/pio-env.sh to point to the aws-java-sdk.jar and try running again? In code, third party JARs for MySQL and PostgreSQL will always be appended at the very front for the "spark-submit --jars" argum

Re: Issue with loading dependencies and jars

2018-03-11 Thread Mars Hall
On Sat, Mar 10, 2018 at 7:49 PM, Shane Johnson wrote: > Mars, I was reviewing the code that you are referencing, the "jars for > Spark" function, this morning and trying to see how it ties in. This code > that is outside the custom binary distribution correct, I could not find it > in the distrib

Re: Issue with loading dependencies and jars

2018-03-10 Thread Shane Johnson
Thanks for looking into this with me Mars and Donald. It's nice to have experts, we are not completely stuck as our models train successfully ~50% of the time but we are looking forward to finding a way to order the jars and stabilize our processes. Mars, I was reviewing the code that you are refe

Re: Issue with loading dependencies and jars

2018-03-09 Thread Mars Hall
Correction, it’s this “jars for Spark” function: https://github.com/apache/predictionio/blob/develop/tools/src/main/scala/org/apache/predictionio/tools/Common.scala#L105 On Fri, Mar 9, 2018 at 17:54 Mars Hall wrote: > It looks like this Scala function is the source of that jars list: > > https:/

Re: Issue with loading dependencies and jars

2018-03-09 Thread Mars Hall
It looks like this Scala function is the source of that jars list: https://github.com/apache/predictionio/blob/develop/tools/src/main/scala/org/apache/predictionio/tools/Common.scala#L81 On Fri, Mar 9, 2018 at 17:42 Mars Hall wrote: > Where does the classpath in spark-submit originate? Is > comp

Re: Issue with loading dependencies and jars

2018-03-09 Thread Mars Hall
Where does the classpath in spark-submit originate? Is compute-classpath.sh not the source? As noted previously, the stable-ordering fix by me in compute-classpath.sh no longer seems to be effective either. Looks like some tracing of classpath assembly through the Spark command runner is required

Re: Issue with loading dependencies and jars

2018-03-09 Thread Shane Johnson
One additional item that you mentioned earlier is that we would need to remove or skip the aws-java-sdk.jar that is already in the CLASSPATH. Do you think this has impact? I did not write anything to skip or remove the existing aws-java-sdk.jar. aws-java-sdk.jar is already in the CLASSPATH though,

Re: Issue with loading dependencies and jars

2018-03-09 Thread Shane Johnson
Now that I am able to deploy I reset the buildpack to ...#debug-custom-dist and redeployed. Here is the build log...URL does point to the correct distribution with the edited compute-classpath.sh file. -> JVM Common app detected -> Installing JDK 1.8... done -> PredictionIO app detec

Re: Issue with loading dependencies and jars

2018-03-09 Thread Shane Johnson
Thanks Mars, trying to get back to a stable state and then I will respond with what I find re: Classpath. Here is some context in the meantime. I was able to deploy to Heroku again. Not sure how I introduced that error, perhaps what Donald said. > Does this error occur locally when you `pio build

Re: Issue with loading dependencies and jars

2018-03-09 Thread Donald Szeto
This error looks like pio build was not run at the engine template directory, which would have the proper configuration to enable that command. On Fri, Mar 9, 2018 at 11:28 AM Mars Hall wrote: > At this point, I'm just searching the internet for "Not a valid command: > assemblyPackageDependency"

Re: Issue with loading dependencies and jars

2018-03-09 Thread Mars Hall
At this point, I'm just searching the internet for "Not a valid command: assemblyPackageDependency" errors, which I image you are too. Does this error occur locally when you `pio build` this engine? Are there any diffs in your local code? How are you running locally? Are you using the buildpack'

Re: Issue with loading dependencies and jars

2018-03-09 Thread Shane Johnson
Mars, to test what may be happening I just reverted to the original buildpack (https://github.com/heroku/predictionio-buildpack.git) and removed the variable PREDICTIONIO_DIST_URL and I am still getting the same error. I don't know where I would have introduced this. Have you seen it before, perhap

Re: Issue with loading dependencies and jars

2018-03-09 Thread Shane Johnson
It looks like the URL is the correct URL of the custom PredictionIO dist. It looks like there is another error that might be occurring here when I went to deploy. -> JVM Common app detected -> Installing JDK 1.8... done -> PredictionIO app detected -> Install core components

Re: Issue with loading dependencies and jars

2018-03-09 Thread Mars Hall
I'm lost as to how such direct manipulation of CLASSPATH is not appearing in the logged spark-submit command. What could cause this!? I just pushed a version of the buildpack which should help debug. Assuming only a single buildpack is assigned to the app, here's how to set it: heroku buildpac

Re: Issue with loading dependencies and jars

2018-03-09 Thread Shane Johnson
Thanks Donald and Mars, I created a new distribution ( https://s3-us-west-1.amazonaws.com/predictionio/0.12.0-incubating/apache-predictionio-0.12.0-incubating-bin.tar.gz) with the a

Re: Issue with loading dependencies and jars

2018-03-07 Thread Mars Hall
Shane, On Wed, Mar 7, 2018 at 4:49 AM, Shane Johnson wrote: > > Re: adding a line to ensure a jar is loaded first. Is this what you are > referring to...(line at the bottom in red)? > I believe the code would need to look like this to effect the output classpath as intended: > CLASSPATH="/ap

Re: Issue with loading dependencies and jars

2018-03-06 Thread Shane Johnson
Thanks Donald, Re: adding a line to ensure a jar is loaded first. Is this what you are referring to...(line at the bottom in red)? # Add hadoop conf dir if given -- otherwise FileSystem.*, etc fail ! Note, this # assumes that there is either a HADOOP_CONF_DIR or YARN_CONF_DIR which hosts # the c

Re: Issue with loading dependencies and jars

2018-03-06 Thread Donald Szeto
Even easier: skip cloning, and just edit the shell script directly in the binary distribution. Hope that works. Regards, Donald On Tue, Mar 6, 2018 at 5:41 PM Shane Johnson wrote: > Thanks Mars and Donald. I think this gets me to next steps: > >- Clone PredictionIO 0.12 and adjust the bin/c

Re: Issue with loading dependencies and jars

2018-03-06 Thread Shane Johnson
Thanks Mars and Donald. I think this gets me to next steps: - Clone PredictionIO 0.12 and adjust the bin/compute-classpath.sh to have aws-java-sdk-1.7.4 loaded first. - Create custom binary distribution of PredicionIO 0.12. - Add config var to point to custom binary distribution. This

Re: Issue with loading dependencies and jars

2018-03-06 Thread Mars Hall
On Tue, Mar 6, 2018 at 11:39 PM, Shane Johnson wrote: > > Do you know the version of hadoop-aws.jar and aws-java-sdk.jar that you > are using? > > I do not know what version is being used. Is this something that I can > specify or control? I am using the PredictionIO buildpack > https://github.co

Re: Issue with loading dependencies and jars

2018-03-06 Thread Shane Johnson
Thanks for the response Donald. Do you know the version of hadoop-aws.jar and aws-java-sdk.jar that you are using? I do not know what version is being used. Is this something that I can specify or control? I am using the PredictionIO buildpack https://github.com/heroku/predictionio-buildpack. I a

Re: Issue with loading dependencies and jars

2018-03-06 Thread Donald Szeto
Hi Shane, Do you know the version of hadoop-aws.jar and aws-java-sdk.jar that you are using? You are also right that you can modify the class path in bin/compute-classpath.sh as a short term fix. The current order is following the output of your target system's `ls`, so the order is not guarantee