PM Ajinkya Kale wrote:
> I am using google cloud dataproc which comes with spark 1.6.1. So upgrade
> is not really an option.
> No way / hack to save the models in spark 1.6.1 ?
>
> On Tue, Jul 19, 2016 at 8:13 PM Shuai Lin wrote:
>
>> It's added in not-released-y
a/browse/SPARK-13036
> https://github.com/apache/spark/commit/83302c3b
>
> so i guess you need to wait for 2.0 release (or use the current rc4).
>
> On Wed, Jul 20, 2016 at 6:54 AM, Ajinkya Kale
> wrote:
>
>> Is there a way to save a pyspark.ml.feature.PCA model ? I know
Is there a way to save a pyspark.ml.feature.PCA model ? I know mllib has
that but mllib does not have PCA afaik. How do people do model persistence
for inference using the pyspark ml models ? Did not find any documentation
on model persistency for ml.
--ajinkya
___
> From: Jakob Odersky
> Sent: Thursday, March 17, 2016 6:40 PM
> Subject: Re: installing packages with pyspark
> To: Ajinkya Kale
> Cc:
>
>
> Hi,
> regarding 1, packages are resolved locally. That means that when you
> specify a package, spark-submit will resolv
Hi all,
I had couple of questions.
1. Is there documentation on how to add the graphframes or any other
package for that matter on the google dataproc managed spark clusters ?
2. Is there a way to add a package to an existing pyspark context through a
jupyter notebook ?
--aj
Please take a look at the example here
http://spark.apache.org/docs/latest/ml-guide.html#example-pipeline
On Thu, Feb 18, 2016 at 9:27 PM Arunkumar Pillai
wrote:
> Hi
>
> I'm trying to build logistic regression using ML Pipeline
>
> val lr = new LogisticRegression()
>
> lr.setFitIntercept(t
Trying to load avro from hdfs. I have around 1000 part avro files in a dir.
I am using this to read them -
val df =
sqlContext.read.format("com.databricks.spark.avro").load("path/to/avro/dir")
df.select("QUERY").take(50).foreach(println)
It works if I have pass only 1or 2 avro files in the path
I tried --jars which supposedly does that but that did not work.
On Fri, Jan 22, 2016 at 4:33 PM Ajinkya Kale wrote:
> Hi Ted,
> Is there a way for the executors to have the hbase-protocol jar on their
> classpath ?
>
> On Fri, Jan 22, 2016 at 4:00 PM Ted Yu wrote:
>
Hi Ted,
Is there a way for the executors to have the hbase-protocol jar on their
classpath ?
On Fri, Jan 22, 2016 at 4:00 PM Ted Yu wrote:
> The class path formations on driver and executors are different.
>
> Cheers
>
> On Fri, Jan 22, 2016 at 3:25 PM, Ajinkya Kale
> wrote:
Is this issue only when the computations are in distributed mode ?
If I do (pseudo code) :
rdd.collect.call_to_hbase I dont get this error,
but if I do :
rdd.call_to_hbase.collect it throws this error.
On Wed, Jan 20, 2016 at 6:50 PM Ajinkya Kale wrote:
> Unfortunately I cannot at this mom
Unfortunately I cannot at this moment (not a decision I can make) :(
On Wed, Jan 20, 2016 at 6:46 PM Ted Yu wrote:
> I am not aware of a workaround.
>
> Can you upgrade to 0.98.4+ release ?
>
> Cheers
>
> On Wed, Jan 20, 2016 at 6:26 PM, Ajinkya Kale
> wrote:
>
.
>
> If still there is problem, please pastebin the stack trace.
>
> Thanks
>
> On Wed, Jan 20, 2016 at 5:41 PM, Ajinkya Kale
> wrote:
>
>>
>> I have posted this on hbase user list but i thought makes more sense on
>> spark user list.
>> I am able to rea
I have posted this on hbase user list but i thought makes more sense on
spark user list.
I am able to read the table in yarn-client mode from spark-shell but I have
exhausted all online forums for options to get it working in the
yarn-cluster mode through spark-submit.
I am using this code-example
13 matches
Mail list logo