Hi,
10.x.x.x is private network, see https://en.wikipedia.org/wiki/IP_address.
You should use the public IP of your AWS.
On Sat, Apr 29, 2017 at 6:35 AM, Yuan Fang
wrote:
>
> object SparkPi {
> private val logger = Logger(this.getClass)
>
> val sparkConf = new SparkConf()
> .setAppName("
Hi Tim,
Spark ML API doesn't support set initial model for GMM currently. I wish we
can get this feature in Spark 2.3.
Thanks
Yanbo
On Fri, Apr 28, 2017 at 1:46 AM, Tim Smith wrote:
> Hi,
>
> I am trying to figure out the API to initialize a gaussian mixture model
> using either centroids crea
Jacek,
Thanks for your help. I didn’t want to write a bug/enhancement unless
warranted.
~ Shawn
From: Jacek Laskowski [mailto:ja...@japila.pl]
Sent: Thursday, April 27, 2017 8:39 AM
To: Lavelle, Shawn
Cc: user
Subject: Re: Spark-SQL Query Optimization: overlapping ranges
Hi Shawn,
If yo
Hi,
the following code is reading a table from my postgresql database, and I'm
following the directives I've read on the internet:
val txs = spark.read.format("jdbc").options(Map(
("driver" -> "org.postgresql.Driver"),
("url" -> "jdbc:postgresql://host/dbname"),
("dbtable" -> "(se
Oh, and if you want a default other than null:
import org.apache.spark.sql.functions._
df.withColumn("address", coalesce($"address", lit())
On Mon, May 1, 2017 at 10:29 AM, Michael Armbrust
wrote:
> The following should work:
>
> val schema = implicitly[org.apache.spark.sql.Encoder[Course]].sch
The following should work:
val schema = implicitly[org.apache.spark.sql.Encoder[Course]].schema
spark.read.schema(schema).parquet("data.parquet").as[Course]
Note this will only work for nullable files (i.e. if you add a primitive
like Int you need to make it an Option[Int])
On Sun, Apr 30, 2017
Two more ways:
*Using the Typed Dataset API with Rows*
Caveat: The docs about flatMapGroups do warn "This function does not
support partial aggregation, and as a result requires shuffling all the
data in the Dataset. If an application intends to perform an aggregation
over each key, it is best to
Use cache or persist. The dataframe will be materialized when the 1st
action is called and then be reused from memory for each following usage
Le 1 mai 2017 4:51 PM, "Saulo Ricci" a écrit :
> Hi,
>
>
> I have the following code that is reading a table to a apache spark
> DataFrame:
>
> val df =
Hi,
I have the following code that is reading a table to a apache spark
DataFrame:
val df = spark.read.format("jdbc")
.option("url","jdbc:postgresql:host/database")
.option("dbtable","tablename").option("user","username")
.option("password", "password")
.load()
When I first
On 28 Apr 2017, at 16:10, Anubhav Agarwal
mailto:anubha...@gmail.com>> wrote:
Are you using Spark's textFiles method? If so, go through this blog :-
http://tech.kinja.com/how-not-to-pull-from-s3-using-apache-spark-1704509219
old/dated blog post.
If you get the Hadoop 2.8 binaries on your clas
10 matches
Mail list logo