Hi there.
I am calling custom Scala code from pyspark (interpreter). The customer
Scala code is simple: it just reads a textFile using sparkContext.textFile
and returns RDD[String].
In pyspark, I am using sc._jvm to make the call to the Scala code:
*s_rdd =
unsubscribe
On Thu, Mar 15, 2018 at 8:00 PM, Alan Featherston Lago
wrote:
> I'm a pretty new user of spark and I've run into this issue with the
> pyspark docs:
>
> The functions pyspark.sql.functions.to_date &&
> pyspark.sql.functions.to_timestamp
> behave in the same way.
I'm a pretty new user of spark and I've run into this issue with the
pyspark docs:
The functions pyspark.sql.functions.to_date &&
pyspark.sql.functions.to_timestamp behave in the same way. As in both
functions convert a Column of pyspark.sql.types.StringType or
pyspark.sql.types.TimestampType
*Environment:*
Spark 2.2.0
*Kafka:* 0.10.0
*Language:* Java
*UseCase:* Streaming data from Kafka using JavaDStreams and storing into a
downstream database.
*Issue:*
I have a use case, where in I have to launch a thread in the background that
would connect to a DB and Cache the retrieved
Awesome, thanks for detailing!
Was thinking the same, we've to split by comma for csv while casting inside.
Cool! Shall try it and revert back tomm.
Thanks a ton!
On 15-Mar-2018 11:50 PM, "Bowden, Chris"
wrote:
> To remain generic, the KafkaSource can only offer
Hey Chris,
You got it right. I'm reading a *csv *file from local as mentioned above,
with a console producer on Kafka side.
So, as it is a csv data with headers, shall I then use from_csv on the
spark side and provide a StructType to shape it up with a schema and then
cast it to string as TD
Hi all,
I am currently trying to enable dynamic resource allocation for a little yarn
managed spark cluster.
We are using sparklyr to access spark from R and have multiple jobs which
should run in parallel, because some of them take several days to complete or
are in development.
Everything
Chris identified the problem correctly. You need to parse out the json text
from Kafka into separate columns before you can join them up.
I walk through an example of this in my slides -
Hi
"In general, configuration values explicitly set on a SparkConf take the
highest precedence, then flags passed to spark-submit, then values in the
defaults file."
https://spark.apache.org/docs/latest/submitting-applications.html
Perhaps this will help Vinyas:
Look at args.sparkProperties in
Hi,
And if I run this below piece of code -
from pyspark.sql import SparkSession
import time
class test:
spark = SparkSession.builder \
.appName("DirectKafka_Spark_Stream_Stream_Join") \
.getOrCreate()
# ssc = StreamingContext(spark, 20)
table1_stream =
Any help on the above?
On Thu, Mar 15, 2018 at 3:53 PM, Aakash Basu
wrote:
> Hi,
>
> I progressed a bit in the above mentioned topic -
>
> 1) I am feeding a CSV file into the Kafka topic.
> 2) Feeding the Kafka topic as readStream as TD's article suggests.
> 3) Then,
Hi David,
I ended building up my own. Livy sounded great on paper, but heavy to
manipulate. I found out about Jobserver too late. We did not find too
complicated to build ours, with a small Spring boot app that was holding the
session (we did not need more than one session).
jg
> On Mar 15,
Hi David,
Which type of incompatibility problems do you have with Apache Livy?
BR,
Liana
From: David Espinosa
Sent: 15 March 2018 12:06:20
To: user@spark.apache.org
Subject: What's the best way to have Spark a service?
Hi all,
I'm quite
Hi all,
I'm quite new to Spark, and I would like to ask whats the best way to have
Spark as a service, and for that I mean being able to include the response
of a scala app/job running in a Spark into a RESTful common request.
Up now I have read about Apache Livy (which I tried and found
Hi,
I progressed a bit in the above mentioned topic -
1) I am feeding a CSV file into the Kafka topic.
2) Feeding the Kafka topic as readStream as TD's article suggests.
3) Then, simply trying to do a show on the streaming dataframe, using
queryName('XYZ') in the writeStream and writing a sql
15 matches
Mail list logo