Spark Image resizing

2019-07-30 Thread Nick Dawes
Hi

I'm new to spark image data source.

After creating a dataframe using Spark's image data source, I would like to
resize the images in PySpark.

df = spark.read.format("image").load(imageDir)

Can you please help me with this?

Nick


Kafka Integration libraries put in the fat jar

2019-07-30 Thread Spico Florin
Hello!

I would like to use the spark structured streaming integrated with Kafka
the way is described here:
https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html


but I got the following issue:

Caused by: org.apache.spark.sql.AnalysisException: Failed to find data
source: kafka. Please deploy the application as per the deployment section
of "Structured Streaming + Kafka Integration Guide".;

eventhough  I've added in the generated fat jar the kafka-sql dependencies:
 
org.apache.spark
spark-sql-kafka-0-10_2.11
2.4.3
compile


When I submit with the command

spark-submit  --master spark://spark-master:7077  --class myClass
--deploy-mode client *--packages
org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.3
my-fat-jar-with-dependencies.jar*

the problem is gone.

Since the packages option requires to download the libaries from an
environment that has access to internet and I don't have it, can you please
advice what can I do to add kafka dependecies either in the fat jar or
other solution.

Thank you.

Regards,

Florin