UnpicklingError while using spark streaming

2017-07-13 Thread lovemoon
| down votefavorite | spark2.1.1 & python2.7.11 I want to union another rdd in Dstream.transform() like below: sc = SparkContext() ssc = StreamingContext(sc, 1) init_rdd = sc.textFile('file:///home/zht/PycharmProjects/test/text_file.txt') lines = ssc.socketTextStream('localhost', ) lin

How to do multiple join in pyspark

2017-02-26 Thread lovemoon
This is my code as below: cfg = SparkConf().setAppName('MyApp') spark = SparkSession.builder.config(conf=cfg).getOrCreate() rdd1 = spark.createDataFrame([(1, 'a'), (2, 'b'), (4, 'c')], ['idx', 'val']) rdd1.registerTempTable('rdd1') rdd2 = spark.createDataFrame([(1, 2, 100), (1, 3, 200), (2, 3,