Hi everyone, I have the following scenario , and I tried to use window method with direct kafka streaming. The program can run, but doesn't run right!
1. The data is stored in kafka. 2. Every single item of the data has its primary key. 3. Every single item of the data will be split into multipe parts,and these parts will arrive at kafka in order. Here's my sample code: JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, Durations.seconds(20)); JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(jssc, String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topicsSet); messages.window(Durations.seconds(60), Durations.seconds(40)).print(); I couldn't get the data of RDD@40 ,when I tried to print the data of windowed RDD@80. Can I have some suggestions!
