Cody,
Yes - I was able to verify that I am not seeing duplicate calls to
createDirectStream. If the spark-streaming-kafka-0-10 will work on a 2.3
cluster I can go ahead and give that a shot.
Regards,
Bryan Jeffrey
On Fri, Aug 31, 2018 at 11:56 AM Cody Koeninger wrote:
> Just to be 100% sure,
Just to be 100% sure, when you're logging the group id in
createDirectStream, you no longer see any duplicates?
Regarding testing master, is the blocker that your spark cluster is on
2.3? There's at least a reasonable chance that building an
application assembly jar that uses the master version j
Cody,
We are connecting to multiple clusters for each topic. I did experiment
this morning with both adding a cluster identifier to the group id, as well
as simply moving to use only a single one of our clusters. Neither of
these were successful. I am not able to run a test against master now.
I wanna be able to read snappy compressed files in spark. I can do a
val df = spark.read.textFile("hdfs:// path")
and it passes that test in spark shell but beyond that when i do a
df.show(10,false) or something - it shows me binary data mixed with real
text - how do I read the decompressed file in
Hi Folks,
I came across one problem while reading parquet through spark.
One parquet has been written with field 'a' with type 'Integer'.
Afterwards, reading this file with schema for 'a' as 'Long' gives exception.
I thought this compatible type change is supported. But this is not working.
Code s