mqtt module

2021-05-27 Thread jianxu
Hi Amit; � Do you have any idea where to find mqtt module. It supposes to be under pyspark.streaming? I could not find it with the latest version of 3.1.1. I need to connect the structured streaming via mqtt. � Appreciate any help with the matter. � Regards, � Jian Xu � From: Amit

can not find module of mqtt under pyspark.streaming

2021-05-27 Thread jianxu
Hi There; � I am using spark latest 3.1.1 and try to connect via mqtt with structured streaming. Seems to me the mqtt does not exist under pyspark.streaming anymore. Does anyone know where does this module go and load it with the python application. � Appreciate answer. � Regards, �

Re: Reading Large File in Pyspark

2021-05-27 Thread Molotch
You can specify the line separator to make spark split your records into separate rows. df = spark.read.option("lineSep","^^^").text("path") Then you need to df.select(split("value", "***").as("arrayColumn")) the column into an array and map over it with getItem to create a column for each proper

Accumulators and other important metrics for your job

2021-05-27 Thread Hamish Whittal
Hi folks, I have a problematic dataset I'm working with and am trying to find ways of "debugging" the data. For example, the most simple thing I would like to do is to know how many rows of data I've read and compare that to a simple count of the lines in the file. I could do: df.count() but

Re: [apache spark] Does Spark 2.4.8 have issues with ServletContextHandler

2021-05-27 Thread Sean Owen
Despite the name, the error doesn't mean the class isn't found but could not be initialized. What's the rest of the error? I don't believe any testing has ever encountered this error, so it's likely something to do with your environment, but I don't know what. On Thu, May 27, 2021 at 7:32 AM Kanch

Re: [apache spark] Does Spark 2.4.8 have issues with ServletContextHandler

2021-05-27 Thread Kanchan Kauthale
Hello, I could see Jetty version has been updated to 9.4.35, from 9.4.28 in JIra- https://issues.apache.org/jira/browse/SPARK-33831 Does it have something to do with it? Thank you Kanchan On Thu, May 27, 2021 at 5:16 PM Kanchan Kauthale < kanchankauthal...@gmail.com> wrote: > Hello, > > We have

[apache spark] Does Spark 2.4.8 have issues with ServletContextHandler

2021-05-27 Thread Kanchan Kauthale
Hello, We have an existing project which works fine with Spark 2.4.7. We want to upgrade the spark version to 2.4.8. Scala version we are using is- 2.11 After building with upgraded pom, we are getting error below for test cases- java.lang.NoClassDefFoundError: Could not initialize class org.spar