Spark Thrift Server in Kubernetes deployment

2020-06-21 Thread Subash K
Hi,

We are currently using Spark 2.4.4 with Spark Thrift Server (STS) to expose a 
JDBC interface to the reporting tools to generate report from Spark tables.

Now as we are analyzing on containerized deployment of Spark and STS, I would 
like to understand is STS deployment on Kubernetes is supported out of the box? 
Because we were not able to find any document link on how to configure and spin 
up the container for STS. Please help us on this.

Regards,
Subash Kunjupillai



Re: Unsubscribe

2020-06-21 Thread Wesley

please send an empty email to:
dev-unsubscr...@spark.apache.org
user-unsubscr...@spark.apache.org

for unsubscribing yourself from the lists.

Thanks.


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Hey good looking toPandas () error stack

2020-06-21 Thread Anwar AliKhan
The only change I am making is spark  directory name.
It keeps failing in this same cell. df.toPandas()


findspark.init('/home/spark-2.4.6-bin-hadoop2.7') FAIL

findspark.init('/home/spark-3.0.0-bin-hadoop2.7'). PASS





On Sun, 21 Jun 2020, 19:51 randy clinton,  wrote:

> You can see from the GitHub history for "toPandas()" that the function has
> been in the code for 5 years.
>
> https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923
>
> When I google IllegalArgumentException: 'Unsupported class file major
> version 55'
>
> I see posts about the Java version being used. Are you sure your configs
> are right?
>
> https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version
>
> On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan 
> wrote:
>
>>
>> Two versions of Spark running against same code
>>
>>
>> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd
>>
>> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See
>> error stack below
>>
>> Jupyter Notebook
>>
>> import findspark
>>
>> findspark.init('/home/spark-3.0.0-bin-hadoop2.7')
>>
>> cell "spark"
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v3.0.0
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>>
>> import findspark
>>
>> findspark.init('/home/spark-2.4.6-bin-hadoop2.7')
>>
>> cell  "spark"
>>
>>
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v2.4.6
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>> cell "df.show(5)"
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |PassengerId|Survived|Pclass|Name|
>> Sex|Age|SibSp|Parch|  Ticket|   Fare|Cabin|Embarked|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |  1|   0| 3|Braund, Mr. Owen ...|  male| 22|1|0|
>>   A/5 21171|   7.25| null|   S|
>>
>> |  2|   1| 1|Cumings, Mrs. Joh...|female| 38|1|
>> 0|PC 17599|71.2833|  C85|   C|
>>
>> |  3|   1| 3|Heikkinen, Miss. ...|female| 26|0|
>> 0|STON/O2. 3101282|  7.925| null|   S|
>>
>> |  4|   1| 1|Futrelle, Mrs. Ja...|female| 35|1|
>> 0|  113803|   53.1| C123|   S|
>>
>> |  5|   0| 3|Allen, Mr. Willia...|  male| 35|0|
>> 0|  373450|   8.05| null|   S|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> only showing top 5 rows
>>
>> cell "df.toPandas()"
>>
>> cell output
>>
>>
>> ---
>>
>> Py4JJavaError Traceback (most recent call
>> last)
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a,
>> **kw)
>>
>>  62 try:
>>
>> ---> 63 return f(*a, **kw)
>>
>>  64 except py4j.protocol.Py4JJavaError as e:
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py
>> in get_return_value(answer, gateway_client, target_id, name)
>>
>> 327 "An error occurred while calling
>> {0}{1}{2}.\n".
>>
>> --> 328 format(target_id, ".", name), value)
>>
>> 329 else:
>>
>> Py4JJavaError: An error occurred while calling o33.collectToPython.
>>
>> : java.lang.IllegalArgumentException: Unsupported class file major
>> version 55
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)
>>
>> at
>> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)
>>
>> at
>> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>>
>> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
>>
>> at
>> 

Re: Hey good looking toPandas () error stack

2020-06-21 Thread Sean Owen
That part isn't related to Spark. It means you have some code compiled for
Java 11, but are running Java 8.

On Sun, Jun 21, 2020 at 1:51 PM randy clinton 
wrote:

> You can see from the GitHub history for "toPandas()" that the function has
> been in the code for 5 years.
>
> https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923
>
> When I google IllegalArgumentException: 'Unsupported class file major
> version 55'
>
> I see posts about the Java version being used. Are you sure your configs
> are right?
>
> https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version
>
> On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan 
> wrote:
>
>>
>> Two versions of Spark running against same code
>>
>>
>> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd
>>
>> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See
>> error stack below
>>
>> Jupyter Notebook
>>
>> import findspark
>>
>> findspark.init('/home/spark-3.0.0-bin-hadoop2.7')
>>
>> cell "spark"
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v3.0.0
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>>
>> import findspark
>>
>> findspark.init('/home/spark-2.4.6-bin-hadoop2.7')
>>
>> cell  "spark"
>>
>>
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v2.4.6
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>> cell "df.show(5)"
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |PassengerId|Survived|Pclass|Name|
>> Sex|Age|SibSp|Parch|  Ticket|   Fare|Cabin|Embarked|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |  1|   0| 3|Braund, Mr. Owen ...|  male| 22|1|0|
>>   A/5 21171|   7.25| null|   S|
>>
>> |  2|   1| 1|Cumings, Mrs. Joh...|female| 38|1|
>> 0|PC 17599|71.2833|  C85|   C|
>>
>> |  3|   1| 3|Heikkinen, Miss. ...|female| 26|0|
>> 0|STON/O2. 3101282|  7.925| null|   S|
>>
>> |  4|   1| 1|Futrelle, Mrs. Ja...|female| 35|1|
>> 0|  113803|   53.1| C123|   S|
>>
>> |  5|   0| 3|Allen, Mr. Willia...|  male| 35|0|
>> 0|  373450|   8.05| null|   S|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> only showing top 5 rows
>>
>> cell "df.toPandas()"
>>
>> cell output
>>
>>
>> ---
>>
>> Py4JJavaError Traceback (most recent call
>> last)
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a,
>> **kw)
>>
>>  62 try:
>>
>> ---> 63 return f(*a, **kw)
>>
>>  64 except py4j.protocol.Py4JJavaError as e:
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py
>> in get_return_value(answer, gateway_client, target_id, name)
>>
>> 327 "An error occurred while calling
>> {0}{1}{2}.\n".
>>
>> --> 328 format(target_id, ".", name), value)
>>
>> 329 else:
>>
>> Py4JJavaError: An error occurred while calling o33.collectToPython.
>>
>> : java.lang.IllegalArgumentException: Unsupported class file major
>> version 55
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)
>>
>> at
>> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)
>>
>> at
>> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>>
>> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
>>
>> at
>> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>>
>> at
>> 

Unsubscribe

2020-06-21 Thread Punna Yenumala



Re: Hey good looking toPandas () error stack

2020-06-21 Thread randy clinton
You can see from the GitHub history for "toPandas()" that the function has
been in the code for 5 years.
https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923

When I google IllegalArgumentException: 'Unsupported class file major
version 55'

I see posts about the Java version being used. Are you sure your configs
are right?
https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version

On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan 
wrote:

>
> Two versions of Spark running against same code
>
>
> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd
>
> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See
> error stack below
>
> Jupyter Notebook
>
> import findspark
>
> findspark.init('/home/spark-3.0.0-bin-hadoop2.7')
>
> cell "spark"
>
> cell output
>
> SparkSession - in-memory
>
> SparkContext
>
> Spark UI
>
> Version
>
> v3.0.0
>
> Master
>
> local[*]
>
> AppName
>
> Titanic Data
>
>
> import findspark
>
> findspark.init('/home/spark-2.4.6-bin-hadoop2.7')
>
> cell  "spark"
>
>
>
> cell output
>
> SparkSession - in-memory
>
> SparkContext
>
> Spark UI
>
> Version
>
> v2.4.6
>
> Master
>
> local[*]
>
> AppName
>
> Titanic Data
>
> cell "df.show(5)"
>
>
> +---++--++--+---+-+-++---+-++
>
> |PassengerId|Survived|Pclass|Name|   Sex|Age|SibSp|Parch|
> Ticket|   Fare|Cabin|Embarked|
>
>
> +---++--++--+---+-+-++---+-++
>
> |  1|   0| 3|Braund, Mr. Owen ...|  male| 22|1|0|
>   A/5 21171|   7.25| null|   S|
>
> |  2|   1| 1|Cumings, Mrs. Joh...|female| 38|1|0|
>   PC 17599|71.2833|  C85|   C|
>
> |  3|   1| 3|Heikkinen, Miss. ...|female| 26|0|
> 0|STON/O2. 3101282|  7.925| null|   S|
>
> |  4|   1| 1|Futrelle, Mrs. Ja...|female| 35|1|0|
> 113803|   53.1| C123|   S|
>
> |  5|   0| 3|Allen, Mr. Willia...|  male| 35|0|0|
> 373450|   8.05| null|   S|
>
>
> +---++--++--+---+-+-++---+-++
>
> only showing top 5 rows
>
> cell "df.toPandas()"
>
> cell output
>
> ---
>
> Py4JJavaError Traceback (most recent call last)
>
> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a,
> **kw)
>
>  62 try:
>
> ---> 63 return f(*a, **kw)
>
>  64 except py4j.protocol.Py4JJavaError as e:
>
> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py
> in get_return_value(answer, gateway_client, target_id, name)
>
> 327 "An error occurred while calling {0}{1}{2}.\n".
>
> --> 328 format(target_id, ".", name), value)
>
> 329 else:
>
> Py4JJavaError: An error occurred while calling o33.collectToPython.
>
> : java.lang.IllegalArgumentException: Unsupported class file major version
> 55
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)
>
> at
> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)
>
> at
> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)
>
> at
> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)
>
> at
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>
> at
> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>
> at
> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>
> at
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>
> at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
>
> at
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>
> at
> org.apache.spark.util.FieldAccessFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:828)
>
> at org.apache.xbean.asm6.ClassReader.readCode(ClassReader.java:2175)
>
> at org.apache.xbean.asm6.ClassReader.readMethod(ClassReader.java:1238)
>
> at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:631)
>
> at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:355)
>
> at
> 

Re: Kafka Zeppelin integration

2020-06-21 Thread Alex Ott
Can you post what settings have you configured for Spark interpreter?
I recently did a demo of using Zeppelin 0.9.0 preview1 + Structured Streaming
+ Kafka, running in distributed mode on the DSE Analytics, and everything
just worked...

P.S. Here is the notebook if you're interested
https://github.com/alexott/zeppelin-demos/blob/master/cassandra-day-russia/Cassandra%20Day%20Russia%20Streaming%20demo.zpln

silav...@dtechspace.com  at "Fri, 19 Jun 2020 19:41:45 -0700" wrote:
 s> hi here is my question. Spark code run on zeppelin is unable to find kafka 
source even
 s> though a dependency is specified. I ask is there any way to fix this. 
Zeppelin version is
 s> 0.9.0, Spark version is 2.4.6, and kafka version is 2.4.1. I have specified 
the dependency
 s> in the packages and add a jar file that contained the kafka stream 010.


-- 
With best wishes,Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org