date:20220206

Re: TypeError: Can not infer schema for type:

2022-02-06 Thread capitnfrakass


Thanks for the reply.

It looks strange that in scala shell I can implement this translation:

scala> sc.parallelize(List(3,2,1,4)).toDF.show
+-+
|value|
+-+
|3|
|2|
|1|
|4|
+-+

But in pyspark i have to write as:

sc.parallelize([3,2,1,4]).map(lambda x: 
(x,1)).toDF(['id','count']).show()

+---+-+
| id|count|
+---+-+
|  3|1|
|  2|1|
|  1|1|
|  4|1|
+---+-+


So there are differences on the implementation of pyspark and scala.

Thanks

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: TypeError: Can not infer schema for type:

2022-02-06 Thread Sean Owen

You are passing a list of primitives. It expects something like a list of
tuples, which can each have 1 int if you like.

On Sun, Feb 6, 2022, 10:10 PM  wrote:

> >>> rdd = sc.parallelize([3,2,1,4])
> >>> rdd.toDF().show()
> Traceback (most recent call last):
>File "", line 1, in 
>File "/opt/spark/python/pyspark/sql/session.py", line 66, in toDF
>  return sparkSession.createDataFrame(self, schema, sampleRatio)
>File "/opt/spark/python/pyspark/sql/session.py", line 675, in
> createDataFrame
>  return self._create_dataframe(data, schema, samplingRatio,
> verifySchema)
>File "/opt/spark/python/pyspark/sql/session.py", line 698, in
> _create_dataframe
>  rdd, schema = self._createFromRDD(data.map(prepare), schema,
> samplingRatio)
>File "/opt/spark/python/pyspark/sql/session.py", line 486, in
> _createFromRDD
>  struct = self._inferSchema(rdd, samplingRatio, names=schema)
>File "/opt/spark/python/pyspark/sql/session.py", line 466, in
> _inferSchema
>  schema = _infer_schema(first, names=names)
>File "/opt/spark/python/pyspark/sql/types.py", line 1067, in
> _infer_schema
>  raise TypeError("Can not infer schema for type: %s" % type(row))
> TypeError: Can not infer schema for type: 
>
>
> In my pyspark why this fails? I didnt get the way.
> Thanks for helps.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

TypeError: Can not infer schema for type:

2022-02-06 Thread capitnfrakass


rdd = sc.parallelize([3,2,1,4])
rdd.toDF().show()

Traceback (most recent call last):
  File "", line 1, in 
  File "/opt/spark/python/pyspark/sql/session.py", line 66, in toDF
return sparkSession.createDataFrame(self, schema, sampleRatio)
  File "/opt/spark/python/pyspark/sql/session.py", line 675, in 
createDataFrame
return self._create_dataframe(data, schema, samplingRatio, 
verifySchema)
  File "/opt/spark/python/pyspark/sql/session.py", line 698, in 
_create_dataframe
rdd, schema = self._createFromRDD(data.map(prepare), schema, 
samplingRatio)
  File "/opt/spark/python/pyspark/sql/session.py", line 486, in 
_createFromRDD

struct = self._inferSchema(rdd, samplingRatio, names=schema)
  File "/opt/spark/python/pyspark/sql/session.py", line 466, in 
_inferSchema

schema = _infer_schema(first, names=names)
  File "/opt/spark/python/pyspark/sql/types.py", line 1067, in 
_infer_schema

raise TypeError("Can not infer schema for type: %s" % type(row))
TypeError: Can not infer schema for type: 


In my pyspark why this fails? I didnt get the way.
Thanks for helps.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: dataframe doesn't support higher order func, right?

2022-02-06 Thread Sean Owen

Scala and Python are not the same in this regard. This isn't related to how
spark works.

On Sun, Feb 6, 2022, 10:04 PM  wrote:

> Indeed. in spark-shell I ignore the parentheses always,
>
> scala> sc.parallelize(List(3,2,1,4)).toDF.show
> +-+
> |value|
> +-+
> |3|
> |2|
> |1|
> |4|
> +-+
>
> So I think it would be ok in pyspark.
>
> But this still doesn't work. why?
>
> >>> sc.parallelize([3,2,1,4]).toDF().show()
> Traceback (most recent call last):
>File "", line 1, in 
>File "/opt/spark/python/pyspark/sql/session.py", line 66, in toDF
>  return sparkSession.createDataFrame(self, schema, sampleRatio)
>File "/opt/spark/python/pyspark/sql/session.py", line 675, in
> createDataFrame
>  return self._create_dataframe(data, schema, samplingRatio,
> verifySchema)
>File "/opt/spark/python/pyspark/sql/session.py", line 698, in
> _create_dataframe
>  rdd, schema = self._createFromRDD(data.map(prepare), schema,
> samplingRatio)
>File "/opt/spark/python/pyspark/sql/session.py", line 486, in
> _createFromRDD
>  struct = self._inferSchema(rdd, samplingRatio, names=schema)
>File "/opt/spark/python/pyspark/sql/session.py", line 466, in
> _inferSchema
>  schema = _infer_schema(first, names=names)
>File "/opt/spark/python/pyspark/sql/types.py", line 1067, in
> _infer_schema
>  raise TypeError("Can not infer schema for type: %s" % type(row))
> TypeError: Can not infer schema for type: 
>
>
> spark 3.2.0
>
>
> On 07/02/2022 11:44, Sean Owen wrote:
> > This is just basic Python - you're missing parentheses on toDF, so you
> > are not calling a function nor getting its result.
> >
> > On Sun, Feb 6, 2022 at 9:39 PM  wrote:
> >
> >> I am a bit confused why in pyspark this doesn't work?
> >>
> > x = sc.parallelize([3,2,1,4])
> > x.toDF.show()
> >> Traceback (most recent call last):
> >> File "", line 1, in 
> >> AttributeError: 'function' object has no attribute 'show'
> >>
> >> Thank you.
> >>
> >>
> > -
> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

Re: dataframe doesn't support higher order func, right?

2022-02-06 Thread capitnfrakass


Indeed. in spark-shell I ignore the parentheses always,

scala> sc.parallelize(List(3,2,1,4)).toDF.show
+-+
|value|
+-+
|3|
|2|
|1|
|4|
+-+

So I think it would be ok in pyspark.

But this still doesn't work. why?


sc.parallelize([3,2,1,4]).toDF().show()

Traceback (most recent call last):
  File "", line 1, in 
  File "/opt/spark/python/pyspark/sql/session.py", line 66, in toDF
return sparkSession.createDataFrame(self, schema, sampleRatio)
  File "/opt/spark/python/pyspark/sql/session.py", line 675, in 
createDataFrame
return self._create_dataframe(data, schema, samplingRatio, 
verifySchema)
  File "/opt/spark/python/pyspark/sql/session.py", line 698, in 
_create_dataframe
rdd, schema = self._createFromRDD(data.map(prepare), schema, 
samplingRatio)
  File "/opt/spark/python/pyspark/sql/session.py", line 486, in 
_createFromRDD

struct = self._inferSchema(rdd, samplingRatio, names=schema)
  File "/opt/spark/python/pyspark/sql/session.py", line 466, in 
_inferSchema

schema = _infer_schema(first, names=names)
  File "/opt/spark/python/pyspark/sql/types.py", line 1067, in 
_infer_schema

raise TypeError("Can not infer schema for type: %s" % type(row))
TypeError: Can not infer schema for type: 


spark 3.2.0


On 07/02/2022 11:44, Sean Owen wrote:

This is just basic Python - you're missing parentheses on toDF, so you
are not calling a function nor getting its result.

On Sun, Feb 6, 2022 at 9:39 PM  wrote:


I am a bit confused why in pyspark this doesn't work?


x = sc.parallelize([3,2,1,4])
x.toDF.show()

Traceback (most recent call last):
File "", line 1, in 
AttributeError: 'function' object has no attribute 'show'

Thank you.



-

To unsubscribe e-mail: user-unsubscr...@spark.apache.org


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: dataframe doesn't support higher order func, right?

2022-02-06 Thread Sean Owen

This is just basic Python - you're missing parentheses on toDF, so you are
not calling a function nor getting its result.

On Sun, Feb 6, 2022 at 9:39 PM  wrote:

> I am a bit confused why in pyspark this doesn't work?
>
> >>> x = sc.parallelize([3,2,1,4])
> >>> x.toDF.show()
> Traceback (most recent call last):
>File "", line 1, in 
> AttributeError: 'function' object has no attribute 'show'
>
>
> Thank you.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: dataframe doesn't support higher order func, right?

2022-02-06 Thread capitnfrakass


I am a bit confused why in pyspark this doesn't work?


x = sc.parallelize([3,2,1,4])
x.toDF.show()

Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'function' object has no attribute 'show'


Thank you.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: add an auto_increment column

2022-02-06 Thread Siva Samraj

Monotonically_increasing_id() will give the same functionality

On Mon, 7 Feb, 2022, 6:57 am ,  wrote:

> For a dataframe object, how to add a column who is auto_increment like
> mysql's behavior?
>
> Thank you.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: add an auto_increment column

2022-02-06 Thread ayan guha

Try this:
https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.functions.monotonically_increasing_id.html



On Mon, 7 Feb 2022 at 12:27 pm,  wrote:

> For a dataframe object, how to add a column who is auto_increment like
> mysql's behavior?
>
> Thank you.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
> --
Best Regards,
Ayan Guha

Fwd: (send this email to subscribe)

2022-02-06 Thread Madhuchaitanya Joshi

-- Forwarded message -
From: Madhuchaitanya Joshi 
Date: Wed, 19 Jan, 2022, 10:51
Subject: (send this email to subscribe)
To: 

Hello team,

I am trying to build and compile spark source code using intellij and
eclipse. But I am getting jackson-bind.jar not found error in intellij. I
tried generaye source and folder, mvn clean compile and rebuild project.
Also invalidate cache I tried. But still not working.

Please help me in this. I want to build and compile spark source code on
eclipse/ intellij to understand flow of code.

Thanks and regards,
Madhuchaitanya Joshi

add an auto_increment column

2022-02-06 Thread capitnfrakass

For a dataframe object, how to add a column who is auto_increment like 
mysql's behavior?


Thank you.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Unsubscribe

2022-02-06 Thread Yogitha Ramanathan

Unsubscribe

Re: dataframe doesn't support higher order func, right?

2022-02-06 Thread Mich Talebzadeh

Basically you are creating a dataframe (a dataframe is a *Dataset* organized
into named columns. It is conceptually equivalent to a table in a
relational database) out of RDD here.


scala> val rdd = sc.parallelize( List(3, 2, 1, 4, 0))

rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[19] at
parallelize at :24


scala> // convert it to a dataframe


scala> val df = rdd.toDF

df: org.apache.spark.sql.DataFrame = [value: int]


scala> df.filter('value > 2).show

+-+

|value|

+-+

|3|

|4|

+-+

HTH



   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sun, 6 Feb 2022 at 11:51,  wrote:

> for example, this work for RDD object:
>
> scala> val li = List(3,2,1,4,0)
> li: List[Int] = List(3, 2, 1, 4, 0)
>
> scala> val rdd = sc.parallelize(li)
> rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at
> parallelize at :24
>
> scala> rdd.filter(_ > 2).collect()
> res0: Array[Int] = Array(3, 4)
>
>
> After I convert RDD to the dataframe, the filter won't work:
>
> scala> val df = rdd.toDF
> df: org.apache.spark.sql.DataFrame = [value: int]
>
> scala> df.filter(_ > 2).show()
> :24: error: value > is not a member of org.apache.spark.sql.Row
> df.filter(_ > 2).show()
>
>
> But this can work:
>
> scala> df.filter($"value" > 2).show()
> +-+
> |value|
> +-+
> |3|
> |4|
> +-+
>
>
> Where to check all the methods supported by dataframe?
>
>
> Thank you.
> Frakass
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: dataframe doesn't support higher order func, right?

2022-02-06 Thread Sean Owen

DataFrames are a quite different API, more SQL-like in its operations, not
functional. The equivalent would be more like df.filterExpr("value > 2")

On Sun, Feb 6, 2022 at 5:51 AM  wrote:

> for example, this work for RDD object:
>
> scala> val li = List(3,2,1,4,0)
> li: List[Int] = List(3, 2, 1, 4, 0)
>
> scala> val rdd = sc.parallelize(li)
> rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at
> parallelize at :24
>
> scala> rdd.filter(_ > 2).collect()
> res0: Array[Int] = Array(3, 4)
>
>
> After I convert RDD to the dataframe, the filter won't work:
>
> scala> val df = rdd.toDF
> df: org.apache.spark.sql.DataFrame = [value: int]
>
> scala> df.filter(_ > 2).show()
> :24: error: value > is not a member of org.apache.spark.sql.Row
> df.filter(_ > 2).show()
>
>
> But this can work:
>
> scala> df.filter($"value" > 2).show()
> +-+
> |value|
> +-+
> |3|
> |4|
> +-+
>
>
> Where to check all the methods supported by dataframe?
>
>
> Thank you.
> Frakass
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

dataframe doesn't support higher order func, right?

2022-02-06 Thread capitnfrakass


for example, this work for RDD object:

scala> val li = List(3,2,1,4,0)
li: List[Int] = List(3, 2, 1, 4, 0)

scala> val rdd = sc.parallelize(li)
rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at 
parallelize at :24


scala> rdd.filter(_ > 2).collect()
res0: Array[Int] = Array(3, 4)


After I convert RDD to the dataframe, the filter won't work:

scala> val df = rdd.toDF
df: org.apache.spark.sql.DataFrame = [value: int]

scala> df.filter(_ > 2).show()
:24: error: value > is not a member of org.apache.spark.sql.Row
   df.filter(_ > 2).show()


But this can work:

scala> df.filter($"value" > 2).show()
+-+
|value|
+-+
|3|
|4|
+-+


Where to check all the methods supported by dataframe?


Thank you.
Frakass


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Python performance

2022-02-06 Thread Hinko Kocevar

Thanks for your input guys!   //hinko

On 4 Feb 2022, at 14:58, Sean Owen  wrote:


Yes, in the sense that any transformation that can be expressed in the SQL-like 
DataFrame API will push down to the JVM, and take advantage of other 
optimizations, avoiding the data movement to/from Python and more.
But you can't do this if you're expressing operations that are not in the 
DataFrame API, custom logic. They are not always alternatives.

There, pandas UDFs are a better choice in python as you can take advantage of 
arrow for data movement, and that is also a reason to use DataFrames in a case 
like this. It still has to execute code in Python though.

On Fri, Feb 4, 2022 at 3:20 AM Bitfox  wrote:
Please see my this test:
https://blog.cloudcache.net/computing-performance-comparison-for-words-statistics/

Don’t use Python RDD, using dataframe instead.

Regards

On Fri, Feb 4, 2022 at 5:02 PM Hinko Kocevar  
wrote:
I'm looking into using Python interface with Spark and came across this [1] 
chart showing some performance hit when going with Python RDD. Data is ~ 7 
years and for older version of Spark. Is this still the case with more recent 
Spark releases?

I'm trying to understand what to expect from Python and Spark and under what 
conditions.

[1] 
https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html

Thanks,
//hinko
-
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org

Re: help check my simple job

2022-02-06 Thread capitnfrakass


That did resolve my issue.
Thanks a lot.

frakass


n 06/02/2022 17:25, Hannes Bibel wrote:

Hi,

looks like you're packaging your application for Scala 2.13 (should be
specified in your build.sbt) while your Spark installation is built
for Scala 2.12.

Go to https://spark.apache.org/downloads.html, select under "Choose a
package type" the package type that says "Scala 2.13". With that
release you should be able to run your application.

In general, minor versions of Scala (e.g. 2.12 and 2.13) are
incompatible.

Best
Hannes

On Sun, Feb 6, 2022 at 10:01 AM  wrote:


Hello

I wrote this simple job in scala:

$ cat Myjob.scala
import org.apache.spark.sql.SparkSession

object Myjob {
def main(args: Array[String]): Unit = {
val sparkSession = SparkSession.builder.appName("Simple
Application").getOrCreate()
val sparkContext = sparkSession.sparkContext

val arrayRDD = sparkContext.parallelize(List(1,2,3,4,5,6,7,8))
println(arrayRDD.getClass, arrayRDD.count())
}
}

After package it then I submit it to spark, it gets the error:

$ /opt/spark/bin/spark-submit --class "Myjob" --master local[4]
target/scala-2.13/my-job_2.13-1.0.jar

Exception in thread "main" java.lang.NoSuchMethodError:
'scala.collection.immutable.ArraySeq
scala.runtime.ScalaRunTime$.wrapIntArray(int[])'
at Myjob$.main(Myjob.scala:8)
at Myjob.main(Myjob.scala)
at


java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native


Method)
at


java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at


java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at
java.base/java.lang.reflect.Method.invoke(Method.java:566)
at


org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)

at
org.apache.spark.deploy.SparkSubmit.org
[1]$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
at


org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)

at
org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at


org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)

at
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
at
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

What's the issue?

Thank you.



-

To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Links:
--
[1] http://org.apache.spark.deploy.SparkSubmit.org


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: help check my simple job

2022-02-06 Thread Hannes Bibel

Hi,

looks like you're packaging your application for Scala 2.13 (should be
specified in your build.sbt) while your Spark installation is built for
Scala 2.12.

Go to https://spark.apache.org/downloads.html, select under "Choose a
package type" the package type that says "Scala 2.13". With that release
you should be able to run your application.

In general, minor versions of Scala (e.g. 2.12 and 2.13) are incompatible.

Best
Hannes


On Sun, Feb 6, 2022 at 10:01 AM  wrote:

> Hello
>
>   I wrote this simple job in scala:
>
> $ cat Myjob.scala
> import org.apache.spark.sql.SparkSession
>
> object Myjob {
>def main(args: Array[String]): Unit = {
>  val sparkSession = SparkSession.builder.appName("Simple
> Application").getOrCreate()
>  val sparkContext = sparkSession.sparkContext
>
>  val arrayRDD = sparkContext.parallelize(List(1,2,3,4,5,6,7,8))
>  println(arrayRDD.getClass, arrayRDD.count())
>}
> }
>
>
> After package it then I submit it to spark, it gets the error:
>
> $ /opt/spark/bin/spark-submit --class "Myjob" --master local[4]
> target/scala-2.13/my-job_2.13-1.0.jar
>
> Exception in thread "main" java.lang.NoSuchMethodError:
> 'scala.collection.immutable.ArraySeq
> scala.runtime.ScalaRunTime$.wrapIntArray(int[])'
> at Myjob$.main(Myjob.scala:8)
> at Myjob.main(Myjob.scala)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
>
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at
>
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
> at
> org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
> at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
> at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at
>
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
> at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> What's the issue?
>
> Thank you.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

help check my simple job

2022-02-06 Thread capitnfrakass


Hello

 I wrote this simple job in scala:

$ cat Myjob.scala
import org.apache.spark.sql.SparkSession

object Myjob {
  def main(args: Array[String]): Unit = {
val sparkSession = SparkSession.builder.appName("Simple 
Application").getOrCreate()

val sparkContext = sparkSession.sparkContext

val arrayRDD = sparkContext.parallelize(List(1,2,3,4,5,6,7,8))
println(arrayRDD.getClass, arrayRDD.count())
  }
}


After package it then I submit it to spark, it gets the error:

$ /opt/spark/bin/spark-submit --class "Myjob" --master local[4] 
target/scala-2.13/my-job_2.13-1.0.jar


Exception in thread "main" java.lang.NoSuchMethodError: 
'scala.collection.immutable.ArraySeq 
scala.runtime.ScalaRunTime$.wrapIntArray(int[])'

at Myjob$.main(Myjob.scala:8)
at Myjob.main(Myjob.scala)
	at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
	at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
	at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)

at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


What's the issue?

Thank you.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: TypeError: Can not infer schema for type:

Re: TypeError: Can not infer schema for type:

TypeError: Can not infer schema for type:

Re: dataframe doesn't support higher order func, right?

Re: dataframe doesn't support higher order func, right?

Re: dataframe doesn't support higher order func, right?

Re: dataframe doesn't support higher order func, right?

Re: add an auto_increment column

Re: add an auto_increment column

Fwd: (send this email to subscribe)

add an auto_increment column

Unsubscribe

Re: dataframe doesn't support higher order func, right?

Re: dataframe doesn't support higher order func, right?

dataframe doesn't support higher order func, right?

Re: Python performance

Re: help check my simple job

Re: help check my simple job

help check my simple job

19 matches

Site Navigation

Mail list logo

Footer information