But you can not get what you expected in PySpark, because the RDD
 in Scala is serialized, so it will always be RDD[Array[Byte]], whatever
 the type of RDD in Python is.

Davies

On Sat, Sep 6, 2014 at 4:09 AM, Aaron Davidson <ilike...@gmail.com> wrote:
> Pretty easy to do in Scala:
>
> rdd.elementClassTag.runtimeClass
>
> You can access this method from Python as well by using the internal _jrdd.
> It would look something like this (warning, I have not tested it):
> rdd._jrdd.classTag().runtimeClass()
>
> (The method name is "classTag" for JavaRDDLike, and "elementClassTag" for
> Scala's RDD.)
>
>
> On Thu, Sep 4, 2014 at 1:32 PM, esamanas <evan.sama...@gmail.com> wrote:
>>
>> Hi,
>>
>> I'm new to spark and scala, so apologies if this is obvious.
>>
>> Every RDD appears to be typed, which I can see by seeing the output in the
>> spark-shell when I execute 'take':
>>
>> scala> val t = sc.parallelize(Array(1,2,3))
>> t: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[6] at parallelize
>> at <console>:12
>>
>> scala> t.take(3)
>> res4: Array[Int] = Array(1, 2, 3)
>>
>>
>> scala> val u = sc.parallelize(Array(1,Array(2,2,2,2,2),3))
>> u: org.apache.spark.rdd.RDD[Any] = ParallelCollectionRDD[3] at parallelize
>> at <console>:12
>>
>> scala> u.take(3)
>> res5: Array[Any] = Array(1, Array(2, 2, 2, 2, 2), 3)
>>
>> Array type stays the same even if only one type returned.
>> scala> u.take(1)
>> res6: Array[Any] = Array(1)
>>
>>
>> Is there some way to just get the name of the type of the entire RDD from
>> some function call?  Also, I would really like this same functionality in
>> pyspark, so I'm wondering if that exists on that side, since clearly the
>> underlying RDD is typed (I'd be fine with either the Scala or Python type
>> name).
>>
>> Thank you,
>>
>> Evan
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Getting-the-type-of-an-RDD-in-spark-AND-pyspark-tp13498.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to