Kanchan,
the `toDebugString` looks unformatted because in some scenarios you need to
parse it before (can't remember the reason, though). I suggest you to print
the RDD Lineage using
`print(rdd.toDebugString().decode("utf-8"))` instead (obs: this only occurs
in Pyspark).

About the other question, you may use `getNumberPartitions`.

On Sat, Apr 20, 2019 at 2:40 PM kanchan tewary <kanchan.tew...@gmail.com>
wrote:

> Dear All,
>
> Greetings!
>
> I am new to Apache Spark and working on RDDs using pyspark. I am trying to
> understand the logical plan provided by toDebugString function, but I find
> two issues a) the output is not formatted when I print the result
> b) I do not see number of partitions shown.
>
> Can anyone direct me to any reference documentation to understand the
> logical plan better? Or, do you suggest to use DAG from spark UI instead?
>
>
> Thanks & Best Regards,
> Kanchan
> Data Engineer, IBM
>

Reply via email to