From what I understand, this is not possible to do. However, let me share my
workaround with you.
Assuming you have your debugger up and running on PyCharm, set a breakpoint at
this line, Take|collect|sample your data (could also consider doing a glom if
its critical the data remain partitioned, then the take/collect), and pass it
into the function directly (direct python, no spark). Use the debugger to step
through there on that small sample.
Alternatively, you can open up the PyCharm execution module. In the execution
module, do the same as above with the RDD, and pass it into the function. This
alleviates the need to write debugging code etc. I find this model useful and
a bit more fast, but it does not offer the step-through capability.
Best of luck!
From: Vitaliy Pisarev <vitaliy.pisa...@biocatch.com>
Date: Sunday, March 11, 2018 at 8:46 AM
To: "firstname.lastname@example.org" <email@example.com>
Subject: [EXT] Debugging a local spark executor in pycharm
I want to step through the work of a spark executor running locally on my
machine, from Pycharm.
I am running explicit functionality, in the form of dataset.foreachPartition(f)
and I want to see what is going on inside f.
Is there a straightforward way to do it or do I need to resort to remote
Posted this on