My understanding is that the log is printed by PythonRunner.scala in the spark code base. May be mistaken
On Thu, Nov 22, 2018, 17:54 Eike von Seggern <eike.segg...@sevenval.com wrote: > Hi, > > Abdeali Kothari <abdealikoth...@gmail.com> schrieb am Do., 22. Nov. 2018 > um 10:04 Uhr: > >> When I run Python UDFs with pyspark, I get multiple logs where it says: >> >> 18/11/22 01:51:59 INFO python.PythonUDFRunner: Times: total = 44, boot = >> -25, init = 67, finish = 2 >> >> >> I am wondering if in these logs I can identify easily which of my >> PythonUDFs this timing information is for (I have about a hundred) so it's >> quite difficult for em to identify this easily ... >> > > If the log is created using Python's logging module, it should be > possible. This supports `funcName` ( > https://docs.python.org/3.6/library/logging.html#logrecord-attributes). > But I do not know how to configure the log-format for pyspark. > > HTH > > Eike >