[GitHub] [spark] zero323 commented on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.

GitBox Sat, 26 Oct 2019 09:21:37 -0700

zero323 commented on issue #26118: [SPARK-24915][Python] Fix Row handling with 
Schema.
URL: https://github.com/apache/spark/pull/26118#issuecomment-546617204
 
 
   @HyukjinKwon To be honest I have mixed feelings about this. It looks 
sensible as a _temporary workaround_, but I am not fond of the idea of 
enforcing notion of `Row` being an unordered  dictionary-like object (though 
with compact dict as standard, that doesn't matter that much), especially when 
it is close to becoming completely obsolete. 
   
   Personally I'd prefer to wait a moment and see where the discussion on 
SPARK-22232 goes. If the resolution is introduction of legacy mode, then the 
scope of this particular change could be conditioned on it and Python version.
   
   If not I'd like to see some memory profiling data (especially memory - 
timings might be actually better for now, as we skip all the nasty `obj[n]`, 
but that's not very meaningful*) first.
   
   ----
   \* Is there any reason why we do this:
   
   
https://github.com/apache/spark/blob/2115bf61465b504bc21e37465cb34878039b5cb8/python/pyspark/sql/types.py#L615
   
   instead of just `tuple(self)`? That's huge performance bottleneck with wide 
schemas. Depending on the resolution of this one, that's something to fix, 
don't you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zero323 commented on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.

Reply via email to