singhpk234 commented on PR #37083:
URL: https://github.com/apache/spark/pull/37083#issuecomment-1178646200

   > After this PR, what's the difference between 
SizeInBytesOnlyStatsPlanVisitor and BasicStatsPlanVisitor
   
   BasicStatsPlanVisitor additionally takes has columnStats such as (NDV / 
NullCount / min / max etc) on estimation, which generally is not passed from 
DSv1 / Dsv2 relation itself.  
   
   As per my understanding, prior to this PR, SizeInBytesOnlyStatsPlanVisitor 
was estimating stats on the subset of info i.e only sizeInBytes and 
BasicStatsPlanVisitor on all 3 info (sizeInBytes, rowcount,ColumStats (min /max 
/NDV etc), now via this PR SizeInBytesOnlyStatsPlanVisitor is estimating stats 
on the subset of info but this subset is now (sizeInBytes / rowCount) and 
BasicStatsPlanVisitor on all 3 info (sizeInBytes, rowcount,ColumStats (min /max 
/NDV etc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to