DaveBirdsall opened a new pull request #1848: [TRAFODION-3316] Three fixes to 
UPDATE STATISTICS
URL: https://github.com/apache/trafodion/pull/1848
 
 
   Three fixes to UPDATE STATISTICS:
   
   1. Add support for Hive TIMESTAMP data type. The only change needed is to 
tolerate a precision of 9 on a TIMESTAMP.
   
   2. When creating a sample table on a small Hive table, the SAMPLING_RATIO 
recorded in SB_PERSISTENT_SAMPLES is incorrect. This happened because the 
estimated size for the table was off by two orders of magnitude. We calculated 
the sampling ratio from the number of rows returned in the sample and this 
estimated size. There was logic to adjust the estimated size of the table as 
well from the statistics of the sample SELECT, however that logic was stubbed 
out because it didn't work. That logic has been removed -- the design premise 
was flawed anyway because it does not take into account the possibility that a 
key predicate will be used in incremental UPDATE STATISTICS. Instead, we now 
re-estimate the rows in the original table using the user-specified sampling 
ratio and the number of rows in the sample.
   
   Note: As part of this change, a parameter to HSFuncExecQuery is nearly 
obsolete. I did not replace it however because this function is called in over 
a hundred places in the code. Will leave that for a later clean-up.
   
   3. There was one place (extractTblName and its caller) where we were using a 
const char * pointer into an NAString object built on the stack that goes out 
of scope. This causes occasional failures. This has been fixed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to