DaveBirdsall opened a new pull request #1848: [TRAFODION-3316] Three fixes to UPDATE STATISTICS URL: https://github.com/apache/trafodion/pull/1848 Three fixes to UPDATE STATISTICS: 1. Add support for Hive TIMESTAMP data type. The only change needed is to tolerate a precision of 9 on a TIMESTAMP. 2. When creating a sample table on a small Hive table, the SAMPLING_RATIO recorded in SB_PERSISTENT_SAMPLES is incorrect. This happened because the estimated size for the table was off by two orders of magnitude. We calculated the sampling ratio from the number of rows returned in the sample and this estimated size. There was logic to adjust the estimated size of the table as well from the statistics of the sample SELECT, however that logic was stubbed out because it didn't work. That logic has been removed -- the design premise was flawed anyway because it does not take into account the possibility that a key predicate will be used in incremental UPDATE STATISTICS. Instead, we now re-estimate the rows in the original table using the user-specified sampling ratio and the number of rows in the sample. Note: As part of this change, a parameter to HSFuncExecQuery is nearly obsolete. I did not replace it however because this function is called in over a hundred places in the code. Will leave that for a later clean-up. 3. There was one place (extractTblName and its caller) where we were using a const char * pointer into an NAString object built on the stack that goes out of scope. This causes occasional failures. This has been fixed.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
