[ https://issues.apache.org/jira/browse/HIVE-16919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt McCline resolved HIVE-16919. --------------------------------- Resolution: Fixed > Vectorization: vectorization_short_regress.q has query result differences > with non-vectorized run. Vectorized unary function broken? > ------------------------------------------------------------------------------------------------------------------------------------- > > Key: HIVE-16919 > URL: https://issues.apache.org/jira/browse/HIVE-16919 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: Matt McCline > Assignee: Matt McCline > Priority: Critical > > Jason spotted a difference in the query result for > vectorization_short_regress.q.out -- that is when vectorization is turned off > and a base .q.out file created, there are 2 differences. > They both seem to be related to negation. For example, in the first one > MAX(cint) and MAX(cint) appear earlier as columns and match non-vec and vec. > So, it doesn't appear that aggregation is failing. It seems like the issue > is now that the Reducer is vectorizing, a bug is exposed. So, even though > MAX and MIN are the same, the expression with negation returns different > results. > 19th field of the query below: Vectorized 511 vs Non-Vectorized -58 > {noformat} > SELECT MAX(cint), > (MAX(cint) / -3728), > (MAX(cint) * -3728), > VAR_POP(cbigint), > (-((MAX(cint) * -3728))), > STDDEV_POP(csmallint), > (-563 % (MAX(cint) * -3728)), > (VAR_POP(cbigint) / STDDEV_POP(csmallint)), > (-(STDDEV_POP(csmallint))), > MAX(cdouble), > AVG(ctinyint), > (STDDEV_POP(csmallint) - 10.175), > MIN(cint), > ((MAX(cint) * -3728) % (STDDEV_POP(csmallint) - 10.175)), > (-(MAX(cdouble))), > MIN(cdouble), > (MAX(cdouble) % -26.28), > STDDEV_SAMP(csmallint), > (-((MAX(cint) / -3728))), > ((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * -3728))), > ((MAX(cint) / -3728) - AVG(ctinyint)), > (-((MAX(cint) * -3728))), > VAR_SAMP(cint) > FROM alltypesorc > WHERE (((cbigint <= 197) > AND (cint < cbigint)) > OR ((cdouble >= -26.28) > AND (csmallint > cdouble)) > OR ((ctinyint > cfloat) > AND (cstring1 RLIKE '.*ss.*')) > OR ((cfloat > 79.553) > AND (cstring2 LIKE '10%'))) > {noformat} > Column expression is: ((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * > -3728))), > ----------------------------------------------- > This is a previously existing issue and now filed as HIVE-16919: > "Vectorization: vectorization_short_regress.q has query result differences > with non-vectorized run" > 10th field of the query below: Non-Vectorized -6432.000015344526 vs. > -Vectorized -6432.0 > Column expression is (-(cdouble)) as c4, > Query result for vectorization_short_regress.q.out -- that is when > vectorization is turned off and a base .q.out file created. > ----------------------------------------------- > 10th field of the query below: Non-Vectorized -6432.000015344526 vs. > Vectorized -6432.0 > Column expression is (-(cdouble)) as c4, > {noformat} > SELECT ctimestamp1, > cstring2, > cdouble, > cfloat, > cbigint, > csmallint, > (cbigint / 3569) as c1, > (-257 - csmallint) as c2, > (-6432 * cfloat) as c3, > (-(cdouble)) as c4, > (cdouble * 10.175) as c5, > ((-6432 * cfloat) / cfloat) as c6, > (-(cfloat)) as c7, > (cint % csmallint) as c8, > (-(cdouble)) as c9, > (cdouble * (-(cdouble))) as c10 > FROM alltypesorc > WHERE (((-1.389 >= cint) > AND ((csmallint < ctinyint) > AND (-6432 > csmallint))) > OR ((cdouble >= cfloat) > AND (cstring2 <= 'a')) > OR ((cstring1 LIKE 'ss%') > AND (10.175 > cbigint))) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)