[ https://issues.apache.org/jira/browse/HIVE-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergey Shelukhin updated HIVE-5304: ----------------------------------- Description: Hive uses JDOQL filters to optimize partition retrieval; recently direct SQL was added to optimize it further. Both of these methods may end up pushing StringCol op 'SomeString' to underlying SQL datastore. Many paths also pushes order by-s, although these are not as problematic. The problem is that different datastores handle string compares differently. While testing on Postgres, I see that results in different things, from innocent like order changes in "show partitions", to more serious like {code} alter table ptestfilter drop partition (c>='US', d<='2') {code} in drop_partitions_filter.q - in Derby, with which the .q.out file was generated, it drops "c=Uganda/d=2"; this also passes on MySQL (I ran tests with autocreated db); on Postgres with a db from the script it doesn't. Looks like we need to enforce collation in partition names and part_key_values-es; both in the create scripts, as well as during autocreate (via package.jdo?) EDIT: also affected are - show indexes. So all names need to be taken care of Then; describe_comment_nonascii.q fails against MySQL on autocreated db due to problems with commends. was: Hive uses JDOQL filters to optimize partition retrieval; recently direct SQL was added to optimize it further. Both of these methods may end up pushing StringCol op 'SomeString' to underlying SQL datastore. Many paths also pushes order by-s, although these are not as problematic. The problem is that different datastores handle string compares differently. While testing on Postgres, I see that results in different things, from innocent like order changes in "show partitions", to more serious like {code} alter table ptestfilter drop partition (c>='US', d<='2') {code} in drop_partitions_filter.q - in Derby, with which the .q.out file was generated, it drops "c=Uganda/d=2"; this also passes on MySQL (I ran tests with autocreated db); on Postgres with a db from the script it doesn't. Looks like we need to enforce collation in partition names and part_key_values-es; both in the create scripts, as well as during autocreate (via package.jdo?) > there are various string encoding problems in metastore depending on > underlying datastore > ----------------------------------------------------------------------------------------- > > Key: HIVE-5304 > URL: https://issues.apache.org/jira/browse/HIVE-5304 > Project: Hive > Issue Type: Bug > Components: Metastore > Reporter: Sergey Shelukhin > > Hive uses JDOQL filters to optimize partition retrieval; recently direct SQL > was added to optimize it further. Both of these methods may end up pushing > StringCol op 'SomeString' to underlying SQL datastore. Many paths also pushes > order by-s, although these are not as problematic. > The problem is that different datastores handle string compares differently. > While testing on Postgres, I see that results in different things, from > innocent like order changes in "show partitions", to more serious like > {code} > alter table ptestfilter drop partition (c>='US', d<='2') > {code} > in drop_partitions_filter.q - in Derby, with which the .q.out file was > generated, it drops "c=Uganda/d=2"; this also passes on MySQL (I ran tests > with autocreated db); on Postgres with a db from the script it doesn't. > Looks like we need to enforce collation in partition names and > part_key_values-es; both in the create scripts, as well as during autocreate > (via package.jdo?) > EDIT: > also affected are - show indexes. So all names need to be taken care of > Then; describe_comment_nonascii.q fails against MySQL on autocreated db due > to problems with commends. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira