[jira] [Created] (HIVE-19265) Potential NPE and hiding actual exception in Hive#copyFiles
Igor Kryvenko created HIVE-19265: Summary: Potential NPE and hiding actual exception in Hive#copyFiles Key: HIVE-19265 URL: https://issues.apache.org/jira/browse/HIVE-19265 Project: Hive Issue Type: Bug Reporter: Igor Kryvenko Assignee: Igor Kryvenko {{In Hive#copyFiles}} we have such code {code:java} if (src.isDirectory()) { try { files = srcFs.listStatus(src.getPath(), FileUtils.HIDDEN_FILES_PATH_FILTER); } catch (IOException e) { pool.shutdownNow(); throw new HiveException(e); } } {code} If pool is null we will get NPE and actual cause will be lost. Initializing of pool {code:java} final ExecutorService pool = conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname, 25) > 0 ? Executors.newFixedThreadPool(conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname, 25), new ThreadFactoryBuilder().setDaemon(true).setNameFormat("Move-Thread-%d").build()) : null; {code} So in the case when the pool is not created we can get potential NPE and swallow an actual exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19264) Vectorization: Reenable vectorization in vector_adaptor_usage_mode.q
Matt McCline created HIVE-19264: --- Summary: Vectorization: Reenable vectorization in vector_adaptor_usage_mode.q Key: HIVE-19264 URL: https://issues.apache.org/jira/browse/HIVE-19264 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 3.0.0, 3.1.0 [~vihangk1] observed vectorization had accidentally been turned off. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19263) Improve ugly exception handling in HiveMetaStore
Igor Kryvenko created HIVE-19263: Summary: Improve ugly exception handling in HiveMetaStore Key: HIVE-19263 URL: https://issues.apache.org/jira/browse/HIVE-19263 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Igor Kryvenko Assignee: Igor Kryvenko In {{HiveMetaStore}} class we have a lot of ugly exception handling code using which use {{instanceof}} {code:java} catch (Exception e) { ex = e; if (e instanceof MetaException) { throw (MetaException) e; } else if (e instanceof InvalidObjectException) { throw (InvalidObjectException) e; } else if (e instanceof AlreadyExistsException) { throw (AlreadyExistsException) e; } else { throw newMetaException(e); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19262) empty array will be saved as NULL by insert into select
liupengcheng created HIVE-19262: --- Summary: empty array will be saved as NULL by insert into select Key: HIVE-19262 URL: https://issues.apache.org/jira/browse/HIVE-19262 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 0.13.1 Reporter: liupengcheng Data is generated by MR parquet, and the data contains empty list. When executing the following sql, the emtpy list col of the result is different from the original data. `insert into table a as select * from b ` {code:java} >select col1 from a where size(col1) = 0 limit 1; []// will show [] >insert into table b select col1 from a; >select col1 from b; NULL // will show NULL {code} I was wondering if we should return the same result as before, and should not change the data saved. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19261) Avro SerDe's InstanceCache should not be synchronized on retrieve
Fangshi Li created HIVE-19261: - Summary: Avro SerDe's InstanceCache should not be synchronized on retrieve Key: HIVE-19261 URL: https://issues.apache.org/jira/browse/HIVE-19261 Project: Hive Issue Type: Improvement Reporter: Fangshi Li Assignee: Fangshi Li In HIVE-16175, upstream made a patch to fix the thread safety issue in AvroSerDe's InstanceCache. This fix made the retrieve method in InstanceCache synchronized. While it should make InstanceCache thread-safe, adding synchronized on retrieve for the cache can be expensive in highly concurrent environment like Spark, as multiple threads need to be synchronized on entering the retrieve method. We are proposing another way to fix this thread safety issue by making the underlying map of InstanceCache as ConcurrentHashMap. Ideally, we can use atomic computeIfAbsent in the retrieve method to avoid synchronizing the entire method. While computeIfAbsent is only available on java 8 and java 7 is still supported in Hive, /we use a pattern to simulate the behavior of computeIfAbsent. In the future, we should move to computeIfAbsent when Hive requires java 8. -- This message was sent by Atlassian JIRA (v7.6.3#76005)