[jira] [Commented] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620627#comment-14620627 ] Jesus Camacho Rodriguez commented on HIVE-10882: The filters data structure was filled in after HIVE-10533, but filtersMap was not; this patch solves that issue. [~ashutoshc], could you take a look? Thanks CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results --- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10882.01.patch CBO return path creates join operator with empty filtersMap. This causes outer joins to produce wrong results. To reproduce, run louter_join_ppr.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11190) No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620648#comment-14620648 ] Hive QA commented on HIVE-11190: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744398/HIVE-11190.002.patch {color:green}SUCCESS:{color} +1 9138 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4552/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4552/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4552/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744398 - PreCommit-HIVE-TRUNK-Build No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden -- Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch, HIVE-11190.002.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 is will be override without prompting info or warning. it will cause user failed to customize the METASTORE_FILTER_HOOK. We should log information such as this value is ignored when override happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620692#comment-14620692 ] Elliot West commented on HIVE-10165: Thanks [~ekoifman]. With regards to you observation, I agree that the use of locks is incorrect. I followed the pattern in the existing Streaming API but or course that is concerned with inserts only. Using [this reference|http://www.slideshare.net/Hadoop_Summit/adding-acid-transactions-inserts-updates-a] I note that I should be using a semi-shared lock. I’d be grateful for any additional advice you can give on when each lock type/target should be employed. A potential concern of mine is that the system may not know the set of partitions when the transaction is initiated. In this case would it suffice to use a lock with a broader scope (i.e. a table lock?), or should I acquire additional locks each time I encounter a new partition? As a side note, it appears as though the current locking documentation does not cover update/delete scenarios or semi-shared locks. I'll volunteer to update these pages once I have a clearer understanding of how these lock types apply to these operations and partitions: * https://cwiki.apache.org/confluence/display/Hive/Locking * https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions Finally, as this issue is now resolved, should I submit patches using additional JIRA issues or reopen this one? Improve hive-hcatalog-streaming extensibility and support updates and deletes. -- Key: HIVE-10165 URL: https://issues.apache.org/jira/browse/HIVE-10165 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 1.2.0 Reporter: Elliot West Assignee: Elliot West Labels: TODOC2.0, streaming_api Fix For: 2.0.0 Attachments: HIVE-10165.0.patch, HIVE-10165.10.patch, HIVE-10165.4.patch, HIVE-10165.5.patch, HIVE-10165.6.patch, HIVE-10165.7.patch, HIVE-10165.9.patch, mutate-system-overview.png h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of the updates (hundreds or partitions) the scope for contention is high. I believe we can address this problem by instead writing only the changed records to a Hive transactional table. This should drastically reduce the amount of data that we need to write and also provide a means for managing concurrent access to the data. Our existing merge processes can read and retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an updated form of the hive-hcatalog-streaming API which will then have the required data to perform an update or insert in a transactional manner. h3. Benefits * Enables the creation of large-scale dataset merge processes * Opens up Hive transactional functionality in an accessible manner to processes that operate outside of Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column
[ https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kohli updated HIVE-11217: Description: If you try to use create-table-as-select (CTAS) statement and create a ORC File format based table, then you can't use NULL as a column value in select clause CREATE TABLE empty (x int); CREATE TABLE orc_table_with_null STORED AS ORC AS SELECT x, null FROM empty; Erro: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.IllegalArgumentException: Unknown primitive type VOID was: If you try to use create-table-as-select (CTAS) statement and create a ORC File format based table, then you can't use NULL as a column value in select clause CREATE TABLE empty (x int); CREATE TABLE orc_table_with_null STORED AS ORC AS SELECT x, null FROM empty; CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column -- Key: HIVE-11217 URL: https://issues.apache.org/jira/browse/HIVE-11217 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Gaurav Kohli Priority: Minor If you try to use create-table-as-select (CTAS) statement and create a ORC File format based table, then you can't use NULL as a column value in select clause CREATE TABLE empty (x int); CREATE TABLE orc_table_with_null STORED AS ORC AS SELECT x, null FROM empty; Erro: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.IllegalArgumentException: Unknown primitive type VOID -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10882: --- Attachment: (was: HIVE-10882.patch) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results --- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez CBO return path creates join operator with empty filtersMap. This causes outer joins to produce wrong results. To reproduce, run louter_join_ppr.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10882: --- Attachment: HIVE-10882.01.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results --- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10882.01.patch CBO return path creates join operator with empty filtersMap. This causes outer joins to produce wrong results. To reproduce, run louter_join_ppr.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column
[ https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kohli updated HIVE-11217: Description: If you try to use create-table-as-select (CTAS) statement and create a ORC File format based table, then you can't use NULL as a column value in select clause CREATE TABLE empty (x int); CREATE TABLE orc_table_with_null STORED AS ORC AS SELECT x, null FROM empty; Error: {quote} 347084 [main] ERROR hive.ql.exec.DDLTask - org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Unknown primitive type VOID at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323) at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID at org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530) at org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.init(OrcStruct.java:195) at org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534) at org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:621) ... 35 more {quote} was: If you try to use create-table-as-select (CTAS) statement and create a ORC File format based table, then you can't use NULL as a column value in select clause CREATE TABLE empty (x int); CREATE TABLE orc_table_with_null STORED AS ORC AS SELECT x, null FROM empty; Erro: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.IllegalArgumentException: Unknown primitive type VOID CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column
[jira] [Updated] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10882: --- Attachment: HIVE-10882.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results --- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10882.patch CBO return path creates join operator with empty filtersMap. This causes outer joins to produce wrong results. To reproduce, run louter_join_ppr.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11159) Integrate hplsql.Conf with HiveConf
[ https://issues.apache.org/jira/browse/HIVE-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620850#comment-14620850 ] Alan Gates commented on HIVE-11159: --- Can you give an example of what this would look like in terms of configuration? It's not immediately clear to me why we'd want to keep it out of hive-site. Integrate hplsql.Conf with HiveConf --- Key: HIVE-11159 URL: https://issues.apache.org/jira/browse/HIVE-11159 Project: Hive Issue Type: Task Components: hpl/sql Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Dmitry Tolpeko HPL/SQL has it's own Conf object. It should re-use HiveConf. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11145) Remove OFFLINE and NO_DROP from tables and partitions
[ https://issues.apache.org/jira/browse/HIVE-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620891#comment-14620891 ] Alan Gates commented on HIVE-11145: --- https://reviews.apache.org/r/36355/ Remove OFFLINE and NO_DROP from tables and partitions - Key: HIVE-11145 URL: https://issues.apache.org/jira/browse/HIVE-11145 Project: Hive Issue Type: Improvement Components: Metastore, SQL Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-11145.2.patch, HIVE-11145.3.patch, HIVE-11145.patch Currently a table or partition can be marked no_drop or offline. This prevents users from dropping or reading (and dropping) the table or partition. This was built in 0.7 before SQL standard authorization was an option. This is an expensive feature as when a table is dropped every partition must be fetched and checked to make sure it can be dropped. This feature is also redundant now that real authorization is available in Hive. This feature should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
[ https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620733#comment-14620733 ] Aihua Xu commented on HIVE-10895: - [~vgumashta] Can you help commit the patch? ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources --- Key: HIVE-10895 URL: https://issues.apache.org/jira/browse/HIVE-10895 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Takahiko Saito Assignee: Aihua Xu Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, HIVE-10895.3.patch During testing, we've noticed Oracle db running out of cursors. Might be related to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
[ https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620743#comment-14620743 ] Vaibhav Gumashta commented on HIVE-10895: - Was waiting for a precommit run. Looks good, will commit it in few mins. ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources --- Key: HIVE-10895 URL: https://issues.apache.org/jira/browse/HIVE-10895 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Takahiko Saito Assignee: Aihua Xu Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, HIVE-10895.3.patch During testing, we've noticed Oracle db running out of cursors. Might be related to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11215) Vectorized grace hash-join throws FileUtil warnings
[ https://issues.apache.org/jira/browse/HIVE-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620821#comment-14620821 ] Hive QA commented on HIVE-11215: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/1279/HIVE-11215.1.patch {color:green}SUCCESS:{color} +1 9138 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4553/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4553/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4553/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 1279 - PreCommit-HIVE-TRUNK-Build Vectorized grace hash-join throws FileUtil warnings --- Key: HIVE-11215 URL: https://issues.apache.org/jira/browse/HIVE-11215 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0, 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-11215.1.patch TPC-DS query13 warnings about a null-file deletion. {code} 2015-07-09 03:14:18,880 INFO [TezChild] exec.MapJoinOperator: Hybrid Grace Hash Join: Number of rows restored from KeyValueContainer: 31184 2015-07-09 03:14:18,881 INFO [TezChild] exec.MapJoinOperator: Hybrid Grace Hash Join: Deserializing spilled hash partition... 2015-07-09 03:14:18,881 INFO [TezChild] exec.MapJoinOperator: Hybrid Grace Hash Join: Number of rows in hashmap: 31184 2015-07-09 03:14:18,897 INFO [TezChild] exec.MapJoinOperator: spilled: true abort: false. Clearing spilled partitions. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11030) Enhance storage layer to create one delta file per write
[ https://issues.apache.org/jira/browse/HIVE-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620870#comment-14620870 ] Alan Gates commented on HIVE-11030: --- bq. OrcRecordUpdate, end of the constructor (line 265 in your patch)... bq. This does stat the file but if this check were to fail unnoticed it leads to data loss which seems really bad. I could wrap this in LOG.isInfoEnabled() for the most perf sensitive cases... I don't have a feel for how frequent a case this would be. What would cause this to happen? It just feels like this belongs in test mode but not in production. Other than that, +1 Enhance storage layer to create one delta file per write Key: HIVE-11030 URL: https://issues.apache.org/jira/browse/HIVE-11030 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-11030.2.patch, HIVE-11030.3.patch, HIVE-11030.4.patch, HIVE-11030.5.patch, HIVE-11030.6.patch, HIVE-11030.7.patch Currently each txn using ACID insert/update/delete will generate a delta directory like delta_100_101. In order to support multi-statement transactions we must generate one delta per operation within the transaction so the deltas would be named like delta_100_101_0001, etc. Support for MERGE (HIVE-10924) would need the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11158) Add tests for HPL/SQL
[ https://issues.apache.org/jira/browse/HIVE-11158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620873#comment-14620873 ] Alan Gates commented on HIVE-11158: --- +1 Add tests for HPL/SQL - Key: HIVE-11158 URL: https://issues.apache.org/jira/browse/HIVE-11158 Project: Hive Issue Type: Task Components: hpl/sql Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Dmitry Tolpeko Attachments: HIVE-11158.1.patch The new HPL/SQL module does not have any tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621078#comment-14621078 ] Eugene Koifman commented on HIVE-10165: --- Hive had several implementations of lock managers prior to ACID work that the user could choose among. https://cwiki.apache.org/confluence/display/Hive/Locking applies to behavior of those managers. When you choose to use hive.txn.manager=DbTxnManager, you get a new lock manager implementation that is installed automatically. This lock manger is designed to support Snapshot Isolation. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowLocks mentions the type of locks it has. The terminology is somewhat inconsistent but basically there are 3 types of locks 1. S/read/shared/shared_read - acquired to read or insert 2. SS/write/semi_shared/shared_write - acquired to update/delete 3. X/exclusive - acquired for drop type DDL operations. X - conflicts with everything. SS - compatible with S but conflicts with other SS (thus there is at most 1 update/delete happening concurrently) There are 3 types of resources that can be locked: database, table and partition. The current (acid) lock manager is built to support auto-commit mode where all locks for a given transaction are known at the start of the txn. As you mention in the docs it doesn't yet have/need deadlock detection. So given the current state of things, I would vote to have the Mutable Streaming API acquire SS locks on tables that it writes. This will of course limit concurrent writes but I think correctness (i.e. properly working Snapshot Isolation) is more important. I intend to build deadlock detection as part of HIVE-9675, which will hopefully happen in the next couple of month. Then it will be possible to acquire partition level locks as you go. There is one crucial concern here: where do you know that you need to lock a partition? If it's in the MutatorClient or Mutator? (If I understood things correctly, the former runs on the client and the later on the grid). Since the current lock mangers uses the metastore RDBMS to manage it's state, you cannot talk to metastore from the grid as it will likely overwhelm it. So if Mutator is the entity that needs to acquire locks, this would have to wait for the HBase based metastore that [~alangates] is working on (and I believe will only be available in branch-2 aka master). Improve hive-hcatalog-streaming extensibility and support updates and deletes. -- Key: HIVE-10165 URL: https://issues.apache.org/jira/browse/HIVE-10165 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 1.2.0 Reporter: Elliot West Assignee: Elliot West Labels: TODOC2.0, streaming_api Fix For: 2.0.0 Attachments: HIVE-10165.0.patch, HIVE-10165.10.patch, HIVE-10165.4.patch, HIVE-10165.5.patch, HIVE-10165.6.patch, HIVE-10165.7.patch, HIVE-10165.9.patch, mutate-system-overview.png h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of the updates (hundreds or partitions) the scope for contention is high. I believe we can address this problem by instead writing only the changed records to a Hive transactional table. This should drastically reduce the amount of data that we need to write and also provide a means for managing concurrent access to the data. Our existing merge processes can read and retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an updated form of the hive-hcatalog-streaming API which will then have the required data to perform an update or insert in a transactional manner. h3. Benefits * Enables the creation of large-scale dataset
[jira] [Updated] (HIVE-11211) Reset the fields in JoinStatsRule in StatsRulesProcFactory
[ https://issues.apache.org/jira/browse/HIVE-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11211: --- Attachment: HIVE-11211.03.patch .03 patch changes the name to getCardinality following [~jpullokkaran]'s suggestion. Reset the fields in JoinStatsRule in StatsRulesProcFactory -- Key: HIVE-11211 URL: https://issues.apache.org/jira/browse/HIVE-11211 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11211.02.patch, HIVE-11211.03.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620977#comment-14620977 ] Xuefu Zhang commented on HIVE-10927: Are this patch and those from previous JIRAs also applicable to branch-1? If so, we should probably also commit them to that branch as well. Add number of HMS/HS2 connection metrics Key: HIVE-10927 URL: https://issues.apache.org/jira/browse/HIVE-10927 Project: Hive Issue Type: Sub-task Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC2.0 Fix For: 2.0.0 Attachments: HIVE-10927.2.patch, HIVE-10927.2.patch, HIVE-10927.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
[ https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620897#comment-14620897 ] Aihua Xu commented on HIVE-10895: - Thanks [~vgumashta]. Most are from your work. :) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources --- Key: HIVE-10895 URL: https://issues.apache.org/jira/browse/HIVE-10895 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Takahiko Saito Assignee: Aihua Xu Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, HIVE-10895.3.patch During testing, we've noticed Oracle db running out of cursors. Might be related to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620966#comment-14620966 ] Szehon Ho commented on HIVE-10927: -- Sorry I was confused about the next release version, I'll change it. Add number of HMS/HS2 connection metrics Key: HIVE-10927 URL: https://issues.apache.org/jira/browse/HIVE-10927 Project: Hive Issue Type: Sub-task Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10927.2.patch, HIVE-10927.2.patch, HIVE-10927.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4239) Remove lock on compilation stage
[ https://issues.apache.org/jira/browse/HIVE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620970#comment-14620970 ] Sergey Shelukhin commented on HIVE-4239: That looks unrelated Remove lock on compilation stage Key: HIVE-4239 URL: https://issues.apache.org/jira/browse/HIVE-4239 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Reporter: Carl Steinbach Assignee: Sergey Shelukhin Attachments: HIVE-4239.01.patch, HIVE-4239.02.patch, HIVE-4239.03.patch, HIVE-4239.04.patch, HIVE-4239.05.patch, HIVE-4239.06.patch, HIVE-4239.07.patch, HIVE-4239.08.patch, HIVE-4239.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11216) UDF GenericUDFMapKeys throws NPE when a null map value is passed in
[ https://issues.apache.org/jira/browse/HIVE-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620989#comment-14620989 ] Hive QA commented on HIVE-11216: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744463/HIVE-11216.patch {color:green}SUCCESS:{color} +1 9138 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4554/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4554/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4554/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744463 - PreCommit-HIVE-TRUNK-Build UDF GenericUDFMapKeys throws NPE when a null map value is passed in --- Key: HIVE-11216 URL: https://issues.apache.org/jira/browse/HIVE-11216 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.2.0 Reporter: Yibing Shi Assignee: Yibing Shi Attachments: HIVE-11216.patch We can reproduce the problem as below: {noformat} hive show create table map_txt; OK CREATE TABLE `map_txt`( `id` int, `content` mapint,string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ... Time taken: 0.233 seconds, Fetched: 18 row(s) hive select * from map_txt; OK 1 NULL Time taken: 0.679 seconds, Fetched: 1 row(s) hive select id, map_keys(content) from map_txt; Error during job, obtaining debugging information... Examining task ID: task_1435534231122_0025_m_00 (and more) from job job_1435534231122_0025 Task with the most failures(4): - Task ID: task_1435534231122_0025_m_00 URL: http://host-10-17-80-40.coe.cloudera.com:8088/taskdetails.jsp?jobid=job_1435534231122_0025tipid=task_1435534231122_0025_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:559) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating map_keys(content) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549) ... 9 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) ... 13 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 1 HDFS
[jira] [Commented] (HIVE-11200) LLAP: Cache BuddyAllocator throws NPE
[ https://issues.apache.org/jira/browse/HIVE-11200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621052#comment-14621052 ] Sergey Shelukhin commented on HIVE-11200: - [~prasanth_j] maybe you can review? LLAP: Cache BuddyAllocator throws NPE - Key: HIVE-11200 URL: https://issues.apache.org/jira/browse/HIVE-11200 Project: Hive Issue Type: Sub-task Affects Versions: llap Environment: large perf cluster - with 64Gb cache sizes Reporter: Gopal V Assignee: Sergey Shelukhin Priority: Minor Fix For: llap Attachments: HIVE-11200.patch Built off da1e0cf21aeff0a9501c5e220a6f66ba61f6da94 merge point {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithSplit(BuddyAllocator.java:331) at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:399) at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$300(BuddyAllocator.java:228) at org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:156) at org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(InStream.java:761) at org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:462) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:342) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:59) at org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37) ... 4 more 2015-07-08 01:17:42,798 [TezTaskRunner_attempt_1435700346116_1212_4_05_80_0(attempt_1435700346116_1212_4_05_80_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: java.lang.NullPointerException {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11197) While extracting join conditions follow Hive rules for type conversion instead of Calcite
[ https://issues.apache.org/jira/browse/HIVE-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620994#comment-14620994 ] Ashutosh Chauhan commented on HIVE-11197: - [~jcamachorodriguez] Didn't follow your double negative statements : ) Also, I hadn't updated RB earlier. Just did it. Can you comment there on what you are suggesting? While extracting join conditions follow Hive rules for type conversion instead of Calcite - Key: HIVE-11197 URL: https://issues.apache.org/jira/browse/HIVE-11197 Project: Hive Issue Type: Bug Components: CBO Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-11197.2.patch, HIVE-11197.2.patch, HIVE-11197.3.patch, HIVE-11197.patch, HIVE-11197.patch Calcite strict type system throws exception in those cases, which are legal in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11206) CBO (Calcite Return Path): Join translation should update all ExprNode recursively
[ https://issues.apache.org/jira/browse/HIVE-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621041#comment-14621041 ] Ashutosh Chauhan commented on HIVE-11206: - Are these filters in on clause of join which are not on joining columns? Can you add a comment whats going on there? Also, I wonder if it would be possible to refactor some pieces of SemanticAnalyzer::genJoinOperatorChildren() so that we have identical logic on old path and new path ? CBO (Calcite Return Path): Join translation should update all ExprNode recursively -- Key: HIVE-11206 URL: https://issues.apache.org/jira/browse/HIVE-11206 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11206.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-10927: - Labels: TODOC2.0 (was: ) Fix Version/s: (was: 1.3.0) 2.0.0 Add number of HMS/HS2 connection metrics Key: HIVE-10927 URL: https://issues.apache.org/jira/browse/HIVE-10927 Project: Hive Issue Type: Sub-task Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC2.0 Fix For: 2.0.0 Attachments: HIVE-10927.2.patch, HIVE-10927.2.patch, HIVE-10927.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11198) Fix load data query file format check for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620976#comment-14620976 ] Prasanth Jayachandran commented on HIVE-11198: -- Test failures are unrelated. Its fixed in HIVE-11202. Fix load data query file format check for partitioned tables Key: HIVE-11198 URL: https://issues.apache.org/jira/browse/HIVE-11198 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11198.patch HIVE-8 added file format check for ORC format. The check will throw exception when non ORC formats is loaded to ORC managed table. But it does not work for partitioned table. Partitioned tables are allowed to have some partitions with different file format. See this discussion for more details https://issues.apache.org/jira/browse/HIVE-8?focusedCommentId=14617271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14617271 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11170) port parts of HIVE-11015 to master for ease of future merging
[ https://issues.apache.org/jira/browse/HIVE-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11170: Attachment: HIVE-11170.02.patch again for HiveQA... port parts of HIVE-11015 to master for ease of future merging - Key: HIVE-11170 URL: https://issues.apache.org/jira/browse/HIVE-11170 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 2.0.0 Attachments: HIVE-11170.01.patch, HIVE-11170.02.patch, HIVE-11170.patch That patch changes how IOContext is created (file structure) and adds tests; I will merge non-LLAP parts of it now, so it's easier to merge later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11191) Beeline-cli: support hive.cli.errors.ignore in new CLI
[ https://issues.apache.org/jira/browse/HIVE-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-11191: Fix Version/s: beeline-cli-branch Beeline-cli: support hive.cli.errors.ignore in new CLI -- Key: HIVE-11191 URL: https://issues.apache.org/jira/browse/HIVE-11191 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Fix For: beeline-cli-branch Attachments: HIVE-11191.1-beeline-cli.patch, HIVE-11191.2-beeline-cli.patch In the old CLI, it uses hive.cli.errors.ignore from the hive configuration to force execution a script when errors occurred. In the beeline, it has a similar option called force. We need to support the previous configuration using beeline functionality. More details about force option are available in https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation
[ https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-0: -- Summary: Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation (was: Enable HiveJoinAddNotNullRule in CBO) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation Key: HIVE-0 URL: https://issues.apache.org/jira/browse/HIVE-0 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Laljo John Pullokkaran Attachments: HIVE-0-branch-1.2.patch, HIVE-0.1.patch, HIVE-0.2.patch, HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.patch Query {code} select count(*) from store_sales ,store_returns ,date_dim d1 ,date_dim d2 where d1.d_quarter_name = '2000Q1' and d1.d_date_sk = ss_sold_date_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); {code} The store_sales table is partitioned on ss_sold_date_sk, which is also used in a join clause. The join clause should add a filter “filterExpr: ss_sold_date_sk is not null”, which should get pushed the MetaStore when fetching the stats. Currently this is not done in CBO planning, which results in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in the optimization phase. In particular, this increases the NDV for the join columns and may result in wrong planning. Including HiveJoinAddNotNullRule in the optimization phase solves this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11160) Collect column stats when set hive.stats.autogather=true
[ https://issues.apache.org/jira/browse/HIVE-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11160: --- Attachment: (was: Design doc for auto column stats gathering.docx) Collect column stats when set hive.stats.autogather=true Key: HIVE-11160 URL: https://issues.apache.org/jira/browse/HIVE-11160 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11160.01.patch Hive will collect table stats when set hive.stats.autogather=true during the INSERT OVERWRITE command. And then the users need to collect the column stats themselves using Analyze command. In this patch, the column stats will also be collected automatically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11160) Auto-gather column stats
[ https://issues.apache.org/jira/browse/HIVE-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11160: --- Summary: Auto-gather column stats (was: Collect column stats when set hive.stats.autogather=true) Auto-gather column stats Key: HIVE-11160 URL: https://issues.apache.org/jira/browse/HIVE-11160 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11160.01.patch, HIVE-11160.02.patch Hive will collect table stats when set hive.stats.autogather=true during the INSERT OVERWRITE command. And then the users need to collect the column stats themselves using Analyze command. In this patch, the column stats will also be collected automatically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11214) Insert into ACID table switches vectorization off
[ https://issues.apache.org/jira/browse/HIVE-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11214: Attachment: HIVE-11214.03.patch Insert into ACID table switches vectorization off -- Key: HIVE-11214 URL: https://issues.apache.org/jira/browse/HIVE-11214 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-11214.01.patch, HIVE-11214.02.patch, HIVE-11214.03.patch PROBLEM: vectorization is switched off automatically after run insert into ACID table. STEPS TO REPRODUCE: set hive.vectorized.execution.enabled=true; create table testv (id int, name string) clustered by (id) into 2 buckets stored as orc tblproperties(transactional=true); insert into testv values(1,'a'); set hive.vectorized.execution.enabled; false -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11196) Utilities.getPartitionDesc() should try to reuse TableDesc object
[ https://issues.apache.org/jira/browse/HIVE-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621763#comment-14621763 ] Hive QA commented on HIVE-11196: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744584/HIVE-11196.2.patch {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9149 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fileformat_mix org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat8 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_bmj_schema_evolution org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_schema_evolution {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4562/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4562/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4562/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744584 - PreCommit-HIVE-TRUNK-Build Utilities.getPartitionDesc() should try to reuse TableDesc object -- Key: HIVE-11196 URL: https://issues.apache.org/jira/browse/HIVE-11196 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11196.1.patch, HIVE-11196.2.patch Currently, Utilities.getPartitionDesc() creates a new PartitionDesc object which inturn creates new TableDesc object via Utilities.getTableDesc(part.getTable()) for every call. This value needs to be reused so that we can avoid the expense of creating new Descriptor object wherever possible -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11224) AggregateStatsCache triggers java.util.ConcurrentModificationException under some conditions
[ https://issues.apache.org/jira/browse/HIVE-11224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621721#comment-14621721 ] Pengcheng Xiong commented on HIVE-11224: As per [~jpullokkaran]'s request, could [~thejas] and [~vgumashta] take a look? It is related to the HIVE-HIVE-10382, AggregateStatsCache work. Thanks. AggregateStatsCache triggers java.util.ConcurrentModificationException under some conditions Key: HIVE-11224 URL: https://issues.apache.org/jira/browse/HIVE-11224 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11224.01.patch Stack {code} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922) at java.util.HashMap$EntryIterator.next(HashMap.java:962) at java.util.HashMap$EntryIterator.next(HashMap.java:960) at org.apache.hadoop.hive.metastore.AggregateStatsCache.findBestMatch(AggregateStatsCache.java:244) at org.apache.hadoop.hive.metastore.AggregateStatsCache.get(AggregateStatsCache.java:186) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1131) at org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6174) at org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6170) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2405) at org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6170) at sun.reflect.GeneratedMethodAccessor103.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy6.get_aggr_stats_for(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_aggr_stats_for(HiveMetaStore.java:5707) at sun.reflect.GeneratedMethodAccessor102.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy8.get_aggr_stats_for(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAggrColStatsFor(HiveMetaStoreClient.java:2067) at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11224) AggregateStatsCache triggers java.util.ConcurrentModificationException under some conditions
[ https://issues.apache.org/jira/browse/HIVE-11224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11224: --- Attachment: HIVE-11224.01.patch AggregateStatsCache triggers java.util.ConcurrentModificationException under some conditions Key: HIVE-11224 URL: https://issues.apache.org/jira/browse/HIVE-11224 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11224.01.patch Stack {code} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922) at java.util.HashMap$EntryIterator.next(HashMap.java:962) at java.util.HashMap$EntryIterator.next(HashMap.java:960) at org.apache.hadoop.hive.metastore.AggregateStatsCache.findBestMatch(AggregateStatsCache.java:244) at org.apache.hadoop.hive.metastore.AggregateStatsCache.get(AggregateStatsCache.java:186) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1131) at org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6174) at org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6170) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2405) at org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6170) at sun.reflect.GeneratedMethodAccessor103.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy6.get_aggr_stats_for(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_aggr_stats_for(HiveMetaStore.java:5707) at sun.reflect.GeneratedMethodAccessor102.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy8.get_aggr_stats_for(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAggrColStatsFor(HiveMetaStoreClient.java:2067) at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11190) No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621744#comment-14621744 ] Dapeng Sun commented on HIVE-11190: --- Update patch according [~thejas] 's comments. No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden -- Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch, HIVE-11190.002.patch, HIVE-11190.003.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 is will be override without prompting info or warning. it will cause user failed to customize the METASTORE_FILTER_HOOK. We should log information such as this value is ignored when override happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11190) No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated HIVE-11190: -- Attachment: HIVE-11190.003.patch No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden -- Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch, HIVE-11190.002.patch, HIVE-11190.003.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 is will be override without prompting info or warning. it will cause user failed to customize the METASTORE_FILTER_HOOK. We should log information such as this value is ignored when override happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11215) Vectorized grace hash-join throws FileUtil warnings
[ https://issues.apache.org/jira/browse/HIVE-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621746#comment-14621746 ] Matt McCline commented on HIVE-11215: - lgtm +1 Vectorized grace hash-join throws FileUtil warnings --- Key: HIVE-11215 URL: https://issues.apache.org/jira/browse/HIVE-11215 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0, 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-11215.1.patch TPC-DS query13 warnings about a null-file deletion. {code} 2015-07-09 03:14:18,880 INFO [TezChild] exec.MapJoinOperator: Hybrid Grace Hash Join: Number of rows restored from KeyValueContainer: 31184 2015-07-09 03:14:18,881 INFO [TezChild] exec.MapJoinOperator: Hybrid Grace Hash Join: Deserializing spilled hash partition... 2015-07-09 03:14:18,881 INFO [TezChild] exec.MapJoinOperator: Hybrid Grace Hash Join: Number of rows in hashmap: 31184 2015-07-09 03:14:18,897 INFO [TezChild] exec.MapJoinOperator: spilled: true abort: false. Clearing spilled partitions. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. 2015-07-09 03:14:18,898 WARN [TezChild] fs.FileUtil: null file argument. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11160) Collect column stats when set hive.stats.autogather=true
[ https://issues.apache.org/jira/browse/HIVE-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11160: --- Attachment: HIVE-11160.02.patch Collect column stats when set hive.stats.autogather=true Key: HIVE-11160 URL: https://issues.apache.org/jira/browse/HIVE-11160 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11160.01.patch, HIVE-11160.02.patch Hive will collect table stats when set hive.stats.autogather=true during the INSERT OVERWRITE command. And then the users need to collect the column stats themselves using Analyze command. In this patch, the column stats will also be collected automatically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11110) Enable HiveJoinAddNotNullRule in CBO
[ https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran reassigned HIVE-0: - Assignee: Laljo John Pullokkaran (was: Jesus Camacho Rodriguez) Enable HiveJoinAddNotNullRule in CBO Key: HIVE-0 URL: https://issues.apache.org/jira/browse/HIVE-0 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Laljo John Pullokkaran Attachments: HIVE-0-branch-1.2.patch, HIVE-0.1.patch, HIVE-0.2.patch, HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.patch Query {code} select count(*) from store_sales ,store_returns ,date_dim d1 ,date_dim d2 where d1.d_quarter_name = '2000Q1' and d1.d_date_sk = ss_sold_date_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); {code} The store_sales table is partitioned on ss_sold_date_sk, which is also used in a join clause. The join clause should add a filter “filterExpr: ss_sold_date_sk is not null”, which should get pushed the MetaStore when fetching the stats. Currently this is not done in CBO planning, which results in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in the optimization phase. In particular, this increases the NDV for the join columns and may result in wrong planning. Including HiveJoinAddNotNullRule in the optimization phase solves this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE
[ https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621614#comment-14621614 ] Hive QA commented on HIVE-11221: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744571/HIVE-11221.1.patch {color:green}SUCCESS:{color} +1 9148 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4560/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4560/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4560/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744571 - PreCommit-HIVE-TRUNK-Build In Tez mode, alter table concatenate orc files can intermittently fail with NPE --- Key: HIVE-11221 URL: https://issues.apache.org/jira/browse/HIVE-11221 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11221.1.patch We are not waiting for input ready events which can trigger occasional NPE if input is not actually ready. Stacktrace: {code} java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648) at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146) at org.apache.tez.mapreduce.lib.MRReaderMapred.init(MRReaderMapred.java:73) at org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483) at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108) at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220) at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10515) Create tests to cover existing (supported) Hive CLI functionality
[ https://issues.apache.org/jira/browse/HIVE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621627#comment-14621627 ] Xuefu Zhang commented on HIVE-10515: [~Ferd], Sure, if you think the coverage has reached to acceptable level. Thanks. Create tests to cover existing (supported) Hive CLI functionality - Key: HIVE-10515 URL: https://issues.apache.org/jira/browse/HIVE-10515 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu After removing HiveServer1, Hive CLI's functionality is reduced to its original use case, a thick client application. Let's identify this so that we maintain it when implementation is changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11222) LLAP: occasional NPE in parallel queries in ORC reader
[ https://issues.apache.org/jira/browse/HIVE-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621560#comment-14621560 ] Sergey Shelukhin commented on HIVE-11222: - Patch appears to fix it in the cluster LLAP: occasional NPE in parallel queries in ORC reader -- Key: HIVE-11222 URL: https://issues.apache.org/jira/browse/HIVE-11222 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap Attachments: HIVE-11222.patch {noformat} Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:275) at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:227) at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:155) at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:101) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) ... 22 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:709) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:618) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:195) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:59) at org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37) ... 4 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11190) No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621665#comment-14621665 ] Thejas M Nair commented on HIVE-11190: -- I didn't realize the config description already said that. So, no change required there. No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden -- Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch, HIVE-11190.002.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 is will be override without prompting info or warning. it will cause user failed to customize the METASTORE_FILTER_HOOK. We should log information such as this value is ignored when override happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11214) Insert into ACID table switches vectorization off
[ https://issues.apache.org/jira/browse/HIVE-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621681#comment-14621681 ] Hive QA commented on HIVE-11214: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744582/HIVE-11214.02.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9151 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_acid3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4561/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4561/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4561/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744582 - PreCommit-HIVE-TRUNK-Build Insert into ACID table switches vectorization off -- Key: HIVE-11214 URL: https://issues.apache.org/jira/browse/HIVE-11214 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-11214.01.patch, HIVE-11214.02.patch PROBLEM: vectorization is switched off automatically after run insert into ACID table. STEPS TO REPRODUCE: set hive.vectorized.execution.enabled=true; create table testv (id int, name string) clustered by (id) into 2 buckets stored as orc tblproperties(transactional=true); insert into testv values(1,'a'); set hive.vectorized.execution.enabled; false -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11222) LLAP: occasional NPE in parallel queries in ORC reader
[ https://issues.apache.org/jira/browse/HIVE-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621530#comment-14621530 ] Sergey Shelukhin commented on HIVE-11222: - Two vertices scanning the same stripe in parallel, so hopefully it shouldn't happen in a single-query case LLAP: occasional NPE in parallel queries in ORC reader -- Key: HIVE-11222 URL: https://issues.apache.org/jira/browse/HIVE-11222 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap Attachments: HIVE-11222.patch {noformat} Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:275) at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:227) at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:155) at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:101) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) ... 22 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:709) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:618) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:195) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:59) at org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37) ... 4 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11170) port parts of HIVE-11015 to master for ease of future merging
[ https://issues.apache.org/jira/browse/HIVE-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621528#comment-14621528 ] Hive QA commented on HIVE-11170: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744543/HIVE-11170.02.patch {color:green}SUCCESS:{color} +1 9148 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4558/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4558/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4558/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744543 - PreCommit-HIVE-TRUNK-Build port parts of HIVE-11015 to master for ease of future merging - Key: HIVE-11170 URL: https://issues.apache.org/jira/browse/HIVE-11170 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 2.0.0 Attachments: HIVE-11170.01.patch, HIVE-11170.02.patch, HIVE-11170.patch That patch changes how IOContext is created (file structure) and adds tests; I will merge non-LLAP parts of it now, so it's easier to merge later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10937) LLAP: make ObjectCache for plans work properly in the daemon
[ https://issues.apache.org/jira/browse/HIVE-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621542#comment-14621542 ] Sergey Shelukhin commented on HIVE-10937: - I just ran a bunch of queries, I think it runs ok... [~gopalv] [~hagleitn] do you want to review? this will need rebase after mapjoin cleanup patch is committed. LLAP: make ObjectCache for plans work properly in the daemon Key: HIVE-10937 URL: https://issues.apache.org/jira/browse/HIVE-10937 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap Attachments: HIVE-10937.01.patch, HIVE-10937.02.patch, HIVE-10937.patch There's perf hit otherwise, esp. when stupid planner creates 1009 reducers of 4Mb each. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11223) CBO (Calcite Return Path): MapJoin and SMBJoin conversion not triggered
[ https://issues.apache.org/jira/browse/HIVE-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11223: --- Attachment: HIVE-11223.patch [~jpullokkaran], could you take a look at the patch? Thanks CBO (Calcite Return Path): MapJoin and SMBJoin conversion not triggered --- Key: HIVE-11223 URL: https://issues.apache.org/jira/browse/HIVE-11223 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11223.patch Information in aux data structures is not complete, thus MapJoin and SMBJoin conversion are not triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10515) Create tests to cover existing (supported) Hive CLI functionality
[ https://issues.apache.org/jira/browse/HIVE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu resolved HIVE-10515. - Resolution: Fixed Resolve it because it's committed as parts of other JIRAs. Create tests to cover existing (supported) Hive CLI functionality - Key: HIVE-10515 URL: https://issues.apache.org/jira/browse/HIVE-10515 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu After removing HiveServer1, Hive CLI's functionality is reduced to its original use case, a thick client application. Let's identify this so that we maintain it when implementation is changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11190) No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621655#comment-14621655 ] Dapeng Sun commented on HIVE-11190: --- Thank you for your comments, [~thejas] . {quote} replace the use of the string org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook with AuthorizationMetaStoreFilterHook.class.getName() ? (less chances of typos happening). {quote} Good suggestion. I will update it in next patch. {quote} update the description of ConfVars.METASTORE_FILTER_HOOK to say that it gets overridden when V2 auth is used ? {quote} The description of {{ConfVars.METASTORE_FILTER_HOOK}} is If hive.security.authorization.manager is set to instance of HiveAuthorizerFactory, then this value is ignored.. Do you think it still need to update? No prompting info or warning provided when METASTORE_FILTER_HOOK in authorization V2 is overridden -- Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch, HIVE-11190.002.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 is will be override without prompting info or warning. it will cause user failed to customize the METASTORE_FILTER_HOOK. We should log information such as this value is ignored when override happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619982#comment-14619982 ] Hive QA commented on HIVE-9152: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744405/HIVE-9152.10-spark.patch {color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 8000 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.initializationError org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_spark_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap_auto org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_column_access_stats org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_partition_metadataonly org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_example_add org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_in_file org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_elt org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_string_concat org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_decimal_date org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_div0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_case org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_math_funcs org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_string_funcs {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/927/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/927/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-927/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 25 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744405 - PreCommit-HIVE-SPARK-Build Dynamic Partition Pruning [Spark Branch] Key: HIVE-9152 URL: https://issues.apache.org/jira/browse/HIVE-9152 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Chao Sun Attachments: HIVE-9152.1-spark.patch, HIVE-9152.10-spark.patch, HIVE-9152.2-spark.patch, HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch, HIVE-9152.6-spark.patch, HIVE-9152.8-spark.patch, HIVE-9152.9-spark.patch Tez implemented dynamic partition pruning in HIVE-7826. This is a nice optimization and we should implement the same in HOS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
[ https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619983#comment-14619983 ] Hive QA commented on HIVE-10895: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744331/HIVE-10895.3.patch {color:green}SUCCESS:{color} +1 9143 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4547/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4547/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4547/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744331 - PreCommit-HIVE-TRUNK-Build ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources --- Key: HIVE-10895 URL: https://issues.apache.org/jira/browse/HIVE-10895 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Takahiko Saito Assignee: Aihua Xu Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, HIVE-10895.3.patch During testing, we've noticed Oracle db running out of cursors. Might be related to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11179) HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers
[ https://issues.apache.org/jira/browse/HIVE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620023#comment-14620023 ] Lefty Leverenz commented on HIVE-11179: --- Does this need documentation? HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers - Key: HIVE-11179 URL: https://issues.apache.org/jira/browse/HIVE-11179 Project: Hive Issue Type: Improvement Reporter: Dapeng Sun Assignee: Dapeng Sun Labels: Authorization Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11179.001.patch, HIVE-11179.001.patch HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers: There is a case in Apache Sentry: Sentry support uri and server level privilege, but in hive side, it uses {{AuthorizationUtils.getHivePrivilegeObject(privSubjectDesc)}} to do the converting, and the code in {{getHivePrivilegeObject()}} only handle the scenes for table and database {noformat} privSubjectDesc.getTable() ? HivePrivilegeObjectType.TABLE_OR_VIEW : HivePrivilegeObjectType.DATABASE; {noformat} A solution is move this method to {{HiveAuthorizer}}, so that a custom Authorizer could enhance it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11179) HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers
[ https://issues.apache.org/jira/browse/HIVE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620041#comment-14620041 ] Lefty Leverenz commented on HIVE-11179: --- Okay, thanks. HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers - Key: HIVE-11179 URL: https://issues.apache.org/jira/browse/HIVE-11179 Project: Hive Issue Type: Improvement Reporter: Dapeng Sun Assignee: Dapeng Sun Labels: Authorization Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11179.001.patch, HIVE-11179.001.patch HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers: There is a case in Apache Sentry: Sentry support uri and server level privilege, but in hive side, it uses {{AuthorizationUtils.getHivePrivilegeObject(privSubjectDesc)}} to do the converting, and the code in {{getHivePrivilegeObject()}} only handle the scenes for table and database {noformat} privSubjectDesc.getTable() ? HivePrivilegeObjectType.TABLE_OR_VIEW : HivePrivilegeObjectType.DATABASE; {noformat} A solution is move this method to {{HiveAuthorizer}}, so that a custom Authorizer could enhance it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11191) Beeline-cli: support hive.cli.errors.ignore in new CLI
[ https://issues.apache.org/jira/browse/HIVE-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-11191: Attachment: HIVE-11191.2-beeline-cli.patch Update the patch addressing [~xuefuz]'s comments Beeline-cli: support hive.cli.errors.ignore in new CLI -- Key: HIVE-11191 URL: https://issues.apache.org/jira/browse/HIVE-11191 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-11191.1-beeline-cli.patch, HIVE-11191.2-beeline-cli.patch In the old CLI, it uses hive.cli.errors.ignore from the hive configuration to force execution a script when errors occurred. In the beeline, it has a similar option called force. We need to support the previous configuration using beeline functionality. More details about force option are available in https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10515) Create tests to cover existing (supported) Hive CLI functionality
[ https://issues.apache.org/jira/browse/HIVE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620026#comment-14620026 ] Ferdinand Xu commented on HIVE-10515: - Hi [~xuefuz], I am wondering whether we can resolve this issue since TestHiveCli is serving as the functionality test as part of committed patches for other jiras. Create tests to cover existing (supported) Hive CLI functionality - Key: HIVE-10515 URL: https://issues.apache.org/jira/browse/HIVE-10515 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu After removing HiveServer1, Hive CLI's functionality is reduced to its original use case, a thick client application. Let's identify this so that we maintain it when implementation is changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11179) HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers
[ https://issues.apache.org/jira/browse/HIVE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620034#comment-14620034 ] Ferdinand Xu commented on HIVE-11179: - Hi [~leftylev], I don't think this JIRA needs extra documentation. It just improve the expandability of the API. Thank you! HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers - Key: HIVE-11179 URL: https://issues.apache.org/jira/browse/HIVE-11179 Project: Hive Issue Type: Improvement Reporter: Dapeng Sun Assignee: Dapeng Sun Labels: Authorization Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11179.001.patch, HIVE-11179.001.patch HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers: There is a case in Apache Sentry: Sentry support uri and server level privilege, but in hive side, it uses {{AuthorizationUtils.getHivePrivilegeObject(privSubjectDesc)}} to do the converting, and the code in {{getHivePrivilegeObject()}} only handle the scenes for table and database {noformat} privSubjectDesc.getTable() ? HivePrivilegeObjectType.TABLE_OR_VIEW : HivePrivilegeObjectType.DATABASE; {noformat} A solution is move this method to {{HiveAuthorizer}}, so that a custom Authorizer could enhance it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11211) Reset the fields in JoinStatsRule in StatsRulesProcFactory
[ https://issues.apache.org/jira/browse/HIVE-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620072#comment-14620072 ] Hive QA commented on HIVE-11211: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744344/HIVE-11211.02.patch {color:green}SUCCESS:{color} +1 9138 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4548/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4548/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4548/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744344 - PreCommit-HIVE-TRUNK-Build Reset the fields in JoinStatsRule in StatsRulesProcFactory -- Key: HIVE-11211 URL: https://issues.apache.org/jira/browse/HIVE-11211 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11211.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11159) Integrate hplsql.Conf with HiveConf
[ https://issues.apache.org/jira/browse/HIVE-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-11159: - Assignee: Dmitry Tolpeko Integrate hplsql.Conf with HiveConf --- Key: HIVE-11159 URL: https://issues.apache.org/jira/browse/HIVE-11159 Project: Hive Issue Type: Task Components: hpl/sql Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Dmitry Tolpeko HPL/SQL has it's own Conf object. It should re-use HiveConf. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11206) CBO (Calcite Return Path): Join translation should update all ExprNode recursively
[ https://issues.apache.org/jira/browse/HIVE-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620020#comment-14620020 ] Jesus Camacho Rodriguez commented on HIVE-11206: [~ashutoshc], could you take a look? Thanks CBO (Calcite Return Path): Join translation should update all ExprNode recursively -- Key: HIVE-11206 URL: https://issues.apache.org/jira/browse/HIVE-11206 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11206.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11197) While extracting join conditions follow Hive rules for type conversion instead of Calcite
[ https://issues.apache.org/jira/browse/HIVE-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620055#comment-14620055 ] Jesus Camacho Rodriguez commented on HIVE-11197: [~ashutoshc], thanks for adding that comment. I'm been checking the patch again, and I'm wondering if it wouldn't be correct to bail out from CBO when the Exception is thrown instead of catching it in those classes, as we may end up hitting other issues if we don't. For instance, we can bail out in case we hit the Exception when we are checking the cost of a certain join, or in case some of the rules are not applied e.g. HiveInsertExchange4JoinRule which needs to be applied in the return path. What do you think? While extracting join conditions follow Hive rules for type conversion instead of Calcite - Key: HIVE-11197 URL: https://issues.apache.org/jira/browse/HIVE-11197 Project: Hive Issue Type: Bug Components: CBO Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-11197.2.patch, HIVE-11197.2.patch, HIVE-11197.3.patch, HIVE-11197.patch, HIVE-11197.patch Calcite strict type system throws exception in those cases, which are legal in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11193) ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted
[ https://issues.apache.org/jira/browse/HIVE-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620387#comment-14620387 ] Hive QA commented on HIVE-11193: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744360/HIVE-11193.02.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9139 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_constprog_dpp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4550/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4550/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4550/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744360 - PreCommit-HIVE-TRUNK-Build ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted --- Key: HIVE-11193 URL: https://issues.apache.org/jira/browse/HIVE-11193 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-11193.01.patch, HIVE-11193.02.patch During Constant Propagation optimization, sometimes a node ends up being added to opToDelete list more than once. Later in ConstantPropagate transform, we try to delete that operator multiple times, which will cause SemanticException since the node has already been removed in an earlier pass. The data structure for storing opToDelete is List. We should use Set to avoid the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11216) UDF GenericUDFMapKeys throws NPE when a null map value is passed in
[ https://issues.apache.org/jira/browse/HIVE-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yibing Shi reassigned HIVE-11216: - Assignee: Yibing Shi UDF GenericUDFMapKeys throws NPE when a null map value is passed in --- Key: HIVE-11216 URL: https://issues.apache.org/jira/browse/HIVE-11216 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.2.0 Reporter: Yibing Shi Assignee: Yibing Shi We can reproduce the problem as below: {noformat} hive show create table map_txt; OK CREATE TABLE `map_txt`( `id` int, `content` mapint,string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ... Time taken: 0.233 seconds, Fetched: 18 row(s) hive select * from map_txt; OK 1 NULL Time taken: 0.679 seconds, Fetched: 1 row(s) hive select id, map_keys(content) from map_txt; Error during job, obtaining debugging information... Examining task ID: task_1435534231122_0025_m_00 (and more) from job job_1435534231122_0025 Task with the most failures(4): - Task ID: task_1435534231122_0025_m_00 URL: http://host-10-17-80-40.coe.cloudera.com:8088/taskdetails.jsp?jobid=job_1435534231122_0025tipid=task_1435534231122_0025_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:559) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating map_keys(content) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549) ... 9 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) ... 13 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL hive {noformat} The error is as below (in mappers): {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) ... 17 more {noformat} Looking at the source code: {code} public Object evaluate(DeferredObject[] arguments) throws HiveException {
[jira] [Updated] (HIVE-11216) UDF GenericUDFMapKeys throws NPE when a null map value is passed in
[ https://issues.apache.org/jira/browse/HIVE-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yibing Shi updated HIVE-11216: -- Attachment: HIVE-11216.patch UDF GenericUDFMapKeys throws NPE when a null map value is passed in --- Key: HIVE-11216 URL: https://issues.apache.org/jira/browse/HIVE-11216 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.2.0 Reporter: Yibing Shi Assignee: Yibing Shi Attachments: HIVE-11216.patch We can reproduce the problem as below: {noformat} hive show create table map_txt; OK CREATE TABLE `map_txt`( `id` int, `content` mapint,string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ... Time taken: 0.233 seconds, Fetched: 18 row(s) hive select * from map_txt; OK 1 NULL Time taken: 0.679 seconds, Fetched: 1 row(s) hive select id, map_keys(content) from map_txt; Error during job, obtaining debugging information... Examining task ID: task_1435534231122_0025_m_00 (and more) from job job_1435534231122_0025 Task with the most failures(4): - Task ID: task_1435534231122_0025_m_00 URL: http://host-10-17-80-40.coe.cloudera.com:8088/taskdetails.jsp?jobid=job_1435534231122_0025tipid=task_1435534231122_0025_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:559) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating map_keys(content) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549) ... 9 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) ... 13 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL hive {noformat} The error is as below (in mappers): {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) ... 17 more {noformat} Looking at the source code: {code} public Object
[jira] [Commented] (HIVE-11191) Beeline-cli: support hive.cli.errors.ignore in new CLI
[ https://issues.apache.org/jira/browse/HIVE-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620266#comment-14620266 ] Hive QA commented on HIVE-11191: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744430/HIVE-11191.2-beeline-cli.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9038 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/7/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/7/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-BEELINE-Build-7/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744430 - PreCommit-HIVE-BEELINE-Build Beeline-cli: support hive.cli.errors.ignore in new CLI -- Key: HIVE-11191 URL: https://issues.apache.org/jira/browse/HIVE-11191 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-11191.1-beeline-cli.patch, HIVE-11191.2-beeline-cli.patch In the old CLI, it uses hive.cli.errors.ignore from the hive configuration to force execution a script when errors occurred. In the beeline, it has a similar option called force. We need to support the previous configuration using beeline functionality. More details about force option are available in https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11210) Remove dependency on HiveConf from Orc reader writer
[ https://issues.apache.org/jira/browse/HIVE-11210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620280#comment-14620280 ] Hive QA commented on HIVE-11210: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744346/HIVE-11210.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4549/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4549/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4549/ Messages: {noformat} This message was trimmed, see log for full details main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec --- [INFO] ANTLR: Processing source directory /data/hive-ptest/working/apache-github-source-source/ql/src/java ANTLR Parser Generator Version 3.4 org/apache/hadoop/hive/ql/parse/HiveLexer.g org/apache/hadoop/hive/ql/parse/HiveParser.g warning(200): IdentifiersParser.g:455:5: Decision can match input such as {KW_REGEXP,
[jira] [Updated] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10882: --- Summary: CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results (was: CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filters of join operator causes NPE exception) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results --- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez CBO return path creates join operator with empty filters. However, vectorization is checking the filters of bigTable in join. This causes NPE exception. To reproduce, run vector_outer_join2.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10882: --- Description: CBO return path creates join operator with empty filtersMap. This causes outer joins to produce wrong results. To reproduce, run louter_join_ppr.q with return path turned on. (was: CBO return path creates join operator with empty filters. However, vectorization is checking the filters of bigTable in join. This causes NPE exception. To reproduce, run vector_outer_join2.q with return path turned on.) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results --- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez CBO return path creates join operator with empty filtersMap. This causes outer joins to produce wrong results. To reproduce, run louter_join_ppr.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11214) Insert into ACID table switches vectorization off
[ https://issues.apache.org/jira/browse/HIVE-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620501#comment-14620501 ] Hive QA commented on HIVE-11214: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744369/HIVE-11214.01.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9139 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_acid3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_acid3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4551/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4551/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4551/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744369 - PreCommit-HIVE-TRUNK-Build Insert into ACID table switches vectorization off -- Key: HIVE-11214 URL: https://issues.apache.org/jira/browse/HIVE-11214 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-11214.01.patch PROBLEM: vectorization is switched off automatically after run insert into ACID table. STEPS TO REPRODUCE: set hive.vectorized.execution.enabled=true; create table testv (id int, name string) clustered by (id) into 2 buckets stored as orc tblproperties(transactional=true); insert into testv values(1,'a'); set hive.vectorized.execution.enabled; false -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11220) HS2 Lineage leakage with 16x concurrency tests
[ https://issues.apache.org/jira/browse/HIVE-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11220: --- Attachment: hs2_Top_Components.zip hs2_System_Overview.zip hs2_Leak_Suspects.zip HS2 Lineage leakage with 16x concurrency tests -- Key: HIVE-11220 URL: https://issues.apache.org/jira/browse/HIVE-11220 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: llap, 1.3.0 Reporter: Gopal V Attachments: hs2_Leak_Suspects.zip, hs2_System_Overview.zip, hs2_Top_Components.zip Test scenario is the HS2 + LLAP, 16x concurrency of TPCDS queries which take less than 4 seconds. session.LineageState accumulates optimizer lineage info and HS2 OOMs due to the amount of data being held in the SessionState, since the sessions are continously being used without pause. The issue seems to be triggered due to the volume of fast queries or the life time of a single JDBC connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11219) Transactional documentation is unclear
[ https://issues.apache.org/jira/browse/HIVE-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621173#comment-14621173 ] Johndee Burks commented on HIVE-11219: -- Also I can make this change, how do I get rights to edit the page? I propose something like this: If a table is to be used in ACID writes (insert, update, delete) then the table property transactional must be set on that table to true, starting with Hive 0.14.0. Without this value, inserts will be done in the old style; updates and deletes will be prohibited. However, this does not apply to Hive 0.13.0. An example DDL is below: {code} CREATE TABLE transactme( key int, id int) CLUSTERED BY ( id) INTO 3 BUCKETS STORED AS orc TBLPROPERTIES ('transactional'='true') {code} Transactional documentation is unclear -- Key: HIVE-11219 URL: https://issues.apache.org/jira/browse/HIVE-11219 Project: Hive Issue Type: Improvement Components: Transactions Reporter: Johndee Burks Assignee: Johndee Burks Priority: Minor At the [this|https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions] link the following is said. If a table is to be used in ACID writes (insert, update, delete) then the table property transactional must be set on that table, starting with Hive 0.14.0. Without this value, inserts will be done in the old style; updates and deletes will be prohibited. However, this does not apply to Hive 0.13.0. It does not tell you what the value of transactional should be. I think we should say it needs to be true and we should show an example DDL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11165) Calcite planner might have a thread-safety issue compiling in parallel
[ https://issues.apache.org/jira/browse/HIVE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621264#comment-14621264 ] Sergey Shelukhin commented on HIVE-11165: - [~jpullokkaran] [~pxiong] can you guys comment? Calcite planner might have a thread-safety issue compiling in parallel -- Key: HIVE-11165 URL: https://issues.apache.org/jira/browse/HIVE-11165 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 2.0.0 Reporter: Gopal V Assignee: Laljo John Pullokkaran Attachments: RunJar-2015-06-30.snapshot After about 6 minutes trying to plan a query, the HiveServer2 was killed to restore functionality to a test run. The HEP planner is stuck on a TopologicalOrder traversal and there were no queries being fed into the HiveServer2 after it got stuck. TPC-DS query13 was the query in question, at 4 way parallel, which triggered the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11165) Calcite planner might have a thread-safety issue compiling in parallel
[ https://issues.apache.org/jira/browse/HIVE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621262#comment-14621262 ] Sergey Shelukhin commented on HIVE-11165: - I have seen the following callstack that may be related, after ctrl-c-ing HiveServer2 that was stuck forever {noformat} Exception in thread HiveServer2-Handler-Pool: Thread-81 java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.HashMap.resize(HashMap.java:703) at java.util.HashMap.putVal(HashMap.java:662) at java.util.HashMap.put(HashMap.java:611) at java.util.HashSet.add(HashSet.java:219) at org.apache.calcite.util.graph.BreadthFirstIterator.reachable(BreadthFirstIterator.java:61) at org.apache.calcite.plan.hep.HepPlanner.collectGarbage(HepPlanner.java:900) at org.apache.calcite.plan.hep.HepPlanner.getGraphIterator(HepPlanner.java:427) at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:400) at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:285) at org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:72) at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207) at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:1035) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:964) ... {noformat} Calcite planner might have a thread-safety issue compiling in parallel -- Key: HIVE-11165 URL: https://issues.apache.org/jira/browse/HIVE-11165 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 2.0.0 Reporter: Gopal V Attachments: RunJar-2015-06-30.snapshot After about 6 minutes trying to plan a query, the HiveServer2 was killed to restore functionality to a test run. The HEP planner is stuck on a TopologicalOrder traversal and there were no queries being fed into the HiveServer2 after it got stuck. TPC-DS query13 was the query in question, at 4 way parallel, which triggered the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11165) Calcite planner might have a thread-safety issue compiling in parallel
[ https://issues.apache.org/jira/browse/HIVE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11165: Assignee: Laljo John Pullokkaran Calcite planner might have a thread-safety issue compiling in parallel -- Key: HIVE-11165 URL: https://issues.apache.org/jira/browse/HIVE-11165 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 2.0.0 Reporter: Gopal V Assignee: Laljo John Pullokkaran Attachments: RunJar-2015-06-30.snapshot After about 6 minutes trying to plan a query, the HiveServer2 was killed to restore functionality to a test run. The HEP planner is stuck on a TopologicalOrder traversal and there were no queries being fed into the HiveServer2 after it got stuck. TPC-DS query13 was the query in question, at 4 way parallel, which triggered the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE
[ https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11221: - Attachment: HIVE-11221.1.patch In Tez mode, alter table concatenate orc files can intermittently fail with NPE --- Key: HIVE-11221 URL: https://issues.apache.org/jira/browse/HIVE-11221 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11221.1.patch We are not waiting for input ready events which can trigger occasional NPE if input is not actually ready. Stacktrace: {code} java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648) at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146) at org.apache.tez.mapreduce.lib.MRReaderMapred.init(MRReaderMapred.java:73) at org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483) at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108) at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220) at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11219) Transactional documentation is unclear
[ https://issues.apache.org/jira/browse/HIVE-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621271#comment-14621271 ] Lefty Leverenz commented on HIVE-11219: --- For edit rights you need a Confluence username. You can post it here or send it to u...@hive.apache.org as described in About This Wiki: * [About This Wiki - How to get permission to edit | https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit] Transactional documentation is unclear -- Key: HIVE-11219 URL: https://issues.apache.org/jira/browse/HIVE-11219 Project: Hive Issue Type: Improvement Components: Transactions Reporter: Johndee Burks Assignee: Johndee Burks Priority: Minor At the [this|https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions] link the following is said. If a table is to be used in ACID writes (insert, update, delete) then the table property transactional must be set on that table, starting with Hive 0.14.0. Without this value, inserts will be done in the old style; updates and deletes will be prohibited. However, this does not apply to Hive 0.13.0. It does not tell you what the value of transactional should be. I think we should say it needs to be true and we should show an example DDL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11165) Calcite planner might have a thread-safety issue compiling in parallel
[ https://issues.apache.org/jira/browse/HIVE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621286#comment-14621286 ] Pengcheng Xiong commented on HIVE-11165: I attached query 13 here. I did not know the root cause yet but I saw lots of predicates. I suspect that this is related to the recent optimization on PPD? [~jpullokkaran]? {code} select avg(ss_quantity) ,avg(ss_ext_sales_price) ,avg(ss_ext_wholesale_cost) ,sum(ss_ext_wholesale_cost) from store_sales ,store ,customer_demographics ,household_demographics ,customer_address ,date_dim where store.s_store_sk = store_sales.ss_store_sk and store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 2001 and ss_sold_date between '2001-01-01' and '2001-12-31' and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'M' and customer_demographics.cd_education_status = '4 yr Degree' and store_sales.ss_sales_price between 100.00 and 150.00 and household_demographics.hd_dep_count = 3 )or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'D' and customer_demographics.cd_education_status = 'Primary' and store_sales.ss_sales_price between 50.00 and 100.00 and household_demographics.hd_dep_count = 1 ) or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = ss_cdemo_sk and customer_demographics.cd_marital_status = 'U' and customer_demographics.cd_education_status = 'Advanced Degree' and store_sales.ss_sales_price between 150.00 and 200.00 and household_demographics.hd_dep_count = 1 )) and((store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('KY', 'GA', 'NM') and store_sales.ss_net_profit between 100 and 200 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('MT', 'OR', 'IN') and store_sales.ss_net_profit between 150 and 300 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('WI', 'MO', 'WV') and store_sales.ss_net_profit between 50 and 250 )) ; {code} Calcite planner might have a thread-safety issue compiling in parallel -- Key: HIVE-11165 URL: https://issues.apache.org/jira/browse/HIVE-11165 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 2.0.0 Reporter: Gopal V Assignee: Laljo John Pullokkaran Attachments: RunJar-2015-06-30.snapshot After about 6 minutes trying to plan a query, the HiveServer2 was killed to restore functionality to a test run. The HEP planner is stuck on a TopologicalOrder traversal and there were no queries being fed into the HiveServer2 after it got stuck. TPC-DS query13 was the query in question, at 4 way parallel, which triggered the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11211) Reset the fields in JoinStatsRule in StatsRulesProcFactory
[ https://issues.apache.org/jira/browse/HIVE-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621140#comment-14621140 ] Laljo John Pullokkaran commented on HIVE-11211: --- +1 Reset the fields in JoinStatsRule in StatsRulesProcFactory -- Key: HIVE-11211 URL: https://issues.apache.org/jira/browse/HIVE-11211 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11211.02.patch, HIVE-11211.03.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11193) ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted
[ https://issues.apache.org/jira/browse/HIVE-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-11193: - Attachment: HIVE-11193.03.patch Attach patch 3. The EXPLAIN difference was due to my local JDK 8 use. JDK 7 produces consistent result. ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted --- Key: HIVE-11193 URL: https://issues.apache.org/jira/browse/HIVE-11193 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-11193.01.patch, HIVE-11193.02.patch, HIVE-11193.03.patch During Constant Propagation optimization, sometimes a node ends up being added to opToDelete list more than once. Later in ConstantPropagate transform, we try to delete that operator multiple times, which will cause SemanticException since the node has already been removed in an earlier pass. The data structure for storing opToDelete is List. We should use Set to avoid the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11219) Transactional documentation is unclear
[ https://issues.apache.org/jira/browse/HIVE-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johndee Burks reassigned HIVE-11219: Assignee: Johndee Burks Transactional documentation is unclear -- Key: HIVE-11219 URL: https://issues.apache.org/jira/browse/HIVE-11219 Project: Hive Issue Type: Improvement Components: Transactions Reporter: Johndee Burks Assignee: Johndee Burks Priority: Minor At the [this|https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions] link the following is said. If a table is to be used in ACID writes (insert, update, delete) then the table property transactional must be set on that table, starting with Hive 0.14.0. Without this value, inserts will be done in the old style; updates and deletes will be prohibited. However, this does not apply to Hive 0.13.0. It does not tell you what the value of transactional should be. I think we should say it needs to be true and we should show an example DDL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621152#comment-14621152 ] Hive QA commented on HIVE-10882: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744507/HIVE-10882.01.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9147 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_semijoin1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4555/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4555/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4555/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744507 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filtersMap of join operator causes wrong results --- Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10882.01.patch CBO return path creates join operator with empty filtersMap. This causes outer joins to produce wrong results. To reproduce, run louter_join_ppr.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620726#comment-14620726 ] Alan Gates commented on HIVE-10165: --- bq. Finally, as this issue is now resolved, should I submit patches using additional JIRA issues or reopen this one? Open a new one, as this just covers adding the feature, not fixing the included bugs. :) Improve hive-hcatalog-streaming extensibility and support updates and deletes. -- Key: HIVE-10165 URL: https://issues.apache.org/jira/browse/HIVE-10165 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 1.2.0 Reporter: Elliot West Assignee: Elliot West Labels: TODOC2.0, streaming_api Fix For: 2.0.0 Attachments: HIVE-10165.0.patch, HIVE-10165.10.patch, HIVE-10165.4.patch, HIVE-10165.5.patch, HIVE-10165.6.patch, HIVE-10165.7.patch, HIVE-10165.9.patch, mutate-system-overview.png h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of the updates (hundreds or partitions) the scope for contention is high. I believe we can address this problem by instead writing only the changed records to a Hive transactional table. This should drastically reduce the amount of data that we need to write and also provide a means for managing concurrent access to the data. Our existing merge processes can read and retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an updated form of the hive-hcatalog-streaming API which will then have the required data to perform an update or insert in a transactional manner. h3. Benefits * Enables the creation of large-scale dataset merge processes * Opens up Hive transactional functionality in an accessible manner to processes that operate outside of Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11220) HS2 Lineage leakage with 16x concurrency tests
[ https://issues.apache.org/jira/browse/HIVE-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11220: --- Assignee: Thejas M Nair HS2 Lineage leakage with 16x concurrency tests -- Key: HIVE-11220 URL: https://issues.apache.org/jira/browse/HIVE-11220 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: llap, 1.3.0 Reporter: Gopal V Assignee: Thejas M Nair Attachments: hs2_Leak_Suspects.zip, hs2_System_Overview.zip, hs2_Top_Components.zip Test scenario is the HS2 + LLAP, 16x concurrency of TPCDS queries which take less than 4 seconds. session.LineageState accumulates optimizer lineage info and HS2 OOMs due to the amount of data being held in the SessionState, since the sessions are continously being used without pause. The issue seems to be triggered due to the volume of fast queries or the life time of a single JDBC connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11165) Calcite planner might have a thread-safety issue compiling in parallel
[ https://issues.apache.org/jira/browse/HIVE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621403#comment-14621403 ] Gopal V commented on HIVE-11165: [~pxiong]: this doesn't always happen - I have to increase concurrency to trigger this. The current test workaround is that there's a RandomOrderController in my tests so that it doesn't plan the same query (with the same vertex names, predicates etc) at the same time. Calcite planner might have a thread-safety issue compiling in parallel -- Key: HIVE-11165 URL: https://issues.apache.org/jira/browse/HIVE-11165 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 2.0.0 Reporter: Gopal V Assignee: Laljo John Pullokkaran Attachments: RunJar-2015-06-30.snapshot After about 6 minutes trying to plan a query, the HiveServer2 was killed to restore functionality to a test run. The HEP planner is stuck on a TopologicalOrder traversal and there were no queries being fed into the HiveServer2 after it got stuck. TPC-DS query13 was the query in question, at 4 way parallel, which triggered the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11211) Reset the fields in JoinStatsRule in StatsRulesProcFactory
[ https://issues.apache.org/jira/browse/HIVE-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621425#comment-14621425 ] Hive QA commented on HIVE-11211: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744529/HIVE-11211.03.patch {color:green}SUCCESS:{color} +1 9146 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4557/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4557/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4557/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744529 - PreCommit-HIVE-TRUNK-Build Reset the fields in JoinStatsRule in StatsRulesProcFactory -- Key: HIVE-11211 URL: https://issues.apache.org/jira/browse/HIVE-11211 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11211.02.patch, HIVE-11211.03.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11193) ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted
[ https://issues.apache.org/jira/browse/HIVE-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621426#comment-14621426 ] Wei Zheng commented on HIVE-11193: -- Here's the link for that change: https://docs.oracle.com/javase/8/docs/technotes/guides/collections/changes8.html ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted --- Key: HIVE-11193 URL: https://issues.apache.org/jira/browse/HIVE-11193 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-11193.01.patch, HIVE-11193.02.patch, HIVE-11193.03.patch During Constant Propagation optimization, sometimes a node ends up being added to opToDelete list more than once. Later in ConstantPropagate transform, we try to delete that operator multiple times, which will cause SemanticException since the node has already been removed in an earlier pass. The data structure for storing opToDelete is List. We should use Set to avoid the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11165) Calcite planner might have a thread-safety issue compiling in parallel
[ https://issues.apache.org/jira/browse/HIVE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621431#comment-14621431 ] Laljo John Pullokkaran commented on HIVE-11165: --- [~gopalv] How did we get to the conclusion that Calcite is not thread safe? Calcite planner might have a thread-safety issue compiling in parallel -- Key: HIVE-11165 URL: https://issues.apache.org/jira/browse/HIVE-11165 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 2.0.0 Reporter: Gopal V Assignee: Laljo John Pullokkaran Attachments: RunJar-2015-06-30.snapshot After about 6 minutes trying to plan a query, the HiveServer2 was killed to restore functionality to a test run. The HEP planner is stuck on a TopologicalOrder traversal and there were no queries being fed into the HiveServer2 after it got stuck. TPC-DS query13 was the query in question, at 4 way parallel, which triggered the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11191) Beeline-cli: support hive.cli.errors.ignore in new CLI
[ https://issues.apache.org/jira/browse/HIVE-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621434#comment-14621434 ] Xuefu Zhang commented on HIVE-11191: +1 Beeline-cli: support hive.cli.errors.ignore in new CLI -- Key: HIVE-11191 URL: https://issues.apache.org/jira/browse/HIVE-11191 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-11191.1-beeline-cli.patch, HIVE-11191.2-beeline-cli.patch In the old CLI, it uses hive.cli.errors.ignore from the hive configuration to force execution a script when errors occurred. In the beeline, it has a similar option called force. We need to support the previous configuration using beeline functionality. More details about force option are available in https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance
[ https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-11131: --- Attachment: HIVE-11131.4.patch Get row information on DataWritableWriter once for better writing performance - Key: HIVE-11131 URL: https://issues.apache.org/jira/browse/HIVE-11131 Project: Hive Issue Type: Sub-task Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch, HIVE-11131.4.patch DataWritableWriter is a class used to write Hive records to Parquet files. This class is getting all the information about how to parse a record, such as schema and object inspector, every time a record is written (or write() is called). We can make this class perform better by initializing some writers per data type once, and saving all object inspectors on each writer. The class expects that the next records written will have the same object inspectors and schema, so there is no need to have conditions for that. When a new schema is written, DataWritableWriter is created again by Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance
[ https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-11131: --- Attachment: (was: HIVE-11131.4.patch) Get row information on DataWritableWriter once for better writing performance - Key: HIVE-11131 URL: https://issues.apache.org/jira/browse/HIVE-11131 Project: Hive Issue Type: Sub-task Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch, HIVE-11131.4.patch DataWritableWriter is a class used to write Hive records to Parquet files. This class is getting all the information about how to parse a record, such as schema and object inspector, every time a record is written (or write() is called). We can make this class perform better by initializing some writers per data type once, and saving all object inspectors on each writer. The class expects that the next records written will have the same object inspectors and schema, so there is no need to have conditions for that. When a new schema is written, DataWritableWriter is created again by Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4239) Remove lock on compilation stage
[ https://issues.apache.org/jira/browse/HIVE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-4239: - Labels: TODOC2.0 (was: ) Remove lock on compilation stage Key: HIVE-4239 URL: https://issues.apache.org/jira/browse/HIVE-4239 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Reporter: Carl Steinbach Assignee: Sergey Shelukhin Labels: TODOC2.0 Fix For: 2.0.0 Attachments: HIVE-4239.01.patch, HIVE-4239.02.patch, HIVE-4239.03.patch, HIVE-4239.04.patch, HIVE-4239.05.patch, HIVE-4239.06.patch, HIVE-4239.07.patch, HIVE-4239.08.patch, HIVE-4239.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-11145) Remove OFFLINE and NO_DROP from tables and partitions
[ https://issues.apache.org/jira/browse/HIVE-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621335#comment-14621335 ] Ashutosh Chauhan edited comment on HIVE-11145 at 7/9/15 9:44 PM: - Patch LGTM +1 Is there another mode called read only? We also need to update wiki to reflect that this feature is now gone and can be substituted with sql std auth cc: [~leftylev] was (Author: ashutoshc): Patch LGTM +1 Is there another mode called read only? We also need to update wiki to reflect that this feature is now gone and can be substituted with sql std auth Remove OFFLINE and NO_DROP from tables and partitions - Key: HIVE-11145 URL: https://issues.apache.org/jira/browse/HIVE-11145 Project: Hive Issue Type: Improvement Components: Metastore, SQL Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-11145.2.patch, HIVE-11145.3.patch, HIVE-11145.patch Currently a table or partition can be marked no_drop or offline. This prevents users from dropping or reading (and dropping) the table or partition. This was built in 0.7 before SQL standard authorization was an option. This is an expensive feature as when a table is dropped every partition must be fetched and checked to make sure it can be dropped. This feature is also redundant now that real authorization is available in Hive. This feature should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11214) Insert into ACID table switches vectorization off
[ https://issues.apache.org/jira/browse/HIVE-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11214: Attachment: HIVE-11214.02.patch Insert into ACID table switches vectorization off -- Key: HIVE-11214 URL: https://issues.apache.org/jira/browse/HIVE-11214 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-11214.01.patch, HIVE-11214.02.patch PROBLEM: vectorization is switched off automatically after run insert into ACID table. STEPS TO REPRODUCE: set hive.vectorized.execution.enabled=true; create table testv (id int, name string) clustered by (id) into 2 buckets stored as orc tblproperties(transactional=true); insert into testv values(1,'a'); set hive.vectorized.execution.enabled; false -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11216) UDF GenericUDFMapKeys throws NPE when a null map value is passed in
[ https://issues.apache.org/jira/browse/HIVE-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yibing Shi updated HIVE-11216: -- Attachment: HIVE-11216.1.patch Attach a new patch. UDF GenericUDFMapKeys throws NPE when a null map value is passed in --- Key: HIVE-11216 URL: https://issues.apache.org/jira/browse/HIVE-11216 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.2.0 Reporter: Yibing Shi Assignee: Yibing Shi Attachments: HIVE-11216.1.patch, HIVE-11216.patch We can reproduce the problem as below: {noformat} hive show create table map_txt; OK CREATE TABLE `map_txt`( `id` int, `content` mapint,string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ... Time taken: 0.233 seconds, Fetched: 18 row(s) hive select * from map_txt; OK 1 NULL Time taken: 0.679 seconds, Fetched: 1 row(s) hive select id, map_keys(content) from map_txt; Error during job, obtaining debugging information... Examining task ID: task_1435534231122_0025_m_00 (and more) from job job_1435534231122_0025 Task with the most failures(4): - Task ID: task_1435534231122_0025_m_00 URL: http://host-10-17-80-40.coe.cloudera.com:8088/taskdetails.jsp?jobid=job_1435534231122_0025tipid=task_1435534231122_0025_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:198) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {id:1,content:null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:559) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:180) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating map_keys(content) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549) ... 9 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) ... 13 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL hive {noformat} The error is as below (in mappers): {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapKeys.evaluate(GenericUDFMapKeys.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) ... 17 more {noformat} Looking at the source code:
[jira] [Commented] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance
[ https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621306#comment-14621306 ] Hive QA commented on HIVE-11131: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744508/HIVE-11131.4.patch {color:green}SUCCESS:{color} +1 9146 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4556/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4556/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4556/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12744508 - PreCommit-HIVE-TRUNK-Build Get row information on DataWritableWriter once for better writing performance - Key: HIVE-11131 URL: https://issues.apache.org/jira/browse/HIVE-11131 Project: Hive Issue Type: Sub-task Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch, HIVE-11131.4.patch DataWritableWriter is a class used to write Hive records to Parquet files. This class is getting all the information about how to parse a record, such as schema and object inspector, every time a record is written (or write() is called). We can make this class perform better by initializing some writers per data type once, and saving all object inspectors on each writer. The class expects that the next records written will have the same object inspectors and schema, so there is no need to have conditions for that. When a new schema is written, DataWritableWriter is created again by Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE
[ https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621307#comment-14621307 ] Vikram Dixit K commented on HIVE-11221: --- +1 LGTM, In Tez mode, alter table concatenate orc files can intermittently fail with NPE --- Key: HIVE-11221 URL: https://issues.apache.org/jira/browse/HIVE-11221 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11221.1.patch We are not waiting for input ready events which can trigger occasional NPE if input is not actually ready. Stacktrace: {code} java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648) at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146) at org.apache.tez.mapreduce.lib.MRReaderMapred.init(MRReaderMapred.java:73) at org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483) at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108) at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220) at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4239) Remove lock on compilation stage
[ https://issues.apache.org/jira/browse/HIVE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621327#comment-14621327 ] Lefty Leverenz commented on HIVE-4239: -- Doc note: This adds configuration parameter *hive.driver.parallel.compilation* to HiveConf.java, so it will need to be documented in the wiki for release 2.0. It belongs in the HiveServer2 section of Configuration Properties: * [Configuration Properties -- HiveServer2 | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2] But shouldn't it be named *hive.server2.driver.parallel.compilation* to match the other HS2 parameters? (Sorry I didn't notice that earlier.) And a nit: if you change the parameter name in a new jira, please start the parameter description on a new line so it will look better in the generated template file. Remove lock on compilation stage Key: HIVE-4239 URL: https://issues.apache.org/jira/browse/HIVE-4239 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Reporter: Carl Steinbach Assignee: Sergey Shelukhin Labels: TODOC2.0 Fix For: 2.0.0 Attachments: HIVE-4239.01.patch, HIVE-4239.02.patch, HIVE-4239.03.patch, HIVE-4239.04.patch, HIVE-4239.05.patch, HIVE-4239.06.patch, HIVE-4239.07.patch, HIVE-4239.08.patch, HIVE-4239.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11145) Remove OFFLINE and NO_DROP from tables and partitions
[ https://issues.apache.org/jira/browse/HIVE-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621335#comment-14621335 ] Ashutosh Chauhan commented on HIVE-11145: - Patch LGTM +1 Is there another mode called read only? We also need to update wiki to reflect that this feature is now gone and can be substituted with sql std auth Remove OFFLINE and NO_DROP from tables and partitions - Key: HIVE-11145 URL: https://issues.apache.org/jira/browse/HIVE-11145 Project: Hive Issue Type: Improvement Components: Metastore, SQL Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-11145.2.patch, HIVE-11145.3.patch, HIVE-11145.patch Currently a table or partition can be marked no_drop or offline. This prevents users from dropping or reading (and dropping) the table or partition. This was built in 0.7 before SQL standard authorization was an option. This is an expensive feature as when a table is dropped every partition must be fetched and checked to make sure it can be dropped. This feature is also redundant now that real authorization is available in Hive. This feature should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)