[jira] [Commented] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027116#comment-15027116 ] Sergio Peña commented on HIVE-12469: Changes look good guys. +1 [~ashutoshc] Could we commit this to branch-1 so that we have this vulnerability fix available soon? Should we upload another file to test the fix on branch-1? > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert >Priority: Blocker > Attachments: HIVE-12469.2.patch, HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-12469: --- Target Version/s: 2.0.0, 1.2.2 > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Affects Versions: 1.2.1 >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert >Priority: Blocker > Attachments: HIVE-12469.2.patch, HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12517) Beeline's use of failed connection(s) causes failures and leaks.
[ https://issues.apache.org/jira/browse/HIVE-12517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027202#comment-15027202 ] Naveen Gangam commented on HIVE-12517: -- review is posted to RB at https://reviews.apache.org/r/40694/ > Beeline's use of failed connection(s) causes failures and leaks. > > > Key: HIVE-12517 > URL: https://issues.apache.org/jira/browse/HIVE-12517 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Minor > Fix For: 2.0.0 > > Attachments: HIVE-12517.patch > > > Beeline adds a bad connection(s) to the connection list and makes it the > current connection, so any subsequent queries will attempt to use this bad > connection and will fail. Even a "!close" would not work. > 1) all queries fail unless !go is used. > 2) !closeall cannot close the active connections either. > 3) !exit will exit while attempting to establish these inactive connections > without closing the active connections. So this could hold up server side > resources. > {code} > beeline> !connect jdbc:hive2://localhost:1 hive1 hive1 > scan complete in 8ms > Connecting to jdbc:hive2://localhost:1 > Connected to: Apache Hive (version 2.0.0-SNAPSHOT) > Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) > Transaction isolation: TRANSACTION_REPEATABLE_READ > 0: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 > hive1 > Connecting to jdbc:hive2://localhost:1 > Connected to: Apache Hive (version 2.0.0-SNAPSHOT) > Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) > Transaction isolation: TRANSACTION_REPEATABLE_READ > 1: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 > hive1 > Connecting to jdbc:hive2://localhost:1 > Connected to: Apache Hive (version 2.0.0-SNAPSHOT) > Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) > Transaction isolation: TRANSACTION_REPEATABLE_READ > 2: jdbc:hive2://localhost:1> !tables > ++--+-+-+--+--+ > | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | > ++--+-+-+--+--+ > || default | char_nested_1 | TABLE | NULL | > || default | src | TABLE | NULL | > || default | char_nested_struct | TABLE | NULL | > || default | src_thrift | TABLE | NULL | > || default | x | TABLE | NULL | > ++--+-+-+--+--+ > 2: jdbc:hive2://localhost:1> !list > 3 active connections: > #0 open jdbc:hive2://localhost:1 > #1 open jdbc:hive2://localhost:1 > #2 open jdbc:hive2://localhost:1 > 2: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:11000 hive1 > hive1 > Connecting to jdbc:hive2://localhost:11000 > Error: Could not open client transport with JDBC Uri: > jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused > (state=08S01,code=0) > 3: jdbc:hive2://localhost:11000 (closed)> !tables > Error: Could not open client transport with JDBC Uri: > jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused > (state=08S01,code=0) > 3: jdbc:hive2://localhost:11000 (closed)> !list > 4 active connections: > #0 open jdbc:hive2://localhost:1 > #1 open jdbc:hive2://localhost:1 > #2 open jdbc:hive2://localhost:1 > #3 closed jdbc:hive2://localhost:11000 > 3: jdbc:hive2://localhost:11000 (closed)> !close > Error: Could not open client transport with JDBC Uri: > jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused > (state=08S01,code=0) > 3: jdbc:hive2://localhost:11000 (closed)> !closeall > Error: Could not open client transport with JDBC Uri: > jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused > (state=08S01,code=0) > 4: jdbc:hive2://localhost:11000 (closed)> !exit > Error: Could not open client transport with JDBC Uri: > jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused > (state=08S01,code=0) > Error: Could not open client transport with JDBC Uri: > jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused > (state=08S01,code=0) > {code} > The workaround is to use !go to set the current connection to a "good" > connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026922#comment-15026922 ] Xuefu Zhang commented on HIVE-12483: [~spena], thanks for working on this. The following failures are expected as they happen on master as well: {code} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby {code} The following are expected and passing in my local box. {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic {code} > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-12469: --- Affects Version/s: 1.2.1 > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Affects Versions: 1.2.1 >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert >Priority: Blocker > Attachments: HIVE-12469.2.patch, HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027105#comment-15027105 ] Reuben Kuhnert commented on HIVE-12469: --- LGTM (Non-committer) +1 > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert >Priority: Blocker > Attachments: HIVE-12469.2.patch, HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027146#comment-15027146 ] Ashutosh Chauhan commented on HIVE-12469: - [~spena] How do I trigger the QA run for branch-1 > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert >Priority: Blocker > Attachments: HIVE-12469.2.patch, HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12338) Add webui to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-12338: --- Attachment: HIVE-12338.4.patch Thanks a lot for the review. Attached v4 that disabled the web UI for testing to fix those unit tests. > Add webui to HiveServer2 > > > Key: HIVE-12338 > URL: https://issues.apache.org/jira/browse/HIVE-12338 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-12338.1.patch, HIVE-12338.2.patch, > HIVE-12338.3.patch, HIVE-12338.4.patch, hs2-conf.png, hs2-logs.png, > hs2-metrics.png, hs2-webui.png > > > A web ui for HiveServer2 can show some useful information such as: > > 1. Sessions, > 2. Queries that are executing on the HS2, their states, starting time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11358) LLAP: move LlapConfiguration into HiveConf and document the settings
[ https://issues.apache.org/jira/browse/HIVE-11358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027218#comment-15027218 ] Hive QA commented on HIVE-11358: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773911/HIVE-11358.03.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 9825 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_nonascii org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_join_transpose org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6124/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6124/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6124/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773911 - PreCommit-HIVE-TRUNK-Build > LLAP: move LlapConfiguration into HiveConf and document the settings > > > Key: HIVE-11358 > URL: https://issues.apache.org/jira/browse/HIVE-11358 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11358.01.patch, HIVE-11358.02.patch, > HIVE-11358.03.patch, HIVE-11358.patch > > > Hive uses HiveConf for configuration. LlapConfiguration should be replaced > with parameters in HiveConf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12511) IN clause performs differently then = clause
[ https://issues.apache.org/jira/browse/HIVE-12511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-12511: --- Attachment: HIVE-12511.1.patch > IN clause performs differently then = clause > > > Key: HIVE-12511 > URL: https://issues.apache.org/jira/browse/HIVE-12511 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-12511.1.patch > > > Similar to HIVE-11973, IN clause performs differently then = clause for "int" > type with string values. > For example, > {noformat} > SELECT * FROM inttest WHERE iValue IN ('01'); > {noformat} > will not return any rows with int iValue = 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026930#comment-15026930 ] Sergio Peña commented on HIVE-12483: With "expected and passing in my local box" you mean that those errors are expected to fail in ptest? or expected to pass? > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026922#comment-15026922 ] Xuefu Zhang edited comment on HIVE-12483 at 11/25/15 4:47 PM: -- [~spena], thanks for working on this. The following failures are expected as they happen on master as well: {code} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby {code} The following are expected to pass and are passing in my local box. {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic {code} was (Author: xuefuz): [~spena], thanks for working on this. The following failures are expected as they happen on master as well: {code} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby {code} The following are expected and passing in my local box. {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic {code} > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027156#comment-15027156 ] Sergio Peña commented on HIVE-12469: Just upload another patch with the filename "HIVE-12469.2-branch1.patch". > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert >Priority: Blocker > Attachments: HIVE-12469.2.patch, HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12476) Metastore NPE on Oracle with Direct SQL
[ https://issues.apache.org/jira/browse/HIVE-12476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027741#comment-15027741 ] Jason Dere commented on HIVE-12476: --- All of the precommit test failures are previously failing tests. > Metastore NPE on Oracle with Direct SQL > --- > > Key: HIVE-12476 > URL: https://issues.apache.org/jira/browse/HIVE-12476 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-12476.1.patch, HIVE-12476.2.patch > > > Stack trace looks very similar to HIVE-8485. I believe the metastore's Direct > SQL mode requires additional fixes similar to HIVE-8485, around the > Partition/StorageDescriptorSerDe parameters. > {noformat} > 2015-11-19 18:08:33,841 ERROR [pool-5-thread-2]: server.TThreadPoolServer > (TThreadPoolServer.java:run(296)) - Error occurred during processing of > message. > java.lang.NullPointerException > at > org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:200) > at > org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:579) > at > org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:501) > at > org.apache.hadoop.hive.metastore.api.SerDeInfo.write(SerDeInfo.java:439) > at > org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1490) > at > org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1288) > at > org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1154) > at > org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1072) > at > org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:929) > at > org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:825) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java:64470) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java:64402) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.write(ThriftHiveMetastore.java:64340) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:681) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:676) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:676) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12512) Include driver logs in execution-level Operation logs
[ https://issues.apache.org/jira/browse/HIVE-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027751#comment-15027751 ] Xuefu Zhang commented on HIVE-12512: +1 on test. > Include driver logs in execution-level Operation logs > - > > Key: HIVE-12512 > URL: https://issues.apache.org/jira/browse/HIVE-12512 > Project: Hive > Issue Type: Bug > Components: Logging >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Minor > Attachments: HIVE-12512.patch > > > When {{hive.server2.logging.operation.level}} is set to {{EXECUTION}} > (default), operation logs do not include Driver logs, which contain useful > info like total number of jobs launched, stage getting executed, etc. that > help track high-level progress. It only adds a few more lines to the output. > {code} > 15/11/24 14:09:12 INFO ql.Driver: Semantic Analysis Completed > 15/11/24 14:09:12 INFO ql.Driver: Starting > command(queryId=hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1): > select count(*) from sample_08 > 15/11/24 14:09:12 INFO ql.Driver: Query ID = > hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1 > 15/11/24 14:09:12 INFO ql.Driver: Total jobs = 1 > ... > 15/11/24 14:09:40 INFO ql.Driver: MapReduce Jobs Launched: > 15/11/24 14:09:40 INFO ql.Driver: Stage-Stage-1: Map: 1 Reduce: 1 > Cumulative CPU: 3.58 sec HDFS Read: 52956 HDFS Write: 4 SUCCESS > 15/11/24 14:09:40 INFO ql.Driver: Total MapReduce CPU Time Spent: 3 seconds > 580 msec > 15/11/24 14:09:40 INFO ql.Driver: OK > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027785#comment-15027785 ] Xuefu Zhang commented on HIVE-9970: --- [~tarushg], the error seems different from the original issue but very strange: {code} Caused by: java.io.IOException: Cannot run program "/home/adt/server/spark1.5/spark-1.5.1set mapreduce.input.fileinputformat.split.maxsize=75000set hive.vectorized.execution.enabled=trueset hive.cbo.enable=trueset hive.optimize.reducededuplication.min.reducer=4set hive.optimize.reducededuplication=trueset hive.orc.splits.include.file.footer=falseset hive.merge.mapfiles=trueset hive.merge.sparkfiles=falseset hive.merge.smallfiles.avgsize=1600set hive.merge.size.per.task=25600set hive.merge.orcfile.stripe.level=trueset hive.auto.convert.join=trueset hive.auto.convert.join.noconditionaltask=trueset hive.auto.convert.join.noconditionaltask.size=894435328set hive.optimize.bucketmapjoin.sortedmerge=falseset hive.map.aggr.hash.percentmemory=0.5set hive.map.aggr=trueset hive.optimize.sort.dynamic.partition=falseset hive.stats.autogather=trueset hive.stats.fetch.column.stats=trueset hive.vectorized.execution.reduce.enabled=falseset hive.vectorized.groupby.checkinterval=4096set hive.vectorized.groupby.flush.percent=0.1set hive.compute.query.using.stats=trueset hive.limit.pushdown.memory.usage=0.4set hive.optimize.index.filter=trueset hive.exec.reducers.bytes.per.reducer=67108864set hive.smbjoin.cache.rows=1set hive.exec.orc.default.stripe.size=67108864set hive.fetch.task.conversion=moreset hive.fetch.task.conversion.threshold=1073741824set hive.fetch.task.aggr=falseset mapreduce.input.fileinputformat.list-status.num-threads=5set spark.kryo.referenceTracking=false#set spark.kryo.classesToRegister=org.apache.hadoop.hive.ql.io.HiveKey,org.apache.hadoop.io.BytesWritable,org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch/bin/spark-submit": error=36, File name too long {code} This is where Hive is building a process to launch a remote spark driver. it usually starts with something like "/home/adt/server/spark1.5/bin/spark-submit ...". It seems that the builder gets corrupted with a bunch of set commands. Could you describe how to reproduce this issue? > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12479) Vectorization: Vectorized Date UDFs with up-stream Joins
[ https://issues.apache.org/jira/browse/HIVE-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027720#comment-15027720 ] Prasanth Jayachandran commented on HIVE-12479: -- LGTM, +1. Pending tests > Vectorization: Vectorized Date UDFs with up-stream Joins > > > Key: HIVE-12479 > URL: https://issues.apache.org/jira/browse/HIVE-12479 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12479.1.patch, HIVE-12479.tar.gz > > > The row-counts expected with and without vectorization differ. > The attached small-scale repro case produces 5 rows with vectorized multi-key > joins and 53 rows without the vectorized join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9970: -- Comment: was deleted (was: My working Environment is Centos 6.4 Hadoop - 2.6.0 Hive - 1.1.0 Spark - 1.3.0 (Builded spark,sql.yarn) could you brief me about the error.Is that because of using higher end verions (or) my mistakes during building the spark. ) > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9970: -- Comment: was deleted (was: My working Environment is Centos 6.4 Hadoop - 2.6.0 Hive - 1.1.0 Spark - 1.3.0 (Builded spark,sql.yarn) could you brief me about the error.Is that because of using higher end verions (or) my mistakes during building the spark. ) > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9970: -- Comment: was deleted (was: My working Environment is Centos 6.4 Hadoop - 2.6.0 Hive - 1.1.0 Spark - 1.3.0 (Builded spark,sql.yarn) could you brief me about the error.Is that because of using higher end verions (or) my mistakes during building the spark. ) > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9970: -- Comment: was deleted (was: My working Environment is Centos 6.4 Hadoop - 2.6.0 Hive - 1.1.0 Spark - 1.3.0 (Builded spark,sql.yarn) could you brief me about the error.Is that because of using higher end verions (or) my mistakes during building the spark. ) > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12523) display Hive query name in explain plan
[ https://issues.apache.org/jira/browse/HIVE-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12523: Description: Query name is being added by HIVE-12357 NO PRECOMMIT TESTS was:Query name is being added by HIVE-12357 > display Hive query name in explain plan > --- > > Key: HIVE-12523 > URL: https://issues.apache.org/jira/browse/HIVE-12523 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12523.patch > > > Query name is being added by HIVE-12357 > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12523) display Hive query name in explain plan
[ https://issues.apache.org/jira/browse/HIVE-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12523: Attachment: HIVE-12523.patch Patch on top of HIVE-12357 > display Hive query name in explain plan > --- > > Key: HIVE-12523 > URL: https://issues.apache.org/jira/browse/HIVE-12523 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12523.patch > > > Query name is being added by HIVE-12357 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12528) don't start HS2 Tez sessions in a single thread
[ https://issues.apache.org/jira/browse/HIVE-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027768#comment-15027768 ] Sergey Shelukhin commented on HIVE-12528: - [~gopalv] [~sseth] fyi > don't start HS2 Tez sessions in a single thread > --- > > Key: HIVE-12528 > URL: https://issues.apache.org/jira/browse/HIVE-12528 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > Starting sessions in parallel would improve the startup time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9970: -- Comment: was deleted (was: my hive version is 1.2.0. and build spark1.3.1 on hadoop with "./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.5". That’s unfortunate,hive on spark still in error state,"Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask". java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT at org.apache.hive.spark.client.rpc.RpcConfiguration.(RpcConfiguration.java:46) at org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:146) at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480) So,Have these questions had answers?) > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9970: -- Comment: was deleted (was: My working Environment is Centos 6.4 Hadoop - 2.6.0 Hive - 1.1.0 Spark - 1.3.0 (Builded spark,sql.yarn) could you brief me about the error.Is that because of using higher end verions (or) my mistakes during building the spark. ) > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9970: -- Comment: was deleted (was: My working Environment is Centos 6.4 Hadoop - 2.6.0 Hive - 1.1.0 Spark - 1.3.0 (Builded spark,sql.yarn) could you brief me about the error.Is that because of using higher end verions (or) my mistakes during building the spark. ) > Hive on spark > - > > Key: HIVE-9970 > URL: https://issues.apache.org/jira/browse/HIVE-9970 > Project: Hive > Issue Type: Bug >Reporter: Amithsha >Assignee: Tarush Grover > > Hi all, > Recently i have configured Spark 1.2.0 and my environment is hadoop > 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing > insert into i am getting the following g error. > Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63 > Total jobs = 1 > Launching Job 1 out of 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create > spark client.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Have added the spark-assembly jar in hive lib > And also in hive console using the command add jar followed by the steps > set spark.home=/opt/spark-1.2.1/; > add jar > /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar; > set hive.execution.engine=spark; > set spark.master=spark://xxx:7077; > set spark.eventLog.enabled=true; > set spark.executor.memory=512m; > set spark.serializer=org.apache.spark.serializer.KryoSerializer; > Can anyone suggest > Thanks & Regards > Amithsha -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12501) LLAP: don't use read(ByteBuffer) in IO
[ https://issues.apache.org/jira/browse/HIVE-12501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027842#comment-15027842 ] Hive QA commented on HIVE-12501: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12774130/HIVE-12501.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9864 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6128/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6128/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6128/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12774130 - PreCommit-HIVE-TRUNK-Build > LLAP: don't use read(ByteBuffer) in IO > -- > > Key: HIVE-12501 > URL: https://issues.apache.org/jira/browse/HIVE-12501 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12501.01.patch, HIVE-12501.patch > > > read(ByteBuffer) API just copies the data anyway, and there's no readFully. > We need to use readFully and copy ourselves. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12491) Column Statistics: 3 attribute join on a 2-source table is off
[ https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027755#comment-15027755 ] Gopal V commented on HIVE-12491: Added a minimal query which demonstrates the issue. > Column Statistics: 3 attribute join on a 2-source table is off > -- > > Key: HIVE-12491 > URL: https://issues.apache.org/jira/browse/HIVE-12491 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Attachments: HIVE-12491.WIP.patch > > > The eased out denominator has to detect duplicate row-stats from different > attributes. > {code} > select account_id from customers c, customer_activation ca > where c.customer_id = ca.customer_id > and year(ca.dt) = year(c.dt) and month(ca.dt) = month(c.dt) > and year(ca.dt) between year('2013-12-26') and year('2013-12-26') > {code} > {code} > private Long getEasedOutDenominator(List distinctVals) { > // Exponential back-off for NDVs. > // 1) Descending order sort of NDVs > // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * > Collections.sort(distinctVals, Collections.reverseOrder()); > long denom = distinctVals.get(0); > for (int i = 1; i < distinctVals.size(); i++) { > denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << > i))); > } > return denom; > } > {code} > This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 > of which are derived from the same column. > {code} > Reduce Output Operator (RS_12) > key expressions: _col0 (type: bigint), year(_col2) (type: int), > month(_col2) (type: int) > sort order: +++ > Map-reduce partition columns: _col0 (type: bigint), year(_col2) > (type: int), month(_col2) (type: int) > value expressions: _col1 (type: bigint) > Join Operator (JOIN_13) > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) > (type: int) > 1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) > (type: int) > outputColumnNames: _col3 > {code} > So the eased out denominator is off by a factor of 30,000 or so, causing OOMs > in map-joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12491) Column Statistics: 3 attribute join on a 2-source table is off
[ https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12491: --- Description: The eased out denominator has to detect duplicate row-stats from different attributes. {code} select account_id from customers c, customer_activation ca where c.customer_id = ca.customer_id and year(ca.dt) = year(c.dt) and month(ca.dt) = month(c.dt) and year(ca.dt) between year('2013-12-26') and year('2013-12-26') {code} {code} private Long getEasedOutDenominator(List distinctVals) { // Exponential back-off for NDVs. // 1) Descending order sort of NDVs // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * Collections.sort(distinctVals, Collections.reverseOrder()); long denom = distinctVals.get(0); for (int i = 1; i < distinctVals.size(); i++) { denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << i))); } return denom; } {code} This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 of which are derived from the same column. {code} Reduce Output Operator (RS_12) key expressions: _col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) sort order: +++ Map-reduce partition columns: _col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) value expressions: _col1 (type: bigint) Join Operator (JOIN_13) condition map: Inner Join 0 to 1 keys: 0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) (type: int) 1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) outputColumnNames: _col3 {code} So the eased out denominator is off by a factor of 30,000 or so, causing OOMs in map-joins. was: The eased out denominator has to detect duplicate row-stats from different attributes. {code} private Long getEasedOutDenominator(List distinctVals) { // Exponential back-off for NDVs. // 1) Descending order sort of NDVs // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * Collections.sort(distinctVals, Collections.reverseOrder()); long denom = distinctVals.get(0); for (int i = 1; i < distinctVals.size(); i++) { denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << i))); } return denom; } {code} This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 of which are derived from the same column. {code} Reduce Output Operator (RS_12) key expressions: _col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) sort order: +++ Map-reduce partition columns: _col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) value expressions: _col1 (type: bigint) Join Operator (JOIN_13) condition map: Inner Join 0 to 1 keys: 0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) (type: int) 1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) outputColumnNames: _col3 {code} So the eased out denominator is off by a factor of 30,000 or so, causing OOMs in map-joins. > Column Statistics: 3 attribute join on a 2-source table is off > -- > > Key: HIVE-12491 > URL: https://issues.apache.org/jira/browse/HIVE-12491 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Attachments: HIVE-12491.WIP.patch > > > The eased out denominator has to detect duplicate row-stats from different > attributes. > {code} > select account_id from customers c, customer_activation ca > where c.customer_id = ca.customer_id > and year(ca.dt) = year(c.dt) and month(ca.dt) = month(c.dt) > and year(ca.dt) between year('2013-12-26') and year('2013-12-26') > {code} > {code} > private Long getEasedOutDenominator(List distinctVals) { > // Exponential back-off for NDVs. > // 1) Descending order sort of NDVs > // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * > Collections.sort(distinctVals, Collections.reverseOrder()); > long denom = distinctVals.get(0); > for (int i = 1; i < distinctVals.size(); i++) { > denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << > i))); > } > return denom; > } > {code} > This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 > of which are derived from the same column. > {code} > Reduce Output Operator (RS_12) > key expressions: _col0 (type: bigint), year(_col2) (type:
[jira] [Commented] (HIVE-12527) use the hottest HS2 session first from the session pool
[ https://issues.apache.org/jira/browse/HIVE-12527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027766#comment-15027766 ] Sergey Shelukhin commented on HIVE-12527: - [~gopalv] fyi > use the hottest HS2 session first from the session pool > --- > > Key: HIVE-12527 > URL: https://issues.apache.org/jira/browse/HIVE-12527 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > It makes much more sense to use the latest used available session, rather > than using round-robin which would use the coldest one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12519) ANTI JOIN in hive
[ https://issues.apache.org/jira/browse/HIVE-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-12519: -- Assignee: Pengcheng Xiong > ANTI JOIN in hive > - > > Key: HIVE-12519 > URL: https://issues.apache.org/jira/browse/HIVE-12519 > Project: Hive > Issue Type: Wish >Reporter: Keren Edri >Assignee: Pengcheng Xiong > Labels: hive > > I wish thee was "ANTI JOIN" in hive... > please implement the ANTI JOIN as described is here > http://blog.montmere.com/2010/12/08/the-anti-join-all-values-from-table1-where-not-in-table2/ > Thank You > have a nice day -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12445) Tracking of completed dags is a slow memory leak
[ https://issues.apache.org/jira/browse/HIVE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028081#comment-15028081 ] Hive QA commented on HIVE-12445: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773973/HIVE-12445.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9864 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6129/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6129/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6129/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773973 - PreCommit-HIVE-TRUNK-Build > Tracking of completed dags is a slow memory leak > > > Key: HIVE-12445 > URL: https://issues.apache.org/jira/browse/HIVE-12445 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-12445.patch > > > LLAP daemons track completed DAGs, but never clean up these structures. This > is primarily to disallow out of order executions. Evaluate whether that can > be avoided - otherwise this structure needs to be cleaned up with a delay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation
[ https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-0: -- Attachment: HIVE-0.26.patch > Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, > improve Filter selectivity estimation > > > Key: HIVE-0 > URL: https://issues.apache.org/jira/browse/HIVE-0 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Laljo John Pullokkaran > Attachments: HIVE-0-10.patch, HIVE-0-11.patch, > HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, > HIVE-0.13.patch, HIVE-0.14.patch, HIVE-0.15.patch, > HIVE-0.16.patch, HIVE-0.17.patch, HIVE-0.18.patch, > HIVE-0.19.patch, HIVE-0.2.patch, HIVE-0.20.patch, > HIVE-0.21.patch, HIVE-0.22.patch, HIVE-0.23.patch, > HIVE-0.24.patch, HIVE-0.25.patch, HIVE-0.26.patch, > HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, > HIVE-0.7.patch, HIVE-0.8.patch, HIVE-0.9.patch, > HIVE-0.91.patch, HIVE-0.92.patch, HIVE-0.patch > > > Query > {code} > select count(*) > from store_sales > ,store_returns > ,date_dim d1 > ,date_dim d2 > where d1.d_quarter_name = '2000Q1' >and d1.d_date_sk = ss_sold_date_sk >and ss_customer_sk = sr_customer_sk >and ss_item_sk = sr_item_sk >and ss_ticket_number = sr_ticket_number >and sr_returned_date_sk = d2.d_date_sk >and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); > {code} > The store_sales table is partitioned on ss_sold_date_sk, which is also used > in a join clause. The join clause should add a filter “filterExpr: > ss_sold_date_sk is not null”, which should get pushed the MetaStore when > fetching the stats. Currently this is not done in CBO planning, which results > in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in > the optimization phase. In particular, this increases the NDV for the join > columns and may result in wrong planning. > Including HiveJoinAddNotNullRule in the optimization phase solves this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11358) LLAP: move LlapConfiguration into HiveConf and document the settings
[ https://issues.apache.org/jira/browse/HIVE-11358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11358: Attachment: HIVE-11358.04.patch The same patch again. There was a number of broken tests caused by other JIRAs, will just do another run over the weekend. > LLAP: move LlapConfiguration into HiveConf and document the settings > > > Key: HIVE-11358 > URL: https://issues.apache.org/jira/browse/HIVE-11358 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11358.01.patch, HIVE-11358.02.patch, > HIVE-11358.03.patch, HIVE-11358.04.patch, HIVE-11358.patch > > > Hive uses HiveConf for configuration. LlapConfiguration should be replaced > with parameters in HiveConf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12075: Attachment: HIVE-12075.03.patch > add analyze command to explictly cache file metadata in HBase metastore > --- > > Key: HIVE-12075 > URL: https://issues.apache.org/jira/browse/HIVE-12075 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12075.01.nogen.patch, HIVE-12075.01.patch, > HIVE-12075.02.patch, HIVE-12075.03.patch, HIVE-12075.nogen.patch, > HIVE-12075.patch > > > ANALYZE TABLE (spec as usual) CACHE METADATA -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028102#comment-15028102 ] Laljo John Pullokkaran commented on HIVE-11927: --- [~pxiong] I looked at the patch: 1. Patch has bunch of print statements 2. ReduceExpressions for filter (select x from r1 where false ) will generate an empty logical value list (i.e null scan) instead of TS. ASTConverter needs to handle this. I.e translate "NULL SCAN" to table "dual" > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028109#comment-15028109 ] Pengcheng Xiong commented on HIVE-11927: [~jpullokkaran], sorry, i think i forgot to remove those print statements for debugging. I will do 2 accordingly. Thanks. > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9983) Vectorizer doesn't vectorize (1) partitions with different schema anywhere (2) any MapWork with >1 table scans in MR
[ https://issues.apache.org/jira/browse/HIVE-9983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-9983. Resolution: Won't Fix > Vectorizer doesn't vectorize (1) partitions with different schema anywhere > (2) any MapWork with >1 table scans in MR > > > Key: HIVE-9983 > URL: https://issues.apache.org/jira/browse/HIVE-9983 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Sergey Shelukhin >Assignee: Matt McCline > > For some test, tables are created as such: > {noformat} > CREATE TABLE orc_llap_part( > csmallint SMALLINT, > cint INT, > cbigint BIGINT, > cfloat FLOAT, > cdouble DOUBLE, > cstring1 STRING, > cstring2 STRING, > ctimestamp1 TIMESTAMP, > ctimestamp2 TIMESTAMP, > cboolean1 BOOLEAN, > cboolean2 BOOLEAN > ) PARTITIONED BY (ctinyint TINYINT) STORED AS ORC; > CREATE TABLE orc_llap_dim_part( > cbigint BIGINT > ) PARTITIONED BY (ctinyint TINYINT) STORED AS ORC; > INSERT OVERWRITE TABLE orc_llap_part PARTITION (ctinyint) > SELECT csmallint, cint, cbigint, cfloat, cdouble, cstring1, cstring2, > ctimestamp1, ctimestamp2, cboolean1, cboolean2, ctinyint FROM alltypesorc; > INSERT OVERWRITE TABLE orc_llap_dim_part PARTITION (ctinyint) > SELECT sum(cbigint) as cbigint, ctinyint FROM alltypesorc WHERE ctinyint > 10 > AND ctinyint < 21 GROUP BY ctinyint; > {noformat} > The query is: > {noformat} > explain > SELECT oft.ctinyint, oft.cint FROM orc_llap_part oft > INNER JOIN orc_llap_dim_part od ON oft.ctinyint = od.ctinyint; > {noformat} > This results in a failure to vectorize in MR: > {noformat} > Could not vectorize partition > pfile:/Users/sergey/git/hive3/itests/qtest/target/warehouse/orc_llap_dim_part/ctinyint=11. > Its column names cbigint do not match the other column names > csmallint,cint,cbigint,cfloat,cdouble,cstring1,cstring2,ctimestamp1,ctimestamp2,cboolean1,cboolean2 > {noformat} > This is comparing schemas from different tables because MapWork has 2 > TableScan-s; in Tez this error will never happen as MapWork will not have 2 > scans. > In Tez (and MR as well), the other case can happen, namely partitions of the > same table having different schemas. > Tez case can be solved by making a super-schema to include all variations and > handling missing columns where necessary. > MR case may be harder to solve. > Of note is that despite schema being different (and not a prefix of a schema > by coincidence or some such), query passes if validation is commented out. > Perhaps in some cases it can work? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7487) Vectorize GROUP BY on the Reduce-Side (Part 2 – Count Distinct)
[ https://issues.apache.org/jira/browse/HIVE-7487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-7487. Resolution: Won't Fix > Vectorize GROUP BY on the Reduce-Side (Part 2 – Count Distinct) > --- > > Key: HIVE-7487 > URL: https://issues.apache.org/jira/browse/HIVE-7487 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > > Vectorize reduce count distinct aggregation, depending on how we decide to > change how the optimizer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12341: Attachment: HIVE-12341.05.patch Addressed the feedback. The main change is separating the protocols. I hope they can work in the same class, need to test. > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.04.patch, > HIVE-12341.05.patch, HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9970) Hive on spark
[ https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028065#comment-15028065 ] TerrenceYTQ commented on HIVE-9970: --- With the same problem " ERROR util.Utils: uncaught error in thread SparkListenerBus, stopping SparkContext ", has anyone already solved it ??? ---My spark 1.5.2 Hive 1.2.1 , build by myself with commands : mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Dscala-2.10 -DskipTests clean package –e ---Error Log : 15/11/26 09:39:40 INFO Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap 2015-11-26 09:39:40,245 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:40 INFO exec.Utilities: Processing alias dc_mf_device_one_check 2015-11-26 09:39:40,245 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:40 INFO exec.Utilities: Adding input file hdfs://cluster1/user/hive/warehouse/vendorzhhs.db/dc_mf_device_one_check 2015-11-26 09:39:40,735 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:40 INFO log.PerfLogger: 2015-11-26 09:39:40,735 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:40 INFO exec.Utilities: Serializing MapWork via kryo 2015-11-26 09:39:40,902 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:40 INFO log.PerfLogger: 2015-11-26 09:39:41,248 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:41 INFO storage.MemoryStore: ensureFreeSpace(599952) called with curMem=0, maxMem=555755765 2015-11-26 09:39:41,250 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:41 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 585.9 KB, free 529.4 MB) 2015-11-26 09:39:41,429 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:41 INFO storage.MemoryStore: ensureFreeSpace(43801) called with curMem=599952, maxMem=555755765 2015-11-26 09:39:41,429 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:41 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 42.8 KB, free 529.4 MB) 2015-11-26 09:39:41,433 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:41 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.0.69:39388 (size: 42.8 KB, free: 530.0 MB) 2015-11-26 09:39:41,437 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:41 INFO spark.SparkContext: Created broadcast 0 from hadoopRDD at SparkPlanGenerator.java:188 2015-11-26 09:39:41,441 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/26 09:39:41 ERROR util.Utils: uncaught error in thread SparkListenerBus, stopping SparkContext 2015-11-26 09:39:41,441 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - java.lang.AbstractMethodError 2015-11-26 09:39:41,441 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:62) 2015-11-26 09:39:41,442 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) 2015-11-26 09:39:41,442 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31) 2015-11-26 09:39:41,442 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56) 2015-11-26 09:39:41,442 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37) 2015-11-26 09:39:41,442 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79) 2015-11-26 09:39:41,442 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136) 2015-11-26 09:39:41,442 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) -at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63) 2015-11-26 09:39:41,467 INFO [stderr-redir-1]: client.SparkClientImpl
[jira] [Updated] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-12469: Attachment: HIVE-12469.2-branch1.patch Pushed patch to master. Attached a patch for branch-1 > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Affects Versions: 1.2.1 >Reporter: Reuben Kuhnert >Assignee: Ashutosh Chauhan >Priority: Blocker > Attachments: HIVE-12469.2-branch1.patch, HIVE-12469.2.patch, > HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12498) ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect
[ https://issues.apache.org/jira/browse/HIVE-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12498: - Attachment: HIVE-12498-branch-1.patch branch-1 patch is little bit more involved as master underwent refactorings that made it easy to pass table properties to writer options. > ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect > - > > Key: HIVE-12498 > URL: https://issues.apache.org/jira/browse/HIVE-12498 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: ACID, ORC > Attachments: HIVE-12498-branch-1.patch, HIVE-12498.1.patch, > HIVE-12498.2.patch > > > OrcRecordUpdater does not honor the > OrcRecordUpdater.OrcOptions.tableProperties() setting. > It would need to translate the specified tableProperties (as listed in > OrcTableProperties enum) to the properties that OrcWriter internally > understands (listed in HiveConf.ConfVars). > This is needed for multiple clients.. like Streaming API and Compactor. > {code:java} > Properties orcTblProps = .. // get Orc Table Properties from MetaStore; > AcidOutputFormat.Options updaterOptions = new > OrcRecordUpdater.OrcOptions(conf) > .inspector(..) > .bucket(..) > .minimumTransactionId(..) > .maximumTransactionId(..) > > .tableProperties(orcTblProps); // <<== > OrcOutputFormat orcOutput = new ... > orcOutput.getRecordUpdater(partitionPath, updaterOptions ); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027451#comment-15027451 ] Sergey Shelukhin commented on HIVE-12462: - Hmm. The previous logic removed it exactly after processing, and then removed the TS one at the end. This one should do the same.. > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12522) Wrong FS error during Tez merge files when warehouse and scratchdir are on different FS
[ https://issues.apache.org/jira/browse/HIVE-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-12522: -- Attachment: HIVE-12522.1.patch Initial patch - DiagUtils shouldn't assume the temp path is always on the scratchdir. > Wrong FS error during Tez merge files when warehouse and scratchdir are on > different FS > --- > > Key: HIVE-12522 > URL: https://issues.apache.org/jira/browse/HIVE-12522 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-12522.1.patch > > > When hive.merge.tezfiles=true, and the warehouse dir/scratchdir are on > different filesystems. > {noformat} > 2015-11-13 10:22:10,617 ERROR exec.Task (TezTask.java:execute(184)) - Failed > to execute tez graph. > java.lang.IllegalArgumentException: Wrong FS: > wasb://chaoyitezt...@chaoyiteztest.blob.core.windows.net/hive/scratch/chaoyitest/c888f405-3c98-46b1-bf39-e57f067dfe4c/hive_2015-11-13_10-16-10_216_8161037519951665173-1/_tmp.-ext-1, > expected: hdfs://headnodehost:9000 > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105) > at > org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136) > at > org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423) > at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:579) > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:1083) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:329) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:156) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:733) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > 2015-11-13 10:22:10,620 INFO hooks.ATSHook (ATSHook.java:(84)) - > Created ATS Hook > {noformat} > When the scratchdir is set to the same FS as the warehouse the problem goes > away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12522) Wrong FS error during Tez merge files when warehouse and scratchdir are on different FS
[ https://issues.apache.org/jira/browse/HIVE-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027497#comment-15027497 ] Prasanth Jayachandran commented on HIVE-12522: -- LGTM, +1. Pending tests > Wrong FS error during Tez merge files when warehouse and scratchdir are on > different FS > --- > > Key: HIVE-12522 > URL: https://issues.apache.org/jira/browse/HIVE-12522 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-12522.1.patch > > > When hive.merge.tezfiles=true, and the warehouse dir/scratchdir are on > different filesystems. > {noformat} > 2015-11-13 10:22:10,617 ERROR exec.Task (TezTask.java:execute(184)) - Failed > to execute tez graph. > java.lang.IllegalArgumentException: Wrong FS: > wasb://chaoyitezt...@chaoyiteztest.blob.core.windows.net/hive/scratch/chaoyitest/c888f405-3c98-46b1-bf39-e57f067dfe4c/hive_2015-11-13_10-16-10_216_8161037519951665173-1/_tmp.-ext-1, > expected: hdfs://headnodehost:9000 > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105) > at > org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136) > at > org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423) > at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:579) > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:1083) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:329) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:156) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:733) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > 2015-11-13 10:22:10,620 INFO hooks.ATSHook (ATSHook.java:(84)) - > Created ATS Hook > {noformat} > When the scratchdir is set to the same FS as the warehouse the problem goes > away. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12520) Fix schema_evol* tests on master
[ https://issues.apache.org/jira/browse/HIVE-12520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-12520: Attachment: HIVE-12520.patch Also included are golden file updates missed in HIVE-12329 > Fix schema_evol* tests on master > > > Key: HIVE-12520 > URL: https://issues.apache.org/jira/browse/HIVE-12520 > Project: Hive > Issue Type: Task > Components: Tests >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12520.patch > > > Few q file updates missed in HIVE-12331 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12520) Fix schema_evol* tests on master
[ https://issues.apache.org/jira/browse/HIVE-12520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027280#comment-15027280 ] Prasanth Jayachandran commented on HIVE-12520: -- +1 > Fix schema_evol* tests on master > > > Key: HIVE-12520 > URL: https://issues.apache.org/jira/browse/HIVE-12520 > Project: Hive > Issue Type: Task > Components: Tests >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12520.patch > > > Few q file updates missed in HIVE-12331 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12511) IN clause performs differently then = clause
[ https://issues.apache.org/jira/browse/HIVE-12511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027461#comment-15027461 ] Yongzhi Chen commented on HIVE-12511: - +1 if the approach can pass all the decimal unit tests. > IN clause performs differently then = clause > > > Key: HIVE-12511 > URL: https://issues.apache.org/jira/browse/HIVE-12511 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-12511.1.patch > > > Similar to HIVE-11973, IN clause performs differently then = clause for "int" > type with string values. > For example, > {noformat} > SELECT * FROM inttest WHERE iValue IN ('01'); > {noformat} > will not return any rows with int iValue = 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-12483: --- Attachment: HIVE-12483.1.patch > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-12483.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12491) Column Statistics: 3 attribute join on a 2-source table is off
[ https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027319#comment-15027319 ] Ashutosh Chauhan commented on HIVE-12491: - [~gopalv] If you have minimal query which repros the effect, can you post it here? > Column Statistics: 3 attribute join on a 2-source table is off > -- > > Key: HIVE-12491 > URL: https://issues.apache.org/jira/browse/HIVE-12491 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Attachments: HIVE-12491.WIP.patch > > > The eased out denominator has to detect duplicate row-stats from different > attributes. > {code} > private Long getEasedOutDenominator(List distinctVals) { > // Exponential back-off for NDVs. > // 1) Descending order sort of NDVs > // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * > Collections.sort(distinctVals, Collections.reverseOrder()); > long denom = distinctVals.get(0); > for (int i = 1; i < distinctVals.size(); i++) { > denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << > i))); > } > return denom; > } > {code} > This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 > of which are derived from the same column. > {code} > Reduce Output Operator (RS_12) > key expressions: _col0 (type: bigint), year(_col2) (type: int), > month(_col2) (type: int) > sort order: +++ > Map-reduce partition columns: _col0 (type: bigint), year(_col2) > (type: int), month(_col2) (type: int) > value expressions: _col1 (type: bigint) > Join Operator (JOIN_13) > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) > (type: int) > 1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) > (type: int) > outputColumnNames: _col3 > {code} > So the eased out denominator is off by a factor of 30,000 or so, causing OOMs > in map-joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch
[ https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027381#comment-15027381 ] Alan Gates commented on HIVE-12307: --- +1 bq. It's worthwhile to do a thread safety review but was not my goal here. Agreed, I'm not trying to add a thread safety review to this patch. I've created HIVE-12521 to start tracking javadoc issues in Hive streaming. I've put in there that we should document the assumptions about thread safety and the meaning of SerializationError. If there are other issues you are aware of in here that need documenting feel free to add to that JIRA. I've assigned it to myself so I don't forget about it but feel free to take it on if you want. > Streaming API TransactionBatch.close() must abort any remaining transactions > in the batch > - > > Key: HIVE-12307 > URL: https://issues.apache.org/jira/browse/HIVE-12307 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 0.14.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-12307.2.patch, HIVE-12307.patch > > > When the client of TransactionBatch API encounters an error it must close() > the batch and start a new one. This prevents attempts to continue writing to > a file that may damaged in some way. > The close() should ensure to abort the any txns that still remain in the > batch and close (best effort) all the files it's writing to. The batch > should also put itself into a mode where any future ops on this batch fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12417) Support for exclamation mark missing in regexp
[ https://issues.apache.org/jira/browse/HIVE-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027413#comment-15027413 ] Olaf Flebbe commented on HIVE-12417: I do not think, since the documentation in wiki was right and the implementation wrong. > Support for exclamation mark missing in regexp > -- > > Key: HIVE-12417 > URL: https://issues.apache.org/jira/browse/HIVE-12417 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Olaf Flebbe >Assignee: Olaf Flebbe > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12417.1.patch, HIVE-12417.2.patch > > > with HIVE-6013 gets support for regular expressions. However, die ! character > is valid, too. It is needed for expressions like > {code} > set hive.support.quoted.identifiers = none; > select `^(?!donotuseme).*$` from table; > {code} > which is the idiom to select all but column {{donotuseme}} . > See http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html for > a reference of supported chars in Java regexp. > The patch simply fixes the lexer to support '!' as REGEX char. And does > simply work. > Please review. > If you like to have an iTest for it, I beg you to help me. I tried several > days on a different issue to figure out how it is supposed to work and failed > miserably. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12496) Open ServerTransport After MetaStore Initialization
[ https://issues.apache.org/jira/browse/HIVE-12496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027541#comment-15027541 ] Ashutosh Chauhan commented on HIVE-12496: - +1 > Open ServerTransport After MetaStore Initialization > > > Key: HIVE-12496 > URL: https://issues.apache.org/jira/browse/HIVE-12496 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 1.2.1 > Environment: Standalone MetaStore, cluster mode(multiple instances) >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Minor > Attachments: HIVE-12496.patch > > > During HiveMetaStore starting,the following steps should be reordered: > 1,Creation of TServerSocket > 2,Creation of HMSHandler > 3,Creation of TThreadPoolServer > Step 2 involves some initialization work including : > {noformat} > createDefaultDB(); > createDefaultRoles(); > addAdminUsers(); > {noformat} > TServerSocket shall be created after these initialization work to prevent > unnecessary waiting from client side.And if there are errors during > initialization (multiple metastores creating default DB at the same time can > cause errors),clients shall not connect to this metastore as it will shuting > down due to error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12519) ANTI JOIN in hive
[ https://issues.apache.org/jira/browse/HIVE-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027557#comment-15027557 ] Laljo John Pullokkaran commented on HIVE-12519: --- You can achieve this by rewriting query with outer join & filter. Proj(R1.*)((Filter(R2.JoinKeys is null )(R1 LOJ R2))) There are three possible solutions: 1. User rewrites the query as above 2. Change Parser/GenPlan to do the rewrite 3. Change the Join operator to filter out tuples where RHS keys are not null. Given the impact on rest of the optimizers, #2 may be a better option. [~pxiong] This will be very similar to Union Distinct Rewrite. > ANTI JOIN in hive > - > > Key: HIVE-12519 > URL: https://issues.apache.org/jira/browse/HIVE-12519 > Project: Hive > Issue Type: Wish >Reporter: Keren Edri > Labels: hive > > I wish thee was "ANTI JOIN" in hive... > please implement the ANTI JOIN as described is here > http://blog.montmere.com/2010/12/08/the-anti-join-all-values-from-table1-where-not-in-table2/ > Thank You > have a nice day -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12525) Cleanup unused metrics in HMS
[ https://issues.apache.org/jira/browse/HIVE-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027690#comment-15027690 ] Jimmy Xiang commented on HIVE-12525: +1 > Cleanup unused metrics in HMS > - > > Key: HIVE-12525 > URL: https://issues.apache.org/jira/browse/HIVE-12525 > Project: Hive > Issue Type: Sub-task >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-12525.patch > > > I had added these without much thought when writing the metrics-framework to > test out the concept. > Looking back, these actually need of more investigation, as some are actually > wrong or at least do not add much value. Wrong is the active-transaction, as > actually each ObjectStore is a thread-local, and an aggregate number is what > was meant. Open/committed/rollback need some investigation what really helps. > Goal is to remove these before the release to reduce confusion to users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5119) MapJoin & Partition Pruning (MapJoin can take advantage of materialized data to prune partitions of big table)
[ https://issues.apache.org/jira/browse/HIVE-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027699#comment-15027699 ] Laljo John Pullokkaran commented on HIVE-5119: -- This is fixed by DPP work. > MapJoin & Partition Pruning (MapJoin can take advantage of materialized data > to prune partitions of big table) > -- > > Key: HIVE-5119 > URL: https://issues.apache.org/jira/browse/HIVE-5119 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 0.11.0 >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > > Map-Join predicates where the joining columns from big table (streamed table) > are partition columns and corresponding columns from small table is not > partitioned, the join would not prune the unnecessary partitions from big > table. Since data for all small tables is materialized before big table is > streamed, theoretically it would be possible to prune the unnecessary > partitions from big table. > Proposal document is at https://cwiki.apache.org/confluence/x/sgkHAg -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7462) CBO:Add rule to push filter through GB
[ https://issues.apache.org/jira/browse/HIVE-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran resolved HIVE-7462. -- Resolution: Fixed > CBO:Add rule to push filter through GB > -- > > Key: HIVE-7462 > URL: https://issues.apache.org/jira/browse/HIVE-7462 > Project: Hive > Issue Type: Sub-task >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-5119) MapJoin & Partition Pruning (MapJoin can take advantage of materialized data to prune partitions of big table)
[ https://issues.apache.org/jira/browse/HIVE-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran resolved HIVE-5119. -- Resolution: Fixed > MapJoin & Partition Pruning (MapJoin can take advantage of materialized data > to prune partitions of big table) > -- > > Key: HIVE-5119 > URL: https://issues.apache.org/jira/browse/HIVE-5119 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 0.11.0 >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > > Map-Join predicates where the joining columns from big table (streamed table) > are partition columns and corresponding columns from small table is not > partitioned, the join would not prune the unnecessary partitions from big > table. Since data for all small tables is materialized before big table is > streamed, theoretically it would be possible to prune the unnecessary > partitions from big table. > Proposal document is at https://cwiki.apache.org/confluence/x/sgkHAg -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8077) CBO Trunk Merge: Test Failure vectorization_7
[ https://issues.apache.org/jira/browse/HIVE-8077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran resolved HIVE-8077. -- Resolution: Fixed > CBO Trunk Merge: Test Failure vectorization_7 > - > > Key: HIVE-8077 > URL: https://issues.apache.org/jira/browse/HIVE-8077 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8234) CBO: Refactor SemanticAnalyzer and move out CBO code
[ https://issues.apache.org/jira/browse/HIVE-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran resolved HIVE-8234. -- Resolution: Fixed > CBO: Refactor SemanticAnalyzer and move out CBO code > - > > Key: HIVE-8234 > URL: https://issues.apache.org/jira/browse/HIVE-8234 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027709#comment-15027709 ] Gopal V commented on HIVE-12462: [~sershe]: yes, you're right. LGTM - +1. > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12357) Allow user to set tez job name
[ https://issues.apache.org/jira/browse/HIVE-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027616#comment-15027616 ] Sergey Shelukhin commented on HIVE-12357: - [~sseth] LLAP tests fail like so; is this related to using the wrong dag name to track queries, the patch that depends on 0.8.2 that you have posted? {noformat} , TaskAttempt 1 failed, info=[org.apache.hadoop.ipc.RemoteException(java.lang.RuntimeException): Dag -- INPUT_RECORDS: 2000 (2 row groups) ...100(Stage-1) already complete. Rejecting fragment [Map 1, 0, 1 at org.apache.hadoop.hive.llap.daemon.impl.QueryTracker.registerFragment(QueryTracker.java:125) at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:172) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:321) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemonProtocolServerImpl.submitWork(LlapDaemonProtocolServerImpl.java:75) at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:12094) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) {noformat} > Allow user to set tez job name > -- > > Key: HIVE-12357 > URL: https://issues.apache.org/jira/browse/HIVE-12357 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12357.04.patch, HIVE-12357.1.patch, > HIVE-12357.2.patch, HIVE-12357.3.patch > > > Need something like mapred.job.name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12448) Change to tracking of dag status via dagIdentifier instead of dag name
[ https://issues.apache.org/jira/browse/HIVE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027528#comment-15027528 ] Sergey Shelukhin commented on HIVE-12448: - RB? > Change to tracking of dag status via dagIdentifier instead of dag name > -- > > Key: HIVE-12448 > URL: https://issues.apache.org/jira/browse/HIVE-12448 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12448.1.txt, HIVE-12448.2.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10924) add support for MERGE statement
[ https://issues.apache.org/jira/browse/HIVE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027546#comment-15027546 ] Eugene Koifman commented on HIVE-10924: --- One more to think about: In a multi-insert, we have multiple MoveTasks see MoveTask.releaseLocks() > add support for MERGE statement > --- > > Key: HIVE-10924 > URL: https://issues.apache.org/jira/browse/HIVE-10924 > Project: Hive > Issue Type: New Feature > Components: Query Planning, Query Processor, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > add support for > MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8396) Hive CliDriver command splitting can be broken when comments are present
[ https://issues.apache.org/jira/browse/HIVE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027560#comment-15027560 ] Hive QA commented on HIVE-8396: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773931/HIVE-8396.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 9825 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_nonascii org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_join_transpose org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6125/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6125/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6125/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773931 - PreCommit-HIVE-TRUNK-Build > Hive CliDriver command splitting can be broken when comments are present > > > Key: HIVE-8396 > URL: https://issues.apache.org/jira/browse/HIVE-8396 > Project: Hive > Issue Type: Bug > Components: Parser, Query Processor >Affects Versions: 0.14.0 >Reporter: Sergey Shelukhin >Assignee: Elliot West > Attachments: HIVE-8396.0.patch, HIVE-8396.01.patch > > > {noformat} > -- SORT_QUERY_RESULTS > set hive.cbo.enable=true; > ... commands ... > {noformat} > causes > {noformat} > 2014-10-07 18:55:57,193 ERROR ql.Driver (SessionState.java:printError(825)) - > FAILED: ParseException line 2:4 missing KW_ROLE at 'hive' near 'hive' > {noformat} > If the comment is moved after the command it works. > I noticed this earlier when I comment out parts of some random q file for > debugging purposes, and it starts failing. This is annoying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027604#comment-15027604 ] Hive QA commented on HIVE-12483: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12774389/HIVE-12483.1-spark.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9788 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1014/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1014/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1014/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12774389 - PreCommit-HIVE-SPARK-Build > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-12483.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8396) Hive CliDriver command splitting can be broken when comments are present
[ https://issues.apache.org/jira/browse/HIVE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027607#comment-15027607 ] Sergey Shelukhin commented on HIVE-8396: These test failures are not related, it looks like. +1. Will commit at some point > Hive CliDriver command splitting can be broken when comments are present > > > Key: HIVE-8396 > URL: https://issues.apache.org/jira/browse/HIVE-8396 > Project: Hive > Issue Type: Bug > Components: Parser, Query Processor >Affects Versions: 0.14.0 >Reporter: Sergey Shelukhin >Assignee: Elliot West > Attachments: HIVE-8396.0.patch, HIVE-8396.01.patch > > > {noformat} > -- SORT_QUERY_RESULTS > set hive.cbo.enable=true; > ... commands ... > {noformat} > causes > {noformat} > 2014-10-07 18:55:57,193 ERROR ql.Driver (SessionState.java:printError(825)) - > FAILED: ParseException line 2:4 missing KW_ROLE at 'hive' near 'hive' > {noformat} > If the comment is moved after the command it works. > I noticed this earlier when I comment out parts of some random q file for > debugging purposes, and it starts failing. This is annoying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12524) SemiJoinProjectTranspose/SemiJoinFilterTransposeRule Rule in CBO is not firing
[ https://issues.apache.org/jira/browse/HIVE-12524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-12524: -- Summary: SemiJoinProjectTranspose/SemiJoinFilterTransposeRule Rule in CBO is not firing (was: SemiJoinProjectTranspose Rule in CBO is not firing) > SemiJoinProjectTranspose/SemiJoinFilterTransposeRule Rule in CBO is not firing > -- > > Key: HIVE-12524 > URL: https://issues.apache.org/jira/browse/HIVE-12524 > Project: Hive > Issue Type: Bug >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > > SemiJoinProjectTransposeRule uses LogicalProject and hence doesn't fire for > CBO. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12525) Cleanup unused metrics in HMS
[ https://issues.apache.org/jira/browse/HIVE-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-12525: - Description: I had added these without much thought when writing the metrics-framework to test out the concept. Looking back, these actually need of more investigation, as some are actually wrong or at least do not add much value. Wrong is the active-transaction, as actually each ObjectStore is a thread-local, and an aggregate number is what was meant. Open/committed/rollback need some investigation what really helps. Goal is to remove these before the release to reduce confusion to users. > Cleanup unused metrics in HMS > - > > Key: HIVE-12525 > URL: https://issues.apache.org/jira/browse/HIVE-12525 > Project: Hive > Issue Type: Sub-task >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-12525.patch > > > I had added these without much thought when writing the metrics-framework to > test out the concept. > Looking back, these actually need of more investigation, as some are actually > wrong or at least do not add much value. Wrong is the active-transaction, as > actually each ObjectStore is a thread-local, and an aggregate number is what > was meant. Open/committed/rollback need some investigation what really helps. > Goal is to remove these before the release to reduce confusion to users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12526) PerfLogger for hive compiler and optimizer
[ https://issues.apache.org/jira/browse/HIVE-12526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12526: - Attachment: HIVE-12526.1.patch cc-ing [~ashutoshc] and [~jcamachorodriguez] > PerfLogger for hive compiler and optimizer > -- > > Key: HIVE-12526 > URL: https://issues.apache.org/jira/browse/HIVE-12526 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12526.1.patch > > > This jira is intended to use the perflogger to track compilation times and > optimization times (calcite, tez compiler, physical compiler) etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12357) Allow user to set tez job name
[ https://issues.apache.org/jira/browse/HIVE-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027568#comment-15027568 ] Sergey Shelukhin commented on HIVE-12357: - [~hagleitn] your patches are full of tabs, this is the latest example. Your emacs-fu is weak :P > Allow user to set tez job name > -- > > Key: HIVE-12357 > URL: https://issues.apache.org/jira/browse/HIVE-12357 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12357.1.patch, HIVE-12357.2.patch, > HIVE-12357.3.patch > > > Need something like mapred.job.name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12357) Allow user to set tez job name
[ https://issues.apache.org/jira/browse/HIVE-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12357: Attachment: HIVE-12357.04.patch Rebased patch. > Allow user to set tez job name > -- > > Key: HIVE-12357 > URL: https://issues.apache.org/jira/browse/HIVE-12357 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12357.04.patch, HIVE-12357.1.patch, > HIVE-12357.2.patch, HIVE-12357.3.patch > > > Need something like mapred.job.name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs
[ https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027621#comment-15027621 ] Ashutosh Chauhan commented on HIVE-12331: - Fix has already been committed on HIVE-12520 > Remove hive.enforce.bucketing & hive.enforce.sorting configs > > > Key: HIVE-12331 > URL: https://issues.apache.org/jira/browse/HIVE-12331 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12331.1.patch, HIVE-12331.patch > > > If table is created as bucketed and/or sorted and this config is set to > false, you will insert data in wrong buckets and/or sort order and then if > you use these tables subsequently in BMJ or SMBJ you will get wrong results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12525) Cleanup unused metrics in HMS
[ https://issues.apache.org/jira/browse/HIVE-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-12525: - Attachment: HIVE-12525.patch > Cleanup unused metrics in HMS > - > > Key: HIVE-12525 > URL: https://issues.apache.org/jira/browse/HIVE-12525 > Project: Hive > Issue Type: Sub-task >Reporter: Szehon Ho > Attachments: HIVE-12525.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12357) Allow user to set tez job name
[ https://issues.apache.org/jira/browse/HIVE-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027680#comment-15027680 ] Siddharth Seth commented on HIVE-12357: --- Yes. HIVE-12448 will be required to get llap tests to work after this patch. This patch also depends on a newer version of Tez. > Allow user to set tez job name > -- > > Key: HIVE-12357 > URL: https://issues.apache.org/jira/browse/HIVE-12357 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12357.04.patch, HIVE-12357.1.patch, > HIVE-12357.2.patch, HIVE-12357.3.patch > > > Need something like mapred.job.name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12514) Setup renewal of LLAP tokens
[ https://issues.apache.org/jira/browse/HIVE-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12514: Description: LLAP tokens are currently issued for 2 weeks; instead, they should be renewed by someone (RM?) > Setup renewal of LLAP tokens > > > Key: HIVE-12514 > URL: https://issues.apache.org/jira/browse/HIVE-12514 > Project: Hive > Issue Type: Improvement > Components: llap, Security >Affects Versions: 2.0.0 >Reporter: Siddharth Seth > > LLAP tokens are currently issued for 2 weeks; instead, they should be renewed > by someone (RM?) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8339) Job status not found after 100% succeded map
[ https://issues.apache.org/jira/browse/HIVE-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027594#comment-15027594 ] Radim Kubacki commented on HIVE-8339: - [~myltik] Please look at https://issues.apache.org/jira/browse/MAPREDUCE-6312 again. Our problem is not caused by lack of space on HDFS. It is a race condition in code that reads status of job. > Job status not found after 100% succeded map > --- > > Key: HIVE-8339 > URL: https://issues.apache.org/jira/browse/HIVE-8339 > Project: Hive > Issue Type: Bug >Affects Versions: 0.13.1 > Environment: Hadoop 2.4.0, Hive 0.13.1. > Amazon EMR cluster of 9 i2.4xlarge nodes. > 800+GB of data in HDFS. >Reporter: Valera Chevtaev > > According to the logs it seems that the jobs 100% succeed for both map and > reduce but then wasn't able to get the status of the job from job history > server. > Hive logs: > 2014-10-03 07:57:26,593 INFO [main]: exec.Task > (SessionState.java:printInfo(536)) - 2014-10-03 07:57:26,593 Stage-1 map = > 100%, reduce = 99%, Cumulative CPU 872541.02 sec > 2014-10-03 07:57:47,447 INFO [main]: exec.Task > (SessionState.java:printInfo(536)) - 2014-10-03 07:57:47,446 Stage-1 map = > 100%, reduce = 100%, Cumulative CPU 872566.55 sec > 2014-10-03 07:57:48,710 INFO [main]: mapred.ClientServiceDelegate > (ClientServiceDelegate.java:getProxy(273)) - Application state is completed. > FinalApplicationStatus=SUCCEEDED. Redirecting to job history server > 2014-10-03 07:57:48,716 ERROR [main]: exec.Task > (SessionState.java:printError(545)) - Ended Job = job_1412263771568_0002 with > exception 'java.io.IOException(Could not find status of > job:job_1412263771568_0002)' > java.io.IOException: Could not find status of job:job_1412263771568_0002 >at > org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:294) >at > org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547) >at > org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426) >at > org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) >at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) >at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) >at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) >at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) >at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) >at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) >at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) >at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:275) >at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:227) >at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430) >at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:366) >at > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:463) >at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:479) >at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:759) >at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697) >at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:636) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:606) >at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > 2014-10-03 07:57:48,763 ERROR [main]: ql.Driver > (SessionState.java:printError(545)) - FAILED: Execution Error, return code 1 > from org.apache.hadoop.hive.ql.exec.mr.MapRedTask -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12500) JDBC driver not overlaying params supplied via properties object when reading params from ZK
[ https://issues.apache.org/jira/browse/HIVE-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027601#comment-15027601 ] Hive QA commented on HIVE-12500: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773952/HIVE-12500.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6127/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6127/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6127/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/itests/util (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-it-util --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (download-spark) @ hive-it-util --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-it-util --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-it-util --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/itests/util/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-it-util --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-it-util --- [INFO] Compiling 51 source files to /data/hive-ptest/working/apache-github-source-source/itests/util/target/classes [WARNING] /data/hive-ptest/working/apache-github-source-source/itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java: Some input files use or override a deprecated API. [WARNING] /data/hive-ptest/working/apache-github-source-source/itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java: Recompile with -Xlint:deprecation for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-it-util --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/itests/util/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-it-util --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/util/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp/conf [copy] Copying 14 files to /data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-it-util --- [INFO] No sources to compile [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-it-util --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-it-util --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/itests/util/target/hive-it-util-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-it-util --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-it-util --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/itests/util/target/hive-it-util-2.0.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-it-util/2.0.0-SNAPSHOT/hive-it-util-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/itests/util/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-it-util/2.0.0-SNAPSHOT/hive-it-util-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Integration - Unit Tests 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-it-unit --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/itests/hive-unit/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/itests/hive-unit (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
[jira] [Commented] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs
[ https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027611#comment-15027611 ] Sergey Shelukhin commented on HIVE-12331: - Looks like some tests were committed with this setting and got broken. Probably a follow-up JIRA is needed to clean-up since that timing couldn't be coordinated. > Remove hive.enforce.bucketing & hive.enforce.sorting configs > > > Key: HIVE-12331 > URL: https://issues.apache.org/jira/browse/HIVE-12331 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12331.1.patch, HIVE-12331.patch > > > If table is created as bucketed and/or sorted and this config is set to > false, you will insert data in wrong buckets and/or sort order and then if > you use these tables subsequently in BMJ or SMBJ you will get wrong results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12525) Cleanup unused metrics in HMS
[ https://issues.apache.org/jira/browse/HIVE-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-12525: - Issue Type: Sub-task (was: Bug) Parent: HIVE-10761 > Cleanup unused metrics in HMS > - > > Key: HIVE-12525 > URL: https://issues.apache.org/jira/browse/HIVE-12525 > Project: Hive > Issue Type: Sub-task >Reporter: Szehon Ho > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-12483: --- Attachment: HIVE-12483.1-spark.patch > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-12483.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12498) ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect
[ https://issues.apache.org/jira/browse/HIVE-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027295#comment-15027295 ] Prasanth Jayachandran commented on HIVE-12498: -- Test failures are not related to this patch. It happening in master as well. HIVE-12520 addresses the test failures. > ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect > - > > Key: HIVE-12498 > URL: https://issues.apache.org/jira/browse/HIVE-12498 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: ACID, ORC > Attachments: HIVE-12498.1.patch, HIVE-12498.2.patch > > > OrcRecordUpdater does not honor the > OrcRecordUpdater.OrcOptions.tableProperties() setting. > It would need to translate the specified tableProperties (as listed in > OrcTableProperties enum) to the properties that OrcWriter internally > understands (listed in HiveConf.ConfVars). > This is needed for multiple clients.. like Streaming API and Compactor. > {code:java} > Properties orcTblProps = .. // get Orc Table Properties from MetaStore; > AcidOutputFormat.Options updaterOptions = new > OrcRecordUpdater.OrcOptions(conf) > .inspector(..) > .bucket(..) > .minimumTransactionId(..) > .maximumTransactionId(..) > > .tableProperties(orcTblProps); // <<== > OrcOutputFormat orcOutput = new ... > orcOutput.getRecordUpdater(partitionPath, updaterOptions ); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-12483: --- Attachment: (was: HIVE-12483.1.patch) > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-12483.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12511) IN clause performs differently then = clause
[ https://issues.apache.org/jira/browse/HIVE-12511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027407#comment-15027407 ] Szehon Ho commented on HIVE-12511: -- Looks fine to me +1 > IN clause performs differently then = clause > > > Key: HIVE-12511 > URL: https://issues.apache.org/jira/browse/HIVE-12511 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-12511.1.patch > > > Similar to HIVE-11973, IN clause performs differently then = clause for "int" > type with string values. > For example, > {noformat} > SELECT * FROM inttest WHERE iValue IN ('01'); > {noformat} > will not return any rows with int iValue = 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
[ https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028192#comment-15028192 ] Ratandeep Ratti commented on HIVE-11878: That's correct [~jdere] > ClassNotFoundException can possibly occur if multiple jars are registered > one at a time in Hive > > > Key: HIVE-11878 > URL: https://issues.apache.org/jira/browse/HIVE-11878 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Ratandeep Ratti >Assignee: Ratandeep Ratti > Labels: URLClassLoader > Attachments: HIVE-11878 ClassLoader Issues when Registering > Jars.pptx, HIVE-11878.2.patch, HIVE-11878.patch, HIVE-11878_approach3.patch, > HIVE-11878_approach3_per_session_clasloader.patch, > HIVE-11878_approach3_with_review_comments.patch, > HIVE-11878_approach3_with_review_comments1.patch, HIVE-11878_qtest.patch > > > When we register a jar on the Hive console. Hive creates a fresh URL > classloader which includes the path of the current jar to be registered and > all the jar paths of the parent classloader. The parent classlaoder is the > current ThreadContextClassLoader. Once the URLClassloader is created Hive > sets that as the current ThreadContextClassloader. > So if we register multiple jars in Hive, there will be multiple > URLClassLoaders created, each classloader including the jars from its parent > and the one extra jar to be registered. The last URLClassLoader created will > end up as the current ThreadContextClassLoader. (See details: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath) > Now here's an example in which the above strategy can lead to a CNF exception. > We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class > *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, > the URLClassLoader *u1* is created and also set as the > ThreadContextClassLoader. We register *j2* next, the new URLClassLoader > created will be *u2* with *u1* as parent and *u2* becomes the new > ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* > whereas *u1* only has paths to *j1* (For details see: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath). > Now when we register class *c1* under a temporary function in Hive, we load > the class using {code} class.forName("c1", true, > Thread.currentThread().getContextClassLoader()) {code} . The > currentThreadContext class-loader is *u2*, and it has the path to the class > *c1*, but note that Class-loaders work by delegating to parent class-loader > first. In this case class *c1* will be found and *defined* by class-loader > *u1*. > Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say > initialize) is called in *c1*, which references the class *c2*, *c2* will not > be found since the class-loader used to search for *c2* will be *u1* (Since > the caller's class-loader is used to load a class) > I've added a qtest to explain the problem. Please see the attached patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11927: --- Attachment: HIVE-11927.08.patch > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook
[ https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028284#comment-15028284 ] Dapeng Sun commented on HIVE-12367: --- Thank [~alangates] for your review :) > Lock/unlock database should add current database to inputs and outputs of > authz hook > > > Key: HIVE-12367 > URL: https://issues.apache.org/jira/browse/HIVE-12367 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.2.1 >Reporter: Dapeng Sun >Assignee: Dapeng Sun > Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, > HIVE-12367.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12497) Remove HADOOP_CLIENT_OPTS from hive script
[ https://issues.apache.org/jira/browse/HIVE-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028286#comment-15028286 ] Gopal V commented on HIVE-12497: [~prasanth_j]: LGTM - +1. The mapredcp looks like it adds some delays too. Making a note to remove HBase deps for beeline. > Remove HADOOP_CLIENT_OPTS from hive script > -- > > Key: HIVE-12497 > URL: https://issues.apache.org/jira/browse/HIVE-12497 > Project: Hive > Issue Type: Sub-task > Components: Logging >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12497.1.patch, HIVE-12497.2.patch, > HIVE-12497.3.patch > > > HADOOP_CLIENT_OPTS added in HIVE-11304 to get around log4j error adds ~5 > seconds delay to hive startup. > {code:title=with HADOOP_CLIENT_OPTS} > time hive --version > real 0m11.948s > user 0m13.026s > sys 0m3.979s > {code} > {code:title=without HADOOP_CLIENT_OPTS} > time hive --version > real 0m7.053s > user 0m7.254s > sys 0m3.589s > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12479) Vectorization: Vectorized Date UDFs with up-stream Joins
[ https://issues.apache.org/jira/browse/HIVE-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026558#comment-15026558 ] Gopal V commented on HIVE-12479: The batch.selectedInUse and col.isRepeating were interfering with each other's semantics. Fix for all Date UDFS + added test-case with 1 value column, with isRepeating=true, isSelectedInUse=true & .selected[0] = 42; After the fix, it should not be trying to read col[42], but should operate on col[0] because isRepeating is true. > Vectorization: Vectorized Date UDFs with up-stream Joins > > > Key: HIVE-12479 > URL: https://issues.apache.org/jira/browse/HIVE-12479 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12479.1.patch, HIVE-12479.tar.gz > > > The row-counts expected with and without vectorization differ. > The attached small-scale repro case produces 5 rows with vectorized multi-key > joins and 53 rows without the vectorized join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12490) Metastore: Mysql ANSI_QUOTES is not there for some cases
[ https://issues.apache.org/jira/browse/HIVE-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026567#comment-15026567 ] Hive QA commented on HIVE-12490: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773905/HIVE-12490.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 9825 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_nonascii org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_join_transpose org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6122/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6122/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6122/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773905 - PreCommit-HIVE-TRUNK-Build > Metastore: Mysql ANSI_QUOTES is not there for some cases > > > Key: HIVE-12490 > URL: https://issues.apache.org/jira/browse/HIVE-12490 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12490.WIP.patch, HIVE-12490.patch > > > {code} > Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You > have an error in your SQL syntax; check the manual that corresponds to your > MySQL server version for the right syntax to use near '"PART_COL_STATS" where > "DB_NAME" = 'tpcds_100' and "TABLE_NAME" = > 'store_sales' at line 1 > ... > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451) > ~[datanucleus-api-jdo-3.2.6.jar:?] > at > org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) > ~[datanucleus-api-jdo-3.2.6.jar:?] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1644) > [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.partsFoundForPartitions(MetaStoreDirectSql.java:1227) > [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026452#comment-15026452 ] Gopal V commented on HIVE-12462: I think the FIL path is having its Synthetic predicate removed too early. I'll take another shot at this. > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12463) VectorMapJoinFastKeyStore has Array OOB errors
[ https://issues.apache.org/jira/browse/HIVE-12463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12463: --- Fix Version/s: 2.0.0 1.3.0 > VectorMapJoinFastKeyStore has Array OOB errors > -- > > Key: HIVE-12463 > URL: https://issues.apache.org/jira/browse/HIVE-12463 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12463.1.patch, HIVE-12463.2.patch > > > When combining different sized keys, observing an occasional error in > hashtable probes. > {code} > Caused by: java.lang.ArrayIndexOutOfBoundsException: 162046429 > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastKeyStore.equalKey(VectorMapJoinFastKeyStore.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.findReadSlot(VectorMapJoinFastBytesHashTable.java:191) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.lookup(VectorMapJoinFastBytesHashMap.java:76) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerMultiKeyOperator.process(VectorMapJoinInnerMultiKeyOperator.java:300) > ... 26 more > {code} > {code} > // Our reading is positioned to the key. > writeBuffers.getByteSegmentRefToCurrent(byteSegmentRef, keyLength, > readPos); > byte[] currentBytes = byteSegmentRef.getBytes(); > int currentStart = (int) byteSegmentRef.getOffset(); > for (int i = 0; i < keyLength; i++) { > if (currentBytes[currentStart + i] != keyBytes[keyStart + i]) { > // LOG.debug("VectorMapJoinFastKeyStore equalKey no match on bytes"); > return false; > } > } > {code} > This needs an identical fix to match > {code} > // Rare case of buffer boundary. Unfortunately we'd have to copy some > bytes. >// Rare case of buffer boundary. Unfortunately we'd have to copy some > bytes. > byte[] bytes = new byte[length]; > int destOffset = 0; > while (destOffset < length) { > ponderNextBufferToRead(readPos); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly
[ https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026545#comment-15026545 ] Gopal V commented on HIVE-12473: [~sershe]: LGTM - +1. I'll make a note & do a bit of type refactoring after talking to [~hagleitn] next sprint. > DPP: UDFs on the partition column side does not evaluate correctly > -- > > Key: HIVE-12473 > URL: https://issues.apache.org/jira/browse/HIVE-12473 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12473.patch > > > Related to HIVE-12462 > {code} > select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) > and account_id = 22; > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > {code} > Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only > checks for final type, not the column type. > {code} > ObjectInspector oi = > > PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory > .getPrimitiveTypeInfo(si.fieldInspector.getTypeName())); > Converter converter = > ObjectInspectorConverters.getConverter( > PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12479) Vectorization: Vectorized Date UDFs with up-stream Joins
[ https://issues.apache.org/jira/browse/HIVE-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12479: --- Attachment: HIVE-12479.1.patch > Vectorization: Vectorized Date UDFs with up-stream Joins > > > Key: HIVE-12479 > URL: https://issues.apache.org/jira/browse/HIVE-12479 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12479.1.patch, HIVE-12479.tar.gz > > > The row-counts expected with and without vectorization differ. > The attached small-scale repro case produces 5 rows with vectorized multi-key > joins and 53 rows without the vectorized join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12479) Vectorization: Vectorized Date UDFs with up-stream Joins
[ https://issues.apache.org/jira/browse/HIVE-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026559#comment-15026559 ] Gopal V commented on HIVE-12479: [~prasanth_j]: can you take a look? > Vectorization: Vectorized Date UDFs with up-stream Joins > > > Key: HIVE-12479 > URL: https://issues.apache.org/jira/browse/HIVE-12479 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12479.1.patch, HIVE-12479.tar.gz > > > The row-counts expected with and without vectorization differ. > The attached small-scale repro case produces 5 rows with vectorized multi-key > joins and 53 rows without the vectorized join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6113) Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
[ https://issues.apache.org/jira/browse/HIVE-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin updated HIVE-6113: --- Attachment: HIVE-6113.5.patch Fixed compilation error in TestMetastoreVersion.java > Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient > -- > > Key: HIVE-6113 > URL: https://issues.apache.org/jira/browse/HIVE-6113 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.1 > Environment: hadoop-0.20.2-cdh3u3,hive-0.12.0 >Reporter: William Stone >Assignee: Oleksiy Sayankin >Priority: Critical > Labels: HiveMetaStoreClient, metastore, unable_instantiate > Attachments: HIVE-6113-2.patch, HIVE-6113.3.patch, HIVE-6113.4.patch, > HIVE-6113.5.patch, HIVE-6113.patch > > > When I exccute SQL "use fdm; desc formatted fdm.tableName;" in python, throw > Error as followed. > but when I tryit again , It will success. > 2013-12-25 03:01:32,290 ERROR exec.DDLTask (DDLTask.java:execute(435)) - > org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: > Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient > at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1143) > at > org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1128) > at > org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3479) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:260) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:507) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:875) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:769) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:708) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > Caused by: java.lang.RuntimeException: Unable to instantiate > org.apache.hadoop.hive.metastore.HiveMetaStoreClient > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1217) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72) > at > org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2372) > at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2383) > at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1139) > ... 20 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1210) > ... 25 more > Caused by: javax.jdo.JDODataStoreException: Exception thrown flushing changes > to datastore > NestedThrowables: > java.sql.BatchUpdateException: Duplicate entry 'default' for key > 'UNIQUE_DATABASE' > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451) > at > org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:165) > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:358) > at > org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:404) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x
[ https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026431#comment-15026431 ] Feng Yuan commented on HIVE-12175: -- sorry,i use this patch in master and: Caused by: java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: java.util.Properties Serialization trace: keyDesc (org.apache.hadoop.hive.ql.plan.ReduceWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:423) at org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:294) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:117) ... 14 more Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: java.util.Properties Serialization trace: keyDesc (org.apache.hadoop.hive.ql.plan.ReduceWork) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1025) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:933) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:947) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:403) ... 16 more Caused by: java.lang.ClassNotFoundException: java.util.Properties at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136) ... 25 more > Upgrade Kryo version to 3.0.x > - > > Key: HIVE-12175 > URL: https://issues.apache.org/jira/browse/HIVE-12175 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, > HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, > HIVE-12175.5.patch, HIVE-12175.6.patch > > > Current version of kryo (2.22) has some issue (refer exception below and in > HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We > need to either replace all occurrences of Arrays.asList() or change the > current StdInstantiatorStrategy. This issue is fixed in later versions and > kryo community recommends using DefaultInstantiatorStrategy with fallback to > StdInstantiatorStrategy. More discussion about this issue is here > https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom > serilization/deserilization class can be provided for Arrays.asList. > Also, kryo 3.0 introduced unsafe based serialization which claims to have > much better performance for certain types of serialization. > Exception: > {code} > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:2847) > at java.util.AbstractList.add(AbstractList.java:108) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > ... 57 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9642) Hive metastore client retries don't happen consistently for all api calls
[ https://issues.apache.org/jira/browse/HIVE-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-9642: - Attachment: HIVE-9642.4.patch A minor change to move reloginExpiringKeytabUser inside loop. > Hive metastore client retries don't happen consistently for all api calls > - > > Key: HIVE-9642 > URL: https://issues.apache.org/jira/browse/HIVE-9642 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Xiaobing Zhou >Assignee: Daniel Dai > Attachments: HIVE-9642.1.patch, HIVE-9642.2.patch, HIVE-9642.3.patch, > HIVE-9642.4.patch > > > When org.apache.thrift.transport.TTransportException is thrown for issues > like socket timeout, the retry via RetryingMetaStoreClient happens only in > certain cases. > Retry happens for the getDatabase call in but not for getAllDatabases(). > The reason is RetryingMetaStoreClient checks for TTransportException being > the cause for InvocationTargetException. But in case of some calls such as > getAllDatabases in HiveMetastoreClient, all exceptions get wrapped in a > MetaException. We should remove this unnecessary wrapping of exceptions for > certain functions in HMC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)