[jira] [Commented] (HIVE-20103) WM: Only Aggregate DAG counters if at least one is used
[ https://issues.apache.org/jira/browse/HIVE-20103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535636#comment-16535636 ] Hive QA commented on HIVE-20103: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 4s{color} | {color:blue} ql in master has 2287 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12437/dev-support/hive-personality.sh | | git revision | master / 8494522 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12437/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > WM: Only Aggregate DAG counters if at least one is used > --- > > Key: HIVE-20103 > URL: https://issues.apache.org/jira/browse/HIVE-20103 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 3.0.0, 4.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20103.1.patch > > > {code} > status = > dagClient.getDAGStatus(EnumSet.of(StatusGetOpts.GET_COUNTERS), checkInterval); > TezCounters dagCounters = status.getDAGCounters(); > ... > if (dagCounters != null && wmContext != null) { > Set desiredCounters = wmContext.getSubscribedCounters(); > if (desiredCounters != null && !desiredCounters.isEmpty()) { > Map currentCounters = getCounterValues(dagCounters, > vertexNames, vertexProgressMap, > desiredCounters, done); > {code} > Skip collecting DAG counters unless there at least one desired counter in > wmContext. > The AM has a hard-lock around the counters, so the current jstacks are full > of > {code} >java.lang.Thread.State: RUNNABLE > at java.lang.String.intern(Native Method) > at > org.apache.hadoop.util.StringInterner.weakIntern(StringInterner.java:71) > at > org.apache.tez.common.counters.GenericCounter.(GenericCounter.java:50) > at > org.apache.tez.common.counters.TezCounters$GenericGroup.newCounter(TezCounters.java:65) > at > org.apache.tez.common.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:92) >
[jira] [Comment Edited] (HIVE-19267) Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535630#comment-16535630 ] Sankar Hariappan edited comment on HIVE-19267 at 7/7/18 5:17 AM: - [~maheshk114], I think, [~vgarg]'s point is, if this patch is targeted to branch-3 (3.2.0), then need to create upgrade-3.1.0-to-3.2.0..sql scripts for each RDBMS and move all the changes relevant to this patch from upgrade-3.0.0-to-3.1.0..sql. Also, as 3.1.0 is already cut-off, we need to keep these changes in hive-schema-3.2.0..sql scripts and remove from hive-schema-3.1.0..sql. was (Author: sankarh): [~maheshk114], I think, [~vgarg]'s point is, if this patch is targeted to branch-3 (3.2.0), then need to create upgrade-3.1.0-to-3.2.0..sql scripts for each RDBMS and move all the changes from upgrade-3.0.0-to-3.1.0..sql. Also, as 3.1.0 is already cut-off, we need to keep these changes in hive-schema-3.2.0..sql scripts and remove from hive-schema-3.1.0..sql. > Replicate ACID/MM tables write operations. > -- > > Key: HIVE-19267 > URL: https://issues.apache.org/jira/browse/HIVE-19267 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, > HIVE-19267.02.patch, HIVE-19267.03.patch, HIVE-19267.04.patch, > HIVE-19267.05.patch, HIVE-19267.06.patch, HIVE-19267.07.patch, > HIVE-19267.08.patch, HIVE-19267.09.patch, HIVE-19267.10.patch, > HIVE-19267.11.patch, HIVE-19267.12.patch, HIVE-19267.13.patch, > HIVE-19267.14.patch, HIVE-19267.15.patch, HIVE-19267.16.patch, > HIVE-19267.17.patch, HIVE-19267.18.patch, HIVE-19267.19.patch, > HIVE-19267.20.patch, HIVE-19267.21.patch, HIVE-19267.22.patch > > > > h1. Replicate ACID write Events > * Create new EVENT_WRITE event with related message format to log the write > operations with in a txn along with data associated. > * Log this event when perform any writes (insert into, insert overwrite, > load table, delete, update, merge, truncate) on table/partition. > * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple > partitions, then need to log one event per partition. > * DbNotificationListener should log this type of event to special metastore > table named "MTxnWriteNotificationLog". > * This table should maintain a map of txn ID against list of > tables/partitions written by given txn. > * The entry for a given txn should be removed by the cleaner thread that > removes the expired events from EventNotificationTable. > h1. Replicate Commit Txn operation (with writes) > Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions > modified within the txn. > *Source warehouse:* > * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" > metastore table to consolidate the list of tables/partitions modified within > this txn scope. > * Based on the list of tables/partitions modified and table Write ID, need > to compute the list of delta files added by this txn. > * Repl dump should read this message and dump the metadata and delta files > list. > *Target warehouse:* > * Ensure snapshot isolation at target for on-going read txns which shouldn't > view the data replicated from committed txn. (Ensured with open and allocate > write ID events). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19267) Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535630#comment-16535630 ] Sankar Hariappan commented on HIVE-19267: - [~maheshk114], I think, [~vgarg]'s point is, if this patch is targeted to branch-3 (3.2.0), then need to create upgrade-3.1.0-to-3.2.0..sql scripts for each RDBMS and move all the changes from upgrade-3.0.0-to-3.1.0..sql. Also, as 3.1.0 is already cut-off, we need to keep these changes in hive-schema-3.2.0..sql scripts and remove from hive-schema-3.1.0..sql. > Replicate ACID/MM tables write operations. > -- > > Key: HIVE-19267 > URL: https://issues.apache.org/jira/browse/HIVE-19267 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, > HIVE-19267.02.patch, HIVE-19267.03.patch, HIVE-19267.04.patch, > HIVE-19267.05.patch, HIVE-19267.06.patch, HIVE-19267.07.patch, > HIVE-19267.08.patch, HIVE-19267.09.patch, HIVE-19267.10.patch, > HIVE-19267.11.patch, HIVE-19267.12.patch, HIVE-19267.13.patch, > HIVE-19267.14.patch, HIVE-19267.15.patch, HIVE-19267.16.patch, > HIVE-19267.17.patch, HIVE-19267.18.patch, HIVE-19267.19.patch, > HIVE-19267.20.patch, HIVE-19267.21.patch, HIVE-19267.22.patch > > > > h1. Replicate ACID write Events > * Create new EVENT_WRITE event with related message format to log the write > operations with in a txn along with data associated. > * Log this event when perform any writes (insert into, insert overwrite, > load table, delete, update, merge, truncate) on table/partition. > * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple > partitions, then need to log one event per partition. > * DbNotificationListener should log this type of event to special metastore > table named "MTxnWriteNotificationLog". > * This table should maintain a map of txn ID against list of > tables/partitions written by given txn. > * The entry for a given txn should be removed by the cleaner thread that > removes the expired events from EventNotificationTable. > h1. Replicate Commit Txn operation (with writes) > Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions > modified within the txn. > *Source warehouse:* > * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" > metastore table to consolidate the list of tables/partitions modified within > this txn scope. > * Based on the list of tables/partitions modified and table Write ID, need > to compute the list of delta files added by this txn. > * Repl dump should read this message and dump the metadata and delta files > list. > *Target warehouse:* > * Ensure snapshot isolation at target for on-going read txns which shouldn't > view the data replicated from committed txn. (Ensured with open and allocate > write ID events). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20069) Fix reoptimization in case of DPP and Semijoin optimization
[ https://issues.apache.org/jira/browse/HIVE-20069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535628#comment-16535628 ] Hive QA commented on HIVE-20069: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930491/HIVE-20069.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14632 tests executed *Failed tests:* {noformat} TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=240) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12436/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12436/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12436/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12930491 - PreCommit-HIVE-Build > Fix reoptimization in case of DPP and Semijoin optimization > --- > > Key: HIVE-20069 > URL: https://issues.apache.org/jira/browse/HIVE-20069 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-20069.01.patch, HIVE-20069.02.patch, > HIVE-20069.02.patch > > > reported by [~t3rmin4t0r] > In case dynamic partition pruning; the operator statistics became partial; to > only reflect the actually scanned partitions; but they are being used as an > information about the "full" table...which leads to the exchange of the 2 > tables being joined...which is really unfortunate... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores
[ https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20006: --- Attachment: HIVE-20006.03.patch > Make materializations invalidation cache work with multiple active remote > metastores > > > Key: HIVE-20006 > URL: https://issues.apache.org/jira/browse/HIVE-20006 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, > HIVE-19027.03.patch, HIVE-19027.04.patch, HIVE-20006.01.patch, > HIVE-20006.02.patch, HIVE-20006.03.patch, HIVE-20006.patch > > > The main points: > - Only MVs stored in transactional tables can have a time window value of 0. > Those are the only MVs that can be guaranteed to not be outdated when a query > is executed, if we use custom storage handlers to store the materialized > view, we cannot make any promises. > - For MVs that +cannot be outdated+, we do not check the metastore. Instead, > comparison is based on valid write id lists. > - For MVs that +can be outdated+, we still rely on the invalidation cache. > ** The window for valid outdated MVs can be specified in intervals of 1 > minute (less than that, it is difficult to have any guarantees about whether > the MV is actually outdated by less than a minute or not). > ** The async loading is done every interval / 2 (or probably better, we can > make it configurable). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19267) Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535624#comment-16535624 ] mahesh kumar behera commented on HIVE-19267: The patch will be committed to branch-3 also > Replicate ACID/MM tables write operations. > -- > > Key: HIVE-19267 > URL: https://issues.apache.org/jira/browse/HIVE-19267 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, > HIVE-19267.02.patch, HIVE-19267.03.patch, HIVE-19267.04.patch, > HIVE-19267.05.patch, HIVE-19267.06.patch, HIVE-19267.07.patch, > HIVE-19267.08.patch, HIVE-19267.09.patch, HIVE-19267.10.patch, > HIVE-19267.11.patch, HIVE-19267.12.patch, HIVE-19267.13.patch, > HIVE-19267.14.patch, HIVE-19267.15.patch, HIVE-19267.16.patch, > HIVE-19267.17.patch, HIVE-19267.18.patch, HIVE-19267.19.patch, > HIVE-19267.20.patch, HIVE-19267.21.patch, HIVE-19267.22.patch > > > > h1. Replicate ACID write Events > * Create new EVENT_WRITE event with related message format to log the write > operations with in a txn along with data associated. > * Log this event when perform any writes (insert into, insert overwrite, > load table, delete, update, merge, truncate) on table/partition. > * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple > partitions, then need to log one event per partition. > * DbNotificationListener should log this type of event to special metastore > table named "MTxnWriteNotificationLog". > * This table should maintain a map of txn ID against list of > tables/partitions written by given txn. > * The entry for a given txn should be removed by the cleaner thread that > removes the expired events from EventNotificationTable. > h1. Replicate Commit Txn operation (with writes) > Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions > modified within the txn. > *Source warehouse:* > * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" > metastore table to consolidate the list of tables/partitions modified within > this txn scope. > * Based on the list of tables/partitions modified and table Write ID, need > to compute the list of delta files added by this txn. > * Repl dump should read this message and dump the metadata and delta files > list. > *Target warehouse:* > * Ensure snapshot isolation at target for on-going read txns which shouldn't > view the data replicated from committed txn. (Ensured with open and allocate > write ID events). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19267) Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535620#comment-16535620 ] mahesh kumar behera commented on HIVE-19267: [~vgarg] The 2.3.0-to-3.0.0. are modified as there were problems in it and were not following the standards (-- These lines need to be last. Insert any changes above.) . Upgrade/installation tests was done for mysql but for 5.7 version. The above mentioned issue is coming only for 5.6 and below. > Replicate ACID/MM tables write operations. > -- > > Key: HIVE-19267 > URL: https://issues.apache.org/jira/browse/HIVE-19267 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, > HIVE-19267.02.patch, HIVE-19267.03.patch, HIVE-19267.04.patch, > HIVE-19267.05.patch, HIVE-19267.06.patch, HIVE-19267.07.patch, > HIVE-19267.08.patch, HIVE-19267.09.patch, HIVE-19267.10.patch, > HIVE-19267.11.patch, HIVE-19267.12.patch, HIVE-19267.13.patch, > HIVE-19267.14.patch, HIVE-19267.15.patch, HIVE-19267.16.patch, > HIVE-19267.17.patch, HIVE-19267.18.patch, HIVE-19267.19.patch, > HIVE-19267.20.patch, HIVE-19267.21.patch, HIVE-19267.22.patch > > > > h1. Replicate ACID write Events > * Create new EVENT_WRITE event with related message format to log the write > operations with in a txn along with data associated. > * Log this event when perform any writes (insert into, insert overwrite, > load table, delete, update, merge, truncate) on table/partition. > * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple > partitions, then need to log one event per partition. > * DbNotificationListener should log this type of event to special metastore > table named "MTxnWriteNotificationLog". > * This table should maintain a map of txn ID against list of > tables/partitions written by given txn. > * The entry for a given txn should be removed by the cleaner thread that > removes the expired events from EventNotificationTable. > h1. Replicate Commit Txn operation (with writes) > Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions > modified within the txn. > *Source warehouse:* > * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" > metastore table to consolidate the list of tables/partitions modified within > this txn scope. > * Based on the list of tables/partitions modified and table Write ID, need > to compute the list of delta files added by this txn. > * Repl dump should read this message and dump the metadata and delta files > list. > *Target warehouse:* > * Ensure snapshot isolation at target for on-going read txns which shouldn't > view the data replicated from committed txn. (Ensured with open and allocate > write ID events). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20069) Fix reoptimization in case of DPP and Semijoin optimization
[ https://issues.apache.org/jira/browse/HIVE-20069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535616#comment-16535616 ] Hive QA commented on HIVE-20069: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 4s{color} | {color:blue} ql in master has 2287 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 2 new + 38 unchanged - 0 fixed = 40 total (was 38) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 29 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12436/dev-support/hive-personality.sh | | git revision | master / 8494522 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12436/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-12436/yetus/whitespace-eol.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-12436/yetus/whitespace-tabs.txt | | modules | C: ql itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12436/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Fix reoptimization in case of DPP and Semijoin optimization > --- > > Key: HIVE-20069 > URL: https://issues.apache.org/jira/browse/HIVE-20069 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-20069.01.patch, HIVE-20069.02.patch, > HIVE-20069.02.patch > > > reported by [~t3rmin4t0r] > In case dynamic partition pruning; the operator statistics became partial; to > only reflect the actually scanned partitions; but they are being used as an > information about the "full" table...which leads to the exchange of the 2 > tables being joined...which is really unfortunate...
[jira] [Commented] (HIVE-19267) Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535611#comment-16535611 ] Vineet Garg commented on HIVE-19267: The commit pushed to master changes wrong upgrade files (upgrade-2.3.0-to-3.0.0.derby.sql and upgrade-3.0.0-to-3.1.0.derby.sql). Hive 3.0.0 is already released and 3.1 branch has been cut off. If this isn't targeted for branch-3 (3.2) then changes should go in upgrade-3.1.0-to-4.0.0.derby.sql (same for all databases) and hive-schema-3.1 etc shouldn't be touched. Also please run metastore upgrade/installation tests (you can find more on it at standalone-metastore/DEV-README). This should tell you if your changes are breaking up-gradation on installation. BTW is this targeted for 3.2? If so then there are whole new set of files which need to created and modified. > Replicate ACID/MM tables write operations. > -- > > Key: HIVE-19267 > URL: https://issues.apache.org/jira/browse/HIVE-19267 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, > HIVE-19267.02.patch, HIVE-19267.03.patch, HIVE-19267.04.patch, > HIVE-19267.05.patch, HIVE-19267.06.patch, HIVE-19267.07.patch, > HIVE-19267.08.patch, HIVE-19267.09.patch, HIVE-19267.10.patch, > HIVE-19267.11.patch, HIVE-19267.12.patch, HIVE-19267.13.patch, > HIVE-19267.14.patch, HIVE-19267.15.patch, HIVE-19267.16.patch, > HIVE-19267.17.patch, HIVE-19267.18.patch, HIVE-19267.19.patch, > HIVE-19267.20.patch, HIVE-19267.21.patch, HIVE-19267.22.patch > > > > h1. Replicate ACID write Events > * Create new EVENT_WRITE event with related message format to log the write > operations with in a txn along with data associated. > * Log this event when perform any writes (insert into, insert overwrite, > load table, delete, update, merge, truncate) on table/partition. > * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple > partitions, then need to log one event per partition. > * DbNotificationListener should log this type of event to special metastore > table named "MTxnWriteNotificationLog". > * This table should maintain a map of txn ID against list of > tables/partitions written by given txn. > * The entry for a given txn should be removed by the cleaner thread that > removes the expired events from EventNotificationTable. > h1. Replicate Commit Txn operation (with writes) > Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions > modified within the txn. > *Source warehouse:* > * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" > metastore table to consolidate the list of tables/partitions modified within > this txn scope. > * Based on the list of tables/partitions modified and table Write ID, need > to compute the list of delta files added by this txn. > * Repl dump should read this message and dump the metadata and delta files > list. > *Target warehouse:* > * Ensure snapshot isolation at target for on-going read txns which shouldn't > view the data replicated from committed txn. (Ensured with open and allocate > write ID events). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities
[ https://issues.apache.org/jira/browse/HIVE-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535604#comment-16535604 ] Hive QA commented on HIVE-20090: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930483/HIVE-20090.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 14631 tests executed *Failed tests:* {noformat} TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=240) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_2] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_fixed_bucket_pruning] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_partition_pruning] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_3] (batchId=186) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query17] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query18] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query1] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query24] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query25] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query29] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query2] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query31] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query32] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query39] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query40] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query50] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query54] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query59] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query5] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query69] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query72] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query77] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query78] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query80] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query91] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query92] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query93] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=261) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query95] (batchId=261) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12435/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12435/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12435/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 39 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12930483 - PreCommit-HIVE-Build > Extend creation of semijoin reduction filters to be able to discover new > opportunities >
[jira] [Commented] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities
[ https://issues.apache.org/jira/browse/HIVE-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535603#comment-16535603 ] Gopal V commented on HIVE-20090: [~jcamachorodriguez]: you end up calling {code} + GenTezUtils.removeBranch(p.getLeft()); {code} on the same item twice, which needs GenTezUtils.removeBranch to do this {code} diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java index bb0de94..2aae6dc 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java @@ -496,6 +496,10 @@ public static void removeBranch(Operator event) { while (curr.getChildOperators().size() <= 1) { child = curr; + if (curr.getParentOperators().isEmpty()) { +// this is already a dead-branch + return; + } curr = curr.getParentOperators().get(0); } {code} > Extend creation of semijoin reduction filters to be able to discover new > opportunities > -- > > Key: HIVE-20090 > URL: https://issues.apache.org/jira/browse/HIVE-20090 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20090.01.patch, HIVE-20090.02.patch > > > Assume the following plan: > {noformat} > TS[0] - RS[1] - JOIN[4] - RS[5] - JOIN[8] - FS[9] > TS[2] - RS[3] - JOIN[4] > TS[6] - RS[7] - JOIN[8] > {noformat} > Currently, {{TS\[6\]}} may only be reduced with the output of {{RS\[5\]}}, > i.e., input to join between both subplans. > However, it may be useful to consider other possibilities too, e.g., reduced > by the output of {{RS\[1\]}} or {{RS\[3\]}}. For instance, this is important > when, given a large plan, an edge between {{RS[5]}} and {{TS[0]}} would > create a cycle, while an edge between {{RS[1]}} and {{TS[6]}} would not. > This patch comprises two parts. First, it creates additional predicates when > possible. Secondly, it removes duplicate semijoin reduction > branches/predicates, e.g., if another semijoin that consumes the output of > the same expression already reduces a certain table scan operator (heuristic, > since this may not result in most efficient plan in all cases). Ultimately, > the decision on whether to use one or another should be cost-driven > (follow-up). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19267) Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535596#comment-16535596 ] mahesh kumar behera commented on HIVE-19267: [~sershe] The issue is there for mysql 5.6 and below. Thanks for pointing out. I will upload a updated patch soon. > Replicate ACID/MM tables write operations. > -- > > Key: HIVE-19267 > URL: https://issues.apache.org/jira/browse/HIVE-19267 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, > HIVE-19267.02.patch, HIVE-19267.03.patch, HIVE-19267.04.patch, > HIVE-19267.05.patch, HIVE-19267.06.patch, HIVE-19267.07.patch, > HIVE-19267.08.patch, HIVE-19267.09.patch, HIVE-19267.10.patch, > HIVE-19267.11.patch, HIVE-19267.12.patch, HIVE-19267.13.patch, > HIVE-19267.14.patch, HIVE-19267.15.patch, HIVE-19267.16.patch, > HIVE-19267.17.patch, HIVE-19267.18.patch, HIVE-19267.19.patch, > HIVE-19267.20.patch, HIVE-19267.21.patch, HIVE-19267.22.patch > > > > h1. Replicate ACID write Events > * Create new EVENT_WRITE event with related message format to log the write > operations with in a txn along with data associated. > * Log this event when perform any writes (insert into, insert overwrite, > load table, delete, update, merge, truncate) on table/partition. > * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple > partitions, then need to log one event per partition. > * DbNotificationListener should log this type of event to special metastore > table named "MTxnWriteNotificationLog". > * This table should maintain a map of txn ID against list of > tables/partitions written by given txn. > * The entry for a given txn should be removed by the cleaner thread that > removes the expired events from EventNotificationTable. > h1. Replicate Commit Txn operation (with writes) > Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions > modified within the txn. > *Source warehouse:* > * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" > metastore table to consolidate the list of tables/partitions modified within > this txn scope. > * Based on the list of tables/partitions modified and table Write ID, need > to compute the list of delta files added by this txn. > * Repl dump should read this message and dump the metadata and delta files > list. > *Target warehouse:* > * Ensure snapshot isolation at target for on-going read txns which shouldn't > view the data replicated from committed txn. (Ensured with open and allocate > write ID events). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20100) OpTraits : Select Optraits should stop when a mismatch is detected
[ https://issues.apache.org/jira/browse/HIVE-20100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-20100: -- Attachment: HIVE-20100.3.patch > OpTraits : Select Optraits should stop when a mismatch is detected > -- > > Key: HIVE-20100 > URL: https://issues.apache.org/jira/browse/HIVE-20100 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-20100.1.patch, HIVE-20100.2.patch, > HIVE-20100.3.patch > > > The select operator's optraits logic as stated in the comment is, > // For bucket columns > // If all the columns match to the parent, put them in the bucket cols > // else, add empty list. > // For sort columns > // Keep the subset of all the columns as long as order is maintained. > > However, this is not happening due to a bug. The bool found is never reset, > so if a single match is found, the value remains true and allows the optraits > get populated with partial list of columns for bucket col which is incorrect. > This may lead to creation of SMB join which should not happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities
[ https://issues.apache.org/jira/browse/HIVE-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535585#comment-16535585 ] Hive QA commented on HIVE-20090: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 49s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} common in master has 64 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 7s{color} | {color:blue} ql in master has 2287 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 10 new + 43 unchanged - 0 fixed = 53 total (was 43) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 23s{color} | {color:red} ql generated 1 new + 2287 unchanged - 0 fixed = 2288 total (was 2287) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Should org.apache.hadoop.hive.ql.parse.TezCompiler$RedundantSemijoinAndDppContext be a _static_ inner class? At TezCompiler.java:inner class? At TezCompiler.java:[lines 1150-1159] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12435/dev-support/hive-personality.sh | | git revision | master / 8494522 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12435/yetus/diff-checkstyle-ql.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12435/yetus/new-findbugs-ql.html | | modules | C: common ql itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12435/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Extend creation of semijoin reduction filters to be able to discover new > opportunities > -- > > Key: HIVE-20090 > URL: https://issues.apache.org/jira/browse/HIVE-20090 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20090.01.patch, HIVE-20090.02.patch > > > Assume the following plan: > {noformat} > TS[0] - RS[1] - JOIN[4] - RS[5] - JOIN[8] - FS[9] > TS[2] - RS[3] - JOIN[4] >
[jira] [Updated] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-20117: Attachment: HIVE-20117.patch > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20102) Add a couple of additional tests for query parsing
[ https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535578#comment-16535578 ] Hive QA commented on HIVE-20102: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930556/HIVE-20102.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 14617 tests executed *Failed tests:* {noformat} TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestAutoPurgeTables - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestReplicationScenariosAcidTables - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=240) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12434/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12434/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12434/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12930556 - PreCommit-HIVE-Build > Add a couple of additional tests for query parsing > -- > > Key: HIVE-20102 > URL: https://issues.apache.org/jira/browse/HIVE-20102 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20102.01.patch, HIVE-20102.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-20117: Attachment: (was: HIVE-20117.patch) > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-20117: Attachment: HIVE-20117.patch > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535577#comment-16535577 ] Sergey Shelukhin commented on HIVE-20117: - [~ekoifman] can you take a look? > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-20117: Status: Patch Available (was: Open) > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-20117.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20117) schema changes for txn stats
[ https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-20117: --- > schema changes for txn stats > > > Key: HIVE-20117 > URL: https://issues.apache.org/jira/browse/HIVE-20117 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19267) Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-19267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535562#comment-16535562 ] Sergey Shelukhin commented on HIVE-19267: - This breaks mysql script: {noformat} Error: (conn=4) Specified key was too long; max key length is 767 bytes (state=42000,code=1071) Aborting command set because "force" is false and command failed: "CREATE TABLE TXN_WRITE_NOTIFICATION_LOG ( WNL_ID bigint NOT NULL, WNL_TXNID bigint NOT NULL, WNL_WRITEID bigint NOT NULL, WNL_DATABASE varchar(128) NOT NULL, WNL_TABLE varchar(128) NOT NULL, WNL_PARTITION varchar(1024) NOT NULL, WNL_TABLE_OBJ longtext NOT NULL, WNL_PARTITION_OBJ longtext, WNL_FILES longtext, WNL_EVENT_TIME INT(11) NOT NULL, PRIMARY KEY (WNL_TXNID, WNL_DATABASE, WNL_TABLE, WNL_PARTITION) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;" {noformat} Also it's committed but not resolved. cc [~thejas] [~hagleitn] [~sankarh] this needs to be fixed or reverted > Replicate ACID/MM tables write operations. > -- > > Key: HIVE-19267 > URL: https://issues.apache.org/jira/browse/HIVE-19267 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, > HIVE-19267.02.patch, HIVE-19267.03.patch, HIVE-19267.04.patch, > HIVE-19267.05.patch, HIVE-19267.06.patch, HIVE-19267.07.patch, > HIVE-19267.08.patch, HIVE-19267.09.patch, HIVE-19267.10.patch, > HIVE-19267.11.patch, HIVE-19267.12.patch, HIVE-19267.13.patch, > HIVE-19267.14.patch, HIVE-19267.15.patch, HIVE-19267.16.patch, > HIVE-19267.17.patch, HIVE-19267.18.patch, HIVE-19267.19.patch, > HIVE-19267.20.patch, HIVE-19267.21.patch, HIVE-19267.22.patch > > > > h1. Replicate ACID write Events > * Create new EVENT_WRITE event with related message format to log the write > operations with in a txn along with data associated. > * Log this event when perform any writes (insert into, insert overwrite, > load table, delete, update, merge, truncate) on table/partition. > * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple > partitions, then need to log one event per partition. > * DbNotificationListener should log this type of event to special metastore > table named "MTxnWriteNotificationLog". > * This table should maintain a map of txn ID against list of > tables/partitions written by given txn. > * The entry for a given txn should be removed by the cleaner thread that > removes the expired events from EventNotificationTable. > h1. Replicate Commit Txn operation (with writes) > Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions > modified within the txn. > *Source warehouse:* > * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" > metastore table to consolidate the list of tables/partitions modified within > this txn scope. > * Based on the list of tables/partitions modified and table Write ID, need > to compute the list of delta files added by this txn. > * Repl dump should read this message and dump the metadata and delta files > list. > *Target warehouse:* > * Ensure snapshot isolation at target for on-going read txns which shouldn't > view the data replicated from committed txn. (Ensured with open and allocate > write ID events). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535557#comment-16535557 ] Ashutosh Chauhan commented on HIVE-20111: - +1 > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.01.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535556#comment-16535556 ] Ashutosh Chauhan commented on HIVE-20112: - +1 > Accumulo-Hive (managed) table creation fails with strict managed table > checks: Table is marked as a managed table but is not transactional > -- > > Key: HIVE-20112 > URL: https://issues.apache.org/jira/browse/HIVE-20112 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20112.patch > > > Similar to HIVE-20085 and HIVE-20111. Accumulo-Hive (managed) table creation > fails with strict managed table checks: Table is marked as a managed table > but is not transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20102) Add a couple of additional tests for query parsing
[ https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535550#comment-16535550 ] Hive QA commented on HIVE-20102: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 13s{color} | {color:blue} ql in master has 2287 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12434/dev-support/hive-personality.sh | | git revision | master / 8494522 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12434/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Add a couple of additional tests for query parsing > -- > > Key: HIVE-20102 > URL: https://issues.apache.org/jira/browse/HIVE-20102 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20102.01.patch, HIVE-20102.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20116) TezTask is using parent logger
[ https://issues.apache.org/jira/browse/HIVE-20116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535535#comment-16535535 ] Sergey Shelukhin commented on HIVE-20116: - +1 > TezTask is using parent logger > -- > > Key: HIVE-20116 > URL: https://issues.apache.org/jira/browse/HIVE-20116 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 4.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20116.1.patch > > > TezTask is using parent's logger (Task). It should instead use its own class > name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20101) BloomKFilter: Avoid using the local byte[] arrays entirely
[ https://issues.apache.org/jira/browse/HIVE-20101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535532#comment-16535532 ] Hive QA commented on HIVE-20101: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930475/HIVE-20101.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 14616 tests executed *Failed tests:* {noformat} TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestAutoPurgeTables - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestReplicationScenariosAcidTables - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=240) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12433/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12433/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12433/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12930475 - PreCommit-HIVE-Build > BloomKFilter: Avoid using the local byte[] arrays entirely > -- > > Key: HIVE-20101 > URL: https://issues.apache.org/jira/browse/HIVE-20101 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-20101.1.patch > > > HIVE-18866 introduced a fast-path for integer -> murmur hash, but the change > hasn't been applied to BloomKFilter for integers. > {code} > public class BloomKFilter { > private final byte[] BYTE_ARRAY_4 = new byte[4]; > private final byte[] BYTE_ARRAY_8 = new byte[8]; > {code} > Remove these objects and use the fast-path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20113: --- Status: Patch Available (was: Open) > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch, HIVE-20113.2.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20113: --- Status: Open (was: Patch Available) > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch, HIVE-20113.2.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20113: --- Attachment: HIVE-20113.2.patch > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch, HIVE-20113.2.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-12316) Improved integration test for Hive
[ https://issues.apache.org/jira/browse/HIVE-12316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535522#comment-16535522 ] Colin Williams commented on HIVE-12316: --- I was looking at [https://cwiki.apache.org/confluence/display/Hive/Unit+Testing+Hive+SQL] and stumbled upon this. But it looks like Hive 3 has been released. Is this a dead effort? I'm left wondering what's the best way to unit test hive, as the mentioned frameworks in the wiki don't seem to support recent versions of Hive. > Improved integration test for Hive > -- > > Key: HIVE-12316 > URL: https://issues.apache.org/jira/browse/HIVE-12316 > Project: Hive > Issue Type: New Feature > Components: Testing Infrastructure >Affects Versions: 2.0.0 >Reporter: Alan Gates >Assignee: Alan Gates >Priority: Major > Attachments: HIVE-12316.2.patch, HIVE-12316.5.patch, HIVE-12316.patch > > > In working with Hive testing I have found there are several issues that are > causing problems for developers, testers, and users: > * Because Hive has many tunable knobs (file format, security, etc.) we end up > with tests that cover the same functionality with different permutations of > these features. > * The Hive integration tests (ie qfiles) cannot be run on a cluster. This > means we cannot run any of those tests at scale. The HBase community by > contrast uses the same test suite locally and on a cluster, and has found > that this helps them greatly in testing. > * Golden files are a grievous evil. Test writers are forced to eyeball > results the first time they run a test and decide whether they look > reasonable, which is error prone and makes testing at scale impossible. And > changes to one part of Hive often end up changing the plan (and the output of > explain) thus breaking many tests that are not related. This is particularly > an issue for people working on the optimizer. > * The lack of ability to run on a cluster means that when people test Hive at > scale, they are forced to develop custom frameworks which can't then benefit > the community. > * There is no easy mechanism to bring user queries into the test suite. > I propose we build a new testing capability with the following requirements: > * One test should be able to run all reasonable permutations (mr/tez/spark, > orc/parquet/text/rcfile, secure/non-secure etc.) This doesn't mean it would > run every permutation every time, but that the tester could choose which > permutation to run. > * The same tests should run locally and on a cluster. The tests should > support scaling of input data from Ks to Ts. > * Expected results should be auto-generated whenever possible, and this > should work with the scaling of inputs. The dev should be able to provide > expected results or custom expected result generation in cases where > auto-generation doesn't make sense. > * Access to the query plan should be available as an API in the tests so that > golden files of explain output are not required. > * This should run in maven, junit, and java so that developers do not need to > manage yet another framework. > * It should be possible to simulate user data (based on schema and > statistics) and quickly incorporate user queries so that tests from user > scenarios can be quickly incorporated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20116) TezTask is using parent logger
[ https://issues.apache.org/jira/browse/HIVE-20116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535516#comment-16535516 ] Prasanth Jayachandran commented on HIVE-20116: -- [~sershe] can you please take a look? small patch > TezTask is using parent logger > -- > > Key: HIVE-20116 > URL: https://issues.apache.org/jira/browse/HIVE-20116 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 4.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20116.1.patch > > > TezTask is using parent's logger (Task). It should instead use its own class > name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20116) TezTask is using parent logger
[ https://issues.apache.org/jira/browse/HIVE-20116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-20116: > TezTask is using parent logger > -- > > Key: HIVE-20116 > URL: https://issues.apache.org/jira/browse/HIVE-20116 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 4.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20116.1.patch > > > TezTask is using parent's logger (Task). It should instead use its own class > name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20116) TezTask is using parent logger
[ https://issues.apache.org/jira/browse/HIVE-20116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-20116: - Attachment: HIVE-20116.1.patch > TezTask is using parent logger > -- > > Key: HIVE-20116 > URL: https://issues.apache.org/jira/browse/HIVE-20116 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 4.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20116.1.patch > > > TezTask is using parent's logger (Task). It should instead use its own class > name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20116) TezTask is using parent logger
[ https://issues.apache.org/jira/browse/HIVE-20116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-20116: - Status: Patch Available (was: Open) > TezTask is using parent logger > -- > > Key: HIVE-20116 > URL: https://issues.apache.org/jira/browse/HIVE-20116 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 4.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-20116.1.patch > > > TezTask is using parent's logger (Task). It should instead use its own class > name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20101) BloomKFilter: Avoid using the local byte[] arrays entirely
[ https://issues.apache.org/jira/browse/HIVE-20101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535514#comment-16535514 ] Hive QA commented on HIVE-20101: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 22s{color} | {color:blue} storage-api in master has 48 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} storage-api: The patch generated 0 new + 7 unchanged - 2 fixed = 7 total (was 9) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12433/dev-support/hive-personality.sh | | git revision | master / 8494522 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: storage-api U: storage-api | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12433/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > BloomKFilter: Avoid using the local byte[] arrays entirely > -- > > Key: HIVE-20101 > URL: https://issues.apache.org/jira/browse/HIVE-20101 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-20101.1.patch > > > HIVE-18866 introduced a fast-path for integer -> murmur hash, but the change > hasn't been applied to BloomKFilter for integers. > {code} > public class BloomKFilter { > private final byte[] BYTE_ARRAY_4 = new byte[4]; > private final byte[] BYTE_ARRAY_8 = new byte[8]; > {code} > Remove these objects and use the fast-path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20115) acid_no_buckets.q fails
[ https://issues.apache.org/jira/browse/HIVE-20115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-20115: - Assignee: Steve Yeom > acid_no_buckets.q fails > --- > > Key: HIVE-20115 > URL: https://issues.apache.org/jira/browse/HIVE-20115 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-20113: -- Assignee: Gopal V (was: Jesus Camacho Rodriguez) > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20113: --- Status: Patch Available (was: In Progress) > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-20113 started by Jesus Camacho Rodriguez. -- > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-20113: -- Assignee: Jesus Camacho Rodriguez (was: Gopal V) > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20100) OpTraits : Select Optraits should stop when a mismatch is detected
[ https://issues.apache.org/jira/browse/HIVE-20100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535500#comment-16535500 ] Hive QA commented on HIVE-20100: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930609/HIVE-20100.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 14629 tests executed *Failed tests:* {noformat} TestAutoPurgeTables - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=240) TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=240) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin] (batchId=173) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_join] (batchId=165) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12432/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12432/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12432/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12930609 - PreCommit-HIVE-Build > OpTraits : Select Optraits should stop when a mismatch is detected > -- > > Key: HIVE-20100 > URL: https://issues.apache.org/jira/browse/HIVE-20100 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-20100.1.patch, HIVE-20100.2.patch > > > The select operator's optraits logic as stated in the comment is, > // For bucket columns > // If all the columns match to the parent, put them in the bucket cols > // else, add empty list. > // For sort columns > // Keep the subset of all the columns as long as order is maintained. > > However, this is not happening due to a bug. The bool found is never reset, > so if a single match is found, the value remains true and allows the optraits > get populated with partial list of columns for bucket col which is incorrect. > This may lead to creation of SMB join which should not happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20047) consider removing txnID argument for txn stats methods
[ https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535481#comment-16535481 ] Eugene Koifman commented on HIVE-20047: --- yes, your own id should be valid within your own txn context. > consider removing txnID argument for txn stats methods > -- > > Key: HIVE-20047 > URL: https://issues.apache.org/jira/browse/HIVE-20047 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Priority: Major > > Followup from HIVE-19975. > W.r.t. write IDs and txn IDs, stats validity check currently verifies one of > two things - that stats write ID is valid for query write ID list, or that > stats txn ID (derived from write ID) is the same as the query txn ID. > I'm not sure the latter check is needed; removing it would allow us to make a > bunch of APIs a little bit simpler. > [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) > observe stats written by the same txn; but in such manner that it doesn't > have the write ID of the same-txn stats writer, in its valid write ID list? > I'm assuming it's not possible, e.g. in multi statement txn each query would > have the previous same-txn writer for the same table in its valid write ID > list? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20047) consider removing txnID argument for txn stats methods
[ https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535477#comment-16535477 ] Sergey Shelukhin commented on HIVE-20047: - Basically your own write Id would always return true from validWriteIdList.isValidWriteID()? [~steveyeom2017] do you remember why the code in the stats check had verification for query txn ID == stats txn ID? I preserved it when changing the logic there, but according to the above it should not be needed. > consider removing txnID argument for txn stats methods > -- > > Key: HIVE-20047 > URL: https://issues.apache.org/jira/browse/HIVE-20047 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Priority: Major > > Followup from HIVE-19975. > W.r.t. write IDs and txn IDs, stats validity check currently verifies one of > two things - that stats write ID is valid for query write ID list, or that > stats txn ID (derived from write ID) is the same as the query txn ID. > I'm not sure the latter check is needed; removing it would allow us to make a > bunch of APIs a little bit simpler. > [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) > observe stats written by the same txn; but in such manner that it doesn't > have the write ID of the same-txn stats writer, in its valid write ID list? > I'm assuming it's not possible, e.g. in multi statement txn each query would > have the previous same-txn writer for the same table in its valid write ID > list? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HIVE-20047) consider removing txnID argument for txn stats methods
[ https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HIVE-20047: - > consider removing txnID argument for txn stats methods > -- > > Key: HIVE-20047 > URL: https://issues.apache.org/jira/browse/HIVE-20047 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Priority: Major > > Followup from HIVE-19975. > W.r.t. write IDs and txn IDs, stats validity check currently verifies one of > two things - that stats write ID is valid for query write ID list, or that > stats txn ID (derived from write ID) is the same as the query txn ID. > I'm not sure the latter check is needed; removing it would allow us to make a > bunch of APIs a little bit simpler. > [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) > observe stats written by the same txn; but in such manner that it doesn't > have the write ID of the same-txn stats writer, in its valid write ID list? > I'm assuming it's not possible, e.g. in multi statement txn each query would > have the previous same-txn writer for the same table in its valid write ID > list? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20112: --- Status: Patch Available (was: In Progress) > Accumulo-Hive (managed) table creation fails with strict managed table > checks: Table is marked as a managed table but is not transactional > -- > > Key: HIVE-20112 > URL: https://issues.apache.org/jira/browse/HIVE-20112 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20112.patch > > > Similar to HIVE-20085 and HIVE-20111. Accumulo-Hive (managed) table creation > fails with strict managed table checks: Table is marked as a managed table > but is not transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20112: --- Attachment: HIVE-20112.patch > Accumulo-Hive (managed) table creation fails with strict managed table > checks: Table is marked as a managed table but is not transactional > -- > > Key: HIVE-20112 > URL: https://issues.apache.org/jira/browse/HIVE-20112 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20112.patch > > > Similar to HIVE-20085 and HIVE-20111. Accumulo-Hive (managed) table creation > fails with strict managed table checks: Table is marked as a managed table > but is not transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20111: --- Attachment: HIVE-20111.01.patch > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.01.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20111: --- Attachment: (was: HIVE-20111.patch) > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.01.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20100) OpTraits : Select Optraits should stop when a mismatch is detected
[ https://issues.apache.org/jira/browse/HIVE-20100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535467#comment-16535467 ] Hive QA commented on HIVE-20100: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 10s{color} | {color:blue} ql in master has 2287 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12432/dev-support/hive-personality.sh | | git revision | master / eae5225 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12432/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > OpTraits : Select Optraits should stop when a mismatch is detected > -- > > Key: HIVE-20100 > URL: https://issues.apache.org/jira/browse/HIVE-20100 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-20100.1.patch, HIVE-20100.2.patch > > > The select operator's optraits logic as stated in the comment is, > // For bucket columns > // If all the columns match to the parent, put them in the bucket cols > // else, add empty list. > // For sort columns > // Keep the subset of all the columns as long as order is maintained. > > However, this is not happening due to a bug. The bool found is never reset, > so if a single match is found, the value remains true and allows the optraits > get populated with partial list of columns for bucket col which is incorrect. > This may lead to creation of SMB join which should not happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20098) Statistics: NPE when getting Date column partition statistics
[ https://issues.apache.org/jira/browse/HIVE-20098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-20098: Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Andy! > Statistics: NPE when getting Date column partition statistics > - > > Key: HIVE-20098 > URL: https://issues.apache.org/jira/browse/HIVE-20098 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore >Affects Versions: 1.2.1, 4.0.0 > Environment: Tested on versions `1.2.1` and the latest 4.0.0-SNAPSHOT >Reporter: Andy Rosa >Assignee: Andy Rosa >Priority: Major > Labels: Branch3Candidate > Fix For: 4.0.0 > > Attachments: > 0001-Fix-NPE-when-getting-statistics-for-date-column.patch, HIVE-20098.1.patch > > > The issue reproduces only for a date column for a partitioned table. It > reproduces only if the date column has all the values set to null, and if the > partition is not empty. > Here is a quick reproducer: > > > {code:java} > CREATE TABLE dummy_table ( > c_date DATE, > c_bigint BIGINT > ) > PARTITIONED BY (ds STRING); > INSERT OVERWRITE TABLE dummy_table PARTITION (ds='2018-01-01') SELECT > CAST(null AS DATE), CAST(null AS BIGINT) FROM ; > ANALYZE TABLE dummy_table COMPUTE STATISTICS FOR COLUMNS; > DESCRIBE FORMATTED dummy_table.c_bigint PARTITION (ds='2018-01-01'); > DESCRIBE FORMATTED dummy_table.c_date PARTITION (ds='2018-01-01'); > {code} > > > The first `DESCRIBE FORMATTED` statement succeeds, when the second fails with > an `NPE` > > It happens because the null check is missing when converting Object from the > ObjectStore to the Thrift object. The null check is missing only in the date > statistics conversion for the partitioned table. > Missing: > [https://github.com/apache/hive/blob/master/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java#L469] > Present: > https://github.com/apache/hive/blob/master/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java#L298 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20047) consider removing txnID argument for txn stats methods
[ https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535458#comment-16535458 ] Eugene Koifman commented on HIVE-20047: --- your own id is specifically removed from the 'exception' list precisely so that you can read your own write This was definitely the case for txnid - before writeids were added. I'd expect that to be true with write ids. > consider removing txnID argument for txn stats methods > -- > > Key: HIVE-20047 > URL: https://issues.apache.org/jira/browse/HIVE-20047 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Priority: Major > > Followup from HIVE-19975. > W.r.t. write IDs and txn IDs, stats validity check currently verifies one of > two things - that stats write ID is valid for query write ID list, or that > stats txn ID (derived from write ID) is the same as the query txn ID. > I'm not sure the latter check is needed; removing it would allow us to make a > bunch of APIs a little bit simpler. > [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) > observe stats written by the same txn; but in such manner that it doesn't > have the write ID of the same-txn stats writer, in its valid write ID list? > I'm assuming it's not possible, e.g. in multi statement txn each query would > have the previous same-txn writer for the same table in its valid write ID > list? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-20047) consider removing txnID argument for txn stats methods
[ https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-20047. - Resolution: Invalid > consider removing txnID argument for txn stats methods > -- > > Key: HIVE-20047 > URL: https://issues.apache.org/jira/browse/HIVE-20047 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Priority: Major > > Followup from HIVE-19975. > W.r.t. write IDs and txn IDs, stats validity check currently verifies one of > two things - that stats write ID is valid for query write ID list, or that > stats txn ID (derived from write ID) is the same as the query txn ID. > I'm not sure the latter check is needed; removing it would allow us to make a > bunch of APIs a little bit simpler. > [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) > observe stats written by the same txn; but in such manner that it doesn't > have the write ID of the same-txn stats writer, in its valid write ID list? > I'm assuming it's not possible, e.g. in multi statement txn each query would > have the previous same-txn writer for the same table in its valid write ID > list? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20047) consider removing txnID argument for txn stats methods
[ https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535455#comment-16535455 ] Sergey Shelukhin commented on HIVE-20047: - Hmm. So in that case I guess we have to keep it... if an insert inserts some stats, and then select from the same table happens in the same txn, it won't have its own write ID in valid write ID list because txn is not committed yet. > consider removing txnID argument for txn stats methods > -- > > Key: HIVE-20047 > URL: https://issues.apache.org/jira/browse/HIVE-20047 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Priority: Major > > Followup from HIVE-19975. > W.r.t. write IDs and txn IDs, stats validity check currently verifies one of > two things - that stats write ID is valid for query write ID list, or that > stats txn ID (derived from write ID) is the same as the query txn ID. > I'm not sure the latter check is needed; removing it would allow us to make a > bunch of APIs a little bit simpler. > [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) > observe stats written by the same txn; but in such manner that it doesn't > have the write ID of the same-txn stats writer, in its valid write ID list? > I'm assuming it's not possible, e.g. in multi statement txn each query would > have the previous same-txn writer for the same table in its valid write ID > list? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20047) consider removing txnID argument for txn stats methods
[ https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535453#comment-16535453 ] Eugene Koifman commented on HIVE-20047: --- Currently the logic is that you get 1 write id per (txnid, table). Multiple data writes in a txn (to same partition), use a 'statementID' which is a suffix of the delta dir: delta_txnid_txnid_stmtId > consider removing txnID argument for txn stats methods > -- > > Key: HIVE-20047 > URL: https://issues.apache.org/jira/browse/HIVE-20047 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Priority: Major > > Followup from HIVE-19975. > W.r.t. write IDs and txn IDs, stats validity check currently verifies one of > two things - that stats write ID is valid for query write ID list, or that > stats txn ID (derived from write ID) is the same as the query txn ID. > I'm not sure the latter check is needed; removing it would allow us to make a > bunch of APIs a little bit simpler. > [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) > observe stats written by the same txn; but in such manner that it doesn't > have the write ID of the same-txn stats writer, in its valid write ID list? > I'm assuming it's not possible, e.g. in multi statement txn each query would > have the previous same-txn writer for the same table in its valid write ID > list? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535450#comment-16535450 ] Gunther Hagleitner commented on HIVE-20113: --- +1 > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20040) JDBC: HTTP listen queue is 50 and SYNs are lost
[ https://issues.apache.org/jira/browse/HIVE-20040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20040: --- Labels: Branch3Candidate (was: ) > JDBC: HTTP listen queue is 50 and SYNs are lost > --- > > Key: HIVE-20040 > URL: https://issues.apache.org/jira/browse/HIVE-20040 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 3.0.0, 3.1.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20040.1.patch > > > When testing with 5000 concurrent users, the JDBC HTTP port ends up > overflowing on SYNs when the HS2 gc pauses. > This is because each getQueryProgress request is an independent HTTP request, > so unlike the BINARY mode, there are more connections being established & > closed in HTTP mode. > {code} > LISTEN 0 50 *:10004*:* > {code} > This turns into connection errors when enabling > {{net.ipv4.tcp_abort_on_overflow=1}}, but the better approach is to enqueue > the connections until the HS2 is done with its GC pause. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20021) LLAP: Fall back to Synthetic File-ids when getting a HdfsConstants.GRANDFATHER_INODE_ID
[ https://issues.apache.org/jira/browse/HIVE-20021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20021: --- Labels: Branch3Candidate (was: ) > LLAP: Fall back to Synthetic File-ids when getting a > HdfsConstants.GRANDFATHER_INODE_ID > --- > > Key: HIVE-20021 > URL: https://issues.apache.org/jira/browse/HIVE-20021 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Fix For: 3.2.0 > > Attachments: HIVE-20021.1.patch > > > HDFS client implementations have multiple server implementations, which do > not all support the inodes for file locations. > If the client returns a 0 InodeId, fall back to the synthetic ones. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries
[ https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-19985: --- Labels: Branch3Candidate (was: ) > ACID: Skip decoding the ROW__ID sections for read-only queries > --- > > Key: HIVE-19985 > URL: https://issues.apache.org/jira/browse/HIVE-19985 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > > For a base_n file there are no aborted transactions within the file and if > there are no pending delete deltas, the entire ACID ROW__ID can be skipped > for all read-only queries (i.e SELECT), though it still needs to be projected > out for MERGE, UPDATE and DELETE queries. > This patch tries to entirely ignore the ACID ROW__ID fields for all tables > where there are no possible deletes or aborted transactions for an ACID split. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20113: --- Labels: Branch3Candidate (was: ) > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20113.1.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20043) HiveServer2: SessionState has a static sync block around an AtomicBoolean
[ https://issues.apache.org/jira/browse/HIVE-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20043: --- Labels: Branch3Candidate Concurrency (was: Concurrency) > HiveServer2: SessionState has a static sync block around an AtomicBoolean > - > > Key: HIVE-20043 > URL: https://issues.apache.org/jira/browse/HIVE-20043 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Laszlo Bodor >Priority: Major > Labels: Branch3Candidate, Concurrency > Attachments: HIVE-20043.01.patch, HIVE-20043.02.patch > > > {code} > private static void start(SessionState startSs, boolean isAsync, LogHelper > console) { > ... > synchronized(SessionState.class) { > if (!startSs.isStarted.compareAndSet(false, true)) { > return; > } > } > {code} > startSs.isStarted is an AtomicBoolean, which makes it hard to know why this > code is locked with a static lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20103) WM: Only Aggregate DAG counters if at least one is used
[ https://issues.apache.org/jira/browse/HIVE-20103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20103: --- Labels: Branch3Candidate (was: ) > WM: Only Aggregate DAG counters if at least one is used > --- > > Key: HIVE-20103 > URL: https://issues.apache.org/jira/browse/HIVE-20103 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 3.0.0, 4.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Labels: Branch3Candidate > Attachments: HIVE-20103.1.patch > > > {code} > status = > dagClient.getDAGStatus(EnumSet.of(StatusGetOpts.GET_COUNTERS), checkInterval); > TezCounters dagCounters = status.getDAGCounters(); > ... > if (dagCounters != null && wmContext != null) { > Set desiredCounters = wmContext.getSubscribedCounters(); > if (desiredCounters != null && !desiredCounters.isEmpty()) { > Map currentCounters = getCounterValues(dagCounters, > vertexNames, vertexProgressMap, > desiredCounters, done); > {code} > Skip collecting DAG counters unless there at least one desired counter in > wmContext. > The AM has a hard-lock around the counters, so the current jstacks are full > of > {code} >java.lang.Thread.State: RUNNABLE > at java.lang.String.intern(Native Method) > at > org.apache.hadoop.util.StringInterner.weakIntern(StringInterner.java:71) > at > org.apache.tez.common.counters.GenericCounter.(GenericCounter.java:50) > at > org.apache.tez.common.counters.TezCounters$GenericGroup.newCounter(TezCounters.java:65) > at > org.apache.tez.common.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:92) > at > org.apache.tez.common.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:104) > - locked <0x7efb3ac7af38> (a > org.apache.tez.common.counters.TezCounters$GenericGroup) > at > org.apache.tez.common.counters.AbstractCounterGroup.aggrAllCounters(AbstractCounterGroup.java:204) > at > org.apache.tez.common.counters.AbstractCounters.aggrAllCounters(AbstractCounters.java:372) > - eliminated <0x7efb3ac64ee8> (a > org.apache.tez.common.counters.TezCounters) > at > org.apache.tez.common.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:357) > - locked <0x7efb3ac64ee8> (a > org.apache.tez.common.counters.TezCounters) > at > org.apache.tez.dag.app.dag.impl.TaskImpl.getCounters(TaskImpl.java:462) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.aggrTaskCounters(VertexImpl.java:1342) > at > org.apache.tez.dag.app.dag.impl.VertexImpl.getAllCounters(VertexImpl.java:1202) > at > org.apache.tez.dag.app.dag.impl.DAGImpl.aggrTaskCounters(DAGImpl.java:755) > at > org.apache.tez.dag.app.dag.impl.DAGImpl.getAllCounters(DAGImpl.java:704) > at > org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:901) > at > org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:940) > at > org.apache.tez.dag.api.client.DAGClientHandler.getDAGStatus(DAGClientHandler.java:73) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20098) Statistics: NPE when getting Date column partition statistics
[ https://issues.apache.org/jira/browse/HIVE-20098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20098: --- Labels: Branch3Candidate (was: ) > Statistics: NPE when getting Date column partition statistics > - > > Key: HIVE-20098 > URL: https://issues.apache.org/jira/browse/HIVE-20098 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore >Affects Versions: 1.2.1, 4.0.0 > Environment: Tested on versions `1.2.1` and the latest 4.0.0-SNAPSHOT >Reporter: Andy Rosa >Assignee: Andy Rosa >Priority: Major > Labels: Branch3Candidate > Attachments: > 0001-Fix-NPE-when-getting-statistics-for-date-column.patch, HIVE-20098.1.patch > > > The issue reproduces only for a date column for a partitioned table. It > reproduces only if the date column has all the values set to null, and if the > partition is not empty. > Here is a quick reproducer: > > > {code:java} > CREATE TABLE dummy_table ( > c_date DATE, > c_bigint BIGINT > ) > PARTITIONED BY (ds STRING); > INSERT OVERWRITE TABLE dummy_table PARTITION (ds='2018-01-01') SELECT > CAST(null AS DATE), CAST(null AS BIGINT) FROM ; > ANALYZE TABLE dummy_table COMPUTE STATISTICS FOR COLUMNS; > DESCRIBE FORMATTED dummy_table.c_bigint PARTITION (ds='2018-01-01'); > DESCRIBE FORMATTED dummy_table.c_date PARTITION (ds='2018-01-01'); > {code} > > > The first `DESCRIBE FORMATTED` statement succeeds, when the second fails with > an `NPE` > > It happens because the null check is missing when converting Object from the > ObjectStore to the Thrift object. The null check is missing only in the date > statistics conversion for the partitioned table. > Missing: > [https://github.com/apache/hive/blob/master/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java#L469] > Present: > https://github.com/apache/hive/blob/master/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java#L298 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20041) ResultsCache: Improve logging for concurrent queries
[ https://issues.apache.org/jira/browse/HIVE-20041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20041: --- Labels: Branch3Candidate (was: ) > ResultsCache: Improve logging for concurrent queries > > > Key: HIVE-20041 > URL: https://issues.apache.org/jira/browse/HIVE-20041 > Project: Hive > Issue Type: Improvement > Components: Diagnosability >Reporter: Gopal V >Assignee: Laszlo Bodor >Priority: Minor > Labels: Branch3Candidate > Attachments: HIVE-20041.01.patch, HIVE-20041.02.patch, > HIVE-20041.03.patch, HIVE-20041.04.patch > > > The logging for QueryResultsCache ends up printing information without > context, like > {code} > 2018-06-30T17:48:45,502 INFO [HiveServer2-Background-Pool: Thread-166] > results.QueryResultsCache: Waiting on pending cacheEntry > {code} > {code} > 2018-06-30T17:50:17,963 INFO [HiveServer2-Background-Pool: Thread-145] > ql.Driver: savedToCache: true > {code} > The previous lines for this are in DEBUG level, so the logging ends up being > useless at INFO level to debug. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20098) Statistics: NPE when getting Date column partition statistics
[ https://issues.apache.org/jira/browse/HIVE-20098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535443#comment-16535443 ] Hive QA commented on HIVE-20098: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930466/HIVE-20098.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14631 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12431/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12431/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12431/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12930466 - PreCommit-HIVE-Build > Statistics: NPE when getting Date column partition statistics > - > > Key: HIVE-20098 > URL: https://issues.apache.org/jira/browse/HIVE-20098 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore >Affects Versions: 1.2.1, 4.0.0 > Environment: Tested on versions `1.2.1` and the latest 4.0.0-SNAPSHOT >Reporter: Andy Rosa >Assignee: Andy Rosa >Priority: Major > Attachments: > 0001-Fix-NPE-when-getting-statistics-for-date-column.patch, HIVE-20098.1.patch > > > The issue reproduces only for a date column for a partitioned table. It > reproduces only if the date column has all the values set to null, and if the > partition is not empty. > Here is a quick reproducer: > > > {code:java} > CREATE TABLE dummy_table ( > c_date DATE, > c_bigint BIGINT > ) > PARTITIONED BY (ds STRING); > INSERT OVERWRITE TABLE dummy_table PARTITION (ds='2018-01-01') SELECT > CAST(null AS DATE), CAST(null AS BIGINT) FROM ; > ANALYZE TABLE dummy_table COMPUTE STATISTICS FOR COLUMNS; > DESCRIBE FORMATTED dummy_table.c_bigint PARTITION (ds='2018-01-01'); > DESCRIBE FORMATTED dummy_table.c_date PARTITION (ds='2018-01-01'); > {code} > > > The first `DESCRIBE FORMATTED` statement succeeds, when the second fails with > an `NPE` > > It happens because the null check is missing when converting Object from the > ObjectStore to the Thrift object. The null check is missing only in the date > statistics conversion for the partitioned table. > Missing: > [https://github.com/apache/hive/blob/master/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java#L469] > Present: > https://github.com/apache/hive/blob/master/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java#L298 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20100) OpTraits : Select Optraits should stop when a mismatch is detected
[ https://issues.apache.org/jira/browse/HIVE-20100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-20100: -- Attachment: HIVE-20100.2.patch > OpTraits : Select Optraits should stop when a mismatch is detected > -- > > Key: HIVE-20100 > URL: https://issues.apache.org/jira/browse/HIVE-20100 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-20100.1.patch, HIVE-20100.2.patch > > > The select operator's optraits logic as stated in the comment is, > // For bucket columns > // If all the columns match to the parent, put them in the bucket cols > // else, add empty list. > // For sort columns > // Keep the subset of all the columns as long as order is maintained. > > However, this is not happening due to a bug. The bool found is never reset, > so if a single match is found, the value remains true and allows the optraits > get populated with partial list of columns for bucket col which is incorrect. > This may lead to creation of SMB join which should not happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-20113: --- Attachment: HIVE-20113.1.patch > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > Attachments: HIVE-20113.1.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20113) Shuffle avoidance: Disable 1-1 edges for sorted shuffle
[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned HIVE-20113: -- Assignee: Gopal V > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Gopal V >Assignee: Gopal V >Priority: Major > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20111: --- Reporter: Romil Choksi (was: Dileep Kumar Chiguruvada) > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20112: --- Description: Similar to HIVE-20085 and HIVE-20111. Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional (was: Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional) > Accumulo-Hive (managed) table creation fails with strict managed table > checks: Table is marked as a managed table but is not transactional > -- > > Key: HIVE-20112 > URL: https://issues.apache.org/jira/browse/HIVE-20112 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > > Similar to HIVE-20085 and HIVE-20111. Accumulo-Hive (managed) table creation > fails with strict managed table checks: Table is marked as a managed table > but is not transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-20112 started by Jesus Camacho Rodriguez. -- > Accumulo-Hive (managed) table creation fails with strict managed table > checks: Table is marked as a managed table but is not transactional > -- > > Key: HIVE-20112 > URL: https://issues.apache.org/jira/browse/HIVE-20112 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > > Similar to HIVE-20085 and HIVE-20111. Accumulo-Hive (managed) table creation > fails with strict managed table checks: Table is marked as a managed table > but is not transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20112: --- Reporter: Romil Choksi (was: Dileep Kumar Chiguruvada) > Accumulo-Hive (managed) table creation fails with strict managed table > checks: Table is marked as a managed table but is not transactional > -- > > Key: HIVE-20112 > URL: https://issues.apache.org/jira/browse/HIVE-20112 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Romil Choksi >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > > Similar to HIVE-20085 and HIVE-20111. Accumulo-Hive (managed) table creation > fails with strict managed table checks: Table is marked as a managed table > but is not transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20112) Accumulo-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-20112: -- > Accumulo-Hive (managed) table creation fails with strict managed table > checks: Table is marked as a managed table but is not transactional > -- > > Key: HIVE-20112 > URL: https://issues.apache.org/jira/browse/HIVE-20112 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-20111 started by Jesus Camacho Rodriguez. -- > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20111: --- Attachment: HIVE-20111.patch > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20111: --- Status: Patch Available (was: In Progress) > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-20111: -- Assignee: Jesus Camacho Rodriguez (was: Nishant Bangarwa) > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20111.patch > > > Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict > managed table checks: Table is marked as a managed table but is not > transactional -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-20032: Attachment: HIVE-20032.2.patch > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-20032: Attachment: (was: HIVE-20032.2.patch) > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535353#comment-16535353 ] Sahil Takiar edited comment on HIVE-20032 at 7/6/18 9:57 PM: - Attaching dummy patch with {{hive.spark.optimize.shuffle.serde}} set to true by default, {{hive.combine.equivalent.work.optimization}} to false by default, and {{hive.spark.use.groupby.shuffle}} to false, just to see if there are any HoS test failures (besides explain plan diffs). was (Author: stakiar): Attaching dummy patch with {{hive.spark.optimize.shuffle.serde}} set to true by default and {{hive.combine.equivalent.work.optimization}} to false by default, just to see if there are any HoS test failures (besides explain plan diffs). > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20098) Statistics: NPE when getting Date column partition statistics
[ https://issues.apache.org/jira/browse/HIVE-20098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535398#comment-16535398 ] Hive QA commented on HIVE-20098: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 19s{color} | {color:blue} standalone-metastore in master has 217 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12431/dev-support/hive-personality.sh | | git revision | master / eae5225 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: standalone-metastore U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12431/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Statistics: NPE when getting Date column partition statistics > - > > Key: HIVE-20098 > URL: https://issues.apache.org/jira/browse/HIVE-20098 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore >Affects Versions: 1.2.1, 4.0.0 > Environment: Tested on versions `1.2.1` and the latest 4.0.0-SNAPSHOT >Reporter: Andy Rosa >Assignee: Andy Rosa >Priority: Major > Attachments: > 0001-Fix-NPE-when-getting-statistics-for-date-column.patch, HIVE-20098.1.patch > > > The issue reproduces only for a date column for a partitioned table. It > reproduces only if the date column has all the values set to null, and if the > partition is not empty. > Here is a quick reproducer: > > > {code:java} > CREATE TABLE dummy_table ( > c_date DATE, > c_bigint BIGINT > ) > PARTITIONED BY (ds STRING); > INSERT OVERWRITE TABLE dummy_table PARTITION (ds='2018-01-01') SELECT > CAST(null AS DATE), CAST(null AS BIGINT) FROM ; > ANALYZE TABLE dummy_table COMPUTE STATISTICS FOR COLUMNS; > DESCRIBE FORMATTED dummy_table.c_bigint PARTITION (ds='2018-01-01'); > DESCRIBE FORMATTED dummy_table.c_date PARTITION (ds='2018-01-01'); > {code} > > > The first `DESCRIBE FORMATTED` statement succeeds, when the second fails with > an `NPE` > > It happens because the null check is missing when converting Object from the > ObjectStore to the Thrift object. The null check is missing only in the date > statistics conversion for the partitioned table. > Missing: >
[jira] [Updated] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20111: --- Description: Similar to HIVE-20085. HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional (was: Druid-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional {code} drop table if exists calcs; create table calcs STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' TBLPROPERTIES ( "druid.segment.granularity" = "MONTH", "druid.query.granularity" = "DAY") AS SELECT cast(datetime0 as timestamp with local time zone) `__time`, key, str0, str1, str2, str3, date0, date1, date2, date3, time0, time1, datetime0, datetime1, zzz, cast(bool0 as string) bool0, cast(bool1 as string) bool1, cast(bool2 as string) bool2, cast(bool3 as string) bool3, int0, int1, int2, int3, num0, num1, num2, num3, num4 from tableau_orc.calcs; 2018-07-03 04:57:31,911|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Status: Running (Executing on YARN cluster with App id application_1530592209763_0009) ... ... 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_BYTES_TO_MEM: 0 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_PHASE_TIME: 330 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPILLED_RECORDS: 17 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : TaskCounter_Reducer_2_OUTPUT_out_Reducer_2: 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : OUTPUT_RECORDS: 0 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : org.apache.hadoop.hive.llap.counters.LlapWmCounters: 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_QUEUED_NS: 0 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_RUNNING_NS: 0 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPECULATIVE_QUEUED_NS: 2162643606 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPECULATIVE_RUNNING_NS: 12151664909 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task [Stage-2:DEPENDENCY_COLLECTION] in serial mode 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task [Stage-0:MOVE] in serial mode 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Moving data to directory hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/calcs from hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/.hive-staging_hive_2018-07-03_04-57-27_351_7124633902209008283-3/-ext-10002 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task [Stage-4:DDL] in serial mode 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table druid_tableau.calcs failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.) 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Completed executing command(queryId=hive_20180703045727_c39c40d2-7d4a-46c7-a36d-7925e7c4a788); Time taken: 6.794 seconds 2018-07-03 04:57:36,337|INFO|Thread-721|machine.py:111 - tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table druid_tableau.calcs failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.) (state=08S01,code=1) {code} This will not allow druid tables to be managed. So its not direct to create Druid tables. while trying to modify things to external tables..we see below issues 1) INSERT/ INSERT OVERWRITE/ DROP are supported by Hive managed tables (not external) , we have few tests which covers this.. what would be the course of
[jira] [Assigned] (HIVE-20111) HBase-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-20111: -- > HBase-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20111 > URL: https://issues.apache.org/jira/browse/HIVE-20111 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Nishant Bangarwa >Priority: Major > Fix For: 3.0.0 > > > Druid-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > {code} > drop table if exists calcs; > create table calcs > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ( > "druid.segment.granularity" = "MONTH", > "druid.query.granularity" = "DAY") > AS SELECT > cast(datetime0 as timestamp with local time zone) `__time`, > key, > str0, str1, str2, str3, > date0, date1, date2, date3, > time0, time1, > datetime0, datetime1, > zzz, > cast(bool0 as string) bool0, > cast(bool1 as string) bool1, > cast(bool2 as string) bool2, > cast(bool3 as string) bool3, > int0, int1, int2, int3, > num0, num1, num2, num3, num4 > from tableau_orc.calcs; > 2018-07-03 04:57:31,911|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Status: Running > (Executing on YARN cluster with App id application_1530592209763_0009) > ... > ... > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_BYTES_TO_MEM: > 0 > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_PHASE_TIME: > 330 > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPILLED_RECORDS: 17 > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > TaskCounter_Reducer_2_OUTPUT_out_Reducer_2: > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : OUTPUT_RECORDS: 0 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > org.apache.hadoop.hive.llap.counters.LlapWmCounters: > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_QUEUED_NS: > 0 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > GUARANTEED_RUNNING_NS: 0 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > SPECULATIVE_QUEUED_NS: 2162643606 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > SPECULATIVE_RUNNING_NS: 12151664909 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task > [Stage-2:DEPENDENCY_COLLECTION] in serial mode > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task > [Stage-0:MOVE] in serial mode > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Moving data to > directory > hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/calcs > from > hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/.hive-staging_hive_2018-07-03_04-57-27_351_7124633902209008283-3/-ext-10002 > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task > [Stage-4:DDL] in serial mode > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|ERROR : FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:Table druid_tableau.calcs failed strict managed table > checks due to the following reason: Table is marked as a managed table but is > not transactional.) > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Completed executing > command(queryId=hive_20180703045727_c39c40d2-7d4a-46c7-a36d-7925e7c4a788); > Time taken: 6.794 seconds > 2018-07-03 04:57:36,337|INFO|Thread-721|machine.py:111 - >
[jira] [Comment Edited] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535380#comment-16535380 ] Sergey Shelukhin edited comment on HIVE-19820 at 7/6/18 9:39 PM: - Trying to write analyze pre-committing stats state makes me think it's still going to be potentially racy, esp since most DB txns in metastore use read_committed instead of a higher concurrency level so it's possible for some write to sneak in that invalidates the stats without updating write ID, and analyze won't notice it. On top of that logic is also extremely ugly and hacky and requires a new API that we then cannot remove. Thinking of alternative solutions (proper one would be https://issues.apache.org/jira/browse/HIVE-20109 but there's no time for that... maybe we can make sure that any path that sets the flag also sets write ID, and any table alteration verifies that you cannot change the stats without setting write ID (and throws otherwise, so we can catch any missing paths in tests). We can also add back validWriteIdList in TBLS/PARTITIONS tables and only populate it on analyze for an extra check, then clear it after any other valid stats update. I also found additional path in set table stats/set partition stats that doesn't appear to be updating stats state in the same DB txn as checking stats validity, when merging stats. It may still be valid but some tests will be needed for parallel updates and reads. This and general absence of any negative tests (parallel inserts/updates/deletes, or even reads with someone updating stats in parallel; parallel insert+analyze, various ways to invalidate the stats like truncate, etc. - we only have a test for a non-parallel insert without stats collection right now for the negatives) gives me pause. I think it would be better to push it to M05 and ideally fix HIVE-20109 (or at least as per above make sure every stats state change sets write ID so we can detect parallel changes), and add more tests. was (Author: sershe): Trying to write analyze pre-committing stats state makes me think it's still going to be potentially racy, esp since most DB txns in metastore use read_committed instead of a higher concurrency level so it's possible for some write to sneak in that invalidates the stats without updating write ID, and analyze won't notice it. On top of that logic is also extremely ugly and hacky and requires a new API that we then cannot remove. Thinking of alternative solutions (proper one would be https://issues.apache.org/jira/browse/HIVE-20109 but there's no time for that... maybe we can make sure that any path that sets the flag also sets write ID, and any table alteration verifies that you cannot change the stats without setting write ID (and throws otherwise, so we can catch any missing paths in tests). We can also add back validWriteIdList in TBLS/PARTITIONS tables and only populate it on analyze for an extra check, then clear it after any other valid stats update. I also found additional path in set table stats/set partition stats that doesn't appear to be updating stats state in the same DB txn as checking stats validity, when merging stats. It may still be valid but some tests will be needed for parallel updates and reads. This and general absence of any negative tests (parallel inserts/updates/deletes, or even reads with someone updating stats in parallel; parallel insert+analyze, various ways to invalidate the stats like truncate, etc. - we only have a test for a non-parallel insert without stats collection right now for the negatives) gives me pause. I think it would be better to push it to M05 and ideally fix HIVE-20109 (or at least as per above make sure every stats state change sets write ID so we can detect parallel changes), and add more tests. Right now I feel this is too hot and way too under-tested; I doubt system tests that are not specifically targeted at this will catch any subtle bugs, or even any glaring bugs e.g. for inserts and select count running in parallel. cc [~hagleitn] [~ekoifman] [~steveyeom2017] > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.02-master-txnstats.patch, HIVE-19820.03-master-txnstats.patch, > HIVE-19820.04-master-txnstats.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards
[jira] [Commented] (HIVE-20110) Bypass HMS CachedStore for transactional stats
[ https://issues.apache.org/jira/browse/HIVE-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535382#comment-16535382 ] Steve Yeom commented on HIVE-20110: --- The patch 04 of HIVE-19532 already has a fix. > Bypass HMS CachedStore for transactional stats > -- > > Key: HIVE-20110 > URL: https://issues.apache.org/jira/browse/HIVE-20110 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-20110) Bypass HMS CachedStore for transactional stats
[ https://issues.apache.org/jira/browse/HIVE-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom resolved HIVE-20110. --- Resolution: Fixed > Bypass HMS CachedStore for transactional stats > -- > > Key: HIVE-20110 > URL: https://issues.apache.org/jira/browse/HIVE-20110 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535380#comment-16535380 ] Sergey Shelukhin edited comment on HIVE-19820 at 7/6/18 9:38 PM: - Trying to write analyze pre-committing stats state makes me think it's still going to be potentially racy, esp since most DB txns in metastore use read_committed instead of a higher concurrency level so it's possible for some write to sneak in that invalidates the stats without updating write ID, and analyze won't notice it. On top of that logic is also extremely ugly and hacky and requires a new API that we then cannot remove. Thinking of alternative solutions (proper one would be https://issues.apache.org/jira/browse/HIVE-20109 but there's no time for that... maybe we can make sure that any path that sets the flag also sets write ID, and any table alteration verifies that you cannot change the stats without setting write ID (and throws otherwise, so we can catch any missing paths in tests). We can also add back validWriteIdList in TBLS/PARTITIONS tables and only populate it on analyze for an extra check, then clear it after any other valid stats update. I also found additional path in set table stats/set partition stats that doesn't appear to be updating stats state in the same DB txn as checking stats validity, when merging stats. It may still be valid but some tests will be needed for parallel updates and reads. This and general absence of any negative tests (parallel inserts/updates/deletes, or even reads with someone updating stats in parallel; parallel insert+analyze, various ways to invalidate the stats like truncate, etc. - we only have a test for a non-parallel insert without stats collection right now for the negatives) gives me pause. I think it would be better to push it to M05 and ideally fix HIVE-20109 (or at least as per above make sure every stats state change sets write ID so we can detect parallel changes), and add more tests. Right now I feel this is too hot and way too under-tested; I doubt system tests that are not specifically targeted at this will catch any subtle bugs, or even any glaring bugs e.g. for inserts and select count running in parallel. cc [~hagleitn] [~ekoifman] [~steveyeom2017] was (Author: sershe): Trying to write analyze pre-committing stats state makes me think it's still going to be potentially racy, esp since most DB txns in metastore use read_committed instead of a higher concurrency level so it's possible for some write to sneak in that invalidates the stats without updating write ID, and analyze won't notice it. On top of that logic is also extremely ugly and hacky and requires a new API that we then cannot remove. Thinking of alternative solutions (proper one would be https://issues.apache.org/jira/browse/HIVE-20109 but there's no time for that... maybe we can make sure that any path that sets the flag also sets write ID, and any table alteration verifies that you cannot change the stats without setting write ID (and throws otherwise, so we can catch any missing paths in tests). I also found additional path in set table stats/set partition stats that doesn't appear to be updating stats state in the same DB txn as checking stats validity, when merging stats. It may still be valid but some tests will be needed for parallel updates and reads. This and general absence of any negative tests (parallel inserts/updates/deletes, or even reads with someone updating stats in parallel; parallel insert+analyze, various ways to invalidate the stats like truncate, etc. - we only have a test for a non-parallel insert without stats collection right now for the negatives) gives me pause. I think it would be better to push it to M05 and ideally fix HIVE-20109 (or at least as per above make sure every stats state change sets write ID so we can detect parallel changes), and add more tests. Right now I feel this is too hot and way too under-tested; I doubt system tests that are not specifically targeted at this will catch any subtle bugs, or even any glaring bugs e.g. for inserts and select count running in parallel. cc [~hagleitn] [~ekoifman] [~steveyeom2017] > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.02-master-txnstats.patch, HIVE-19820.03-master-txnstats.patch, > HIVE-19820.04-master-txnstats.patch > > > Follow-up from HIVE-19418. > Right
[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535380#comment-16535380 ] Sergey Shelukhin commented on HIVE-19820: - Trying to write analyze pre-committing stats state makes me think it's still going to be potentially racy, esp since most DB txns in metastore use read_committed instead of a higher concurrency level so it's possible for some write to sneak in that invalidates the stats without updating write ID, and analyze won't notice it. On top of that logic is also extremely ugly and hacky and requires a new API that we then cannot remove. Thinking of alternative solutions (proper one would be https://issues.apache.org/jira/browse/HIVE-20109 but there's no time for that... maybe we can make sure that any path that sets the flag also sets write ID, and any table alteration verifies that you cannot change the stats without setting write ID (and throws otherwise, so we can catch any missing paths in tests). I also found additional path in set table stats/set partition stats that doesn't appear to be updating stats state in the same DB txn as checking stats validity, when merging stats. It may still be valid but some tests will be needed for parallel updates and reads. This and general absence of any negative tests (parallel inserts/updates/deletes, or even reads with someone updating stats in parallel; parallel insert+analyze, various ways to invalidate the stats like truncate, etc. - we only have a test for a non-parallel insert without stats collection right now for the negatives) gives me pause. I think it would be better to push it to M05 and ideally fix HIVE-20109 (or at least as per above make sure every stats state change sets write ID so we can detect parallel changes), and add more tests. Right now I feel this is too hot and way too under-tested; I doubt system tests that are not specifically targeted at this will catch any subtle bugs, or even any glaring bugs e.g. for inserts and select count running in parallel. cc [~hagleitn] [~ekoifman] [~steveyeom2017] > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.02-master-txnstats.patch, HIVE-19820.03-master-txnstats.patch, > HIVE-19820.04-master-txnstats.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20110) Bypass HMS CachedStore for transactional stats
[ https://issues.apache.org/jira/browse/HIVE-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-20110: - Assignee: Steve Yeom > Bypass HMS CachedStore for transactional stats > -- > > Key: HIVE-20110 > URL: https://issues.apache.org/jira/browse/HIVE-20110 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20102) Add a couple of additional tests for query parsing
[ https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535372#comment-16535372 ] Ashutosh Chauhan commented on HIVE-20102: - +1 > Add a couple of additional tests for query parsing > -- > > Key: HIVE-20102 > URL: https://issues.apache.org/jira/browse/HIVE-20102 > Project: Hive > Issue Type: Improvement > Components: Parser >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20102.01.patch, HIVE-20102.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20100) OpTraits : Select Optraits should stop when a mismatch is detected
[ https://issues.apache.org/jira/browse/HIVE-20100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535373#comment-16535373 ] Hive QA commented on HIVE-20100: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930472/HIVE-20100.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 14631 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin] (batchId=173) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_join] (batchId=165) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12430/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12430/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12430/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12930472 - PreCommit-HIVE-Build > OpTraits : Select Optraits should stop when a mismatch is detected > -- > > Key: HIVE-20100 > URL: https://issues.apache.org/jira/browse/HIVE-20100 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-20100.1.patch > > > The select operator's optraits logic as stated in the comment is, > // For bucket columns > // If all the columns match to the parent, put them in the bucket cols > // else, add empty list. > // For sort columns > // Keep the subset of all the columns as long as order is maintained. > > However, this is not happening due to a bug. The bool found is never reset, > so if a single match is found, the value remains true and allows the optraits > get populated with partial list of columns for bucket col which is incorrect. > This may lead to creation of SMB join which should not happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20085) Druid-Hive (managed) table creation fails with strict managed table checks: Table is marked as a managed table but is not transactional
[ https://issues.apache.org/jira/browse/HIVE-20085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535370#comment-16535370 ] Ashutosh Chauhan commented on HIVE-20085: - +1 > Druid-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > --- > > Key: HIVE-20085 > URL: https://issues.apache.org/jira/browse/HIVE-20085 > Project: Hive > Issue Type: Bug > Components: Hive, StorageHandler >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Nishant Bangarwa >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20085.1.patch, HIVE-20085.2.patch, > HIVE-20085.3.patch, HIVE-20085.4.patch, HIVE-20085.5.patch, HIVE-20085.patch > > > Druid-Hive (managed) table creation fails with strict managed table checks: > Table is marked as a managed table but is not transactional > {code} > drop table if exists calcs; > create table calcs > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ( > "druid.segment.granularity" = "MONTH", > "druid.query.granularity" = "DAY") > AS SELECT > cast(datetime0 as timestamp with local time zone) `__time`, > key, > str0, str1, str2, str3, > date0, date1, date2, date3, > time0, time1, > datetime0, datetime1, > zzz, > cast(bool0 as string) bool0, > cast(bool1 as string) bool1, > cast(bool2 as string) bool2, > cast(bool3 as string) bool3, > int0, int1, int2, int3, > num0, num1, num2, num3, num4 > from tableau_orc.calcs; > 2018-07-03 04:57:31,911|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Status: Running > (Executing on YARN cluster with App id application_1530592209763_0009) > ... > ... > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_BYTES_TO_MEM: > 0 > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SHUFFLE_PHASE_TIME: > 330 > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : SPILLED_RECORDS: 17 > 2018-07-03 04:57:36,334|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > TaskCounter_Reducer_2_OUTPUT_out_Reducer_2: > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : OUTPUT_RECORDS: 0 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > org.apache.hadoop.hive.llap.counters.LlapWmCounters: > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : GUARANTEED_QUEUED_NS: > 0 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > GUARANTEED_RUNNING_NS: 0 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > SPECULATIVE_QUEUED_NS: 2162643606 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : > SPECULATIVE_RUNNING_NS: 12151664909 > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task > [Stage-2:DEPENDENCY_COLLECTION] in serial mode > 2018-07-03 04:57:36,335|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task > [Stage-0:MOVE] in serial mode > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Moving data to > directory > hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/calcs > from > hdfs://mycluster/warehouse/tablespace/managed/hive/druid_tableau.db/.hive-staging_hive_2018-07-03_04-57-27_351_7124633902209008283-3/-ext-10002 > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Starting task > [Stage-4:DDL] in serial mode > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|ERROR : FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:Table druid_tableau.calcs failed strict managed table > checks due to the following reason: Table is marked as a managed table but is > not transactional.) > 2018-07-03 04:57:36,336|INFO|Thread-721|machine.py:111 - > tee_pipe()||aa121a45-29eb-48a8-8628-ae5368aa172d|INFO : Completed executing >
[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-19820: Summary: add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests (was: add ACID stats support to background stats updater) > add ACID stats support to background stats updater and fix bunch of edge > cases found in SU tests > > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch, > HIVE-19820.02-master-txnstats.patch, HIVE-19820.03-master-txnstats.patch, > HIVE-19820.04-master-txnstats.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20105) Druid-Hive: tpcds query on timestamp throws java.lang.IllegalArgumentException: Cannot create timestamp, parsing error
[ https://issues.apache.org/jira/browse/HIVE-20105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535368#comment-16535368 ] Ashutosh Chauhan commented on HIVE-20105: - +1 > Druid-Hive: tpcds query on timestamp throws > java.lang.IllegalArgumentException: Cannot create timestamp, parsing error > -- > > Key: HIVE-20105 > URL: https://issues.apache.org/jira/browse/HIVE-20105 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Dileep Kumar Chiguruvada >Assignee: Nishant Bangarwa >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-20105.patch > > > Druid-Hive: tpcds query on timestamp trows > java.lang.IllegalArgumentException: Cannot create timestamp, parsing error. > {code} > SELECT `__time`, max(ss_quantity), sum(ss_wholesale_cost) > FROM misc_store_sales_denormalized_subset > GROUP BY `__time`; > INFO : Compiling > command(queryId=hive_20180705123007_dd94e295-9e3e-440e-9818-2e7f8458f06d): > SELECT `__time`, max(ss_quantity), sum(ss_wholesale_cost) > FROM misc_store_sales_denormalized_subset > GROUP BY `__time` > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:__time, > type:timestamp, comment:null), FieldSchema(name:$f1, type:int, comment:null), > FieldSchema(name:$f2, type:double, comment:null)], properties:null) > INFO : Completed compiling > command(queryId=hive_20180705123007_dd94e295-9e3e-440e-9818-2e7f8458f06d); > Time taken: 0.143 seconds > INFO : Executing > command(queryId=hive_20180705123007_dd94e295-9e3e-440e-9818-2e7f8458f06d): > SELECT `__time`, max(ss_quantity), sum(ss_wholesale_cost) > FROM misc_store_sales_denormalized_subset > GROUP BY `__time` > INFO : Completed executing > command(queryId=hive_20180705123007_dd94e295-9e3e-440e-9818-2e7f8458f06d); > Time taken: 0.003 seconds > INFO : OK > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > java.lang.IllegalArgumentException: Cannot create timestamp, parsing error > Closing: 0: > jdbc:hive2://ctr-e138-1518143905142-397384-01-06.hwx.site:2181,ctr-e138-1518143905142-397384-01-05.hwx.site:2181,ctr-e138-1518143905142-397384-01-04.hwx.site:2181,ctr-e138-1518143905142-397384-01-07.hwx.site:2181,ctr-e138-1518143905142-397384-01-08.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive;principal=hive/_h...@hwqe.hortonworks.com > {code} > Facing this issue after removed condition to create Druid Hive table with > (TIMESTAMP with local time zone) > {code} > -SELECT CAST(d_date AS TIMESTAMP with local time zone) AS `__time`, > +SELECT CAST(d_date AS TIMESTAMP) AS `__time`, > {code} > create table with > SELECT CAST(d_date AS TIMESTAMP with local time zone) AS `__time . works fine > HSI log: > {code} > 2018-07-05T12:30:08,297 INFO [6b9ca95f-3aee-44cc-b2eb-2aa9bdec2b38 > HiveServer2-Handler-Pool: Thread-326]: session.SessionState > (SessionState.java:resetThreadName(449)) - Resetting thread name to > HiveServer2-Handler-Pool: Thread-326 > 2018-07-05T12:30:08,297 WARN [HiveServer2-Handler-Pool: Thread-326]: > thrift.ThriftCLIService (ThriftCLIService.java:FetchResults(795)) - Error > fetching results: > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > java.lang.IllegalArgumentException: Cannot create timestamp, parsing error > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:465) > ~[hive-service-3.1.0.3.0.0.0-1602.jar:3.1.0.3.0.0.0-1602] > at > org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:309) > ~[hive-service-3.1.0.3.0.0.0-1602.jar:3.1.0.3.0.0.0-1602] > at > org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:905) > ~[hive-service-3.1.0.3.0.0.0-1602.jar:3.1.0.3.0.0.0-1602] > at > org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:561) > ~[hive-service-3.1.0.3.0.0.0-1602.jar:3.1.0.3.0.0.0-1602] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786) > [hive-service-3.1.0.3.0.0.0-1602.jar:3.1.0.3.0.0.0-1602] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837) > [hive-exec-3.1.0.3.0.0.0-1602.jar:3.1.0.3.0.0.0-1602] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822) > [hive-exec-3.1.0.3.0.0.0-1602.jar:3.1.0.3.0.0.0-1602] > at
[jira] [Commented] (HIVE-20094) Update Druid to 0.12.1 version
[ https://issues.apache.org/jira/browse/HIVE-20094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535363#comment-16535363 ] Ashutosh Chauhan commented on HIVE-20094: - +1 > Update Druid to 0.12.1 version > -- > > Key: HIVE-20094 > URL: https://issues.apache.org/jira/browse/HIVE-20094 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Minor > Attachments: HIVE-20094.patch > > > As per Jira title. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535353#comment-16535353 ] Sahil Takiar commented on HIVE-20032: - Attaching dummy patch with {{hive.spark.optimize.shuffle.serde}} set to true by default and {{hive.combine.equivalent.work.optimization}} to false by default, just to see if there are any HoS test failures (besides explain plan diffs). > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
[ https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-20032: Attachment: HIVE-20032.2.patch > Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled > - > > Key: HIVE-20032 > URL: https://issues.apache.org/jira/browse/HIVE-20032 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch > > > Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, > then we don't need to serialize the hashCode when shuffling data in HoS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20108) Investigate alternatives to groupByKey
[ https://issues.apache.org/jira/browse/HIVE-20108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-20108: --- > Investigate alternatives to groupByKey > -- > > Key: HIVE-20108 > URL: https://issues.apache.org/jira/browse/HIVE-20108 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > > We use {{groupByKey}} for aggregations (or if > {{hive.spark.use.groupby.shuffle}} is false we use > {{repartitionAndSortWithinPartitions}}). > {{groupByKey}} has its drawbacks because it can't spill records within a > single key group. It also seems to be doing some unnecessary work in Spark's > {{Aggregator}} (not positive about this part). > {{repartitionAndSortWithinPartitions}} is better, but the sorting within > partitions isn't necessary for aggregations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17751) Separate HMS Client and HMS server into separate sub-modules
[ https://issues.apache.org/jira/browse/HIVE-17751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535341#comment-16535341 ] Alexander Kolbasov commented on HIVE-17751: --- Patch 13 merged with {code} * commit eae5225f4301254cd8c5ad127bc242890bd441a8 (origin/master, origin/HEAD) | Author: Alan Gates | Date: Fri Jul 6 09:26:17 2018 -0700 | | HIVE-20060 Refactor HiveSchemaTool and MetastoreSchemaTool (Alan Gates, reviewed by Daniel Dai) {code} > Separate HMS Client and HMS server into separate sub-modules > > > Key: HIVE-17751 > URL: https://issues.apache.org/jira/browse/HIVE-17751 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Vihang Karajgaonkar >Assignee: Alexander Kolbasov >Priority: Major > Attachments: HIVE-17751.01.patch, HIVE-17751.02.patch, > HIVE-17751.03.patch, HIVE-17751.04.patch, > HIVE-17751.06-standalone-metastore.patch, HIVE-17751.07.patch, > HIVE-17751.08.patch, HIVE-17751.09.patch, HIVE-17751.10.patch, > HIVE-17751.11.patch, HIVE-17751.13.patch > > > external applications which are interfacing with HMS should ideally only > include HMSClient library instead of one big library containing server as > well. We should ideally have a thin client library so that cross version > support for external applications is easier. We should sub-divide the > standalone module into possibly 3 modules (one for common classes, one for > client classes and one for server) or 2 sub-modules (one for client and one > for server) so that we can generate separate jars for HMS client and server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17751) Separate HMS Client and HMS server into separate sub-modules
[ https://issues.apache.org/jira/browse/HIVE-17751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Kolbasov updated HIVE-17751: -- Attachment: HIVE-17751.13.patch > Separate HMS Client and HMS server into separate sub-modules > > > Key: HIVE-17751 > URL: https://issues.apache.org/jira/browse/HIVE-17751 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Vihang Karajgaonkar >Assignee: Alexander Kolbasov >Priority: Major > Attachments: HIVE-17751.01.patch, HIVE-17751.02.patch, > HIVE-17751.03.patch, HIVE-17751.04.patch, > HIVE-17751.06-standalone-metastore.patch, HIVE-17751.07.patch, > HIVE-17751.08.patch, HIVE-17751.09.patch, HIVE-17751.10.patch, > HIVE-17751.11.patch, HIVE-17751.13.patch > > > external applications which are interfacing with HMS should ideally only > include HMSClient library instead of one big library containing server as > well. We should ideally have a thin client library so that cross version > support for external applications is easier. We should sub-divide the > standalone module into possibly 3 modules (one for common classes, one for > client classes and one for server) or 2 sub-modules (one for client and one > for server) so that we can generate separate jars for HMS client and server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)