[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778022#comment-16778022 ] mahesh kumar behera commented on HIVE-21197: 05.patch committed to master. Thanks [~sankarh] for review. > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch, > HIVE-21197.03.patch, HIVE-21197.04.patch, HIVE-21197.05.patch > > Time Spent: 22h 20m > Remaining Estimate: 0h > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777915#comment-16777915 ] Sankar Hariappan commented on HIVE-21197: - +1 05.patch looks good to me. > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch, > HIVE-21197.03.patch, HIVE-21197.04.patch, HIVE-21197.05.patch > > Time Spent: 22h 20m > Remaining Estimate: 0h > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1677#comment-1677 ] Hive QA commented on HIVE-21197: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12960156/HIVE-21197.05.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15818 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16246/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16246/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16246/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12960156 - PreCommit-HIVE-Build > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch, > HIVE-21197.03.patch, HIVE-21197.04.patch, HIVE-21197.05.patch > > Time Spent: 22h 20m > Remaining Estimate: 0h > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1659#comment-1659 ] Hive QA commented on HIVE-21197: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 50s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 22s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 9s{color} | {color:blue} standalone-metastore/metastore-server in master has 181 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 1s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 44s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s{color} | {color:red} itests/hive-unit: The patch generated 90 new + 272 unchanged - 0 fixed = 362 total (was 272) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 13s{color} | {color:red} ql generated 1 new + 2261 unchanged - 0 fixed = 2262 total (was 2261) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 38m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Redundant nullcheck of tableTuple, which is known to be non-null in org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.bootStrapDump(Path, DumpMetaData, Path, Hive) Redundant null check at ReplDumpTask.java:is known to be non-null in org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.bootStrapDump(Path, DumpMetaData, Path, Hive) Redundant null check at ReplDumpTask.java:[line 289] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16246/dev-support/hive-personality.sh | | git revision | master / 2daaed7 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16246/yetus/diff-checkstyle-itests_hive-unit.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-16246/yetus/new-findbugs-ql.html | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16246/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > --
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1627#comment-1627 ] Hive QA commented on HIVE-21197: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12960153/HIVE-21197.05.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16245/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16245/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16245/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2019-02-26 08:31:11.883 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-16245/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2019-02-26 08:31:11.887 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 2daaed7 HIVE-21307: Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY (Sankar Hariappan, reviewed by Mahesh Kumar Behera) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 2daaed7 HIVE-21307: Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY (Sankar Hariappan, reviewed by Mahesh Kumar Behera) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2019-02-26 08:31:13.025 + rm -rf ../yetus_PreCommit-HIVE-Build-16245 + mkdir ../yetus_PreCommit-HIVE-Build-16245 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-16245 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-16245/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java: does not exist in index error: a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/WarehouseInstance.java: does not exist in index error: a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadDatabase.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/TableContext.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/incremental/IncrementalLoadTasksBuilder.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/util/ReplUtils.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/io/TableSerializer.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/AlterDatabaseHandler.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/p
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777647#comment-16777647 ] Hive QA commented on HIVE-21197: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12960126/HIVE-21197.04.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15818 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.exec.repl.TestReplDumpTask.removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails (batchId=321) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16243/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16243/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16243/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12960126 - PreCommit-HIVE-Build > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch, > HIVE-21197.03.patch, HIVE-21197.04.patch > > Time Spent: 22h 10m > Remaining Estimate: 0h > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777621#comment-16777621 ] Hive QA commented on HIVE-21197: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 55s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 16s{color} | {color:blue} standalone-metastore/metastore-server in master has 181 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 4s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 46s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 3 new + 346 unchanged - 0 fixed = 349 total (was 346) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s{color} | {color:red} itests/hive-unit: The patch generated 90 new + 272 unchanged - 0 fixed = 362 total (was 272) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 37m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16243/dev-support/hive-personality.sh | | git revision | master / 2daaed7 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16243/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16243/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16243/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Prio
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776524#comment-16776524 ] Hive QA commented on HIVE-21197: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12959970/HIVE-21197.03.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15816 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16228/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16228/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16228/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12959970 - PreCommit-HIVE-Build > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch, > HIVE-21197.03.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776522#comment-16776522 ] Hive QA commented on HIVE-21197: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 12s{color} | {color:blue} standalone-metastore/metastore-server in master has 181 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 3s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 47s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 1 new + 325 unchanged - 0 fixed = 326 total (was 325) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s{color} | {color:red} itests/hive-unit: The patch generated 71 new + 272 unchanged - 0 fixed = 343 total (was 272) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 37m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16228/dev-support/hive-personality.sh | | git revision | master / 2daaed7 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16228/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16228/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16228/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Prio
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776405#comment-16776405 ] Hive QA commented on HIVE-21197: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12959955/HIVE-21197.03.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 15816 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] (batchId=86) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=109) org.apache.hadoop.hive.ql.exec.repl.TestReplDumpTask.removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails (batchId=321) org.apache.hadoop.hive.ql.parse.TestReplWithJsonMessageFormat.testBootstrapWithConcurrentDropTable (batchId=244) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBootstrapWithConcurrentDropTable (batchId=246) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16224/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16224/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16224/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12959955 - PreCommit-HIVE-Build > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch, > HIVE-21197.03.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776399#comment-16776399 ] Hive QA commented on HIVE-21197: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 12s{color} | {color:blue} standalone-metastore/metastore-server in master has 181 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 12s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 45s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 1 new + 325 unchanged - 0 fixed = 326 total (was 325) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s{color} | {color:red} itests/hive-unit: The patch generated 71 new + 272 unchanged - 0 fixed = 343 total (was 272) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 37m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16224/dev-support/hive-personality.sh | | git revision | master / 2daaed7 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16224/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16224/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16224/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Prio
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774032#comment-16774032 ] Hive QA commented on HIVE-21197: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12959554/HIVE-21197.02.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15813 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning (batchId=337) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16178/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16178/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16178/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12959554 - PreCommit-HIVE-Build > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch > > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774014#comment-16774014 ] Hive QA commented on HIVE-21197: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 3s{color} | {color:blue} standalone-metastore/metastore-server in master has 181 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 46s{color} | {color:blue} ql in master has 2260 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 1 new + 310 unchanged - 0 fixed = 311 total (was 310) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} itests/hive-unit: The patch generated 71 new + 272 unchanged - 0 fixed = 343 total (was 272) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16178/dev-support/hive-personality.sh | | git revision | master / 89b9f12 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16178/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16178/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16178/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Prio
[jira] [Commented] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773713#comment-16773713 ] mahesh kumar behera commented on HIVE-21197: [https://github.com/apache/hive/pull/541] – pull request > Hive replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch > > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive Replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773035#comment-16773035 ] Hive QA commented on HIVE-21197: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12959405/HIVE-21197.01.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15813 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testDisableCompactionDuringReplLoad (batchId=242) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16159/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16159/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16159/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12959405 - PreCommit-HIVE-Build > Hive Replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Attachments: HIVE-21197.01.patch > > > During bootstrap phase it may happen that the files copied to target are > created by events which are not part of the bootstrap. This is because of the > fact that, bootstrap first gets the last event id and then the file list. > During this period if some event are added, then bootstrap will include files > created by these events also.The same files will be copied again during the > first incremental replication just after the bootstrap. In normal scenario, > the duplicate copy does not cause any issue as hive allows the use of target > database only after the first incremental. But in case of migration, the file > at source and target are copied to different location (based on the write id > at target) and thus this may lead to duplicate data at target. This can be > avoided by having at check at load time for duplicate file. This check can be > done only for the first incremental and the search can be done in the > bootstrap directory (with write id 1). if the file is already present then > just ignore the copy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21197) Hive Replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled
[ https://issues.apache.org/jira/browse/HIVE-21197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773012#comment-16773012 ] Hive QA commented on HIVE-21197: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 18s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 14s{color} | {color:blue} standalone-metastore/metastore-server in master has 181 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 0s{color} | {color:blue} ql in master has 2262 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 43s{color} | {color:red} ql: The patch generated 4 new + 310 unchanged - 0 fixed = 314 total (was 310) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s{color} | {color:red} itests/hive-unit: The patch generated 81 new + 272 unchanged - 0 fixed = 353 total (was 272) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16159/dev-support/hive-personality.sh | | git revision | master / e5a35e7 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16159/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16159/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16159/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hive Replication can add duplicate data during migration to a target with > hive.strict.managed.tables enabled > > > Key: HIVE-21197 > URL: https://issues.apache.org/jira/browse/HIVE-21197 > Project: Hive > Issue Type: Task > Components: repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Prio