[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577146#comment-16577146 ] Alisha Prabhu commented on HIVE-19927: -- Thanks for the quick fix [~sankarh] !! > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, > HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576544#comment-16576544 ] Sankar Hariappan commented on HIVE-19927: - [~alishap], It fails because of assert in below code. {code:java} Long bootDumpBeginReplId = queryState.getConf().getLong(ReplicationSemanticAnalyzer.LAST_REPL_ID_KEY, -1L); assert (bootDumpBeginReplId >= 0L);{code} It is expected to set the "hive.repl.last.repl.id" config in queryState.queryConf by Driver before invoking bootstrapDump. So, this unit test should mock it like this. {code:java} import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.ql.QueryState; @RunWith(PowerMockRunner.class) @PrepareForTest({ Utils.class }) @PowerMockIgnore({ "javax.management.*" }) public class ReplDumpTaskTest { @Mock private Hive hive; @Mock private HiveConf conf; @Mock private QueryState qs; class StubReplDumpTask extends ReplDumpTask { @Override protected Hive getHive() { return hive; } @Override long currentNotificationId(Hive hiveDb) { return Long.MAX_VALUE; } @Override String getValidTxnListForReplDump(Hive hiveDb) { return ""; } @Override void dumpFunctionMetadata(String dbName, Path dumpRoot) { } @Override Path dumpDbMetadata(String dbName, Path dumpRoot, long lastReplId) { return Mockito.mock(Path.class); } @Override void dumpConstraintMetadata(String dbName, String tblName, Path dbRoot) { } } private static class TestException extends Exception { } @Test(expected = TestException.class) public void removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails() throws Exception { List tableList = Arrays.asList("a1", "a2"); String dbRandomKey = "akeytoberandom"; mockStatic(Utils.class); when(Utils.matchesDb(same(hive), eq("default"))) .thenReturn(Collections.singletonList("default")); when(Utils.getAllTables(same(hive), eq("default"))).thenReturn(tableList); when(Utils.setDbBootstrapDumpState(same(hive), eq("default"))).thenReturn(dbRandomKey); when(Utils.matchesTbl(same(hive), eq("default"), anyString())).thenReturn(tableList); when(hive.getAllFunctions()).thenReturn(Collections.emptyList()); when(qs.getConf()).thenReturn(conf); when(conf.getLong("hive.repl.last.repl.id", -1L)).thenReturn(1L); ReplDumpTask task = new StubReplDumpTask() { private int tableDumpCount = 0; @Override void dumpTable(String dbName, String tblName, String validTxnList, Path dbRoot, long lastReplId) throws Exception { tableDumpCount++; if (tableDumpCount > 1) { throw new TestException(); } } }; task.initialize(qs, null, null, null); task.setWork( new ReplDumpWork("default", "", Long.MAX_VALUE, Long.MAX_VALUE, "", Integer.MAX_VALUE, "") ); try { task.bootStrapDump(mock(Path.class), null, mock(Path.class)); } finally { verifyStatic(); Utils.resetDbBootstrapDumpState(same(hive), eq("default"), eq(dbRandomKey)); } } }{code} I think, this test was not run by ptest as it's naming doesn't start with Test*. Will submit a patch with this fix. Thanks for bringing this up! > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, > HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn >
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576181#comment-16576181 ] Alisha Prabhu commented on HIVE-19927: -- Hi [~sankarh], Facing issue in ReplDumpTaskTest.java after the above commit. Command used in ql module : mvn -Dtest=ReplDumpTaskTest test Error: {code:java} [ERROR] removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails(org.apache.hadoop.hive.ql.exec.repl.ReplDumpTaskTest) Time elapsed: 3.008 s <<< ERROR! java.lang.Exception: Unexpected exception, expected but was at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.handleException(PowerMockJUnit44RunnerDelegateImpl.java:370) {code} However, after debugging I have observed that, at line 225 of ReplDumpTask.java, as shown below , Long bootDumpBeginReplId = queryState.getConf().getLong(ReplicationSemanticAnalyzer.LAST_REPL_ID_KEY, -1L); it is unable to fetch details from the "queryState" object. Could you please help me understand the context behind the change? or a possible reason for the above error ? > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, > HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560632#comment-16560632 ] Sankar Hariappan commented on HIVE-19927: - Test failures are irrelevant. 01-branch-3.patch is committed to branch-3. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0, 3.2.0 > > Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, > HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1655#comment-1655 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933322/HIVE-19927.01-branch-3.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 14410 tests executed *Failed tests:* {noformat} TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) (batchId=237) TestTezPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=70) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_with_masking] (batchId=174) org.apache.hadoop.hive.ql.TestWarehouseExternalDir.testManagedPaths (batchId=235) org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation (batchId=243) org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=310) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12900/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12900/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12900/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12933322 - PreCommit-HIVE-Build > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, > HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559927#comment-16559927 ] Hive QA commented on HIVE-19927: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 19s{color} | {color:red} /data/hiveptest/logs/PreCommit-HIVE-Build-12900/patches/PreCommit-HIVE-Build-12900.patch does not apply to master. Rebase required? Wrong Branch? See http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12900/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, > HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559360#comment-16559360 ] Sankar Hariappan commented on HIVE-19927: - Attached branch-3 patch. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, > HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559347#comment-16559347 ] ASF GitHub Bot commented on HIVE-19927: --- Github user sankarh closed the pull request at: https://github.com/apache/hive/pull/403 > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559320#comment-16559320 ] Sankar Hariappan commented on HIVE-19927: - Thanks for the review [~maheshk114] and [~anishek]! 04.patch is committed to master. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559308#comment-16559308 ] anishek commented on HIVE-19927: +1 > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558568#comment-16558568 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933197/HIVE-19927.04.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12878/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12878/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12878/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12933197/HIVE-19927.04.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12933197 - PreCommit-HIVE-Build > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558419#comment-16558419 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933197/HIVE-19927.04.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14812 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12876/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12876/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12876/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12933197 - PreCommit-HIVE-Build > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558335#comment-16558335 ] Hive QA commented on HIVE-19927: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 38s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 4s{color} | {color:blue} ql in master has 2296 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s{color} | {color:red} metastore-server in master failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 unchanged - 0 fixed = 187 total (was 166) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 1 new + 258 unchanged - 1 fixed = 259 total (was 259) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s{color} | {color:red} metastore-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 25s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12876/dev-support/hive-personality.sh | | git revision | master / 2820fc4 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/branch-findbugs-standalone-metastore_metastore-server.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/diff-checkstyle-itests_hive-unit.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/diff-checkstyle-ql.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/patch-findbugs-standalone-metastore_metastore-server.txt | | modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL:
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558206#comment-16558206 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933166/HIVE-19927.04.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14812 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestRestrictedList.testRestrictedList (batchId=251) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12875/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12875/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12875/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12933166 - PreCommit-HIVE-Build > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558174#comment-16558174 ] Hive QA commented on HIVE-19927: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 38s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 56s{color} | {color:blue} ql in master has 2296 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s{color} | {color:red} metastore-server in master failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 unchanged - 0 fixed = 187 total (was 166) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 44s{color} | {color:red} ql: The patch generated 1 new + 258 unchanged - 1 fixed = 259 total (was 259) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s{color} | {color:red} metastore-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12875/dev-support/hive-personality.sh | | git revision | master / 2820fc4 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/branch-findbugs-standalone-metastore_metastore-server.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/diff-checkstyle-itests_hive-unit.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/diff-checkstyle-ql.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/patch-findbugs-standalone-metastore_metastore-server.txt | | modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL:
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558054#comment-16558054 ] Sankar Hariappan commented on HIVE-19927: - Attached 04.patch after rebasing with master. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch, HIVE-19927.04.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558045#comment-16558045 ] mahesh kumar behera commented on HIVE-19927: [~sankarh] patch 3 looks fine to me > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, > HIVE-19927.03.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558042#comment-16558042 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933145/HIVE-19927.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12872/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12872/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12872/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-07-26 08:02:33.008 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12872/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-07-26 08:02:33.010 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 2820fc4 HIVE-20203: Arrow SerDe leaks a DirectByteBuffer (Eric Wohlstadter, reviewed by Teddy Choi) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 2820fc4 HIVE-20203: Arrow SerDe leaks a DirectByteBuffer (Eric Wohlstadter, reviewed by Teddy Choi) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-26 08:02:34.163 + rm -rf ../yetus_PreCommit-HIVE-Build-12872 + mkdir ../yetus_PreCommit-HIVE-Build-12872 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12872 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12872/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java: does not exist in index error: a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java: does not exist in index error: a/ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java: does not exist in index error: a/ql/src/test/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTaskTest.java: does not exist in index error: a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: does not exist in index error: a/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java: does not exist in index error: patch failed: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java:19 Falling back to three-way merge... Applied patch to 'standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java' with conflicts. Going to apply patch with: git apply -p1 error: patch failed: standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java:19 Falling back to three-way merge... Applied patch to 'standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java' with conflicts. U
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552917#comment-16552917 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12932695/HIVE-19927.02.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14681 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12791/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12791/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12791/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12932695 - PreCommit-HIVE-Build > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552882#comment-16552882 ] Hive QA commented on HIVE-19927: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 9s{color} | {color:blue} ql in master has 2280 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s{color} | {color:red} metastore-server in master failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 unchanged - 0 fixed = 187 total (was 166) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 1 new + 258 unchanged - 1 fixed = 259 total (was 259) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s{color} | {color:red} metastore-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12791/dev-support/hive-personality.sh | | git revision | master / 6b15816 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/branch-findbugs-standalone-metastore_metastore-server.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/diff-checkstyle-itests_hive-unit.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/diff-checkstyle-ql.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/patch-findbugs-standalone-metastore_metastore-server.txt | | modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL:
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552775#comment-16552775 ] Sankar Hariappan commented on HIVE-19927: - Attached 02.patch with * Rebased with master * Bug fix where idempotent behaviour for create/drop functions which occur concurrently to bootstrap dump after fetching last repl id. * Set last repl ID in queryState con for each query overwriting old one. * Set last repl ID only if txn is opened. Request [~maheshk114] to take a look! > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550342#comment-16550342 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12932104/HIVE-19927.01.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12722/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12722/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12722/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12932104/HIVE-19927.01.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12932104 - PreCommit-HIVE-Build > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549735#comment-16549735 ] Hive QA commented on HIVE-19927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12932104/HIVE-19927.01.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 14672 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.txn.TestTxnHandler.testReplAllocWriteId (batchId=273) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testDumpLimit (batchId=239) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testTruncateWithCM (batchId=239) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootStrapDumpOfWarehouse (batchId=241) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testDropFunctionIncrementalReplication (batchId=241) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testIncrementalDumpOfWarehouse (batchId=241) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12701/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12701/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12701/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12932104 - PreCommit-HIVE-Build > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549716#comment-16549716 ] Hive QA commented on HIVE-19927: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 18s{color} | {color:blue} standalone-metastore/metastore-common in master has 218 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 1s{color} | {color:blue} ql in master has 2273 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 1 new + 211 unchanged - 0 fixed = 212 total (was 211) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 unchanged - 0 fixed = 187 total (was 166) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 43m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12701/dev-support/hive-personality.sh | | git revision | master / 6d15ce4 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12701/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12701/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: standalone-metastore/metastore-common ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12701/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >
[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.
[ https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548155#comment-16548155 ] ASF GitHub Bot commented on HIVE-19927: --- GitHub user sankarh opened a pull request: https://github.com/apache/hive/pull/403 HIVE-19927: Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sankarh/hive HIVE-19927 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/403.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #403 commit 434aa1fe7060d1aab393d693e471e67880d599d8 Author: Sankar Hariappan Date: 2018-07-08T10:43:47Z HIVE-19927: Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables. > Last Repl ID set by bootstrap dump is incorrect and may cause data loss if > have ACID/MM tables. > --- > > Key: HIVE-19927 > URL: https://issues.apache.org/jira/browse/HIVE-19927 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, Transactions >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-19927.01.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Current session (REPL DUMP), Open txn (Txn1) - Event-10 > - Another session (Session-2), Open txn (Txn2) - Event-11 > - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Get lastReplId = last event ID logged. (Event-12) > - Session-2 -> Commit Txn (Txn2) - Event-13 > - Dump ACID tables based on validTxnList based on Txn1. --> This step skips > all the data written by txns > Txn1. So, T1.D1 will be missing. > - Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1.D1. > - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is > opened after Txn1. So, data T1.D1 will be lost for ever. > Proposed to capture the lastReplId of bootstrap before opening current txn > (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)