[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-08-11 Thread Alisha Prabhu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577146#comment-16577146
 ] 

Alisha Prabhu commented on HIVE-19927:
--

Thanks for the quick fix [~sankarh] !!

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, 
> HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-08-10 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576544#comment-16576544
 ] 

Sankar Hariappan commented on HIVE-19927:
-

[~alishap], 

It fails because of assert in below code. 
{code:java}
Long bootDumpBeginReplId = 
queryState.getConf().getLong(ReplicationSemanticAnalyzer.LAST_REPL_ID_KEY, -1L);
assert (bootDumpBeginReplId >= 0L);{code}
It is expected to set the  "hive.repl.last.repl.id" config in 
queryState.queryConf by Driver before invoking bootstrapDump.

So, this unit test should mock it like this.
{code:java}
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.ql.QueryState;

@RunWith(PowerMockRunner.class)
@PrepareForTest({ Utils.class })
@PowerMockIgnore({ "javax.management.*" })
public class ReplDumpTaskTest {

  @Mock
  private Hive hive;

  @Mock
  private HiveConf conf;

  @Mock
  private QueryState qs;

  class StubReplDumpTask extends ReplDumpTask {

@Override
protected Hive getHive() {
  return hive;
}

@Override
long currentNotificationId(Hive hiveDb) {
  return Long.MAX_VALUE;
}

@Override
String getValidTxnListForReplDump(Hive hiveDb) {
  return "";
}

@Override
void dumpFunctionMetadata(String dbName, Path dumpRoot) {
}

@Override
Path dumpDbMetadata(String dbName, Path dumpRoot, long lastReplId) {
  return Mockito.mock(Path.class);
}

@Override
void dumpConstraintMetadata(String dbName, String tblName, Path dbRoot) {
}
  }

  private static class TestException extends Exception {
  }

  @Test(expected = TestException.class)
  public void removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails() 
throws Exception {
List tableList = Arrays.asList("a1", "a2");
String dbRandomKey = "akeytoberandom";

mockStatic(Utils.class);
when(Utils.matchesDb(same(hive), eq("default")))
.thenReturn(Collections.singletonList("default"));
when(Utils.getAllTables(same(hive), eq("default"))).thenReturn(tableList);
when(Utils.setDbBootstrapDumpState(same(hive), 
eq("default"))).thenReturn(dbRandomKey);
when(Utils.matchesTbl(same(hive), eq("default"), 
anyString())).thenReturn(tableList);


when(hive.getAllFunctions()).thenReturn(Collections.emptyList());
when(qs.getConf()).thenReturn(conf);
when(conf.getLong("hive.repl.last.repl.id", -1L)).thenReturn(1L);

ReplDumpTask task = new StubReplDumpTask() {
  private int tableDumpCount = 0;

  @Override
  void dumpTable(String dbName, String tblName, String validTxnList, Path 
dbRoot, long lastReplId)
  throws Exception {
tableDumpCount++;
if (tableDumpCount > 1) {
  throw new TestException();
}
  }
};

task.initialize(qs, null, null, null);
task.setWork(
new ReplDumpWork("default", "",
Long.MAX_VALUE, Long.MAX_VALUE, "",
Integer.MAX_VALUE, "")
);

try {
  task.bootStrapDump(mock(Path.class), null, mock(Path.class));
} finally {
  verifyStatic();
  Utils.resetDbBootstrapDumpState(same(hive), eq("default"), 
eq(dbRandomKey));
}
  }
}{code}
I think, this test was not run by ptest as it's naming doesn't start with Test*.

Will submit a patch with this fix. Thanks for bringing this up!

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, 
> HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> 

[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-08-10 Thread Alisha Prabhu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576181#comment-16576181
 ] 

Alisha Prabhu commented on HIVE-19927:
--

Hi [~sankarh],
Facing issue in ReplDumpTaskTest.java after the above commit.
Command used in ql module : mvn -Dtest=ReplDumpTaskTest test
Error:
{code:java}
[ERROR] 
removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails(org.apache.hadoop.hive.ql.exec.repl.ReplDumpTaskTest)
  Time elapsed: 3.008 s  <<< ERROR!
java.lang.Exception: Unexpected exception, 
expected 
but was
at 
org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.handleException(PowerMockJUnit44RunnerDelegateImpl.java:370)
{code}
However, after debugging I have observed that, at line 225 of 
ReplDumpTask.java, as shown below ,
Long bootDumpBeginReplId = 
queryState.getConf().getLong(ReplicationSemanticAnalyzer.LAST_REPL_ID_KEY, -1L);
it is unable to fetch details from the "queryState" object.

Could you please help me understand the context behind the change? or a 
possible reason for the above error ?

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, 
> HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-27 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560632#comment-16560632
 ] 

Sankar Hariappan commented on HIVE-19927:
-

Test failures are irrelevant.

01-branch-3.patch is committed to branch-3.

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, 
> HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-27 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1655#comment-1655
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933322/HIVE-19927.01-branch-3.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 14410 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=258)
TestSparkStatistics - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestTezPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=70)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_with_masking]
 (batchId=174)
org.apache.hadoop.hive.ql.TestWarehouseExternalDir.testManagedPaths 
(batchId=235)
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation 
(batchId=243)
org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=310)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12900/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12900/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12900/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933322 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, 
> HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-27 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559927#comment-16559927
 ] 

Hive QA commented on HIVE-19927:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 19s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-12900/patches/PreCommit-HIVE-Build-12900.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12900/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, 
> HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-27 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559360#comment-16559360
 ] 

Sankar Hariappan commented on HIVE-19927:
-

Attached branch-3 patch.

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-19927.01-branch-3.patch, HIVE-19927.01.patch, 
> HIVE-19927.02.patch, HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559347#comment-16559347
 ] 

ASF GitHub Bot commented on HIVE-19927:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/403


> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-27 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559320#comment-16559320
 ] 

Sankar Hariappan commented on HIVE-19927:
-

Thanks for the review [~maheshk114] and [~anishek]!

04.patch is committed to master.

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-27 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559308#comment-16559308
 ] 

anishek commented on HIVE-19927:


+1 

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558568#comment-16558568
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933197/HIVE-19927.04.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12878/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12878/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12878/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12933197/HIVE-19927.04.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933197 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558419#comment-16558419
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933197/HIVE-19927.04.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14812 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12876/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12876/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12876/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933197 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558335#comment-16558335
 ] 

Hive QA commented on HIVE-19927:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 2296 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
13s{color} | {color:red} metastore-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 
unchanged - 0 fixed = 187 total (was 166) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 1 new + 258 unchanged - 1 
fixed = 259 total (was 259) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12876/dev-support/hive-personality.sh
 |
| git revision | master / 2820fc4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/branch-findbugs-standalone-metastore_metastore-server.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus/patch-findbugs-standalone-metastore_metastore-server.txt
 |
| modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12876/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: 

[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558206#comment-16558206
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933166/HIVE-19927.04.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14812 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestRestrictedList.testRestrictedList (batchId=251)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12875/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12875/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12875/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933166 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558174#comment-16558174
 ] 

Hive QA commented on HIVE-19927:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
56s{color} | {color:blue} ql in master has 2296 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 
unchanged - 0 fixed = 187 total (was 166) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 1 new + 258 unchanged - 1 
fixed = 259 total (was 259) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12875/dev-support/hive-personality.sh
 |
| git revision | master / 2820fc4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/branch-findbugs-standalone-metastore_metastore-server.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus/patch-findbugs-standalone-metastore_metastore-server.txt
 |
| modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12875/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: 

[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558054#comment-16558054
 ] 

Sankar Hariappan commented on HIVE-19927:
-

Attached 04.patch after rebasing with master.

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch, HIVE-19927.04.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558045#comment-16558045
 ] 

mahesh kumar behera commented on HIVE-19927:


[~sankarh]
patch 3 looks fine to me 

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch, 
> HIVE-19927.03.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558042#comment-16558042
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933145/HIVE-19927.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12872/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12872/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12872/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-07-26 08:02:33.008
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-12872/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-07-26 08:02:33.010
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2820fc4 HIVE-20203: Arrow SerDe leaks a DirectByteBuffer (Eric 
Wohlstadter, reviewed by Teddy Choi)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 2820fc4 HIVE-20203: Arrow SerDe leaks a DirectByteBuffer (Eric 
Wohlstadter, reviewed by Teddy Choi)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-07-26 08:02:34.163
+ rm -rf ../yetus_PreCommit-HIVE-Build-12872
+ mkdir ../yetus_PreCommit-HIVE-Build-12872
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-12872
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12872/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java:
 does not exist in index
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java: does not exist in 
index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java: 
does not exist in index
error: a/ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java: 
does not exist in index
error: a/ql/src/test/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTaskTest.java: 
does not exist in index
error: 
a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
 does not exist in index
error: 
a/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java:
 does not exist in index
error: patch failed: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java:19
Falling back to three-way merge...
Applied patch to 
'standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java'
 with conflicts.
Going to apply patch with: git apply -p1
error: patch failed: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java:19
Falling back to three-way merge...
Applied patch to 
'standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/InjectableBehaviourObjectStore.java'
 with conflicts.
U 

[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552917#comment-16552917
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932695/HIVE-19927.02.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14681 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12791/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12791/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932695 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552882#comment-16552882
 ] 

Hive QA commented on HIVE-19927:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2280 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 
unchanged - 0 fixed = 187 total (was 166) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 1 new + 258 unchanged - 1 
fixed = 259 total (was 259) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
13s{color} | {color:red} metastore-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12791/dev-support/hive-personality.sh
 |
| git revision | master / 6b15816 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/branch-findbugs-standalone-metastore_metastore-server.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus/patch-findbugs-standalone-metastore_metastore-server.txt
 |
| modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12791/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: 

[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-23 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552775#comment-16552775
 ] 

Sankar Hariappan commented on HIVE-19927:
-

Attached 02.patch with
 * Rebased with master
 * Bug fix where idempotent behaviour for create/drop functions which occur 
concurrently to bootstrap dump after fetching last repl id.
 * Set last repl ID in queryState con for each query overwriting old one.
 * Set last repl ID only if txn is opened.

Request [~maheshk114] to take a look!

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch, HIVE-19927.02.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-20 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550342#comment-16550342
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932104/HIVE-19927.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12722/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12722/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12722/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12932104/HIVE-19927.01.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932104 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549735#comment-16549735
 ] 

Hive QA commented on HIVE-19927:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932104/HIVE-19927.01.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 14672 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.txn.TestTxnHandler.testReplAllocWriteId 
(batchId=273)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testDumpLimit 
(batchId=239)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testTruncateWithCM 
(batchId=239)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootStrapDumpOfWarehouse
 (batchId=241)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testDropFunctionIncrementalReplication
 (batchId=241)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testIncrementalDumpOfWarehouse
 (batchId=241)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12701/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12701/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12701/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932104 - PreCommit-HIVE-Build

> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549716#comment-16549716
 ] 

Hive QA commented on HIVE-19927:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
18s{color} | {color:blue} standalone-metastore/metastore-common in master has 
218 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 2273 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
36s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 1 new + 211 unchanged - 0 
fixed = 212 total (was 211) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} itests/hive-unit: The patch generated 21 new + 166 
unchanged - 0 fixed = 187 total (was 166) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12701/dev-support/hive-personality.sh
 |
| git revision | master / 6d15ce4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12701/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12701/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: standalone-metastore/metastore-common ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12701/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>   

[jira] [Commented] (HIVE-19927) Last Repl ID set by bootstrap dump is incorrect and may cause data loss if have ACID/MM tables.

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548155#comment-16548155
 ] 

ASF GitHub Bot commented on HIVE-19927:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/403

HIVE-19927: Last Repl ID set by bootstrap dump is incorrect and may cause 
data loss  if have ACID/MM tables.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-19927

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/403.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #403


commit 434aa1fe7060d1aab393d693e471e67880d599d8
Author: Sankar Hariappan 
Date:   2018-07-08T10:43:47Z

HIVE-19927: Last Repl ID set by bootstrap dump is incorrect and may cause 
data loss  if have ACID/MM tables.




> Last Repl ID set by bootstrap dump is incorrect and may cause data loss if 
> have ACID/MM tables.
> ---
>
> Key: HIVE-19927
> URL: https://issues.apache.org/jira/browse/HIVE-19927
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, Transactions
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-19927.01.patch
>
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
> - Current session (REPL DUMP), Open txn (Txn1) - Event-10
> - Another session (Session-2), Open txn (Txn2) - Event-11
> - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12
> - Get lastReplId = last event ID logged. (Event-12)
> - Session-2 -> Commit Txn (Txn2) - Event-13
> - Dump ACID tables based on validTxnList based on Txn1. --> This step skips 
> all the data written by txns > Txn1. So, T1.D1 will be missing.
> - Commit Txn (Txn1)
> - REPL LOAD from bootstrap dump will skip T1.D1.
> - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is 
> opened after Txn1. So, data T1.D1 will be lost for ever.
> Proposed to capture the lastReplId of bootstrap before opening current txn 
> (Txn1) and store it in Driver context and use it for dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)