[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job

2018-10-15 Thread Yonatan Gottesman (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651163#comment-16651163
 ] 

Yonatan Gottesman commented on OMID-109:


OK cools, looks like many tests fail.

Are some of them expected to fail before I jump to debugging?

> Unable to build phoenix-integration branch through Jenkins job
> --
>
> Key: OMID-109
> URL: https://issues.apache.org/jira/browse/OMID-109
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Blocker
> Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch
>
>
> Based on Jenkins job failures 
> (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom 
> needs to be updated to 
> https://raw.githubusercontent.com/synergian/wagon-git/releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job

2018-10-15 Thread Yonatan Gottesman (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650588#comment-16650588
 ] 

Yonatan Gottesman commented on OMID-109:


https://issues.apache.org/jira/browse/INFRA-17149

Is this ok?

 

> Unable to build phoenix-integration branch through Jenkins job
> --
>
> Key: OMID-109
> URL: https://issues.apache.org/jira/browse/OMID-109
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Blocker
> Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch
>
>
> Based on Jenkins job failures 
> (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom 
> needs to be updated to 
> https://raw.githubusercontent.com/synergian/wagon-git/releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job

2018-10-15 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650580#comment-16650580
 ] 

James Taylor commented on OMID-109:
---

[~yonigo] & [~ohads] - would you guys mind filing INFRA Jira tickets so that 
you can make changes to these Jenkins jobs?

> Unable to build phoenix-integration branch through Jenkins job
> --
>
> Key: OMID-109
> URL: https://issues.apache.org/jira/browse/OMID-109
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Blocker
> Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch
>
>
> Based on Jenkins job failures 
> (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom 
> needs to be updated to 
> https://raw.githubusercontent.com/synergian/wagon-git/releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-90) Reducing begin/commit latency by distributing the write to the commit table

2018-10-15 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650361#comment-16650361
 ] 

James Taylor commented on OMID-90:
--

Ping, [~ohads]. I didn't understand your question:
{quote}This commit removes the addition of the fence information to the commit 
table. Could you please check if it hurts correctness? It is related to the 
index population with auto commit that you have just added.
{quote}
We need the fence functionality in Phoenix so that when an index is created, we 
can guarantee that we don't miss creation of any data rows.

> Reducing begin/commit latency by distributing the write to the commit table
> ---
>
> Key: OMID-90
> URL: https://issues.apache.org/jira/browse/OMID-90
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: Ohad Shacham
>Assignee: Yonatan Gottesman
>Priority: Major
> Attachments: OmidCloud-VLDB.pdf, omid90.patch
>
>
> Today, Omid's commits are done by the transaction manager. In order to 
> efficiently write to the commit table, the transaction manager batches these 
> writes. This optimization, even thought reduces the write time to HBase, 
> significantly increases the begin and commit latency. The commit latency 
> increases since a commit operation returns only after its commit timestamp 
> was persisted in the commit table. And the begin latency increases since 
> begin returns a transaction id that is also used by the transaction to 
> identify its snapshot and therefore, begin returns only after all commits 
> with commit id smaller than the begin id was persisted in the commit table. 
> This is crucial, since a snapshot change during a transaction run may violate 
> snapshot isolation. 
>  
> The idea of this feature is to distribute the commit by moving the write to 
> the commit table from the server to the client. The transaction manager does 
> conflict analysis and returns a commit timestamp. While the client atomically 
> persists this commit in the commit table.
> This significantly reduces the begin and commit latency, since batching is 
> not required anymore. A begin operation can immediately returns and a commit 
> operation returns after conflict detection. 
> This can introduce snapshot isolation violation since a slow client can 
> commit and change other transaction's snapsho. Therefore, we use an 
> invalidation technique which is similar to the one Omid uses today to 
> maintain snapshot isolation in high availability mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-117) Ensure timeouts are configured low for RPCs to commit table

2018-10-15 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650343#comment-16650343
 ] 

James Taylor commented on OMID-117:
---

{quote}About the retries, what the worst thing that can happen with this? how 
bad is it to have it like this?
{quote}
Check out the comment I added to RegionConnectionFactory:
{code:java}
// This setting controls how many retries occur on the region server if an
// IOException occurs while trying to access the commit table. Because a
// handler thread will be in use while these retries occur and the client
// will be blocked waiting, it must not tie up the call for longer than
// the client RPC timeout. Otherwise, the client will initiate retries on it's
// end, tying up yet another handler thread. It's best if the retries can be
// zero, as in that case the handler is released and the retries occur on the
// client side. In testing, we've seen NoServerForRegionException occur which
// is a DoNotRetryIOException which are not retried on the client. It's not
// clear if this is a real issue or a test-only issue.
private static final int DEFAULT_COMMIT_TABLE_ACCESS_ON_READ_RETRIES_NUMBER = 
11;
private static final int DEFAULT_COMMIT_TABLE_ACCESS_ON_READ_RETRY_PAUSE = 100;
{code}
As it is with this patch, if retries are necessary to reach the RS hosting the 
commit table, they will occur from the RS handling the scan for 48 seconds. 
During this time, the handler thread will be tied up (i.e. it won't be able to 
be used by any other HBase client). If this occurs for all the handler threads 
on the RS, then all incoming requests would be queued. For example, non 
transactional queries would potentially not be processed during this time. If 
the retries (and pauses) occur on the client side, then non transactional work 
loads wouldn't be impacted. 

Ideally, we'd have a test that reproduces this NoServerForRegionException and 
see if any changes are needed to handle this situation. You might be able to 
repro this by manually splitting the commit table and then performing a read 
against a transactional table. It also may just occur the very first time the 
commit table is attempted to be reached from a RS after the commit table is 
created.

 

> Ensure timeouts are configured low for RPCs to commit table
> ---
>
> Key: OMID-117
> URL: https://issues.apache.org/jira/browse/OMID-117
> Project: Apache Omid
>  Issue Type: Bug
>Reporter: James Taylor
>Priority: Major
> Attachments: OMID-117.patch, OMID-117_hbase2.patch, 
> OMID-117_v2.patch, OMID-117_v3.patch, OMID-117_v4.patch, OMID-117_v5.patch, 
> OMID-117_v6.patch, OMID-117_v7.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-117) Ensure timeouts are configured low for RPCs to commit table

2018-10-15 Thread Yonatan Gottesman (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649798#comment-16649798
 ] 

Yonatan Gottesman commented on OMID-117:


Hi it looks good.

I dont understand what you said about changing the poms, all tests pass without 
changing anything (hbase2 too).

About the retries, what the worst thing that can happen with this? how bad is 
it to have it like this? if its bad i can investigate why tests dont pass.

ok to commit

> Ensure timeouts are configured low for RPCs to commit table
> ---
>
> Key: OMID-117
> URL: https://issues.apache.org/jira/browse/OMID-117
> Project: Apache Omid
>  Issue Type: Bug
>Reporter: James Taylor
>Priority: Major
> Attachments: OMID-117.patch, OMID-117_hbase2.patch, 
> OMID-117_v2.patch, OMID-117_v3.patch, OMID-117_v4.patch, OMID-117_v5.patch, 
> OMID-117_v6.patch, OMID-117_v7.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)