[ 
https://issues.apache.org/jira/browse/PHOENIX-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350883#comment-14350883
 ] 

Gary Helmling commented on PHOENIX-1674:
----------------------------------------

{quote}
I think those should be implemented first. Going down the path of a custom 
delete marker is going to be really ugly. Plus, then we have to support it 
going forward (at least until we know that any major compaction ran, I guess  
). Do you think it'd be feasible to attempt that? Maybe you and Lars Hofhansl 
could collaborate?
{quote}

Tephra already implements a custom delete marker and this should be completely 
transparent to Phoenix.  What I was describing was simply a write-time 
optimization to this to implement column family delete markers in Tephra as 
well.  And it seems like it might be necessary, depending on the HBase versions 
you want to support with this.  Even if the development on the HBase changes 
was already done, something like adding an "undelete" operation does seem like 
it would go into a release prior to 1.1, so there will be some delay in the 
release and adoption cycle.

I would definitely like to avoid duplicating HBase functionality, but 
availability of the new features is something we need to consider as well.

{quote}
Functionally I think it's the same to have the SkipScanFilter and 
TransactionVisibilityFilter in either order, but I think we'd take a perf hit 
under some cases if SkipScanFilter isn't first. Phoenix doesn't use the HBase 
Get for point lookups, but uses the SkipScanFilter instead. If we're seeking 
over a billion rows and only returning a few rows (which would be a few seeks 
from the SkipScanFilter), running the TransactionVisibilityFilter over every 
row is going to be more expensive than running it over only the rows that pass 
through the SkipScanFilter.
{quote}

Looking at the code for FilterList, I'm not sure this is true.  It looks to me 
like any seek hint returned by SkipScanFilter would still be respected 
regardless of ordering.  We can examine this in more detail together.

{quote}
Can these [transaction managers] be co-located with RSs (and does that make 
sense)? Would it be possible to communicate to them via an EndPoint coprocessor 
and even run in the same JVM as the RS? Not that Tephra may want to always work 
this way, but for Phoenix I think it'd make sense, as we're already pinging the 
RS (as mentioned above) to ensure our metadata is up to date, so there'd be no 
extra overhead. It'd also negate any issues with ensuring a separate, new 
service is up and running.
{quote}

Yes, you can run multiple stand-by transaction managers.  These could be 
co-located with region servers, as long as the region servers are not hogging 
up all the system resources.  You don't want the transaction managers to be 
CPU-starved for example.  So for that reason, co-locating with master nodes 
might be a better option.


> Snapshot isolation transaction support through Tephra
> -----------------------------------------------------
>
>                 Key: PHOENIX-1674
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1674
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>
> Tephra (http://tephra.io/ and https://github.com/caskdata/tephra) is one 
> option for getting transaction support in Phoenix. Let's use this JIRA to 
> discuss the way in which this could be integrated along with the pros and 
> cons.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to