Author: jamestaylor Date: Thu Dec 1 21:09:53 2016 New Revision: 1772276 URL: http://svn.apache.org/viewvc?rev=1772276&view=rev Log: Update docs based on 4.9 features (add new markdown page for Atomic Upsert)
Added: phoenix/site/publish/atomic_upsert.html phoenix/site/source/src/site/markdown/atomic_upsert.md Added: phoenix/site/publish/atomic_upsert.html URL: http://svn.apache.org/viewvc/phoenix/site/publish/atomic_upsert.html?rev=1772276&view=auto ============================================================================== --- phoenix/site/publish/atomic_upsert.html (added) +++ phoenix/site/publish/atomic_upsert.html Thu Dec 1 21:09:53 2016 @@ -0,0 +1,455 @@ + +<!DOCTYPE html> +<!-- + Generated by Apache Maven Doxia at 2016-12-01 + Rendered using Reflow Maven Skin 1.1.0 (http://andriusvelykis.github.io/reflow-maven-skin) +--> +<html xml:lang="en" lang="en"> + + <head> + <meta charset="UTF-8" /> + <title>Atomic Upsert | Apache Phoenix</title> + <meta name="viewport" content="width=device-width, initial-scale=1.0" /> + <meta name="description" content="" /> + <meta http-equiv="content-language" content="en" /> + + <link href="//netdna.bootstrapcdn.com/bootswatch/2.3.2/flatly/bootstrap.min.css" rel="stylesheet" /> + <link href="//netdna.bootstrapcdn.com/twitter-bootstrap/2.3.1/css/bootstrap-responsive.min.css" rel="stylesheet" /> + <link href="./css/bootswatch.css" rel="stylesheet" /> + <link href="./css/reflow-skin.css" rel="stylesheet" /> + + <link href="//yandex.st/highlightjs/7.5/styles/default.min.css" rel="stylesheet" /> + + <link href="./css/lightbox.css" rel="stylesheet" /> + + <link href="./css/site.css" rel="stylesheet" /> + <link href="./css/print.css" rel="stylesheet" media="print" /> + + <!-- Le HTML5 shim, for IE6-8 support of HTML5 elements --> + <!--[if lt IE 9]> + <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script> + <![endif]--> + + + + </head> + + <body class="page-atomic_upsert project-phoenix-site" data-spy="scroll" data-offset="60" data-target="#toc-scroll-target"> + + <div class="navbar navbar-fixed-top"> + <div class="navbar-inner"> + <div class="container"> + <a class="btn btn-navbar" data-toggle="collapse" data-target="#top-nav-collapse"> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </a> + <a class="brand" href="index.html"><div class="xtoplogo"></div></a> + <div class="nav-collapse collapse" id="top-nav-collapse"> + <ul class="nav pull-right"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown">About <b class="caret"></b></a> + <ul class="dropdown-menu"> + <li ><a href="index.html" title="Overview">Overview</a></li> + <li ><a href="who_is_using.html" title="Who is Using">Who is Using</a></li> + <li ><a href="recent.html" title="New Features">New Features</a></li> + <li ><a href="roadmap.html" title="Roadmap">Roadmap</a></li> + <li ><a href="performance.html" title="Performance">Performance</a></li> + <li ><a href="team.html" title="Team">Team</a></li> + <li ><a href="resources.html" title="Presentations">Presentations</a></li> + <li ><a href="mailing_list.html" title="Mailing Lists">Mailing Lists</a></li> + <li ><a href="source.html" title="Source Repository">Source Repository</a></li> + <li ><a href="issues.html" title="Issue Tracking">Issue Tracking</a></li> + <li ><a href="download.html" title="Download">Download</a></li> + <li ><a href="installation.html" title="Installation">Installation</a></li> + <li class="divider"/> + <li ><a href="contributing.html" title="How to Contribute">How to Contribute</a></li> + <li ><a href="develop.html" title="How to Develop">How to Develop</a></li> + <li ><a href="building_website.html" title="How to Update Website">How to Update Website</a></li> + <li ><a href="release.html" title="How to Release">How to Release</a></li> + <li class="divider"/> + <li ><a href="http://www.apache.org/licenses/" title="License" class="externalLink">License</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown">Using <b class="caret"></b></a> + <ul class="dropdown-menu"> + <li ><a href="faq.html" title="F.A.Q.">F.A.Q.</a></li> + <li ><a href="Phoenix-in-15-minutes-or-less.html" title="Quick Start">Quick Start</a></li> + <li ><a href="building.html" title="Building">Building</a></li> + <li ><a href="tuning.html" title="Tuning">Tuning</a></li> + <li ><a href="upgrading.html" title="Backward Compatibility">Backward Compatibility</a></li> + <li ><a href="release_notes.html" title="Release Notes">Release Notes</a></li> + <li ><a href="pherf.html" title="Performance Testing">Performance Testing</a></li> + <li class="divider"/> + <li ><a href="phoenix_spark.html" title="Apache Spark Integration">Apache Spark Integration</a></li> + <li ><a href="hive_storage_handler.html" title="Apache Hive Storage Handler">Apache Hive Storage Handler</a></li> + <li ><a href="pig_integration.html" title="Apache Pig Integration">Apache Pig Integration</a></li> + <li ><a href="phoenix_mr.html" title="Map Reduce Integration">Map Reduce Integration</a></li> + <li ><a href="flume.html" title="Apache Flume Plugin">Apache Flume Plugin</a></li> + <li class="divider"/> + <li ><a href="phoenix_on_emr.html" title="Phoenix on Amazon EMR">Phoenix on Amazon EMR</a></li> + <li ><a href="phoenix_python.html" title="Phoenix Adapter for Python">Phoenix Adapter for Python</a></li> + <li ><a href="phoenix_orm.html" title="Phoenix ORM Library">Phoenix ORM Library</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown">Features <b class="caret"></b></a> + <ul class="dropdown-menu"> + <li ><a href="transactions.html" title="Transactions">Transactions</a></li> + <li ><a href="udf.html" title="User-defined Functions">User-defined Functions</a></li> + <li class="divider"/> + <li ><a href="secondary_indexing.html" title="Secondary Indexes">Secondary Indexes</a></li> + <li class="active"><a href="" title="Atomic Upsert">Atomic Upsert</a></li> + <li ><a href="namspace_mapping.html" title="Namespace Mapping">Namespace Mapping</a></li> + <li ><a href="update_statistics.html" title="Statistics Collection">Statistics Collection</a></li> + <li ><a href="rowtimestamp.html" title="Row Timestamp Column">Row Timestamp Column</a></li> + <li ><a href="paged.html" title="Paged Queries">Paged Queries</a></li> + <li ><a href="salted.html" title="Salted Tables">Salted Tables</a></li> + <li ><a href="skip_scan.html" title="Skip Scan">Skip Scan</a></li> + <li class="divider"/> + <li ><a href="views.html" title="Views">Views</a></li> + <li ><a href="multi-tenancy.html" title="Multi tenancy">Multi tenancy</a></li> + <li ><a href="dynamic_columns.html" title="Dynamic Columns">Dynamic Columns</a></li> + <li class="divider"/> + <li ><a href="bulk_dataload.html" title="Bulk Loading">Bulk Loading</a></li> + <li ><a href="server.html" title="Query Server">Query Server</a></li> + <li ><a href="tracing.html" title="Tracing">Tracing</a></li> + <li ><a href="metrics.html" title="Metrics">Metrics</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown">Reference <b class="caret"></b></a> + <ul class="dropdown-menu"> + <li ><a href="language/index.html" title="Grammar">Grammar</a></li> + <li ><a href="language/functions.html" title="Functions">Functions</a></li> + <li ><a href="language/datatypes.html" title="Datatypes">Datatypes</a></li> + <li ><a href="array_type.html" title="ARRAY type">ARRAY type</a></li> + <li class="divider"/> + <li ><a href="sequences.html" title="Sequences">Sequences</a></li> + <li ><a href="joins.html" title="Joins">Joins</a></li> + <li ><a href="subqueries.html" title="Subqueries">Subqueries</a></li> + </ul> + </li> + </ul> + </div><!--/.nav-collapse --> + </div> + </div> + </div> + + <div class="container"> + + <!-- Masthead + ================================================== --> + + <header> + </header> + + <div class="main-body"> + <div class="row"> + <div class="span12"> + <div class="body-content"> +<div class="page-header"> + <h1>Atomic Upsert</h1> +</div> +<p>To support atomic upsert, an optional ON DUPLICATE KEY clause, similar to the MySQL syntax, has been encorporated into the UPSERT VALUES command as of Phoenix 4.9. The general syntax is described <a href="language/index.html#upsert_values">here</a>. This feature provides a superset of the HBase Increment and CheckAndPut functionality to enable atomic upserts. On the server-side, when the commit is processed, the row being updated will be locked while the current column values are read and the ON DUPLICATE KEY clause is executed. Given that the row must be locked and read when the ON DUPLICATE KEY clause is used, there will be a performance penalty (much like there is for an HBase Put versus a CheckAndPut).</p> +<p>In the presence of the ON DUPLICATE KEY clause, if the row already exists, the VALUES specified will be ignored and instead either:</p> +<ul> + <li>the row will not be updated if ON DUPLICATE KEY IGNORE is specified or</li> + <li>the row will be updated (under lock) by executing the expressions following the ON DUPLICATE KEY UPDATE clause.</li> +</ul> +<p>Multiple UPSERT statements for the same row in the same commit batch will be processed in the order of their execution. Thus the same result will be produced when auto commit is on or off.</p> +<div class="section"> + <h2 id="Examples">Examples</h2> + <p>For example, to atomically increment two counter columns, you would execute the following command:</p> + <div class="source"> + <pre>UPSERT INTO my_table(id, counter1, counter2) VALUES ('abc', 0, 0) +ON DUPLICATE KEY UPDATE counter1 = counter1 + 1, counter2 = counter2 + 1; +</pre> + </div> + <p>To only update a column if it doesnât yet exist:</p> + <div class="source"> + <pre>UPSERT INTO my_table(id, my_col) VALUES ('abc', 100) +ON DUPLICATE KEY IGNORE; +</pre> + </div> + <p>Note that arbitrarily complex expressions may be used in this new clause:</p> + <div class="source"> + <pre>UPSERT INTO my_table(id, total_deal_size, deal_size) VALUES ('abc', 0, 100) +ON DUPLICATE KEY UPDATE + total_deal_size = total_deal_size + deal_size, + approval_reqd = CASE WHEN total_deal_size < 100 THEN 'NONE' + WHEN total_deal_size < 1000 THEN 'MANAGER APPROVAL' + ELSE 'VP APPROVAL' END; +</pre> + </div> +</div> +<div class="section"> + <h2 id="Limitations">Limitations</h2> + <p>The following limitations are enforced for the ON DUPLICATE KEY clause usage:</p> + <ul> + <li>Primary key columns may not be updated, since this would essentially be creating a <i>new</i> row.</li> + <li>Transactional tables may not use this clause as atomic upserts are already possible through exception handling when a conflict occurs.</li> + <li>Immutable tables may not use this clause as by definition there should be no updates to existing rows</li> + <li>The CURRENT_SCN property may not be set on connection when this clause is used as HBase does not handle atomicity unless the latest value is being updated.</li> + <li>The same column should not be updated more than once in the same statement.</li> + <li>No aggregation or references to sequences are allowed within the clause.</li> + <li>Although global indexes on columns being atomically updated are supported, itâs not recommended as a potentially a separate RPC across the wire would be made while the row is under lock to maintain the secondary index.</li> + </ul> +</div> + </div> + </div> + </div> + </div> + + </div><!-- /container --> + + <!-- Footer + ================================================== --> + <footer class="well"> + <div class="container"> + <div class="row"> + <div class="span2 bottom-nav"> + <ul class="nav nav-list"> + <li class="nav-header">About</li> + <li > + <a href="index.html" title="Overview">Overview</a> + </li> + <li > + <a href="who_is_using.html" title="Who is Using">Who is Using</a> + </li> + <li > + <a href="recent.html" title="New Features">New Features</a> + </li> + <li > + <a href="roadmap.html" title="Roadmap">Roadmap</a> + </li> + <li > + <a href="performance.html" title="Performance">Performance</a> + </li> + <li > + <a href="team.html" title="Team">Team</a> + </li> + <li > + <a href="resources.html" title="Presentations">Presentations</a> + </li> + <li > + <a href="mailing_list.html" title="Mailing Lists">Mailing Lists</a> + </li> + <li > + <a href="source.html" title="Source Repository">Source Repository</a> + </li> + <li > + <a href="issues.html" title="Issue Tracking">Issue Tracking</a> + </li> + <li > + <a href="download.html" title="Download">Download</a> + </li> + <li > + <a href="installation.html" title="Installation">Installation</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="contributing.html" title="How to Contribute">How to Contribute</a> + </li> + <li > + <a href="develop.html" title="How to Develop">How to Develop</a> + </li> + <li > + <a href="building_website.html" title="How to Update Website">How to Update Website</a> + </li> + <li > + <a href="release.html" title="How to Release">How to Release</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="http://www.apache.org/licenses/" title="License" class="externalLink">License</a> + </li> + </ul> + </div> + <div class="span2 bottom-nav"> + <ul class="nav nav-list"> + <li class="nav-header">Using</li> + <li > + <a href="faq.html" title="F.A.Q.">F.A.Q.</a> + </li> + <li > + <a href="Phoenix-in-15-minutes-or-less.html" title="Quick Start">Quick Start</a> + </li> + <li > + <a href="building.html" title="Building">Building</a> + </li> + <li > + <a href="tuning.html" title="Tuning">Tuning</a> + </li> + <li > + <a href="upgrading.html" title="Backward Compatibility">Backward Compatibility</a> + </li> + <li > + <a href="release_notes.html" title="Release Notes">Release Notes</a> + </li> + <li > + <a href="pherf.html" title="Performance Testing">Performance Testing</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="phoenix_spark.html" title="Apache Spark Integration">Apache Spark Integration</a> + </li> + <li > + <a href="hive_storage_handler.html" title="Apache Hive Storage Handler">Apache Hive Storage Handler</a> + </li> + <li > + <a href="pig_integration.html" title="Apache Pig Integration">Apache Pig Integration</a> + </li> + <li > + <a href="phoenix_mr.html" title="Map Reduce Integration">Map Reduce Integration</a> + </li> + <li > + <a href="flume.html" title="Apache Flume Plugin">Apache Flume Plugin</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="phoenix_on_emr.html" title="Phoenix on Amazon EMR">Phoenix on Amazon EMR</a> + </li> + <li > + <a href="phoenix_python.html" title="Phoenix Adapter for Python">Phoenix Adapter for Python</a> + </li> + <li > + <a href="phoenix_orm.html" title="Phoenix ORM Library">Phoenix ORM Library</a> + </li> + </ul> + </div> + <div class="span2 bottom-nav"> + <ul class="nav nav-list"> + <li class="nav-header">Features</li> + <li > + <a href="transactions.html" title="Transactions">Transactions</a> + </li> + <li > + <a href="udf.html" title="User-defined Functions">User-defined Functions</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="secondary_indexing.html" title="Secondary Indexes">Secondary Indexes</a> + </li> + <li class="active"> + <a href="#" title="Atomic Upsert">Atomic Upsert</a> + </li> + <li > + <a href="namspace_mapping.html" title="Namespace Mapping">Namespace Mapping</a> + </li> + <li > + <a href="update_statistics.html" title="Statistics Collection">Statistics Collection</a> + </li> + <li > + <a href="rowtimestamp.html" title="Row Timestamp Column">Row Timestamp Column</a> + </li> + <li > + <a href="paged.html" title="Paged Queries">Paged Queries</a> + </li> + <li > + <a href="salted.html" title="Salted Tables">Salted Tables</a> + </li> + <li > + <a href="skip_scan.html" title="Skip Scan">Skip Scan</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="views.html" title="Views">Views</a> + </li> + <li > + <a href="multi-tenancy.html" title="Multi tenancy">Multi tenancy</a> + </li> + <li > + <a href="dynamic_columns.html" title="Dynamic Columns">Dynamic Columns</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="bulk_dataload.html" title="Bulk Loading">Bulk Loading</a> + </li> + <li > + <a href="server.html" title="Query Server">Query Server</a> + </li> + <li > + <a href="tracing.html" title="Tracing">Tracing</a> + </li> + <li > + <a href="metrics.html" title="Metrics">Metrics</a> + </li> + </ul> + </div> + <div class="span3 bottom-nav"> + <ul class="nav nav-list"> + <li class="nav-header">Reference</li> + <li > + <a href="language/index.html" title="Grammar">Grammar</a> + </li> + <li > + <a href="language/functions.html" title="Functions">Functions</a> + </li> + <li > + <a href="language/datatypes.html" title="Datatypes">Datatypes</a> + </li> + <li > + <a href="array_type.html" title="ARRAY type">ARRAY type</a> + </li> + <li > + <a href="http:divider" title=""></a> + </li> + <li > + <a href="sequences.html" title="Sequences">Sequences</a> + </li> + <li > + <a href="joins.html" title="Joins">Joins</a> + </li> + <li > + <a href="subqueries.html" title="Subqueries">Subqueries</a> + </li> + </ul> + </div> + <div class="span3 bottom-description"> + <form action="http://search-hadoop.com/?" method="get"><input value="Phoenix" name="fc_project" type="hidden"><input placeholder="Search Phoenix…" required="required" style="width:170px;" size="18" name="q" id="query" type="search"></form> + </div> + </div> + </div> + </footer> + + <div class="container subfooter"> + <div class="row"> + <div class="span12"> + <p class="pull-right"><a href="#">Back to top</a></p> + <p class="copyright">Copyright ©2016 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved.</p> + </div> + </div> + </div> + + <!-- Le javascript + ================================================== --> + <!-- Placed at the end of the document so the pages load faster --> + <script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script> + + <script src="//netdna.bootstrapcdn.com/twitter-bootstrap/2.3.2/js/bootstrap.min.js"></script> + <script src="./js/lightbox.js"></script> + <script src="./js/jquery.smooth-scroll.min.js"></script> + <!-- back button support for smooth scroll --> + <script src="./js/jquery.ba-bbq.min.js"></script> + <script src="//yandex.st/highlightjs/7.5/highlight.min.js"></script> + + <script src="./js/reflow-skin.js"></script> + + </body> +</html> Added: phoenix/site/source/src/site/markdown/atomic_upsert.md URL: http://svn.apache.org/viewvc/phoenix/site/source/src/site/markdown/atomic_upsert.md?rev=1772276&view=auto ============================================================================== --- phoenix/site/source/src/site/markdown/atomic_upsert.md (added) +++ phoenix/site/source/src/site/markdown/atomic_upsert.md Thu Dec 1 21:09:53 2016 @@ -0,0 +1,58 @@ +# Atomic Upsert + +To support atomic upsert, an optional ON DUPLICATE KEY clause, similar to the MySQL syntax, has been +encorporated into the UPSERT VALUES command as of Phoenix 4.9. The general syntax is described +[here](language/index.html#upsert_values). This feature provides a superset of the HBase Increment and +CheckAndPut functionality to enable atomic upserts. On the server-side, when the commit +is processed, the row being updated will be locked while the current column values are read and the +ON DUPLICATE KEY clause is executed. Given that the row must be locked and read when the ON DUPLICATE KEY +clause is used, there will be a performance penalty (much like there is for an HBase Put versus a CheckAndPut). + +In the presence of the ON DUPLICATE KEY clause, if the row already exists, the VALUES specified will +be ignored and instead either: + +* the row will not be updated if ON DUPLICATE KEY IGNORE is specified or +* the row will be updated (under lock) by executing the expressions following the ON DUPLICATE KEY UPDATE +clause. + +Multiple UPSERT statements for the same row in the same commit batch will be processed in the order of their +execution. Thus the same result will be produced when auto commit is on or off. + +## Examples + +For example, to atomically increment two counter columns, you would execute the following command: + + UPSERT INTO my_table(id, counter1, counter2) VALUES ('abc', 0, 0) + ON DUPLICATE KEY UPDATE counter1 = counter1 + 1, counter2 = counter2 + 1; + +To only update a column if it doesn't yet exist: + + UPSERT INTO my_table(id, my_col) VALUES ('abc', 100) + ON DUPLICATE KEY IGNORE; + +Note that arbitrarily complex expressions may be used in this new clause: + + UPSERT INTO my_table(id, total_deal_size, deal_size) VALUES ('abc', 0, 100) + ON DUPLICATE KEY UPDATE + total_deal_size = total_deal_size + deal_size, + approval_reqd = CASE WHEN total_deal_size < 100 THEN 'NONE' + WHEN total_deal_size < 1000 THEN 'MANAGER APPROVAL' + ELSE 'VP APPROVAL' END; + +## Limitations + +The following limitations are enforced for the ON DUPLICATE KEY clause usage: + +* Primary key columns may not be updated, since this would essentially be creating a *new* row. +* Transactional tables may not use this clause as atomic upserts are already possible through + exception handling when a conflict occurs. +* Immutable tables may not use this clause as by definition there should be no updates to + existing rows +* The CURRENT_SCN property may not be set on connection when this clause is used as HBase + does not handle atomicity unless the latest value is being updated. +* The same column should not be updated more than once in the same statement. +* No aggregation or references to sequences are allowed within the clause. +* Although global indexes on columns being atomically updated are supported, it's not recommended + as a potentially a separate RPC across the wire would be made while the row is under lock to + maintain the secondary index. +