http://git-wip-us.apache.org/repos/asf/kudu-site/blob/9b792926/docs/prior_release_notes.html ---------------------------------------------------------------------- diff --git a/docs/prior_release_notes.html b/docs/prior_release_notes.html index 24029cb..7e5cdee 100644 --- a/docs/prior_release_notes.html +++ b/docs/prior_release_notes.html @@ -115,7 +115,228 @@ limitations under the License. <div class="col-md-9"> <h1>Apache Kudu Prior Version Release Notes</h1> - <div class="sect1"> + <div id="preamble"> +<div class="sectionbody"> +<div class="paragraph"> +<p>This section reproduces the release notes for new features and incompatible +changes in prior releases of Apache Kudu.</p> +</div> +<div class="admonitionblock note"> +<table> +<tr> +<td class="icon"> +<i class="fa icon-note" title="Note"></i> +</td> +<td class="content"> +The list of known issues and limitations for prior releases are not +reproduced on this page. Please consult the +<a href="http://kudu.apache.org/releases/">documentation of the appropriate release</a> +for a list of known issues and limitations. +</td> +</tr> +</table> +</div> +</div> +</div> +<div class="sect1"> +<h2 id="rn_1.1.0"><a class="link" href="#rn_1.1.0">Release notes specific to 1.1.0</a></h2> +<div class="sectionbody"> + +</div> +</div> +<div class="sect1"> +<h2 id="rn_1.1.0_new_features"><a class="link" href="#rn_1.1.0_new_features">New features</a></h2> +<div class="sectionbody"> +<div class="ulist"> +<ul> +<li> +<p>The Python client has been brought up to feature parity with the Java and C++ clients +and as such the package version will be brought to 1.1 with this release (from 0.3). A +list of the highlights can be found below.</p> +<div class="ulist"> +<ul> +<li> +<p>Improved Partial Row semantics</p> +</li> +<li> +<p>Range partition support</p> +</li> +<li> +<p>Scan Token API</p> +</li> +<li> +<p>Enhanced predicate support</p> +</li> +<li> +<p>Support for all Kudu data types (including a mapping of Python’s <code>datetime.datetime</code> to +<code>UNIXTIME_MICROS</code>)</p> +</li> +<li> +<p>Alter table support</p> +</li> +<li> +<p>Enabled Read at Snapshot for Scanners</p> +</li> +<li> +<p>Enabled Scanner Replica Selection</p> +</li> +<li> +<p>A few bug fixes for Python 3 in addition to various other improvements.</p> +</li> +</ul> +</div> +</li> +<li> +<p>IN LIST predicate pushdown support was added to allow optimized execution of filters which +match on a set of column values. Support for Spark, Map Reduce and Impala queries utilizing +IN LIST pushdown is not yet complete.</p> +</li> +<li> +<p>The Java client now features client-side request tracing in order to help troubleshoot timeouts. +Error messages are now augmented with traces that show which servers were contacted before the +timeout occured instead of just the last error. The traces also contain RPCs that were +required to fulfill the client’s request, such as contacting the master to discover a tablet’s +location. Note that the traces are not available for successful requests and are not +programatically queryable.</p> +</li> +</ul> +</div> +</div> +</div> +<div class="sect1"> +<h2 id="_optimizations_and_improvements"><a class="link" href="#_optimizations_and_improvements">Optimizations and improvements</a></h2> +<div class="sectionbody"> +<div class="ulist"> +<ul> +<li> +<p>Kudu now publishes JAR files for Spark 2.0 compiled with Scala 2.11 along with the +existing Spark 1.6 JAR compiled with Scala 2.10.</p> +</li> +<li> +<p>The Java client now allows configuring scanners to read from the closest replica instead of +the known leader replica. The default remains the latter. Use the relevant <code>ReplicaSelection</code> +enum with the scanner’s builder to change this behavior.</p> +</li> +<li> +<p>Tablet servers use a new policy for retaining write-ahead log (WAL) segments. +Previously, servers used the 'log_min_segments_to_retain' flag to prioritize +any flushes which were retaining log segments past the configured value (default 2). +This policy caused servers to flush in-memory data more frequently than necessary, +limiting write performance.</p> +<div class="paragraph"> +<p>The new policy introduces a new flag 'log_target_replay_size_mb' which + determines the threshold at which write-ahead log retention will prioritize flushes. + The new flag is considered experimental and users should not need to modify + its value.</p> +</div> +<div class="paragraph"> +<p>The improved policy has been seen to improve write performance in some use cases + by a factor of 2x relative to the old policy.</p> +</div> +</li> +<li> +<p>Kudu’s implementation of the Raft consensus algorithm has been improved to include +a "pre-election" phase. This can improve the stability of tablet leader election +in high-load scenarios, especially if each server hosts a high number of tablets.</p> +</li> +<li> +<p>Tablet server start-up time has been substantially improved in the case that +the server contains a high number of tombstoned tablet replicas.</p> +</li> +</ul> +</div> +<div class="sect2"> +<h3 id="_command_line_tools"><a class="link" href="#_command_line_tools">Command line tools</a></h3> +<div class="ulist"> +<ul> +<li> +<p>The tool <code>kudu tablet leader_step_down</code> has been added to manually force a leader to step down.</p> +</li> +<li> +<p>The tool <code>kudu remote_replica copy</code> has been added to manually copy a replica from +one running tablet server to another.</p> +</li> +<li> +<p>The tool <code>kudu local_replica delete</code> has been added to delete a replica of a tablet.</p> +</li> +<li> +<p>The <code>kudu test loadgen</code> tool has been added to replace the obsoleted +<code>insert-generated-rows</code> standalone binary. The new tool is enriched with +additional functionality and can be used to run load generation tests against +a Kudu cluster.</p> +</li> +</ul> +</div> +</div> +</div> +</div> +<div class="sect1"> +<h2 id="_wire_protocol_compatibility"><a class="link" href="#_wire_protocol_compatibility">Wire protocol compatibility</a></h2> +<div class="sectionbody"> +<div class="paragraph"> +<p>Kudu 1.1.0 is wire-compatible with previous versions of Kudu:</p> +</div> +<div class="ulist"> +<ul> +<li> +<p>Kudu 1.1 clients may connect to servers running Kudu 1.0. If the client uses the new +'IN LIST' predicate type, an error will be returned.</p> +</li> +<li> +<p>Kudu 1.0 clients may connect to servers running Kudu 1.1 without limitations.</p> +</li> +<li> +<p>Rolling upgrade between Kudu 1.0 and Kudu 1.1 servers is believed to be possible +though has not been sufficiently tested. Users are encouraged to shut down all nodes +in the cluster, upgrade the software, and then restart the daemons on the new version.</p> +</li> +</ul> +</div> +</div> +</div> +<div class="sect1"> +<h2 id="rn_1.1.0_incompatible_changes"><a class="link" href="#rn_1.1.0_incompatible_changes">Incompatible changes in Kudu 1.1.0</a></h2> +<div class="sectionbody"> +<div class="sect2"> +<h3 id="_client_apis_c_java_python"><a class="link" href="#_client_apis_c_java_python">Client APIs (C++/Java/Python)</a></h3> +<div class="ulist"> +<ul> +<li> +<p>The C++ client no longer requires the +<a href="https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html">old gcc5 ABI</a>. +Which ABI is actually used depends on the compiler configuration. Some new distros +(e.g. Ubuntu 16.04) will use the new ABI. Your application must use the same ABI as is +used by the client library; an easy way to guarantee this is to use the same compiler +to build both.</p> +</li> +<li> +<p>The C++ client’s <code>KuduSession::CountBufferedOperations()</code> method is +deprecated. Its behavior is inconsistent unless the session runs in the +<code>MANUAL_FLUSH</code> mode. Instead, to get number of buffered operations, count +invocations of the <code>KuduSession::Apply()</code> method since last +<code>KuduSession::Flush()</code> call or, if using asynchronous flushing, since last +invocation of the callback passed into <code>KuduSession::FlushAsync()</code>.</p> +</li> +<li> +<p>The Java client’s <code>OperationResponse.getWriteTimestamp</code> method was renamed to <code>getWriteTimestampRaw</code> +to emphasize that it doesn’t return milliseconds, unlike what its Javadoc indicated. The renamed +method was also hidden from the public APIs and should not be used.</p> +</li> +<li> +<p>The Java client’s sync API (<code>KuduClient</code>, <code>KuduSession</code>, <code>KuduScanner</code>) used to throw either +a <code>NonRecoverableException</code> or a <code>TimeoutException</code> for a timeout, and now it’s only possible for the +client to throw the former.</p> +</li> +<li> +<p>The Java client’s handling of errors in <code>KuduSession</code> was modified so that subclasses of +<code>KuduException</code> are converted into RowErrors instead of being thrown.</p> +</li> +</ul> +</div> +</div> +</div> +</div> +<div class="sect1"> <h2 id="rn_1.0.1"><a class="link" href="#rn_1.0.1">Release notes specific to 1.0.1</a></h2> <div class="sectionbody"> <div class="paragraph"> @@ -232,7 +453,7 @@ This can provide higher throughput for ingest workloads.</p> </div> </div> <div class="sect2"> -<h3 id="_optimizations_and_improvements"><a class="link" href="#_optimizations_and_improvements">Optimizations and improvements</a></h3> +<h3 id="_optimizations_and_improvements_2"><a class="link" href="#_optimizations_and_improvements_2">Optimizations and improvements</a></h3> <div class="ulist"> <ul> <li> @@ -263,7 +484,7 @@ that temporarily lag behind the other replicas.</p> </div> </div> <div class="sect2"> -<h3 id="_wire_protocol_compatibility"><a class="link" href="#_wire_protocol_compatibility">Wire protocol compatibility</a></h3> +<h3 id="_wire_protocol_compatibility_2"><a class="link" href="#_wire_protocol_compatibility_2">Wire protocol compatibility</a></h3> <div class="paragraph"> <p>Kudu 1.0.0 maintains client-server wire-compatibility with previous releases. Applications using the Kudu client libraries may be upgraded either @@ -278,7 +499,7 @@ Kudu 1.0.0 are not supported.</p> <div class="sect2"> <h3 id="rn_1.0.0_incompatible_changes"><a class="link" href="#rn_1.0.0_incompatible_changes">Incompatible changes in Kudu 1.0.0</a></h3> <div class="sect3"> -<h4 id="_command_line_tools"><a class="link" href="#_command_line_tools">Command line tools</a></h4> +<h4 id="_command_line_tools_2"><a class="link" href="#_command_line_tools_2">Command line tools</a></h4> <div class="ulist"> <ul> <li> @@ -331,7 +552,7 @@ and without notice in future Kudu releases.</p> </div> </div> <div class="sect3"> -<h4 id="_client_apis_c_java_python"><a class="link" href="#_client_apis_c_java_python">Client APIs (C++/Java/Python)</a></h4> +<h4 id="_client_apis_c_java_python_2"><a class="link" href="#_client_apis_c_java_python_2">Client APIs (C++/Java/Python)</a></h4> <div class="ulist"> <ul> <li> @@ -399,7 +620,7 @@ for Kudu 0.10.0</a> and <a href="https://github.com/apache/kudu/compare/0.9.1... changes between 0.9.1 and 0.10.0</a>.</p> </div> <div class="paragraph"> -<p>To upgrade to Kudu 0.10.0, see <a href="#rn_0.10.0_upgrade">Upgrading from 0.9.x to 0.10.0</a>.</p> +<p>To upgrade to Kudu 0.10.0, see <a href="#rn_0.10.0_upgrade">[rn_0.10.0_upgrade]</a>.</p> </div> <div class="sect2"> <h3 id="rn_0.10.0_incompatible_changes"><a class="link" href="#rn_0.10.0_incompatible_changes">Incompatible changes and deprecated APIs in 0.10.0</a></h3> @@ -614,77 +835,6 @@ should not be visible to users.</p> </ul> </div> </div> -<div class="sect2"> -<h3 id="rn_0.10.0_upgrade"><a class="link" href="#rn_0.10.0_upgrade">Upgrading from 0.9.x to 0.10.0</a></h3> -<div class="paragraph"> -<p>Before upgrading, see <a href="#rn_0.10.0_incompatible_changes">Incompatible changes and deprecated APIs in 0.10.0</a> and -<a href="#rn_0.10.0_downgrade">Downgrading from 0.10.0 to 0.9.x</a>.</p> -</div> -<div class="paragraph"> -<p>To upgrade from Kudu 0.9.x to Kudu 0.10.0, perform the following high-level -steps, which are detailed in the installation guide under -<a href="installation.html#upgrade_procedure">Upgrade Procedure</a>:</p> -</div> -<div class="olist arabic"> -<ol class="arabic"> -<li> -<p>Shut down all Kudu services.</p> -</li> -<li> -<p>Install the new Kudu packages or parcels, or install Kudu 0.10.0 from source.</p> -</li> -<li> -<p>Restart all Kudu services.</p> -</li> -</ol> -</div> -<div class="admonitionblock warning"> -<table> -<tr> -<td class="icon"> -<i class="fa icon-warning" title="Warning"></i> -</td> -<td class="content"> -Rolling upgrades are not supported when upgrading from Kudu 0.9.x to -0.10.0 and they are known to cause errors in this release. If you run into a -problem after an accidental rolling upgrade, shut down all services and then -restart all services and the system should come up properly. -</td> -</tr> -</table> -</div> -<div class="admonitionblock note"> -<table> -<tr> -<td class="icon"> -<i class="fa icon-note" title="Note"></i> -</td> -<td class="content"> -For the duration of the Kudu Beta, upgrade instructions are generally -only given for going from the previous latest version to the newly released -version. -</td> -</tr> -</table> -</div> -</div> -<div class="sect2"> -<h3 id="rn_0.10.0_downgrade"><a class="link" href="#rn_0.10.0_downgrade">Downgrading from 0.10.0 to 0.9.x</a></h3> -<div class="paragraph"> -<p>After upgrading to Kudu 0.10.0, it is possible to downgrade to 0.9.x with the -following exceptions:</p> -</div> -<div class="olist arabic"> -<ol class="arabic"> -<li> -<p>Tables created in 0.10.0 will not be accessible after a downgrade to 0.9.x</p> -</li> -<li> -<p>A multi-master setup formatted in 0.10.0 may not be downgraded to 0.9.x</p> -</li> -</ol> -</div> -</div> </div> </div> <div class="sect1"> @@ -701,30 +851,6 @@ for Kudu 0.9.1</a> and <a href="https://github.com/apache/kudu/compare/0.9.0...0 changes between 0.9.0 and 0.9.1</a>.</p> </div> <div class="sect2"> -<h3 id="rn_0.9.1_upgrade"><a class="link" href="#rn_0.9.1_upgrade">Upgrading from 0.9.0 to 0.9.1</a></h3> -<div class="paragraph"> -<p>Before upgrading to Kudu 0.9.1 from Kudu 0.8.0, please read the <a href="#rn_0.9.0">Release notes specific to 0.9.0</a>.</p> -</div> -<div class="paragraph"> -<p>Upgrading from 0.8.0 or 0.9.0 to 0.9.1 is supported. To upgrade from Kudu 0.8.0 -or Kudu 0.9.0 to Kudu 0.9.1, use the procedure documented in <a href="#rn_0.9.0_upgrade">Upgrading from 0.8.0 to 0.9.x</a>.</p> -</div> -<div class="admonitionblock note"> -<table> -<tr> -<td class="icon"> -<i class="fa icon-note" title="Note"></i> -</td> -<td class="content"> -For the duration of the Kudu Beta, upgrade instructions are generally -only given for going from the previous latest version to the newly released -version. -</td> -</tr> -</table> -</div> -</div> -<div class="sect2"> <h3 id="rn_0.9.1_fixed_issues"><a class="link" href="#rn_0.9.1_fixed_issues">Fixed Issues</a></h3> <div class="ulist"> <ul> @@ -763,7 +889,7 @@ for Kudu 0.9.0</a> and <a href="https://github.com/apache/kudu/compare/0.8.0...0 changes between 0.8.0 and 0.9.0</a>.</p> </div> <div class="paragraph"> -<p>To upgrade to Kudu 0.10.0, see <a href="#rn_0.9.0_upgrade">Upgrading from 0.8.0 to 0.9.x</a>.</p> +<p>To upgrade to Kudu 0.10.0, see <a href="#rn_0.9.0_upgrade">[rn_0.9.0_upgrade]</a>.</p> </div> <div class="sect2"> <h3 id="rn_0.9.0_incompatible_changes"><a class="link" href="#rn_0.9.0_incompatible_changes">Incompatible changes</a></h3> @@ -890,52 +1016,6 @@ values will provide better throughput for write-heavy applications on typical se </ul> </div> </div> -<div class="sect2"> -<h3 id="rn_0.9.0_upgrade"><a class="link" href="#rn_0.9.0_upgrade">Upgrading from 0.8.0 to 0.9.x</a></h3> -<div class="paragraph"> -<p>Before upgrading, see <a href="#rn_0.9.0_incompatible_changes">Incompatible changes</a> and -<a href="#rn_0.9.0_client_compatibility">Client compatibility</a>. To upgrade from Kudu 0.8.0 to 0.9.0, perform -the following high-level steps, which are detailed in the installation guide -under <a href="installation.html#upgrade_procedure">Upgrade Procedure</a>:</p> -</div> -<div class="olist arabic"> -<ol class="arabic"> -<li> -<p>Shut down all Kudu services.</p> -</li> -<li> -<p>Install the new Kudu packages or parcels, or install Kudu 0.9.1 from source.</p> -</li> -<li> -<p>Restart all Kudu services.</p> -</li> -</ol> -</div> -<div class="paragraph"> -<p>It is technically possible to upgrade Kudu using rolling restarts, but it has not -been tested and is not recommended.</p> -</div> -<div class="admonitionblock note"> -<table> -<tr> -<td class="icon"> -<i class="fa icon-note" title="Note"></i> -</td> -<td class="content"> -For the duration of the Kudu Beta, upgrade instructions are only given for going -from the previous latest version to the newest. -</td> -</tr> -</table> -</div> -</div> -<div class="sect2"> -<h3 id="rn_0.9.0_client_compatibility"><a class="link" href="#rn_0.9.0_client_compatibility">Client compatibility</a></h3> -<div class="paragraph"> -<p>Masters and tablet servers should be upgraded before clients are upgraded. For specific -information about client compatibility, see the <a href="#rn_0.9.0_incompatible_changes">Incompatible changes</a> section.</p> -</div> -</div> </div> </div> <div class="sect1"> @@ -1227,24 +1307,6 @@ previous link and link:http://developerblog.redhat.com/2015/02/05/gcc5-and-the-c </ul> </div> </div> -<div class="sect2"> -<h3 id="_limitations"><a class="link" href="#_limitations">Limitations</a></h3> -<div class="paragraph"> -<p>See also <a href="#beta_limitations">Limitations of the Kudu Public Beta</a>. Where applicable, this list adds to or overrides that -list.</p> -</div> -<div class="sect3"> -<h4 id="_operating_system_limitations"><a class="link" href="#_operating_system_limitations">Operating System Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Kudu 0.7 is known to work on RHEL 7 or 6.4 or newer, CentOS 7 or 6.4 or newer, Ubuntu -Trusty, and SLES 12. Other operating systems may work but have not been tested.</p> -</li> -</ul> -</div> -</div> -</div> </div> </div> <div class="sect1"> @@ -1276,338 +1338,14 @@ instructions in <a href="installation.html#osx_from_source">OS X</a>.</p> </li> </ul> </div> -<div class="sect2"> -<h3 id="_limitations_2"><a class="link" href="#_limitations_2">Limitations</a></h3> -<div class="paragraph"> -<p>See also <a href="#beta_limitations">Limitations of the Kudu Public Beta</a>. Where applicable, this list adds to or overrides that -list.</p> -</div> -<div class="sect3"> -<h4 id="_operating_system_limitations_2"><a class="link" href="#_operating_system_limitations_2">Operating System Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Kudu 0.6 is known to work on RHEL 6.4 or newer, CentOS 6.4 or newer, and Ubuntu -Trusty. Other operating systems may work but have not been tested.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_api_limitations"><a class="link" href="#_api_limitations">API Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>The Python client is still considered experimental.</p> -</li> -</ul> -</div> -</div> -</div> </div> </div> <div class="sect1"> <h2 id="rn_0.5.0"><a class="link" href="#rn_0.5.0">Release Notes Specific to 0.5.0</a></h2> <div class="sectionbody"> -<div class="sect2"> -<h3 id="_limitations_3"><a class="link" href="#_limitations_3">Limitations</a></h3> -<div class="paragraph"> -<p>See also <a href="#beta_limitations">Limitations of the Kudu Public Beta</a>. Where applicable, this list adds to or overrides that -list.</p> -</div> -<div class="sect3"> -<h4 id="_operating_system_limitations_3"><a class="link" href="#_operating_system_limitations_3">Operating System Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Kudu 0.5 is known to work on RHEL 7 or 6.4 or newer, CentOS 7 or 6.4 or newer, Ubuntu -Trusty, and SLES 12. Other operating systems may work but have not been tested.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_api_limitations_2"><a class="link" href="#_api_limitations_2">API Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>The Python client is considered experimental.</p> -</li> -</ul> -</div> -</div> -</div> -</div> -</div> -<div class="sect1"> -<h2 id="_about_the_kudu_public_beta"><a class="link" href="#_about_the_kudu_public_beta">About the Kudu Public Beta</a></h2> -<div class="sectionbody"> -<div class="paragraph"> -<p>Releases of Apache Kudu prior to 1.0 are considered beta. Do not run beta releases on production clusters. -During the public beta period, Kudu will be supported via a -<a href="https://issues.cloudera.org/projects/KUDU">public JIRA</a> and a public -<a href="http://mail-archives.apache.org/mod_mbox/kudu-user/">mailing list</a>, which will be -monitored by the Kudu development team and community members. Commercial support -is not available at this time.</p> -</div> -<div class="ulist"> -<ul> -<li> -<p>You can submit any issues or feedback related to your Kudu experience via either -the JIRA system or the mailing list. The Kudu development team and community members -will respond and assist as quickly as possible.</p> -</li> -<li> -<p>The Kudu team will work with early adopters to fix bugs and release new binary drops -when fixes or features are ready. However, we cannot commit to issue resolution or -bug fix delivery times during the public beta period, and it is possible that some -fixes or enhancements will not be selected for a release.</p> -</li> -<li> -<p>We can’t guarantee time frames or contents for future beta code drops. However, -they will be announced to the user group when they occur.</p> -</li> -<li> -<p>No guarantees are made regarding upgrades from this release to follow-on releases. -While multiple drops of beta code are planned, we can’t guarantee their schedules -or contents.</p> -</li> -</ul> -</div> -<div class="sect2"> -<h3 id="beta_limitations"><a class="link" href="#beta_limitations">Limitations of the Kudu Public Beta</a></h3> -<div class="paragraph"> -<p>Items in this list may be amended or superseded by limitations listed in the release -notes for specific Kudu releases above.</p> -</div> -<div class="sect3"> -<h4 id="_schema_limitations"><a class="link" href="#_schema_limitations">Schema Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Kudu is primarily designed for analytic use cases and, in the beta release, -you are likely to encounter issues if a single row contains multiple kilobytes of data.</p> -</li> -<li> -<p>The columns which make up the primary key must be listed first in the schema.</p> -</li> -<li> -<p>Key columns cannot be altered. You must drop and recreate a table to change its keys.</p> -</li> -<li> -<p>Key columns must not be null.</p> -</li> -<li> -<p>Columns with <code>DOUBLE</code>, <code>FLOAT</code>, or <code>BOOL</code> types are not allowed as part of a -primary key definition.</p> -</li> -<li> -<p>Type and nullability of existing columns cannot be changed by altering the table.</p> -</li> -<li> -<p>A tableâs primary key cannot be changed.</p> -</li> -<li> -<p>Dropping a column does not immediately reclaim space. Compaction must run first. -There is no way to run compaction manually, but dropping the table will reclaim the -space immediately.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_ingest_limitations"><a class="link" href="#_ingest_limitations">Ingest Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Ingest via Sqoop or Flume is not supported in the public beta. The recommended -approach for bulk ingest is to use Impalaâs <code>CREATE TABLE AS SELECT</code> functionality -or use the Kudu Java or C++ API.</p> -</li> -<li> -<p>Tables must be manually pre-split into tablets using simple or compound primary -keys. Automatic splitting is not yet possible. See -<a href="schema_design.html">Schema Design</a>.</p> -</li> -<li> -<p>Tablets cannot currently be merged. Instead, create a new table with the contents -of the old tables to be merged.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_replication_and_backup_limitations"><a class="link" href="#_replication_and_backup_limitations">Replication and Backup Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Replication and failover of Kudu masters is considered experimental. It is -recommended to run a single master and periodically perform a manual backup of -its data directories.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_impala_limitations"><a class="link" href="#_impala_limitations">Impala Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>To use Kudu with Impala, you must install a special release of Impala called -Impala_Kudu. Obtaining and installing a compatible Impala release is detailed in Kudu’s -<a href="kudu_impala_integration.html">Impala Integration</a> documentation.</p> -</li> -<li> -<p>To use Impala_Kudu alongside an existing Impala instance, you must install using parcels.</p> -</li> -<li> -<p>Updates, inserts, and deletes via Impala are non-transactional. If a query -fails part of the way through, its partial effects will not be rolled back.</p> -</li> -<li> -<p>All queries will be distributed across all Impala hosts which host a replica -of the target table(s), even if a predicate on a primary key could correctly -restrict the query to a single tablet. This limits the maximum concurrency of -short queries made via Impala.</p> -</li> -<li> -<p>No timestamp and decimal type support.</p> -</li> -<li> -<p>The maximum parallelism of a single query is limited to the number of tablets -in a table. For good analytic performance, aim for 10 or more tablets per host -or use large tables.</p> -</li> -<li> -<p>Impala is only able to push down predicates involving <code>=</code>, <code>⇐</code>, <code>>=</code>, -or <code>BETWEEN</code> comparisons between any column and a literal value, and <code><</code> and <code>></code> -for integer columns only. For example, for a table with an integer key <code>ts</code>, and -a string key <code>name</code>, the predicate <code>WHERE ts >= 12345</code> will convert into an -efficient range scan, whereas <code>where name > 'lipcon'</code> will currently fetch all -data from the table and evaluate the predicate within Impala.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_security_limitations"><a class="link" href="#_security_limitations">Security Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Authentication and authorization are not included in the public beta.</p> -</li> -<li> -<p>Data encryption is not included in the public beta.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_client_and_api_limitations"><a class="link" href="#_client_and_api_limitations">Client and API Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>Potentially-incompatible C++, Java and Python API changes may be required during the -public beta.</p> -</li> -<li> -<p><code>ALTER TABLE</code> is not yet fully supported via the client APIs. More <code>ALTER TABLE</code> -operations will become available in future betas.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_application_integration_limitations"><a class="link" href="#_application_integration_limitations">Application Integration Limitations</a></h4> -<div class="ulist"> -<ul> -<li> -<p>The Spark DataFrame implementation is not yet complete.</p> -</li> -</ul> -</div> -</div> -<div class="sect3"> -<h4 id="_other_known_issues"><a class="link" href="#_other_known_issues">Other Known Issues</a></h4> -<div class="paragraph"> -<p>The following are known bugs and issues with the current release of Kudu. They will -be addressed in later beta releases.</p> -</div> -<div class="ulist"> -<ul> -<li> -<p>If the Kudu master is configured with the <code>-log_fsync_all</code> option, tablet servers -and clients will experience frequent timeouts, and the cluster may become unusable.</p> -</li> -<li> -<p>If a tablet server has a very large number of tablets, it may take several minutes -to start up. It is recommended to limit the number of tablets per server to 100 or fewer. -Consider this limitation when pre-splitting your tables. If you notice slow start-up times, -you can monitor the number of tablets per server in the web UI.</p> -</li> -</ul> -</div> -</div> -</div> -</div> -</div> -<div class="sect1"> -<h2 id="_resources"><a class="link" href="#_resources">Resources</a></h2> -<div class="sectionbody"> -<div class="ulist"> -<ul> -<li> -<p><a href="http://getkudu.io">Kudu Website</a></p> -</li> -<li> -<p><a href="http://github.com/apache/kudu">Kudu GitHub Repository</a></p> -</li> -<li> -<p><a href="index.html">Kudu Documentation</a></p> -</li> -</ul> -</div> -</div> -</div> -<div class="sect1"> -<h2 id="_installation_options"><a class="link" href="#_installation_options">Installation Options</a></h2> -<div class="sectionbody"> -<div class="ulist"> -<ul> -<li> -<p>A Quickstart VM is provided to get you up and running quickly.</p> -</li> -<li> -<p>You can install Kudu using provided deb/yum packages.</p> -</li> -<li> -<p>You can install Kudu, in clusters managed by Cloudera Manager, using parcels or deb/yum packages.</p> -</li> -<li> -<p>You can build Kudu from source.</p> -</li> -</ul> -</div> <div class="paragraph"> -<p>For full installation details, see <a href="installation.html">Kudu Installation</a>.</p> -</div> -</div> -</div> -<div class="sect1"> -<h2 id="_next_steps"><a class="link" href="#_next_steps">Next Steps</a></h2> -<div class="sectionbody"> -<div class="ulist"> -<ul> -<li> -<p><a href="quickstart.html">Kudu Quickstart</a></p> -</li> -<li> -<p><a href="installation.html">Installing Kudu</a></p> -</li> -<li> -<p><a href="configuration.html">Configuring Kudu</a></p> -</li> -</ul> +<p>Kudu 0.5.0 was the first public release. As such, no improvements or changes were +noted in its release notes.</p> </div> </div> </div> @@ -1675,6 +1413,10 @@ you can monitor the number of tablets per server in the web UI.</p> </li> <li> + <a href="known_issues.html">Known Issues and Limitations</a> + </li> + <li> + <a href="export_control.html">Export Control Notice</a> </li> </ul> @@ -1684,7 +1426,7 @@ you can monitor the number of tablets per server in the web UI.</p> </div> <footer class="footer"> <p class="small"> - Copyright © 2016 The Apache Software Foundation. Last updated 2016-11-14 15:52:59 PST + Copyright © 2016 The Apache Software Foundation. Last updated 2017-01-12 12:48:06 PST </p> </footer> </div>
http://git-wip-us.apache.org/repos/asf/kudu-site/blob/9b792926/docs/quickstart.html ---------------------------------------------------------------------- diff --git a/docs/quickstart.html b/docs/quickstart.html index 190c4b5..32aa2e5 100644 --- a/docs/quickstart.html +++ b/docs/quickstart.html @@ -191,18 +191,32 @@ to consult the <a href="#trouble">Troubleshooting</a> section.</p> <h2 id="_load_data"><a class="link" href="#_load_data">Load Data</a></h2> <div class="sectionbody"> <div class="paragraph"> -<p>To perform some typical operations with Kudu and Impala, you can load the -<a href="http://www.flysfo.com/media/facts-statistics/air-traffic-statistics">SFO Passenger Data</a> -into Impala and then load it into Kudu.</p> +<p>To practice some typical operations with Kudu and Impala, we’ll use the +<a href="https://data.sfgov.org/Transportation/Raw-AVL-GPS-data/5fk7-ivit/data">San Francisco MTA +GPS dataset</a>. This dataset contains raw location data transmitted periodically from +sensors installed on the busses in the SF MTA’s fleet.</p> </div> <div class="olist arabic"> <ol class="arabic"> <li> -<p>Upload the sample data from the home directory to HDFS.</p> +<p>Download the sample data and load it into HDFS</p> +<div class="paragraph"> +<p>First we’ll download the sample dataset, prepare it, and upload it into the HDFS +cluster.</p> +</div> +<div class="paragraph"> +<p>The SF MTA’s site is often a bit slow, so we’ve mirrored a sample CSV file from the +dataset at <a href="http://kudu-sample-data.s3.amazonaws.com/sfmtaAVLRawData01012013.csv.gz" class="bare">http://kudu-sample-data.s3.amazonaws.com/sfmtaAVLRawData01012013.csv.gz</a></p> +</div> +<div class="paragraph"> +<p>The original dataset uses DOS-style line endings, so we’ll convert it to +UNIX-style during the upload process using <code>tr</code>.</p> +</div> <div class="listingblock"> <div class="content"> -<pre class="highlight"><code class="language-bash" data-lang="bash">$ hdfs dfs -mkdir /data -$ hdfs dfs -put examples/SFO_Passenger_Data/MonthlyPassengerData_200507_to_201506.csv /data</code></pre> +<pre class="highlight"><code class="language-bash" data-lang="bash">$ wget http://kudu-sample-data.s3.amazonaws.com/sfmtaAVLRawData01012013.csv.gz +$ hdfs dfs -mkdir /sfmta +$ zcat sfmtaAVLRawData01012013.csv.gz | tr -d '\r' | hadoop fs -put - /sfmta/data.csv</code></pre> </div> </div> </li> @@ -219,24 +233,19 @@ in the virtual machine issue the following command:</p> </div> <div class="listingblock"> <div class="content"> -<pre class="highlight"><code class="language-sql" data-lang="sql">CREATE EXTERNAL TABLE passenger_data_raw ( - id int, - activity_period int, - operating_airline string, - airline_iata_code string, - published_airline string, - published_airline_iata_code string, - geo_summary string, - geo_region string, - activity_type_code string, - price_category_code string, - terminal string, - boarding_area string, - passenger_count bigint +<pre class="highlight"><code class="language-sql" data-lang="sql">CREATE EXTERNAL TABLE sfmta_raw ( + revision int, + report_time string, + vehicle_tag int, + longitude float, + latitude float, + speed float, + heading float ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' -LOCATION '/data/';</code></pre> +LOCATION '/sfmta/' +TBLPROPERTIES ('skip.header.line.count'='1');</code></pre> </div> </div> </li> @@ -244,63 +253,53 @@ LOCATION '/data/';</code></pre> <p>Validate if the data was actually loaded run the following command:</p> <div class="listingblock"> <div class="content"> -<pre class="highlight"><code class="language-sql" data-lang="sql">SELECT count(*) FROM passenger_data_raw; +<pre class="highlight"><code class="language-sql" data-lang="sql">SELECT count(*) FROM sfmta_raw; +----------+ | count(*) | +----------+ -| 13901 | +| 859086 | +----------+</code></pre> </div> </div> </li> <li> -<p>It’s easy to convert data from any Hadoop file format and store it Kudu using the -<code>CREATE TABLE AS SELECT</code> statement.</p> +<p>Next we’ll create a Kudu table and load the data. Note that we convert +the string <code>report_time</code> field into a unix-style timestamp for more efficient +storage.</p> <div class="listingblock"> <div class="content"> -<pre class="highlight"><code class="language-sql" data-lang="sql">CREATE TABLE passenger_data -DISTRIBUTE BY HASH (id) INTO 16 BUCKETS -TBLPROPERTIES( -'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler', -'kudu.table_name' = 'passenger_data', -'kudu.master_addresses' = '127.0.0.1', -'kudu.key_columns' = 'id' - ) AS SELECT * FROM passenger_data_raw; - -+-----------------------+ -| summary | -+-----------------------+ -| Inserted 13901 row(s) | -+-----------------------+ -Fetched 1 row(s) in 1.26s</code></pre> -</div> -</div> -</li> -</ol> -</div> -<div class="exampleblock"> -<div class="content"> -<div class="paragraph"> -<p>For <code>CREATE TABLE …​ AS SELECT</code> we currently require that the first columns that are -projected in the <code>SELECT</code> statement correspond to the Kudu table keys and are in the -same order (<code>id</code> in the example above). If the default projection generated by <code>*</code> -does not meet this requirement, the user should avoid using <code>*</code> and explicitly mention -the columns to project, in the correct order.</p> -</div> +<pre class="highlight"><code class="language-sql" data-lang="sql">CREATE TABLE sfmta ( + report_time BIGINT NOT NULL, + vehicle_tag STRING NOT NULL, + longitude FLOAT NOT NULL, + latitude FLOAT NOT NULL, + speed FLOAT NOT NULL, + heading FLOAT NOT NULL, + PRIMARY KEY (report_time, vehicle_tag) +) +DISTRIBUTE BY HASH(report_time) INTO 8 BUCKETS +STORED AS KUDU; + +INSERT INTO sfmta SELECT + UNIX_TIMESTAMP(report_time, 'MM/dd/yyyy HH:mm:ss'), + vehicle_tag, + longitude, + latitude, + speed, + heading +FROM sfmta_raw; + +-- Modified 859086 row(s), 0 row error(s) in 8.55s</code></pre> </div> </div> <div class="paragraph"> -<p>+ -The created table uses a simple single column primary key. See +<p>The created table uses a composite primary key. See <a href="kudu_impala_integration.html#kudu_impala">Kudu Impala Integration</a> for a more detailed introduction to the extended SQL syntax for Impala.</p> </div> -<div class="paragraph"> -<p>+ -The columns of the created table are copied from the <code>passenger_data_raw</code> base table. See -<a href="http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/impala_create_table.html">Impala’s -documentation</a> for more details about the extended SQL syntax for Impala.</p> +</li> +</ol> </div> </div> </div> @@ -309,63 +308,48 @@ documentation</a> for more details about the extended SQL syntax for Impala.</p> <div class="sectionbody"> <div class="paragraph"> <p>Now that the data is stored in Kudu, you can run queries against it. The following query -lists the airline with the highest passenger volume over the entire reporting timeframe.</p> +finds the data point containing the highest recorded vehicle speed.</p> </div> <div class="listingblock"> <div class="content"> -<pre class="highlight"><code class="language-sql" data-lang="sql">SELECT sum(passenger_count) AS total, operating_airline FROM passenger_data - GROUP BY operating_airline - HAVING total IS NOT null - ORDER BY total DESC LIMIT 10; - -+-----------+----------------------------------+ -| total | operating_airline | -+-----------+----------------------------------+ -| 105363917 | United Airlines - Pre 07/01/2013 | -| 51319845 | United Airlines | -| 32657456 | SkyWest Airlines | -| 31727343 | American Airlines | -| 23801507 | Delta Air Lines | -| 23685267 | Virgin America | -| 22507320 | Southwest Airlines | -| 16235520 | US Airways | -| 11860630 | Alaska Airlines | -| 6706438 | JetBlue Airways | -+-----------+----------------------------------+</code></pre> +<pre class="highlight"><code class="language-sql" data-lang="sql">SELECT * FROM sfmta ORDER BY speed DESC LIMIT 1; + ++-------------+-------------+--------------------+-------------------+-------------------+---------+ +| report_time | vehicle_tag | longitude | latitude | speed | heading | ++-------------+-------------+--------------------+-------------------+-------------------+---------+ +| 1357022342 | 5411 | -122.3968811035156 | 37.76665878295898 | 68.33300018310547 | 82 | ++-------------+-------------+--------------------+-------------------+-------------------+---------+</code></pre> </div> </div> <div class="paragraph"> -<p>Looking at the result, you can already see a problem with the dataset. There is a -duplicate airline name. Since the data is stored in Kudu rather than HDFS, you can quickly -change any individual record and fix the problem without having to rewrite the entire -table.</p> +<p>With a quick <a href="https://www.google.com/search?q=122.3968811035156W+37.76665878295898N">Google search</a> +we can see that this bus was traveling east on 16th street at 68MPH. +At first glance, this seems unlikely to be true. Perhaps we do some research +and find that this bus’s sensor equipment was broken and we decide to +remove the data. With Kudu this is very easy to correct using standard +SQL:</p> </div> <div class="listingblock"> <div class="content"> -<pre class="highlight"><code class="language-sql" data-lang="sql">UPDATE passenger_data - SET operating_airline="United Airlines" - WHERE operating_airline LIKE "United Airlines - Pre%"; - -SELECT sum(passenger_count) AS total, operating_airline FROM passenger_data - GROUP BY operating_airline - HAVING total IS NOT null - ORDER BY total DESC LIMIT 10; - -+-----------+--------------------+ -| total | operating_airline | -+-----------+--------------------+ -| 156683762 | United Airlines | -| 32657456 | SkyWest Airlines | -| 31727343 | American Airlines | -| 23801507 | Delta Air Lines | -| 23685267 | Virgin America | -| 22507320 | Southwest Airlines | -| 16235520 | US Airways | -| 11860630 | Alaska Airlines | -| 6706438 | JetBlue Airways | -| 6266220 | Northwest Airlines | -+-----------+--------------------+</code></pre> +<pre class="highlight"><code class="language-sql" data-lang="sql">DELETE FROM sfmta WHERE vehicle_tag = '5411'; + +-- Modified 1169 row(s), 0 row error(s) in 0.25s</code></pre> +</div> +</div> +</div> </div> +<div class="sect1"> +<h2 id="_next_steps"><a class="link" href="#_next_steps">Next steps</a></h2> +<div class="sectionbody"> +<div class="paragraph"> +<p>The above example showed how to load, query, and mutate a static dataset with Impala +and Kudu. The real power of Kudu, however, is the ability to ingest and mutate data +in a streaming fashion.</p> +</div> +<div class="paragraph"> +<p>As an exercise to learn the Kudu programmatic APIs, try implementing a program +that uses the <a href="http://www.nextbus.com/xmlFeedDocs/NextBusXMLFeed.pdf">SFMTA +XML data feed</a> to ingest this same dataset in real time into the Kudu table.</p> </div> <div class="sect2"> <h3 id="trouble"><a class="link" href="#trouble">Troubleshooting</a></h3> @@ -418,7 +402,7 @@ contain references to the previous VM’s SSH credentials. Remove any refere </div> </div> <div class="sect1"> -<h2 id="_next_steps"><a class="link" href="#_next_steps">Next Steps</a></h2> +<h2 id="_next_steps_2"><a class="link" href="#_next_steps_2">Next Steps</a></h2> <div class="sectionbody"> <div class="ulist"> <ul> @@ -456,12 +440,13 @@ contain references to the previous VM’s SSH credentials. Remove any refere </ul> </li> <li><a href="#_load_data">Load Data</a></li> -<li><a href="#_read_and_modify_data">Read and Modify Data</a> +<li><a href="#_read_and_modify_data">Read and Modify Data</a></li> +<li><a href="#_next_steps">Next steps</a> <ul class="sectlevel2"> <li><a href="#trouble">Troubleshooting</a></li> </ul> </li> -<li><a href="#_next_steps">Next Steps</a></li> +<li><a href="#_next_steps_2">Next Steps</a></li> </ul> </li> <li> @@ -510,6 +495,10 @@ contain references to the previous VM’s SSH credentials. Remove any refere </li> <li> + <a href="known_issues.html">Known Issues and Limitations</a> + </li> + <li> + <a href="export_control.html">Export Control Notice</a> </li> </ul> @@ -527,7 +516,7 @@ contain references to the previous VM’s SSH credentials. Remove any refere </div> <footer class="footer"> <p class="small"> - Copyright © 2016 The Apache Software Foundation. Last updated 2016-10-25 14:39:46 PDT + Copyright © 2016 The Apache Software Foundation. Last updated 2017-01-12 20:06:29 PST </p> </footer> </div> http://git-wip-us.apache.org/repos/asf/kudu-site/blob/9b792926/docs/release_notes.html ---------------------------------------------------------------------- diff --git a/docs/release_notes.html b/docs/release_notes.html index 4fa5119..55999f9 100644 --- a/docs/release_notes.html +++ b/docs/release_notes.html @@ -7,7 +7,7 @@ <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> <meta name="description" content="A new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast data" /> <meta name="author" content="Cloudera" /> - <title>Apache Kudu - Apache Kudu 1.1 Release Notes</title> + <title>Apache Kudu - Apache Kudu 1.2.0 Release Notes</title> <!-- Bootstrap core CSS --> <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" @@ -114,61 +114,65 @@ limitations under the License. <div class="row"> <div class="col-md-9"> -<h1>Apache Kudu 1.1 Release Notes</h1> +<h1>Apache Kudu 1.2.0 Release Notes</h1> <div class="sect1"> -<h2 id="rn_1.1.0_new_features"><a class="link" href="#rn_1.1.0_new_features">New features</a></h2> +<h2 id="rn_1.2.0_new_features"><a class="link" href="#rn_1.2.0_new_features">New features</a></h2> <div class="sectionbody"> <div class="ulist"> <ul> <li> -<p>The Python client has been brought up to feature parity with the Java and C++ clients -and as such the package version will be brought to 1.1 with this release (from 0.3). A -list of the highlights can be found below.</p> -<div class="ulist"> -<ul> -<li> -<p>Improved Partial Row semantics</p> -</li> -<li> -<p>Range partition support</p> -</li> -<li> -<p>Scan Token API</p> -</li> -<li> -<p>Enhanced predicate support</p> +<p>Kudu clients and servers now redact user data such as cell values +from log messages, Java exception messages, and <code>Status</code> strings. +User metadata such as table names, column names, and partition +bounds are not redacted.</p> +<div class="paragraph"> +<p>Redaction is enabled by default, but may be disabled by setting the new +<code>log_redact_user_data</code> flag to <code>false</code>.</p> +</div> </li> <li> -<p>Support for all Kudu data types (including a mapping of Python’s <code>datetime.datetime</code> to -<code>UNIXTIME_MICROS</code>)</p> -</li> +<p>Kudu’s ability to provide consistency guarantees has been substantially +improved:</p> +<div class="ulist"> +<ul> <li> -<p>Alter table support</p> +<p>Replicas now correctly track their "safe timestamp". This timestamp +is the maximum timestamp at which reads are guaranteed to be +repeatable.</p> </li> <li> -<p>Enabled Read at Snapshot for Scanners</p> +<p>A scan created using the <code>SCAN_AT_SNAPSHOT</code> mode will now +either wait for the requested snapshot to be "safe" at the replica +being scanned, or be re-routed to a replica where the requested +snapshot is "safe". This ensures that all such scans are repeatable.</p> </li> <li> -<p>Enabled Scanner Replica Selection</p> +<p>Kudu Tablet Servers now properly retain historical data when a row +with a given primary key is inserted and deleted, followed by the +insertion of a new row with the same key. Previous versions of Kudu +would not retain history in such situations. This allows the server +to return correct results for snapshot scans with a timestamp in the +past, even in the presence of such "reinsertion" scenarios.</p> </li> <li> -<p>A few bug fixes for Python 3 in addition to various other improvements.</p> +<p>The Kudu clients now automatically retain the timestamp of their latest +successful read or write operation. Scans using the <code>READ_AT_SNAPSHOT</code> mode +without a client-provided timestamp automatically assign a timestamp +higher than the timestamp of their most recent write. Writes also propagate +the timestamp, ensuring that sequences of operations with causal dependencies +between them are assigned increasing timestamps. Together, these changes +allow clients to achieve read-your-writes consistency, and also ensure +that snapshot scans performed by other clients return causally-consistent +results.</p> </li> </ul> </div> </li> <li> -<p>IN LIST predicate pushdown support was added to allow optimized execution of filters which -match on a set of column values. Support for Spark, Map Reduce and Impala queries utilizing -IN LIST pushdown is not yet complete.</p> -</li> -<li> -<p>The Java client now features client-side request tracing in order to help troubleshoot timeouts. -Error messages are now augmented with traces that show which servers were contacted before the -timeout occured instead of just the last error. The traces also contain RPCs that were -required to fulfill the client’s request, such as contacting the master to discover a tablet’s -location. Note that the traces are not available for successful requests and are not -programatically queryable.</p> +<p>Kudu servers now automatically limit the number of log files. +The number of log files retained can be configured using the +<code>max_log_files</code> flag. By default, 10 log files will be retained +at each severity level.</p> </li> </ul> </div> @@ -180,290 +184,241 @@ programatically queryable.</p> <div class="ulist"> <ul> <li> -<p>Kudu now publishes JAR files for Spark 2.0 compiled with Scala 2.11 along with the -existing Spark 1.6 JAR compiled with Scala 2.10.</p> -</li> -<li> -<p>The Java client now allows configuring scanners to read from the closest replica instead of -the known leader replica. The default remains the latter. Use the relevant <code>ReplicaSelection</code> -enum with the scanner’s builder to change this behavior.</p> +<p>The logging in the Java and C++ clients has been substantially quieted. +Clients no longer log messages in normal operation unless there +is some kind of error.</p> </li> <li> -<p>Tablet servers use a new policy for retaining write-ahead log (WAL) segments. -Previously, servers used the 'log_min_segments_to_retain' flag to prioritize -any flushes which were retaining log segments past the configured value (default 2). -This policy caused servers to flush in-memory data more frequently than necessary, -limiting write performance.</p> -<div class="paragraph"> -<p>The new policy introduces a new flag 'log_target_replay_size_mb' which - determines the threshold at which write-ahead log retention will prioritize flushes. - The new flag is considered experimental and users should not need to modify - its value.</p> -</div> -<div class="paragraph"> -<p>The improved policy has been seen to improve write performance in some use cases - by a factor of 2x relative to the old policy.</p> -</div> +<p>The C++ client now includes a <code>KuduSession::SetErrorBufferSpace</code> +API which can limit the amount of memory used to buffer +errors from asynchronous operations.</p> </li> <li> -<p>Kudu’s implementation of the Raft consensus algorithm has been improved to include -a "pre-election" phase. This can improve the stability of tablet leader election -in high-load scenarios, especially if each server hosts a high number of tablets.</p> +<p>The Java client now fetches tablet locations from the Kudu Master +in batches of 1000, increased from batches of 10 in prior versions. +This can substantially improve the performance of Spark and Impala +queries running against Kudu tables with large numbers of tablets.</p> </li> <li> -<p>Tablet server start-up time has been substantially improved in the case that -the server contains a high number of tombstoned tablet replicas.</p> +<p>Table metadata lock contention in the Kudu Master was substantially +reduced. This improves the performance of tablet location lookups on +large clusters with a high degree of concurrency.</p> </li> -</ul> -</div> -<div class="sect2"> -<h3 id="_command_line_tools"><a class="link" href="#_command_line_tools">Command line tools</a></h3> -<div class="ulist"> -<ul> <li> -<p>The tool <code>kudu tablet leader_step_down</code> has been added to manually force a leader to step down.</p> +<p>Lock contention in the Kudu Tablet Server during high-concurrency +write workloads was also reduced. This can reduce CPU consumption and +improve performance when a large number of concurrent clients are writing +to a smaller number of a servers.</p> </li> <li> -<p>The tool <code>kudu remote_replica copy</code> has been added to manually copy a replica from -one running tablet server to another.</p> +<p>Lock contention when writing log messages has been substantially reduced. +This source of contention could cause high tail latencies on requests, +and when under high load could contribute to cluster instability +such as election storms and request timeouts.</p> </li> <li> -<p>The tool <code>kudu local_replica delete</code> has been added to delete a replica of a tablet.</p> +<p>The <code>BITSHUFFLE</code> column encoding has been optimized to use the <code>AVX2</code> +instruction set present on processors including Intel® Sandy Bridge +and later. Scans on <code>BITSHUFFLE</code>-encoded columns are now up to 30% faster.</p> </li> <li> -<p>The <code>kudu test loadgen</code> tool has been added to replace the obsoleted -<code>insert-generated-rows</code> standalone binary. The new tool is enriched with -additional functionality and can be used to run load generation tests against -a Kudu cluster.</p> +<p>The <code>kudu</code> tool now accepts hyphens as an alternative to underscores +when specifying actions. For example, <code>kudu local-replica copy-from-remote</code> +may be used as an alternative to <code>kudu local_replica copy_from_remote</code>.</p> </li> </ul> </div> </div> </div> -</div> <div class="sect1"> -<h2 id="_wire_protocol_compatibility"><a class="link" href="#_wire_protocol_compatibility">Wire protocol compatibility</a></h2> +<h2 id="rn_1.2.0_fixed_issues"><a class="link" href="#rn_1.2.0_fixed_issues">Fixed Issues</a></h2> <div class="sectionbody"> -<div class="paragraph"> -<p>Kudu 1.1.0 is wire-compatible with previous versions of Kudu:</p> -</div> <div class="ulist"> <ul> <li> -<p>Kudu 1.1 clients may connect to servers running Kudu 1.0. If the client uses the new -'IN LIST' predicate type, an error will be returned.</p> +<p><a href="https://issues.apache.org/jira/browse/KUDU-1508">KUDU-1508</a> +Fixed a long-standing issue in which running Kudu on <code>ext4</code> file systems +could cause file system corruption.</p> </li> <li> -<p>Kudu 1.0 clients may connect to servers running Kudu 1.1 without limitations.</p> +<p><a href="https://issues.apache.org/jira/browse/KUDU-1399">KUDU-1399</a> +Implemented an LRU cache for open files, which prevents running out of +file descriptors on long-lived Kudu clusters. By default, Kudu will +limit its file descriptor usage to half of its configured <code>ulimit</code>.</p> </li> <li> -<p>Rolling upgrade between Kudu 1.0 and Kudu 1.1 servers is believed to be possible -though has not been sufficiently tested. Users are encouraged to shut down all nodes -in the cluster, upgrade the software, and then restart the daemons on the new version.</p> +<p><a href="http://gerrit.cloudera.org:8080/5192">Gerrit #5192</a> +Fixed an issue which caused data corruption and crashes in the case that +a table had a non-composite (single-column) primary key, and that column +was specified to use <code>DICT_ENCODING</code> or <code>BITSHUFFLE</code> encodings. If a +table with an affected schema was written in previous versions of Kudu, +the corruption will not be automatically repaired; users are encouraged +to re-insert such tables after upgrading to Kudu 1.2 or later.</p> </li> -</ul> -</div> -</div> -</div> -<div class="sect1"> -<h2 id="rn_1.1.0_incompatible_changes"><a class="link" href="#rn_1.1.0_incompatible_changes">Incompatible changes in Kudu 1.1.0</a></h2> -<div class="sectionbody"> -<div class="sect2"> -<h3 id="_client_apis_c_java_python"><a class="link" href="#_client_apis_c_java_python">Client APIs (C++/Java/Python)</a></h3> -<div class="ulist"> -<ul> <li> -<p>The C++ client no longer requires the -<a href="https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html">old gcc5 ABI</a>. -Which ABI is actually used depends on the compiler configuration. Some new distros -(e.g. Ubuntu 16.04) will use the new ABI. Your application must use the same ABI as is -used by the client library; an easy way to guarantee this is to use the same compiler -to build both.</p> +<p><a href="http://gerrit.cloudera.org:8080/5541">Gerrit #5541</a> +Fixed a bug in the Spark <code>KuduRDD</code> implementation which could cause +rows in the result set to be silently skipped in some cases.</p> </li> <li> -<p>The C++ client’s <code>KuduSession::CountBufferedOperations()</code> method is -deprecated. Its behavior is inconsistent unless the session runs in the -<code>MANUAL_FLUSH</code> mode. Instead, to get number of buffered operations, count -invocations of the <code>KuduSession::Apply()</code> method since last -<code>KuduSession::Flush()</code> call or, if using asynchronous flushing, since last -invocation of the callback passed into <code>KuduSession::FlushAsync()</code>.</p> +<p><a href="https://issues.apache.org/jira/browse/KUDU-1551">KUDU-1551</a> +Fixed an issue in which the tablet server would crash on restart in the +case that it had previously crashed during the process of allocating +a new WAL segment.</p> </li> <li> -<p>The Java client’s <code>OperationResponse.getWriteTimestamp</code> method was renamed to <code>getWriteTimestampRaw</code> -to emphasize that it doesn’t return milliseconds, unlike what its Javadoc indicated. The renamed -method was also hidden from the public APIs and should not be used.</p> +<p><a href="https://issues.apache.org/jira/browse/KUDU-1764">KUDU-1764</a> +Fixed an issue where Kudu servers would leak approximately 16-32MB of disk +space for every 10GB of data written to disk. After upgrading to Kudu +1.2 or later, any disk space leaked in previous versions will be +automatically recovered on startup.</p> </li> <li> -<p>The Java client’s sync API (<code>KuduClient</code>, <code>KuduSession</code>, <code>KuduScanner</code>) used to throw either -a <code>NonRecoverableException</code> or a <code>TimeoutException</code> for a timeout, and now it’s only possible for the -client to throw the former.</p> +<p><a href="https://issues.apache.org/jira/browse/KUDU-1750">KUDU-1750</a> +Fixed an issue where the API to drop a range partition would drop any +partition with a matching lower <em>or</em> upper bound, rather than any partition +with matching lower <em>and</em> upper bound.</p> </li> <li> -<p>The Java client’s handling of errors in <code>KuduSession</code> was modified so that subclasses of -<code>KuduException</code> are converted into RowErrors instead of being thrown.</p> +<p><a href="https://issues.apache.org/jira/browse/KUDU-1766">KUDU-1766</a> +Fixed an issue in the Java client where equality predicates which compared +an integer column to its maximum possible value (e.g. <code>Integer.MAX_VALUE</code>) +would return incorrect results.</p> +</li> +<li> +<p><a href="https://issues.apache.org/jira/browse/KUDU-1780">KUDU-1780</a> +Fixed the <code>kudu-client</code> Java artifact to properly shade classes in the +<code>com.google.thirdparty</code> namespace. The lack of proper shading in prior +releases could cause conflicts with certain versions of Google Guava.</p> +</li> +<li> +<p><a href="http://gerrit.cloudera.org:8080/5327">Gerrit #5327</a> +Fixed shading issues in the <code>kudu-flume-sink</code> Java artifact. The sink +now expects that Hadoop dependencies are provided by Flume, and properly +shades the Kudu client’s dependencies.</p> +</li> +<li> +<p>Fixed a few issues using the Python client library from Python 3.</p> </li> </ul> </div> </div> </div> -</div> <div class="sect1"> -<h2 id="known_issues_and_limitations"><a class="link" href="#known_issues_and_limitations">Known Issues and Limitations</a></h2> +<h2 id="rn_1.2.0_wire_compatibility"><a class="link" href="#rn_1.2.0_wire_compatibility">Wire Protocol compatibility</a></h2> <div class="sectionbody"> -<div class="sect2"> -<h3 id="_schema_and_usage_limitations"><a class="link" href="#_schema_and_usage_limitations">Schema and Usage Limitations</a></h3> +<div class="paragraph"> +<p>Kudu 1.2.0 is wire-compatible with previous versions of Kudu:</p> +</div> <div class="ulist"> <ul> <li> -<p>Kudu is primarily designed for analytic use cases. You are likely to encounter issues if -a single row contains multiple kilobytes of data.</p> +<p>Kudu 1.2 clients may connect to servers running Kudu 1.0. If the client uses features +that are not available on the target server, an error will be returned.</p> </li> <li> -<p>The columns which make up the primary key must be listed first in the schema.</p> +<p>Kudu 1.0 clients may connect to servers running Kudu 1.2 without limitations.</p> </li> <li> -<p>Key columns cannot be altered. You must drop and recreate a table to change its keys.</p> -</li> -<li> -<p>Key columns must not be null.</p> -</li> -<li> -<p>Columns with <code>DOUBLE</code>, <code>FLOAT</code>, or <code>BOOL</code> types are not allowed as part of a -primary key definition.</p> -</li> -<li> -<p>Type and nullability of existing columns cannot be changed by altering the table.</p> -</li> -<li> -<p>A tableâs primary key cannot be changed.</p> -</li> -<li> -<p>Dropping a column does not immediately reclaim space. Compaction must run first. -There is no way to run compaction manually, but dropping the table will reclaim the -space immediately.</p> +<p>Rolling upgrade between Kudu 1.1 and Kudu 1.2 servers is believed to be possible +though has not been sufficiently tested. Users are encouraged to shut down all nodes +in the cluster, upgrade the software, and then restart the daemons on the new version.</p> </li> </ul> </div> </div> -<div class="sect2"> -<h3 id="_partitioning_limitations"><a class="link" href="#_partitioning_limitations">Partitioning Limitations</a></h3> +</div> +<div class="sect1"> +<h2 id="rn_1.2.0_incompatible_changes"><a class="link" href="#rn_1.2.0_incompatible_changes">Incompatible Changes in Kudu 1.2.0</a></h2> +<div class="sectionbody"> <div class="ulist"> <ul> <li> -<p>Tables must be manually pre-split into tablets using simple or compound primary -keys. Automatic splitting is not yet possible. Range partitions may be added -or dropped after a table has been created. See -<a href="schema_design.html">Schema Design</a> for more information.</p> +<p>The replication factor of tables is now limited to a maximum of 7. In addition, +it is no longer allowed to create a table with an even replication factor.</p> </li> <li> -<p>Data in existing tables cannot currently be automatically repartitioned. As a workaround, -create a new table with the new partitioning and insert the contents of the old -table.</p> +<p>The <code>GROUP_VARINT</code> encoding is now deprecated. Kudu servers have never supported +this encoding, and now the client-side constant has been deprecated to match the +server’s capabilities.</p> </li> </ul> </div> -</div> <div class="sect2"> -<h3 id="_replication_and_backup_limitations"><a class="link" href="#_replication_and_backup_limitations">Replication and Backup Limitations</a></h3> -<div class="ulist"> -<ul> -<li> -<p>Kudu does not currently include any built-in features for backup and restore. -Users are encouraged to use tools such as Spark or Impala to export or import -tables as necessary.</p> -</li> -</ul> +<h3 id="_new_restrictions_on_data_schemas_and_identifiers"><a class="link" href="#_new_restrictions_on_data_schemas_and_identifiers">New Restrictions on Data, Schemas, and Identifiers</a></h3> +<div class="paragraph"> +<p>Kudu 1.2.0 introduces several new restrictions on schemas, cell size, and identifiers:</p> +</div> +<div class="dlist"> +<dl> +<dt class="hdlist1">Number of Columns</dt> +<dd> +<p>By default, Kudu will not permit the creation of tables with +more than 300 columns. We recommend schema designs that use fewer columns for best +performance.</p> +</dd> +<dt class="hdlist1">Size of Cells</dt> +<dd> +<p>No individual cell may be larger than 64KB. The cells making up a +a composite key are limited to a total of 16KB after the internal composite-key encoding +done by Kudu. Inserting rows not conforming to these limitations will result in errors +being returned to the client.</p> +</dd> +<dt class="hdlist1">Valid Identifiers</dt> +<dd> +<p>Identifiers such as column and table names are now restricted to +be valid UTF-8 strings. Additionally, a maximum length of 256 characters is enforced.</p> +</dd> +</dl> </div> </div> <div class="sect2"> -<h3 id="_impala_limitations"><a class="link" href="#_impala_limitations">Impala Limitations</a></h3> +<h3 id="rn_1.2.0_client_compatibility"><a class="link" href="#rn_1.2.0_client_compatibility">Client Library Compatibility</a></h3> <div class="ulist"> <ul> <li> -<p>To use Kudu with Impala, you must install a special release of Impala called -Impala_Kudu. Obtaining and installing a compatible Impala release is detailed in Kudu’s -<a href="kudu_impala_integration.html">Impala Integration</a> documentation.</p> -</li> -<li> -<p>To use Impala_Kudu alongside an existing Impala instance, you must install using parcels.</p> -</li> -<li> -<p>Updates, inserts, and deletes via Impala are non-transactional. If a query -fails part of the way through, its partial effects will not be rolled back.</p> -</li> -<li> -<p>No timestamp and decimal type support.</p> +<p>The Kudu 1.2 Java client is API- and ABI-compatible with Kudu 1.1. Applications +written against Kudu 1.1 will compile and run against the Kudu 1.2 client and +vice-versa.</p> </li> <li> -<p>The maximum parallelism of a single query is limited to the number of tablets -in a table. For good analytic performance, aim for 10 or more tablets per host -or use large tables.</p> -</li> -</ul> -</div> -</div> -<div class="sect2"> -<h3 id="_security_limitations"><a class="link" href="#_security_limitations">Security Limitations</a></h3> +<p>The Kudu 1.2 C++ client is API- and ABI-forward-compatible with Kudu 1.1. +Applications written and compiled against the Kudu 1.1 client will run without +modification against the Kudu 1.2 client. Applications written and compiled +against the Kudu 1.2 client will run without modification against the Kudu 1.1 +client unless they use one of the following new APIs:</p> <div class="ulist"> <ul> <li> -<p>Authentication and authorization features are not implemented.</p> +<p><code>kudu::DisableSaslInitialization()</code></p> </li> <li> -<p>Data encryption is not built in. Kudu has been reported to run correctly -on systems using local block device encryption (e.g. <code>dmcrypt</code>).</p> +<p><code>KuduSession::SetErrorBufferSpace(…​)</code></p> </li> </ul> </div> -</div> -<div class="sect2"> -<h3 id="_client_and_api_limitations"><a class="link" href="#_client_and_api_limitations">Client and API Limitations</a></h3> -<div class="ulist"> -<ul> +</li> <li> -<p><code>ALTER TABLE</code> is not yet fully supported via the client APIs. More <code>ALTER TABLE</code> -operations will become available in future releases.</p> +<p>The Kudu 1.2 Python client is API-compatible with Kudu 1.1. Applications +written against Kudu 1.1 will continue to run against the Kudu 1.2 client +and vice-versa.</p> </li> </ul> </div> </div> -<div class="sect2"> -<h3 id="_other_known_issues"><a class="link" href="#_other_known_issues">Other Known Issues</a></h3> -<div class="paragraph"> -<p>The following are known bugs and issues with the current release of Kudu. They will -be addressed in later releases. Note that this list is not exhaustive, and is meant -to communicate only the most important known issues.</p> -</div> -<div class="ulist"> -<ul> -<li> -<p>If the Kudu master is configured with the <code>-log_fsync_all</code> option, tablet servers -and clients will experience frequent timeouts, and the cluster may become unusable.</p> -</li> -<li> -<p>If a tablet server has a very large number of tablets, it may take several minutes -to start up. It is recommended to limit the number of tablets per server to 100 or fewer. -Consider this limitation when pre-splitting your tables. If you notice slow start-up times, -you can monitor the number of tablets per server in the web UI.</p> -</li> -<li> -<p>Due to a known bug in Linux kernels prior to 3.8, running Kudu on <code>ext4</code> mount points -may cause a subsequent <code>fsck</code> to fail with errors such as <code>Logical start <N> does -not match logical start <M> at next level</code>. These errors are repairable using <code>fsck -y</code>, -but may impact server restart time.</p> -<div class="paragraph"> -<p>This affects RHEL/CentOS 6.8 and below. A fix is planned for RHEL/CentOS 6.9. - RHEL 7.0 and higher are not affected. Ubuntu 14.04 and later are not affected. - SLES 12 and later are not affected.</p> </div> -</li> -</ul> </div> +<div class="sect1"> +<h2 id="rn_1.2.0_known_issues"><a class="link" href="#rn_1.2.0_known_issues">Known Issues and Limitations</a></h2> +<div class="sectionbody"> +<div class="paragraph"> +<p>Please refer to the <a href="known_issues.html">Known Issues and Limitations</a> section of the +documentation.</p> </div> </div> </div> <div class="sect1"> -<h2 id="_resources"><a class="link" href="#_resources">Resources</a></h2> +<h2 id="resources_and_next_steps"><a class="link" href="#resources_and_next_steps">Resources</a></h2> <div class="sectionbody"> <div class="ulist"> <ul> @@ -522,30 +477,18 @@ but may impact server restart time.</p> <li> <span class="active-toc">Kudu Release Notes</span> <ul class="sectlevel1"> -<li><a href="#rn_1.1.0_new_features">New features</a></li> -<li><a href="#_optimizations_and_improvements">Optimizations and improvements</a> -<ul class="sectlevel2"> -<li><a href="#_command_line_tools">Command line tools</a></li> -</ul> -</li> -<li><a href="#_wire_protocol_compatibility">Wire protocol compatibility</a></li> -<li><a href="#rn_1.1.0_incompatible_changes">Incompatible changes in Kudu 1.1.0</a> -<ul class="sectlevel2"> -<li><a href="#_client_apis_c_java_python">Client APIs (C++/Java/Python)</a></li> -</ul> -</li> -<li><a href="#known_issues_and_limitations">Known Issues and Limitations</a> +<li><a href="#rn_1.2.0_new_features">New features</a></li> +<li><a href="#_optimizations_and_improvements">Optimizations and improvements</a></li> +<li><a href="#rn_1.2.0_fixed_issues">Fixed Issues</a></li> +<li><a href="#rn_1.2.0_wire_compatibility">Wire Protocol compatibility</a></li> +<li><a href="#rn_1.2.0_incompatible_changes">Incompatible Changes in Kudu 1.2.0</a> <ul class="sectlevel2"> -<li><a href="#_schema_and_usage_limitations">Schema and Usage Limitations</a></li> -<li><a href="#_partitioning_limitations">Partitioning Limitations</a></li> -<li><a href="#_replication_and_backup_limitations">Replication and Backup Limitations</a></li> -<li><a href="#_impala_limitations">Impala Limitations</a></li> -<li><a href="#_security_limitations">Security Limitations</a></li> -<li><a href="#_client_and_api_limitations">Client and API Limitations</a></li> -<li><a href="#_other_known_issues">Other Known Issues</a></li> +<li><a href="#_new_restrictions_on_data_schemas_and_identifiers">New Restrictions on Data, Schemas, and Identifiers</a></li> +<li><a href="#rn_1.2.0_client_compatibility">Client Library Compatibility</a></li> </ul> </li> -<li><a href="#_resources">Resources</a></li> +<li><a href="#rn_1.2.0_known_issues">Known Issues and Limitations</a></li> +<li><a href="#resources_and_next_steps">Resources</a></li> <li><a href="#_installation_options">Installation Options</a></li> <li><a href="#_next_steps">Next Steps</a></li> </ul> @@ -600,6 +543,10 @@ but may impact server restart time.</p> </li> <li> + <a href="known_issues.html">Known Issues and Limitations</a> + </li> + <li> + <a href="export_control.html">Export Control Notice</a> </li> </ul> @@ -609,7 +556,7 @@ but may impact server restart time.</p> </div> <footer class="footer"> <p class="small"> - Copyright © 2016 The Apache Software Foundation. Last updated 2016-11-17 10:36:43 PST + Copyright © 2016 The Apache Software Foundation. Last updated 2017-01-12 12:48:06 PST </p> </footer> </div> http://git-wip-us.apache.org/repos/asf/kudu-site/blob/9b792926/docs/schema_design.html ---------------------------------------------------------------------- diff --git a/docs/schema_design.html b/docs/schema_design.html index c4c18fa..d1d459e 100644 --- a/docs/schema_design.html +++ b/docs/schema_design.html @@ -201,10 +201,10 @@ column types include:</p> <p>double-precision (64-bit) IEEE-754 floating-point number</p> </li> <li> -<p>UTF-8 encoded string</p> +<p>UTF-8 encoded string (up to 64KB)</p> </li> <li> -<p>binary</p> +<p>binary (up to 64KB)</p> </li> </ul> </div> @@ -770,27 +770,32 @@ support renaming primary key columns. <h2 id="known-limitations"><a class="link" href="#known-limitations">Known Limitations</a></h2> <div class="sectionbody"> <div class="paragraph"> -<p>Kudu currently has some known limitations that may factor into schema design. When -designing your schema, consider these limitations together, not in isolation. If you -test these limitations and your findings are different from these, please share your -test cases and results.</p> +<p>Kudu currently has some known limitations that may factor into schema design.</p> </div> <div class="dlist"> <dl> <dt class="hdlist1">Number of Columns</dt> <dd> -<p>Kudu has not been thoroughly tested with more than 200 columns -and we recommend schemas with fewer than 50 columns per table.</p> +<p>By default, Kudu will not permit the creation of tables with +more than 300 columns. We recommend schema designs that use fewer columns for best +performance.</p> +</dd> +<dt class="hdlist1">Size of Cells</dt> +<dd> +<p>No individual cell may be larger than 64KB. The cells making up a +a composite key are limited to a total of 16KB after the internal composite-key encoding +done by Kudu. Inserting rows not conforming to these limitations will result in errors +being returned to the client.</p> </dd> <dt class="hdlist1">Size of Rows</dt> <dd> -<p>Kudu has not been thoroughly tested with rows larger than 10 kb. Most -testing has been on rows at 1 kb.</p> +<p>Although individual cells may be up to 64KB, and Kudu supports up to +300 columns, it is recommended that no single row be larger than a few hundred KB.</p> </dd> -<dt class="hdlist1">Size of Cells</dt> +<dt class="hdlist1">Valid Identifiers</dt> <dd> -<p>There is no hard limit imposed by Kudu, however large cells may -push the entire row over the recommended size.</p> +<p>Identifiers such as table and column names must be valid UTF-8 +sequences and no longer than 256 bytes.</p> </dd> <dt class="hdlist1">Immutable Primary Keys</dt> <dd> @@ -805,7 +810,8 @@ columns after table creation.</p> <dt class="hdlist1">Non-alterable Partitioning</dt> <dd> <p>Kudu does not allow you to change how a table is -partitioned after creation.</p> +partitioned after creation, with the exception of adding or dropping range +partitions.</p> </dd> <dt class="hdlist1">Non-alterable Column Types</dt> <dd> @@ -919,6 +925,10 @@ altered.</p> </li> <li> + <a href="known_issues.html">Known Issues and Limitations</a> + </li> + <li> + <a href="export_control.html">Export Control Notice</a> </li> </ul> @@ -928,7 +938,7 @@ altered.</p> </div> <footer class="footer"> <p class="small"> - Copyright © 2016 The Apache Software Foundation. Last updated 2016-11-08 09:35:57 PST + Copyright © 2016 The Apache Software Foundation. Last updated 2017-01-12 12:48:06 PST </p> </footer> </div> http://git-wip-us.apache.org/repos/asf/kudu-site/blob/9b792926/docs/style_guide.html ---------------------------------------------------------------------- diff --git a/docs/style_guide.html b/docs/style_guide.html index ea5a215..24358af 100644 --- a/docs/style_guide.html +++ b/docs/style_guide.html @@ -831,6 +831,10 @@ Nothing between the slashes will show up. </li> <li> + <a href="known_issues.html">Known Issues and Limitations</a> + </li> + <li> + <a href="export_control.html">Export Control Notice</a> </li> </ul>
