[kudu] branch gh-pages updated: Update the Kudu FAQ

granthenke Tue, 02 Jul 2019 17:36:12 -0700

This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/kudu.git



The following commit(s) were added to refs/heads/gh-pages by this push:
     new 166def7  Update the Kudu FAQ
166def7 is described below

commit 166def7389ce97f85bb9801cfff7b42762e4b834
Author: Grant Henke <[email protected]>
AuthorDate: Tue Jul 2 12:13:38 2019 -0500

    Update the Kudu FAQ
    
    Change-Id: I7b0b57830a6ede9a12e3a889c918273788bc0de8
    Reviewed-on: http://gerrit.cloudera.org:8080/13787
    Reviewed-by: Alexey Serbin <[email protected]>
    Reviewed-by: Andrew Wong <[email protected]>
    Tested-by: Grant Henke <[email protected]>
---
 faq.md | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/faq.md b/faq.md
index 9278e71..7443142 100644
--- a/faq.md
+++ b/faq.md
@@ -45,8 +45,7 @@ workloads.
 
 #### Is Apache Kudu ready to be deployed into production yet?
 
-Yes! Although Kudu is still relatively new, as far as storage engines are 
considered,
-it is ready for production workloads.
+Yes! Kudu has been battle tested in production at many major corporations.
 
 #### Is Kudu open source?
 
@@ -82,12 +81,11 @@ store, and access data in Kudu tables with Apache Impala.
 Aside from training, you can also get help with using Kudu through
 [documentation](docs/index.html),
 the [mailing lists](community.html),
-the [Kudu chat room](https://getkudu-slack.herokuapp.com/), and the
-[Cloudera beta release 
forum](https://community.cloudera.com/t5/Beta-Releases-Kudu-RecordService/bd-p/Beta).
+and the [Kudu chat room](https://getkudu-slack.herokuapp.com/).
 
-#### Is there a quickstart VM?
+#### Is there a quickstart?
 
-Yes. Instructions on getting up and running on Kudu via a VM are provided in 
Kudu's
+Yes. Instructions on getting up and running on Kudu via a Docker based 
quickstart are provided in Kudu's
 [quickstart guide](http://kudu.apache.org/docs/quickstart.html).
 
 
@@ -257,10 +255,8 @@ Kudu hasn't been publicly tested with Jepsen but it is 
possible to run a set of
 Kudu provides direct access via Java and C++ APIs. An experimental Python API 
is
 also available and is expected to be fully supported in the future. The easiest
 way to load data into Kudu is to use a `CREATE TABLE ... AS SELECT * FROM ...`
-statement in Impala. Although Kudu has not been extensively tested to work with
-ingest tools such as Flume, Sqoop, or Kafka, several of these have been
-experimentally tested. Explicit support for these ingest tools is expected with
-Kudu's first generally available release.
+statement in Impala. Additionally, data is commonly ingested into Kudu using
+Spark, Nifi, and Flume.
 
 #### What's the most efficient way to bulk load data into Kudu?
 
@@ -268,8 +264,8 @@ The easiest way to load data into Kudu is if the data is 
already managed by Impa
 In this case, a simple `INSERT INTO TABLE some_kudu_table SELECT * FROM 
some_csv_table`
 does the trick.
 
-You can also use Kudu's MapReduce OutputFormat to load data from HDFS, HBase, 
or
-any other data store that has an InputFormat.
+You can also use Kudu's Spark integration to load data from or
+any other Spark compatible data store.
 
 No tool is provided to load data directly into Kudu's on-disk data format. We
 have found that for many workloads, the insert performance of Kudu is 
comparable
@@ -284,7 +280,7 @@ BINARY column, but large values (10s of KB or more) are 
likely to cause
 performance or stability problems in current versions.
 
 Fuller support for semi-structured types like JSON and protobuf will be added 
in
-the future, contingent on demand from early adopters.
+the future, contingent on demand.
 
 #### Is there a JDBC driver available?
 
@@ -324,9 +320,8 @@ direction, for the following reasons:
 
 #### What frameworks are integrated with Kudu for data access?
 
-Kudu is already integrated with Impala, MapReduce, and Spark. Additional
-frameworks are expected for GA with Hive being the current highest priority
-addition.
+Kudu is integrated with Impala, Spark, Nifi, MapReduce, and more. Additional
+frameworks are expected, with Hive being the current highest priority addition.
 
 #### Can I colocate Kudu with HDFS on the same servers?
 
@@ -411,13 +406,18 @@ features.
 
 #### How can I back up my Kudu data?
 
-Kudu doesn't yet have a built-in backup mechanism. Similar to bulk loading 
data,
-Impala can help if you have it available. You can use it to copy your data into
-Parquet format using a statement like:
+As of Kudu 1.10.0, Kudu supports both full and incremental table backups via a
+job implemented using Apache Spark. Additionally it supports restoring tables
+from full and incremental backups via a restore job implemented using Apache 
Spark.
+See the [administration documentation](docs/administration.html) for details.
+
+For older versions which do not have a built-in backup mechanism, Impala can
+help if you have it available. You can use it to copy your data into Parquet
+format using a statement like:
 
     INSERT INTO TABLE some_parquet_table SELECT * FROM kudu_table
 
-then use [distcp](http://hadoop.apache.org/docs/r1.2.1/distcp2.html)
+then use 
[distcp](http://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html)
 to copy the Parquet data to another cluster.
 
 #### Can the WAL transaction logs be used to build a disaster recovery site?

[kudu] branch gh-pages updated: Update the Kudu FAQ

Reply via email to