[3/4] spark git commit: [SPARK-24499][SQL][DOC] Split the page of sql-programming-guide.html to multiple separate pages

2018-10-18 Thread lixiao
http://git-wip-us.apache.org/repos/asf/spark/blob/71535516/docs/sql-data-sources-troubleshooting.md
--
diff --git a/docs/sql-data-sources-troubleshooting.md 
b/docs/sql-data-sources-troubleshooting.md
new file mode 100644
index 000..5775eb8
--- /dev/null
+++ b/docs/sql-data-sources-troubleshooting.md
@@ -0,0 +1,9 @@
+---
+layout: global
+title: Troubleshooting
+displayTitle: Troubleshooting
+---
+
+ * The JDBC driver class must be visible to the primordial class loader on the 
client session and on all executors. This is because Java's DriverManager class 
does a security check that results in it ignoring all drivers not visible to 
the primordial class loader when one goes to open a connection. One convenient 
way to do this is to modify compute_classpath.sh on all worker nodes to include 
your driver JARs.
+ * Some databases, such as H2, convert all names to upper case. You'll need to 
use upper case to refer to those names in Spark SQL.
+ * Users can specify vendor-specific JDBC connection properties in the data 
source options to do special treatment. For example, 
`spark.read.format("jdbc").option("url", 
oracleJdbcUrl).option("oracle.jdbc.mapDateToTimestamp", "false")`. 
`oracle.jdbc.mapDateToTimestamp` defaults to true, users often need to disable 
this flag to avoid Oracle date being resolved as timestamp.

http://git-wip-us.apache.org/repos/asf/spark/blob/71535516/docs/sql-data-sources.md
--
diff --git a/docs/sql-data-sources.md b/docs/sql-data-sources.md
new file mode 100644
index 000..aa607ec
--- /dev/null
+++ b/docs/sql-data-sources.md
@@ -0,0 +1,42 @@
+---
+layout: global
+title: Data Sources
+displayTitle: Data Sources
+---
+
+
+Spark SQL supports operating on a variety of data sources through the 
DataFrame interface.
+A DataFrame can be operated on using relational transformations and can also 
be used to create a temporary view.
+Registering a DataFrame as a temporary view allows you to run SQL queries over 
its data. This section
+describes the general methods for loading and saving data using the Spark Data 
Sources and then
+goes into specific options that are available for the built-in data sources.
+
+
+* [Generic Load/Save Functions](sql-data-sources-load-save-functions.html)
+  * [Manually Specifying 
Options](sql-data-sources-load-save-functions.html#manually-specifying-options)
+  * [Run SQL on files 
directly](sql-data-sources-load-save-functions.html#run-sql-on-files-directly)
+  * [Save Modes](sql-data-sources-load-save-functions.html#save-modes)
+  * [Saving to Persistent 
Tables](sql-data-sources-load-save-functions.html#run-sql-on-files-directly)
+  * [Bucketing, Sorting and 
Partitioning](sql-data-sources-load-save-functions.html#run-sql-on-files-directly)
+* [Parquet Files](sql-data-sources-parquet.html)
+  * [Loading Data 
Programmatically](sql-data-sources-parquet.html#loading-data-programmatically)
+  * [Partition Discovery](sql-data-sources-parquet.html#partition-discovery)
+  * [Schema Merging](sql-data-sources-parquet.html#schema-merging)
+  * [Hive metastore Parquet table 
conversion](sql-data-sources-parquet.html#hive-metastore-parquet-table-conversion)
+  * [Configuration](sql-data-sources-parquet.html#configuration)
+* [ORC Files](sql-data-sources-orc.html)
+* [JSON Files](sql-data-sources-json.html)
+* [Hive Tables](sql-data-sources-hive-tables.html)
+  * [Specifying storage format for Hive 
tables](sql-data-sources-hive-tables.html#specifying-storage-format-for-hive-tables)
+  * [Interacting with Different Versions of Hive 
Metastore](sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore)
+* [JDBC To Other Databases](sql-data-sources-jdbc.html)
+* [Avro Files](sql-data-sources-avro.html)
+  * [Deploying](sql-data-sources-avro.html#deploying)
+  * [Load and Save 
Functions](sql-data-sources-avro.html#load-and-save-functions)
+  * [to_avro() and 
from_avro()](sql-data-sources-avro.html#to_avro-and-from_avro)
+  * [Data Source Option](sql-data-sources-avro.html#data-source-option)
+  * [Configuration](sql-data-sources-avro.html#configuration)
+  * [Compatibility with Databricks 
spark-avro](sql-data-sources-avro.html#compatibility-with-databricks-spark-avro)
+  * [Supported types for Avro -> Spark SQL 
conversion](sql-data-sources-avro.html#supported-types-for-avro---spark-sql-conversion)
+  * [Supported types for Spark SQL -> Avro 
conversion](sql-data-sources-avro.html#supported-types-for-spark-sql---avro-conversion)
+* [Troubleshooting](sql-data-sources-troubleshooting.html)

http://git-wip-us.apache.org/repos/asf/spark/blob/71535516/docs/sql-distributed-sql-engine.md
--
diff --git a/docs/sql-distributed-sql-engine.md 
b/docs/sql-distributed-sql-engine.md
new file mode 100644
index 000..66d6fda
--- /dev/null
+++ 

[3/4] spark git commit: [SPARK-24499][SQL][DOC] Split the page of sql-programming-guide.html to multiple separate pages

2018-10-18 Thread lixiao
http://git-wip-us.apache.org/repos/asf/spark/blob/987f3865/docs/sql-data-sources-troubleshooting.md
--
diff --git a/docs/sql-data-sources-troubleshooting.md 
b/docs/sql-data-sources-troubleshooting.md
new file mode 100644
index 000..5775eb8
--- /dev/null
+++ b/docs/sql-data-sources-troubleshooting.md
@@ -0,0 +1,9 @@
+---
+layout: global
+title: Troubleshooting
+displayTitle: Troubleshooting
+---
+
+ * The JDBC driver class must be visible to the primordial class loader on the 
client session and on all executors. This is because Java's DriverManager class 
does a security check that results in it ignoring all drivers not visible to 
the primordial class loader when one goes to open a connection. One convenient 
way to do this is to modify compute_classpath.sh on all worker nodes to include 
your driver JARs.
+ * Some databases, such as H2, convert all names to upper case. You'll need to 
use upper case to refer to those names in Spark SQL.
+ * Users can specify vendor-specific JDBC connection properties in the data 
source options to do special treatment. For example, 
`spark.read.format("jdbc").option("url", 
oracleJdbcUrl).option("oracle.jdbc.mapDateToTimestamp", "false")`. 
`oracle.jdbc.mapDateToTimestamp` defaults to true, users often need to disable 
this flag to avoid Oracle date being resolved as timestamp.

http://git-wip-us.apache.org/repos/asf/spark/blob/987f3865/docs/sql-data-sources.md
--
diff --git a/docs/sql-data-sources.md b/docs/sql-data-sources.md
new file mode 100644
index 000..aa607ec
--- /dev/null
+++ b/docs/sql-data-sources.md
@@ -0,0 +1,42 @@
+---
+layout: global
+title: Data Sources
+displayTitle: Data Sources
+---
+
+
+Spark SQL supports operating on a variety of data sources through the 
DataFrame interface.
+A DataFrame can be operated on using relational transformations and can also 
be used to create a temporary view.
+Registering a DataFrame as a temporary view allows you to run SQL queries over 
its data. This section
+describes the general methods for loading and saving data using the Spark Data 
Sources and then
+goes into specific options that are available for the built-in data sources.
+
+
+* [Generic Load/Save Functions](sql-data-sources-load-save-functions.html)
+  * [Manually Specifying 
Options](sql-data-sources-load-save-functions.html#manually-specifying-options)
+  * [Run SQL on files 
directly](sql-data-sources-load-save-functions.html#run-sql-on-files-directly)
+  * [Save Modes](sql-data-sources-load-save-functions.html#save-modes)
+  * [Saving to Persistent 
Tables](sql-data-sources-load-save-functions.html#run-sql-on-files-directly)
+  * [Bucketing, Sorting and 
Partitioning](sql-data-sources-load-save-functions.html#run-sql-on-files-directly)
+* [Parquet Files](sql-data-sources-parquet.html)
+  * [Loading Data 
Programmatically](sql-data-sources-parquet.html#loading-data-programmatically)
+  * [Partition Discovery](sql-data-sources-parquet.html#partition-discovery)
+  * [Schema Merging](sql-data-sources-parquet.html#schema-merging)
+  * [Hive metastore Parquet table 
conversion](sql-data-sources-parquet.html#hive-metastore-parquet-table-conversion)
+  * [Configuration](sql-data-sources-parquet.html#configuration)
+* [ORC Files](sql-data-sources-orc.html)
+* [JSON Files](sql-data-sources-json.html)
+* [Hive Tables](sql-data-sources-hive-tables.html)
+  * [Specifying storage format for Hive 
tables](sql-data-sources-hive-tables.html#specifying-storage-format-for-hive-tables)
+  * [Interacting with Different Versions of Hive 
Metastore](sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore)
+* [JDBC To Other Databases](sql-data-sources-jdbc.html)
+* [Avro Files](sql-data-sources-avro.html)
+  * [Deploying](sql-data-sources-avro.html#deploying)
+  * [Load and Save 
Functions](sql-data-sources-avro.html#load-and-save-functions)
+  * [to_avro() and 
from_avro()](sql-data-sources-avro.html#to_avro-and-from_avro)
+  * [Data Source Option](sql-data-sources-avro.html#data-source-option)
+  * [Configuration](sql-data-sources-avro.html#configuration)
+  * [Compatibility with Databricks 
spark-avro](sql-data-sources-avro.html#compatibility-with-databricks-spark-avro)
+  * [Supported types for Avro -> Spark SQL 
conversion](sql-data-sources-avro.html#supported-types-for-avro---spark-sql-conversion)
+  * [Supported types for Spark SQL -> Avro 
conversion](sql-data-sources-avro.html#supported-types-for-spark-sql---avro-conversion)
+* [Troubleshooting](sql-data-sources-troubleshooting.html)

http://git-wip-us.apache.org/repos/asf/spark/blob/987f3865/docs/sql-distributed-sql-engine.md
--
diff --git a/docs/sql-distributed-sql-engine.md 
b/docs/sql-distributed-sql-engine.md
new file mode 100644
index 000..66d6fda
--- /dev/null
+++