This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/iceberg-docs.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 79e0fe97 deploy: 3b0d73abe2be8c2fd57cc298ab13828c38bd59ba
79e0fe97 is described below
commit 79e0fe97d32338c1df811ad77e2123f6f587e855
Author: rdblue <[email protected]>
AuthorDate: Thu Nov 17 18:10:17 2022 +0000
deploy: 3b0d73abe2be8c2fd57cc298ab13828c38bd59ba
---
getting-started/index.html | 20 +-------------
landingpagesearch.json | 2 +-
spark-quickstart/index.html | 67 +++++++++++++++++++++++++++++++++------------
3 files changed, 52 insertions(+), 37 deletions(-)
diff --git a/getting-started/index.html b/getting-started/index.html
index b3c6e0d8..52cc0335 100644
--- a/getting-started/index.html
+++ b/getting-started/index.html
@@ -1,19 +1 @@
-<!--
- - Licensed to the Apache Software Foundation (ASF) under one or more
- - contributor license agreements. See the NOTICE file distributed with
- - this work for additional information regarding copyright ownership.
- - The ASF licenses this file to You under the Apache License, Version 2.0
- - (the "License"); you may not use this file except in compliance with
- - the License. You may obtain a copy of the License at
- -
- - http://www.apache.org/licenses/LICENSE-2.0
- -
- - Unless required by applicable law or agreed to in writing, software
- - distributed under the License is distributed on an "AS IS" BASIS,
- - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- - See the License for the specific language governing permissions and
- - limitations under the License.
- -->
-<head>
- <meta http-equiv="Refresh" content="0; url='/docs/latest/getting-started'" />
-</head>
+<!doctype html><html
lang=en-us><head><title>https://iceberg.apache.org/spark-quickstart/</title><link
rel=canonical href=https://iceberg.apache.org/spark-quickstart/><meta
name=robots content="noindex"><meta charset=utf-8><meta http-equiv=refresh
content="0; url=https://iceberg.apache.org/spark-quickstart/"></head></html>
\ No newline at end of file
diff --git a/landingpagesearch.json b/landingpagesearch.json
index 5c522b37..e322700f 100644
--- a/landingpagesearch.json
+++ b/landingpagesearch.json
@@ -1 +1 @@
-[{"categories":null,"content":" Spark and Iceberg Quickstart This guide will
get you up and running with an Iceberg and Spark environment, including sample
code to highlight some powerful features. You can learn more about Iceberg’s
Spark runtime by checking out the Spark section.\nDocker-Compose Creating a
table Writing Data to a Table Reading Data from a Table Adding A Catalog Next
Steps Docker-Compose The fastest way to get started is to use a docker-compose
file that uses the the tab [...]
\ No newline at end of file
+[{"categories":null,"content":" Spark and Iceberg Quickstart This guide will
get you up and running with an Iceberg and Spark environment, including sample
code to highlight some powerful features. You can learn more about Iceberg’s
Spark runtime by checking out the Spark section.\nDocker-Compose Creating a
table Writing Data to a Table Reading Data from a Table Adding A Catalog Next
Steps Docker-Compose The fastest way to get started is to use a docker-compose
file that uses the the tab [...]
\ No newline at end of file
diff --git a/spark-quickstart/index.html b/spark-quickstart/index.html
index d9ddc693..711d8010 100644
--- a/spark-quickstart/index.html
+++ b/spark-quickstart/index.html
@@ -10,29 +10,62 @@ which contains a local Spark cluster with a configured
Iceberg catalog. To use t
</span></span><span style=display:flex><span><span
style=color:#f92672>services</span>:
</span></span><span style=display:flex><span> <span
style=color:#f92672>spark-iceberg</span>:
</span></span><span style=display:flex><span> <span
style=color:#f92672>image</span>: <span
style=color:#ae81ff>tabulario/spark-iceberg</span>
-</span></span><span style=display:flex><span> <span
style=color:#f92672>depends_on</span>:
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>postgres</span>
</span></span><span style=display:flex><span> <span
style=color:#f92672>container_name</span>: <span
style=color:#ae81ff>spark-iceberg</span>
-</span></span><span style=display:flex><span> <span
style=color:#f92672>environment</span>:
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>SPARK_HOME=/opt/spark</span>
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>PYSPARK_PYTON=/usr/bin/python3.9</span>
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/spark/bin</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>build</span>: <span style=color:#ae81ff>spark/</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>depends_on</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>rest</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>minio</span>
</span></span><span style=display:flex><span> <span
style=color:#f92672>volumes</span>:
</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>./warehouse:/home/iceberg/warehouse</span>
</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>./notebooks:/home/iceberg/notebooks/notebooks</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>environment</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_ACCESS_KEY_ID=admin</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_SECRET_ACCESS_KEY=password</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_REGION=us-east-1</span>
</span></span><span style=display:flex><span> <span
style=color:#f92672>ports</span>:
</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>8888</span>:<span style=color:#ae81ff>8888</span>
</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>8080</span>:<span style=color:#ae81ff>8080</span>
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>18080</span>:<span style=color:#ae81ff>18080</span>
-</span></span><span style=display:flex><span> <span
style=color:#f92672>postgres</span>:
-</span></span><span style=display:flex><span> <span
style=color:#f92672>image</span>: <span
style=color:#ae81ff>postgres:13.4-bullseye</span>
-</span></span><span style=display:flex><span> <span
style=color:#f92672>container_name</span>: <span
style=color:#ae81ff>postgres</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>links</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>rest:rest</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>minio:minio</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>rest</span>:
+</span></span><span style=display:flex><span> <span
style=color:#f92672>image</span>: <span
style=color:#ae81ff>tabulario/iceberg-rest:0.1.0</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>ports</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>8181</span>:<span style=color:#ae81ff>8181</span>
</span></span><span style=display:flex><span> <span
style=color:#f92672>environment</span>:
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>POSTGRES_USER=admin</span>
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>POSTGRES_PASSWORD=password</span>
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>POSTGRES_DB=demo_catalog</span>
-</span></span><span style=display:flex><span> <span
style=color:#f92672>volumes</span>:
-</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>./postgres/data:/var/lib/postgresql/data</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_ACCESS_KEY_ID=admin</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_SECRET_ACCESS_KEY=password</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_REGION=us-east-1</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>CATALOG_WAREHOUSE=s3a://warehouse/wh/</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>CATALOG_S3_ENDPOINT=http://minio:9000</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>minio</span>:
+</span></span><span style=display:flex><span> <span
style=color:#f92672>image</span>: <span style=color:#ae81ff>minio/minio</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>container_name</span>: <span
style=color:#ae81ff>minio</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>environment</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>MINIO_ROOT_USER=admin</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>MINIO_ROOT_PASSWORD=password</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>ports</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>9001</span>:<span style=color:#ae81ff>9001</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>9000</span>:<span style=color:#ae81ff>9000</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>command</span>: [<span
style=color:#e6db74>"server"</span>, <span
style=color:#e6db74>"/data"</span>, <span
style=color:#e6db74>"--console-address"</span>, <span
style=color:#e6db74>":9001"</span>]
+</span></span><span style=display:flex><span> <span
style=color:#f92672>mc</span>:
+</span></span><span style=display:flex><span> <span
style=color:#f92672>depends_on</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>minio</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>image</span>: <span style=color:#ae81ff>minio/mc</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>container_name</span>: <span style=color:#ae81ff>mc</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>environment</span>:
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_ACCESS_KEY_ID=admin</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_SECRET_ACCESS_KEY=password</span>
+</span></span><span style=display:flex><span> - <span
style=color:#ae81ff>AWS_REGION=us-east-1</span>
+</span></span><span style=display:flex><span> <span
style=color:#f92672>entrypoint</span>: ><span style=color:#e6db74>
+</span></span></span><span style=display:flex><span><span style=color:#e6db74>
/bin/sh -c "
+</span></span></span><span style=display:flex><span><span style=color:#e6db74>
until (/usr/bin/mc config host add minio http://minio:9000 admin password)
do echo '...waiting...' && sleep 1; done;
+</span></span></span><span style=display:flex><span><span style=color:#e6db74>
/usr/bin/mc rm -r --force minio/warehouse;
+</span></span></span><span style=display:flex><span><span style=color:#e6db74>
/usr/bin/mc mb minio/warehouse;
+</span></span></span><span style=display:flex><span><span style=color:#e6db74>
/usr/bin/mc policy set public minio/warehouse;
+</span></span></span><span style=display:flex><span><span style=color:#e6db74>
exit 0;
+</span></span></span><span style=display:flex><span><span style=color:#e6db74>
"</span>
</span></span></code></pre></div><p>Next, start up the docker containers with
this command:</p><div class=highlight><pre tabindex=0
style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code
class=language-sh data-lang=sh><span style=display:flex><span>docker-compose up
</span></span></code></pre></div><p>You can then run any of the following
commands to start a Spark session.</p><div class=codetabs><input id=spark-sql
type=radio name=LaunchSparkClient
onclick='selectExampleLanguage("spark-queries","spark-sql")'>
<label for=spark-sql>SparkSQL</label>
@@ -76,8 +109,8 @@ using <code>demo.nyc.taxis</code> where <code>demo</code> is
the catalog name, <
</span></span><span style=display:flex><span> StructField(<span
style=color:#e6db74>"vendor_id"</span>, LongType(), <span
style=color:#66d9ef>True</span>),
</span></span><span style=display:flex><span> StructField(<span
style=color:#e6db74>"trip_id"</span>, LongType(), <span
style=color:#66d9ef>True</span>),
</span></span><span style=display:flex><span> StructField(<span
style=color:#e6db74>"trip_distance"</span>, FloatType(), <span
style=color:#66d9ef>True</span>),
-</span></span><span style=display:flex><span> StructField(<span
style=color:#e6db74>"fare_amount', DoubleType(), True),</span>
-</span></span><span style=display:flex><span> StructField(<span
style=color:#e6db74>"store_and_fwd_flag', StringType(), True)</span>
+</span></span><span style=display:flex><span> StructField(<span
style=color:#e6db74>"fare_amount"</span>, DoubleType(), <span
style=color:#66d9ef>True</span>),
+</span></span><span style=display:flex><span> StructField(<span
style=color:#e6db74>"store_and_fwd_flag"</span>, StringType(), <span
style=color:#66d9ef>True</span>)
</span></span><span style=display:flex><span>])
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>df <span
style=color:#f92672>=</span> spark<span
style=color:#f92672>.</span>createDataFrame([], schema)