HyukjinKwon commented on code in PR #48922:
URL: https://github.com/apache/spark/pull/48922#discussion_r1853117855


##########
docs/app-dev-spark-connect.md:
##########
@@ -0,0 +1,239 @@
+---
+layout: global
+title: Application Development with Spark Connect
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+**Spark Connect Overview**
+
+In Apache Spark 3.4, Spark Connect introduced a decoupled client-server
+architecture that allows remote connectivity to Spark clusters using the
+DataFrame API and unresolved logical plans as the protocol. The separation
+between client and server allows Spark and its open ecosystem to be
+leveraged from everywhere. It can be embedded in modern data applications,
+in IDEs, Notebooks and programming languages.
+
+To learn more about Spark Connect, see [Spark Connect 
Overview](spark-connect-overview.html).
+
+# Redefining Spark Applications using Spark Connect
+
+With its decoupled client-server architecture, Spark Connect simplifies how 
Spark Applications are
+developed.
+The notion of Spark Client Applications and Spark Server Libraries are 
introduced as follows: 
+* _Spark Client Applications_ are regular Spark applications that use Spark 
and its rich ecosystem for
+distributed data processing. Examples include ETL pipelines, data preparation, 
and model training
+and inference.
+* _Spark Server Libraries_ build on, extend, and complement Spark's 
functionality, e.g.
+[MLlib](ml-guide.html) (distributed ML libraries that use Spark's powerful 
distributed processing). Spark Connect
+can be extended to expose client-side interfaces for Spark Server Libraries.
+
+With Spark 3.4 and Spark Connect, the development of Spark Client Applications 
is simplified, and
+clear extension points and guidelines are provided on how to build Spark 
Server Libraries, making
+it easy for both types of applications to evolve alongside Spark. As 
illustrated in Fig.1, Spark
+Client applications connect to Spark using the Spark Connect API, which is 
essentially the
+DataFrame API and fully declarative.
+
+<p style="text-align: center;">
+  <img src="img/extending-spark-connect.png" title="Figure 1: Architecture" 
alt="Extending Spark
+Connect Diagram" />

Review Comment:
   ```suggestion
   Connect Diagram"/>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to