[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

BryanCutler Thu, 25 Jan 2018 16:23:25 -0800

Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19575#discussion_r164007019
  
    --- Diff: examples/src/main/python/sql/arrow.py ---
    @@ -0,0 +1,125 @@
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#    http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +
    +"""
    +A simple example demonstrating Arrow in Spark.
    +Run with:
    +  ./bin/spark-submit examples/src/main/python/sql/arrow.py
    +"""
    +
    +from __future__ import print_function
    +
    +from pyspark.sql import SparkSession
    +from pyspark.sql.utils import require_minimum_pandas_version, 
require_minimum_pyarrow_version
    +
    +require_minimum_pandas_version()
    +require_minimum_pyarrow_version()
    +
    +
    +def dataframe_with_arrow_example(spark):
    +    # $example on:dataframe_with_arrow$
    +    import numpy as np
    +    import pandas as pd
    +
    +    # Enable Arrow-based columnar data transfers
    +    spark.conf.set("spark.sql.execution.arrow.enabled", "true")
    +
    +    # Generate a Pandas DataFrame
    +    pdf = pd.DataFrame(np.random.rand(100, 3))
    +
    +    # Create a Spark DataFrame from a Pandas DataFrame using Arrow
    +    df = spark.createDataFrame(pdf)
    +
    +    # Convert the Spark DataFrame back to a Pandas DataFrame using Arrow
    +    result_pdf = df.select("*").toPandas()
    +    # $example off:dataframe_with_arrow$
    +
    --- End diff --
    
    @HyukjinKwon mind if I add a print at the end here?  I think it can be 
confusing to have examples that run, but don't show anything



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

Reply via email to