Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20775#discussion_r173446568
--- Diff: examples/src/main/python/ml/dataframe_example.py ---
@@ -35,18 +35,18 @@
print("Usage: dataframe_example.py <libsvm file>", file=sys.stderr)
sys.exit(-1)
elif len(sys.argv) == 2:
- input = sys.argv[1]
+ input_path = sys.argv[1]
else:
- input = "data/mllib/sample_libsvm_data.txt"
+ input_path = "data/mllib/sample_libsvm_data.txt"
spark = SparkSession \
.builder \
.appName("DataFrameExample") \
.getOrCreate()
- # Load input data
- print("Loading LIBSVM file with UDT from " + input + ".")
- df = spark.read.format("libsvm").load(input).cache()
+ # Load file from path
--- End diff --
Let's just write it as `Load an input file`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]