Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21952
@dbtsai I didn't use Spark 2.3 when testing databricks-avro. I also used
current master. But because a recent change of schema verifying
(`FileFormat.supportDataType`) causes incompatibility, I manually skip this
call to `supportDataType`.
So basically I tested built-in avro and databricks-avro both on current
master. I think the difference between Spark 2.3 and current master may cause
difference.
Btw, in the following benchmark numbers I modify array feature length from
16000 to 1600.
```scala
> "com.databricks.spark.avro"
scala> spark.sparkContext.parallelize(writeTimes.slice(50,
150)).toDF("writeTimes").describe("writeTimes").show()
+-------+--------------------+
|summary| writeTimes|
+-------+--------------------+
| count| 100|
| mean| 0.21102|
| stddev|0.010737435692590912|
| min| 0.195|
| max| 0.247|
+-------+--------------------+
scala> spark.sparkContext.parallelize(readTimes.slice(50,
150)).toDF("readTimes").describe("readTimes").show()
+-------+--------------------+
|summary| readTimes|
+-------+--------------------+
| count| 100|
| mean| 0.09441999999999999|
| stddev|0.016021563751722395|
| min| 0.07|
| max| 0.134|
+-------+--------------------+
> "avro"
scala> spark.sparkContext.parallelize(writeTimes.slice(50,
150)).toDF("writeTimes").describe("writeTimes").show()
+-------+--------------------+
|summary| writeTimes|
+-------+--------------------+
| count| 100|
| mean| 0.21445|
| stddev|0.008952596824329237|
| min| 0.201|
| max| 0.25|
+-------+--------------------+
scala> spark.sparkContext.parallelize(readTimes.slice(50,
150)).toDF("readTimes").describe("readTimes").show()
+-------+--------------------+
|summary| readTimes|
+-------+--------------------+
| count| 100|
| mean| 0.10792|
| stddev|0.015983375201386058|
| min| 0.08|
| max| 0.15|
+-------+--------------------+
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]