Hm.. As I said here https://github.com/databricks/spark-csv/issues/245#issuecomment-177682354,
It sounds reasonable in a way though. For me, this might be to deal with some narrow use-cases. How about using csvRdd(), https://github.com/databricks/spark-csv/blob/master/src/main/scala/com/databricks/spark/csv/CsvParser.scala#L143-L162 ? I think you can do this like below: val rdd = sc.newAPIHadoopFile("/file.csv.lzo", classOf[com.hadoop.mapreduce.LzoTextInputFormat], classOf[org.apache.hadoop.io.LongWritable], classOf[org.apache.hadoop.io.Text]) val df = new CsvParser() .csvRdd(sqlContext, rdd) 2016-01-30 10:04 GMT+09:00 syepes <sye...@gmail.com>: > Well looking at the src it look like its not implemented: > > > https://github.com/databricks/spark-csv/blob/master/src/main/scala/com/databricks/spark/csv/util/TextFile.scala#L34-L36 > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Reading-lzo-index-with-spark-csv-Splittable-reads-tp26103p26105.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >