> System.setProperty("spark.io.compression.codec",
> "com.hadoop.compression.lzo.LzopCodec")This spark.io.compression.codec is a completely different setting than the codecs that are used for reading/writing from HDFS. (It is for compressing Spark's internal/non-HDFS intermediate output.) > Hope this helps and someone can help read a LZO file Spark just uses the regular Hadoop File System API, so any issues with reading LZO files would be Hadoop issues. I would search in the Hadoop issue tracker, and look for information on using LZO files with Hadoop/Hive, and whatever works for them, should magically work for Spark as well. This looks like a good place to start: https://github.com/twitter/hadoop-lzo IANAE, but I would try passing one of these: https://github.com/twitter/hadoop-lzo/blob/master/src/main/java/com/hadoop/mapreduce/LzoTextInputFormat.java To the SparkContext.hadoopFile method. - Stephen
