Spark 1.0.2, Tachyon 0.4.1, Hadoop 1.0 (standard EC2 config) scala> val gdeltT = sqlContext.parquetFile("tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/") 14/08/21 19:07:14 INFO : initialize(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005, Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, hdfs-default.xml, hdfs-site.xml). Connecting to Tachyon: tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005 14/08/21 19:07:14 INFO : Trying to connect master @ /172.31.42.40:19998 14/08/21 19:07:14 INFO : User registered at the master ip-172-31-42-40.us-west-2.compute.internal/172.31.42.40:19998 got UserId 14 14/08/21 19:07:14 INFO : Trying to get local worker host : ip-172-31-42-40.us-west-2.compute.internal 14/08/21 19:07:14 INFO : No local worker on ip-172-31-42-40.us-west-2.compute.internal 14/08/21 19:07:14 INFO : Connecting remote worker @ ip-172-31-47-74/172.31.47.74:29998 14/08/21 19:07:14 INFO : tachyon://172.31.42.40:19998 tachyon://172.31.42.40:19998 hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000 14/08/21 19:07:14 INFO : getFileStatus(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005): HDFS Path: hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000/gdelt-parquet/1979-2005 TPath: tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005 14/08/21 19:07:14 INFO : tachyon.client.TachyonFS@4b05b3ff hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000 /gdelt-parquet/1979-2005 tachyon.PrefixList@636c50d3 14/08/21 19:07:14 WARN : tachyon.home is not set. Using /mnt/tachyon_default_home as the default value. 14/08/21 19:07:14 INFO : Get: /gdelt-parquet/1979-2005/_SUCCESS 14/08/21 19:07:14 INFO : Get: /gdelt-parquet/1979-2005/_metadata 14/08/21 19:07:14 INFO : Get: /gdelt-parquet/1979-2005/part-r-1.parquet
.... 14/08/21 19:07:14 INFO : getFileStatus(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/_metadata): HDFS Path: hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000/gdelt-parquet/1979-2005/_metadata TPath: tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/_metadata 14/08/21 19:07:14 INFO : open(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/_metadata, 65536) 14/08/21 19:07:14 ERROR : The machine does not have any local worker. 14/08/21 19:07:14 ERROR : Reading from HDFS directly 14/08/21 19:07:14 ERROR : Reading from HDFS directly java.io.IOException: can not read class parquet.format.FileMetaData: null at parquet.format.Util.read(Util.java:50) at parquet.format.Util.readFileMetaData(Util.java:34) at parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:310) at parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:296) I'm not sure why this is saying that, as the Tachyon UI reports all 8 nodes being up? --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org