And it worked earlier with non-parquet directory. On Thu, Aug 21, 2014 at 12:22 PM, Evan Chan <velvia.git...@gmail.com> wrote: > The underFS is HDFS btw. > > On Thu, Aug 21, 2014 at 12:22 PM, Evan Chan <velvia.git...@gmail.com> wrote: >> Spark 1.0.2, Tachyon 0.4.1, Hadoop 1.0 (standard EC2 config) >> >> scala> val gdeltT = >> sqlContext.parquetFile("tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/") >> 14/08/21 19:07:14 INFO : >> initialize(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005, >> Configuration: core-default.xml, core-site.xml, mapred-default.xml, >> mapred-site.xml, hdfs-default.xml, hdfs-site.xml). Connecting to >> Tachyon: tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005 >> 14/08/21 19:07:14 INFO : Trying to connect master @ /172.31.42.40:19998 >> 14/08/21 19:07:14 INFO : User registered at the master >> ip-172-31-42-40.us-west-2.compute.internal/172.31.42.40:19998 got >> UserId 14 >> 14/08/21 19:07:14 INFO : Trying to get local worker host : >> ip-172-31-42-40.us-west-2.compute.internal >> 14/08/21 19:07:14 INFO : No local worker on >> ip-172-31-42-40.us-west-2.compute.internal >> 14/08/21 19:07:14 INFO : Connecting remote worker @ >> ip-172-31-47-74/172.31.47.74:29998 >> 14/08/21 19:07:14 INFO : tachyon://172.31.42.40:19998 >> tachyon://172.31.42.40:19998 >> hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000 >> 14/08/21 19:07:14 INFO : >> getFileStatus(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005): >> HDFS Path: >> hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000/gdelt-parquet/1979-2005 >> TPath: tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005 >> 14/08/21 19:07:14 INFO : tachyon.client.TachyonFS@4b05b3ff >> hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000 >> /gdelt-parquet/1979-2005 tachyon.PrefixList@636c50d3 >> 14/08/21 19:07:14 WARN : tachyon.home is not set. Using >> /mnt/tachyon_default_home as the default value. >> 14/08/21 19:07:14 INFO : Get: /gdelt-parquet/1979-2005/_SUCCESS >> 14/08/21 19:07:14 INFO : Get: /gdelt-parquet/1979-2005/_metadata >> 14/08/21 19:07:14 INFO : Get: /gdelt-parquet/1979-2005/part-r-1.parquet >> >> .... >> >> 14/08/21 19:07:14 INFO : >> getFileStatus(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/_metadata): >> HDFS Path: >> hdfs://ec2-54-213-113-173.us-west-2.compute.amazonaws.com:9000/gdelt-parquet/1979-2005/_metadata >> TPath: tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/_metadata >> 14/08/21 19:07:14 INFO : >> open(tachyon://172.31.42.40:19998/gdelt-parquet/1979-2005/_metadata, >> 65536) >> 14/08/21 19:07:14 ERROR : The machine does not have any local worker. >> 14/08/21 19:07:14 ERROR : Reading from HDFS directly >> 14/08/21 19:07:14 ERROR : Reading from HDFS directly >> java.io.IOException: can not read class parquet.format.FileMetaData: null >> at parquet.format.Util.read(Util.java:50) >> at parquet.format.Util.readFileMetaData(Util.java:34) >> at >> parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:310) >> at parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:296) >> >> >> I'm not sure why this is saying that, as the Tachyon UI reports all 8 >> nodes being up?
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org