Lewis, You are using a pseudo-dist hdfs, but still providing a local gora.avrostore.output.path. Can you try with an location on the hdfs.
Enis On Wed, Oct 10, 2012 at 1:40 PM, Lewis John Mcgibbney < [email protected]> wrote: > Hi, > > For the sake of obtaining a pure understanding of this myself I'm > trying to use DataFileAvroStore with the gora-tutorial LogManager > scenario... with little luck. Config as follows > > gora.properties > --------------------- > gora.datastore.default=org.apache.gora.avro.store.DataFileAvroStore > gora.avrostore.output.path=file:///home/lewis/ASF/gora_trunk/gora.output > > gora-datafileavrostore-mapping.xml > --------------------------------------------------- > non-existent... yet > > I'm running hadoop 1.0.1 (for compatibility with Gora trunk) in pseudo > distrib with the following settings > > core-site.xml > ------------------ > <configuration> > <property> > <name>fs.default.name</name> > <value>hdfs://localhost:9000</value> > <description>URI of NameNode.</description> > </property> > </configuration> > > hdfs-site.xml > ------------------ > <property> > <name>dfs.replication</name> > <value>1</value> > <description></description> > </property> > > <property> > <name>dfs.name.dir</name> > <value>/home/lewis/ASF/hadoop_output/dfs/name/</value> > <description>Path on the local filesystem where the NameNode > stores the namespace and transactions logs persistently.</description> > </property> > > <property> > <name>dfs.data.dir</name> > <value>/home/lewis/ASF/hadoop_output/dfs/data/</value> > <description>Comma separated list of paths on the local > filesystem of a DataNode where it should store its blocks. > </description> > </property> > > mapred-site.xml > ------------------------ > <property> > <name>mapred.job.tracker</name> > <value>localhost:9001</value> > <description>URI of job tracker.</description> > </property> > > <property> > <name>mapred.system.dir</name> > <value>/home/lewis/ASF/hadoop_output/mapred/system_files</value> > <description>Path on the HDFS where where the MapReduce framework > stores system files e.g. /hadoop/mapred/system/. </description> > </property> > > <property> > <name>mapred.local.dir</name> > <value>/home/lewis/ASF/hadoop_output/mapred/</value> > <description>Comma-separated list of paths on the local filesystem > where temporary MapReduce data is written. </description> > </property> > > <property> > <name>mapred.child.java.opts</name> > <value>-Xmx1024m</value> > <description>Memory allocated to the medred children > nodes.</description> > </property> > > I've been running this set up with both Nutch 2.x (head) and Cassandra > 1.1.1 as well as the goraci module so I know my current Hadoop set up > is 'OK'. When I parse the webserver logs within the tutorial module > everything is fine, however when I attempt to query an individual > record I am getting > > lewis@lewis-desktop:~/ASF/gora_trunk$ ./bin/gora logmanager -query 10 > Exception in thread "main" java.lang.IllegalArgumentException: Can not > create a Path from a null string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:78) > at org.apache.hadoop.fs.Path.<init>(Path.java:90) > at > org.apache.gora.avro.store.DataFileAvroStore.createFsInput(DataFileAvroStore.java:85) > at > org.apache.gora.avro.store.DataFileAvroStore.executeQuery(DataFileAvroStore.java:67) > at > org.apache.gora.store.impl.FileBackedDataStoreBase.execute(FileBackedDataStoreBase.java:163) > at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:71) > at > org.apache.gora.tutorial.log.LogManager.query(LogManager.java:156) > at > org.apache.gora.tutorial.log.LogManager.main(LogManager.java:246) > > Before I head over to hadoop forums I thought best to fire this one on > here as it primarily concerns Gora config and fitting this around > Hadoop. > > Any thoughts would be excellent here... > > Thanks > > Lewis > > > On Wed, Oct 10, 2012 at 12:57 AM, Enis Söztutar <[email protected]> wrote: > > Sorry, It's been some time that I last looked into these. AvroStore uses > > files and writes data with DatumWriter directly, whereas > DataFileAvroStore > > uses the data file, which is an avro file format. This format support > > blocks, so they can be split for mapreduce tasks. > > > > Yes, all FileBasedDataStores work on top of files stored at a hadoop file > > system. even local file system should work. >

