avro to bcc: >> Why can't it use the schema file from front-end invocation?
You're right. It should load the schema file in the front-end and pass it to the back-end via properties. Unfortunately, Piggybank AvroStorage doesn't do this. However, the new built-in AvroStorage in Pig 0.12 does exactly what you want. Can you use it instead? https://github.com/apache/pig/blob/trunk/src/org/apache/pig/builtin/AvroStorage.java#L120 On Tue, Dec 24, 2013 at 10:15 AM, Ruslan Al-Fakikh <[email protected]>wrote: > Hey guys, > > I am using AvroStorage like this: > > STORE alias INTO '$OUTPUT' > USING org.apache.pig.piggybank.storage.avro.AvroStorage('{ > "index" : 1, > "schema_uri": "file://path/schema.avsc"}'); > > so, it is explicit to take the schema.avsc from the local file system, not > HDFS. > It works in a pseudo-distributed cluster, but fails on a normal cluster > with java.io.FileNotFoundException for the schema file > Looks like this is happening in backend. > I assume this is because the backend invocation of AvroStorage on a node, > different from the node I am running the pig script from, cannot find the > file in the local file system. > Why can't it use the schema file from front-end invocation? > Does it mean that I am only limited to either HDFS locations for > schema_uri or using embedding the schema string in AvroStorage parameters? > > Thanks in advance > > Ruslan Al-Fakikh >
