Thank you for your reply. The problem is I cannot find a methodology to go from a String representation of a complex data type to a nested Object of pig DataTypes. I looked over the pig 0.10.1 docs, but cannot find a way to go from String and Schema to pig DataType Object.
For context, I am generating these Strings for my own JUnit testing of other UDFs. Currently, for complex types, I have to generate each nesting from Tuple and DataBag factories, append data, and next them manually. For larger unit tests, this process becomes unwieldy (hundreds of lines per method, non-dynamic), and it would be much simpler to go directly from a String and a Schema to a DataBag Object for UDF testing (few lines of code, easily modifiable). -Dan On Mon, Mar 18, 2013 at 6:31 PM, Jonathan Coveney <[email protected]>wrote: > Why not just use PigStorage? This is essentially what it does. It saves a > bag as text, and then loads it again. > > I suppose the question becomes: why do you need to do this? > > > 2013/3/18 Dan DeCapria, CivicScience <[email protected]> > > > In Java, I am trying to convert a DataBag from it's String representation > > with its schema String to a valid DataBag Object: > > > > String databag_string = "{(apples,1024)}"; > > String schema_string = "b1:bag{t1:tuple(a:chararray,b:long)}"; > > > > I've tried implementing something along the lines of this, but I believe > > it's in the wrong direction, and then I get stuck: > > > > String[] aliases = {"b1", "t1", "a", "b"}; > > byte[] types = {DataType.BAG, DataType.TUPLE, DataType.CHARARRAY, > > DataType.LONG}; > > List<Schema.FieldSchema> fsList = new > > ArrayList<Schema.FieldSchema>(); > > for (int i = 0; i < aliases.length; i++) { > > fsList.add(new Schema.FieldSchema(aliases[i], types[i])) ; > > } > > Schema origSchema = new Schema(fsList); > > ResourceSchema rsSchema = new ResourceSchema(origSchema); > > Schema genSchema = Schema.getPigSchema(rsSchema); > > ResourceSchema.ResourceFieldSchema[] rfschema = > > rsSchema.getFields(); > > ... lost here, maybe Utf8StorageConverter c = new > > Utf8StorageConverter(); ??? > > > > > > An ideal process would be along the lines of: > > > > DataBag d = BagFactory.getInstance().newDefaultBag(); > > d.something(databag_string, schema_string); // ??? no idea what this > > process could be > > d.toString().equals(databag_string) == true. > > > > Thanks, -Dan > > > -- Dan DeCapria CivicScience, Inc. Senior Informatics / DM / ML / BI Specialist
