Why not just use PigStorage? This is essentially what it does. It saves a bag as text, and then loads it again.
I suppose the question becomes: why do you need to do this? 2013/3/18 Dan DeCapria, CivicScience <[email protected]> > In Java, I am trying to convert a DataBag from it's String representation > with its schema String to a valid DataBag Object: > > String databag_string = "{(apples,1024)}"; > String schema_string = "b1:bag{t1:tuple(a:chararray,b:long)}"; > > I've tried implementing something along the lines of this, but I believe > it's in the wrong direction, and then I get stuck: > > String[] aliases = {"b1", "t1", "a", "b"}; > byte[] types = {DataType.BAG, DataType.TUPLE, DataType.CHARARRAY, > DataType.LONG}; > List<Schema.FieldSchema> fsList = new > ArrayList<Schema.FieldSchema>(); > for (int i = 0; i < aliases.length; i++) { > fsList.add(new Schema.FieldSchema(aliases[i], types[i])) ; > } > Schema origSchema = new Schema(fsList); > ResourceSchema rsSchema = new ResourceSchema(origSchema); > Schema genSchema = Schema.getPigSchema(rsSchema); > ResourceSchema.ResourceFieldSchema[] rfschema = > rsSchema.getFields(); > ... lost here, maybe Utf8StorageConverter c = new > Utf8StorageConverter(); ??? > > > An ideal process would be along the lines of: > > DataBag d = BagFactory.getInstance().newDefaultBag(); > d.something(databag_string, schema_string); // ??? no idea what this > process could be > d.toString().equals(databag_string) == true. > > Thanks, -Dan >
