I'm not sure either, but it's a good point. So basically it would be possible to create a UDF that generates a Map<String, Object> from my input, right? -- Christian Decker Software Architect http://blog.snyke.net
On Wed, Aug 25, 2010 at 8:11 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote: > Chris, > This sort of pattern is not common because Map<String, Object> is a > primitive data type in Pig, I am not sure why Cassandra doesn't just use > it. > That would seem to be the right solution based on what I am reading in your > email. > > -D > > On Wed, Aug 25, 2010 at 10:59 AM, Christian Decker < > decker.christ...@gmail.com> wrote: > > > Hi all, > > > > I'm trying to read some data from CassandraStorage (contrib by Cassandra) > > and then work on it, but the format of the data is just incredibly ugly. > > When just loading it and dumping it I can see that the format is > something > > like this: > > > > > (key,{(col0,col0value),(col1,col1value),(col2,col2value),(col3,col3value)}) > > > > > > which makes my UDFs incredibly ugly: > > > > public Boolean exec(Tuple arg0) throws IOException { > > > > DataBag b = (DataBag) arg0.get(0); > > > > Iterator<Tuple> i = b.iterator(); > > > > while(i.hasNext()){ > > > > Tuple next = i.next(); > > > > if("col1".equals(next.get(0).toString())) > > > > col1 = Double.parseDouble(next.get(1).toString()); > > > > else if("longitude".equals(next.get(0).toString())) > > > > col2 = Double.parseDouble(next.get(1).toString()); > > > > } > > > > } > > > > ... > > > > } > > > > > > As you can see the most part of this is just iterating over the DataBag > and > > mapping the column names to their value, before working on the real data. > > Since my guess is that this is quite commonplace and timeconsuming, I was > > wondering whether there is a better way to prepare the data before > passing > > it to the UDFs, some sort of HashMap that extracts column names and > values > > and stores them correctly. > > > > Regards, > > Chris > > > > -- > > Christian Decker > > Software Architect > > http://blog.snyke.net > > >