Hi all, I'm trying to read some data from CassandraStorage (contrib by Cassandra) and then work on it, but the format of the data is just incredibly ugly. When just loading it and dumping it I can see that the format is something like this:
(key,{(col0,col0value),(col1,col1value),(col2,col2value),(col3,col3value)}) which makes my UDFs incredibly ugly: public Boolean exec(Tuple arg0) throws IOException { DataBag b = (DataBag) arg0.get(0); Iterator<Tuple> i = b.iterator(); while(i.hasNext()){ Tuple next = i.next(); if("col1".equals(next.get(0).toString())) col1 = Double.parseDouble(next.get(1).toString()); else if("longitude".equals(next.get(0).toString())) col2 = Double.parseDouble(next.get(1).toString()); } } ... } As you can see the most part of this is just iterating over the DataBag and mapping the column names to their value, before working on the real data. Since my guess is that this is quite commonplace and timeconsuming, I was wondering whether there is a better way to prepare the data before passing it to the UDFs, some sort of HashMap that extracts column names and values and stores them correctly. Regards, Chris -- Christian Decker Software Architect http://blog.snyke.net