I'm not sure either, but it's a good point. So basically it would be
possible to create a UDF that generates a Map<String, Object> from my input,
right?
--
Christian Decker
Software Architect
http://blog.snyke.net


On Wed, Aug 25, 2010 at 8:11 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote:

> Chris,
> This sort of pattern is not common because Map<String, Object> is a
> primitive data type in Pig, I am not sure why Cassandra doesn't just use
> it.
> That would seem to be the right solution based on what I am reading in your
> email.
>
> -D
>
> On Wed, Aug 25, 2010 at 10:59 AM, Christian Decker <
> decker.christ...@gmail.com> wrote:
>
> > Hi all,
> >
> > I'm trying to read some data from CassandraStorage (contrib by Cassandra)
> > and then work on it, but the format of the data is just incredibly ugly.
> > When just loading it and dumping it I can see that the format is
> something
> > like this:
> >
> >
> (key,{(col0,col0value),(col1,col1value),(col2,col2value),(col3,col3value)})
> >
> >
> > which makes my UDFs incredibly ugly:
> >
> > public Boolean exec(Tuple arg0) throws IOException {
> >
> >  DataBag b = (DataBag) arg0.get(0);
> >
> >  Iterator<Tuple> i = b.iterator();
> >
> >  while(i.hasNext()){
> >
> >  Tuple next = i.next();
> >
> >  if("col1".equals(next.get(0).toString()))
> >
> >  col1 = Double.parseDouble(next.get(1).toString());
> >
> >  else if("longitude".equals(next.get(0).toString()))
> >
> >  col2 = Double.parseDouble(next.get(1).toString());
> >
> >  }
> >
> >  }
> >
> >  ...
> >
> > }
> >
> >
> > As you can see the most part of this is just iterating over the DataBag
> and
> > mapping the column names to their value, before working on the real data.
> > Since my guess is that this is quite commonplace and timeconsuming, I was
> > wondering whether there is a better way to prepare the data before
> passing
> > it to the UDFs, some sort of HashMap that extracts column names and
> values
> > and stores them correctly.
> >
> > Regards,
> > Chris
> >
> > --
> > Christian Decker
> > Software Architect
> > http://blog.snyke.net
> >
>

Reply via email to