You can have 3 different bolts in storm and give like this
bolt1 group by(col1), bolt2 group by (col1, col2), bolt3 group by
(col1, col2, col3)
bolt1 will give o/p as you are expecting
a1 {43,136}
a2 {52,77}
a4 {99,66}
bolt2 will give o/p as
a1
--b1 {21,33}
--b2 {22,103}
a2
--b1 {30,44}
--b3 {22,33}
a4
--b4 {99,66}
Rest individual columns will be bolt3 o/p.
You cannot have dynamic grouping
On 1/31/14, Kavi Kumar <[email protected]> wrote:
> I have rows in BigData DB (Cassandra in my case) with column names
> col1,col2,col3,val1,val2
>
> in SQL approach I can do group by col1,col2 or col2,col1 or any other
> possible way also. This way I can form tree hierarchy easily.
>
> But now we are using Cassandra to store the data which doesnt support group
> by. So we want to use Storm for doing group by and aggregations.
> We wrote some sample code do aggregation and group by, but we are unable to
> form an opinion whether we can achieve it or not.
>
> Data looks like this
>
>
> col1,col2,col3,val1,val2
> ------------------------
> a1,b1,c1,10,20
> a1,b1,c2,11,13
> a1,b2,c1,9,15
> a1,b2,c3,13,88
> a2,b1,c1,30,44
> a2,b3,c2,22,33
> a4,b4,c4,99,66
>
>
> Like in excel pivot I want to build hierarchy
> root->child1->child2->child3-val1,val2 then it may look like this if my
> hierarchy is col1->col2->col3
>
>
> a1 {43,136}
> --b1 {21,33}
> --c1 10,20
> --c2 11,13
> --b2 {22,103}
> --c1 9,15
> --c3 13,88
> a2 {52,77}
> --b1 {30,44}
> --c1 30,44
> --b3 {22,33}
> --c2 22,33
> a4 {99,66}
> --b4 {99,66}
> --c4 99,66
>
>
> I want to give user functionality to re-arrange hierarchy elements
> something like col3->col1->col2 (or something else also, which is dynamic)
> in this case data will look like this
>
> c1 {49,79}
> --a1 {19,35}
> --b1 10,20
> --b2 9,15
> --a2 {30,44}
> --b1 30,44
> c2 {11,13}
> --a1 {11,13}
> --b1 11,13
> --a2 {22,33}
> --b3 22,33
> c3 {13,88}
> --a1 {13,88}
> --b2 13,88
> c4 {99,66}
> --a4 {99,66}
> --b4 99,66
>
>
>
> Few lines of my trident code looks like this, which is not working as
> expected.
>
>
> topology.newStream("aggregation", spout).groupBy(new Fields(
> "col1","col2","col3","val1","val2")).aggregate(new Fields(
> "val1","val2"), new Sum(), new Fields("val1sum","val2sum"))
> .each(new Fields("col1","col2","col3","val1sum","val2sum"), new
> Utils.PrintFilter());
>
>
> For doing above transformations I want to use Storm with or without Trident
> API support.
> Can anyone guide me how to achive it? Any program ideas are very much
> appreciated.
>