Thank you for the inputs! @Norbert, But a Group By column number clause also does not guarantee the order of columns to be preserved. Like even the row number should be known so that may be in the end we can sort each row based on the row number using a nested FOREACH. But after that FOREACH since sorting is not preserved, for other operations again data may be in wrong order in the row.
To me it seems like it is not possible to do this in MR. On Fri, Jun 22, 2012 at 12:56 AM, Robert Evans <ev...@yahoo-inc.com> wrote: > That may be true, I have not read through the code very closely, if you > have multiple reduces, so you can run it with a single reduce or you can > write a custom partitioner to do it. You only need to know the length of > the column, and then you can divide them up appropriately, kind of like how > the total order partitioner does it. > > --Bobby Evans > > On 6/21/12 1:15 PM, "Norbert Burger" <norbert.bur...@gmail.com> wrote: > > While it may be fine for many cases, If I'm reading the Nectar code > correctly, that transpose doesn't guarantee anything about the order of > rows within each column. In other words, transposing: > > a - b -c > d - e - f > g - h - i > > may give you different permutations of "a - d - g" as the first row, > depending on shuffle order. You can trivially avoid this with one > mapper/reducer, but then you're not exploiting the framework. Note that > you can accomplish same with a higher-level language like PIg by using a > UDF like LinkedIn's Enumerate [1] to tag each column, and then simply > GROUPing BY column number. > > [1] > > https://raw.github.com/linkedin/datafu/master/src/java/datafu/pig/bags/Enumerate.java > > Norbert > > On Thu, Jun 21, 2012 at 5:00 AM, madhu phatak <phatak....@gmail.com> > wrote: > > > Hi, > > Its possible in Map/Reduce. Look into the code here > > > > > https://github.com/zinnia-phatak-dev/Nectar/tree/master/Nectar-regression/src/main/java/com/zinnia/nectar/regression/hadoop/primitive/mapreduce > > > > > > > > 2012/6/21 Subir S <subir.sasiku...@gmail.com> > > > > > Hi, > > > > > > Is it possible to implement transpose operation of rows into columns > and > > > vice versa... > > > > > > > > > i.e. > > > > > > col1 col2 col3 > > > col4 col5 col6 > > > col7 col8 col9 > > > col10 col11 col12 > > > > > > can this be converted to > > > > > > col1 col4 col7 col10 > > > col2 col5 col8 col11 > > > col3 col6 col9 col12 > > > > > > Is this even possible with map reduce? If yes, which language helps to > > > achieve this faster? > > > > > > Thanks > > > > > > > > > > > -- > > https://github.com/zinnia-phatak-dev/Nectar > > > >