Hi all, I'm having a little trouble understanding the best way to perform some tabular file manipulations in Galaxy. I have several tabular files, which contain different numbers of columns, which I want to combine using a single column containing an identifier (which must match for the rows to be combined).
e.g. File 1 contains, c1 = ID c2 = Score1a File 2 contains, c1 = ID c2 = Score2a c3 = Score2b c4 = Score2c File 3 contains, c1 = ID c2 = Score3a c3 = Score3b Desired combined file containing: c1 = ID c2 = Score1a c3 = Score2a c4 = Score2b c4 = Score2c c6 = Score3a c7 = Score3b I have worked out how to do this with two calls to the "Join two Datasets" tool, but this results in the repetition of the join column (ID in this example), so a final clean-up is required using the "Cut" tool (which breaks the column assignments). The more flexible "Column Join" tool would let me combine an arbitrary number of files, but is designed for input files containing the same column structure. Is there a better way to do this with Galaxy as it stands? Alternatively, would adding an option to the "Join two Datasets" tool not to bother with the redundant column be widely useful? Peter ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/