Hi Peter,
Florian's answers are very good - I am not sure this will add much, but
perhaps a little, for the Galaxy output datasets parts of the questions ...
The latest Using Galaxy paper, protocol 3, includes all of the
optional output that MACS in Galaxy will produce (in addition to the
linked files from the HTML report). Apart from the primary BED file and
HTML output, there are 4 files paired by tags/control = 2 interval and 2
wig.
The coordinate system used by each file specification can vary, as you
observed and already noted. See the documentation links for exactly how
these files are formatted. But regardless of the file coordinate system,
a proper browser that interprets the datatype correctly will display the
start/stop correctly, which is where the output datasets in Galaxy can
be useful. Meaning, that whether the start in the file is 1-based or
0-based, the actual start base will visualize as the same start base.
Load the output into the UCSC Browser or Trackster in Galaxy and scroll
into one of the regions to view this, and compare with the files, both
datasets in Galaxy and downloaded through links) to better understand.
Full documentation for core MACS output is in the MACS documentation
(link given by Florian, also linked from MACS tool page).
Documentation/examples for the Galaxy output files is in our paper:
http://main.g2.bx.psu.edu/u/galaxyproject/p/using-galaxy-2012 (scroll to
protocol 3)
http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi1005s38/full#bi1005-prot-0003
(see step #6)
More help for datatypes:
http://wiki.g2.bx.psu.edu/Learn/Datatypes (bed, interval, wig are all
covered with links to more resources)
Florian mostly covered these, but I'll also address to be clear:
On 9/11/12 9:45 AM, peter scot wrote:
I ran MACS on my chipseq dataset and found various files:
1. under html report there ar etwo files one of negative peaks.xls and
second is peaks.xls the file peaks.xls is same as peaks .intreval file
in the right out put flow with one bp position added e..g if peak
coordinate under html report are 99 to 120 than in the peaks .interval
it is 100 to 121. Which one should be followed?
Related to different coordinate system. See file specifications.
2. What is the meaning of negative peak. interval file?
Is a type of control data - basically the inputs are flipped to produce
it. May not be needed/useful for further downstream analysis. The advice
to read the MACS doc to fully understand is a good one.
3. I have used ctrl and treated sample to run MACS - there are two wig
files one ctrl.wig and another treatment. Wig; Do these two files belong
to ctrl and treated samples then where are corresponding bed files.
These show the data density (pileup) in a graphical format. No bed
files, although you can visualize these against the other bed and/or
interval peak data to see how density was interpreted when calling peaks.
Hopefully this helps!
Jen
Galaxy team
If someone can direct me to the out put as we get in Galaxy while using
MACS that will be helpful
Thanks
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using reply all in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Jennifer Jackson
http://galaxyproject.org
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using reply all in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/