Re: [galaxy-user] BED to BAM conversion in Galaxy

2011-10-04 Thread Jennifer Jackson

Hello,

The last line of the report you sent suggests that the file has format 
problems. I read through your other email and noticed that a few steps 
were inserted, perhaps because they were needed. However, the line noted 
here is not in BED format:


On 10/4/11 12:06 PM, shamsher jagat wrote:

'Strand information can not be recognized in this line:
"chr1\t10093\t10093\t10292\t61PDWAAXX100706:4:82:5766:21319


c1 chrom = chr1
c2 start = 10093
c3 end   = 10093
c4 name  = 10292
c5 score = 61PDWAAXX100706:4:82:5766:21319
c6 strand = no data

A description of BED format can be found at http://usegalaxy.org -> "Get 
Data -> Upload File" (scroll down to BED) or on most tool forms that 
uses a BED file, such as "Convert Formats -> BED-to-GFF".


A few guidelines (subject to amendment by UCSC readers!):
1 - start is 0-based
2 - start is always a smaller number than end, as coordinates are 
reported with respect to the forward strand. Start and stop are never 
the same value.

3 - score is a value between 0-1000, where 0 means undefined.
4 - strand can be "+", "-", or ".", where the "." means undefined.
5 - BED files have to be at least 3 columns, but can have up to 15. Any 
column used must have all proceeding columns defined, columns 7-12 are 
usually considered interdependent by the tools that use that data, as 
are columns 13-15 (newer spec, for microarray data).

6 - BED files often are BED3-6, BED12 or BED15.
7 - some older tools will not recognize columns 13-15 as being "strict 
BED" format.


Once the data is sorted out, if you continue to have problems, please 
send in a bug report from an error dataset and note in the comments that 
the bug is from you, if the account email address is different. Please 
be sure to leave all input datasets and the error dataset in the history 
until we can examine and provide feedback.


Thanks,

Jen
Galaxy team

--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-user] BED to BAM conversion in Galaxy

2011-10-04 Thread shamsher jagat
Now when I run the same files in the main Galaxy server it gave me following
errors, Do you have any suggestion how these same files will be working ion
Develop server but not on main server using same steps.

INFO @ Tue, 04 Oct 2011 14:56:21: # ARGUMENTS LIST: # name = MACS_in_Galaxy
# format = BED # ChIP-seq file =
/galaxy/main_database/files/003/068/dataset_3068865.dat # control file =
/galaxy/main_database/files/003/068/dataset_3068668.dat # effective genome
size = 2.70e+09 # tag size = 25 # band width = 300 # model fold = 30 #
pvalue cutoff = 5.00e-02 # Ranges for calculating regional lambda are :
peak_region,1000,5000,1 INFO @ Tue, 04 Oct 2011 14:56:21: #1 read tag
files... INFO @ Tue, 04 Oct 2011 14:56:21: #1 read treatment tags... INFO @
Tue, 04 Oct 2011 14:56:32: 100 INFO @ Tue, 04 Oct 2011 14:56:44: 200
INFO @ Tue, 04 Oct 2011 14:56:55: 300 INFO @ Tue, 04 Oct 2011 14:57:06:
400 INFO @ Tue, 04 Oct 2011 14:57:19: 500 INFO @ Tue, 04 Oct 2011
14:57:30: 600 INFO @ Tue, 04 Oct 2011 14:57:41: 700 INFO @ Tue, 04
Oct 2011 14:57:52: 800 INFO @ Tue, 04 Oct 2011 14:58:03: 900 INFO @
Tue, 04 Oct 2011 14:58:15: 1000 INFO @ Tue, 04 Oct 2011 14:58:26:
1100 INFO @ Tue, 04 Oct 2011 14:58:37: 1200 INFO @ Tue, 04 Oct 2011
14:58:49: #1.2 read input tags... Traceback (most recent call last): File
"/home/g2main/linux2.6-x86_64/bin/macs", line 273, in main() File
"/home/g2main/linux2.6-x86_64/bin/macs", line 57, in main (treat, control) =
load_tag_files_options (options) File
"/home/g2main/linux2.6-x86_64/bin/macs", line 256, in load_tag_files_options
control = options.build(open2(options.cfile, gzip_flag=options.gzip_flag))
File "/home/g2main/linux2.6-x86_64/lib/python2.6/MACS/IO/__init__.py", line
1063, in build_fwtrack (chromosome,fpos,strand) =
self.__fw_parse_line(thisline) File
"/home/g2main/linux2.6-x86_64/lib/python2.6/MACS/IO/__init__.py", line 1102,
in __fw_parse_line raise self.StrandFormatError(thisline,thisfields[5])
MACS.IO.StrandFormatError: 'Strand information can not be recognized in this
line: "chr1\t10093\t10093\t10292\t61PDWAAXX100706:4:82:5766:21319

I  can share this history if required please.
Thanks.

On Mon, Oct 3, 2011 at 3:58 PM, shamsher jagat  wrote:

> This is what I followed:
>
>
> 1.   Upload the Bed file (60) > Text manipulation Add column –add this
> value 0; iterate –no will give  file 73
>
> 2.   73 >  Txt manipulation – cut > c1,c2,c3,c4,c6,c5 and delimited by
> tab-  give file 74
>
> 3.   74> pencil icon>  change data type – tabular – file 74
>
> 4.   Txt manipulation- Convert  all white spaces to tab – 75
>
> 5.   *Condense consecutive characters- don’t find this option-  I am
> using Dev. Galaxy version Is it somehow possible this option in develop
> option*
>
> 6.   Change file type – BED file 75
>
> 7.   Pencil> edit attribute col 5 for score- file 75
>
> 8.   Run MACS from NGS peak calling-
> I have shared my history with you please (http://test.g2.bx.psu.edu/root)
> How we can annotate the genes corresponding to peaks.
> Thanks
>
> On Fri, Sep 30, 2011 at 7:08 AM, Jennifer Jackson  wrote:
>
>> Hello,
>>
>> The format of the BED file may be a problem. To be in BED format, an
>> additional field is required for the "score" attribute. This would be column
>> 5, moving the strand out to column 6.
>>
>> To do this:
>>
>> 1 - use "Text Manipulation->Add column" with the value "0"
>> note: "0" often is used to represent a NULL or undefined score value in
>> BED files. This field cannot be left as whitespace (two tabs), a placeholder
>> value must be present.
>>
>> 2 - then use ""Text Manipulation->Cut" and cut out the columns in the
>> proper BED file order, in this case "c1,c2,c3,c4,c6,c5", to swap the last
>> two
>>
>> 3 - change datatype to BED using the pencil icon/Edit attributes form
>>
>> In Galaxy, many of the tools in "NGS: Peak Calling" will work with
>> ChIP-seq data in BED format. Having a control would be helpful, but is not
>> required by all tools.
>>
>> Good luck with your project,
>>
>> Jen
>> Galaxy team
>>
>>
>> On 9/29/11 9:31 PM, shamsher jagat wrote:
>>
>>> Thanks Jen,
>>> My problem is I have ChIP-seq data where I have one Bed
>>> file with  coordinates-
>>>
>>> chr172402772422661PDWAAXX10070**6:4:19:6952:18071-
>>>
>>> Then there is wig file.? Is it possible that thsi data can be analyzed
>>> in Galaxy/ Cistrome. I tried to use Cistrome  which gav eme error
>>> message.
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Wed, Sep 28, 2011 at 3:46 PM, Jennifer Jackson >> > wrote:
>>>
>>>Hello,
>>>
>>>It is possible to go from SAM/BAM to BED, but not the reverse.
>>>SAM/BAM files contain the actual sequence data associated with the
>>>original aligned read. BED files only have the reference genome
>>>location of the alignment (no read "sequence").
>>>
>>>It is possible to extract genomic sequence based on BED coordinates,
>>>but the resulting seq

Re: [galaxy-user] BED to BAM conversion in Galaxy

2011-10-03 Thread shamsher jagat
This is what I followed:


1.   Upload the Bed file (60) > Text manipulation Add column –add this
value 0; iterate –no will give  file 73

2.   73 >  Txt manipulation – cut > c1,c2,c3,c4,c6,c5 and delimited by
tab-  give file 74

3.   74> pencil icon>  change data type – tabular – file 74

4.   Txt manipulation- Convert  all white spaces to tab – 75

5.   *Condense consecutive characters- don’t find this option-  I am
using Dev. Galaxy version Is it somehow possible this option in develop
option*

6.   Change file type – BED file 75

7.   Pencil> edit attribute col 5 for score- file 75

8.   Run MACS from NGS peak calling-
I have shared my history with you please (http://test.g2.bx.psu.edu/root)
How we can annotate the genes corresponding to peaks.
Thanks

On Fri, Sep 30, 2011 at 7:08 AM, Jennifer Jackson  wrote:

> Hello,
>
> The format of the BED file may be a problem. To be in BED format, an
> additional field is required for the "score" attribute. This would be column
> 5, moving the strand out to column 6.
>
> To do this:
>
> 1 - use "Text Manipulation->Add column" with the value "0"
> note: "0" often is used to represent a NULL or undefined score value in BED
> files. This field cannot be left as whitespace (two tabs), a placeholder
> value must be present.
>
> 2 - then use ""Text Manipulation->Cut" and cut out the columns in the
> proper BED file order, in this case "c1,c2,c3,c4,c6,c5", to swap the last
> two
>
> 3 - change datatype to BED using the pencil icon/Edit attributes form
>
> In Galaxy, many of the tools in "NGS: Peak Calling" will work with ChIP-seq
> data in BED format. Having a control would be helpful, but is not required
> by all tools.
>
> Good luck with your project,
>
> Jen
> Galaxy team
>
>
> On 9/29/11 9:31 PM, shamsher jagat wrote:
>
>> Thanks Jen,
>> My problem is I have ChIP-seq data where I have one Bed
>> file with  coordinates-
>>
>> chr172402772422661PDWAAXX10070**6:4:19:6952:18071-
>>
>> Then there is wig file.? Is it possible that thsi data can be analyzed
>> in Galaxy/ Cistrome. I tried to use Cistrome  which gav eme error message.
>>
>> Thanks
>>
>>
>>
>> On Wed, Sep 28, 2011 at 3:46 PM, Jennifer Jackson > > wrote:
>>
>>Hello,
>>
>>It is possible to go from SAM/BAM to BED, but not the reverse.
>>SAM/BAM files contain the actual sequence data associated with the
>>original aligned read. BED files only have the reference genome
>>location of the alignment (no read "sequence").
>>
>>It is possible to extract genomic sequence based on BED coordinates,
>>but the resulting sequence would not necessarily be the same
>>sequence as in the original aligned read (any variation would be lost).
>>
>>BED is very similar to Interval format, so Interval tools also work
>>with BED format. A BED file is basically a 3-12 column, tab
>>delimited file, so tools that work with Tabular data are also
>>appropriate for BED file. Note that you may need to change the
>>datatype to be interval or tab for certain tools to recognize a BED
>>file as an input.
>>
>>Hopefully this helps,
>>
>>Jen
>>Galaxy team
>>
>>
>>
>>
>>On 9/22/11 2:55 PM, shamsher jagat wrote:
>>
>>Is it possible to use some tool in Galaxy to convert BED file to
>>Bam/
>>sam file. In other word do we have Bed tools or other option in
>>Galaxy
>>
>>Thanks
>>
>>
>>__**__**_
>>
>>The Galaxy User list should be used for the discussion of
>>Galaxy analysis and other features on the public server
>>at usegalaxy.org .  Please keep all
>>
>>replies on the list by
>>using "reply all" in your mail client.  For discussion of
>>local Galaxy instances and the Galaxy source code, please
>>use the Galaxy Development list:
>>
>>
>> http://lists.bx.psu.edu/__**listinfo/galaxy-dev
>>
>>
>> 
>> >
>>
>>To manage your subscriptions to this and other Galaxy lists,
>>please use the interface at:
>>
>>http://lists.bx.psu.edu/
>>
>>
>>--
>>Jennifer Jackson
>>http://usegalaxy.org 
>>http://galaxyproject.org/__**Support<
>> http://galaxyproject.org/**Support >
>>
>>
>>
> --
> Jennifer Jackson
> http://usegalaxy.org
> http://galaxyproject.org/**Support 
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion 

Re: [galaxy-user] BED to BAM conversion in Galaxy

2011-09-30 Thread Jennifer Jackson

Hello,

The format of the BED file may be a problem. To be in BED format, an 
additional field is required for the "score" attribute. This would be 
column 5, moving the strand out to column 6.


To do this:

1 - use "Text Manipulation->Add column" with the value "0"
note: "0" often is used to represent a NULL or undefined score value in 
BED files. This field cannot be left as whitespace (two tabs), a 
placeholder value must be present.


2 - then use ""Text Manipulation->Cut" and cut out the columns in the 
proper BED file order, in this case "c1,c2,c3,c4,c6,c5", to swap the 
last two


3 - change datatype to BED using the pencil icon/Edit attributes form

In Galaxy, many of the tools in "NGS: Peak Calling" will work with 
ChIP-seq data in BED format. Having a control would be helpful, but is 
not required by all tools.


Good luck with your project,

Jen
Galaxy team

On 9/29/11 9:31 PM, shamsher jagat wrote:

Thanks Jen,
My problem is I have ChIP-seq data where I have one Bed
file with  coordinates-

chr172402772422661PDWAAXX100706:4:19:6952:18071-

Then there is wig file.? Is it possible that thsi data can be analyzed
in Galaxy/ Cistrome. I tried to use Cistrome  which gav eme error message.

Thanks



On Wed, Sep 28, 2011 at 3:46 PM, Jennifer Jackson mailto:j...@bx.psu.edu>> wrote:

Hello,

It is possible to go from SAM/BAM to BED, but not the reverse.
SAM/BAM files contain the actual sequence data associated with the
original aligned read. BED files only have the reference genome
location of the alignment (no read "sequence").

It is possible to extract genomic sequence based on BED coordinates,
but the resulting sequence would not necessarily be the same
sequence as in the original aligned read (any variation would be lost).

BED is very similar to Interval format, so Interval tools also work
with BED format. A BED file is basically a 3-12 column, tab
delimited file, so tools that work with Tabular data are also
appropriate for BED file. Note that you may need to change the
datatype to be interval or tab for certain tools to recognize a BED
file as an input.

Hopefully this helps,

Jen
Galaxy team




On 9/22/11 2:55 PM, shamsher jagat wrote:

Is it possible to use some tool in Galaxy to convert BED file to
Bam/
sam file. In other word do we have Bed tools or other option in
Galaxy

Thanks


_
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org .  Please keep all
replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

http://lists.bx.psu.edu/__listinfo/galaxy-dev


To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org 
http://galaxyproject.org/__Support 




--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-user] BED to BAM conversion in Galaxy

2011-09-29 Thread shamsher jagat
Thanks Jen,

My problem is I have ChIP-seq data where I have one Bed
file with  coordinates-


chr1   724027  724226  61PDWAAXX100706:4:19:6952:18071   -

Then there is wig file.? Is it possible that thsi data can be analyzed in
Galaxy/ Cistrome. I tried to use Cistrome  which gav eme error message.



Thanks


On Wed, Sep 28, 2011 at 3:46 PM, Jennifer Jackson  wrote:

> Hello,
>
> It is possible to go from SAM/BAM to BED, but not the reverse. SAM/BAM
> files contain the actual sequence data associated with the original aligned
> read. BED files only have the reference genome location of the alignment (no
> read "sequence").
>
> It is possible to extract genomic sequence based on BED coordinates, but
> the resulting sequence would not necessarily be the same sequence as in the
> original aligned read (any variation would be lost).
>
> BED is very similar to Interval format, so Interval tools also work with
> BED format. A BED file is basically a 3-12 column, tab delimited file, so
> tools that work with Tabular data are also appropriate for BED file. Note
> that you may need to change the datatype to be interval or tab for certain
> tools to recognize a BED file as an input.
>
> Hopefully this helps,
>
> Jen
> Galaxy team
>
>
>
>
> On 9/22/11 2:55 PM, shamsher jagat wrote:
>
>>  Is it possible to use some tool in Galaxy to convert BED file to Bam/
>> sam file. In other word do we have Bed tools or other option in Galaxy
>>
>> Thanks
>>
>>
>> __**_
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>   
>> http://lists.bx.psu.edu/**listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>   http://lists.bx.psu.edu/
>>
>
> --
> Jennifer Jackson
> http://usegalaxy.org
> http://galaxyproject.org/**Support 
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] BED to BAM conversion in Galaxy

2011-09-28 Thread Jennifer Jackson

Hello,

It is possible to go from SAM/BAM to BED, but not the reverse. SAM/BAM 
files contain the actual sequence data associated with the original 
aligned read. BED files only have the reference genome location of the 
alignment (no read "sequence").


It is possible to extract genomic sequence based on BED coordinates, but 
the resulting sequence would not necessarily be the same sequence as in 
the original aligned read (any variation would be lost).


BED is very similar to Interval format, so Interval tools also work with 
BED format. A BED file is basically a 3-12 column, tab delimited file, 
so tools that work with Tabular data are also appropriate for BED file. 
Note that you may need to change the datatype to be interval or tab for 
certain tools to recognize a BED file as an input.


Hopefully this helps,

Jen
Galaxy team



On 9/22/11 2:55 PM, shamsher jagat wrote:

Is it possible to use some tool in Galaxy to convert BED file to Bam/
sam file. In other word do we have Bed tools or other option in Galaxy

Thanks


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] BED to BAM conversion in Galaxy

2011-09-22 Thread shamsher jagat
Is it possible to use some tool in Galaxy to convert BED file to Bam/ sam
file. In other word do we have Bed tools or other option in Galaxy

Thanks
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/