Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-07 Thread Jonas Grauholm
Hi Paul,

I'm happy to hear that so far you are happy with the data.

We used the Illumina TruSeq targeting kit, I made the exome target regions 
available as both a .bed and .gff file on the FTP.

Best regards
Jonas


Fra: galaxy-user-boun...@lists.bx.psu.edu 
[mailto:galaxy-user-boun...@lists.bx.psu.edu] På vegne af Guru Ananda
Sendt: 06 May 2011 16:47
Til: Peter Cock
Cc: galaxy-user@lists.bx.psu.edu
Emne: Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an 
plain text file ?

Hi Peter and Roman,

The "Count" tool under "Statistics" section provides uniq-like functionality. 
If you run this tool by selecting all columns under "Count occurrences of 
values in column(s)" field, your output will contain one line per record, with 
the 1st column containing the number of occurrences of each record.

Hope this answers your question.
Thanks for using Galaxy,
Guru.


On Fri, May 6, 2011 at 10:22 AM, Peter Cock 
mailto:p.j.a.c...@googlemail.com>> wrote:
On Fri, May 6, 2011 at 3:16 PM, Roman Valls 
mailto:brainst...@nopcode.org>> wrote:
> Well, having similarly basic tools (in Galaxy) that can be performed on
> the commandline, such as "sort" or "cut" I just wondered how come a
> "uniq" is not there on the tool panel in some form/name.
>
> Thanks for the feedback Rory !
That's a timely question - I was also looking for something within Galaxy
to take a text file and remove duplicate lines.

Peter
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org<http://usegalaxy.org>.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/



--
Graduate student, Bioinformatics and Genomics
Makova lab/Galaxy team
Penn State University
505 Wartik lab
University Park PA 16802
g...@psu.edu<mailto:g...@psu.edu>

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Guru Ananda
Hi Peter and Roman,

The "Count" tool under "Statistics" section provides uniq-like
functionality. If you run this tool by selecting all columns under "Count
occurrences of values in column(s)" field, your output will contain one line
per record, with the 1st column containing the number of occurrences of each
record.

Hope this answers your question.
Thanks for using Galaxy,
Guru.


On Fri, May 6, 2011 at 10:22 AM, Peter Cock wrote:

> On Fri, May 6, 2011 at 3:16 PM, Roman Valls 
> wrote:
> > Well, having similarly basic tools (in Galaxy) that can be performed on
> > the commandline, such as "sort" or "cut" I just wondered how come a
> > "uniq" is not there on the tool panel in some form/name.
> >
> > Thanks for the feedback Rory !
>
> That's a timely question - I was also looking for something within Galaxy
> to take a text file and remove duplicate lines.
>
> Peter
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>  http://lists.bx.psu.edu/
>



-- 
Graduate student, Bioinformatics and Genomics
Makova lab/Galaxy team
Penn State University
505 Wartik lab
University Park PA 16802
g...@psu.edu
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Peter Cock
On Fri, May 6, 2011 at 3:16 PM, Roman Valls  wrote:
> Well, having similarly basic tools (in Galaxy) that can be performed on
> the commandline, such as "sort" or "cut" I just wondered how come a
> "uniq" is not there on the tool panel in some form/name.
>
> Thanks for the feedback Rory !

That's a timely question - I was also looking for something within Galaxy
to take a text file and remove duplicate lines.

Peter
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Roman Valls
Well, having similarly basic tools (in Galaxy) that can be performed on
the commandline, such as "sort" or "cut" I just wondered how come a
"uniq" is not there on the tool panel in some form/name.

Thanks for the feedback Rory !

On 2011-05-06 15:56, Rory Kirchner wrote:
> Is there a reason why just using the command line tool isn't workable for 
> you? Personally, I'm happy when I can just do something quick like that.
> 
> Also you can simplify your command with sort filename | uniq.
> 
> -rory
> 
> On May 6, 2011, at 9:45 AM, Roman Valls wrote:
> 
>> Hey galaxy users,
>>
>> Thats a fairly good question from one of my colleagues. I've looked
>> through the menus (mainly "Text Manipulation" and "Filter and
>> Sort"(Select)), googled (on the mailing list archives too), but couldn't
>> find an answer: How should I remove duplicates on plain text files
>> without resorting to:
>>
>> "echo file|sort|uniq" before uploading the file/text.
>>
>> or
>>
>> Putting a regexp together to replace the duplicate occurences as in:
>>
>> http://www.regular-expressions.info/duplicatelines.html
>>
>>
>> I'm pretty sure I'm missing some really basic stuff here... is this
>> basic operation something supposed to be done outside galaxy perhaps ?
>>
>> Thanks in advance !
>>
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>  http://lists.bx.psu.edu/
> 
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Rory Kirchner
Is there a reason why just using the command line tool isn't workable for you? 
Personally, I'm happy when I can just do something quick like that.

Also you can simplify your command with sort filename | uniq.

-rory

On May 6, 2011, at 9:45 AM, Roman Valls wrote:

> Hey galaxy users,
> 
> Thats a fairly good question from one of my colleagues. I've looked
> through the menus (mainly "Text Manipulation" and "Filter and
> Sort"(Select)), googled (on the mailing list archives too), but couldn't
> find an answer: How should I remove duplicates on plain text files
> without resorting to:
> 
> "echo file|sort|uniq" before uploading the file/text.
> 
> or
> 
> Putting a regexp together to replace the duplicate occurences as in:
> 
> http://www.regular-expressions.info/duplicatelines.html
> 
> 
> I'm pretty sure I'm missing some really basic stuff here... is this
> basic operation something supposed to be done outside galaxy perhaps ?
> 
> Thanks in advance !
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Roman Valls
Hey galaxy users,

Thats a fairly good question from one of my colleagues. I've looked
through the menus (mainly "Text Manipulation" and "Filter and
Sort"(Select)), googled (on the mailing list archives too), but couldn't
find an answer: How should I remove duplicates on plain text files
without resorting to:

"echo file|sort|uniq" before uploading the file/text.

or

Putting a regexp together to replace the duplicate occurences as in:

http://www.regular-expressions.info/duplicatelines.html


I'm pretty sure I'm missing some really basic stuff here... is this
basic operation something supposed to be done outside galaxy perhaps ?

Thanks in advance !

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/