Re: [galaxy-user] How to filter the sequences containing not[ATCG] character?

2013-12-10 Thread 朱师云
Hello, Yeah, it's interesting!I have tried and something like [^ATCGatcg] is useful.I have a large file to deal with so I will search something to choose an efficient regular expresson. Thank you. Date: Mon, 9 Dec 2013 07:24:46 -0800 From: j...@bx.psu.edu To: zhus...@msn.cn CC:

Re: [galaxy-user] How to filter the sequences containing not[ATCG] character?

2013-12-09 Thread Jennifer Jackson
Hello, If the data was in .fastqsanger format, you could use the tool Manipulate FASTQ, but with .fasta, this is a good way. But watch your regular expression - test it out on a smaller set to make sure it is doing what you want. I see a start of the line character in the middle of your

Re: [galaxy-user] How to filter the sequences containing not[ATCG] character?

2013-12-09 Thread 朱师云
Hi, It indeed helps.Your regular expression looks brief and more useful.BTW, a start of line (^) between [] and in the first location, for example, [^ATCGatcg] means a character not [ATCGatcg], which maybe not work in the tool SELECT. Thank you for your help! Date: Mon, 9 Dec 2013 06:34:28

Re: [galaxy-user] How to filter the sequences containing not[ATCG] character?

2013-12-09 Thread Jennifer Jackson
Hello, You are right! I forgot about that. Aren't regular expressions fun? And please test it out, if you prefer your method or are just curious, I didn't try it that way. There are usually a few ways to do the same thing when using a regex. But, I am glad that this helped a bit and good luck

[galaxy-user] How to filter the sequences containing not[ATCG] character?

2013-12-08 Thread 朱师云
Hi Jen,As the title, I have a [fasta] file that obtained from a [gtf] file, cuff102.1atcgtaaagggcgatcuff103.1gtcgttgactgtc and I want to get the output like this to filter the sequences that contain any not[ATCG] character? cuff102.1atcgtaaagggcgat I have a large of sequences to filter. I