That's quite nice.

Looking at the genetic coding examples on the website Jan-Pieter
mentioned,  and bearing in mind that "cut" is a provided verb,
this might be the sort of thing he would like:
NB. script version of defining some taxons:

sample =: 0 : 0 NB. truncated lines from web example

>Taxon1

CCTGCGGAAGATCGGCAC

TCCCACTAATAA

>Taxon2

CCATCGGTAGC

ATATCCATTTGTCAGCAGAC

>Taxon3

CCACCCTCGT

TGGGAACCT

)

   >({.,<@;@}.)@(LF&cut)each '>' cut sample
+------+-------------------------------+
|Taxon1|CCTGCGGAAGATCGGCACTCCCACTAATAA |
+------+-------------------------------+
|Taxon2|CCATCGGTAGCATATCCATTTGTCAGCAGAC|
+------+-------------------------------+
|Taxon3|CCACCCTCGTTGGGAACCT            |
+------+-------------------------------+

NB. these boxes might not display nicely - sorry -
they look ok to me!


If the file is very large,  some buffering might be necessary.

Any use?

Mike

On 25/11/2014 10:47, Jan-Pieter Jacobs wrote:
I think I would solve it like this (with intermediate steps):

x=: 'case_id', LF, 'GCTAGTCG', LF, 'ACGTC', LF

NB. Find the locations of LF's in the string
(LF = ]) x

NB. assign 0 to everything before the LF, assign 1 to everything after it
([: +./\  LF = ]) x

NB. The wonderful "key" adverb lets you apply verbs based on it's key
(another array), eg. box
(</.~ [: +./\ LF =]) x

NB. Get rid of the LF's by replacing < by a version removing LF:
((<@#~ LF~:])/.~ [: +./\ LF = ]) x

This is what I came up with, but smarter people will probably suggest more
elegant ways.

I hope this is useful.
Jan-Pieter

2014-11-25 11:00 GMT+01:00 Ryan <rec...@bwh.harvard.edu>:

I have a character array with LF's, and want to split it on the
first LF, and remove the remaining LF's.  I'm wondering if there's a
simpler
way than what I'm doing now:

x=: 'case_id', LF, 'GCTAGTCG', LF, 'ACGTC', LF

filterLF=: #~ ~:&LF
headLF=: {.~ i.&LF
tailLF=: }.~ i.&LF
(headLF ; filterLF@tailLF) x
┌────────┬─────────────┐
│>case_id│GCTAGTCGACGTC│
└────────┴─────────────┘

(I'm reading text files in fasta format: http://rosalind.info/glossary/
fasta-format/)

Thanks for any suggestions,
Ryan





---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2014.0.4794 / Virus Database: 4189/8623 - Release Date: 11/24/14

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to