I have a set of data that looks something like the following:

>human
acgtt---cgatacg---acgact-----t
>chimp
acgtacgatac---actgca---ac
>mouse
acgata---acgatcg----acgt

I am having trouble setting up a hash etc., to count the number and types of continuous gaps. For example the 'human' sequence above has 2 sets of 3 gaps and 1 set of 5 gaps. The 'chimp' has 2 sets of 3 gaps and finally the 'mouse' has 1 set of 3 gaps and 1 set of 4 gaps.

So, I am having trouble being able to assign a dynamic variable (i.e. gap length) and place that in a pattern match so that it can count how many gaps of that length are in that particular sequence. I know how to set up a hash to count the number of times a gap appears: '$gaptype{$gap}++' or something. The problem is: what is the best way (and how) can I set '$gap' to be dynamic.

I need to know the length of each consecutive string of gaps. I know how to count the gaps by using the 'tr' function. But it gets confusing when I need to add counts to every instance of that gap length. I also need to know the position of each gap (denoted by the position of the first gap in that particular instance). I know that I can use the 'pos()' command for this.

So, my problem is that I think I know some of the bits of code to put into place the problem is I am getting lost on how to structure it all together. For now I am just trying to get my output to look like this:

Human
number of 3 base pair gaps:             2
                        at positions:           6, 16
number of 5 base pair gaps:             1
                        at positions:           25

Chimp
.... and so on ...

So, any suggestions would be greatly appreciated. If anyone can help me out with all or even just bits of this I would greatly appreciate it. This should help me get started on some more advanced parsing I need to do after this. I like to try and figure things out on my own if I can, so even pseudo code would be of great help!

-Thanks
-Mike



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to