RE: how to push a double dimensional array

Charles K. Clarkson Tue, 24 Feb 2004 12:57:22 -0800

Öznur Tastan <[EMAIL PROTECTED]> wrote:
: 
: ----- Original Message -----
: From: "Charles K. Clarkson" <[EMAIL PROTECTED]>
: To: "'Öznur Tastan'" <[EMAIL PROTECTED]>; "'Perl Lists'"
: <[EMAIL PROTECTED]>
: Sent: Tuesday, February 24, 2004 8:59 PM
: Subject: RE: how to push a double dimensional array
: 
: 
: > Öznur Tastan <[EMAIL PROTECTED]> wrote:
: > :
: > :
: > : Ok: I am starting from the beginning.
: >
: >     I thought I had accidentally discouraged you.
: >
: >
: > : The alignment is alignment of two strings so that similar
: > : characters according to a scoring system (this is out of scope
: > : I think) should be in the same register and there can be gaps
: > : in one of the strings denotes as a dash.
: >
: >     What do you mean by "should be in the same register"? What
: > is a register?
:
: Like in the alignment below A is in the same 'register'( bad 
: English himm :) with - but seq1 and seq2 have these information
: already so when  I know seq1 and seq2 I don't need any further
: information to visualize the alignment.
: Three feature is enough to define my alignment
: 
: AADALLL
: - -  EVLLL
: the alignment is there ( this is pairwise alignment of protein
: sequences that can be calculated by dynamic programming of two
: strings)


    You are aware that this is the first time you have actually come
out and said this was biology related, right? Have you checked out
http://bio.perl.org/? I don't know enough about your field to know
if your current application is already solved there.

    Okay, so we have established that each alignment has three
characteristics and we are trying to see how best to arrange
the alignments in a data structure that will ease your manipulation
of those alignments.


: > : So my alignments have three features (seq1 seq2 and the score
: > : of the alignments)
: > : seq1: AADALLL
: > : seq2: - -  EVLLL
: > : score:12
: > :
: > : There are groups of alignments that I want to keep separated.
: > : Say I have 15 alignments (the numbers and distributions can
: > : be changed with different input strings)
: > :          3 of the alignments belong to first group
: > :          4 of them second belong to 2nd group.
: > :          5 of them belong to 3rd
: > :          2 of them belong to 4th
: > :          1 of them belong to 5th)
: >
: >    What is a group of alignments grouped by? Is it by score
: > or some other quality?
: 
: No not score some other quality related to origins of the 
: sequences ( again related to biology ).
: So the alignments of certain sequence pairs should belong to one
: group and the others another. So we for the scope of problem
: believe me it is not important.

    So, in this example we have 15 alignments and they can be
grouped by their origin.

    Are the group names alphanumeric of numbered? If numbered, are
they in sequence? Assuming they are not a numbered sequence, I
would think the top level of your structure would be a hash.


(
    group_name_1 => ( '3 alignments' ),
    group_name_2 => ( '4 alignments' ),
    group_name_3 => ( '5 alignments' ),
    group_name_4 => ( '2 alignments' ),
    group_name_5 => ( '1 alignments' ),
)


    What goes inside each hash value depends on what you need
to do with those values. I am still not clear what that is, but
you have two basic choices:

    ( 'AADALLL', '- -  EVLLL', 12 )

    Or:

    (
        seq1 => 'AADALLL',
        seq2 => '- -  EVLLL',
        score => 12
    )

    A hash of an array of arrays might look like this:
(
    group_name_1 => [
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
                    ],
    group_name_2 => [
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
                    ],
    group_name_3 => [
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
                    ],
    group_name_4 => [
        [ 'AADALLL', '- -  EVLLL', 12 ],
        [ 'AADALLL', '- -  EVLLL', 12 ],
                    ],
    group_name_5 => [
        [ 'AADALLL', '- -  EVLLL', 12 ],
                    ],
)

    Of course, the alignments would all be different and the and
the group names would probably be more topical, but I don't know
enough biology to make up other values.

    Assuming this data structure is named %groups, we access
each 'seq1' of a group named $group with:

my $group = 'group_name_1';
foreach my $sequence ( @{ $groups{ $group } } ) {
    print "$sequence->[0]\n";
    # do something with each sequence
}


    A hash of array of hashes might look like this:

(
    group_name_1 => [
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
                    ],
    group_name_2 => [
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
                    ],
    group_name_3 => [
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
                    ],
    group_name_4 => [
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
                    ],
    group_name_4 => [
        {
            seq1 => 'AADALLL',
            seq2 => '- -  EVLLL',
            score => 12
        },
                    ],
)


    Assuming this data structure is named %groups, we access
each 'seq1' of a group named $group with:

my $group = 'group_name_1';
foreach my $sequence ( @{ $groups{ $group } } ) {
    print "$sequence->{seq1}\n";
    # do something with each sequence
}


    Unfortunately, I have no idea if I have helped or not. These
steps are the same ones many of us take when trying to find a
desirable data structure. It is more difficult for me here
because you are supplying very limited examples and because I do
not know your field of study well enough to add needed info.

    Don't let the size of the printed data structure fool you.
They are both efficient structures. It is unlikely you will ever
print them out except for debugging.



HTH,

Charles K. Clarkson
-- 
Head Bottle Washer,
Clarkson Energy Homes, Inc.
Mobile Home Specialists
254 968-8328


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

RE: how to push a double dimensional array

Reply via email to