The best reference in this area is the classic book: Time warps, string edits, and macromolecules : the theory and practice of sequence comparison By:David Sankoff; Joseph B Kruskal Publisher:Reading, Mass. : Addison-Wesley Pub. Co., 1983. ISBN:0201078090
--stephen hirtle At 01:08 PM 3/9/2005, you wrote: >I am getting some great input from people. Thank you. > >The method I was trying to think of is 'unfolding' which several people >identified, and Doug Carroll has directed me to multidimensional >unfolding. > > >Here is the problem that I may be faced with (given funding becomes >available) >: > >1. Viruses contain sets of genes which for argument we will call A, B C, >D, etc. > > >2. We want to infer a relationship among virus 'species', though species >here is according to the virologists (or at least some of them) generally >not thought to apply in the strict evolutionary sense, so I use this term >loosely > > >3. The order of the genes within each virus species is different as a >result of significant genome rearrangements over time > > e.g., ABCD ABDC .... CDBA ... etc > > >4. If we view the arrangement of genes among many viruses can we identify >families of viruses where the rearrangement is less within the family than >between families (sounds like clustering) > > >5. One approach is to define a metric on gene arrangement patterns (e.g., >the minimum number of rearrangements needed to have identical gene >arrangement between two virus species) and proceed using standard >clustering > > >6. I also thought unfolding is something I need to consider, so thank you >for your input. > > >Anyone ever approach a problem like this? > > >Bill >--- > > Joint Meeting of the Interface and > Classification Society of North America > > http://ilya.wustl.edu/if_csna_2005_meeting/ > Abstracts and Registration Deadline is 4/9/05 > > >William D. Shannon, Ph.D. > >Associate Professor of Biostatistics in Medicine >Division of General Medical Sciences and Biostatistics > >Washington University School of Medicine >Campus Box 8005, 660 S. Euclid >St. Louis, MO 63110 > >Phone: 314-454-8356 >Fax: 314-454-5113 >e-mail: [EMAIL PROTECTED] >web page: http://ilya.wustl.edu/~shannon > > >On Wed, 9 Mar 2005, J. Douglas Carroll wrote: > >> Maybe this is the usage of this term in some statistical fields, but the >> mathematical psychologist Clyde Coombs's (as far as I know original) use of >> the word "unfolding" implied finding a real valued continuum (a single >> dimension defined on at least an interval, not merely an ordinal, scale) >> such that all the input orders can be generated via a model in which each >> order is inversely related to the order of distances from an "ideal point" >> on that continuum. Unfolding was later generalized by some students of >> Coombs's to the multidimensional case, in which the single dimension was >> generalized to a multidimensional space, and the input orders were assumed >> to be inversely monotonically related to (Euclidean) distances from a set >> of ideal points in this multidimensional space. This generalization is >> referred to as "MULTIDIMENSIONAL unfolding". >> >> While it's certainly true that, in Coombs's original unidimensional version >> of unfolding analysis, an order can be associated with the unidimensional >> continuum resulting from unfolding analysis, its purpose was NOT to >> determine an ordering, but to determine an underlying >> continuum. Furthermore, the order defined by the resulting continuum will >> generally NOT be in any realistic sense a "consensus order"; in extreme >> cases its average rank order correlation (calculated by any reasonable rank >> order correlation coefficient) with the input orders could be zero, in fact. >> >> Doug Carroll >> >> >> At 10:27 AM 3/9/2005 -0600, shannon wrote: >> >Yes -- unfolding is the word. Thanks >> > >> > >> >Bill >> >--- >> > >> > Joint Meeting of the Interface and >> > Classification Society of North America >> > >> > http://ilya.wustl.edu/if_csna_2005_meeting/ >> > Abstracts and Registration Deadline is 4/9/05 >> > >> > >> >William D. Shannon, Ph.D. >> > >> >Associate Professor of Biostatistics in Medicine >> >Division of General Medical Sciences and Biostatistics >> > >> >Washington University School of Medicine >> >Campus Box 8005, 660 S. Euclid >> >St. Louis, MO 63110 >> > >> >Phone: 314-454-8356 >> >Fax: 314-454-5113 >> >e-mail: [EMAIL PROTECTED] >> >web page: http://ilya.wustl.edu/~shannon >> > >> > >> >On Wed, 9 Mar 2005, Paul R Swank wrote: >> > >> > > Do you mean unfolding? >> > > >> > > Paul R. Swank, Ph.D. >> > > Professor, Developmental Pediatrics >> > > Medical School >> > > UT Health Science Center at Houston >> > > >> > > >> > > -----Original Message----- >> > > From: Classification, clustering, and phylogeny estimation >> > > [mailto:[EMAIL PROTECTED] On Behalf Of shannon >> > > Sent: Wednesday, March 09, 2005 9:19 AM >> > > To: [email protected] >> > > Subject: statistical method >> > > >> > > >> > > What is the name of the statistical method which generates an order from >> > > a >> > > set of orders: >> > > >> > > Vote preferences: A > B > C > D >> > > B > A > C > D >> > > A > B > C > D >> > > A > C > D > B >> > > etc >> > > >> > > It is something like peeling? >> > > >> > > >> > > Bill >> > > --- >> > > >> > > Joint Meeting of the Interface and >> > > Classification Society of North America >> > > >> > > http://ilya.wustl.edu/if_csna_2005_meeting/ >> > > Abstracts and Registration Deadline is 4/9/05 >> > > >> > > >> > > William D. Shannon, Ph.D. >> > > >> > > Associate Professor of Biostatistics in Medicine >> > > Division of General Medical Sciences and Biostatistics >> > > >> > > Washington University School of Medicine >> > > Campus Box 8005, 660 S. Euclid >> > > St. Louis, MO 63110 >> > > >> > > Phone: 314-454-8356 >> > > Fax: 314-454-5113 >> > > e-mail: [EMAIL PROTECTED] >> > > web page: http://ilya.wustl.edu/~shannon >> > > >> >> >> >> ###################################################################### >> # J. Douglas Carroll, Board of Governors Professor of Management and # >> #Psychology, Rutgers University, Graduate School of Management, # >> #Marketing Dept., MEC125, 111 Washington Street, Newark, New Jersey # >> #07102-3027. Tel.: (973) 353-5814, Fax: (973) 353-5376. # >> # Home: 14 Forest Drive, Warren, New Jersey 07059-5802. # >> # Home Phone: (908) 753-6441 or 753-1620, Home Fax: (908) 757-1086. # >> # E-mail: [EMAIL PROTECTED] # >> ###################################################################### >>
