Re: [R] hclust/dendrogram merging

2013-09-16 Thread Thomas Parr
Josh,
A couple of things:
1) It would be helpful if you can provide some reproducible data and the
code you have developed thus far.
2) This is more of a stackexchange.com or crossvalidated.com question.

That said...without seeing the data...
Dendrograms/hclust are generated by using a distance matrix. In ecology
distance would be measured by an index summarizing site similarity based on
species composition. Selecting the appropriate index is crucial.  Legendre
and Legendre have a chapter on this in Numerical Ecology.  You don't have
species or sites, but you do have antibodies, antigens and blocked
antibodies. You could treat antibodies as sites and how they block other
antibodies as species.  So for your data, you might have two antibodies
that block antibodies x,y, and z, but A1 blocks with a 0.2, 0.5, and 0.8
efficiency while A2 blocks with a 0.8,0.5, and 0.2 efficiency.  Despite
blocking similar antibodies, you would likely conclude that they behave
differently due to the difference in x and z blocking.

If this sounds like what you are looking for, I would start with
help(hclust) in R and scroll down to the examples.


Message: 23
 Date: Sun, 15 Sep 2013 15:33:28 -0600
 From: Joshua Eckman josheck...@hotmail.com
 To: r-help@r-project.org r-help@r-project.org
 Subject: [R] hclust/dendrogram merging
 Message-ID: bay167-w1059032665f8387dd05cb26de...@phx.gbl
 Content-Type: text/plain

 I am working with protein blocking assays and the end result is a 2D
 matrix describing which antibodies block the binding of other antibodies to
 the target antigen.I need to group the antibodies together into bins
 based on their combined profiles in both the row and column direction.I am
 able to group the blocking profiles of rows vs rows, or columns vs columns,
 using clustering.  The end results could look something like this:
 col_bins binAb1   1Ab2   2Ab3   2Ab4   2Ab5   3Ab6   4Ab7   5Ab8
   5Ab9   6
 In this case the bin values are just to describe they have similar
 blocking profiles - so Ab2, Ab3, Ab4 have the same blocking profile, as do
 Ab7 and Ab8.
 Looking at the row profiles
 row_bins   binAb1   1Ab2   2Ab3   3Ab4   3Ab5   4Ab6   5Ab7   5  Ab8
   6Ab10  7
 The important end result, where I am stuck, is how to combine this with
 the row direction and only report those that are represented in both
 directions AND group together in both directions.  It is possible that some
 Abs will not be represented in both directions.  The bin values of
 row_bins and col_bins are also not important, just the relationship between
 Abs by name that belong in the same bin, in both directions.
 In other words, a combined bins report would look something like this:
binAb1  A Ab3  BAb4  BAb5  C
 I made this visually because it is clear that these are the only groupings
 that are maintained in both directions.  But real data sets are much
 bigger, so I need some form of automation.
 Any ideas on how do this with matrix, dendograms or clustering functions?
 Thank you,
 josh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hclust/dendrogram merging

2013-09-16 Thread Peter Langfelder
Joshua,

I'm not sure I understand your aim correctly, but if I do, here's my
advice: If you are able to find the clusters according to rows or
columns using clustering, you must be using some kind of a distance
matrix that encodes whether two antibodies should be in one bin for
rows, and a similar matrix for the columns. To get a clustering that
represents only bins that occur in both directions, you can
appropriately combine the two matrices into a single matrix. For
example, if the distance matrix is zero if the antibodies go together
and 1 otherwise, you can add the two matrices into a single matrix,
then cluster the antibodies using the combined matrix using hclust
(with complete linkage, if I understand it correctly), then use
cutree() with cut height equal say 0.5.

HTH,

Peter

On Sun, Sep 15, 2013 at 11:33 PM, Joshua Eckman josheck...@hotmail.com wrote:
 I am working with protein blocking assays and the end result is a 2D matrix 
 describing which antibodies block the binding of other antibodies to the 
 target antigen.I need to group the antibodies together into bins based on 
 their combined profiles in both the row and column direction.I am able to 
 group the blocking profiles of rows vs rows, or columns vs columns, using 
 clustering.  The end results could look something like this:
col_bins binAb1   1Ab2   2Ab3   2Ab4   2Ab5   3Ab6   4Ab7   5Ab8   
5Ab9   6
 In this case the bin values are just to describe they have similar blocking 
 profiles - so Ab2, Ab3, Ab4 have the same blocking profile, as do Ab7 and Ab8.
 Looking at the row profiles
row_bins   binAb1   1Ab2   2Ab3   3Ab4   3Ab5   4Ab6   5Ab7   5  Ab8   
6Ab10  7
 The important end result, where I am stuck, is how to combine this with the 
 row direction and only report those that are represented in both directions 
 AND group together in both directions.  It is possible that some Abs will not 
 be represented in both directions.  The bin values of row_bins and col_bins 
 are also not important, just the relationship between Abs by name that belong 
 in the same bin, in both directions.
 In other words, a combined bins report would look something like this:
binAb1  A Ab3  BAb4  BAb5  C
 I made this visually because it is clear that these are the only groupings 
 that are maintained in both directions.  But real data sets are much bigger, 
 so I need some form of automation.
 Any ideas on how do this with matrix, dendograms or clustering functions?
 Thank you,
 josh


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hclust/dendrogram merging

2013-09-15 Thread Joshua Eckman
I am working with protein blocking assays and the end result is a 2D matrix 
describing which antibodies block the binding of other antibodies to the target 
antigen.I need to group the antibodies together into bins based on their 
combined profiles in both the row and column direction.I am able to group the 
blocking profiles of rows vs rows, or columns vs columns, using clustering.  
The end results could look something like this:
col_bins binAb1   1Ab2   2Ab3   2Ab4   2Ab5   3Ab6   4Ab7   5Ab8   
5Ab9   6
In this case the bin values are just to describe they have similar blocking 
profiles - so Ab2, Ab3, Ab4 have the same blocking profile, as do Ab7 and Ab8.
Looking at the row profiles
row_bins   binAb1   1Ab2   2Ab3   3Ab4   3Ab5   4Ab6   5Ab7   5  Ab8   
6Ab10  7
The important end result, where I am stuck, is how to combine this with the row 
direction and only report those that are represented in both directions AND 
group together in both directions.  It is possible that some Abs will not be 
represented in both directions.  The bin values of row_bins and col_bins are 
also not important, just the relationship between Abs by name that belong in 
the same bin, in both directions.
In other words, a combined bins report would look something like this:
   binAb1  A Ab3  BAb4  BAb5  C
I made this visually because it is clear that these are the only groupings that 
are maintained in both directions.  But real data sets are much bigger, so I 
need some form of automation.
Any ideas on how do this with matrix, dendograms or clustering functions?
Thank you,
josh

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.