Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-06 Thread ying chen
I see. Thanks a lot,

Sean


On Tue, Aug 6, 2013 at 8:05 PM, Henrik Bengtsson <
henrik.bengts...@aroma-project.org> wrote:

> Hi.
>
> On Tue, Aug 6, 2013 at 12:31 PM, ying chen  wrote:
> > Hi Henrik,
> >
> >
> > Sorry I have one more question to bug you. The urls in the df (by df <-
> > readDataFrame(db);) point to UCSCgenome browser with NCBI36/hg18
> assembly. I
> > looked at the http://www.aroma-project.org/chipTypes/GenomeWideSNP_6page
> > resource part, the SNP6 chip type files are listed as based on hg19
> (Human
> > Genome build v36.1). This is a typo, right? I thought it should be hg18.
> So
> > the Cbs out put is also based on hg18, right?
>
> The genome version ("hg18") in the URLs to the USCC Genome Browser are
> (still) hard coded.  Please see:
>
>
> https://groups.google.com/d/msg/aroma-affymetrix/r8CtCvdGWnE/jzQFMifxt2MJ
>
> for more informations/details.
>
> /Henrik
>
> >
> > Thanks a lot for the help!
> >
> > Sean
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Aug 2, 2013 at 11:03 AM, Henrik Bengtsson
> >  wrote:
> >>
> >> On Thu, Aug 1, 2013 at 6:37 PM, ying chen  wrote:
> >> > Hi Henrik,
> >> >
> >> > Thanks a lot for the help.
> >> >
> >> > For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <-
> >> > CbsModel(dsT, ref=dfR, tags="HapMapRef"), right?
> >>
> >> Correct, I meant:
> >>
> >> sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef")
> >>
> >> /H
> >>
> >> >
> >> > Thanks,
> >> >
> >> > Sean
> >> >
> >> >
> >> >
> >> > On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson
> >> >  wrote:
> >> >>
> >> >> On Mon, Jul 29, 2013 at 7:45 PM, ying chen  wrote:
> >> >> > Hi Henrik,
> >> >> > Thanks a lot for the help!
> >> >> > Sorry I have more questions. I am following "How to: Calculate
> total
> >> >> > copy
> >> >> > number ratios from total (non-polymorphic) signals" and "Vignette:
> >> >> > Total
> >> >> > copy-number segmentation (non-paired CBS)", but I am not sure if I
> do
> >> >> > it
> >> >> > correctly.
> >> >> > I have two SNP6 datasets Tumor and HapMap270 and I want to use
> >> >> > HpaMap270
> >> >> > as
> >> >> > reference to go all the way to CBS step. So I do the following
> steps
> >> >> > respectively.
> >> >> >   > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
> >> >> >   > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
> >> >> > After that, I do
> >> >> >  > dataSet <- "HapMap270"
> >> >> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> >> >> >  > chipType <- "GenomeWideSNP_6"
> >> >> >  > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> >> >> > chipType=chipType)
> >> >> >  > dataSet <- "Tumor"
> >> >> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> >> >> >  > chipType <- "GenomeWideSNP_6"
> >> >> >  > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> >> >> > chipType=chipType)
> >> >> >  > dfR <- getAverageFile(dsN)   # ?
> >> >> >  > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
> >> >> > Would the above two steps work? My question is how to go ahead from
> >> >> > here
> >> >> > to
> >> >> > do CBS and will sm <- CbsModel(dsTR) work?
> >> >>
> >> >> Skip the exportTotalCnRatioSet() call, and instead use:
> >> >>
> >> >> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef");
> >> >>
> >> >> The 'tags' is just to add an informative tag to the output data set.
> >> >>
> >> >> /Henrik
> >> >>
> >> >> > Thanks again for your help!
> >> >> > Sean
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson
> >> >> >  wrote:
> >> >> >>
> >> >> >> Hi.
> >> >> >>
> >> >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj 
> wrote:
> >> >> >> > Hi guys,
> >> >> >> >
> >> >> >> > I have a question regarding how to calculate raw copy numbers
> >> >> >> > using
> >> >> >> > common
> >> >> >> > reference instead of average of all samples of the study.
> >> >> >> > Basically I
> >> >> >> > want
> >> >> >> > to use average of HapMap270 samples as reference for all further
> >> >> >> > copy
> >> >> >> > number
> >> >> >> > calculations.
> >> >> >> >
> >> >> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
> >> >> >> > Estimation
> >> >> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD)
> to
> >> >> >> > Step
> >> >> >> > 5 -
> >> >> >> > Calculation of raw copy numbers, and generated ceR and saved it
> as
> >> >> >> > a
> >> >> >> > RData
> >> >> >> > file ceR.Rdata.
> >> >> >>
> >> >> >> It's important to understand that almost all objects in the Aroma
> >> >> >> framework are basically "pointers" to external files.  For
> instance,
> >> >> >> your 'ceR', which I assume you've got from something like ceR <-
> >> >> >> getAverageFile(ces), is referring to the file with pathname
> >> >> >> getPathname(ceR).  More below...
> >> >> >>
> >> >> >> >
> >> >> >> > My first question is, how to use this data for any future copy
> >> >> >> > number
> >> >> >> > analysis? My guess is that instead of calculating the ceR from
> the
> >> >> >> > sample
> >> >> >> > set I can

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-06 Thread Henrik Bengtsson
Hi.

On Tue, Aug 6, 2013 at 12:31 PM, ying chen  wrote:
> Hi Henrik,
>
>
> Sorry I have one more question to bug you. The urls in the df (by df <-
> readDataFrame(db);) point to UCSCgenome browser with NCBI36/hg18 assembly. I
> looked at the http://www.aroma-project.org/chipTypes/GenomeWideSNP_6 page
> resource part, the SNP6 chip type files are listed as based on hg19 (Human
> Genome build v36.1). This is a typo, right? I thought it should be hg18. So
> the Cbs out put is also based on hg18, right?

The genome version ("hg18") in the URLs to the USCC Genome Browser are
(still) hard coded.  Please see:

  https://groups.google.com/d/msg/aroma-affymetrix/r8CtCvdGWnE/jzQFMifxt2MJ

for more informations/details.

/Henrik

>
> Thanks a lot for the help!
>
> Sean
>
>
>
>
>
>
>
> On Fri, Aug 2, 2013 at 11:03 AM, Henrik Bengtsson
>  wrote:
>>
>> On Thu, Aug 1, 2013 at 6:37 PM, ying chen  wrote:
>> > Hi Henrik,
>> >
>> > Thanks a lot for the help.
>> >
>> > For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <-
>> > CbsModel(dsT, ref=dfR, tags="HapMapRef"), right?
>>
>> Correct, I meant:
>>
>> sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef")
>>
>> /H
>>
>> >
>> > Thanks,
>> >
>> > Sean
>> >
>> >
>> >
>> > On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson
>> >  wrote:
>> >>
>> >> On Mon, Jul 29, 2013 at 7:45 PM, ying chen  wrote:
>> >> > Hi Henrik,
>> >> > Thanks a lot for the help!
>> >> > Sorry I have more questions. I am following "How to: Calculate total
>> >> > copy
>> >> > number ratios from total (non-polymorphic) signals" and "Vignette:
>> >> > Total
>> >> > copy-number segmentation (non-paired CBS)", but I am not sure if I do
>> >> > it
>> >> > correctly.
>> >> > I have two SNP6 datasets Tumor and HapMap270 and I want to use
>> >> > HpaMap270
>> >> > as
>> >> > reference to go all the way to CBS step. So I do the following steps
>> >> > respectively.
>> >> >   > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
>> >> >   > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
>> >> > After that, I do
>> >> >  > dataSet <- "HapMap270"
>> >> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>> >> >  > chipType <- "GenomeWideSNP_6"
>> >> >  > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> >> > chipType=chipType)
>> >> >  > dataSet <- "Tumor"
>> >> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>> >> >  > chipType <- "GenomeWideSNP_6"
>> >> >  > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> >> > chipType=chipType)
>> >> >  > dfR <- getAverageFile(dsN)   # ?
>> >> >  > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
>> >> > Would the above two steps work? My question is how to go ahead from
>> >> > here
>> >> > to
>> >> > do CBS and will sm <- CbsModel(dsTR) work?
>> >>
>> >> Skip the exportTotalCnRatioSet() call, and instead use:
>> >>
>> >> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef");
>> >>
>> >> The 'tags' is just to add an informative tag to the output data set.
>> >>
>> >> /Henrik
>> >>
>> >> > Thanks again for your help!
>> >> > Sean
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson
>> >> >  wrote:
>> >> >>
>> >> >> Hi.
>> >> >>
>> >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj  wrote:
>> >> >> > Hi guys,
>> >> >> >
>> >> >> > I have a question regarding how to calculate raw copy numbers
>> >> >> > using
>> >> >> > common
>> >> >> > reference instead of average of all samples of the study.
>> >> >> > Basically I
>> >> >> > want
>> >> >> > to use average of HapMap270 samples as reference for all further
>> >> >> > copy
>> >> >> > number
>> >> >> > calculations.
>> >> >> >
>> >> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
>> >> >> > Estimation
>> >> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to
>> >> >> > Step
>> >> >> > 5 -
>> >> >> > Calculation of raw copy numbers, and generated ceR and saved it as
>> >> >> > a
>> >> >> > RData
>> >> >> > file ceR.Rdata.
>> >> >>
>> >> >> It's important to understand that almost all objects in the Aroma
>> >> >> framework are basically "pointers" to external files.  For instance,
>> >> >> your 'ceR', which I assume you've got from something like ceR <-
>> >> >> getAverageFile(ces), is referring to the file with pathname
>> >> >> getPathname(ceR).  More below...
>> >> >>
>> >> >> >
>> >> >> > My first question is, how to use this data for any future copy
>> >> >> > number
>> >> >> > analysis? My guess is that instead of calculating the ceR from the
>> >> >> > sample
>> >> >> > set I can just load the ceR.RData file I saved and use it. Right?
>> >> >>
>> >> >> First of all, please note that when do ceR <- getAverageFile(ces) on
>> >> >> the same data set 'ces', the result is already available on file and
>> >> >> it will be quickly found and returned.  In other words, it will not
>> >> >> recalcuate the averages again [unless you do ceR <-
>> >> >> getAverageFile(ces, force=TRUE)].
>> >> >>
>> >> >> However, I do unde

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-06 Thread ying chen
Hi Henrik,


Sorry I have one more question to bug you. The urls in the df (by df <-
readDataFrame(db);) point to UCSCgenome browser with NCBI36/hg18 assembly.
I looked at the http://www.aroma-project.org/chipTypes/GenomeWideSNP_6 page
resource part, the SNP6 chip type files are listed as based on hg19 (Human
Genome build v36.1). This is a typo, right? I thought it should be hg18. So
the Cbs out put is also based on hg18, right?

Thanks a lot for the help!

Sean






On Fri, Aug 2, 2013 at 11:03 AM, Henrik Bengtsson <
henrik.bengts...@aroma-project.org> wrote:

> On Thu, Aug 1, 2013 at 6:37 PM, ying chen  wrote:
> > Hi Henrik,
> >
> > Thanks a lot for the help.
> >
> > For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <-
> > CbsModel(dsT, ref=dfR, tags="HapMapRef"), right?
>
> Correct, I meant:
>
> sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef")
>
> /H
>
> >
> > Thanks,
> >
> > Sean
> >
> >
> >
> > On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson
> >  wrote:
> >>
> >> On Mon, Jul 29, 2013 at 7:45 PM, ying chen  wrote:
> >> > Hi Henrik,
> >> > Thanks a lot for the help!
> >> > Sorry I have more questions. I am following "How to: Calculate total
> >> > copy
> >> > number ratios from total (non-polymorphic) signals" and "Vignette:
> Total
> >> > copy-number segmentation (non-paired CBS)", but I am not sure if I do
> it
> >> > correctly.
> >> > I have two SNP6 datasets Tumor and HapMap270 and I want to use
> HpaMap270
> >> > as
> >> > reference to go all the way to CBS step. So I do the following steps
> >> > respectively.
> >> >   > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
> >> >   > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
> >> > After that, I do
> >> >  > dataSet <- "HapMap270"
> >> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> >> >  > chipType <- "GenomeWideSNP_6"
> >> >  > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> >> > chipType=chipType)
> >> >  > dataSet <- "Tumor"
> >> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> >> >  > chipType <- "GenomeWideSNP_6"
> >> >  > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> >> > chipType=chipType)
> >> >  > dfR <- getAverageFile(dsN)   # ?
> >> >  > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
> >> > Would the above two steps work? My question is how to go ahead from
> here
> >> > to
> >> > do CBS and will sm <- CbsModel(dsTR) work?
> >>
> >> Skip the exportTotalCnRatioSet() call, and instead use:
> >>
> >> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef");
> >>
> >> The 'tags' is just to add an informative tag to the output data set.
> >>
> >> /Henrik
> >>
> >> > Thanks again for your help!
> >> > Sean
> >> >
> >> >
> >> >
> >> >
> >> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson
> >> >  wrote:
> >> >>
> >> >> Hi.
> >> >>
> >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj  wrote:
> >> >> > Hi guys,
> >> >> >
> >> >> > I have a question regarding how to calculate raw copy numbers using
> >> >> > common
> >> >> > reference instead of average of all samples of the study.
> Basically I
> >> >> > want
> >> >> > to use average of HapMap270 samples as reference for all further
> copy
> >> >> > number
> >> >> > calculations.
> >> >> >
> >> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
> >> >> > Estimation
> >> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to
> >> >> > Step
> >> >> > 5 -
> >> >> > Calculation of raw copy numbers, and generated ceR and saved it as
> a
> >> >> > RData
> >> >> > file ceR.Rdata.
> >> >>
> >> >> It's important to understand that almost all objects in the Aroma
> >> >> framework are basically "pointers" to external files.  For instance,
> >> >> your 'ceR', which I assume you've got from something like ceR <-
> >> >> getAverageFile(ces), is referring to the file with pathname
> >> >> getPathname(ceR).  More below...
> >> >>
> >> >> >
> >> >> > My first question is, how to use this data for any future copy
> number
> >> >> > analysis? My guess is that instead of calculating the ceR from the
> >> >> > sample
> >> >> > set I can just load the ceR.RData file I saved and use it. Right?
> >> >>
> >> >> First of all, please note that when do ceR <- getAverageFile(ces) on
> >> >> the same data set 'ces', the result is already available on file and
> >> >> it will be quickly found and returned.  In other words, it will not
> >> >> recalcuate the averages again [unless you do ceR <-
> >> >> getAverageFile(ces, force=TRUE)].
> >> >>
> >> >> However, I do understand that you may not want to have to keep a
> large
> >> >> 'ces' data set around, when you're only interested in the pooled
> >> >> average.  In that case, I would copy the file containing the
> "average"
> >> >> to a new data set.  Currently, this is not straightforward in Aroma
> >> >> (I'll think about something), but you can do the following:
> >> >>
> >> >> # Calculate the pooled average
> >> >> > ceR <- getAverageFile(cesN);
> >> >>
> >> >> # Copy t

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-02 Thread Henrik Bengtsson
On Thu, Aug 1, 2013 at 6:37 PM, ying chen  wrote:
> Hi Henrik,
>
> Thanks a lot for the help.
>
> For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <-
> CbsModel(dsT, ref=dfR, tags="HapMapRef"), right?

Correct, I meant:

sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef")

/H

>
> Thanks,
>
> Sean
>
>
>
> On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson
>  wrote:
>>
>> On Mon, Jul 29, 2013 at 7:45 PM, ying chen  wrote:
>> > Hi Henrik,
>> > Thanks a lot for the help!
>> > Sorry I have more questions. I am following "How to: Calculate total
>> > copy
>> > number ratios from total (non-polymorphic) signals" and "Vignette: Total
>> > copy-number segmentation (non-paired CBS)", but I am not sure if I do it
>> > correctly.
>> > I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270
>> > as
>> > reference to go all the way to CBS step. So I do the following steps
>> > respectively.
>> >   > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
>> >   > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
>> > After that, I do
>> >  > dataSet <- "HapMap270"
>> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>> >  > chipType <- "GenomeWideSNP_6"
>> >  > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> > chipType=chipType)
>> >  > dataSet <- "Tumor"
>> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>> >  > chipType <- "GenomeWideSNP_6"
>> >  > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> > chipType=chipType)
>> >  > dfR <- getAverageFile(dsN)   # ?
>> >  > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
>> > Would the above two steps work? My question is how to go ahead from here
>> > to
>> > do CBS and will sm <- CbsModel(dsTR) work?
>>
>> Skip the exportTotalCnRatioSet() call, and instead use:
>>
>> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef");
>>
>> The 'tags' is just to add an informative tag to the output data set.
>>
>> /Henrik
>>
>> > Thanks again for your help!
>> > Sean
>> >
>> >
>> >
>> >
>> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson
>> >  wrote:
>> >>
>> >> Hi.
>> >>
>> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj  wrote:
>> >> > Hi guys,
>> >> >
>> >> > I have a question regarding how to calculate raw copy numbers using
>> >> > common
>> >> > reference instead of average of all samples of the study. Basically I
>> >> > want
>> >> > to use average of HapMap270 samples as reference for all further copy
>> >> > number
>> >> > calculations.
>> >> >
>> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
>> >> > Estimation
>> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to
>> >> > Step
>> >> > 5 -
>> >> > Calculation of raw copy numbers, and generated ceR and saved it as a
>> >> > RData
>> >> > file ceR.Rdata.
>> >>
>> >> It's important to understand that almost all objects in the Aroma
>> >> framework are basically "pointers" to external files.  For instance,
>> >> your 'ceR', which I assume you've got from something like ceR <-
>> >> getAverageFile(ces), is referring to the file with pathname
>> >> getPathname(ceR).  More below...
>> >>
>> >> >
>> >> > My first question is, how to use this data for any future copy number
>> >> > analysis? My guess is that instead of calculating the ceR from the
>> >> > sample
>> >> > set I can just load the ceR.RData file I saved and use it. Right?
>> >>
>> >> First of all, please note that when do ceR <- getAverageFile(ces) on
>> >> the same data set 'ces', the result is already available on file and
>> >> it will be quickly found and returned.  In other words, it will not
>> >> recalcuate the averages again [unless you do ceR <-
>> >> getAverageFile(ces, force=TRUE)].
>> >>
>> >> However, I do understand that you may not want to have to keep a large
>> >> 'ces' data set around, when you're only interested in the pooled
>> >> average.  In that case, I would copy the file containing the "average"
>> >> to a new data set.  Currently, this is not straightforward in Aroma
>> >> (I'll think about something), but you can do the following:
>> >>
>> >> # Calculate the pooled average
>> >> > ceR <- getAverageFile(cesN);
>> >>
>> >> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g.
>> >> > filename <- getFilename(ceR);
>> >> > filename
>> >> [1]
>> >> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"
>> >> > rootPath <- getParent(getPath(cesN), depth=2L);
>> >> > dataSet <- "HapMap270,pooled";
>> >> > chipType <- getChipType(ceR, fullname=FALSE);
>> >> > path <- file.path(rootPath, dataSet, chipType);
>> >> > path
>> >> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6"
>> >> > mkdirs(path);
>> >> > copyFile(getPathname(ceR), file.path(path, filename));
>> >>
>> >> With this done, you can then grab this pooled reference as:
>> >>
>> >> > library("aroma.affymetrix")
>> >> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6";
>> >> > filename <-
>> >> >
>> >> > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-01 Thread ying chen
Hi Henrik,

Thanks a lot for the help.

For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <-
CbsModel(dsT, ref=dfR, tags="HapMapRef"), right?

Thanks,

Sean



On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson <
henrik.bengts...@aroma-project.org> wrote:

> On Mon, Jul 29, 2013 at 7:45 PM, ying chen  wrote:
> > Hi Henrik,
> > Thanks a lot for the help!
> > Sorry I have more questions. I am following "How to: Calculate total copy
> > number ratios from total (non-polymorphic) signals" and "Vignette: Total
> > copy-number segmentation (non-paired CBS)", but I am not sure if I do it
> > correctly.
> > I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270
> as
> > reference to go all the way to CBS step. So I do the following steps
> > respectively.
> >   > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
> >   > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
> > After that, I do
> >  > dataSet <- "HapMap270"
> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> >  > chipType <- "GenomeWideSNP_6"
> >  > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> > chipType=chipType)
> >  > dataSet <- "Tumor"
> >  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> >  > chipType <- "GenomeWideSNP_6"
> >  > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> > chipType=chipType)
> >  > dfR <- getAverageFile(dsN)   # ?
> >  > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
> > Would the above two steps work? My question is how to go ahead from here
> to
> > do CBS and will sm <- CbsModel(dsTR) work?
>
> Skip the exportTotalCnRatioSet() call, and instead use:
>
> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef");
>
> The 'tags' is just to add an informative tag to the output data set.
>
> /Henrik
>
> > Thanks again for your help!
> > Sean
> >
> >
> >
> >
> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson
> >  wrote:
> >>
> >> Hi.
> >>
> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj  wrote:
> >> > Hi guys,
> >> >
> >> > I have a question regarding how to calculate raw copy numbers using
> >> > common
> >> > reference instead of average of all samples of the study. Basically I
> >> > want
> >> > to use average of HapMap270 samples as reference for all further copy
> >> > number
> >> > calculations.
> >> >
> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
> >> > Estimation
> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to
> Step
> >> > 5 -
> >> > Calculation of raw copy numbers, and generated ceR and saved it as a
> >> > RData
> >> > file ceR.Rdata.
> >>
> >> It's important to understand that almost all objects in the Aroma
> >> framework are basically "pointers" to external files.  For instance,
> >> your 'ceR', which I assume you've got from something like ceR <-
> >> getAverageFile(ces), is referring to the file with pathname
> >> getPathname(ceR).  More below...
> >>
> >> >
> >> > My first question is, how to use this data for any future copy number
> >> > analysis? My guess is that instead of calculating the ceR from the
> >> > sample
> >> > set I can just load the ceR.RData file I saved and use it. Right?
> >>
> >> First of all, please note that when do ceR <- getAverageFile(ces) on
> >> the same data set 'ces', the result is already available on file and
> >> it will be quickly found and returned.  In other words, it will not
> >> recalcuate the averages again [unless you do ceR <-
> >> getAverageFile(ces, force=TRUE)].
> >>
> >> However, I do understand that you may not want to have to keep a large
> >> 'ces' data set around, when you're only interested in the pooled
> >> average.  In that case, I would copy the file containing the "average"
> >> to a new data set.  Currently, this is not straightforward in Aroma
> >> (I'll think about something), but you can do the following:
> >>
> >> # Calculate the pooled average
> >> > ceR <- getAverageFile(cesN);
> >>
> >> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g.
> >> > filename <- getFilename(ceR);
> >> > filename
> >> [1]
> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"
> >> > rootPath <- getParent(getPath(cesN), depth=2L);
> >> > dataSet <- "HapMap270,pooled";
> >> > chipType <- getChipType(ceR, fullname=FALSE);
> >> > path <- file.path(rootPath, dataSet, chipType);
> >> > path
> >> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6"
> >> > mkdirs(path);
> >> > copyFile(getPathname(ceR), file.path(path, filename));
> >>
> >> With this done, you can then grab this pooled reference as:
> >>
> >> > library("aroma.affymetrix")
> >> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6";
> >> > filename <-
> >> >
> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL";
> >> > ceR <- CnChipEffectFile(filename, path=path);
> >>
> >> Note, when you save 'ceR', you are basically saving the reference to
> >> the file.  Yes, you can load it later, but make sure not to move it,
> >> otherwise you'll get some t

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-01 Thread Henrik Bengtsson
On Mon, Jul 29, 2013 at 7:45 PM, ying chen  wrote:
> Hi Henrik,
> Thanks a lot for the help!
> Sorry I have more questions. I am following "How to: Calculate total copy
> number ratios from total (non-polymorphic) signals" and "Vignette: Total
> copy-number segmentation (non-paired CBS)", but I am not sure if I do it
> correctly.
> I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 as
> reference to go all the way to CBS step. So I do the following steps
> respectively.
>   > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
>   > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
> After that, I do
>  > dataSet <- "HapMap270"
>  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>  > chipType <- "GenomeWideSNP_6"
>  > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> chipType=chipType)
>  > dataSet <- "Tumor"
>  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>  > chipType <- "GenomeWideSNP_6"
>  > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> chipType=chipType)
>  > dfR <- getAverageFile(dsN)   # ?
>  > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
> Would the above two steps work? My question is how to go ahead from here to
> do CBS and will sm <- CbsModel(dsTR) work?

Skip the exportTotalCnRatioSet() call, and instead use:

sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef");

The 'tags' is just to add an informative tag to the output data set.

/Henrik

> Thanks again for your help!
> Sean
>
>
>
>
> On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson
>  wrote:
>>
>> Hi.
>>
>> On Fri, Jul 26, 2013 at 8:02 AM, sean nj  wrote:
>> > Hi guys,
>> >
>> > I have a question regarding how to calculate raw copy numbers using
>> > common
>> > reference instead of average of all samples of the study. Basically I
>> > want
>> > to use average of HapMap270 samples as reference for all further copy
>> > number
>> > calculations.
>> >
>> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
>> > Estimation
>> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step
>> > 5 -
>> > Calculation of raw copy numbers, and generated ceR and saved it as a
>> > RData
>> > file ceR.Rdata.
>>
>> It's important to understand that almost all objects in the Aroma
>> framework are basically "pointers" to external files.  For instance,
>> your 'ceR', which I assume you've got from something like ceR <-
>> getAverageFile(ces), is referring to the file with pathname
>> getPathname(ceR).  More below...
>>
>> >
>> > My first question is, how to use this data for any future copy number
>> > analysis? My guess is that instead of calculating the ceR from the
>> > sample
>> > set I can just load the ceR.RData file I saved and use it. Right?
>>
>> First of all, please note that when do ceR <- getAverageFile(ces) on
>> the same data set 'ces', the result is already available on file and
>> it will be quickly found and returned.  In other words, it will not
>> recalcuate the averages again [unless you do ceR <-
>> getAverageFile(ces, force=TRUE)].
>>
>> However, I do understand that you may not want to have to keep a large
>> 'ces' data set around, when you're only interested in the pooled
>> average.  In that case, I would copy the file containing the "average"
>> to a new data set.  Currently, this is not straightforward in Aroma
>> (I'll think about something), but you can do the following:
>>
>> # Calculate the pooled average
>> > ceR <- getAverageFile(cesN);
>>
>> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g.
>> > filename <- getFilename(ceR);
>> > filename
>> [1] ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"
>> > rootPath <- getParent(getPath(cesN), depth=2L);
>> > dataSet <- "HapMap270,pooled";
>> > chipType <- getChipType(ceR, fullname=FALSE);
>> > path <- file.path(rootPath, dataSet, chipType);
>> > path
>> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6"
>> > mkdirs(path);
>> > copyFile(getPathname(ceR), file.path(path, filename));
>>
>> With this done, you can then grab this pooled reference as:
>>
>> > library("aroma.affymetrix")
>> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6";
>> > filename <-
>> > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL";
>> > ceR <- CnChipEffectFile(filename, path=path);
>>
>> Note, when you save 'ceR', you are basically saving the reference to
>> the file.  Yes, you can load it later, but make sure not to move it,
>> otherwise you'll get some type of "file not found" error.
>>
>> > saveObject(ceR, "HapMap270,GenomeWideSNP_6,reference.Rdata");
>>
>> If already saved, and file not moved, you can then do:
>>
>> > library("aroma.affymetrix");
>> > ceR <- loadObject("HapMap270,GenomeWideSNP_6,reference.Rdata");
>>
>> All this is very ad hoc (=non-aroma style), and as I said, I'll see if
>> I can come up with a cleaner solution for storing and retrieving
>> pooled averages.
>>
>> >
>> > My second question is, how to go ahead from there to calculate 

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-01 Thread ying chen
Sorry, one more question regarding CBS. During

fit(sm, verbose=-10)
Building tuple of reference sets...
 Type of reference: median
 No reference available.
  Calculating average copy-number signals...
   Retrieving average unit signals across 291 arrays...
..

I thought I used HapMap270 data as reference, why here it says no reference
available and uses average instead?

Thanks,

Sean



On Thu, Aug 1, 2013 at 10:56 AM, ying chen  wrote:

> Sorry, one more question regarding CBS. During
>
> fit(sm, verbose=-10)
> Building tuple of reference sets...
>  Type of reference: median
>  No reference available.
>  Calculating average copy-number signals...
>   Retrieving average unit signals across 291 arrays...
> ..
>
>
> On Thu, Aug 1, 2013 at 10:53 AM, ying chen  wrote:
>
>> Hi Henrik,
>>
>> I tried method I mentioned above. But I got an error message when running
>> fit(sm, verbose=-10).
>>
>>
>>
>>  Array #1 ('321T') of 291 on chromosome 1...
>>  Error in UseMethod("getChecksum") :
>>no applicable method for 'getChecksum' applied to an object of class
>> "list"
>>
>> What did I do wrong?
>>
>> Thanks a lot for the help!
>>
>> Sean
>>
>>
>> > library(aroma.affymetrix)
>>
>> > dataSet <- "HapMap270"
>> > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>> > chipType <- "GenomeWideSNP_6"
>> > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> chipType=chipType)
>> > dataSet <- "Tumor"
>> > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> chipType=chipType)
>> > dfR <- getAverageFile(dsN)
>> > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR)
>> There were 50 or more warnings (use warnings() to see the first 50)
>> > warnings()
>> Warning messages:
>> 1: In log(C) : NaNs produced
>> 2: In log(C) : NaNs produced
>> 3: In log(C) : NaNs produced
>> ...
>> > print(dsTR)
>> AromaUnitTotalCnBinarySet:
>> Name: Tumor
>> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>> Number of files: 291
>> Names: 321T, 322T, 323T, ..., T97 [291]
>> Path (to the first file):
>> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
>> Total file size: 2088.72 MB
>> RAM: 0.37MB
>> > print(dsT)
>> AromaUnitTotalCnBinarySet:
>> Name: Tumor
>> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>> Number of files: 291
>> Names: 321T, 322T, 323T, ..., T97 [291]
>> Path (to the first file):
>> totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
>> Total file size: 2088.64 MB
>> RAM: 0.37MB
>> >
>> > sm <- CbsModel(dsTR)
>> >
>> Loading required package: DNAcopy
>> > print(sm)
>> CbsModel:
>> Name: Tumor
>> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired
>> Chip type (virtual): GenomeWideSNP_6
>> Path:
>> cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6
>> Number of chip types: 1
>> Sample & reference file pairs:
>> Chip type #1 ('GenomeWideSNP_6') of 1:
>> Sample data set:
>> AromaUnitTotalCnBinarySet:
>> Name: Tumor
>> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>> Number of files: 291
>> Names: 321T, 322T, 323T, ..., T97 [291]
>> Path (to the first file):
>> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
>> Total file size: 2088.72 MB
>> RAM: 0.37MB
>> Reference: 
>> RAM: 0.00MB
>> >
>> > fit(sm, verbose=-10)
>> Building tuple of reference sets...
>>  Type of reference: median
>>  No reference available.
>>  Calculating average copy-number signals...
>>   Retrieving average unit signals across 291 arrays...
>>AromaUnitTotalCnBinaryFile:
>>Name: .average-signals-median-mad
>>Tags: 8143f776d59b4c432debaf4bb288
>>Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288
>>Pathname:
>> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb
>>File size: 7.18 MB (7526027 bytes)
>>RAM: 0.00 MB
>>Number of data rows: 1881415
>>File format: v1
>>Dimensions: 1881415x1
>>Column classes: double
>>Number of bytes per column: 4
>>Footer: 20130801 10:20:33
>> EDTAffymetrixGenomeWideSNP_6,Full29160c565ccf6239a081e4162b3328e1ccbmedianmad
>>Platform: Affymetrix
>>Chip type: GenomeWideSNP_6,Full
>>Number of units to be updated: 1
>>Processing chunk...
>>  chr "Indices in chunk:"
>>  int 22933
>> Reading data...
>> Reading data...done
>> Estimating averages and standard deviations...
>> Estimating averages and standard deviations...done
>> Writing estimates...
>> Writing estimates...done
>>Processing chunk...done
>>   Retrieving average unit signals across 291 arrays...done
>>  Calculating average copy-number signals...done
>> Building tuple of reference sets...done
>> Using reference tuple:
>> AromaUnitTotalCnBinarySetTuple:
>> Name: Tumor
>> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>> C

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-01 Thread ying chen
Sorry, one more question regarding CBS. During

fit(sm, verbose=-10)
Building tuple of reference sets...
 Type of reference: median
 No reference available.
 Calculating average copy-number signals...
  Retrieving average unit signals across 291 arrays...
..


On Thu, Aug 1, 2013 at 10:53 AM, ying chen  wrote:

> Hi Henrik,
>
> I tried method I mentioned above. But I got an error message when running
> fit(sm, verbose=-10).
>
>
>
>  Array #1 ('321T') of 291 on chromosome 1...
>  Error in UseMethod("getChecksum") :
>no applicable method for 'getChecksum' applied to an object of class
> "list"
>
> What did I do wrong?
>
> Thanks a lot for the help!
>
> Sean
>
>
> > library(aroma.affymetrix)
>
> > dataSet <- "HapMap270"
> > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> > chipType <- "GenomeWideSNP_6"
> > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> chipType=chipType)
> > dataSet <- "Tumor"
> > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> chipType=chipType)
> > dfR <- getAverageFile(dsN)
> > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR)
> There were 50 or more warnings (use warnings() to see the first 50)
> > warnings()
> Warning messages:
> 1: In log(C) : NaNs produced
> 2: In log(C) : NaNs produced
> 3: In log(C) : NaNs produced
> ...
> > print(dsTR)
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: 321T, 322T, 323T, ..., T97 [291]
> Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.72 MB
> RAM: 0.37MB
> > print(dsT)
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: 321T, 322T, 323T, ..., T97 [291]
> Path (to the first file):
> totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.64 MB
> RAM: 0.37MB
> >
> > sm <- CbsModel(dsTR)
> >
> Loading required package: DNAcopy
> > print(sm)
> CbsModel:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired
> Chip type (virtual): GenomeWideSNP_6
> Path:
> cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6
> Number of chip types: 1
> Sample & reference file pairs:
> Chip type #1 ('GenomeWideSNP_6') of 1:
> Sample data set:
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: 321T, 322T, 323T, ..., T97 [291]
> Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.72 MB
> RAM: 0.37MB
> Reference: 
> RAM: 0.00MB
> >
> > fit(sm, verbose=-10)
> Building tuple of reference sets...
>  Type of reference: median
>  No reference available.
>  Calculating average copy-number signals...
>   Retrieving average unit signals across 291 arrays...
>AromaUnitTotalCnBinaryFile:
>Name: .average-signals-median-mad
>Tags: 8143f776d59b4c432debaf4bb288
>Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288
>Pathname:
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb
>File size: 7.18 MB (7526027 bytes)
>RAM: 0.00 MB
>Number of data rows: 1881415
>File format: v1
>Dimensions: 1881415x1
>Column classes: double
>Number of bytes per column: 4
>Footer: 20130801 10:20:33
> EDTAffymetrixGenomeWideSNP_6,Full29160c565ccf6239a081e4162b3328e1ccbmedianmad
>Platform: Affymetrix
>Chip type: GenomeWideSNP_6,Full
>Number of units to be updated: 1
>Processing chunk...
>  chr "Indices in chunk:"
>  int 22933
> Reading data...
> Reading data...done
> Estimating averages and standard deviations...
> Estimating averages and standard deviations...done
> Writing estimates...
> Writing estimates...done
>Processing chunk...done
>   Retrieving average unit signals across 291 arrays...done
>  Calculating average copy-number signals...done
> Building tuple of reference sets...done
> Using reference tuple:
> AromaUnitTotalCnBinarySetTuple:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Chip types: GenomeWideSNP_6
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: .average-signals-median-mad, .average-signals-median-mad,
> .average-signals-median-mad, ..., .average-signals-median-mad [291]
> Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.62 MB
> RAM: 0.37MB
> RAM: 0.00MB
> Extract DataFileMatrix...
>  Array: 1
>  Test data sets:
>  AromaUnitTotalCnBinarySetTuple:
>  Name: Tumor
>  Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,F

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-08-01 Thread ying chen
Hi Henrik,

I tried method I mentioned above. But I got an error message when running
fit(sm, verbose=-10).



 Array #1 ('321T') of 291 on chromosome 1...
 Error in UseMethod("getChecksum") :
   no applicable method for 'getChecksum' applied to an object of class
"list"

What did I do wrong?

Thanks a lot for the help!

Sean


> library(aroma.affymetrix)
> dataSet <- "HapMap270"
> tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> chipType <- "GenomeWideSNP_6"
> dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
chipType=chipType)
> dataSet <- "Tumor"
> dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
chipType=chipType)
> dfR <- getAverageFile(dsN)
> dsTR <- exportTotalCnRatioSet(dsT, ref=dfR)
There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In log(C) : NaNs produced
2: In log(C) : NaNs produced
3: In log(C) : NaNs produced
...
> print(dsTR)
AromaUnitTotalCnBinarySet:
Name: Tumor
Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Number of files: 291
Names: 321T, 322T, 323T, ..., T97 [291]
Path (to the first file):
rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
Total file size: 2088.72 MB
RAM: 0.37MB
> print(dsT)
AromaUnitTotalCnBinarySet:
Name: Tumor
Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Number of files: 291
Names: 321T, 322T, 323T, ..., T97 [291]
Path (to the first file):
totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
Total file size: 2088.64 MB
RAM: 0.37MB
>
> sm <- CbsModel(dsTR)
>
Loading required package: DNAcopy
> print(sm)
CbsModel:
Name: Tumor
Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired
Chip type (virtual): GenomeWideSNP_6
Path:
cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6
Number of chip types: 1
Sample & reference file pairs:
Chip type #1 ('GenomeWideSNP_6') of 1:
Sample data set:
AromaUnitTotalCnBinarySet:
Name: Tumor
Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Number of files: 291
Names: 321T, 322T, 323T, ..., T97 [291]
Path (to the first file):
rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
Total file size: 2088.72 MB
RAM: 0.37MB
Reference: 
RAM: 0.00MB
>
> fit(sm, verbose=-10)
Building tuple of reference sets...
 Type of reference: median
 No reference available.
 Calculating average copy-number signals...
  Retrieving average unit signals across 291 arrays...
   AromaUnitTotalCnBinaryFile:
   Name: .average-signals-median-mad
   Tags: 8143f776d59b4c432debaf4bb288
   Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288
   Pathname:
rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb
   File size: 7.18 MB (7526027 bytes)
   RAM: 0.00 MB
   Number of data rows: 1881415
   File format: v1
   Dimensions: 1881415x1
   Column classes: double
   Number of bytes per column: 4
   Footer: 20130801 10:20:33
EDTAffymetrixGenomeWideSNP_6,Full29160c565ccf6239a081e4162b3328e1ccbmedianmad
   Platform: Affymetrix
   Chip type: GenomeWideSNP_6,Full
   Number of units to be updated: 1
   Processing chunk...
 chr "Indices in chunk:"
 int 22933
Reading data...
Reading data...done
Estimating averages and standard deviations...
Estimating averages and standard deviations...done
Writing estimates...
Writing estimates...done
   Processing chunk...done
  Retrieving average unit signals across 291 arrays...done
 Calculating average copy-number signals...done
Building tuple of reference sets...done
Using reference tuple:
AromaUnitTotalCnBinarySetTuple:
Name: Tumor
Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Chip types: GenomeWideSNP_6
AromaUnitTotalCnBinarySet:
Name: Tumor
Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
Number of files: 291
Names: .average-signals-median-mad, .average-signals-median-mad,
.average-signals-median-mad, ..., .average-signals-median-mad [291]
Path (to the first file):
rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
Total file size: 2088.62 MB
RAM: 0.37MB
RAM: 0.00MB
Extract DataFileMatrix...
 Array: 1
 Test data sets:
 AromaUnitTotalCnBinarySetTuple:
 Name: Tumor
 Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
 Chip types: GenomeWideSNP_6
 AromaUnitTotalCnBinarySet:
 Name: Tumor
 Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
 Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
 Number of files: 291
 Names: 321T, 322T, 323T, ..., T97 [291]
 Path (to the first file):
rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
 Total file size: 2088.72 MB
 RAM: 0.37MB
 RAM: 0.00MB
 Test data files:
 $`GenomeWideSNP_6,Full`
 AromaUnitTotalCnBinaryFile:
 Name: 321T
 Tags:
ref=.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7,log2ratio,total
 Full name:
321T,ref=.average-signals-median-mad

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-07-29 Thread ying chen
Hi Henrik,
Thanks a lot for the help!
Sorry I have more questions. I am following "How to: Calculate total copy
number ratios from total (non-polymorphic) signals" and "Vignette: Total
copy-number segmentation (non-paired CBS)", but I am not sure if I do it
correctly.
I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 as
reference to go all the way to CBS step. So I do the following steps
respectively.
  > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
  > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
After that, I do
 > dataSet <- "HapMap270"
 > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
 > chipType <- "GenomeWideSNP_6"
 > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
chipType=chipType)
 > dataSet <- "Tumor"
 > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
 > chipType <- "GenomeWideSNP_6"
 > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
chipType=chipType)
 > dfR <- getAverageFile(dsN)   # ?
 > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
Would the above two steps work? My question is how to go ahead from here to
do CBS and will sm <- CbsModel(dsTR) work?
Thanks again for your help!
Sean




On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson <
henrik.bengts...@aroma-project.org> wrote:

> Hi.
>
> On Fri, Jul 26, 2013 at 8:02 AM, sean nj  wrote:
> > Hi guys,
> >
> > I have a question regarding how to calculate raw copy numbers using
> common
> > reference instead of average of all samples of the study. Basically I
> want
> > to use average of HapMap270 samples as reference for all further copy
> number
> > calculations.
> >
> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
> Estimation
> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step
> 5 -
> > Calculation of raw copy numbers, and generated ceR and saved it as a
> RData
> > file ceR.Rdata.
>
> It's important to understand that almost all objects in the Aroma
> framework are basically "pointers" to external files.  For instance,
> your 'ceR', which I assume you've got from something like ceR <-
> getAverageFile(ces), is referring to the file with pathname
> getPathname(ceR).  More below...
>
> >
> > My first question is, how to use this data for any future copy number
> > analysis? My guess is that instead of calculating the ceR from the sample
> > set I can just load the ceR.RData file I saved and use it. Right?
>
> First of all, please note that when do ceR <- getAverageFile(ces) on
> the same data set 'ces', the result is already available on file and
> it will be quickly found and returned.  In other words, it will not
> recalcuate the averages again [unless you do ceR <-
> getAverageFile(ces, force=TRUE)].
>
> However, I do understand that you may not want to have to keep a large
> 'ces' data set around, when you're only interested in the pooled
> average.  In that case, I would copy the file containing the "average"
> to a new data set.  Currently, this is not straightforward in Aroma
> (I'll think about something), but you can do the following:
>
> # Calculate the pooled average
> > ceR <- getAverageFile(cesN);
>
> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g.
> > filename <- getFilename(ceR);
> > filename
> [1] ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"
> > rootPath <- getParent(getPath(cesN), depth=2L);
> > dataSet <- "HapMap270,pooled";
> > chipType <- getChipType(ceR, fullname=FALSE);
> > path <- file.path(rootPath, dataSet, chipType);
> > path
> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6"
> > mkdirs(path);
> > copyFile(getPathname(ceR), file.path(path, filename));
>
> With this done, you can then grab this pooled reference as:
>
> > library("aroma.affymetrix")
> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6";
> > filename <-
> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL";
> > ceR <- CnChipEffectFile(filename, path=path);
>
> Note, when you save 'ceR', you are basically saving the reference to
> the file.  Yes, you can load it later, but make sure not to move it,
> otherwise you'll get some type of "file not found" error.
>
> > saveObject(ceR, "HapMap270,GenomeWideSNP_6,reference.Rdata");
>
> If already saved, and file not moved, you can then do:
>
> > library("aroma.affymetrix");
> > ceR <- loadObject("HapMap270,GenomeWideSNP_6,reference.Rdata");
>
> All this is very ad hoc (=non-aroma style), and as I said, I'll see if
> I can come up with a cleaner solution for storing and retrieving
> pooled averages.
>
> >
> > My second question is, how to go ahead from there to calculate the
> relative
> > copy numbers for all unit from all samples? The two examples given in the
> > Vignette  are for one unit from one sample and for a few unit on
> chromosome 2
> > for one sample. What is the function to retrieve all units on all
> > chromosomes instead of units <- getUnitsOnChromosome(gi, chromosome=2,
> > region=c(81,86)*1e6)?
>
> You can set 'units'

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

2013-07-28 Thread Henrik Bengtsson
Hi.

On Fri, Jul 26, 2013 at 8:02 AM, sean nj  wrote:
> Hi guys,
>
> I have a question regarding how to calculate raw copy numbers using common
> reference instead of average of all samples of the study. Basically I want
> to use average of HapMap270 samples as reference for all further copy number
> calculations.
>
> I have a bunch HapMap270 snp6 cel files and I followed Vignette: Estimation
> of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step 5 -
> Calculation of raw copy numbers, and generated ceR and saved it as a RData
> file ceR.Rdata.

It's important to understand that almost all objects in the Aroma
framework are basically "pointers" to external files.  For instance,
your 'ceR', which I assume you've got from something like ceR <-
getAverageFile(ces), is referring to the file with pathname
getPathname(ceR).  More below...

>
> My first question is, how to use this data for any future copy number
> analysis? My guess is that instead of calculating the ceR from the sample
> set I can just load the ceR.RData file I saved and use it. Right?

First of all, please note that when do ceR <- getAverageFile(ces) on
the same data set 'ces', the result is already available on file and
it will be quickly found and returned.  In other words, it will not
recalcuate the averages again [unless you do ceR <-
getAverageFile(ces, force=TRUE)].

However, I do understand that you may not want to have to keep a large
'ces' data set around, when you're only interested in the pooled
average.  In that case, I would copy the file containing the "average"
to a new data set.  Currently, this is not straightforward in Aroma
(I'll think about something), but you can do the following:

# Calculate the pooled average
> ceR <- getAverageFile(cesN);

# Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g.
> filename <- getFilename(ceR);
> filename
[1] ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"
> rootPath <- getParent(getPath(cesN), depth=2L);
> dataSet <- "HapMap270,pooled";
> chipType <- getChipType(ceR, fullname=FALSE);
> path <- file.path(rootPath, dataSet, chipType);
> path
[1] "plmData/HapMap270,pooled/GenomeWideSNP_6"
> mkdirs(path);
> copyFile(getPathname(ceR), file.path(path, filename));

With this done, you can then grab this pooled reference as:

> library("aroma.affymetrix")
> path <- "plmData/HapMap270,pooled/GenomeWideSNP_6";
> filename <- 
> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL";
> ceR <- CnChipEffectFile(filename, path=path);

Note, when you save 'ceR', you are basically saving the reference to
the file.  Yes, you can load it later, but make sure not to move it,
otherwise you'll get some type of "file not found" error.

> saveObject(ceR, "HapMap270,GenomeWideSNP_6,reference.Rdata");

If already saved, and file not moved, you can then do:

> library("aroma.affymetrix");
> ceR <- loadObject("HapMap270,GenomeWideSNP_6,reference.Rdata");

All this is very ad hoc (=non-aroma style), and as I said, I'll see if
I can come up with a cleaner solution for storing and retrieving
pooled averages.

>
> My second question is, how to go ahead from there to calculate the relative
> copy numbers for all unit from all samples? The two examples given in the
> Vignette  are for one unit from one sample and for a few unit on chromosome 2
> for one sample. What is the function to retrieve all units on all
> chromosomes instead of units <- getUnitsOnChromosome(gi, chromosome=2,
> region=c(81,86)*1e6)?

You can set 'units' to NULL to retrieve all loci, i.e. no need to use
getUnitsOnChromosome().  FYI, units <- NULL will give the same data as
with units <- 1:nbrOfUnits(gi).

> And what is the function to retrieve all samples
> instead of ce <- getFile(cesN, indexOf(cesN, "NA06985"))?

Hmm... not clear what you mean.  All samples are in 'cesN', and you do
need to iterate over them somehow.  Is this what you're looking for?

for (ii in seq_along(cesN)) {
  ce <- getFile(cesN, ii)
  ...
}

Or are you asking how to extract the data from all samples?  Then you can do:

theta <- extractTheta(cesN, units=units)

but be careful because that loads a lot of data into memory.

Hope this helps,

Henrik

>
> Thanks a lot for the help,
>
> Sean
>
> --
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest
> version of the package, 2) to report the output of sessionInfo() and
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>
> ---
> You received this message because you are subscribed to the Google Groups
> "aroma.affymetrix" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to aroma-affymetrix