Hi,

I could reproduce the error with the data you've sent me offline.
I've located the bug, which indeed was an edge-case bug, and I've
fixed it.  Update to PSCBS v0.35.0:

source("http://aroma-project.org/hbLite.R";);
hbInstall("PSCBS");

I've verified that it works with your data.  Below is the script I used.

stopifnot(packageVersion("PSCBS") >= "0.35.0");

library("PSCBS");

library("R.devices");
devOptions("png", width=1024);
setOption("devEval/args/force", FALSE);

sampleName <- "ingrid";

## Load the data from a previously saved segmentation object.
pathname <- sprintf("%s.Rbin", sampleName);
fit <- loadObject(pathname);
data <- getLocusData(fit);
rm(fit);


## Locate centromeres and other large gaps in data
gaps <- findLargeGaps(data, minLength=1e6);
knownSegments <- gapsToSegments(gaps);
## => dim(knownSegments) == c(64, 4)

## 'betaT' is already normalized using TumorBoost => tbn=FALSE.
fit <- segmentByNonPairedPSCBS(data, knownSegments=knownSegments,
avgDH="median", seed=0xBEEF, verbose=-10);
printf("Number of segments: %d\n", nbrOfSegments(fit));
## Number of segments: 568

toPNG(sampleName, tags="tracks,avgDH=median", aspectRatio=0.6, {
  plotTracks(fit);
});


## Estimate global background level in [0,1] (due to normal
contamination and more)
kappa <- estimateKappa(fit, verbose=-10);
printf("Kappa: %g\n", kappa);
## Kappa: 0.374814


## Call allelic balance
## (a) Estimate DH threshold for calling AB
deltaAB <- estimateDeltaAB(fit, kappa=kappa); # If skipped, will be
done internally
printf("Delta_AB: %g\n", deltaAB);
## Delta_AB: 0.146787

## (b) Call AB based on bootstrapped segment DH levels
fit <- callAB(fit, delta=deltaAB, verbose=-100);
printf("Number of AB segments: %d\n", sum(getSegments(fit)$abCall, na.rm=TRUE));
## Number of AB segments: 193

toPNG(sampleName, tags="tracks,avgDH=median,AB", aspectRatio=0.6, {
  plotTracks(fit);
});


## Call loss of heterozygosity (LOH)
## (a) Estimate C_1 threshold for calling LOH
deltaLOH <- estimateDeltaLOH(fit); # If skipped, will be done internally
printf("Delta_LOH: %g\n", deltaLOH);
## Delta_LOH: 0.566545

## (b) Call LOH based on bootstrapped segment C_1 levels
fit <- callLOH(fit, delta=deltaLOH, verbose=-10);
printf("Number of LOH segments: %d\n", sum(getSegments(fit)$lohCall,
na.rm=TRUE));
## Number of LOH segments: 149

toPNG(sampleName, tags="tracks,avgDH=median,AB+LOH", aspectRatio=0.6, {
  plotTracks(fit);
});


## Call NTCN
## (a) Estimate the threshold for calling neutral TCN segments
##     By shrinking 'scale', more segments will be non-NTCN.
deltaCN <- estimateDeltaCN(fit, scale=1.0, kappa=kappa);
printf("Delta_CN: %g\n", deltaCN);
## Delta_CN: 0.312593

## (b) Call NTCN based on bootstrapped segment TCN levels
fit <- callNTCN(fit, delta=deltaCN, verbose=-20);
printf("Number of NTCN segments: %d\n", sum(getSegments(fit)$ntcnCall,
na.rm=TRUE));
## Number of NTCN segments: 303

toPNG(sampleName, tags="tracks,avgDH=median,AB+LOH+NTCN", aspectRatio=0.6, {
  plotTracks(fit);
});

/Henrik


On Mon, Apr 22, 2013 at 6:00 PM, Henrik Bengtsson
<henrik.bengts...@aroma-project.org> wrote:
> Hi Ingrid.
>
> [cleaning up by dropping everything but the
> bootstrapTCNandDHByRegion() error; will reply to the other stuff
> separately.]
>
> On Sun, Apr 21, 2013 at 8:33 PM, Ingrid Lönnstedt
> <ingrid.lonnst...@gmail.com> wrote:
>>
>> Hi Henrik,
>> Thank you for your answers! You do a great job, indeed! I am now back to
>> this. See a couple of remaining issues below.
>> /Ingrid
>>
>>
>>
>> On Wednesday, 27 March 2013 18:16:07 UTC+11, Henrik Bengtsson wrote:
>>>
>>> On Tue, Mar 26, 2013 at 10:03 PM, Ingrid Lönnstedt
>>> <ingrid.l...@gmail.com> wrote:
>>> >
>>> > Thank you Henrik!
>>> > OK, I followed your instructions, and the same problem (the error in
>>> > callAB()) remains, or rather, it now occurs already in sample one. Also,
>
> [snip]
>
>> So, my main problem that remains is insanity check failures in callAB(). I
>> have followed my sample 1 into bootstrapTCNandDHByRegion.PairedPSCBS with
>> debug() and concluded that I have 592 segments, and the first failure occurs
>> with the 233rd segment. I have printed the CBS info for segments 220 to 233,
>> in case that would help, as well as the current error message and session
>> Info. If there is a failure with segment 233, is there a reason not to
>> report the callAB  outcomes of the other segments? What does the sanity
>> check failure tell me, and could I do any fix to get round it or at least
>> get something for the other segments?
>>
>> Thanks,
>> Ingrid
>>
>>
>> Browse[3]> segs[220:233,]
>>
>>     chromosome tcnId dhId  tcnStart    tcnEnd tcnNbrOfLoci   tcnMean
>> 220          9    11    1  73187382 125406672        38303 2.1977136
>> 221          9    12    1 125406672 141091395         8913 2.1506696
>> 222         NA    NA   NA        NA        NA            0        NA
>> 223         10     1    1     72760  17932882        15166 2.0003566
>> 224         10     2    1  17932882  17956182            2 4.5480683
>> 225         10     3    1  17956182  18179544           18 1.9775509
>> 226         10     4    1  18179544  18202780            3 0.6127961
>> 227         10     5    1  18202780  24376470         4403 2.0286301
>> 228         10     6    1  24376470  24378482           11 0.6265435
>> 229         10     7    1  24378482  39076222        10124 2.0268394
>> 230         10     8    1  39076223  42433539            0       NaN
>> 231         10     9    1  42433540  86411230        28982 2.0375534
>> 232         10    10    1  86411230  89029244         1697 2.0558821
>> 233         10    11    1  89029244  89115851           21 2.5505440
>>     tcnNbrOfSNPs tcnNbrOfHets dhNbrOfLoci   dhStart     dhEnd     dhMean
>> 220        18928         5290        5290  73187382 125406672 0.07756713
>> 221         4002         1101        1101 125406672 141091395 0.07943612
>> 222           NA           NA           0        NA        NA         NA
>> 223         8456         2087        2087     72760  17932882 0.07487577
>> 224            0            0           0  17932882  17956182         NA
>> 225            4            1           1  17956182  18179544 0.04524338
>> 226            0            0           0  18179544  18202780         NA
>> 227         2212          543         543  18202780  24376470 0.07859021
>> 228            0            0           0  24376470  24378482         NA
>> 229         4971         1086        1086  24378482  39076222 0.07712847
>> 230            0            0           0  39076223  42433539         NA
>> 231        13983         3576        3576  42433540  86411230 0.06990635
>> 232          858          192         192  86411230  89029244 0.06439847
>> 233            4            1           1  89029244  89115851 0.36474615
>>        c1Mean   c2Mean
>> 220 1.0136216 1.184092
>> 221 0.9899144 1.160755
>> 222        NA       NA
>> 223 0.9252892 1.075067
>> 224        NA       NA
>> 225 0.9440399 1.033511
>> 226        NA       NA
>> 227 0.9345998 1.094030
>> 228        NA       NA
>> 229 0.9352562 1.091583
>> 230        NA      NaN
>> 231 0.9475577 1.089996
>> 232 0.9617432 1.094139
>> 233 0.8101214 1.740423
>>
>> deltaCN <- estimateDeltaCN(fit, scale=1) #0.2601609 for my sample
>> fit = callAB(fit, delta = deltaCN, verbose = -10)
>> List of 7
>>  $ nbrOfTCNs      : int 21
>>  $ tcnNbrOfLoci   : int 21
>>  $ mu             : num 2.58
>>  $ tcnMean        : num 2.55
>>  $ dMu            : num 0.0248
>>  $ abs(dMu)       : num 0.0248
>>  $ range(x[units]): num [1:2] 89029244 89114401
>>
>> Error in bootstrapTCNandDHByRegion.PairedPSCBS(fit, statsFcn = statsFcn,
>> :
>>   INTERNAL ERROR: Incorrect TCN mean!
>
> This is an internal error that the *implementation* of the PSCBS
> algorithm is correct and handles all "edge cases"; apparently there is
> still another case that was not considered.  These "edge cases" are
> for instance when two SNPs share the exact same location but the
> internal CBS segmentation method wants to put a change point in
> between them because that's what the data happens to say.  In other
> words, it's nothing you're doing wrong and probably nothing you should
> try to work around either. (I'll think about this a bit more since it
> happens once in a while and it would be nice to be able to keep going.
>  On the other hand, if it would be just a warning you would have never
> told me and the error would never be fixed.)
>
> First, update to the latest PSCBS v0.34.8 (on CRAN since yday) and
> verify that the error remains.  If it does, could you save the
> segmentation result;
>
> saveObject(fit, "problem.Rdata")
>
> and make 'problem.Rdata' available to me for download somewhere?  Then
> I can troubleshoot.
>
> /Henrik
>
>>
>>
>> > sessionInfo()
>> R version 2.15.2 Patched (2013-02-09 r61879)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>  [7] LC_PAPER=C                 LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>>  [1] DNAcopy_1.32.0         PSCBS_0.34.1           aroma.affymetrix_2.8.0
>>  [4] affxparser_1.30.2      aroma.apd_0.2.3        R.huge_0.4.1
>>  [7] aroma.light_1.28.0     aroma.core_2.8.0       matrixStats_0.7.0
>> [10] R.rsp_0.8.2            R.devices_2.1.3        R.cache_0.6.5
>> [13] R.filesets_2.0.0       R.utils_1.23.2         R.oo_1.13.0
>> [16] R.methodsS3_1.4.2
>>
>> loaded via a namespace (and not attached):
>> [1] calmate_0.10.0 digest_0.6.3   MASS_7.3-23
>>
>>
>>>
>>> > All the best,
>>> > Ingrid
>>> >
>>> >
>>> > On Saturday, 23 March 2013 05:03:50 UTC+11, Henrik Bengtsson wrote:
>>> >>
>>> >> Hi Ingrid.
>>> >>
>>> >> On Wed, Mar 20, 2013 at 4:45 PM, Ingrid Lönnstedt
>>> >> <ingrid.l...@gmail.com> wrote:
>>> >> >
>>> >> > Hi!
>>> >> >
>>> >> > I have segmented 56 SNP 6 tumour samples with
>>> >> > segmentByNonPairedPSCBS(),
>>> >> > and like to call allelic balance segments with callAB() as required,
>>> >> > I
>>> >> > understand, before I can call neutral TCNs with callNTCN().
>>> >>
>
> [snip]
>
>>> >>
>>> >> source("http://aroma-project.org/hbLite.R";);
>>> >> hbInstall("PSCBS");
>>> >>
>>> >>
>>> >>
>>> >> > For the first
>>> >> > tumour sample, both callAB() and callNTCN() seem to work fine, but
>>> >> > then on
>>> >> > the second sample I get errors with callAB(), see below.
>>> >>
>>> >> See comment after the error.
>>> >>
>>> >> > The plotTracks()
>>> >> > image looks pretty much the same between sample one and two, and I
>>> >> > have no
>>> >> > reason to believe that sample two has a major problem in the
>>> >> > segmentation. 4
>>> >> > of the 56 samples get very noisy segmentations, with lots of short
>>> >> > regions
>>> >> > scattered (has anybody seen that before?)
>
> [snip]
>
>>> >> >   betaT = dataN[,'fracB',samp]
>>> >> >   ugp = getAromaUgpFile(dsNList$total)
>>> >> >   chromosome = ugp[,1,drop = T]
>>> >> >   x = ugp[, 2, drop = T]
>>> >> >   df = data.frame(chromosome = chromosome, x=x, CT=CT, betaT =
>>> >> > betaT)
>>> >> >   df = dropSegmentationOutliers(df)
>>> >> >         #This gives warnings:
>>> >> >         #24: In DNAcopy::CNA(genomdat = yKK, chrom = chromosomeKK,
>>> >> > ... :
>>> >> >         #array has repeated maploc positions
>>> >> >   gaps = findLargeGaps(df, minLength = 1e+06)
>>> >> >   knownSegments = gapsToSegments(gaps)
>>> >> >
>>> >> >   fit = segmentByNonPairedPSCBS(df, knownSegments = knownSegments,
>>> >> > avgDH="median",
>>> >> >                                                      tauA = .33,
>>> >> > seed =
>>> >> > 48879, verbose = -10)
>>> >> >       #This gives warnings that there are multipe maploc positions
>>> >> >   fit = callAB(fit, verbose = -10)
>>> >> >
>>> >> > Calling segments of allelic balance from one-sided DH bootstrap
>>> >> > confidence
>>> >> > intervals...
>>> >> >  delta (offset adjusting for bias in DH): 0.211106073856354
>>> >> >  alpha (CI quantile; significance level): 0.05
>>> >> > List of 7
>>> >> >  $ nbrOfDHs       : int 70
>>> >> >  $ dhNbrOfLoci    : int 70
>>> >> >  $ mu             : num 0.183
>>> >> >  $ dhMean         : num 0.203
>>> >> >  $ dMu            : num -0.02
>>> >> >  $ abs(dMu)       : num 0.02
>>> >> >  $ range(x[units]): num [1:2] 990417 2792801
>>> >> > Error in bootstrapTCNandDHByRegion.PairedPSCBS(fit, statsFcn =
>>> >> > statsFcn,
>>> >> > :
>>> >> >   INTERNAL ERROR: Incorrect DH mean!
>>> >> >  Calling segments of allelic balance from one-sided DH bootstrap
>>> >> > confidence intervals...done
>>> >>
>>> >> This is a internal sanity check throwing an error.   As a start, I
>>> >> recommend to retry with (i) a proper CT calculate, and (ii) after
>>> >> updated PSCBS.
>>> >>
>>> >> Hope this helps
>>> >>
>>> >> Henrik
>>> >>
>>> >> >
>>> >> >
>>> >> > >
>>> >> > > head(getSegments(fit))
>>> >> >   chromosome tcnId dhId   tcnStart     tcnEnd tcnNbrOfLoci   tcnMean
>>> >> > tcnNbrOfSNPs tcnNbrOfHets dhNbrOfLoci    dhStart      dhEnd
>>> >> > 1          1     1    1    61736.0   811403.5           73 1.4735545
>>> >> > 15            2           2    61736.0   811403.5
>>> >> > 2          1     2    1   811403.5  2821394.0          527 0.9201369
>>> >> > 183           70          70   811403.5  2821394.0
>>> >> > 3          1     3    1  2821394.0  2827143.5            2 5.0273015
>>> >> > 2            0           0  2821394.0  2827143.5
>>> >> > 4          1     4    1  2827143.5 11694138.0         5531 0.9780667
>>> >> > 2940          983         983  2827143.5 11694138.0
>>> >> > 5          1     5    1 11694138.0 11709225.0            5 3.3949294
>>> >> > 5            3           3 11694138.0 11709225.0
>>> >> > 6          1     6    1 11709225.0 16923050.0         2918 1.0072282
>>> >> > 1485          550         550 11709225.0 16923050.0
>>> >> >      dhMean    c1Mean    c2Mean
>>> >> > 1 0.1695309 0.6118708 0.8616838
>>> >> > 2 0.2033105 0.3665317 0.5536052
>>> >> > 3        NA        NA        NA
>>> >> > 4 0.2033807 0.3895734 0.5884933
>>> >> > 5 0.1465555 1.4486918 1.9462376
>>> >> > 6 0.2036166 0.4010699 0.6061583
>>> >> >
>>> >> >
>>> >> > > traceback()
>>> >> > 7: stop("INTERNAL ERROR: Incorrect DH mean!")
>>> >> > 6: bootstrapTCNandDHByRegion.PairedPSCBS(fit, statsFcn = statsFcn,
>>> >> >        ..., verbose = less(verbose, 50))
>>> >> > 5: bootstrapTCNandDHByRegion(fit, statsFcn = statsFcn, ..., verbose
>>> >> > =
>>> >> > less(verbose,
>>> >> >        50))
>>> >> > 4: callAllelicBalanceByDH.PairedPSCBS(fit, ...)
>>> >> > 3: callAllelicBalanceByDH(fit, ...)
>>> >> > 2: callAB.PairedPSCBS(fit, verbose = -10)
>>> >> > 1: callAB(fit, verbose = -10)
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > > sessionInfo()
>>> >> >
>>> >> > R version 2.15.2 Patched (2013-02-09 r61879)
>>> >> > Platform: x86_64-unknown-linux-gnu (64-bit)
>>> >> >
>>> >> > locale:
>>> >> >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>> >> > LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>> >> >  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>> >> > LC_PAPER=C
>>> >> > LC_NAME=C
>>> >> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> >> > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>> >> >
>>> >> > attached base packages:
>>> >> > [1] splines   stats     graphics  grDevices utils     datasets
>>> >> > methods
>>> >> > base
>>> >> >
>>> >> > other attached packages:
>>> >> >  [1] Hmisc_3.10-1           survival_2.37-4        DNAcopy_1.32.0
>>> >> > PSCBS_0.30.0           aroma.affymetrix_2.8.0
>>> >> >  [6] affxparser_1.30.2      aroma.apd_0.2.3        R.huge_0.4.1
>>> >> > aroma.light_1.28.0     aroma.core_2.8.0
>>> >> > [11] matrixStats_0.6.2      R.rsp_0.8.2            R.devices_2.1.3
>>> >> > R.cache_0.6.5          R.filesets_2.0.0
>>> >> > [16] crlmm_1.16.9           preprocessCore_1.20.0
>>> >> > oligoClasses_1.20.0
>>> >> > BiocGenerics_0.4.0     R.utils_1.19.5
>>> >> > [21] R.oo_1.11.7            R.methodsS3_1.4.2
>>> >> >
>>> >> > loaded via a namespace (and not attached):
>>> >> >  [1] affyio_1.26.0        annotate_1.36.0      AnnotationDbi_1.20.5
>>> >> > Biobase_2.18.0       BiocInstaller_1.8.3  Biostrings_2.26.3
>>> >> >  [7] bit_1.1-9            cluster_1.14.3       codetools_0.2-8
>>> >> > DBI_0.2-5            digest_0.6.3         ellipse_0.3-7
>>> >> > [13] ff_2.2-10            foreach_1.4.0        genefilter_1.40.0
>>> >> > GenomicRanges_1.10.7 grid_2.15.2          IRanges_1.16.6
>>> >> > [19] iterators_1.0.6      lattice_0.20-13      Matrix_1.0-11
>>> >> > mvtnorm_0.9-9994     parallel_2.15.2      RcppEigen_0.3.1.2.1
>>> >> > [25] RSQLite_0.11.2       stats4_2.15.2        tools_2.15.2
>>> >> > XML_3.95-0.1         xtable_1.7-1         zlibbioc_1.4.0
>>> >> >
>>> >> >
>>> >> > >
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > --
>>> >> > When reporting problems on aroma.affymetrix, make sure 1) to run the
>>> >> > latest version of the package, 2) to report the output of
>>> >> > sessionInfo() and
>>> >> > traceback(), and 3) to post a complete code example.
>>> >> >
>>> >> >
>>> >> > You received this message because you are subscribed to the Google
>>> >> > Groups
>>> >> > "aroma.affymetrix" group with website http://www.aroma-project.org/.
>>> >> > To post to this group, send email to aroma-af...@googlegroups.com
>>> >> > To unsubscribe and other options, go to
>>> >> > http://www.aroma-project.org/forum/
>>> >> >
>>> >> > ---
>>> >> > You received this message because you are subscribed to the Google
>>> >> > Groups
>>> >> > "aroma.affymetrix" group.
>>> >> > To unsubscribe from this group and stop receiving emails from it,
>>> >> > send an
>>> >> > email to aroma-affymetr...@googlegroups.com.
>>> >> > For more options, visit https://groups.google.com/groups/opt_out.
>>> >> >
>>> >> >
>>> >
>>> > --
>>> > --
>>> > When reporting problems on aroma.affymetrix, make sure 1) to run the
>>> > latest version of the package, 2) to report the output of sessionInfo() 
>>> > and
>>> > traceback(), and 3) to post a complete code example.
>>> >
>>> >
>>> > You received this message because you are subscribed to the Google
>>> > Groups "aroma.affymetrix" group with website 
>>> > http://www.aroma-project.org/.
>>> > To post to this group, send email to aroma-af...@googlegroups.com
>>> > To unsubscribe and other options, go to
>>> > http://www.aroma-project.org/forum/
>>> >
>>> > ---
>>> > You received this message because you are subscribed to the Google
>>> > Groups "aroma.affymetrix" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> > an email to aroma-affymetr...@googlegroups.com.
>>> > For more options, visit https://groups.google.com/groups/opt_out.
>>> >
>>> >

-- 
-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

--- 
You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to aroma-affymetrix+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to