Re: [aroma.affymetrix] Questions on extracting probeset summaries

2015-01-23 Thread Henrik Bengtsson
Solved.  Before finalize a release, would you mind making sure it
works on your end.  Install aroma.affymetrix 2.13.0-9001 by running
the following in a fresh R session:


source('http://callr.org/install#HenrikBengtsson/aroma.affymetrix@2.13.0-9001')

Then retry with

library(aroma.affymetrix)
cdf - AffymetrixCdfFile$byChipType(HG-U133_Plus_2)
cdfM - getMonocellCdf(cdf, verbose=TRUE)
print(cdfM)

If it complains about a pre-existing *.tmp file, remove that one an retry.

As soon as you confirm it works, I'll make aroma.affymetrix 2.13.1
available, because this was a critical bug(*).

Thanks for the report

/Henrik

(*) DETAILS: Turns out to be due to a single stray newline. It should have been

  affxparser::writeCdfUnits(...)

but it was:

  affxparser::writeCdfUnits
  (...)

Despite running 24 hours of regular package testing, this piece of
code was never tested.  I've now added an explicit test on creating
and re-creating monocell CDF.

On Fri, Jan 23, 2015 at 8:49 AM, Henrik Bengtsson h...@biostat.ucsf.edu wrote:
 I managed to reproduce this now:

 Error in (...) : 3 arguments passed to '(' which requires 1
 20150123 08:48:49| Could not locate monocell CDF. Will create one for chip 
 type.
 ..done
 20150123 08:48:49|Retrieving monocell CDF...done
 traceback()
 5: .writeCdfUnits(con = con, srcUnits, verbose = verbose2)
 4: createMonocellCdf.AffymetrixCdfFile(this, ..., verbose = less(verbose))
 3: createMonocellCdf(this, ..., verbose = less(verbose))
 2: getMonocellCdf.AffymetrixCdfFile(cdf, verbose = Arguments$getVerbose(-8,
timestamp = TRUE))
 1: getMonocellCdf(cdf, verbose = Arguments$getVerbose(-8, timestamp = TRUE))

 I'll investigate and fix this asap.

 /Henrik


 On Fri, Jan 23, 2015 at 7:37 AM, Henrik Bengtsson h...@biostat.ucsf.edu 
 wrote:

 On Jan 23, 2015 7:36 AM, Henrik Bengtsson h...@biostat.ucsf.edu wrote:

 This is odd for several reasons, e.g. I'm puzzled how you ended up with a
 monocell CDF previously but now it gives an error.  Let's troubleshoot
 more...

 What does troubleshoot() output directly after you get that error?

 I meant traceback()


 Henrik

 On Jan 23, 2015 7:23 AM, Qingzhou Zhang zqznept...@gmail.com wrote:
 
  Thanks, Henrik,
 
  It seems that something went wrong with the monocell cdf file by
  troubleshooting:
 
 
   cdf
 
  AffymetrixCdfFile:
 
  Path: annotationData/chipTypes/HG-U133_Plus_2
 
  Filename: HG-U133_Plus_2,monocell.CDF
 
  File size: 4.88 MB (5116945 bytes)
 
  Chip type: HG-U133_Plus_2,monocell
 
  RAM: 0.46MB
 
  File format: v4 (binary; XDA)
 
  Dimension: 182x182
 
  Number of cells: 33124
 
  Number of units: 27604
 
  Cells per unit: 1.20
 
  Number of QC units: 9
 
 
 
  So I have deleted the previous monocell cdf file in
  annotationData/chipTypes/HG-U133_Plus_2 and re-create it by the following:
 
  cdf - AffymetrixCdfFile$byChipType(HG-U133_Plus_2)
 
  cdfM - getMonocellCdf(cdf, verbose = Arguments$getVerbose(-8, timestamp
  = TRUE))
 
 
 
  However, the above process also failed, here is the output:
 
   cdfM - getMonocellCdf(cdf, verbose = Arguments$getVerbose(-8,
   timestamp = TRUE))
 
  20150123 21:47:53|Retrieving monocell CDF...
 
  20150123 21:47:53| Monocell chip type: HG-U133_Plus_2,monocell
 
  20150123 21:47:53| Locating monocell CDF...
 
  20150123 21:47:53|  Pathname:
 
  20150123 21:47:53| Locating monocell CDF...done
 
  20150123 21:47:53| Could not locate monocell CDF. Will create one for
  chip type...
 
  20150123 21:47:53|  Creating monocell CDF...
 
  20150123 21:47:53|   Chip type: HG-U133_Plus_2
 
  20150123 21:47:53|   Validate (main) CDF...
 
  20150123 21:47:54|   Validate (main) CDF...done
 
  20150123 21:47:55|   Adding temporary suffix from file...
 
  20150123 21:47:55|Pathname:
  annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2,monocell.CDF
 
  20150123 21:47:55|Suffix: .tmp
 
  20150123 21:47:55|Rename existing file?: FALSE
 
  20150123 21:47:55|Temporary pathname:
  annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2,monocell.CDF.tmp
 
  20150123 21:47:55|   Adding temporary suffix from file...done
 
  20150123 21:47:55|   Number of cells per group field: 1
 
  20150123 21:47:55|   Reading CDF group names...
 
  20150123 21:47:55|   Reading CDF group names...done
 
   used (Mb) gc trigger (Mb) max used (Mb)
 
 Ncells  603933 32.3 899071 48.1   741108 39.6
 
 Vcells 1027587  7.91757946 13.5  1424724 10.9
 
  used (Mb) gc trigger (Mb) max used (Mb)
 
 Ncells 549349 29.4 899071 48.1   899071 48.1
 
 Vcells 945722  7.31757946 13.5  1424724 10.9
 
  20150123 21:47:56|   Number of cells per unit:
 
Min. 1st Qu.  MedianMean 3rd Qu.Max.
 
   1   1   1   1   1   1
 
  20150123 21:47:56|   Reading CDF QC units...
 
  20150123 21:47:56|   Reading CDF QC units...done
 
  20150123 21:47:56|   Number of QC cells: 5385 in 9 QC units (0.1MB)
 
  20150123 21:47:56|   Total number

Re: [aroma.affymetrix] Questions on extracting probeset summaries

2015-01-23 Thread Henrik Bengtsson
This is odd for several reasons, e.g. I'm puzzled how you ended up with a
monocell CDF previously but now it gives an error.  Let's troubleshoot
more...

What does troubleshoot() output directly after you get that error?

Henrik

On Jan 23, 2015 7:23 AM, Qingzhou Zhang zqznept...@gmail.com wrote:

 Thanks, Henrik,

 It seems that something went wrong with the monocell cdf file by
troubleshooting:


  cdf

 AffymetrixCdfFile:

 Path: annotationData/chipTypes/HG-U133_Plus_2

 Filename: HG-U133_Plus_2,monocell.CDF

 File size: 4.88 MB (5116945 bytes)

 Chip type: HG-U133_Plus_2,monocell

 RAM: 0.46MB

 File format: v4 (binary; XDA)

 Dimension: 182x182

 Number of cells: 33124

 Number of units: 27604

 Cells per unit: 1.20

 Number of QC units: 9



 So I have deleted the previous monocell cdf file in
annotationData/chipTypes/HG-U133_Plus_2 and re-create it by the following:

 cdf - AffymetrixCdfFile$byChipType(HG-U133_Plus_2)

 cdfM - getMonocellCdf(cdf, verbose = Arguments$getVerbose(-8, timestamp
= TRUE))



 However, the above process also failed, here is the output:

  cdfM - getMonocellCdf(cdf, verbose = Arguments$getVerbose(-8,
timestamp = TRUE))

 20150123 21:47:53|Retrieving monocell CDF...

 20150123 21:47:53| Monocell chip type: HG-U133_Plus_2,monocell

 20150123 21:47:53| Locating monocell CDF...

 20150123 21:47:53|  Pathname:

 20150123 21:47:53| Locating monocell CDF...done

 20150123 21:47:53| Could not locate monocell CDF. Will create one for
chip type...

 20150123 21:47:53|  Creating monocell CDF...

 20150123 21:47:53|   Chip type: HG-U133_Plus_2

 20150123 21:47:53|   Validate (main) CDF...

 20150123 21:47:54|   Validate (main) CDF...done

 20150123 21:47:55|   Adding temporary suffix from file...

 20150123 21:47:55|Pathname:
annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2,monocell.CDF

 20150123 21:47:55|Suffix: .tmp

 20150123 21:47:55|Rename existing file?: FALSE

 20150123 21:47:55|Temporary pathname:
annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2,monocell.CDF.tmp

 20150123 21:47:55|   Adding temporary suffix from file...done

 20150123 21:47:55|   Number of cells per group field: 1

 20150123 21:47:55|   Reading CDF group names...

 20150123 21:47:55|   Reading CDF group names...done

  used (Mb) gc trigger (Mb) max used (Mb)

Ncells  603933 32.3 899071 48.1   741108 39.6

Vcells 1027587  7.91757946 13.5  1424724 10.9

 used (Mb) gc trigger (Mb) max used (Mb)

Ncells 549349 29.4 899071 48.1   899071 48.1

Vcells 945722  7.31757946 13.5  1424724 10.9

 20150123 21:47:56|   Number of cells per unit:

   Min. 1st Qu.  MedianMean 3rd Qu.Max.

  1   1   1   1   1   1

 20150123 21:47:56|   Reading CDF QC units...

 20150123 21:47:56|   Reading CDF QC units...done

 20150123 21:47:56|   Number of QC cells: 5385 in 9 QC units (0.1MB)

 20150123 21:47:56|   Total number of cells: 60060

 20150123 21:47:56|   Best array dimension: 246x245 (=60270 cells, i.e.
210 left-over cells)

 20150123 21:47:56|   Creating CDF header with source CDF as template...

 20150123 21:47:56|Setting up header...

 20150123 21:47:56| Reading CDF header...

 20150123 21:47:56| Reading CDF header...done

 20150123 21:47:56| Reading CDF unit names...

 20150123 21:47:56| Reading CDF unit names...done

 20150123 21:47:56|Setting up header...done

 20150123 21:47:56|Writing...

 20150123 21:47:56| destHeader:

  List of 12

   $ ncols  : int 245

   $ nrows  : int 246

   $ nunits : int 54675

   $ nqcunits   : int 9

   $ refseq : chr 

   $ chiptype   : chr HG-U133_Plus_2

   $ filename   : chr
annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2.cdf

   $ rows   : int 1164

   $ cols   : int 1164

   $ probesets  : int 54675

   $ qcprobesets: int 9

   $ reference  : chr 

 20150123 21:47:56| unitNames:

   chr [1:54675] AFFX-BioB-5_at AFFX-BioB-M_at AFFX-BioB-3_at
AFFX-BioC-5_at ...

 20150123 21:47:56| qcUnitLengths:

   num [1:9] 15966 174 230 1658 69 ...

 20150123 21:47:56| unitLengths:

   num [1:54675] 116 116 116 116 116 116 116 116 116 116 ...

used (Mb) gc trigger (Mb) max used (Mb)

  Ncells  561416 30.0 984024 52.6   899071 48.1

  Vcells 1120064  8.61925843 14.7  1515846 11.6

used (Mb) gc trigger (Mb) max used (Mb)

  Ncells  562232 30.1 984024 52.6   899071 48.1

  Vcells 1010995  7.85484388 41.9  6516658 49.8

 20150123 21:47:57|Writing...done

 20150123 21:47:57|   Creating CDF header with source CDF as
template...done

 20150123 21:47:57|   Writing QC units...

 20150123 21:47:57|Rearranging QC unit cell indices...

 20150123 21:47:57| Units: 20150123 21:47:57|

 20150123 21:47:57|Rearranging QC unit cell indices...done

  used (Mb) gc trigger

Re: [aroma.affymetrix] Questions on extracting probeset summaries

2015-01-23 Thread Henrik Bengtsson
On Jan 23, 2015 7:36 AM, Henrik Bengtsson h...@biostat.ucsf.edu wrote:

 This is odd for several reasons, e.g. I'm puzzled how you ended up with a
monocell CDF previously but now it gives an error.  Let's troubleshoot
more...

 What does troubleshoot() output directly after you get that error?

I meant traceback()


 Henrik

 On Jan 23, 2015 7:23 AM, Qingzhou Zhang zqznept...@gmail.com wrote:
 
  Thanks, Henrik,
 
  It seems that something went wrong with the monocell cdf file by
troubleshooting:
 
 
   cdf
 
  AffymetrixCdfFile:
 
  Path: annotationData/chipTypes/HG-U133_Plus_2
 
  Filename: HG-U133_Plus_2,monocell.CDF
 
  File size: 4.88 MB (5116945 bytes)
 
  Chip type: HG-U133_Plus_2,monocell
 
  RAM: 0.46MB
 
  File format: v4 (binary; XDA)
 
  Dimension: 182x182
 
  Number of cells: 33124
 
  Number of units: 27604
 
  Cells per unit: 1.20
 
  Number of QC units: 9
 
 
 
  So I have deleted the previous monocell cdf file in
annotationData/chipTypes/HG-U133_Plus_2 and re-create it by the following:
 
  cdf - AffymetrixCdfFile$byChipType(HG-U133_Plus_2)
 
  cdfM - getMonocellCdf(cdf, verbose = Arguments$getVerbose(-8,
timestamp = TRUE))
 
 
 
  However, the above process also failed, here is the output:
 
   cdfM - getMonocellCdf(cdf, verbose = Arguments$getVerbose(-8,
timestamp = TRUE))
 
  20150123 21:47:53|Retrieving monocell CDF...
 
  20150123 21:47:53| Monocell chip type: HG-U133_Plus_2,monocell
 
  20150123 21:47:53| Locating monocell CDF...
 
  20150123 21:47:53|  Pathname:
 
  20150123 21:47:53| Locating monocell CDF...done
 
  20150123 21:47:53| Could not locate monocell CDF. Will create one for
chip type...
 
  20150123 21:47:53|  Creating monocell CDF...
 
  20150123 21:47:53|   Chip type: HG-U133_Plus_2
 
  20150123 21:47:53|   Validate (main) CDF...
 
  20150123 21:47:54|   Validate (main) CDF...done
 
  20150123 21:47:55|   Adding temporary suffix from file...
 
  20150123 21:47:55|Pathname:
annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2,monocell.CDF
 
  20150123 21:47:55|Suffix: .tmp
 
  20150123 21:47:55|Rename existing file?: FALSE
 
  20150123 21:47:55|Temporary pathname:
annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2,monocell.CDF.tmp
 
  20150123 21:47:55|   Adding temporary suffix from file...done
 
  20150123 21:47:55|   Number of cells per group field: 1
 
  20150123 21:47:55|   Reading CDF group names...
 
  20150123 21:47:55|   Reading CDF group names...done
 
   used (Mb) gc trigger (Mb) max used (Mb)
 
 Ncells  603933 32.3 899071 48.1   741108 39.6
 
 Vcells 1027587  7.91757946 13.5  1424724 10.9
 
  used (Mb) gc trigger (Mb) max used (Mb)
 
 Ncells 549349 29.4 899071 48.1   899071 48.1
 
 Vcells 945722  7.31757946 13.5  1424724 10.9
 
  20150123 21:47:56|   Number of cells per unit:
 
Min. 1st Qu.  MedianMean 3rd Qu.Max.
 
   1   1   1   1   1   1
 
  20150123 21:47:56|   Reading CDF QC units...
 
  20150123 21:47:56|   Reading CDF QC units...done
 
  20150123 21:47:56|   Number of QC cells: 5385 in 9 QC units (0.1MB)
 
  20150123 21:47:56|   Total number of cells: 60060
 
  20150123 21:47:56|   Best array dimension: 246x245 (=60270 cells, i.e.
210 left-over cells)
 
  20150123 21:47:56|   Creating CDF header with source CDF as template...
 
  20150123 21:47:56|Setting up header...
 
  20150123 21:47:56| Reading CDF header...
 
  20150123 21:47:56| Reading CDF header...done
 
  20150123 21:47:56| Reading CDF unit names...
 
  20150123 21:47:56| Reading CDF unit names...done
 
  20150123 21:47:56|Setting up header...done
 
  20150123 21:47:56|Writing...
 
  20150123 21:47:56| destHeader:
 
   List of 12
 
$ ncols  : int 245
 
$ nrows  : int 246
 
$ nunits : int 54675
 
$ nqcunits   : int 9
 
$ refseq : chr 
 
$ chiptype   : chr HG-U133_Plus_2
 
$ filename   : chr
annotationData/chipTypes/HG-U133_Plus_2/HG-U133_Plus_2.cdf
 
$ rows   : int 1164
 
$ cols   : int 1164
 
$ probesets  : int 54675
 
$ qcprobesets: int 9
 
$ reference  : chr 
 
  20150123 21:47:56| unitNames:
 
chr [1:54675] AFFX-BioB-5_at AFFX-BioB-M_at AFFX-BioB-3_at
AFFX-BioC-5_at ...
 
  20150123 21:47:56| qcUnitLengths:
 
num [1:9] 15966 174 230 1658 69 ...
 
  20150123 21:47:56| unitLengths:
 
num [1:54675] 116 116 116 116 116 116 116 116 116 116 ...
 
 used (Mb) gc trigger (Mb) max used (Mb)
 
   Ncells  561416 30.0 984024 52.6   899071 48.1
 
   Vcells 1120064  8.61925843 14.7  1515846 11.6
 
 used (Mb) gc trigger (Mb) max used (Mb)
 
   Ncells  562232 30.1 984024 52.6   899071 48.1
 
   Vcells 1010995  7.85484388 41.9  6516658 49.8
 
  20150123 21:47:57|Writing...done
 
  20150123 21:47:57|   Creating CDF header