[aroma.affymetrix] Re: Question for custom CDF of ST-Array

cstratowa Mon, 01 Feb 2010 00:02:46 -0800

Dear Branko,

I have just seen your comment that you have an issue with XPS, which
is my BioC package. However, you have never contacted me at
<[email protected]> nor have sent a mail to the BioC mailing list. I
would really appreciate that you contact me first before making such a
statement.


Regarding your problem with HuGene-1_0-st-v1.na30.hg19.probeset._csv,
(which could be the reason of your problem with xps), Affymetrix had
included probesets in this annotation file which were not included in
the corresponding *.PGF file. For this reason I have contacted
Affymetrix DevNet and asked them to remove the extra probesets, which
they kindly did resulting in file HuGene-1_0-st-
v1.na30.1.hg19.probeset._csv.

Best regards
Christian

On Jan 28, 4:58 pm, branko <[email protected]> wrote:
> Hi all,
>
>   First to thank you Mark , Henrik for nice  tool and also  all others
> for  informative mailing list.
> This is my first email so hopefully not so many dummy questions.
>
> I found this thread as it is linked to my first 2  questions:
> Month  ago I did some basic analysis  with 300 chips where your tool
> got very handy. And I'm starting again to analyze things ...
>  (Btw. I found  about Aroma  on BioC list as I had issue with XPS
> package that I couldn’t resolve . I would like to have them both
> working so  Xps will come on board eventually.)
>
>  Now regarding this tread.
> 1)     When I compared HuGene-1_0-st-v1.na30.hg19.probeset._csv.  with
> the Aroma CDF file there are
>      33257  PSets  in csv while  Aroma has 33252 so 4 different. So
> maybe the CDF  update is needed in Aroma?
>      I also just saw some more recent update on Affymetrix to “na30.1”
> and I see I have used “na30” csv file.
>
>   Following  seems weird , many  ProbeSets have no gene symbol
> annotation in “main“ category :
>
>   tr <- read.csv ( "HuGene-1_0-st-v1.na30.hg19.transcript.csv" ,
> stringsAsFactors =F ,comment.char="#")
>
>   sum(tr$gene_assignment=="---" & tr$category=="main") #6711
>   sum(tr$unigene=="---" & tr$category=="main")  #7376
>   sum(tr$swissprot=="---" & tr$category=="main") # 8021
>
>    Affy 33257  Probesets are classified as:
>       TYPE                              TOTAL
> control->affx                           57
> control->bgp->antigenomic    45
> main                                    28829
> normgene->exon                 1195
> normgene->intron                  2904
> rescue->FLmRNA->unmapped    227
>
>    Do you  maybe know why are there so many e.g above 6711 without
> gene_assignment ?
>
> 2) One more related to 1st Q  : How to run normalization only on
> ProbeSet  classified as “main” ?  ( Affymetrix csv file says this
> class has 28829
>       Psets in main class)   Idea is of course  to minimize
> normalization bias of non-main class Psets.
>
> 3) Few  questions regarding Quality checks and basic data
> manipulations in Aroma:
>     I would like to give meaningful labels to  the Arrays in the e.g.
> box-plots (instead of CEL file names, as said i have 300) .
>     How can I do that?
>     Also how can I sort them ?
>     I ask this silly questions because Using R commands like str()
> doesn’t show me the
>     object fields etc. so I can’t use standard R matrix commands,
> also  help (“some Aroma command” ) doesn’t show enough information.
>     Sometimes it gives empty help page.
>     I could not find pdf manual in Aroma installed libraries nor in
> the Google group. I see only html file showing me all the functions &
> classes.
>      Is there easier way to look for functions than main html pages ?
>     Code of functions are not visible by just typing  func.name() , I
> guess I can always get source code and search but there is likely easy
> way to do it.
>
>      It is visible that  Aroma uses different classes than
> BioConductor.  I assume there is a good reason for that, but maybe you
> can give some link with explanation?
>
>     Oh and thank you for putting clean instructions for basic
> preprocessing steps to get Expression matrix out for further
> analysis.
>
> One example of my attempt to searching things :
>     cs <- AffymetrixCelSet$byName(dataSet, cdf=cdf)
>     print(cs)
>     str(cs)
>     #Classes 'AffymetrixCelSet', 'AffymetrixFileSet',
> 'AromaPlatformInterface', 'AromaMicroarrayDataSet',
> 'GenericDataFileSet', 'FullNameInterface', 'Object'  atomic [1:1] NA
> ….
>    ?classname put me on some track.
>
>    So I looked a bit and I saw GenericDataFileSet class provides
> function
>    getFullNames()  and it does  give me the my CEL filenames (btw.
> help on ?getFullNames gives me no help .)
>    setNames() point to basic library…  but  before going deeper and
> making some mess I hope you
>    can give me answer on above questions or some links ?
>
> 4) Last  one , regarding QC  issue with plotting … SO when doing Array
> (pseudo) image plots  my RGui in windows complains e.g.:
>
>   If I do:   cf <- getFile (cs, 1)
>                plotImage(cf, transform=list(log2), palette=rainbow
> (256))
>
>                #Loading required package: EBImage
>                #Loading required package: abind
>                 ….
>
>     I get “Runtime error!”  message from Visual C++ and I have to
> click 2 times “ok” and then I get the picture…
>     Here is the link to the error msg::
>      http://www.4shared.com/file/209798878/f5a3f82e/Aromaplotimageerror.html
>
>     SO you can imagine I’m not  enthusiastic of clicking twice for 300
> arrays and then again for several type of plots.
>     Any idea where is the issue ?  (I guess something with EBImage
> dependency make issue )
>     Below  is my session info .
>
>    Hope you can help .
>
>   Best  regards,
>
>   Branko
>
> --------------------------
> Branislav Misovic,
> Department of Toxicogenetics
> Leiden University Medical Center
> PO.box 9600, Building2,Room:T3-11
> 2300 RC Leiden
> The Netherlands
> Phone: +31 71 526 9636
> Mob: 0653135855
> E-mail: [email protected]
>
>  > sessionInfo()
> R version 2.10.0 (2009-10-26)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
> Kingdom.1252
> [3] LC_MONETARY=English_United Kingdom.1252
> LC_NUMERIC=C
> [5] LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods
> base
>
> other attached packages:
>  [1] abind_1.1-0            aroma.affymetrix_1.3.4
> aroma.apd_0.1.7
>  [4] affxparser_1.18.0      R.huge_0.2.0
> aroma.core_1.3.4
>  [7] aroma.light_1.15.1     matrixStats_0.1.8
> R.rsp_0.3.6
> [10] R.filesets_0.6.5       digest_0.4.1
> R.cache_0.2.0
> [13] R.utils_1.2.4          R.oo_1.6.6
> EBImage_3.2.0
> [16] R.methodsS3_1.0.3
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.0
>
> --------------------------
> Branislav Misovic,
> Department of Toxicogenetics
> Leiden University Medical Center
> PO.box 9600, Building2,Room:T3-11
> 2300 RC Leiden
> The Netherlands
> Phone: +31 71 526 9636
> Mob: 0653135855
> E-mail: [email protected]

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en

[aroma.affymetrix] Re: Question for custom CDF of ST-Array

Reply via email to