I am sorry, this line should read as:

> fqhead[fqhead$V2 == "8" & fqhead$V3 == "120",]

my dataframe looks like this:
V2 is the lane number, V3 is the tile number, V4 and V5 being the x and y coordinates of the cluster position.
> head(fqhead)
           V1 V2 V3 V4   V5
1  @ILLUMINA06  8  1  6  849
2  @ILLUMINA06  8  1  6 1169
3  @ILLUMINA06  8  1  6 1163
4  @ILLUMINA06  8  1  6 1512
5  @ILLUMINA06  8  1  6 1251
6  @ILLUMINA06  8  1  6  372
7  @ILLUMINA06  8  1  6 1555
8  @ILLUMINA06  8  1  6 1644
9  @ILLUMINA06  8  1  6 2011
10 @ILLUMINA06  8  1  7 1835

Sirisha


Sirisha Sunkara wrote:
Hi Martin,

I am using the sequence.txt files generated by the Illumina pipeline (OLB1.6/RTA1.6) as is, which seem to have the tile coordinates.

Just so I can focus on the ReadIDs part for now (and I am sure this is not exactly what you asked for), I parsed out the readIDs from the fastq, and am working with those.

This is what my fastqs look like:

@ILLUMINA06:8:1:6:849#0/1
GCTCTTTTTGATTCTCAAATCCGGCGTCAACCATA
+ILLUMINA06:8:1:6:849#0/1
a`abaa_aaa]_a`_a_[]`]a_`aa_`_aa`aaa
@ILLUMINA06:8:1:6:1169#0/1
TAATGCCACTCCTCTCCCGACTGTTAACACTGCTG
+ILLUMINA06:8:1:6:1169#0/1
ab`_Z_aXa`bbababbbaabaaaaababaaa`V`

My very basic attempt at this:

> fqhead <- read.table("./Contam_Screening/Run703/sequence_8_1_hdrs.txt", sep=":")

To extract all entries for instance in lane 8, tile 120:
> fqhead[fqhead$V2 == "3" & fq$V3 == "120",]

I hope I am somewhat closer to what you asked for...

Thanks a lot!
Sirisha


Martin Morgan wrote:
On 04/12/2010 02:49 PM, Sirisha Sunkara wrote:
Hi Martin,

The qa function that reads in fastq format files, doesn't seem to
populate the perTile QA element with row information...
The row counts are zero for both the readCounts and
medianReadQualityScore list elements of perTile.

Is this feature still work in progress..? Essentially, I am trying to
get the TileQC plots for lanes where there was no reference genome to
align (no export.txt files)

Hi Sirisha -- fastq files can't be guaranteed to have tile info so
ShortRead doesn't try to guess these, even if some software adopts
conventions for embedding the information in the read ids.

The tile images are generated by

  ShortRead:::.plotTileCounts

and

  ShortRead:::.plotTileQualityScore

both take a regular data.frame. For .plotTileCounts, the columns are
'type' (safe to ignore, I think), 'tile' (integer tile index), 'lane'
(integer lane index), and 'count' (number of reads in this particular
lane & tile). As an untested work-around, you could create a data frame
like this by parsing your read IDs using standard R commands; provide an
example of what the read IDs look like and I'll help you. For the
.plotTileQualityScore, the columns are 'type', 'tile', 'lane', and
'score', where 'score' is the median 'qualityScore'
(alphabetScore(quality(srq)) / width(quality(srq)) for some ShortReadQ
object srq obtained by readFastq) over all reads in the tile.

Martin

 qafq <- qa("./Contam_Screening/Run703/","s_8_1_sequence.txt",
type="fastq")
qafq
class: FastqQA(9)
QA elements (access with qa[["elt"]]):
 readCounts: data.frame(1 3)
 baseCalls: data.frame(1 5)
 readQualityScore: data.frame(512 4)
 baseQuality: data.frame(94 3)
 alignQuality: data.frame(1 3)
 frequentSequences: data.frame(50 4)
 sequenceDistribution: data.frame(1663 4)
 perCycle: list(2)
   baseCall: data.frame(150 4)
   quality: data.frame(1081 5)
 perTile: list(2)
   readCounts: data.frame(0 4)
   medianReadQualityScore: data.frame(0 4)

Thank You,
Sirisha

sessionInfo()
R version 2.11.0 Under development (unstable) (2010-03-07 r51225)
x86_64-unknown-linux-gnu

locale:
[1] C

attached base packages:
[1] stats graphics grDevices utils datasets methods base other attached packages:
[1] ShortRead_1.5.21    lattice_0.18-3      Biostrings_2.15.22
[4] GenomicRanges_0.1.0 IRanges_1.5.74 Rmpi_0.5-8 loaded via a namespace (and not attached):
[1] Biobase_2.7.5 grid_2.11.0   hwriter_1.2   tools_2.11.0

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing





_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to