New topic: A better way to implement a many to many to many relation?
<http://forums.realsoftware.com/viewtopic.php?t=45833> Page 1 of 1 [ 13 posts ] Previous topic | Next topic Author Message Markus Winter Post subject: A better way to implement a many to many to many relation?Posted: Wed Nov 07, 2012 9:08 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany Hi all, might be a stupid question but here it goes: oligos are short pieces of DNA that can bind to longer DNA sequences at different positions which can be calculated quite nicely. A single oligo usually has quite a few possible binding sites on each sequence. I want to use a database with tables OLIGOS and SEQUENCES. However I want to keep the calculations to a minimum. When the binding sites for a given oligo to a sequence have been calculated, then I want to save that info as well so I can retrieve it later and do not have to do the calculation again (while the calculation is trivial for one oligo, if you have 3500 in your database then it is the speed limiting step). I thought about having a table BINDINGSITES which stores the binding sites as a comma delimited string (which in the app can be split into an array easily again) BindingSiteID SequenceID OligoID BindingSites 1 1 12 57,256,897,1212 2 1 23 323,767 3 1 112 33,699,1467,1776 4 2 23 456,864,888,1374 5 2 53 328,672,1007,1116 .. but that seems kind of inelegant. On the other hand I can't think of a more elegant solution. So is this the way to do it or is there a better way that you could recommend? Thanks Markus Top Jason_Adams Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 9:32 am Joined: Fri Nov 10, 2006 4:10 pm Posts: 1713 Location: Michigan, USA Please help me understand a bit more: Is an Oligo specific to only one Sequence? Or can a single Oligo bind to multiple Sequences? I'm confident there's a better way to do this, but I need to understand the Oligo - Sequence relationship better. _________________ Windows 7 Ultimate x64 Windows XP Pro SP3 Ubuntu 11.04 via Virtual Box RS Enterprise 2012r1.1 Programming Tutorials & Free Projects: http://www.JasonTheAdams.com "Christianity has not been tried and found wanting; it has been found difficult and not tried." - G.K. Chesterton Top DaveS Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 9:52 am Joined: Sun Aug 05, 2007 10:46 am Posts: 4238 Location: San Diego, CA Why not use relational database tables Table #1 Binding Site ID Sequence ID OligoID Table #2 OligioID BindingSite Unless I am duplicating BindingSiteID from Table #1 with BindingSite in Table #2 Like Jason said... more info please _________________ Dave Sisemore MacPro, OSX Lion 10.7.4 RB2012r1 Note : I am not interested in any solutions that involve custom Plug-ins of any kind Top Markus Winter Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:09 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany Jason_Adams wrote:Please help me understand a bit more: Is an Oligo specific to only one Sequence? Or can a single Oligo bind to multiple Sequences? I'm confident there's a better way to do this, but I need to understand the Oligo - Sequence relationship better. An oligo can be used (bind to) for many sequences. Each sequence can have many oligos which bind to it. For each sequence an oligo can bind to multiple sites. Top Markus Winter Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:13 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany DaveS wrote:Why not use relational database tables I thought I was doing that Basically BINDINGSITES is an xref table with OligoID and SequenceID as foreign keys. It is the many to many to many relationship which throws me here, and as I'm not too familiar with databases beyond some simple SQL to decide if this is the best way of doing it. Top npalardy Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:16 am Real Software Engineer Joined: Sat Dec 24, 2005 8:18 pm Posts: 7532 Location: Canada, Alberta, Near Red Deer You and Dave were on the right track - just needed to spit things up a bit more Sequence Table ============ Sequence ID Sequence (I'm assuming this is the particular DNA sequence or event the list of nucleotides) OligoTable ======= OligioID Oligo (I'm assuming this is the list of nucleotides that make up the oligo) BindingSite Table =========== BindingSiteID BindingSite (not sure what this means in this context but I assume there's some data that identifies a site) OligoSequenceSite table =============== OligoID SequenceID BindingSiteID Table 4 is the key to many to many to many Now with a single query you can find all the oligo's that bind to a site, all the sequences that have binding to that site or specific oligo's & sequences that bind to that site (and other variations) just through joins _________________ My web site Great White Software RBLibrary.com REALbasic learning Top Markus Winter Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:24 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany DaveS wrote:Like Jason said... more info please A sequence is basically a long string (usually between 3,000 and 10,000 characters) made up of four letters (for simplicities sake) and looks like this: Seq1: GTGACCAGACATTGACACAGTGATTGATTGGACCCCACAGATTAGAG ... GGTAGACCACACCACATATACATACCTAGACATTGC Seq2: TCGGTATCGGCTCTAACACTCGGTAGCTTGCTCGAGCGAGCTCGAGAT ... AGATAGCTCGATGACACGACCATATACATAGACAG An oligo is basically a short (usually 18-30) string made up of four letters (for simplicities sake) and looks like this: Oligo1: CAGTGATTGATTGGA Oligo2: ACCACATATACAT Binding is basically defined as a match of the oligo in the sequence (where the match does not have to be perfect) and noting down the position (the BINDINGSITE) where it happens. As you can imagine running a few thousand oligos against a sequence is taking a bit of time. However in practice you often use the same 10-20 sequences over and over again (you just try to find which oligos are best suited for the task at hand). So computational time can be massively reduced by remembering where a specific oligo binds on a sequence. Top Markus Winter Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:28 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany npalardy wrote:BindingSite (not sure what this means in this context but I assume there's some data that identifies a site) The Bindingsite is the position where an oligo binds to a sequence (so simply an integer value). Note that an oligo usually has many binding sites on each sequence (each with a certain strength which I also would need to save). Top npalardy Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:32 am Real Software Engineer Joined: Sat Dec 24, 2005 8:18 pm Posts: 7532 Location: Canada, Alberta, Near Red Deer Markus Winter wrote:npalardy wrote:BindingSite (not sure what this means in this context but I assume there's some data that identifies a site) The Bindingsite is the position where an oligo binds to a sequence (so simply an integer value). Note that an oligo usually has many binding sites on each sequence (each with a certain strength which I also would need to save). ah then you probably dont need the table 3 and alter table 4 to be Sequence Table ============ Sequence ID Sequence (I'm assuming this is the particular DNA sequence or event the list of nucleotides) OligoTable ======= OligioID Oligo (I'm assuming this is the list of nucleotides that make up the oligo) OligoSequenceSite table =============== OligoID SequenceID BindingSite Now you can have the same oligo bound to many sites on the same sequence (it's a distinct row in the OligoSequenceSite table You might make the OligoSequenceSite have a unique key on all columns so you dont insert the same one twice (that would be useless) But this should (from what I understand) suffice for your needs _________________ My web site Great White Software RBLibrary.com REALbasic learning Top Markus Winter Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:40 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany npalardy wrote:OligoSequenceSite table =============== OligoID SequenceID BindingSite Now you can have the same oligo bound to many sites on the same sequence (it's a distinct row in the OligoSequenceSite table So if I understand this right then it would be BindingSiteID SequenceID OligoID BindingSites 1 1 12 57 2 1 12 256 3 1 12 897 4 1 12 1212 5 1 23 323 6 1 23 767 7 1 112 33 8 1 112 699 9 1 112 1467 10 1 112 1776 11 2 23 456 12 2 23 864 13 2 23 888 14 2 23 1374 15 2 53 328 16 2 53 672 17 2 53 1007 18 2 53 1116 P.S. Note that the binding site is simply the position where an oligo binds to a sequence. A binding site 56 on one sequence is completely different from a binding site 56 on another sequence. Top Markus Winter Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 10:49 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany And that would allow me to save the strength of each binding in the table as well: BindingSiteID SequenceID OligoID BindingSite BindingStrength 1 1 12 57 88 2 1 12 256 97 3 1 12 897 85 4 1 12 1212 99 5 1 23 323 100 6 1 23 767 86 7 1 112 33 96 8 1 112 699 89 9 1 112 1467 84 10 1 112 1776 92 11 2 23 456 97 12 2 23 864 85 13 2 23 888 86 14 2 23 1374 86 15 2 53 328 84 16 2 53 672 94 17 2 53 1007 82 18 2 53 1116 93 Thanks Norman, that would work. Would I need a unique BindingSiteID for the table? Probably not ... Top Jason_Adams Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 11:39 am Joined: Fri Nov 10, 2006 4:10 pm Posts: 1713 Location: Michigan, USA Markus Winter wrote:Would I need a unique BindingSiteID for the table? Probably not ... No, I would set the Primary Key as (SequenceID, OligoID) â so long as you're sure the pair will always be unique, then you're good to go. _________________ Windows 7 Ultimate x64 Windows XP Pro SP3 Ubuntu 11.04 via Virtual Box RS Enterprise 2012r1.1 Programming Tutorials & Free Projects: http://www.JasonTheAdams.com "Christianity has not been tried and found wanting; it has been found difficult and not tried." - G.K. Chesterton Top Markus Winter Post subject: Re: A better way to implement a many to many to many relatioPosted: Wed Nov 07, 2012 11:44 am Joined: Sun Feb 19, 2006 4:00 pm Posts: 1215 Location: Heidelberg, Germany Jason_Adams wrote:Markus Winter wrote:Would I need a unique BindingSiteID for the table? Probably not ... No, I would set the Primary Key as (SequenceID, OligoID) â so long as you're sure the pair will always be unique, then you're good to go. As you can see from BindingSiteID SequenceID OligoID BindingSite BindingStrength 1 1 12 57 88 2 1 12 256 97 they would not be unique (as an oligo can bind to multiple sites on a sequence). After Norman's post I just wonder if a join table needs a unique ID. Top Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending Page 1 of 1 [ 13 posts ]
-- Over 1500 classes with 29000 functions in one REALbasic plug-in collection. The Monkeybread Software Realbasic Plugin v9.3. http://www.monkeybreadsoftware.de/realbasic/plugins.shtml [email protected]
