Some months ago, I brought this up incorrectly.  I've spoken to the
scientists here, and I'd like to propose a set of semantics for
BetweenLocations.  This is driven by the two genbank/embl cases 5^6 and
5^10.  LocationTools will require the boolean operations areEqual, contains,
and overlaps.  It also requires operations that return a location, union and
intersection.  

The general proposal is to be generous with returning values where the
correct behavior is ill-defined.  For example, we recommend 5^10 intersects
7^12 results in 7^10.  It's easiest to define the semantics for each case.

Assumptions
* Union, areEqual, Overlaps, and Intersection are symmetric operators
* Contains is a directional operator
* In general, 7^12 is analogous to 7.12 in behavior
* A location 1..10 represents nucleotides 1 through 10 inclusive
* A location 5^6 represents the space between nucleotide 5 and nucleotide 6
* A location 5^10 represents a space between two nucleotides.  The
nucleotides are located between 5 and 10 inclusive
* Requesting the features on a subsequence must return the features with
between locations as well

(tab-delimited table)
Location                                        Intersection    Overlaps
1..10 intersects 5^6    result  5^6                     TRUE
1..10 intersects 5^7    result  5^7                     TRUE
1..10 intersects 10^11  result  EMPTY                   FALSE
1..10 intersects 9^11   result  9^10                    TRUE
5^6 intersects 5^6      result  5^6                     TRUE
5^10 intersects 5^6     result  5^6                     TRUE
5^10 intersects 7^12    result  7^10                    TRUE

areEqual
        5..6 equals 5^6 result FALSE
        5^6 equals 5^6  result TRUE
        5^6 equals 5^10 result FALSE

Contains
        1..10 contains 5^6      result TRUE
        5^6     contains 1..10  result FALSE
        1..10 contains 5^7      result TRUE
        1..10 contains 10^11    result FALSE
        1..10 contains 9^11     result FALSE
        5^6 contains 5^6                result TRUE
        5^10 contains 5^6               result TRUE
        5^6 contains 5^10               result FALSE
        5^10 contains 7^12      result FALSE
        7^12 contains 5^10      result FALSE

Overlaps
        1..10 overlaps 5^6      result TRUE
        1..10 overlaps 5^7      result TRUE
        1..10 overlaps 10^11    result FALSE
        1..10 overlaps 9^11     result TRUE
        9^11 overlaps 1..10     result TRUE
        5^6 overlaps 5^6                result TRUE
        5^6 overlaps 5^10               result TRUE
        5^10 overlaps 7^12      result TRUE 

Union
        1..10 union 5^6 result compoundLocation(1..10,5^6)
        1..10 union 5^7 result compoundLocation(1..10,5^7)
        1..10 union 10^11       result compoundLocation(1..10,10^11)
        1..10 union 9^11        result compoundLocation(1..10,9^11)
        5^6 union 5^6   result 5^6
        5^10 union 5^6  result 5^10
        5^10 union 7^12 result 5^12

If there aren't any objections, I'll code this up and commit it.

Greg
_______________________________________________
Biojava-l mailing list  -  [EMAIL PROTECTED]
http://biojava.org/mailman/listinfo/biojava-l

Reply via email to