Re: Python Module Exposure
Jacob Page [EMAIL PROTECTED] wrote: George Sakkis wrote: As I see it, there are two main options you have: 1. Keep Intervals immutable and pass all the responsibility of combining them to IntervalSet. In this case Interval.__add__ would have to go. This is simple to implement, but it's probably not the most convenient to the user. 2. Give Interval the same interface with IntervalSet, at least as far as interval combinations are concerned, so that Interval.between(2,3) | Interval.greaterThan(7) returns an IntervalSet. Apart from being user friendlier, an extra benefit is that you don't have to support factories for IntervalSets, so I am more in favor of this option. I selected option one; Intervals are immutable. However, this doesn't mean that __add__ has to go, as that function has no side-effects. The reason I chose option one was because it's uncommon for a mathematical operation on two objects to return a different type altogether. Mathematically what you described corresponds to sets that are not closed under some operation and it's not uncommon at all; a few examples are: - The set of integers is not closed under division: int / int - rational - The set of real numbers is not closed under square root: sqrt(real) - complex - The set of positive number is not closed under subtraction: pos_number - pos_number - number - And yes, the set of intervals is not closed under union. On another note, I noticed you use __contains__ both for membership and is-subset queries. This is problematic in case of Intervals that have other Intervals as members. The set builtin type uses __contains__ for membership checks and issubset for subset checks (with __le__ as synonym); it's good to keep the same interface. George -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
George Sakkis wrote: Jacob Page [EMAIL PROTECTED] wrote: I selected option one; Intervals are immutable. However, this doesn't mean that __add__ has to go, as that function has no side-effects. The reason I chose option one was because it's uncommon for a mathematical operation on two objects to return a different type altogether. Mathematically what you described corresponds to sets that are not closed under some operation and it's not uncommon at all; a few examples are: - The set of integers is not closed under division: int / int - rational - The set of real numbers is not closed under square root: sqrt(real) - complex - The set of positive number is not closed under subtraction: pos_number - pos_number - number - And yes, the set of intervals is not closed under union. Yes, but I wasn't talking about mathematical operations in general; I was talking about mathematical operations in Python. Example: 6 / 5 doesn't yield a float (though I heard that might change in future versions). If the union of two integers yielded a set of integers, then it'd make more since for the union of two Intervals to yield an IntervalSet. But it doesn't. Just as set([2, 6]) creates a set of two integers, IntervalSet(Interval(...), Interval(...)) creates a set of two intervals. On another note, I noticed you use __contains__ both for membership and is-subset queries. This is problematic in case of Intervals that have other Intervals as members. The set builtin type uses __contains__ for membership checks and issubset for subset checks (with __le__ as synonym); it's good to keep the same interface. Good catch. I think I've made similar mistakes for a few of the other functions, too. Thanks for your help. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Jacob Page [EMAIL PROTECTED] wrote: If the union of two integers yielded a set of integers, then it'd make more since for the union of two Intervals to yield an IntervalSet. AFAIK union is defined over sets, not numbers, so I'm not sure what you mean by the union of two integers. What I'm saying is that while the union of two intervals is always defined (since intervals are sets), the result set is not guaranteed to be an interval. More specifically, the result is an interval if and only if the intervals overlap, e.g. (2,4] | [3,7] = (2,7] but (2,4] | [5,7] = { (2,4], [5,7] } That is, the set of intervals is not closed under union. OTOH, the set of intervals _is_ closed under intersection; intersecting two non-overlapping intervals gives the empty interval. George -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
George Sakkis wrote: Jacob Page [EMAIL PROTECTED] wrote: If the union of two integers yielded a set of integers, then it'd make more since for the union of two Intervals to yield an IntervalSet. AFAIK union is defined over sets, not numbers, so I'm not sure what you mean by the union of two integers. What I'm saying is that while the union of two intervals is always defined (since intervals are sets), the result set is not guaranteed to be an interval. More specifically, the result is an interval if and only if the intervals overlap, e.g. (2,4] | [3,7] = (2,7] but (2,4] | [5,7] = { (2,4], [5,7] } That is, the set of intervals is not closed under union. OTOH, the set of intervals _is_ closed under intersection; intersecting two non-overlapping intervals gives the empty interval. OK, you've convinced me now to support and, or, and xor between every combination of Intervals and IntervalSets, Intervals and IntervalSets, and IntervalSets and Intervals. However, I'm not sure I like the idea of an operation generating either one type or another. Thus, I'll have | and ^ operations between Intervals always return an IntervalSet instead of returning either an IntervalSet or an Interval. will return an Interval. I suppose that means I should just have + do a union and - return an IntervalSet. It will just have to be documented which types are to be expected for the return values depending on the operands. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Jacob Page [EMAIL PROTECTED] wrote: OK, you've convinced me now to support and, or, and xor between every combination of Intervals and IntervalSets, Intervals and IntervalSets, and IntervalSets and Intervals. I'm sorry, this was not my intention wink. However, I'm not sure I like the idea of an operation generating either one type or another. I don't like it either; it makes the user's work harder by forcing him to check the result's type, and then either process each type differently or wrap Intervals into IntervalSets and deal with the latter only. Thus, I'll have | and ^ operations between Intervals always return an IntervalSet instead of returning either an IntervalSet or an Interval. will return an Interval. I suppose that means I should just have + do a union and - return an IntervalSet. It will just have to be documented which types are to be expected for the return values depending on the operands. This is better, but still I'm not sure if it's good enough. Splitting the set of operators into those returning Interval and those returning IntervalSet has to be documented of course, but nevertheles it is not intuitive for the simple user who doesn't think about closed and non-closed operations. It's a viable option though. The other two options are: - Return always an IntervalSet from all five operators (~, |, , ^, -). This is inconvenient for at least intersection and difference which are known to be closed operations. - Go back to your initial preference, that is don't support any operator at all for Intervals. Given that in your new version there are factories in IntervalSet as well, it's not as bad as before; simply the user can create Intervals through the IntervalSet factories. You can take this even further by disallowing (or discouraging at least) the user to instantiate Intervals directly. An analogy is the Match type in the re module. Match objects are returned by re.match() and re.search() and they expose a set of useful methods (group(), groups(), etc.). However if the user attempts to create a Match instance, a TypeError is raised. Currently I like this option better; it's both user and developer friendly :) George -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Jacob Page wrote: Thomas Lotze wrote: Jacob Page wrote: better-named, Just a quick remark, without even having looked at it yet: the name is not really descriptive and runs a chance of misleading people. The example I'm thinking of is using zope.interface in the same project: it's customary to name interfaces ISomething. I've thought of IntervalSet (which is a very long name), IntSet (might be mistaken for integers), ItvlSet (doesn't roll off the fingers), ConSet, and many others. Perhaps IntervalSet is the best choice out of these. I'd love any suggestions. Hi Jacob, I liked a lot the idea of interval sets; I wonder why no one has thought of it before (hell, why *I* haven't thought about it before ? :)) I haven't had a chance to look into the internals, so my remarks now are only API-related and stylistic: 1. As already noted, ISet is not really descriptive of what the class does. How about RangeSet ? It's not that long and I find it pretty descriptive. In this case, it would be a good idea to change Interval to Range to make the association easier. 2. The module's helper functions -- which are usually called factory functions/methods because they are essentially alternative constructors of ISets -- would perhaps better be defined as classmethods of ISet; that's a common way to define instance factories in python. Except for 'eq' and 'ne', the rest define an ISet of a single Interval, so they would rather be classmethods of Interval. Also the function names could take some improvement; at the very least they should not be 2-letter abbreviations. Finally I would omit 'eq', an interval of a single value; single values can be given in ISet's constructor anyway. Here's a different layout: class Range(object): # Interval @classmethod def lowerThan(cls, value, closed=False): # lt, for closed==False # le, for closed==True return cls(None, False, value, closed) @classmethod def greaterThan(cls, value, closed=False): # gt, for closed==False # ge, for closed==True return cls(value, closed, None, False) @classmethod def between(cls, min, max, closed=False): # exInterval, for closed==False # incInterval, for closed==True return cls(min, closed, max, closed) class RangeSet(object): # ISet @classmethod def allExcept(cls, value): # ne return cls(Range.lowerThan(value), Range.greaterThan(value)) 3. Having more than one names for the same thing is good to be avoided in general. So keep either none or empty (preferably empty to avoid confusion with None) and remove ISet.__add__ since it is synonym to ISet.__or__. 4. Intervals shouldn't be comparable; the way __cmp__ works is arbitrary and not obvious. 5. Interval.adjacentTo() could take an optional argument to control whether an endpoint is allowed to be in both ranges or not (e.g. whether (1,3], [3, 7] are adjacent or not). 6. Possible ideas in your TODO list: - Implement the whole interface of sets for ISet's so that a client can use either or them transparently. - Add an ISet.remove() for removing elements, Intervals, ISets as complementary to ISet.append(). - More generally, think about mutable vs immutable Intervals and ISets. The sets module in the standard library will give you a good idea of the relevant design and implementation issues. After I look into the module's internals, I'll try to make some changes and send it back to you for feedback. Regards, George -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
George Sakkis wrote: Jacob Page [EMAIL PROTECTED] wrote: George Sakkis wrote: 1. As already noted, ISet is not really descriptive of what the class does. How about RangeSet ? It's not that long and I find it pretty descriptive. In this case, it would be a good idea to change Interval to Range to make the association easier. The reason I decided not to use the term Range was because that could be confused with the range function, which produces a discrete set of integers. Interval isn't a term used in the standard/built-in library, AFAIK, so I may stick with it. Is this sound reasoning? Yes, it is not unreasonable; I can't argue strongly against Interval. Still I'm a bit more in favor of Range and I don't think it is particularly confusing with range() because: 1. Range has to be either qualified with the name of the package (e.g. rangesets.Range) or imported as from rangesets import Range, so one cannot mistake it for the range builtin. 2. Most popular naming conventions use lower first letter for functions and capital for classes. 3. If you insist that only RangeSet should be exposed from the module's interface and Range (Interval) should be hidden, the potential conflict between range() and RangeSet is even less. Those are pretty good arguments, but after doing some poking around on planetmath.org and reading http://planetmath.org/encyclopedia/Interval.html, I've now settled on Interval, since that seems to be the proper use of the term. 2. The module's helper functions -- which are usually called factory functions/methods because they are essentially alternative constructors of ISets -- would perhaps better be defined as classmethods of ISet; that's a common way to define instance factories in python. Except for 'eq' and 'ne', the rest define an ISet of a single Interval, so they would rather be classmethods of Interval. Also the function names could take some improvement; at the very least they should not be 2-letter abbreviations. First, as far as having the factory functions create Interval instances (Range instances in your examples), I was hoping to avoid Intervals being publically exposed. I think it's cleaner if the end-programmer only has to deal with one object interface. First off, the python convention for names you intend to be 'private' is to prefix them with a single underscore, i.e. _Interval, so it was not obvious at all by reading the documentation that this was your intention. Assuming that Interval is to be exposed, I found Interval.lowerThan(5) a bit more intuitive than IntervalSet.lowerThan(5). The only slight problem is the 'ne'/ allExcept factory which doesn't return a continuous interval and therefore cannot be a classmethod in Interval. If the factories resided in separate classes, it seems like they might be less convenient to use. I wanted these things to be easily constructed. Maybe a good compromise is to implement lessThan and greaterThan in both Interval and IntervalSet. On whether Interval should be exposed or not: I believe that interval is a useful abstraction by itself and has the important property of being continuous, which IntervalSet doesn't. Perhaps I should add a boolean function for IntervalSet called continuous (isContinuous?). Having a simple single-class interface is a valid argument, but it may turn out to be restricted later. For example, I was thinking that a useful method of IntervalSet would be an iterator over its Intervals: for interval in myIntervalSet: print interval.min, interval.max I like the idea of allowing iteration over the Intervals. There are several possible use cases where dealing directly with intervals would be appropriate or necessary, so it's good to have them supported directly by the module. I think I will keep Interval exposed. It sort of raises a bunch of hard-to-answer design questions having two class interfaces, though. For example, would Interval.between(2, 3) + Interval.between(5, 7) raise an error (as it currently does) because the intervals are disjoint or yield an IntervalSet, or should it not even be implemented? How about subtraction, xoring, and anding? An exposed class should have a more complete interface. I think that IntervalSet.between(5, 7) | IntervalSet.between(2, 3) is more intuitive than IntervalSet(Interval.between(5, 7), Interval.between(2, 3)), but I can understand the reverse. I think I'll just support both. I like the idea of lengthening the factory function names, but I'm not sure that the factory methods would be better as class methods. One of the main reason for introducing classmethods was to allow alternate constructors that play well with subclasses. So if Interval is subclassed, say for a FrozenInterval class, FrozenInterval.lowerThan() returns FrozenInterval instances automatically. You've convinced me :). I sure hate to require the programmer to have to type the class name when using a factory
Re: Python Module Exposure
Jacob Page [EMAIL PROTECTED] wrote: George Sakkis wrote: There are several possible use cases where dealing directly with intervals would be appropriate or necessary, so it's good to have them supported directly by the module. I think I will keep Interval exposed. It sort of raises a bunch of hard-to-answer design questions having two class interfaces, though. For example, would Interval.between(2, 3) + Interval.between(5, 7) raise an error (as it currently does) because the intervals are disjoint or yield an IntervalSet, or should it not even be implemented? How about subtraction, xoring, and anding? An exposed class should have a more complete interface. I think that IntervalSet.between(5, 7) | IntervalSet.between(2, 3) is more intuitive than IntervalSet(Interval.between(5, 7), Interval.between(2, 3)), but I can understand the reverse. I think I'll just support both. As I see it, there are two main options you have: 1. Keep Intervals immutable and pass all the responsibility of combining them to IntervalSet. In this case Interval.__add__ would have to go. This is simple to implement, but it's probably not the most convenient to the user. 2. Give Interval the same interface with IntervalSet, at least as far as interval combinations are concerned, so that Interval.between(2,3) | Interval.greaterThan(7) returns an IntervalSet. Apart from being user friendlier, an extra benefit is that you don't have to support factories for IntervalSets, so I am more in favor of this option. Another hard design problem is how to combine intervals when inheritance comes to play. Say that you have FrozenInterval and FrozenIntervalSet subclasses. What should Interval.between(2,3) | FrozenInterval.greaterThan(7) return ? I ran into this problem for another datastructure that I wanted it to support both mutable and immutable versions. I was surprised to find out that: s1 = set([1]) s2 = frozenset([2]) s1|s2 == s2|s1 True type(s1|s2) == type(s2|s1) False It make sense why this happens if you think about the implementation, but I guess most users expect the result of a commutative operation to be independent of the operands' order. In the worst case, this may lead to some hard-to-find bugs. George -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
George Sakkis wrote: Jacob Page [EMAIL PROTECTED] wrote: I think I will keep Interval exposed. It sort of raises a bunch of hard-to-answer design questions having two class interfaces, though. For example, would Interval.between(2, 3) + Interval.between(5, 7) raise an error (as it currently does) because the intervals are disjoint or yield an IntervalSet, or should it not even be implemented? How about subtraction, xoring, and anding? An exposed class should have a more complete interface. I think that IntervalSet.between(5, 7) | IntervalSet.between(2, 3) is more intuitive than IntervalSet(Interval.between(5, 7), Interval.between(2, 3)), but I can understand the reverse. I think I'll just support both. As I see it, there are two main options you have: 1. Keep Intervals immutable and pass all the responsibility of combining them to IntervalSet. In this case Interval.__add__ would have to go. This is simple to implement, but it's probably not the most convenient to the user. 2. Give Interval the same interface with IntervalSet, at least as far as interval combinations are concerned, so that Interval.between(2,3) | Interval.greaterThan(7) returns an IntervalSet. Apart from being user friendlier, an extra benefit is that you don't have to support factories for IntervalSets, so I am more in favor of this option. I selected option one; Intervals are immutable. However, this doesn't mean that __add__ has to go, as that function has no side-effects. The reason I chose option one was because it's uncommon for a mathematical operation on two objects to return a different type altogether. Another hard design problem is how to combine intervals when inheritance comes to play. Say that you have FrozenInterval and FrozenIntervalSet subclasses. What should Interval.between(2,3) | FrozenInterval.greaterThan(7) return ? For now, operations will return mutable instances. They can always be frozen later if needs be. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Jacob Page wrote: better-named, Just a quick remark, without even having looked at it yet: the name is not really descriptive and runs a chance of misleading people. The example I'm thinking of is using zope.interface in the same project: it's customary to name interfaces ISomething. -- Thomas -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Thomas Lotze wrote: Jacob Page wrote: better-named, Just a quick remark, without even having looked at it yet: the name is not really descriptive and runs a chance of misleading people. The example I'm thinking of is using zope.interface in the same project: it's customary to name interfaces ISomething. I've thought of IntervalSet (which is a very long name), IntSet (might be mistaken for integers), ItvlSet (doesn't roll off the fingers), ConSet, and many others. Perhaps IntervalSet is the best choice out of these. I'd love any suggestions. -- http://mail.python.org/mailman/listinfo/python-list
Python Module Exposure
I have created what I think may be a useful Python module, but I'd like to share it with the Python community to get feedback, i.e. if it's Pythonic. If it's considered useful by Pythonistas, I'll see about hosting it on Sourceforge or something like that. Is this a good forum for exposing modules to the public, or is there somewhere more-acceptable? Does this newsgroup find attachments acceptable? -- Jacob -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Jacob Page wrote: I have created what I think may be a useful Python module, but I'd like to share it with the Python community to get feedback, i.e. if it's Pythonic. If it's considered useful by Pythonistas, I'll see about hosting it on Sourceforge or something like that. Is this a good forum for exposing modules to the public, or is there somewhere more-acceptable? Does this newsgroup find attachments acceptable? No. Please put files somewhere on the web and post a URL. This would be a good forum to informally announce and discuss your module. Formal announcements once you, e.g. put it on SF should go to c.l.py.announce . -- Robert Kern [EMAIL PROTECTED] In the fields of hell where the grass grows high Are the graves of dreams allowed to die. -- Richard Harter -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Robert Kern wrote: Jacob Page wrote: I have created what I think may be a useful Python module, but I'd like to share it with the Python community to get feedback, i.e. if it's Pythonic. If it's considered useful by Pythonistas, I'll see about hosting it on Sourceforge or something like that. Is this a good forum for exposing modules to the public, or is there somewhere more-acceptable? Does this newsgroup find attachments acceptable? No. Please put files somewhere on the web and post a URL. This would be a good forum to informally announce and discuss your module. Formal announcements once you, e.g. put it on SF should go to c.l.py.announce . Thanks for the information, Robert. Anyway, here's my informal announcement: The iset module is a pure Python module that provides the ISet class and some helper functions for creating them. Unlike Python sets, which are sets of discrete values, an ISet is a set of intervals. An ISet could, for example, stand for all values less than 0, all values from 2 up to, but not including 62, or all values not equal to zero. ISets can also pertain to non-numeric values. ISets can be used in much the same way as sets. They can be or'ed, and'ed, xor'ed, added, subtracted, and inversed. Membership testing is done the same as with a set. The documentation has some examples of how to create and use ISets. The iset module is for Python 2.4 or later. I am seeking feedback from programmers and mathematicians on how to possibly make this module more user-friendly, better-named, better-documented, better-tested, and more Pythonic. Then, if this module is considered acceptable by the community, I'll create a more permanent home for this project. To download the iset module and view its pydoc-generated documentation, please visit http://members.cox.net/apoco/iset/. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Module Exposure
Robert Kern wrote: Jacob Page wrote: Does this newsgroup find attachments acceptable? No. Please put files somewhere on the web and post a URL. This would be a good forum to informally announce and discuss your module. To add to what Robert said, keep in mind this newsgroup is also mirrored to a mailing list, so posting anything but example code snippets would quickly fill up people's inboxes. -- http://mail.python.org/mailman/listinfo/python-list