I can't really comment on whether this particular instance is a bug, but
I would definitely agree with Chris on the benefits of coming up with
your own deterministic rules and enforcing them.
Most of my work has been with various SQL databases and "null handling "
always ends up being problematic, even if you can perform operations with
such values included in the operation.
It also often results in almost comical statements like, "I *want* the
this date column to be sorted and put all the ones where we don't know
the date at the top *OR* the bottom - I don't care which."
The fact that most SQL databases will handle ordering nullable columns
at all is based on an important assumption made in the standard: that
although two null values are "not equal" to one another, they are also
This implies that all the entries that you have where the publication
date is unknown, all occurred at the same time. While useful for
grouping items, it is almost always a clear, factual error about the
data itself. (In your case, it probably means it wasn't published
When this assumption is not clear, or later gets lost in the output,
people tend to make bad decisions using the "implied sort order".
Example: "Delete all the entries before a particular publication date."
What does that really imply? What will the result be? Will all the
unpublished ones get deleted also?
There is not one particular "right" way to handle all the situations, so
you have to think through each one. Maybe you could combine the use of
other data (creation date, etc.) into the sorting to get a better
chronology. Or, maybe it would be better to always keep the two groups
of records distinct: a list of published ones sorted by publication
date, and a list of unpublished ones sorted by something else.
Sorry if this response is a bit off-topic, but I just wanted to offer
some advice on a topic that has bitten me more than enough times.
--- On Fri, 1/14/11, Wichert Akkerman <wich...@wiggy.net> wrote:
> From: Wichert Akkerman <wich...@wiggy.net>
> Subject: Re: [Repoze-dev] catalog oddness
> To: email@example.com
> Date: Friday, January 14, 2011, 9:21 AM
> On 1/14/11 15:04 , Chris Rossi
> > On Fri, Jan 14, 2011 at 5:20 AM, Wichert Akkerman
> > <mailto:wich...@wiggy.net>>
> > This may already be different
> in the trunk of repoze.catalog, but I
> > just stumbled over this: when
> you do a catalog search and ask it to
> > order by an empty index you
> get an empty result set. I was expecting
> > the result to be an unordered
> result for that situation. Is this
> > expected behaviour?
> > Hi Wichert,
> > I haven't observed this behavior, but it seems like an
> undefined case,
> > to me. I'm not sure what I would expect it to do
> in such a situation.
> > Supposing you have a set of documents
> you want sorted by an index and
> > the index contains only a subset of those
> documents? It seems to me the
> > case is undefined--I would have a tendency to raise an
> > personally.
> I was expecting a missing index value to be treated as None
> (or NULL in SQL terms) and the related items to appear
> either first or last. Raising an exception is undesirable:
> there are valid situations where an object might have a None
> value for an indexed attribute and that should not lead to
> exceptions when doing catalog queries.
> > I think in the interest of a well defined determinism
> I would suggest
> > that if you are using an index to sort, you should
> make sure that the
> > discriminator for that index be able to return some
> value for any
> > document. This way even if logically, to you,
> the document doesn't
> > really have a value for that index, you can at least
> be deterministic
> > about how it will be sorted.
> The object did have a value, but it was None which the
> indexed apparently ignores. The fact that it was always None
> was a bug in my code that has been fixed now - it should be
> either None or a date (it was a publication-date field).
> Repoze-dev mailing list
Repoze-dev mailing list