Hi Russ,

Sorry for the delayed response. Just finished up chairing PyOhio and got 
through my work backlog.

Broadly speaking -- yes, sounds interesting; the trick will be coming 
> up with an API that *isn't* PostgreSQL specific. 
>

Definitely, and one that doesn't complicate things for the 99% case.

 

> Off the top of my head, I would look at this problem as the problem of 
> configuring the types of index that are to be created. db_index is 
> currently treated as a boolean yes/no; if you say yes, it creates an 
> index (or, in the case of PostgresSQL, indices); if you say no, it 
> doesn't. To me, what you're talking about is turning a True/False 
> option into something that can be explicitly configured (with some 
> sort of fallback so that "True" is the default index collection). 
>

That was my initial thought, but there is something elegant about 
db_index=True.  If it turns into more than that, it's become more complex 
for the 99% of the time you just want db_index=True.  The only time (today) 
that db_index would not be a boolean is if:

1. You are using a CharField
2. You are using PostgreSQL
3. You don't want the varchar_pattern_ops index created.

That's a really small subset.  Certainly there are probably going to be in 
the future other indexes potentially added to various types, but it seems 
to me it's always going to be a corner case.  It's a corner case I run into 
a lot, but I'm not egocentric enough to apply that to a general case :-).


This would need to be functionally driven -- e.g., "I want to add an 
> index that allows for partial matches", or "I want to add a 
> case-insensitive index" -- not exposing literal database syntax or 
> options. On some databases, some of these "functions" would be no-ops, 
> or subsumed by other index functions (so MySQL, for example, will only 
> create one index on CharFields).
>

There is a mechanism today to add custom indexes on fields that is 
*database-specific*, and that's custom SQL via the sql/<modelname>.sql 
structure.

The nice thing about your functionality-driven approach is it takes a 
*db-agnostic* approach to index creation.  None of the ORM's I've ever 
dealt with do anything more with index creation than the simple index, the 
foreign key index, or the unique/unique together index/constraint.  Doing 
something like this would certainly be novel, but that likely also means a 
lot of work. It would also require a lot of design thought, as to my 
knowledge there isn't an ORM that has already blazed this trail.


There is an analogous ticket around dealing with adding 
> composite/multicolumn indices: 
>
> https://code.djangoproject.com/ticket/5805 
>
> I don't know if there's any potential for overlap when it comes to the 
> API here, but it might be worth exploring. 
>

That is definitely interesting, and is a nice feature.  I'm not sure how to 
apply it to this situation without making db_index=True more complex.  The 
issue here is that Django is creating an index that is beyond the normal 
column index for CharFields in PostgreSQL and there isn't an in-Django way 
to override that behavior.  The extra index is getting created down in the 
Django PostgreSQL driver layer based on the column type.  Kind of like how 
you can create custom Model managers today.  Right now however, there is no 
easy way to extend or override the database driver layer.

In summary, I would say these are the possible approaches in rough order of 
complexity:

1. Don't specify db_index=True for CharFields where you don't want 
varchar_pattern_ops 
index created but want a regular index createrd.  Instead, create the 
regular index in sql/<modelname>.sql code.  This doesn't require any change 
to Django code.

2. Have some mechanism in Python code (via custom classes/overrides) that 
would allow one to override the default Django PostgreSQL layer creating 
the extra index.  With a custom "index creator" (not within Django, but 
written by the user, like a custom Model manager) could fine tune what/how 
indexes get created and even possibly change the create index statements to 
be specific to different databases.  In that case, you could use 
db_index=True, could make decisions based on db type, etc. which you can't 
with sql/<modelname>.sql code, and everything remains in Python code.

3. Create additional options to the Column's db_index parameter to allow 
for fine tuning of index creation.  Right now, the *only* fine tuning that 
makes sense (i.e. the only time more than one index is created when 
db_index=True) is PostgreSQL CharField's.

4. Extend the Django ORM to have an index creation capability in much the 
same manner as tables and columns are created.  This would provide a 
db-agnostic way to create indexes that have specific properties (case 
insensitive, multi-column, partial, etc.) and would generate CREATE INDEX 
statements that are specific to the underlying db (or not create if the db 
doesn't support).  This would be functionality not common in ORMs today, 
and Django would be a trail-blazer in this area.


Are there any I'm missing?  What seems like the most viable direction to 
take?

Thanks!
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/c4pH0-mo1IoJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to