#26683: Oracle DBs: Add option to use VARCHAR2 data type
--------------------------------+--------------------------------------
     Reporter:  elarandjelovic  |                    Owner:  nobody
         Type:  New feature     |                   Status:  new
    Component:  Migrations      |                  Version:  1.9
     Severity:  Normal          |               Resolution:
     Keywords:  Oracle          |             Triage Stage:  Unreviewed
    Has patch:  0               |      Needs documentation:  0
  Needs tests:  0               |  Patch needs improvement:  0
Easy pickings:  0               |                    UI/UX:  0
--------------------------------+--------------------------------------

Comment (by jarshwah):

 Just about every database I've worked with uses VARCHAR rather than
 NVARCHAR. I wasn't aware of performance issues between the two though.
 Vackar, were the joins you're talking about joins between NVAR and VAR
 fields, or just two tables that contained those fields. I'm fairly sure
 you mean joining NCHAR to CHAR, but just want to be clear.

 I guess the original decision to support NVARCHAR was due to the
 possibility/frequency of oracle databases at the time not using utf8 (or
 similar) encoding by default, so the NVARCHAR was a way to make unicode
 work in all circumstances.

 Recommendations mostly follow this from Ask Tom
 
(https://asktom.oracle.com/pls/apex/f%3Fp%3D100:11:::NO:RP:P11_QUESTION_ID:9462837200346048883)

 > NVARCHAR2 is a unicode-only data type. It's useful if you want to have
 unicode for some columns (those that are NVARCHAR2) but have the rest of
 the database use a different characterset. In all other cases stick with
 using VARCHAR2.

 I think this is probably worth fixing provided a palatable solution can be
 found. Simply querying for the character set is probably not enough. You
 need to know this information every time you run a migration (or cache it
 somewhere). If the character set is changed between migrations, now you
 have different character types within the same application which is
 arguably worse. I'm unsure if django needs to know what the type of
 varchar column is at runtime or not, but if it does, that'd incur an extra
 query at connection creation time.

 Also, looking into the code a bit it looks like the
 startswith/istartswith/contains based operators all expect NCHAR, and do
 the appropriate conversions:

 {{{
  _standard_operators = {
         'contains': "LIKE TRANSLATE(%s USING NCHAR_CS) ESCAPE
 TRANSLATE('\\' USING NCHAR_CS)",
         'icontains': "LIKE UPPER(TRANSLATE(%s USING NCHAR_CS)) ESCAPE
 TRANSLATE('\\' USING NCHAR_CS)",
         'startswith': "LIKE TRANSLATE(%s USING NCHAR_CS) ESCAPE
 TRANSLATE('\\' USING NCHAR_CS)",
         'endswith': "LIKE TRANSLATE(%s USING NCHAR_CS) ESCAPE
 TRANSLATE('\\' USING NCHAR_CS)",
         'istartswith': "LIKE UPPER(TRANSLATE(%s USING NCHAR_CS)) ESCAPE
 TRANSLATE('\\' USING NCHAR_CS)",
         'iendswith': "LIKE UPPER(TRANSLATE(%s USING NCHAR_CS)) ESCAPE
 TRANSLATE('\\' USING NCHAR_CS)",
     }
 }}}

 I would think most of that TRANSLATE/USING would be unnecessary if we were
 simply using varchar to begin with. There's probably other optimisations
 we can make elsewhere if varchar was used. But these operators probably
 prevent you from creating a useful model field yourself - I haven't tried
 to see what happens if you do LIKE comparisons by translating char using
 nchar.

 Migrations between the two are possible, but it requires creating a new
 field, migrating the data across, and then dropping the existing field.
 For big tables that's going to be very expensive.

 It seems to me that this is an all or nothing proposition. Since the
 character encoding is database wide (not per user/schema), and there are
 many very old installations of oracle around, we can't guarantee that an
 existing django deployment will be able to continue on a version using
 varchar if they aren't able to change the global character set.

 I think this is going to require a new backend since there is no easy way
 to dynamically switch behaviours. A new backend would be able to sweep out
 a tonne of old work arounds that are probably not useful anymore. It'd
 also be able to break backwards compatibility, because I don't see the
 django organisation maintaining two oracle backends when maintaining the
 current one is hard enough due to lack of oracle maintainers.

 Sorry for the wall of text, but I kept jumping down new rabbit holes. As I
 said earlier though, if a palatable solution can be found, I think it's
 definitely worth looking into.

--
Ticket URL: <https://code.djangoproject.com/ticket/26683#comment:7>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/072.898515f46925ac1e41a6c035c6b7553d%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to