[jira] [Commented] (SENTRY-2028) Avoid datanucleus to create/update database schema

kalyan kumar kalvagadda (JIRA) Thu, 02 Nov 2017 12:36:25 -0700

    [ 
https://issues.apache.org/jira/browse/SENTRY-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236459#comment-16236459
 ]


kalyan kumar kalvagadda commented on SENTRY-2028:
-------------------------------------------------

[~akolb]  
Below is the description for datanucleus.schema.autoCreateAll property. 
(http://www.datanucleus.org/products/accessplatform_4_1/jdo/schema.html)
"If you want to create the schema ("tables"+"columns"+"constraints") during the 
persistence process, the property datanucleus.schema.autoCreateAll provides a 
way of telling DataNucleus to do this. It's a shortcut to setting the other 3 
properties to true. Thereafter, during calls to DataNucleus to persist classes 
or performs queries of persisted data, whenever it encounters a new class to 
persist that it has no information about, it will use the MetaData to check the 
datastore for presence of the "table", and if it doesn't exist, will create it. 
In addition it will validate the correctness of the table (compared to the 
MetaData for the class), and any other constraints that it requires (to manage 
any relationships). If any constraints are missing it will create them."

With that said, this configuration will create index/unique constraints if they 
do not match the database schema. Do don't think we want that. This change will 
stop datanucleus from updating the schema based on the JDO definition. 

*What is the issue we currently have?*
 We will be having multiple indexes for the same column as the index names in 
JDO and sql script differ. Here is an example
{noformat}
+-------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name                      | Seq_in_index | 
Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | 
Comment | Index_comment |
+-------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| SENTRY_USER |          0 | PRIMARY                       |            1 | 
USER_ID     | A         |           0 |     NULL | NULL   |      | BTREE      | 
        |               |
| SENTRY_USER |          0 | SENTRY_USER_USER_NAME_UNIQUE  |            1 | 
USER_NAME   | A         |           0 |     NULL | NULL   |      | BTREE      | 
        |               |
| SENTRY_USER |          0 | SENTRY_USER_NAME              |            1 | 
USER_NAME   | A         |           0 |     NULL | NULL   |      | BTREE      | 
        |               |
+-------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)

{noformat}
This will take up more space and also effect the performance as the data grows. 
we did not even realize this issue till date. This issue is exposed while we 
were testing this code with oracle database as Oracle does not allow indexing 
the same column twice.

With the SENTRY-1159 fix, schemas are created only if they do not exist. As the 
index names in the JDO and sql script do not match they are created again.

*When we have sql scripts to create schema, we should turnoff schema generating 
properties in datanucles configuration.*

> Avoid datanucleus to create/update database schema
> --------------------------------------------------
>
>                 Key: SENTRY-2028
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2028
>             Project: Sentry
>          Issue Type: Bug
>          Components: Sentry
>    Affects Versions: 2.0.0
>            Reporter: kalyan kumar kalvagadda
>            Assignee: kalyan kumar kalvagadda
>            Priority: Major
>         Attachments: SENTRY-2028.001.patch
>
>
> With the current default datanucleus configuration, datanucleus tries to 
> update the database schema. This is not desired as the schema creation and 
> update happens from schemaTool using sql files.  
> If this jira is resolved,  issues reported in SENTRY-1934 and SENTRY-2011 
> will not be seen. I feel we should revert code changes done for these jira's



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (SENTRY-2028) Avoid datanucleus to create/update database schema

Reply via email to