[ 
https://issues.apache.org/jira/browse/JCR-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas D. Jordan updated JCR-4611:
------------------------------------
    Description: 
I’m trying to understand the reasoning behind the oracle schema making the 
XXX_BUNDLE tables column NODE_ID of column type raw instead of a char or binary 
type?

 

The reason that is driving this question is a performance issue when our 
application uses jackrabbit and the table XXX_BUNDLE is about 1M rows . The 
index XXX_BUNDLE_IDX on the XXX_BUNDLE.NODE_ID isn’t being used in selects and 
deletes. Changing the index to:

(RAWTOHEX("NODE_ID")) yields some improvement. But ultimately changing the 
column type of NODE_ID to varchar(32) yielded the best results.

 

I have output from SQL Developer's Sql Tune Advisor to support my claim. See 
results near the end.

 

The new jackrabbit schema for XXX_BUNDLE was introduced in version 1.3: 
[http://svn.apache.org/repos/asf/jackrabbit/branches/1.3/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/persistence/bundle/oracle.ddl]

 
{panel:title=My title}
------------ INFORMATION -----

IDX_PM_VER_BUNDLE_NODEID = CREATE INDEX "IDX_PM_VER_BUNDLE_NODEID" ON 
"PM_VER_BUNDLE" (RAWTOHEX("NODE_ID"));

Query: select * from jcr_user.pm_ver_bundle where node_id = 
'FE0CDF71A86E47E0ADC428078DF43955';

Scenario [1] : normal jackrabbit oracle schema since version 1.3

Scenario [2] : normal jackrabbit oracle schema since version 1.3, but with new 
index IDX_PM_VER_BUNDLE_NODEID

Scenario [3] : normal jackrabbit oracle schema since version 1.3, but node_id 
type is varchar(32)

------------ SQL TUNE ADVISOR RESULTS -----

 

[1]

-----------------------------------------------------------------------------------|
 Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time    
 
|-----------------------------------------------------------------------------------|
   0 | SELECT STATEMENT  |               |     1 |   301 |   102   (0)| 
00:00:01 ||*  1 |  TABLE ACCESS FULL| PM_VER_BUNDLE |     1 |   301 |   102   
(0)| 00:00:01 
|-----------------------------------------------------------------------------------

 

[2]

----------------------------------------------------------------------------------------------------------------|
 Id  | Operation                           | Name                     | Rows  | 
Bytes | Cost (%CPU)| Time     
|----------------------------------------------------------------------------------------------------------------|
   0 | SELECT STATEMENT                    |                          |   105 | 
31605 |    49   (0)| 00:00:01 ||   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| 
PM_VER_BUNDLE            |   105 | 31605 |    49   (0)| 00:00:01 ||*  2 |   
INDEX RANGE SCAN                  | IDX_PM_VER_BUNDLE_NODEID |    42 |       |  
   3   (0)| 00:00:01 
|----------------------------------------------------------------------------------------------------------------

 

[3]

-------------------------------------------------------------------------------------------------|
 Id  | Operation                   | Name              | Rows  | Bytes | Cost 
(%CPU)| Time     
|-------------------------------------------------------------------------------------------------|
   0 | SELECT STATEMENT            |                   |     1 |  2020 |     1  
 (0)| 00:00:01 ||   1 |  TABLE ACCESS BY INDEX ROWID| PM_VER_BUNDLE     |     1 
|  2020 |     1   (0)| 00:00:01 ||*  2 |   INDEX UNIQUE SCAN         | 
PM_VER_BUNDLE_IDX |     1 |       |     1   (0)| 00:00:01 
|-------------------------------------------------------------------------------------------------

 

 
{panel}
 
{noformat}
 {noformat}

  was:
I’m trying to understand the reasoning behind the oracle schema making the 
XXX_BUNDLE tables column NODE_ID of column type raw instead of a char or binary 
type?

 

The reason that is driving this question is a performance issue when our 
application uses jackrabbit and the table XXX_BUNDLE is about 1M rows . The 
index XXX_BUNDLE_IDX on the XXX_BUNDLE.NODE_ID isn’t being used in selects and 
deletes. Changing the index to:

(RAWTOHEX("NODE_ID")) yields some improvement. But ultimately changing the 
column type of NODE_ID to varchar(32) yielded the best results.

 

I have output from SQL Developer's Sql Tune Advisor to support my claim. See 
results near the end.

 

The new jackrabbit schema for XXX_BUNDLE was introduced in version 1.3: 
[http://svn.apache.org/repos/asf/jackrabbit/branches/1.3/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/persistence/bundle/oracle.ddl]

 
{noformat}
------------ INFORMATION -----IDX_PM_VER_BUNDLE_NODEID = CREATE INDEX 
"IDX_PM_VER_BUNDLE_NODEID" ON "PM_VER_BUNDLE" (RAWTOHEX("NODE_ID"));Query: 
select * from jcr_user.pm_ver_bundle where node_id = 
'FE0CDF71A86E47E0ADC428078DF43955';Scenario [1] : normal jackrabbit oracle 
schema since version 1.3Scenario [2] : normal jackrabbit oracle schema since 
version 1.3, but with new index IDX_PM_VER_BUNDLE_NODEIDScenario [3] : normal 
jackrabbit oracle schema since version 1.3, but node_id type is 
varchar(32)------------ SQL TUNE ADVISOR RESULTS ----- 
[1]-----------------------------------------------------------------------------------|
 Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time    
 
|-----------------------------------------------------------------------------------|
   0 | SELECT STATEMENT  |               |     1 |   301 |   102   (0)| 
00:00:01 ||*  1 |  TABLE ACCESS FULL| PM_VER_BUNDLE |     1 |   301 |   102   
(0)| 00:00:01 
|-----------------------------------------------------------------------------------
 
[2]----------------------------------------------------------------------------------------------------------------|
 Id  | Operation                           | Name                     | Rows  | 
Bytes | Cost (%CPU)| Time     
|----------------------------------------------------------------------------------------------------------------|
   0 | SELECT STATEMENT                    |                          |   105 | 
31605 |    49   (0)| 00:00:01 ||   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| 
PM_VER_BUNDLE            |   105 | 31605 |    49   (0)| 00:00:01 ||*  2 |   
INDEX RANGE SCAN                  | IDX_PM_VER_BUNDLE_NODEID |    42 |       |  
   3   (0)| 00:00:01 
|----------------------------------------------------------------------------------------------------------------
 
[3]-------------------------------------------------------------------------------------------------|
 Id  | Operation                   | Name              | Rows  | Bytes | Cost 
(%CPU)| Time     
|-------------------------------------------------------------------------------------------------|
   0 | SELECT STATEMENT            |                   |     1 |  2020 |     1  
 (0)| 00:00:01 ||   1 |  TABLE ACCESS BY INDEX ROWID| PM_VER_BUNDLE     |     1 
|  2020 |     1   (0)| 00:00:01 ||*  2 |   INDEX UNIQUE SCAN         | 
PM_VER_BUNDLE_IDX |     1 |       |     1   (0)| 00:00:01 
|-------------------------------------------------------------------------------------------------
  {noformat}
 


> JackRabbit Oracle Bundle Schema NODE_ID Type Change to VarChar
> --------------------------------------------------------------
>
>                 Key: JCR-4611
>                 URL: https://issues.apache.org/jira/browse/JCR-4611
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: core, indexing, jackrabbit-core, sql
>    Affects Versions: 1.3
>         Environment: Oracle 12g
>            Reporter: Nicholas D. Jordan
>            Priority: Minor
>
> I’m trying to understand the reasoning behind the oracle schema making the 
> XXX_BUNDLE tables column NODE_ID of column type raw instead of a char or 
> binary type?
>  
> The reason that is driving this question is a performance issue when our 
> application uses jackrabbit and the table XXX_BUNDLE is about 1M rows . The 
> index XXX_BUNDLE_IDX on the XXX_BUNDLE.NODE_ID isn’t being used in selects 
> and deletes. Changing the index to:
> (RAWTOHEX("NODE_ID")) yields some improvement. But ultimately changing the 
> column type of NODE_ID to varchar(32) yielded the best results.
>  
> I have output from SQL Developer's Sql Tune Advisor to support my claim. See 
> results near the end.
>  
> The new jackrabbit schema for XXX_BUNDLE was introduced in version 1.3: 
> [http://svn.apache.org/repos/asf/jackrabbit/branches/1.3/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/persistence/bundle/oracle.ddl]
>  
> {panel:title=My title}
> ------------ INFORMATION -----
> IDX_PM_VER_BUNDLE_NODEID = CREATE INDEX "IDX_PM_VER_BUNDLE_NODEID" ON 
> "PM_VER_BUNDLE" (RAWTOHEX("NODE_ID"));
> Query: select * from jcr_user.pm_ver_bundle where node_id = 
> 'FE0CDF71A86E47E0ADC428078DF43955';
> Scenario [1] : normal jackrabbit oracle schema since version 1.3
> Scenario [2] : normal jackrabbit oracle schema since version 1.3, but with 
> new index IDX_PM_VER_BUNDLE_NODEID
> Scenario [3] : normal jackrabbit oracle schema since version 1.3, but node_id 
> type is varchar(32)
> ------------ SQL TUNE ADVISOR RESULTS -----
>  
> [1]
> -----------------------------------------------------------------------------------|
>  Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time  
>    
> |-----------------------------------------------------------------------------------|
>    0 | SELECT STATEMENT  |               |     1 |   301 |   102   (0)| 
> 00:00:01 ||*  1 |  TABLE ACCESS FULL| PM_VER_BUNDLE |     1 |   301 |   102   
> (0)| 00:00:01 
> |-----------------------------------------------------------------------------------
>  
> [2]
> ----------------------------------------------------------------------------------------------------------------|
>  Id  | Operation                           | Name                     | Rows  
> | Bytes | Cost (%CPU)| Time     
> |----------------------------------------------------------------------------------------------------------------|
>    0 | SELECT STATEMENT                    |                          |   105 
> | 31605 |    49   (0)| 00:00:01 ||   1 |  TABLE ACCESS BY INDEX ROWID 
> BATCHED| PM_VER_BUNDLE            |   105 | 31605 |    49   (0)| 00:00:01 ||* 
>  2 |   INDEX RANGE SCAN                  | IDX_PM_VER_BUNDLE_NODEID |    42 | 
>       |     3   (0)| 00:00:01 
> |----------------------------------------------------------------------------------------------------------------
>  
> [3]
> -------------------------------------------------------------------------------------------------|
>  Id  | Operation                   | Name              | Rows  | Bytes | Cost 
> (%CPU)| Time     
> |-------------------------------------------------------------------------------------------------|
>    0 | SELECT STATEMENT            |                   |     1 |  2020 |     
> 1   (0)| 00:00:01 ||   1 |  TABLE ACCESS BY INDEX ROWID| PM_VER_BUNDLE     |  
>    1 |  2020 |     1   (0)| 00:00:01 ||*  2 |   INDEX UNIQUE SCAN         | 
> PM_VER_BUNDLE_IDX |     1 |       |     1   (0)| 00:00:01 
> |-------------------------------------------------------------------------------------------------
>  
>  
> {panel}
>  
> {noformat}
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to