[
https://issues.apache.org/jira/browse/OAK-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070799#comment-17070799
]
Julian Reschke commented on OAK-8977:
-------------------------------------
Hm. You say:
{quote}prepared statements, even when parameterized, don't cache the execution
plan between connections (even if the connection is pooled and does not need to
re-authenticate)
{quote}
{quote}prepared statements only re-use the execution plan when executing the
same prepared statement with different parameters in the same session (the Java
code opens and closes the connection per each call)
{quote}
This benchmark is single-threaded, so it *should* be using the same connection.
What connection pool/datasource implementation do the statements above apply to?
(FWIW, thanks for your research. If handling of PreparedStatements is indeed
the problem, then it would probably better to fix *that*, instead of working
around that introducing vendor-specific code).
> RDBBlobStore performance improvement for SQL Server (MSSQL)
> -----------------------------------------------------------
>
> Key: OAK-8977
> URL: https://issues.apache.org/jira/browse/OAK-8977
> Project: Jackrabbit Oak
> Issue Type: Technical task
> Components: rdbmk
> Reporter: Solomon Rutzky
> Priority: Major
>
> The performance test done in OAK-8926 indicates that one or more aspects of
> using SQL Server as the database need to be addressed (also MySQL, but this
> issue only concerns SQL Server).
>
> OAK-8963 (specify a binary collation in the {{CREATE TABLE}} statement) will
> improve performance a little, but not nearly enough to get its performance up
> to the range of the other non-MySQL databases.
>
> I did some testing and found one area that can be improved, and that should
> be feasible to improve. The current DB access pattern is to submit
> parameterized dynamic SQL. JDBC handles this as a "prepared" statement. I did
> some research four years ago on prepared statements ( [What is the sense and
> benefit of using
> SqlCommand.Prepare()?|https://dba.stackexchange.com/a/129721/30859] on
> DBA.StackExchange ) and found that:
> # prepared statements, even when parameterized, don't cache the execution
> plan between connections (even if the connection is pooled and does not need
> to re-authenticate)
> # prepared statements only re-use the execution plan when executing the same
> prepared statement with different parameters in the _same session_ (the Java
> code opens and closes the connection per each call)
> One way to improve performance is to:
> # put the T-SQL code into a stored procedure (that way SQL Server does not
> need to parse it per each call and can re-use an execution plan), _and_
> # call the stored procedure using a "callable statement": [Using a stored
> procedure with input
> parameters|https://docs.microsoft.com/en-us/sql/connect/jdbc/using-a-stored-procedure-with-input-parameters]
>
> I tested this by first copying the basic sequence of T-SQL statements found
> in the
> [storeBlockInDatabase|http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/rdb/RDBBlobStore.java?revision=1874570&view=markup#l307]
> method (assuming that everything always worked / happy path / no error
> handling) and simulating the new
> [testInsertSmallBlobs()|http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/blob/RDBBlobStoreTest.java?view=markup&pathrev=1874693#l308]
> test, first as dynamic SQL, and then as a stored procedure:
> {code:sql}
> ALTER PROCEDURE dbo.[StoreBlockInDatabase]
> (
> @Digest VARBINARY(32),
> @Level INT,
> @Data VARBINARY(MAX)
> )
> AS
> SET NOCOUNT ON;
> DECLARE @Id VARCHAR(64) = CONVERT(VARCHAR(64), @Digest, 2),
> @Now BIGINT;
> SELECT @Now = [cpu_ticks]
> FROM sys.dm_os_sys_info;
> BEGIN TRY
> UPDATE dbo.[DATASTORE_META]
> SET [LASTMOD] = @Now
> WHERE [ID] = @Id;
> IF (@@ROWCOUNT = 0)
> BEGIN
> INSERT INTO dbo.[DATASTORE_DATA] ([ID], [DATA]) VALUES (@Id,
> @Data);
> INSERT INTO dbo.[DATASTORE_META] ([ID], [LVL], [LASTMOD])
> VALUES (@Id, @Level, @Now);
> END;
> END TRY
> BEGIN CATCH
> ;THROW;
> END CATCH;
> GO
> {code}
> I tested against both the {{Latin1_General_CI_AS}} collation and the
> {{Latin1_General_BIN2}} collation:
> * The stored procedure inserts 2.5 times more rows than the dynamic SQL
> version
> * The binary collation is slightly better than the non-binary collation per
> each test
> I will attach the test script once I get it cleaned up.
> P.S. I also tested with another pattern to see if doing a {{SELECT}} first
> (rather than assuming the {{UPDATE}}) would be better (based on my testing
> for a similar case: [UPDATE performance where no data
> changes|https://dba.stackexchange.com/a/114369/30859] on DBA.StackExchange),
> but it wasn't better.
> P.P.S. I figure the {{CREATE PROCEDURE}} statement could be placed in the
> vendor-specific area of
> [RDBBlobStoreDB|http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/rdb/RDBBlobStoreDB.java?revision=1874271&view=markup#l60]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)