Hi Thomas, Is the ORCID dataset the only RDF datasets in the Virtuoso RDF Quad Store currently, or are there others ?
What is the size of the ORCID dataset ie triple count ? I would definitely suggest setting swappiness to 10 to reduce swapping to disk which should speed inserts rates. Looking at you status() command output I see "Clients: 4177045 connects, max 3 concurrent” indicating more than 4 million SQL connections have been made to Virtuoso since it was started on 9th Mar . What is making that many connections, it is this insertion process or are there other clients reading from the instance also ? Apart from that the status() output looks fine with please of unused Buffers for database working set size to be increased and still fit in memory , no deadlock and only one pending transaction which is one of your inserts. You talk about the Oracle JDBC Driver but I still don’t see its relevance as ultimately your insertions to Virtuoso must be done one of its client interfaces / services ie either the /sparql endpoint or the Virtuoso JDBC driver I would presume, thus which is it ? The "DEFINE sql:log-enable 2” pragma being passed in the SPARQL insert queries does set row by row auto-commit and turn off transaction logging, which is the fastest transaction mode for write operations, see: http://docs.openlinksw.com/virtuoso/fn_log_enable/ Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers > On 10 Mar 2017, at 10:54, Thomas Michaux <mich...@abes.fr> wrote: > > Hi, > > thanks Hugh, we reached 110 932 303 triples loaded from our ORCID dataset > since yesterday, and still loading... > > > > Virtuoso process use VmSize: 32227664kB 32708 of memory of : > > KiB Mem : 32780296 total, 243972 free, 29985320 used, 2551004 buff/cache > KiB Swap: 2097148 total, 1734244 free, 362904 used. 2241196 avail Mem > > previous 4h logs : > > ... > > 06:03:28 Checkpoint started > 06:04:11 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310055817.trx > 06:28:41 Checkpoint started > 06:28:44 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310062412.trx > 06:52:58 Checkpoint started > 06:53:16 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310064844.trx > 07:17:14 Checkpoint started > 07:17:18 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310071317.trx > 07:39:58 Write load high relative to disk write throughput. Flushing at > 5.5 MB/s while application is making dirty pages at 1.5 MB/s. Doing a > second flushing pass before checkpoint > 07:41:10 Checkpoint started > 07:41:17 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310073719.trx > 08:04:53 Checkpoint started > 08:04:56 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310080117.trx > 08:27:35 Write load high relative to disk write throughput. Flushing at > 5.7 MB/s while application is making dirty pages at 1.7 MB/s. Doing a > second flushing pass before checkpoint > 08:28:45 Checkpoint started > 08:29:02 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310082457.trx > 08:51:43 Write load high relative to disk write throughput. Flushing at > 5.4 MB/s while application is making dirty pages at 1.7 MB/s. Doing a > second flushing pass before checkpoint > 08:52:57 Checkpoint started > 08:53:01 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310084902.trx > 09:15:40 Write load high relative to disk write throughput. Flushing at > 5.6 MB/s while application is making dirty pages at 1.9 MB/s. Doing a > second flushing pass before checkpoint > 09:16:59 Checkpoint started > 09:17:13 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310091301.trx > 09:39:57 Write load high relative to disk write throughput. Flushing at > 5.4 MB/s while application is making dirty pages at 1.7 MB/s. Doing a > second flushing pass before checkpoint > 09:41:13 Checkpoint started > 09:41:16 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310093714.trx > 10:04:13 Write load high relative to disk write throughput. Flushing at > 5.2 MB/s while application is making dirty pages at 1.6 MB/s. Doing a > second flushing pass before checkpoint > 10:05:38 Checkpoint started > 10:05:52 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310100118.trx > 10:28:52 Write load high relative to disk write throughput. Flushing at > 5.1 MB/s while application is making dirty pages at 1.8 MB/s. Doing a > second flushing pass before checkpoint > 10:30:31 Checkpoint started > 10:30:34 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310102554.trx > 10:53:32 Write load high relative to disk write throughput. Flushing at > 5.2 MB/s while application is making dirty pages at 1.4 MB/s. Doing a > second flushing pass before checkpoint > 10:54:43 Checkpoint started > 10:55:03 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310105036.trx > 11:19:29 Checkpoint started > 11:20:01 Checkpoint finished, new log is > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310111504.trx > > > here is the output of "status()" : > > SQL> status(); > REPORT > VARCHAR > _______________________________________________________________________________ > > OpenLink Virtuoso Server > Version 07.20.3217-pthreads for Linux as of Feb 10 2017 > Started on: 2017-03-09 12:33 GMT+1 > > Database Status: > File size 0, 1000960 pages, 247031 free. > 2720000 buffers, 447219 used, 112398 dirty 4 wired down, repl age 13435443 0 > w. io 3 w/crsr. > Disk Usage: 2212080 reads avg 0 msec, 0% r 0% w last 176 s, 12791013 writes > flush 8.82 MB, > 1221 read ahead, batch = 156. Autocompact 722034 in 631152 out, 12% saved > col ac: 7230338 in 3% saved. > Gate: 5993 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap. > Log = > /usr/local/virtuoso-opensource/var/lib/virtuoso/db/virtuoso20170310105036.trx, > 90073727 bytes > 558107 pages have been changed since last backup (in checkpoint state) > Current backup timestamp: 0x0000-0x00-0x00 > Last backup date: unknown > Clients: 4177045 connects, max 3 concurrent > RPC: 25061533 calls, -4177308 pending, 2 max until now, 0 queued, 37 burst > reads (0%), 0 second 5M large, 298M max > Checkpoint Remap 132107 pages, 0 mapped back. 554 s atomic time. > DB master 1000960 total 247030 free 132107 remap 44169 mapped back > temp 165120 total 160375 free > > Lock Status: 0 deadlocks of which 0 2r1w, 28 waits, > Currently 2 threads running 0 threads waiting 0 threads in vdb. > Pending: > 1100: IER 10.34.10.171 > 1: IER 10.34.10.171 > > Client 1111:4175445: Account: dba, 364 bytes in, 359 bytes out, 1 stmts. > PID: 25646, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 1 threads. > Locks: > > Client 1111:4177046: Account: ABES, 2728 bytes in, 361 bytes out, 2 stmts. > Transaction status: PENDING, 0 threads. > Locks: > > > Running Statements: > Time (msec) Text > 8 sparql DEFINE sql:log-enable 2 INSERT DATA INTO GRAPH > <http://hub.abes.fr/refere > 76 status() > > > Hash indexes > > > 44 Rows. -- 77 msec. > > > > Le 10/03/2017 à 02:03, Hugh Williams a écrit : >> Hi Thomas, >> >> What is this JDBC Connector from Oracle that is being used for the inserts >> in RDF/XML form ? > Oracle 12.1 brings it's own jdk 1.6.0_37, so if i'm right ojdbc6.jar Thin > Driver or OCI Driver : > > "Oracle JDBC Drivers release 12.1.0.1.0 production Readme.txt : > Driver Versions > --------------- > > These are the driver versions in the 12R1 release: > > - JDBC Thin Driver 12R1 > 100% Java client-side JDBC driver for use in client applications, > middle-tier servers and applets. > > - JDBC OCI Driver 12R1 > Client-side JDBC driver for use on a machine where OCI 12R1 > is installed. > > - JDBC Thin Server-side Driver 12R1 > JDBC driver for use in Java program in the database to access > remote Oracle databases. > > - JDBC Server-side Internal Driver 12R1 > Server-side JDBC driver for use by Java Stored procedures. This > driver used to be called the "JDBC Kprb Driver". > > > >> >> What is the ORCID dataset being used as the only one I see is in N-Triple >> format from 2014 at: >> >> https://datahub.io/dataset/orcid_2014_dataset > will ask for this >> >> Performing inserts with transaction would consume more memory maintaining >> the transaction than with log_enable(2) which auto commits without >> transaction logging in memory. > is it possible to have autocommit enabled the way we perform sparql INSERTs ? > we used DEFINE sql:log-enable 2 in the query >> >> The O_DIRECT param set in your INI file is an old param for which no real >> benefit has been seen on current OS’es and on a Linux system setting >> swappiness as detailed at: >> >> >> https://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFPerformanceTuning#Linux-only%20--%20”swappiness" >> >> Would give better results. > ok, known this, thought it was done but raised back to 30 after check, will > find a way to fix it @ 10. > >> >> There is also no real need to set ColumnStore = 1 as for as the RDF_QUAD >> tables is column store by default in Virtuoso 7 , so that setting would only >> have effect on default SQL table creation >> >> If you still have problems, can you provide a copy of your virtuoso.log file >> and the output of the “status();” command for review ... >> >> Best Regards >> Hugh Williams >> Professional Services >> OpenLink Software, Inc. // http://www.openlinksw.com/ >> Weblog -- http://www.openlinksw.com/blogs/ >> LinkedIn -- http://www.linkedin.com/company/openlink-software/ >> Twitter -- http://twitter.com/OpenLink >> Google+ -- http://plus.google.com/100570109519069333827/ >> Facebook -- http://www.facebook.com/OpenLinkSoftware >> Universal Data Access, Integration, and Management Technology Providers >> >> >> >>> On 9 Mar 2017, at 17:28, Thomas Michaux <mich...@abes.fr> wrote: >>> >>> Hello, >>> >>> We are loading ORCID 2016 in a V7 instance (Version 07.20.3217-pthreads for >>> Linux as of Feb 10 2017), we DO NOT want to use the bulk loader, instead we >>> are providing SPARQL inserts of RDF/XML files via JDBC connector from >>> Oracle. >>> >>> Virtuoso is hosted on 8 cores, 32Gb platform. >>> >>> We successfully inserted 75 633 079 triples until virtuoso.log signals >>> performances problems on "disk write throughput", is there something else >>> to optimize in the virtuoso.ini while we are in this "loading" phase (no >>> SPARQL "read" query from clients at the moment ) ? >>> >>> We've already done : >>> >>> - full text indexation has been delayed ( DB.DBA.VT_BATCH_UPDATE ( >>> 'DB.DBA.RDF_OBJ', 'ON', 8640 ); ) >>> - MaxCheckpointRemap = 505856 ( it's larger than 25% of total pages) >>> - UnremapQuota = 0 >>> - DefaultIsolation = 2 >>> - O_DIRECT = 1 (we are on XFS filesystem) >>> - ColumnStore = 1 (we started from a new, fresh .db, deleted >>> all previous existing .db, .trx) >>> >>> Can we do something at transaction level ? We commit each JDBC insert as >>> short as possible (1 insert-> 1 commit), query is : >>> >>> "'sparql DEFINE sql:log-enable 2 INSERT DATA INTO GRAPH '||graphe ||' { '|| >>> var_clob_line|| ' }'" >>> >>> I can see that free memory slowly decrease, and finally the server hang. >>> >>> Thanks for your help ! (Attached is virtuoso.ini) >>> >>> Thomas >>> <virtuoso.ini>------------------------------------------------------------------------------ >>> Announcing the Oxford Dictionaries API! The API offers world-renowned >>> dictionary content that is easy and intuitive to access. Sign up for an >>> account today to start using our lexical data to power your apps and >>> projects. Get started today and enter our developer competition. >>> http://sdm.link/oxford_______________________________________________ >>> Virtuoso-users mailing list >>> Virtuoso-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users >> > > ------------------------------------------------------------------------------ > Announcing the Oxford Dictionaries API! The API offers world-renowned > dictionary content that is easy and intuitive to access. Sign up for an > account today to start using our lexical data to power your apps and > projects. Get started today and enter our developer competition. > http://sdm.link/oxford_______________________________________________ > Virtuoso-users mailing list > Virtuoso-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/virtuoso-users
smime.p7s
Description: S/MIME cryptographic signature
------------------------------------------------------------------------------ Announcing the Oxford Dictionaries API! The API offers world-renowned dictionary content that is easy and intuitive to access. Sign up for an account today to start using our lexical data to power your apps and projects. Get started today and enter our developer competition. http://sdm.link/oxford
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users