Hi Hugh.

Answering to your questions:

1. The total number of triples in virtuoso is *179.686.927*, or at least
this is the result of the count query against the default graph in the
myURL:8890/sparql enpoint, without specifying a default graph IRI. I
suppose that if I do not specify a default graph IRI, Virtuoso considers
the union of all graphs as the default graph. Is this right?

SELECT COUNT(*)
> WHERE {
>   ?s ?p ?o
> }


2. I cannot really understand much of the status(); command, and therefore
I write down the full response. Repeating the long-time transaction's
execution, after *45 minutes *passed (normally, after an hour or so the
process breaks and the virtuoso restarts), the response is the following:

Database Status:
>   File size 0, 11663872 pages, 4599861 free.
>   2720000 buffers, 1201742 used, 2 dirty 0 wired down, repl age 0 0 w. io
> 1 w/crsr.
>   Disk Usage: 1202392 reads avg 0 msec, 16% r 0% w last  821 s, 731 writes
> flush          0 MB,
>     10509 read ahead, batch = 110.  Autocompact 0 in 0 out, 0% saved.
> Gate:  42826 2nd in reads, 0 gate write waits, 0 in while read 0 busy
> scrap.
> Log = /usr/local/var/lib/virtuoso/db/virtuoso.trx, 4769 bytes
> 7063527 pages have been changed since last backup (in checkpoint state)
> Current backup timestamp: 0x0000-0x00-0x00
> Last backup date: unknown
> Clients: 2 connects, max 2 concurrent
> RPC: 17 calls, 2 pending, 2 max until now, 0 queued, 5 burst reads (29%),
> 0 second 4057M large, 4057M max
> Checkpoint Remap 38 pages, 0 mapped back. 0 s atomic time.
>     DB master 11663872 total 4599861 free 38 remap 2 mapped back
>    temp  256 total 251 free
>
> Lock Status: 0 deadlocks of which 0 2r1w, 0 waits,
>    Currently 2 threads running 0 threads waiting 0 threads in vdb.
> Pending:
>   3574016: IER 1:-2
>       56: ISR NO OWNER
>       52: ISR NO OWNER
>       48: ISR NO OWNER
> ..... (Thousands of similar lines repeating) .......
> Client 1111:1:-2:  Account: dba, 742 bytes in, 2516 bytes out, 1 stmts.
> PID: 2120, OS: unix, Application: unknown, IP#: 127.0.0.1
> Transaction status: PENDING, 1 threads.
> Locks: 34868: IE, 3574134: IE, 34971: IE, 3574137: IE, 3574173: IE,
> 3574080: IE, 3574072: IE, 3574068: IE, 3574172: IE, 3574103: IE, 34933: IE,
> ....
>
> Client 1111:3:-4:  Account: dba, 471 bytes in, 547444 bytes out, 1 stmts.
> PID: 2268, OS: unix, Application: unknown, IP#: 127.0.0.1
> Transaction status: PENDING, 1 threads.
> Locks:
>
>
> Running Statements:
>  Time (msec) Text
>         1707 status()
>      2732874 SPARQL DEFINE sql:log-enable 3 INSERT { GRAPH <
> http://bio2rdf.org/clinicaltrials
>
>
> Hash indexes


I tried to ask the status() in a later stage, but the response was to large
to fit (flooded by the repeating statements like the ones highlighted
above) in the isql-v command line window
The "lost connection" message came after an hour from the beginning of the
whole transaction with the top command line utility showing virtuoso
process consuming around 100% CPU and 98% memory.

Therefore, I have set the parameters
NumberOfBuffers = 1360000
MaxDirtyBuffers = 1000000
ThreadCleanupInterval    = 1
ResourcesCleanupInterval = 1

I restarted and repeated the overall process and the result was successful.
After 13 hours, the SPARUL operation had completed successfully.

Thank you very much for your help.

Kind regards,
Pantelis Natsiavas




2016-08-11 15:56 GMT+03:00 Hugh Williams <hwilli...@openlinksw.com>:

> Hi Pantelis,
>
> 152M triple graphs is quite large to be performing such an operation on
> which will require a significant amount of memory to complete. What is the
> total number of triples in Virtuoso and how many of the buffers are in use
> when the database is in use, which can be seen by running the status();
> command from isql. As you may be able to reduce the “NumberOfBuffers” param
> for database workig set to make more memory available to the system for
> performing such large insert queries.
>
> You should also ensure the following params are set in the INI file to
> ensure unused resource/threads are cleaned up immediately to maximise
> available memory:
>
> [Parameters]
> ....
> ThreadCleanupInterval    = 1
> ResourcesCleanupInterval = 1
> ...
>
> See, http://docs.openlinksw.com/virtuoso/dbadm/
>
> The server log does not show anything useful …
>
> Best Regards
> Hugh Williams
> Professional Services
> OpenLink Software, Inc.      //              http://www.openlinksw.com/
> Weblog   -- http://www.openlinksw.com/blogs/
> LinkedIn -- http://www.linkedin.com/company/openlink-software/
> Twitter  -- http://twitter.com/OpenLink
> Google+  -- http://plus.google.com/100570109519069333827/
> Facebook -- http://www.facebook.com/OpenLinkSoftware
> Universal Data Access, Integration, and Management Technology Providers
>
> > On 10 Aug 2016, at 09:44, Pantelis Natsiavas <natsia...@gmail.com>
> wrote:
> >
> > Thank you Hugh for your answer.
> >
> > I have actually already used the respective pragma before the SPARUL
> query.
> >
> > SPARQL DEFINE sql:log-enable 3
> > INSERT ....
> >
> > The documentation of the virtuoso functions is not clear for me.
> However, I think I have already done what you suggest. Right?
> >
> > The graph the SPARUL is run against is rather big. It is the
> "ClinicalTrials.gov" graph of the Bio2RDF, namely 152.725.529 triplets. My
> VM has 32GBs RAM and the memory consumption of the virtuoso process starts
> from 40% and constantly increases until the break down (I have seen 70%
> memory consumption but I cannot be certain for  the peak). When the
> operation breaks down, the memory is not fully released. It drops down to
> around 40%. Even though in the log file it is shown that virtuoso restarts,
> I have to restart the VM in order to get my full memory back.
> >
> > I have repeated the process today and I am attaching the respective log
> file part (I though it might be useful). Perhaps you can understand more
> from the log, or you could give me instructions for more detailed logging.
> >
> > Please note that I have already followed in the instructions regarding
> the buffering (General Memory Usage Settings) and the swapping (Linux-only
> -- "swappiness") on the http://virtuoso.openlinksw.com
> /dataspace/doc/dav/wiki/Main/VirtRDFPerformanceTuning.
> >
> > I really appreciate your help.
> >
> > Kind regards,
> > Pantelis Natsiavas
> >
> > 2016-08-10 4:09 GMT+03:00 Hugh Williams <hwilli...@openlinksw.com>:
> > Hi Pantelis,
> >
> > What is the memory consumption of Virtuoso whilst running the insert
> query and what is the size of the graph the insert is being performed
> against ? As note for SPARUL operation against graph with large amounts of
> data it is recommended queries are performed in row-wise auto commit mode
> to reduce memory consumption which case otherwise be depleted causing hangs
> or the server crash if memory cannot be allocated, with the Virtuoso
> log_enable function as detailed at:
> >
> >       http://docs.openlinksw.com/virtuoso/fn_log_enable.html
> >       http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/
> VirtTipsAndTricksGuideDeleteLargeGraphs
> >
> > Best Regards
> > Hugh Williams
> > Professional Services
> > OpenLink Software, Inc.      //              http://www.openlinksw.com/
> > Weblog   -- http://www.openlinksw.com/blogs/
> > LinkedIn -- http://www.linkedin.com/company/openlink-software/
> > Twitter  -- http://twitter.com/OpenLink
> > Google+  -- http://plus.google.com/100570109519069333827/
> > Facebook -- http://www.facebook.com/OpenLinkSoftware
> > Universal Data Access, Integration, and Management Technology Providers
> >
> >> On 9 Aug 2016, at 12:11, Pantelis Natsiavas <natsia...@gmail.com>
> wrote:
> >>
> >> I have just retried the above long-time query and checked the
> virtuoso.log. I see that after 45 minutes I get the following log entries
> (I executed the query around 10.00):
> >>
> >> 10:24:23 * Monitor: High disk read (2)
> >> 10:41:23 * Monitor: High disk read (2)
> >> 10:43:24 * Monitor: High disk read (2)
> >> 10:43:41 mmap failed with 12
> >> 10:43:45 mmap failed with 12
> >> 10:43:45 mmap failed with 12
> >> 10:43:45 mmap failed with 12
> >> 10:43:46 mmap failed with 12
> >> 10:43:46 mmap failed with 12
> >> 10:43:46 GPF: Dkpool.c:1634 could not allocate memory with mmap
> >>
> >> The same messages (more or less) had in the previous "lost connection"
> incident.
> >>
> >> Could somebody provide some hints?
> >>
> >> Kind regards,
> >> Pantelis Natsiavas
> >>
> >>
> >> 2016-08-09 9:58 GMT+03:00 Pantelis Natsiavas <natsia...@gmail.com>:
> >> Hi.
> >>
> >> I am trying to execute a SPARQL INSERT-WHERE taking a lot time to
> complete.
> >>
> >> INSERT {
> >>      GRAPH <a> {
> >>         ?s <hasLabelAfterManualConversion> ?newLabel
> >>      }
> >> }
> >> WHERE {
> >>      GRAPH <a> {
> >>              ?s ?p ?o .
> >>              FILTER (REGEX(STR(?o),"abc","i")) .
> >>              BIND(REPLACE(?o, "abc", "", "i") AS ?newLabel)
> >>      }
> >> };
> >>
> >> Trying to execute it through the web environment, I get a timeout. I
> thought that the most appropriate way to do it would be through the isql-v
> command line environment. When I try to do it, after some hours I get
> >>
> >> *** Error 08S01: [Virtuoso Driver]CL065: Lost connection to server
> >>
> >> This is rather peculiar as the isql-v runs on the same VM as the
> virtuoso. It should also be noted that the virtuoso instance responds and
> has not collapsed.
> >>
> >> I have three questions:
> >> 1. Is this "lost connection" behavior in the isql-v environment normal?
> >> 2. Does this "lost connection" message mean that the transaction has
> rolled-back or is the command still running?
> >> 3. Is there a better way to run long-time queries against virtuoso?
> >>
> >> Kind regards,
> >> Pantelis Natsiavas
> >>
> >> ------------------------------------------------------------
> ------------------
> >> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
> >> patterns at an interface-level. Reveals which users, apps, and
> protocols are
> >> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> >> J-Flow, sFlow and other flows. Make informed decisions using capacity
> >> planning reports. http://sdm.link/zohodev2dev___
> ____________________________________________
> >> Virtuoso-users mailing list
> >> Virtuoso-users@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
> >
> >
> > <virtuoso.log>
>
>
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. http://sdm.link/zohodev2dev
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to