Re: [Virtuoso-users] deadlocks in bulk loading using rdf_loader_run function
Is it possible to load files into staging tables first, then merge them into RDF_QUAD? Best, Gang On Fri, Dec 18, 2015 at 2:34 PM, Gang Fu <gangfu1...@gmail.com> wrote: > Hi, > > I have deadlock problem with function 'rdf_loader_run' running in > parallel, and I have tried many ways to avoid deadlocks, including > > 1) Preprocess files using sort and uniq > > 2) Loading without indexes > > 3) Split big files into very small chunks > > > > But deadlocks problem is still there. It seems the only way to avoid this > is loading in single thread, which is very slow… any other solutions? > > > > Best, > > Gang > > > -- ___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] deadlocks in bulk loading using rdf_loader_run function
Hi, I have deadlock problem with function 'rdf_loader_run' running in parallel, and I have tried many ways to avoid deadlocks, including 1) Preprocess files using sort and uniq 2) Loading without indexes 3) Split big files into very small chunks But deadlocks problem is still there. It seems the only way to avoid this is loading in single thread, which is very slow… any other solutions? Best, Gang -- ___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] virtuoso.db file can not start the server
Hi, We have copied the virtuoso.db file from one server to another one, and start the server using just the db file, but we got the following error message in virtuoso.log file, can anyone help? Thank you very much! Best, Gang - 16:00:50 OpenLink Virtuoso Universal Server 16:00:50 Version 07.20.3212-pthreads for Linux as of Mar 26 2015 16:00:50 uses parts of OpenSSL, PCRE, Html Tidy 16:00:50 Database version 3126 16:00:50 SQL Optimizer enabled (max 1000 layouts) 16:00:51 Compiler unit is timed at 0.000155 msec 16:00:57 built-in procedure "repl_undot_name" overruled by the RDBMS 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x924b48] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x924bb6] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4cf137] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4cf536] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4cfc7b] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x562976] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x86fca5] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x870c4b] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x678df5] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x67afe3] 16:00:57 /opt/virtuoso/bin/virtuoso-t(insert_node_run+0x3d7) [0x615467] 16:00:57 /opt/virtuoso/bin/virtuoso-t(insert_node_input+0x9e) [0x6157fe] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5ea6d1] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5ea7d7] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e7a10] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x614252] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x618aca] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e5f26] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e79e3] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x614252] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x618aca] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e5f26] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e79e3] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x614252] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x61a089] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x61ab5c] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4b3779] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x8c5e9c] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x61d86d] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x625d57] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x457e67] 16:00:57 /lib64/libc.so.6(__libc_start_main+0xfd) [0x319581ed5d] 16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x452ad9] 16:00:57 GPF: extent.c:848 an extent was made for a range already taken by other ext -- ___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] cannot start server after delete
Hi, I was running sparql delete and stopped the job. Then when I tried to start the server I got the error message: Starting virtuoso: The VDBMS server process terminated prematurely after checkpointing. The server cannot be started anymore. I checked the log file, I found this message: Thu May 07 2015 14:34:44 OpenLink Virtuoso Universal Server 14:34:44 Version 07.20.3212-pthreads for Linux as of Mar 26 2015 14:34:44 uses parts of OpenSSL, PCRE, Html Tidy 14:34:50 Database version 3126 14:34:50 SQL Optimizer enabled (max 1000 layouts) 14:34:51 Compiler unit is timed at 0.000171 msec 14:34:57 built-in procedure repl_undot_name overruled by the RDBMS 14:34:57 Roll forward started 14:34:58 86 transactions, 9940 bytes replayed (100 %) 14:34:58 Roll forward complete 14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x924b48] 14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x924bb6] 14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x496660] 14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x4ff9da] 14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x4ffd7e] 14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x5000d1] 14:34:59 /opt/virtuoso/bin/virtuoso-t(ac_aq_func+0x13f) [0x50076f] 14:34:59 /opt/virtuoso/bin/virtuoso-t(aq_thread_func+0x1df) [0x45b0ff] 14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x92efdf] 14:34:59 /lib64/libpthread.so.0() [0x38db6079d1] 14:34:59 /lib64/libc.so.6(clone+0x6d) [0x38daae88fd] 14:34:59 GPF: colins.c:3412 uneven length cols after insert Do you know why this happen, and how to recover from there? Thank you very much! Best, Gang -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] exact string match
Hi, I want to do fast exact string match for hundreds of millions of text strings. sparql filter regex is apparently too slow. I found bif:contains is very fast, and in virtuoso 7 we do not need to do anything after installation, bif:contains works well. However, bif:contains only does substring match, so most of time I got many hits back. I am wondering how I can do very fast exact string match with case-insensitive. I have searched for other bif functions, there are a couple of them: bif:strstr, bif:strcasestr, bif:starts_with, bif:ends_with. However, none of them work after initial installation of virtuoso 7. Do I need to run the store procedure as follows to get them work? Best, Gang DB.DBA.RDF_OBJ_FT_RULE_ADD (null, null, 'All'); DB.DBA.VT_INC_INDEX_DB_DBA_RDF_OBJ (); DB.DBA.VT_INDEX_DB_DBA_RDF_OBJ (); -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document
Thank you very much, Hugh! You are right, we are going to update the large data set on weekly basis, and the updates are on the scale of a couple of millions, or less than that. The bulk loader with delete option sounds good for me, but only the nquad files are allowed. Our input files are in ttl, which are dumped from sql database. Preparing another set of dump scripts is not goodconverting ttl to nquad requires extra step in the pipelineis there a way to do rdf loader 'with delete' with ttl files? Otherwise, I think sparql delete is better for us, since no extra efforts are needed. On Wed, Apr 8, 2015 at 12:59 PM, Hugh Williams hwilli...@openlinksw.com wrote: Hi Gang, To be clear when you say I want to update a large RDF store with 10 billions triples once a week , presume you are *NOT* loading 10billion new triples every week, but rather the base 10billion triples are to be updated which triples/graphs being inserted/deleted/updated, thus the overall number of triples does increase (or decrease) on that scale ? As if these updates are in the form of documents ie datasets and if they or can be converted to nquad format to meet the requirements of the Virtuoso RDF Bulk Loder with_delete [1] option, then this would be the most the fastest and most efficient way to do this I would say ... [1] http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFBulkLoaderWithDelete Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers On 8 Apr 2015, at 12:27, Gang Fu gangfu1...@gmail.com wrote: using isql or jdbc or http will make any difference? On Wed, Apr 8, 2015 at 7:25 AM, Gang Fu gangfu1...@gmail.com wrote: There are millions of triples to be updated on weekly basis. On Wed, Apr 8, 2015 at 7:24 AM, Gang Fu gangfu1...@gmail.com wrote: Hi, I want to update a large RDF store with 10 billions triples once a week. The triples to be inserted or deleted are save in documents. There is no variable binding or blank nodes in the documents. So I guess the best fit sparql update functions are insert data/delete data What is the best way to do this? Using JDBC connection pool or http? Using 'modify graph graph-iri insert/delete', or insert/delete data? Is it possible to run concurrent update jobs? Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] virtuoso striping
Thank you very much, Morty! You are right, 'split' plus 'cat' is a better option, since the server can start immediately with the rebuild db file. Is there a way to test whether a stored procedure exists? I have another ticket about this question, but I have not gotten any reply yet there :) On Wed, Apr 15, 2015 at 3:51 PM, Morty morty+virtu...@frakir.org wrote: Yes, you can split a large file into many small files. At the colo, you can put them back together again. The command to put them back together is cat. The join command does something else, so you don't want to try to use it. NB: this is actually what the cat command is for. cat is short for concatenate. Although it's rarely used for this purpose! ;) Alternatively, there are options for rsync that turn off the checksum stuff. So if a file transfer gets interrupted, it picks off right where it left off. You can then do file verification outside the scope of rsync, e.g. by doing sha1sum on both sides and comparing the results. Contact your local sysadmins for assistance with either of these options. :) - Morty On Wed, Apr 15, 2015 at 02:50:31PM -0400, Gang Fu wrote: We want to transfer the files to another location, 'colo' for disaster recovery. The long distance transfer is time-consuming and may fail sometimes. We are using rsync, and we believe rsync a 500 GB file or rsync many small files indeed make difference, since rsync does a checksum validation before transfer, so if a large portion of many small files have the same checksum, then we only need to transfer a small port of them. Can we just 'split' and 'join' db files before and after transferring? Best, Gang On Wed, Apr 15, 2015 at 1:17 PM, Morty morty+virtu...@frakir.org wrote: On Tue, Apr 14, 2015 at 12:24:22PM -0400, Gang Fu wrote: We want to copy a large virtuoso db from one server to another in different location. We cannot copy single 500 GB db file, which is slow and unstable. So we want to break the db files in different segments. I have tried with virtuoso striping: each segment has 20 GB, and in total we have over 25 segments. What issue are you seeing with transferring a 500GB file? Transferring one 500GB file should not be significantly slower than transferring 25x 20GB files. If you are concerned about a transfer interruption, you could use rsync. rsync has options to resume a failed transfer. Alternatively, you could use the Linux/Unix split command to split the one large file into a bunch of smaller files. Or you could use the commercial version of virtuoso with built-in replication. - Morty -- Mordechai T. Abzug Linux red-sonja 3.11.0-24-generic #42-Ubuntu SMP Fri Jul 4 21:19:31 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux A verbal contract isn't worth the paper it's written on. - Samuel Goldwyn -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] check if a stored procedure exists
Hi, I just want to ping this question, in case it was missed Best, Gang On Sat, Apr 11, 2015 at 8:45 AM, Gang Fu gangfu1...@gmail.com wrote: Hi, I want to ask how can we check the existence of a stored procedure before we drop it? We need to drop the stored procedure before we create it, otherwise, there will be some issue, but if we drop a stored procedure that does not exist, we will get an error. So we need to check existence before drop it. Thank you very much! Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] virtuoso striping
By the way, I have tried to compress the big db file using gzip, and I can get over 1:4 compression ratio, so I think there is still a lot of room in the db file that are not very important. Is there a function in virtuoso that can be post-loading optimization to reduce the db file size, which may in turn boost the performance as well? Best, Gang On Fri, Apr 17, 2015 at 8:32 AM, Gang Fu gangfu1...@gmail.com wrote: Thank you very much, Morty! You are right, 'split' plus 'cat' is a better option, since the server can start immediately with the rebuild db file. Is there a way to test whether a stored procedure exists? I have another ticket about this question, but I have not gotten any reply yet there :) On Wed, Apr 15, 2015 at 3:51 PM, Morty morty+virtu...@frakir.org wrote: Yes, you can split a large file into many small files. At the colo, you can put them back together again. The command to put them back together is cat. The join command does something else, so you don't want to try to use it. NB: this is actually what the cat command is for. cat is short for concatenate. Although it's rarely used for this purpose! ;) Alternatively, there are options for rsync that turn off the checksum stuff. So if a file transfer gets interrupted, it picks off right where it left off. You can then do file verification outside the scope of rsync, e.g. by doing sha1sum on both sides and comparing the results. Contact your local sysadmins for assistance with either of these options. :) - Morty On Wed, Apr 15, 2015 at 02:50:31PM -0400, Gang Fu wrote: We want to transfer the files to another location, 'colo' for disaster recovery. The long distance transfer is time-consuming and may fail sometimes. We are using rsync, and we believe rsync a 500 GB file or rsync many small files indeed make difference, since rsync does a checksum validation before transfer, so if a large portion of many small files have the same checksum, then we only need to transfer a small port of them. Can we just 'split' and 'join' db files before and after transferring? Best, Gang On Wed, Apr 15, 2015 at 1:17 PM, Morty morty+virtu...@frakir.org wrote: On Tue, Apr 14, 2015 at 12:24:22PM -0400, Gang Fu wrote: We want to copy a large virtuoso db from one server to another in different location. We cannot copy single 500 GB db file, which is slow and unstable. So we want to break the db files in different segments. I have tried with virtuoso striping: each segment has 20 GB, and in total we have over 25 segments. What issue are you seeing with transferring a 500GB file? Transferring one 500GB file should not be significantly slower than transferring 25x 20GB files. If you are concerned about a transfer interruption, you could use rsync. rsync has options to resume a failed transfer. Alternatively, you could use the Linux/Unix split command to split the one large file into a bunch of smaller files. Or you could use the commercial version of virtuoso with built-in replication. - Morty -- Mordechai T. Abzug Linux red-sonja 3.11.0-24-generic #42-Ubuntu SMP Fri Jul 4 21:19:31 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux A verbal contract isn't worth the paper it's written on. - Samuel Goldwyn -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] virtuoso striping
Thank you very much, Hugh! Will the restoring from the online backup series take some times, or momentarily? Our system team has concern, if the restore take long time for large database (~10 billions of triples), we do not have enough redundancies to guarantee the service. Best, Gang On Tue, Apr 14, 2015 at 12:46 PM, Hugh Williams hwilli...@openlinksw.com wrote: Hi Gang Perform a Virtuoso online backup which will split files into manageable file sizes then copy and restored on new location as detailed at: http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#backup Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers On 14 Apr 2015, at 17:24, Gang Fu gangfu1...@gmail.com wrote: Hi, We want to copy a large virtuoso db from one server to another in different location. We cannot copy single 500 GB db file, which is slow and unstable. So we want to break the db files in different segments. I have tried with virtuoso striping: each segment has 20 GB, and in total we have over 25 segments. Then new issue come out, the server start very slowly with multiple segments, it may take half an hour to start after copy those segment files to another server. Any explanation? How to avoid the long time start? We want to implement the db toggle mechanism, so we want the server to start immediately, like without striping. Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] virtuoso striping
We want to transfer the files to another location, 'colo' for disaster recovery. The long distance transfer is time-consuming and may fail sometimes. We are using rsync, and we believe rsync a 500 GB file or rsync many small files indeed make difference, since rsync does a checksum validation before transfer, so if a large portion of many small files have the same checksum, then we only need to transfer a small port of them. Can we just 'split' and 'join' db files before and after transferring? Best, Gang On Wed, Apr 15, 2015 at 1:17 PM, Morty morty+virtu...@frakir.org wrote: On Tue, Apr 14, 2015 at 12:24:22PM -0400, Gang Fu wrote: We want to copy a large virtuoso db from one server to another in different location. We cannot copy single 500 GB db file, which is slow and unstable. So we want to break the db files in different segments. I have tried with virtuoso striping: each segment has 20 GB, and in total we have over 25 segments. What issue are you seeing with transferring a 500GB file? Transferring one 500GB file should not be significantly slower than transferring 25x 20GB files. If you are concerned about a transfer interruption, you could use rsync. rsync has options to resume a failed transfer. Alternatively, you could use the Linux/Unix split command to split the one large file into a bunch of smaller files. Or you could use the commercial version of virtuoso with built-in replication. - Morty -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] virtuoso striping
Hi, We want to copy a large virtuoso db from one server to another in different location. We cannot copy single 500 GB db file, which is slow and unstable. So we want to break the db files in different segments. I have tried with virtuoso striping: each segment has 20 GB, and in total we have over 25 segments. Then new issue come out, the server start very slowly with multiple segments, it may take half an hour to start after copy those segment files to another server. Any explanation? How to avoid the long time start? We want to implement the db toggle mechanism, so we want the server to start immediately, like without striping. Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] check if a stored procedure exists
Hi, I want to ask how can we check the existence of a stored procedure before we drop it? We need to drop the stored procedure before we create it, otherwise, there will be some issue, but if we drop a stored procedure that does not exist, we will get an error. So we need to check existence before drop it. Thank you very much! Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] best way to update large RDF stores with triples of a large document
Hi, I want to update a large RDF store with 10 billions triples once a week. The triples to be inserted or deleted are save in documents. There is no variable binding or blank nodes in the documents. So I guess the best fit sparql update functions are insert data/delete data What is the best way to do this? Using JDBC connection pool or http? Using 'modify graph graph-iri insert/delete', or insert/delete data? Is it possible to run concurrent update jobs? Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document
using isql or jdbc or http will make any difference? On Wed, Apr 8, 2015 at 7:25 AM, Gang Fu gangfu1...@gmail.com wrote: There are millions of triples to be updated on weekly basis. On Wed, Apr 8, 2015 at 7:24 AM, Gang Fu gangfu1...@gmail.com wrote: Hi, I want to update a large RDF store with 10 billions triples once a week. The triples to be inserted or deleted are save in documents. There is no variable binding or blank nodes in the documents. So I guess the best fit sparql update functions are insert data/delete data What is the best way to do this? Using JDBC connection pool or http? Using 'modify graph graph-iri insert/delete', or insert/delete data? Is it possible to run concurrent update jobs? Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document
There are millions of triples to be updated on weekly basis. On Wed, Apr 8, 2015 at 7:24 AM, Gang Fu gangfu1...@gmail.com wrote: Hi, I want to update a large RDF store with 10 billions triples once a week. The triples to be inserted or deleted are save in documents. There is no variable binding or blank nodes in the documents. So I guess the best fit sparql update functions are insert data/delete data What is the best way to do this? Using JDBC connection pool or http? Using 'modify graph graph-iri insert/delete', or insert/delete data? Is it possible to run concurrent update jobs? Best, Gang -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] create procedure in bash shell
Hi, In our application, we want to create many different dump functions to dump different subset of triple collections from database. According to the wiki page: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFDatasetDump we can define dump one graph function in isql command line. Since we will create hundreds of customized dump functions, we do not want to copy and paste to isql command line. Instead we want to prepare a shell script. In the bash shell, we need to take care of single quote by converting ' to '''. Basically, I copy and paste the create procedure into shell script surrounded by: /opt/virtuoso/bin/isql dba password exec='create procedure function' But I always got this error *** Error 37000: [Virtuoso Driver][Virtuoso Server]SQ074: Line 40: at line 0 of Top-Level: although the create procedure function itself works fine in the isql command line. Can anyone help me out? Thank you very much! Best, Gang --- The shell command is as follow: /opt/virtuoso/bin/isql dba password verbose=on banner=off prompt=off echo=ON errors=stdout \ exec='CREATE PROCEDURE dump_one_graph ( IN srcgraph VARCHAR , IN out_file VARCHAR , IN file_length_limit INTEGER := 10 ) { DECLARE file_name VARCHAR; DECLARE env, ses ANY; DECLARE ses_len, max_ses_len, file_len, file_idx INTEGER; SET ISOLATION = '''uncommitted'''; max_ses_len := 1000; file_len := 0; file_idx := 1; file_name:= sprintf ('''%s%06d.ttl''', out_file, file_idx); string_to_file ( file_name || '''.graph''', srcgraph, -2 ); string_to_file ( file_name, sprintf ( '''# Dump of graph %s, as of %s\n@base .\n''', srcgraph, CAST (NOW() AS VARCHAR) ), -2 ); env := vector (dict_new (16000), 0, , , , 0, 0, 0, 0, 0); ses := string_output (); FOR (SELECT * FROM ( SPARQL DEFINE input:storage SELECT ?s ?p ?o { GRAPH `iri(?:srcgraph)` { ?s ?p ?o } } ) AS sub OPTION (LOOP)) DO { http_ttl_triple (env, s, p, o, ses); ses_len := length (ses); IF (ses_len max_ses_len) { file_len := file_len + ses_len; IF (file_len file_length_limit) { http (''' .\n''', ses); string_to_file (file_name, ses, -1); gz_compress_file (file_name, file_name||'''.gz'''); file_delete (file_name); file_len := 0; file_idx := file_idx + 1; file_name := sprintf ('''%s%06d.ttl''', out_file, file_idx); string_to_file ( file_name, sprintf ( '''# Dump of graph %s, as of %s (part %d)\n@base .\n''', srcgraph, CAST (NOW() AS VARCHAR), file_idx), -2 ); env := VECTOR (dict_new (16000), 0, , , , 0, 0, 0, 0, 0); } ELSE string_to_file (file_name, ses, -1); ses := string_output (); } } IF (LENGTH (ses)) { http (''' .\n''', ses); string_to_file (file_name, ses, -1); gz_compress_file (file_name, file_name||'''.gz'''); file_delete (file_name); } } ;' -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] create procedure in bash shell
I double-checked the error message and found the plus sign '+' is missing in the error message: file_len := file_len ses_len; Do we need to escape plus sign in shell script? Any comment? On Tue, Apr 7, 2015 at 4:36 PM, Gang Fu gangfu1...@gmail.com wrote: Hi, In our application, we want to create many different dump functions to dump different subset of triple collections from database. According to the wiki page: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFDatasetDump we can define dump one graph function in isql command line. Since we will create hundreds of customized dump functions, we do not want to copy and paste to isql command line. Instead we want to prepare a shell script. In the bash shell, we need to take care of single quote by converting ' to '''. Basically, I copy and paste the create procedure into shell script surrounded by: /opt/virtuoso/bin/isql dba password exec='create procedure function' But I always got this error *** Error 37000: [Virtuoso Driver][Virtuoso Server]SQ074: Line 40: at line 0 of Top-Level: although the create procedure function itself works fine in the isql command line. Can anyone help me out? Thank you very much! Best, Gang --- The shell command is as follow: /opt/virtuoso/bin/isql dba password verbose=on banner=off prompt=off echo=ON errors=stdout \ exec='CREATE PROCEDURE dump_one_graph ( IN srcgraph VARCHAR , IN out_file VARCHAR , IN file_length_limit INTEGER := 10 ) { DECLARE file_name VARCHAR; DECLARE env, ses ANY; DECLARE ses_len, max_ses_len, file_len, file_idx INTEGER; SET ISOLATION = '''uncommitted'''; max_ses_len := 1000; file_len := 0; file_idx := 1; file_name:= sprintf ('''%s%06d.ttl''', out_file, file_idx); string_to_file ( file_name || '''.graph''', srcgraph, -2 ); string_to_file ( file_name, sprintf ( '''# Dump of graph %s, as of %s\n@base .\n''', srcgraph, CAST (NOW() AS VARCHAR) ), -2 ); env := vector (dict_new (16000), 0, , , , 0, 0, 0, 0, 0); ses := string_output (); FOR (SELECT * FROM ( SPARQL DEFINE input:storage SELECT ?s ?p ?o { GRAPH `iri(?:srcgraph)` { ?s ?p ?o } } ) AS sub OPTION (LOOP)) DO { http_ttl_triple (env, s, p, o, ses); ses_len := length (ses); IF (ses_len max_ses_len) { file_len := file_len + ses_len; IF (file_len file_length_limit) { http (''' .\n''', ses); string_to_file (file_name, ses, -1); gz_compress_file (file_name, file_name||'''.gz'''); file_delete (file_name); file_len := 0; file_idx := file_idx + 1; file_name := sprintf ('''%s%06d.ttl''', out_file, file_idx); string_to_file ( file_name, sprintf ( '''# Dump of graph %s, as of %s (part %d)\n@base .\n''', srcgraph, CAST (NOW() AS VARCHAR), file_idx), -2 ); env := VECTOR (dict_new (16000), 0, , , , 0, 0, 0, 0, 0); } ELSE string_to_file (file_name, ses, -1); ses := string_output (); } } IF (LENGTH (ses)) { http (''' .\n''', ses); string_to_file (file_name, ses, -1); gz_compress_file (file_name, file_name||'''.gz'''); file_delete (file_name); } } ;' -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] create procedure in bash shell
Thank you very much, Morty and Hugh! I just found out that the isql command line has two mode: file mode and command line mode: isql dba dba file.sql isql dba dba exec=command Best, Gang B On Tue, Apr 7, 2015 at 10:14 PM, Hugh Williams hwilli...@openlinksw.com wrote: Hi Gang, You could also place the procedure to be loaded in a file (dump.sql) and use the “load” command to load it into the database: $ isql dba dba verbose=on banner=off prompt=off echo=ON errors=stdout exec='load dump.sql' Connected to OpenLink Virtuoso Driver: 07.10.3211 OpenLink Virtuoso ODBC Driver OpenLink Interactive SQL (Virtuoso), version 0.9849b. Type HELP; for help and EXIT; to exit. -- Line 1: CREATE PROCEDURE dump_one_graph ( IN srcgraph VARCHAR , IN out_file VARCHAR , IN file_length_limit INTEGER := 10 ) { DECLARE file_name VARCHAR; DECLARE env, ses ANY; DECLARE ses_len, max_ses_len, file_len, file_idx INTEGER; SET ISOLATION = 'uncommitted'; max_ses_len := 1000; file_len := 0; file_idx := 1; file_name:= sprintf ('%s%06d.ttl', out_file, file_idx); string_to_file ( file_name || '.graph', srcgraph, -2 ); string_to_file ( file_name, sprintf ( '# Dump of graph %s, as of %s\n@base .\n', srcgraph, CAST (NOW() AS VARCHAR) ), -2 ); env := vector (dict_new (16000), 0, '', '', '', 0, 0, 0, 0, 0); ses := string_output (); FOR (SELECT * FROM ( SPARQL DEFINE input:storage SELECT ?s ?p ?o { GRAPH `iri(?:srcgraph)` { ?s ?p ?o } } ) AS sub OPTION (LOOP)) DO { http_ttl_triple (env, s, p, o, ses); ses_len := length (ses); IF (ses_len max_ses_len) { file_len := file_len + ses_len; IF (file_len file_length_limit) { http (' .\n', ses); string_to_file (file_name, ses, -1); gz_compress_file (file_name, file_name||'.gz'); file_delete (file_name); file_len := 0; file_idx := file_idx + 1; file_name := sprintf ('%s%06d.ttl', out_file, file_idx); string_to_file ( file_name, sprintf ( '# Dump of graph %s, as of %s (part %d)\n@base .\n', srcgraph, CAST (NOW() AS VARCHAR), file_idx), -2 ); env := VECTOR (dict_new (16000), 0, '', '', '', 0, 0, 0, 0, 0); } ELSE string_to_file (file_name, ses, -1); ses := string_output (); } } IF (LENGTH (ses)) { http (' .\n', ses); string_to_file (file_name, ses, -1); gz_compress_file (file_name, file_name||'.gz'); file_delete (file_name); } } Done. -- 13 msec. -- Line 72: $ Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers On 7 Apr 2015, at 22:08, Morty morty+virtu...@frakir.org wrote: On Tue, Apr 07, 2015 at 04:36:39PM -0400, Gang Fu wrote: In the bash shell, we need to take care of single quote by converting ' to '''. Gang -- You can avoid some issues and debug more easily by using isql's run file syntax. temp_file=`mktemp /somedir/tmpX` $command_to_generate_procedure $temp_file isql dba $dba_password $temp_file If there are any issues, you can look at the temporary file to debug quoting and the like. - Morty -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF ___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] vhost_define vsp_user and real user
Hi Rumi, I totally understand your point. My quesiton is about the 'vsp_user' or called 'vsp_host' used to expose the sparql endpoint. Our system security team has concern about the 'vsp_user', they are not sure what is used for, and how to configure it. Basically, they are not familiar with 'vsp'. I cannot explain well to them, and they want to audit the user permission for /sparql endpoint. I have explained that the default user for /sparql endpoint is 'SPARLQ' and it is read-only. But there is no way to audit that, if later some configuration is changed, they want to know whether the endpoint is still read-only... I found the system table 'http_path' tells you the 'vsp_host' for 'lpath', but not the user and user role... Best, Gang On Thu, Feb 5, 2015 at 10:17 AM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 05-Feb-15 3:03 PM, Gang Fu wrote: Hi Rumi, Using vhost_define() through isql we can achieve the same thing: DB.DBA.VHOST_DEFINE ( lhost=ip:port, vhost=name, lpath='/sparql', ppath='/!sparql/', is_dav=1, vsp_user='dba', ses_vars=0, sec='digest', auth_fn='DB.DBA.HP_AUTH_SPARQL_USER', realm='SPARQL', opts=vector('noinherit', 1, 'exec_as_get', 1), is_default_host=0 ); It is password protected, but it is read+write ,even though I have: 'exec_as_get', 1 Right, so to me is not clear exactly what you want to do in this case, with what user are you using to log in? It seems your user has read+write permissions, i.e you need to try to log in as user with read permissions only: a simple scenario that demonstrates this: 1) Create 2 users, one can update, the other can only perform select: SQL DB.DBA.USER_CREATE ('ana', 'ana'); Done. -- 0 msec. SQL DB.DBA.USER_CREATE ('brad', 'brad'); Done. -- 0 msec. Done. -- 16 msec. SQL GRANT SPARQL_UPDATE to ana; Done. -- 0 msec. SQL GRANT SPARQL_SELECT to brad; Done. -- 0 msec. So ana can update, brad can only select. 2) Then from the default /sparql-auth endpoint, if I log in as brad and: -- 1) attempt to insert data: INSERT INTO GRAPH http://NewBookStore.com http://NewBookStore.com { ?book ?p ?v } fails with this error: SPARQL Update was denied to brad -- 2) attempt to clear data: sparql clear graph urn:example:com; also fails with error: Error SR186: No permission to execute procedure DB.DBA.SPARUL_CLEAR with user ID 127, group ID 127 Which is correct, since brad can only select data, but has no update ( read-write ) permissions. Question is the user you are using, what permissions it has? Best Regards, Rumi Kocis Best, Gang On Wed, Feb 4, 2015 at 6:57 AM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 04-Feb-15 2:22 AM, Gang Fu wrote: Hi Rumi, I have tried to expose a password-protected sparql endpoint, actually it can be done using vhost_define() function as well, just add sec='digest' and authentication function. But the vsp_user to expose a password-protected sparql endpoint is still dba. By default /sparq-auth is protected, so what you can try is : 1. Export /sparq-auth definition from Conductor-Web Application Server - Virtual Domains Directories 2. Change in the generated script /sparql-auth with /sparql. * Note: the vsp_user is dba, but in the next step you can change in the authentication function a connection setting so to use your user. 3. In the authentication function DB.DBA.HP_AUTH_SPARQL_USER (sparql_io.sql) there is: Lin: 2935 user_id := connection_get ('SPARQLUserId', 'SPARQL'); Change it respectively so to use your user and execute the function creation so the change to kick in. 4. Execute from Conductor or iSQL the changed script from step 2. Please let me know if that worked for you. Best Regards, Rumi Kocis Best, Gang On Tue, Feb 3, 2015 at 12:35 PM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 03-Feb-15 3:47 PM, Gang Fu wrote: Hi Rumi, I looked at the source code of libsrc/Wi/sparql_io.sql for procedure WS.WS./!sparql/: create procedure WS.WS./!sparql/ (inout path varchar, inout params any, inout lines any) I am not sure whether the user as SPARQL for /sparql endpoint are set by default here: user_id := connection_get ('SPARQLUserId', 'SPARQL'); set_user_id (user_id, 1); I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql endpoint is not read-only And when I tried to grant another role, I got The object SPARQL_LOAD_SERVICE_DATA does not exist. But it does not allow me to expose /sparql endpoint using vsp_user SPARQL. What I am really interested in is how to expose sparql endpoint using vsp users other than dba. Hm, I would say you grant the roles to another vsp user as this is what you want to achieve is this correct? As now you granted them to SPARQL instead? Additionally, did you try the steps from the guide http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main
Re: [Virtuoso-users] vhost_define vsp_user and real user
Hi Rumi, Using vhost_define() through isql we can achieve the same thing: DB.DBA.VHOST_DEFINE ( lhost=ip:port, vhost=name, lpath='/sparql', ppath='/!sparql/', is_dav=1, vsp_user='dba', ses_vars=0, sec='digest', auth_fn='DB.DBA.HP_AUTH_SPARQL_USER', realm='SPARQL', opts=vector('noinherit', 1, 'exec_as_get', 1), is_default_host=0 ); It is password protected, but it is read+write ,even though I have: 'exec_as_get', 1 Best, Gang On Wed, Feb 4, 2015 at 6:57 AM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 04-Feb-15 2:22 AM, Gang Fu wrote: Hi Rumi, I have tried to expose a password-protected sparql endpoint, actually it can be done using vhost_define() function as well, just add sec='digest' and authentication function. But the vsp_user to expose a password-protected sparql endpoint is still dba. By default /sparq-auth is protected, so what you can try is : 1. Export /sparq-auth definition from Conductor-Web Application Server - Virtual Domains Directories 2. Change in the generated script /sparql-auth with /sparql. * Note: the vsp_user is dba, but in the next step you can change in the authentication function a connection setting so to use your user. 3. In the authentication function DB.DBA.HP_AUTH_SPARQL_USER (sparql_io.sql) there is: Lin: 2935 user_id := connection_get ('SPARQLUserId', 'SPARQL'); Change it respectively so to use your user and execute the function creation so the change to kick in. 4. Execute from Conductor or iSQL the changed script from step 2. Please let me know if that worked for you. Best Regards, Rumi Kocis Best, Gang On Tue, Feb 3, 2015 at 12:35 PM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 03-Feb-15 3:47 PM, Gang Fu wrote: Hi Rumi, I looked at the source code of libsrc/Wi/sparql_io.sql for procedure WS.WS./!sparql/: create procedure WS.WS./!sparql/ (inout path varchar, inout params any, inout lines any) I am not sure whether the user as SPARQL for /sparql endpoint are set by default here: user_id := connection_get ('SPARQLUserId', 'SPARQL'); set_user_id (user_id, 1); I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql endpoint is not read-only And when I tried to grant another role, I got The object SPARQL_LOAD_SERVICE_DATA does not exist. But it does not allow me to expose /sparql endpoint using vsp_user SPARQL. What I am really interested in is how to expose sparql endpoint using vsp users other than dba. Hm, I would say you grant the roles to another vsp user as this is what you want to achieve is this correct? As now you granted them to SPARQL instead? Additionally, did you try the steps from the guide http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication ? Best Regards, Rumi Kocis Best, Gang On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 03-Feb-15 1:15 PM, Gang Fu wrote: Hi, I am using function vhost_define() to expose read-only sparql endpoint through another port (different from 8890) for security concern. I have two questions: 1) how can I expose a sparql endpoint using account other than 'dba'. I have tried to using vsp_user='SPARQL', but I got '404 cannot access' error when I tried the url. I also set the opts-(executable, 'yes'), this option seems to allow any vsp user to have execute permission, but it still does not work. I also tried to set user 'SPARQL' to administrator role, but still cannot work Please try the steps from this guide: Secure SPARQL Endpoint via SQL Accounts -- usage path digest authentication Link: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication Related: -- Securing SPARQL endpoints: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints -- Securing your SPARQL Endpoint via OAuth: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL -- Securing your SPARQL Endpoint via WebID: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID 2) how can I know and configure the user account to use '/sparql' endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that the vsp_user is 'dba', but it does not show the default user of that endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ' for both '/sparql' and '/sparql-graph-crud', but I cannot find any system table for that. Our system team wants to audit that information. The name 'SPARQL' is a constant in the code of SPARQL web service endpoint pages ( /sparql and /sparql-auth ). Another name can be used if authentication function sets connection variable 'SPARQLUserId' to that name, for ex., placing inside authentication call: connection_set ('SPARQLUserId', 'SOME_USER_NAME'); What you could try is to grant more roles to the user
Re: [Virtuoso-users] vhost_define vsp_user and real user
Hi Rumi, I have also tried: grant execute on DB.DBA.SPARUL_LOAD_SERVICE_DATA to SPARQL; but still, user SPARQL cannot be used as vsp_user to expose a sparql endpoint, I got: 404 page not found Resource /sparql not found.Access to page is forbidden Best, Gang On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 03-Feb-15 1:15 PM, Gang Fu wrote: Hi, I am using function vhost_define() to expose read-only sparql endpoint through another port (different from 8890) for security concern. I have two questions: 1) how can I expose a sparql endpoint using account other than 'dba'. I have tried to using vsp_user='SPARQL', but I got '404 cannot access' error when I tried the url. I also set the opts-(executable, 'yes'), this option seems to allow any vsp user to have execute permission, but it still does not work. I also tried to set user 'SPARQL' to administrator role, but still cannot work Please try the steps from this guide: Secure SPARQL Endpoint via SQL Accounts -- usage path digest authentication Link: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication Related: -- Securing SPARQL endpoints: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints -- Securing your SPARQL Endpoint via OAuth: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL -- Securing your SPARQL Endpoint via WebID: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID 2) how can I know and configure the user account to use '/sparql' endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that the vsp_user is 'dba', but it does not show the default user of that endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ' for both '/sparql' and '/sparql-graph-crud', but I cannot find any system table for that. Our system team wants to audit that information. The name 'SPARQL' is a constant in the code of SPARQL web service endpoint pages ( /sparql and /sparql-auth ). Another name can be used if authentication function sets connection variable 'SPARQLUserId' to that name, for ex., placing inside authentication call: connection_set ('SPARQLUserId', 'SOME_USER_NAME'); What you could try is to grant more roles to the user if needed, such as: SPARQL_LOAD_SERVICE_DATA or SPARQL_UPDATE, by granting directly to the user or, better, to SPARQL_SELECT, since the endpoint page will require that the user is member of SPARQL_SELECT group -- that's the minimal practical permission, however one can grant more permissions. Best Regards, Rumi Kocis Best, Gang -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Virtuoso-users mailing listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] vhost_define vsp_user and real user
Hi Rumi, I looked at the source code of libsrc/Wi/sparql_io.sql for procedure WS.WS ./!sparql/: create procedure WS.WS./!sparql/ (inout path varchar, inout params any, inout lines any) I am not sure whether the user as SPARQL for /sparql endpoint are set by default here: user_id := connection_get ('SPARQLUserId', 'SPARQL'); set_user_id (user_id, 1); I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql endpoint is not read-only And when I tried to grant another role, I got The object SPARQL_LOAD_SERVICE_DATA does not exist. But it does not allow me to expose /sparql endpoint using vsp_user SPARQL. What I am really interested in is how to expose sparql endpoint using vsp users other than dba. Best, Gang On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 03-Feb-15 1:15 PM, Gang Fu wrote: Hi, I am using function vhost_define() to expose read-only sparql endpoint through another port (different from 8890) for security concern. I have two questions: 1) how can I expose a sparql endpoint using account other than 'dba'. I have tried to using vsp_user='SPARQL', but I got '404 cannot access' error when I tried the url. I also set the opts-(executable, 'yes'), this option seems to allow any vsp user to have execute permission, but it still does not work. I also tried to set user 'SPARQL' to administrator role, but still cannot work Please try the steps from this guide: Secure SPARQL Endpoint via SQL Accounts -- usage path digest authentication Link: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication Related: -- Securing SPARQL endpoints: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints -- Securing your SPARQL Endpoint via OAuth: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL -- Securing your SPARQL Endpoint via WebID: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID 2) how can I know and configure the user account to use '/sparql' endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that the vsp_user is 'dba', but it does not show the default user of that endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ' for both '/sparql' and '/sparql-graph-crud', but I cannot find any system table for that. Our system team wants to audit that information. The name 'SPARQL' is a constant in the code of SPARQL web service endpoint pages ( /sparql and /sparql-auth ). Another name can be used if authentication function sets connection variable 'SPARQLUserId' to that name, for ex., placing inside authentication call: connection_set ('SPARQLUserId', 'SOME_USER_NAME'); What you could try is to grant more roles to the user if needed, such as: SPARQL_LOAD_SERVICE_DATA or SPARQL_UPDATE, by granting directly to the user or, better, to SPARQL_SELECT, since the endpoint page will require that the user is member of SPARQL_SELECT group -- that's the minimal practical permission, however one can grant more permissions. Best Regards, Rumi Kocis Best, Gang -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Virtuoso-users mailing listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
[Virtuoso-users] vhost_define vsp_user and real user
Hi, I am using function vhost_define() to expose read-only sparql endpoint through another port (different from 8890) for security concern. I have two questions: 1) how can I expose a sparql endpoint using account other than 'dba'. I have tried to using vsp_user='SPARQL', but I got '404 cannot access' error when I tried the url. I also set the opts-(executable, 'yes'), this option seems to allow any vsp user to have execute permission, but it still does not work. I also tried to set user 'SPARQL' to administrator role, but still cannot work 2) how can I know and configure the user account to use '/sparql' endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that the vsp_user is 'dba', but it does not show the default user of that endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ' for both '/sparql' and '/sparql-graph-crud', but I cannot find any system table for that. Our system team wants to audit that information. Best, Gang -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
Re: [Virtuoso-users] vhost_define vsp_user and real user
Hi Rumi, I have tried to expose a password-protected sparql endpoint, actually it can be done using vhost_define() function as well, just add sec='digest' and authentication function. But the vsp_user to expose a password-protected sparql endpoint is still dba. Best, Gang On Tue, Feb 3, 2015 at 12:35 PM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 03-Feb-15 3:47 PM, Gang Fu wrote: Hi Rumi, I looked at the source code of libsrc/Wi/sparql_io.sql for procedure WS.WS./!sparql/: create procedure WS.WS./!sparql/ (inout path varchar, inout params any, inout lines any) I am not sure whether the user as SPARQL for /sparql endpoint are set by default here: user_id := connection_get ('SPARQLUserId', 'SPARQL'); set_user_id (user_id, 1); I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql endpoint is not read-only And when I tried to grant another role, I got The object SPARQL_LOAD_SERVICE_DATA does not exist. But it does not allow me to expose /sparql endpoint using vsp_user SPARQL. What I am really interested in is how to expose sparql endpoint using vsp users other than dba. Hm, I would say you grant the roles to another vsp user as this is what you want to achieve is this correct? As now you granted them to SPARQL instead? Additionally, did you try the steps from the guide http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication ? Best Regards, Rumi Kocis Best, Gang On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote: Hi Gang Fu, On 03-Feb-15 1:15 PM, Gang Fu wrote: Hi, I am using function vhost_define() to expose read-only sparql endpoint through another port (different from 8890) for security concern. I have two questions: 1) how can I expose a sparql endpoint using account other than 'dba'. I have tried to using vsp_user='SPARQL', but I got '404 cannot access' error when I tried the url. I also set the opts-(executable, 'yes'), this option seems to allow any vsp user to have execute permission, but it still does not work. I also tried to set user 'SPARQL' to administrator role, but still cannot work Please try the steps from this guide: Secure SPARQL Endpoint via SQL Accounts -- usage path digest authentication Link: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication Related: -- Securing SPARQL endpoints: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints -- Securing your SPARQL Endpoint via OAuth: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL -- Securing your SPARQL Endpoint via WebID: http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID 2) how can I know and configure the user account to use '/sparql' endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that the vsp_user is 'dba', but it does not show the default user of that endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ' for both '/sparql' and '/sparql-graph-crud', but I cannot find any system table for that. Our system team wants to audit that information. The name 'SPARQL' is a constant in the code of SPARQL web service endpoint pages ( /sparql and /sparql-auth ). Another name can be used if authentication function sets connection variable 'SPARQLUserId' to that name, for ex., placing inside authentication call: connection_set ('SPARQLUserId', 'SOME_USER_NAME'); What you could try is to grant more roles to the user if needed, such as: SPARQL_LOAD_SERVICE_DATA or SPARQL_UPDATE, by granting directly to the user or, better, to SPARQL_SELECT, since the endpoint page will require that the user is member of SPARQL_SELECT group -- that's the minimal practical permission, however one can grant more permissions. Best Regards, Rumi Kocis Best, Gang -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Virtuoso-users mailing listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now