Re: [Virtuoso-users] deadlocks in bulk loading using rdf_loader_run function

2015-12-21 Thread Gang Fu
Is it possible to load files into staging tables first, then merge them
into RDF_QUAD?

Best,
Gang

On Fri, Dec 18, 2015 at 2:34 PM, Gang Fu <gangfu1...@gmail.com> wrote:

> Hi,
>
> I have deadlock problem with function 'rdf_loader_run' running in
> parallel, and I have tried many ways to avoid deadlocks, including
>
> 1)  Preprocess files using sort and uniq
>
> 2)  Loading without indexes
>
> 3)  Split big files into very small chunks
>
>
>
> But deadlocks problem is still there. It seems the only way to avoid this
> is loading in single thread, which is very slow… any other solutions?
>
>
>
> Best,
>
> Gang
>
>
>
--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] deadlocks in bulk loading using rdf_loader_run function

2015-12-18 Thread Gang Fu
Hi,

I have deadlock problem with function 'rdf_loader_run' running in parallel,
and I have tried many ways to avoid deadlocks, including

1)  Preprocess files using sort and uniq

2)  Loading without indexes

3)  Split big files into very small chunks



But deadlocks problem is still there. It seems the only way to avoid this
is loading in single thread, which is very slow… any other solutions?



Best,

Gang
--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] virtuoso.db file can not start the server

2015-11-03 Thread Gang Fu
Hi,

We have copied the virtuoso.db file from one server to another one, and
start the server using just the db file, but we got the following error
message in virtuoso.log file, can anyone help?
Thank you very much!

Best,
Gang
-
16:00:50 OpenLink Virtuoso Universal Server
16:00:50 Version 07.20.3212-pthreads for Linux as of Mar 26 2015
16:00:50 uses parts of OpenSSL, PCRE, Html Tidy
16:00:50 Database version 3126
16:00:50 SQL Optimizer enabled (max 1000 layouts)
16:00:51 Compiler unit is timed at 0.000155 msec
16:00:57 built-in procedure "repl_undot_name" overruled by the RDBMS
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x924b48]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x924bb6]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4cf137]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4cf536]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4cfc7b]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x562976]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x86fca5]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x870c4b]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x678df5]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x67afe3]
16:00:57 /opt/virtuoso/bin/virtuoso-t(insert_node_run+0x3d7) [0x615467]
16:00:57 /opt/virtuoso/bin/virtuoso-t(insert_node_input+0x9e) [0x6157fe]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5ea6d1]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5ea7d7]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e7a10]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x614252]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x618aca]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e5f26]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e79e3]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x614252]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x618aca]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e5f26]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x5e79e3]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x614252]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x61a089]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x61ab5c]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x4b3779]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x8c5e9c]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x61d86d]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x625d57]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x457e67]
16:00:57 /lib64/libc.so.6(__libc_start_main+0xfd) [0x319581ed5d]
16:00:57 /opt/virtuoso/bin/virtuoso-t() [0x452ad9]
16:00:57 GPF: extent.c:848 an extent was made for a range already taken by
other ext
--
___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] cannot start server after delete

2015-05-07 Thread Gang Fu
Hi,

I was running sparql delete and stopped the job. Then when I tried to start
the server I got the error message:
Starting virtuoso: The VDBMS server process terminated prematurely
after checkpointing.

The server cannot be started anymore.
I checked the log file, I found this message:

Thu May 07 2015
14:34:44 OpenLink Virtuoso Universal Server
14:34:44 Version 07.20.3212-pthreads for Linux as of Mar 26 2015
14:34:44 uses parts of OpenSSL, PCRE, Html Tidy
14:34:50 Database version 3126
14:34:50 SQL Optimizer enabled (max 1000 layouts)
14:34:51 Compiler unit is timed at 0.000171 msec
14:34:57 built-in procedure repl_undot_name overruled by the RDBMS
14:34:57 Roll forward started
14:34:58 86 transactions, 9940 bytes replayed (100 %)
14:34:58 Roll forward complete
14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x924b48]
14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x924bb6]
14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x496660]
14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x4ff9da]
14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x4ffd7e]
14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x5000d1]
14:34:59 /opt/virtuoso/bin/virtuoso-t(ac_aq_func+0x13f) [0x50076f]
14:34:59 /opt/virtuoso/bin/virtuoso-t(aq_thread_func+0x1df) [0x45b0ff]
14:34:59 /opt/virtuoso/bin/virtuoso-t() [0x92efdf]
14:34:59 /lib64/libpthread.so.0() [0x38db6079d1]
14:34:59 /lib64/libc.so.6(clone+0x6d) [0x38daae88fd]
14:34:59 GPF: colins.c:3412 uneven length cols after insert

Do you know why this happen, and how to recover from there?

Thank you very much!

Best,
Gang
--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] exact string match

2015-05-07 Thread Gang Fu
Hi,

I want to do fast exact string match for hundreds of millions of text
strings. sparql filter regex is apparently too slow. I found bif:contains
is very fast, and in virtuoso 7 we do not need to do anything after
installation, bif:contains works well. However, bif:contains only does
substring match, so most of time I got many hits back.

I am wondering how I can do very fast exact string match with
case-insensitive. I have searched for other bif functions, there are a
couple of them: bif:strstr, bif:strcasestr, bif:starts_with, bif:ends_with.
However, none of them work after initial installation of virtuoso 7. Do I
need to run the store procedure as follows to get them work?

Best,
Gang

DB.DBA.RDF_OBJ_FT_RULE_ADD (null, null, 'All');
DB.DBA.VT_INC_INDEX_DB_DBA_RDF_OBJ ();
DB.DBA.VT_INDEX_DB_DBA_RDF_OBJ ();
--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document

2015-04-17 Thread Gang Fu
Thank you very much, Hugh! You are right, we are going to update the large
data set on weekly basis, and the updates are on the scale of a couple of
millions, or less than that. The bulk loader with delete option sounds good
for me, but only the nquad files are allowed. Our input files are in ttl,
which are dumped from sql database. Preparing another set of dump scripts
is not goodconverting ttl to nquad requires extra step in the
pipelineis there a way to do rdf loader 'with delete' with ttl files?
Otherwise, I think sparql delete is better for us, since no extra efforts
are needed.

On Wed, Apr 8, 2015 at 12:59 PM, Hugh Williams hwilli...@openlinksw.com
wrote:

 Hi Gang,

 To be clear when you say I want to update a large RDF store with 10
 billions triples once a week , presume you are *NOT* loading 10billion new
 triples every week, but rather the base 10billion triples are to be updated
 which triples/graphs being inserted/deleted/updated, thus  the overall
 number of triples does increase (or decrease) on that scale ?

 As if these updates are in the form of documents ie datasets and if they
 or can be converted to nquad format to meet the requirements of the
 Virtuoso RDF Bulk Loder with_delete [1] option, then this would be the
 most the fastest and most efficient way to do this I would say ...

 [1]
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFBulkLoaderWithDelete

 Best Regards
 Hugh Williams
 Professional Services
 OpenLink Software, Inc.  //  http://www.openlinksw.com/
 Weblog   -- http://www.openlinksw.com/blogs/
 LinkedIn -- http://www.linkedin.com/company/openlink-software/
 Twitter  -- http://twitter.com/OpenLink
 Google+  -- http://plus.google.com/100570109519069333827/
 Facebook -- http://www.facebook.com/OpenLinkSoftware
 Universal Data Access, Integration, and Management Technology Providers

 On 8 Apr 2015, at 12:27, Gang Fu gangfu1...@gmail.com wrote:

 using isql or jdbc or http will make any difference?

 On Wed, Apr 8, 2015 at 7:25 AM, Gang Fu gangfu1...@gmail.com wrote:

 There are millions of triples to be updated on weekly basis.

 On Wed, Apr 8, 2015 at 7:24 AM, Gang Fu gangfu1...@gmail.com wrote:

 Hi,

 I want to update a large RDF store with 10 billions triples once a week.
 The triples to be inserted or deleted are save in documents.
 There is no variable binding or blank nodes in the documents.
 So I guess the best fit sparql update functions are
 insert data/delete data

 What is the best way to do this?
 Using JDBC connection pool or http?
 Using 'modify graph graph-iri insert/delete', or insert/delete data?
 Is it possible to run concurrent update jobs?


 Best,
 Gang




 --
 BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
 Develop your own process in accordance with the BPMN 2 standard
 Learn Process modeling best practices with Bonita BPM through live
 exercises
 http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
 event?utm_

 source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
 Virtuoso-users mailing list
 Virtuoso-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/virtuoso-users



--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] virtuoso striping

2015-04-17 Thread Gang Fu
Thank you very much, Morty! You are right, 'split' plus 'cat' is a better
option, since the server can start immediately with the rebuild db file.

Is there a way to test whether a stored procedure exists? I have another
ticket about this question, but I have not gotten any reply yet there :)

On Wed, Apr 15, 2015 at 3:51 PM, Morty morty+virtu...@frakir.org wrote:

 Yes, you can split a large file into many small files.  At the colo,
 you can put them back together again.  The command to put them back
 together is cat.  The join command does something else, so you
 don't want to try to use it.

 NB: this is actually what the cat command is for.  cat is short for
 concatenate.  Although it's rarely used for this purpose!  ;)

 Alternatively, there are options for rsync that turn off the checksum
 stuff.  So if a file transfer gets interrupted, it picks off right
 where it left off.  You can then do file verification outside the
 scope of rsync, e.g. by doing sha1sum on both sides and comparing the
 results.

 Contact your local sysadmins for assistance with either of these
 options.  :)

 - Morty


 On Wed, Apr 15, 2015 at 02:50:31PM -0400, Gang Fu wrote:
  We want to transfer the files to another location, 'colo' for disaster
  recovery. The long distance transfer is time-consuming and may fail
  sometimes.
 
  We are using rsync, and we believe rsync a 500 GB file or rsync many
 small
  files indeed make difference, since rsync does a checksum validation
 before
  transfer, so if a large portion of many small files have the same
 checksum,
  then we only need to transfer a small port of them.
 
  Can we just 'split' and 'join' db files before and after transferring?
 
  Best,
  Gang
 
  On Wed, Apr 15, 2015 at 1:17 PM, Morty morty+virtu...@frakir.org
 wrote:
 
   On Tue, Apr 14, 2015 at 12:24:22PM -0400, Gang Fu wrote:
  
We want to copy a large virtuoso db from one server to another in
different location. We cannot copy single 500 GB db file, which is
slow and unstable.  So we want to break the db files in different
segments. I have tried with virtuoso striping: each segment has 20
GB, and in total we have over 25 segments.
  
   What issue are you seeing with transferring a 500GB file?
   Transferring one 500GB file should not be significantly slower than
   transferring 25x 20GB files.
  
   If you are concerned about a transfer interruption, you could use
   rsync.  rsync has options to resume a failed transfer.
  
   Alternatively, you could use the Linux/Unix split command to split
   the one large file into a bunch of smaller files.
  
   Or you could use the commercial version of virtuoso with built-in
   replication.
  
   - Morty
  

 --
Mordechai T. Abzug
 Linux red-sonja 3.11.0-24-generic #42-Ubuntu SMP Fri Jul 4 21:19:31 UTC
 2014 x86_64 x86_64 x86_64 GNU/Linux
 A verbal contract isn't worth the paper it's written on. - Samuel Goldwyn

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] check if a stored procedure exists

2015-04-17 Thread Gang Fu
Hi,

I just want to ping this question, in case it was missed

Best,
Gang

On Sat, Apr 11, 2015 at 8:45 AM, Gang Fu gangfu1...@gmail.com wrote:

 Hi,

 I want to ask how can we check the existence of a stored procedure before
 we drop it?
 We need to drop the stored procedure before we create it, otherwise, there
 will be some issue, but if we drop a stored procedure that does not exist,
 we will get an error. So we need to check existence before drop it.

 Thank you very much!

 Best,
 Gang

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] virtuoso striping

2015-04-17 Thread Gang Fu
By the way, I have tried to compress the big db file using gzip, and I can
get over 1:4 compression ratio, so I think there is still a lot of room in
the db file that are not very important.

Is there a function in virtuoso that can be post-loading optimization to
reduce the db file size, which may in turn boost the performance as well?

Best,
Gang

On Fri, Apr 17, 2015 at 8:32 AM, Gang Fu gangfu1...@gmail.com wrote:

 Thank you very much, Morty! You are right, 'split' plus 'cat' is a better
 option, since the server can start immediately with the rebuild db file.

 Is there a way to test whether a stored procedure exists? I have another
 ticket about this question, but I have not gotten any reply yet there :)

 On Wed, Apr 15, 2015 at 3:51 PM, Morty morty+virtu...@frakir.org wrote:

 Yes, you can split a large file into many small files.  At the colo,
 you can put them back together again.  The command to put them back
 together is cat.  The join command does something else, so you
 don't want to try to use it.

 NB: this is actually what the cat command is for.  cat is short for
 concatenate.  Although it's rarely used for this purpose!  ;)

 Alternatively, there are options for rsync that turn off the checksum
 stuff.  So if a file transfer gets interrupted, it picks off right
 where it left off.  You can then do file verification outside the
 scope of rsync, e.g. by doing sha1sum on both sides and comparing the
 results.

 Contact your local sysadmins for assistance with either of these
 options.  :)

 - Morty


 On Wed, Apr 15, 2015 at 02:50:31PM -0400, Gang Fu wrote:
  We want to transfer the files to another location, 'colo' for disaster
  recovery. The long distance transfer is time-consuming and may fail
  sometimes.
 
  We are using rsync, and we believe rsync a 500 GB file or rsync many
 small
  files indeed make difference, since rsync does a checksum validation
 before
  transfer, so if a large portion of many small files have the same
 checksum,
  then we only need to transfer a small port of them.
 
  Can we just 'split' and 'join' db files before and after transferring?
 
  Best,
  Gang
 
  On Wed, Apr 15, 2015 at 1:17 PM, Morty morty+virtu...@frakir.org
 wrote:
 
   On Tue, Apr 14, 2015 at 12:24:22PM -0400, Gang Fu wrote:
  
We want to copy a large virtuoso db from one server to another in
different location. We cannot copy single 500 GB db file, which is
slow and unstable.  So we want to break the db files in different
segments. I have tried with virtuoso striping: each segment has 20
GB, and in total we have over 25 segments.
  
   What issue are you seeing with transferring a 500GB file?
   Transferring one 500GB file should not be significantly slower than
   transferring 25x 20GB files.
  
   If you are concerned about a transfer interruption, you could use
   rsync.  rsync has options to resume a failed transfer.
  
   Alternatively, you could use the Linux/Unix split command to split
   the one large file into a bunch of smaller files.
  
   Or you could use the commercial version of virtuoso with built-in
   replication.
  
   - Morty
  

 --
Mordechai T. Abzug
 Linux red-sonja 3.11.0-24-generic #42-Ubuntu SMP Fri Jul 4 21:19:31 UTC
 2014 x86_64 x86_64 x86_64 GNU/Linux
 A verbal contract isn't worth the paper it's written on. - Samuel
 Goldwyn



--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] virtuoso striping

2015-04-15 Thread Gang Fu
Thank you very much, Hugh!

Will the restoring from the online backup series take some times, or
momentarily? Our system team has concern, if the restore take long time for
large database (~10 billions of triples), we do not have enough
redundancies to guarantee the service.

Best,
Gang

On Tue, Apr 14, 2015 at 12:46 PM, Hugh Williams hwilli...@openlinksw.com
wrote:

 Hi Gang

 Perform a Virtuoso online backup which will split files into manageable
 file sizes then copy and restored on new location as detailed at:

 http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#backup

 Best Regards
 Hugh Williams
 Professional Services
 OpenLink Software, Inc.  //  http://www.openlinksw.com/
 Weblog   -- http://www.openlinksw.com/blogs/
 LinkedIn -- http://www.linkedin.com/company/openlink-software/
 Twitter  -- http://twitter.com/OpenLink
 Google+  -- http://plus.google.com/100570109519069333827/
 Facebook -- http://www.facebook.com/OpenLinkSoftware
 Universal Data Access, Integration, and Management Technology Providers

 On 14 Apr 2015, at 17:24, Gang Fu gangfu1...@gmail.com wrote:

 Hi,

 We want to copy a large virtuoso db from one server to another in
 different location. We cannot copy single 500 GB db file, which is slow and
 unstable. So we want to break the db files in different segments. I have
 tried with virtuoso striping: each segment has 20 GB, and in total we have
 over 25 segments. Then new issue come out, the server start very slowly
 with multiple segments, it may take half an hour to start after copy those
 segment files to another server.

 Any explanation? How to avoid the long time start? We want to implement
 the db toggle mechanism, so we want the server to start immediately, like
 without striping.

 Best,
 Gang

 --
 BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
 Develop your own process in accordance with the BPMN 2 standard
 Learn Process modeling best practices with Bonita BPM through live
 exercises
 http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
 event?utm_

 source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
 Virtuoso-users mailing list
 Virtuoso-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/virtuoso-users



--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] virtuoso striping

2015-04-15 Thread Gang Fu
We want to transfer the files to another location, 'colo' for disaster
recovery. The long distance transfer is time-consuming and may fail
sometimes.

We are using rsync, and we believe rsync a 500 GB file or rsync many small
files indeed make difference, since rsync does a checksum validation before
transfer, so if a large portion of many small files have the same checksum,
then we only need to transfer a small port of them.

Can we just 'split' and 'join' db files before and after transferring?

Best,
Gang

On Wed, Apr 15, 2015 at 1:17 PM, Morty morty+virtu...@frakir.org wrote:

 On Tue, Apr 14, 2015 at 12:24:22PM -0400, Gang Fu wrote:

  We want to copy a large virtuoso db from one server to another in
  different location. We cannot copy single 500 GB db file, which is
  slow and unstable.  So we want to break the db files in different
  segments. I have tried with virtuoso striping: each segment has 20
  GB, and in total we have over 25 segments.

 What issue are you seeing with transferring a 500GB file?
 Transferring one 500GB file should not be significantly slower than
 transferring 25x 20GB files.

 If you are concerned about a transfer interruption, you could use
 rsync.  rsync has options to resume a failed transfer.

 Alternatively, you could use the Linux/Unix split command to split
 the one large file into a bunch of smaller files.

 Or you could use the commercial version of virtuoso with built-in
 replication.

 - Morty

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] virtuoso striping

2015-04-14 Thread Gang Fu
Hi,

We want to copy a large virtuoso db from one server to another in different
location. We cannot copy single 500 GB db file, which is slow and unstable.
So we want to break the db files in different segments. I have tried with
virtuoso striping: each segment has 20 GB, and in total we have over 25
segments. Then new issue come out, the server start very slowly with
multiple segments, it may take half an hour to start after copy those
segment files to another server.

Any explanation? How to avoid the long time start? We want to implement the
db toggle mechanism, so we want the server to start immediately, like
without striping.

Best,
Gang
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] check if a stored procedure exists

2015-04-11 Thread Gang Fu
Hi,

I want to ask how can we check the existence of a stored procedure before
we drop it?
We need to drop the stored procedure before we create it, otherwise, there
will be some issue, but if we drop a stored procedure that does not exist,
we will get an error. So we need to check existence before drop it.

Thank you very much!

Best,
Gang
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] best way to update large RDF stores with triples of a large document

2015-04-08 Thread Gang Fu
Hi,

I want to update a large RDF store with 10 billions triples once a week.
The triples to be inserted or deleted are save in documents.
There is no variable binding or blank nodes in the documents.
So I guess the best fit sparql update functions are
insert data/delete data

What is the best way to do this?
Using JDBC connection pool or http?
Using 'modify graph graph-iri insert/delete', or insert/delete data?
Is it possible to run concurrent update jobs?


Best,
Gang
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document

2015-04-08 Thread Gang Fu
using isql or jdbc or http will make any difference?

On Wed, Apr 8, 2015 at 7:25 AM, Gang Fu gangfu1...@gmail.com wrote:

 There are millions of triples to be updated on weekly basis.

 On Wed, Apr 8, 2015 at 7:24 AM, Gang Fu gangfu1...@gmail.com wrote:

 Hi,

 I want to update a large RDF store with 10 billions triples once a week.
 The triples to be inserted or deleted are save in documents.
 There is no variable binding or blank nodes in the documents.
 So I guess the best fit sparql update functions are
 insert data/delete data

 What is the best way to do this?
 Using JDBC connection pool or http?
 Using 'modify graph graph-iri insert/delete', or insert/delete data?
 Is it possible to run concurrent update jobs?


 Best,
 Gang



--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document

2015-04-08 Thread Gang Fu
There are millions of triples to be updated on weekly basis.

On Wed, Apr 8, 2015 at 7:24 AM, Gang Fu gangfu1...@gmail.com wrote:

 Hi,

 I want to update a large RDF store with 10 billions triples once a week.
 The triples to be inserted or deleted are save in documents.
 There is no variable binding or blank nodes in the documents.
 So I guess the best fit sparql update functions are
 insert data/delete data

 What is the best way to do this?
 Using JDBC connection pool or http?
 Using 'modify graph graph-iri insert/delete', or insert/delete data?
 Is it possible to run concurrent update jobs?


 Best,
 Gang


--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] create procedure in bash shell

2015-04-07 Thread Gang Fu
Hi,

In our application, we want to create many different dump functions to dump
different subset of triple collections from database.

According to the wiki page:
http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFDatasetDump

we can define dump one graph function in isql command line. Since we will
create hundreds of customized dump functions, we do not want to copy and
paste to isql command line. Instead we want to prepare a shell script.

In the bash shell, we need to take care of single quote by converting ' to
'''.

Basically, I copy and paste the create procedure into shell script
surrounded by:

/opt/virtuoso/bin/isql  dba password exec='create procedure
function'

But I always got this error
*** Error 37000: [Virtuoso Driver][Virtuoso Server]SQ074: Line 40:
at line 0 of Top-Level:

although the create procedure function itself works fine in the isql
command line.
Can anyone help me out?

Thank you very much!

Best,
Gang
---
The shell command is as follow:
/opt/virtuoso/bin/isql  dba password verbose=on banner=off prompt=off
echo=ON errors=stdout  \
exec='CREATE PROCEDURE dump_one_graph
  ( IN  srcgraph   VARCHAR  ,
IN  out_file   VARCHAR  ,
IN  file_length_limit  INTEGER  := 10
  )
  {
DECLARE  file_name VARCHAR;
DECLARE  env,
 ses   ANY;
DECLARE  ses_len,
 max_ses_len,
 file_len,
 file_idx  INTEGER;
SET ISOLATION = '''uncommitted''';
max_ses_len  := 1000;
file_len := 0;
file_idx := 1;
file_name:= sprintf ('''%s%06d.ttl''', out_file, file_idx);
string_to_file ( file_name || '''.graph''',
 srcgraph,
 -2
   );
string_to_file ( file_name,
 sprintf ( '''# Dump of graph %s, as of %s\n@base
 .\n''',
   srcgraph,
   CAST (NOW() AS VARCHAR)
 ),
 -2
   );
env := vector (dict_new (16000), 0, , , , 0, 0, 0, 0,
0);
ses := string_output ();
FOR (SELECT * FROM ( SPARQL DEFINE input:storage 
 SELECT ?s ?p ?o { GRAPH `iri(?:srcgraph)` { ?s ?p
?o } }
   ) AS sub OPTION (LOOP)) DO
  {
http_ttl_triple (env, s, p, o, ses);
ses_len := length (ses);
IF (ses_len  max_ses_len)
  {
file_len := file_len + ses_len;
IF (file_len  file_length_limit)
  {
http (''' .\n''', ses);
string_to_file (file_name, ses, -1);
gz_compress_file (file_name, file_name||'''.gz''');
file_delete (file_name);
file_len := 0;
file_idx := file_idx + 1;
file_name := sprintf ('''%s%06d.ttl''', out_file,
file_idx);
string_to_file ( file_name,
 sprintf ( '''# Dump of graph %s, as of
%s (part %d)\n@base  .\n''',
   srcgraph,
   CAST (NOW() AS VARCHAR),
   file_idx),
 -2
   );
 env := VECTOR (dict_new (16000), 0, , ,
, 0, 0, 0, 0, 0);
  }
ELSE
  string_to_file (file_name, ses, -1);
ses := string_output ();
  }
  }
IF (LENGTH (ses))
  {
http (''' .\n''', ses);
string_to_file (file_name, ses, -1);
gz_compress_file (file_name, file_name||'''.gz''');
file_delete (file_name);
  }
  }
;'
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] create procedure in bash shell

2015-04-07 Thread Gang Fu
I double-checked the error message and found the plus sign '+' is missing
in the error message:

file_len := file_len   ses_len;

Do we need to escape plus sign in shell script? Any comment?

On Tue, Apr 7, 2015 at 4:36 PM, Gang Fu gangfu1...@gmail.com wrote:

 Hi,

 In our application, we want to create many different dump functions to
 dump different subset of triple collections from database.

 According to the wiki page:

 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFDatasetDump

 we can define dump one graph function in isql command line. Since we will
 create hundreds of customized dump functions, we do not want to copy and
 paste to isql command line. Instead we want to prepare a shell script.

 In the bash shell, we need to take care of single quote by converting ' to
 '''.

 Basically, I copy and paste the create procedure into shell script
 surrounded by:

 /opt/virtuoso/bin/isql  dba password exec='create procedure
 function'

 But I always got this error
 *** Error 37000: [Virtuoso Driver][Virtuoso Server]SQ074: Line 40:
 at line 0 of Top-Level:

 although the create procedure function itself works fine in the isql
 command line.
 Can anyone help me out?

 Thank you very much!

 Best,
 Gang
 ---
 The shell command is as follow:
 /opt/virtuoso/bin/isql  dba password verbose=on banner=off
 prompt=off echo=ON errors=stdout  \
 exec='CREATE PROCEDURE dump_one_graph
   ( IN  srcgraph   VARCHAR  ,
 IN  out_file   VARCHAR  ,
 IN  file_length_limit  INTEGER  := 10
   )
   {
 DECLARE  file_name VARCHAR;
 DECLARE  env,
  ses   ANY;
 DECLARE  ses_len,
  max_ses_len,
  file_len,
  file_idx  INTEGER;
 SET ISOLATION = '''uncommitted''';
 max_ses_len  := 1000;
 file_len := 0;
 file_idx := 1;
 file_name:= sprintf ('''%s%06d.ttl''', out_file, file_idx);
 string_to_file ( file_name || '''.graph''',
  srcgraph,
  -2
);
 string_to_file ( file_name,
  sprintf ( '''# Dump of graph %s, as of %s\n@base
  .\n''',
srcgraph,
CAST (NOW() AS VARCHAR)
  ),
  -2
);
 env := vector (dict_new (16000), 0, , , , 0, 0, 0,
 0, 0);
 ses := string_output ();
 FOR (SELECT * FROM ( SPARQL DEFINE input:storage 
  SELECT ?s ?p ?o { GRAPH `iri(?:srcgraph)` { ?s ?p
 ?o } }
) AS sub OPTION (LOOP)) DO
   {
 http_ttl_triple (env, s, p, o, ses);
 ses_len := length (ses);
 IF (ses_len  max_ses_len)
   {
 file_len := file_len + ses_len;
 IF (file_len  file_length_limit)
   {
 http (''' .\n''', ses);
 string_to_file (file_name, ses, -1);
 gz_compress_file (file_name, file_name||'''.gz''');
 file_delete (file_name);
 file_len := 0;
 file_idx := file_idx + 1;
 file_name := sprintf ('''%s%06d.ttl''', out_file,
 file_idx);
 string_to_file ( file_name,
  sprintf ( '''# Dump of graph %s, as
 of %s (part %d)\n@base  .\n''',
srcgraph,
CAST (NOW() AS VARCHAR),
file_idx),
  -2
);
  env := VECTOR (dict_new (16000), 0, , ,
 , 0, 0, 0, 0, 0);
   }
 ELSE
   string_to_file (file_name, ses, -1);
 ses := string_output ();
   }
   }
 IF (LENGTH (ses))
   {
 http (''' .\n''', ses);
 string_to_file (file_name, ses, -1);
 gz_compress_file (file_name, file_name||'''.gz''');
 file_delete (file_name);
   }
   }
 ;'

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] create procedure in bash shell

2015-04-07 Thread Gang Fu
Thank you very much, Morty and Hugh!

I just found out that the isql command line has two mode: file mode and
command line mode:
isql  dba dba file.sql
isql  dba dba exec=command

Best,
Gang
B

On Tue, Apr 7, 2015 at 10:14 PM, Hugh Williams hwilli...@openlinksw.com
wrote:

 Hi Gang,

 You could also place the procedure to be loaded in a file (dump.sql) and
 use the “load” command to load it into the database:

 $ isql  dba dba verbose=on banner=off prompt=off echo=ON errors=stdout
 exec='load dump.sql'
 Connected to OpenLink Virtuoso
 Driver: 07.10.3211 OpenLink Virtuoso ODBC Driver
 OpenLink Interactive SQL (Virtuoso), version 0.9849b.
 Type HELP; for help and EXIT; to exit.

 -- Line 1:
 CREATE PROCEDURE dump_one_graph
   ( IN  srcgraph   VARCHAR  ,
 IN  out_file   VARCHAR  ,
 IN  file_length_limit  INTEGER  := 10
   )
   {
 DECLARE  file_name VARCHAR;
 DECLARE  env,
  ses   ANY;
 DECLARE  ses_len,
  max_ses_len,
  file_len,
  file_idx  INTEGER;
 SET ISOLATION = 'uncommitted';
 max_ses_len  := 1000;
 file_len := 0;
 file_idx := 1;
 file_name:= sprintf ('%s%06d.ttl', out_file, file_idx);
 string_to_file ( file_name || '.graph',
  srcgraph,
  -2
);
 string_to_file ( file_name,
  sprintf ( '# Dump of graph %s, as of %s\n@base 
 .\n',
srcgraph,
CAST (NOW() AS VARCHAR)
  ),
  -2
);
 env := vector (dict_new (16000), 0, '', '', '', 0, 0, 0, 0, 0);
 ses := string_output ();
 FOR (SELECT * FROM ( SPARQL DEFINE input:storage 
  SELECT ?s ?p ?o { GRAPH `iri(?:srcgraph)` { ?s ?p
 ?o } }
) AS sub OPTION (LOOP)) DO
   {
 http_ttl_triple (env, s, p, o, ses);
 ses_len := length (ses);
 IF (ses_len  max_ses_len)
   {
 file_len := file_len + ses_len;
 IF (file_len  file_length_limit)
   {
 http (' .\n', ses);
 string_to_file (file_name, ses, -1);
 gz_compress_file (file_name, file_name||'.gz');
 file_delete (file_name);
 file_len := 0;
 file_idx := file_idx + 1;
 file_name := sprintf ('%s%06d.ttl', out_file, file_idx);
 string_to_file ( file_name,
  sprintf ( '# Dump of graph %s, as of %s
 (part %d)\n@base  .\n',
srcgraph,
CAST (NOW() AS VARCHAR),
file_idx),
  -2
);
  env := VECTOR (dict_new (16000), 0, '', '', '', 0, 0, 0,
 0, 0);
   }
 ELSE
   string_to_file (file_name, ses, -1);
 ses := string_output ();
   }
   }
 IF (LENGTH (ses))
   {
 http (' .\n', ses);
 string_to_file (file_name, ses, -1);
 gz_compress_file (file_name, file_name||'.gz');
 file_delete (file_name);
   }
   }


 Done. -- 13 msec.

 -- Line 72:
 $

 Best Regards
 Hugh Williams
 Professional Services
 OpenLink Software, Inc.  //  http://www.openlinksw.com/
 Weblog   -- http://www.openlinksw.com/blogs/
 LinkedIn -- http://www.linkedin.com/company/openlink-software/
 Twitter  -- http://twitter.com/OpenLink
 Google+  -- http://plus.google.com/100570109519069333827/
 Facebook -- http://www.facebook.com/OpenLinkSoftware
 Universal Data Access, Integration, and Management Technology Providers

 On 7 Apr 2015, at 22:08, Morty morty+virtu...@frakir.org wrote:

 On Tue, Apr 07, 2015 at 04:36:39PM -0400, Gang Fu wrote:

 In the bash shell, we need to take care of single quote by converting ' to
 '''.


 Gang --

 You can avoid some issues and debug more easily by using isql's run
 file syntax.

 temp_file=`mktemp /somedir/tmpX`
 $command_to_generate_procedure  $temp_file 
 isql  dba $dba_password $temp_file

 If there are any issues, you can look at the temporary file to debug
 quoting and the like.

 - Morty


 --
 BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
 Develop your own process in accordance with the BPMN 2 standard
 Learn Process modeling best practices with Bonita BPM through live
 exercises
 http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
 event?utm_
 source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF
 ___
 Virtuoso-users mailing list
 Virtuoso-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Re: [Virtuoso-users] vhost_define vsp_user and real user

2015-02-06 Thread Gang Fu
Hi Rumi,

I totally understand your point. My quesiton is about the 'vsp_user' or
called 'vsp_host' used to expose the sparql endpoint. Our system security
team has concern about the 'vsp_user', they are not sure what is used for,
and how to configure it. Basically, they are not familiar with 'vsp'. I
cannot explain well to them, and they want to audit the user permission for
/sparql endpoint. I have explained that the default user for /sparql
endpoint is 'SPARLQ' and it is read-only. But there is no way to audit
that, if later some configuration is changed, they want to know whether the
endpoint is still read-only...

I found the system table 'http_path' tells you the 'vsp_host' for 'lpath',
but not the user and user role...

Best,
Gang

On Thu, Feb 5, 2015 at 10:17 AM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,


 On 05-Feb-15 3:03 PM, Gang Fu wrote:

 Hi Rumi,

  Using vhost_define() through isql we can achieve the same thing:
  DB.DBA.VHOST_DEFINE (
   lhost=ip:port,
   vhost=name,
   lpath='/sparql',
   ppath='/!sparql/',
   is_dav=1,
   vsp_user='dba',
   ses_vars=0,
   sec='digest',
   auth_fn='DB.DBA.HP_AUTH_SPARQL_USER',
   realm='SPARQL',
   opts=vector('noinherit', 1, 'exec_as_get', 1),
   is_default_host=0
 );


  It is password protected, but it is read+write ,even though I have:
 'exec_as_get', 1



 Right, so to me is not clear exactly what you want to do in this case,
 with what user are you using to log in? It seems your user has read+write
 permissions,
 i.e you need to try to log in as user with read permissions only: a simple
 scenario that demonstrates this:

 1) Create 2 users, one can update, the other can only perform select:

 SQL DB.DBA.USER_CREATE ('ana', 'ana');
 Done. -- 0 msec.
 SQL DB.DBA.USER_CREATE ('brad', 'brad');
 Done. -- 0 msec.
 Done. -- 16 msec.
 SQL GRANT SPARQL_UPDATE to ana;
 Done. -- 0 msec.
 SQL GRANT SPARQL_SELECT to brad;
 Done. -- 0 msec.

 So ana can update, brad can only select.
 2) Then from the default /sparql-auth endpoint, if I log in as brad and:

 -- 1) attempt to insert data:
 INSERT INTO GRAPH http://NewBookStore.com http://NewBookStore.com {
 ?book ?p ?v }
 fails with this error:
 SPARQL Update was denied to  brad

 -- 2) attempt to clear data:
 sparql clear graph urn:example:com;
 also fails with error:

 Error SR186: No permission to execute procedure DB.DBA.SPARUL_CLEAR with
 user ID 127, group ID 127

 Which is correct, since brad can only select data, but has no update (
 read-write ) permissions.

 Question is the user you are using, what permissions it has?



 Best Regards,
 Rumi Kocis



  Best,
 Gang


 On Wed, Feb 4, 2015 at 6:57 AM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 04-Feb-15 2:22 AM, Gang Fu wrote:

  Hi Rumi,

  I have tried to expose a password-protected sparql endpoint, actually
 it can be done using vhost_define() function as well, just add
 sec='digest' and authentication function. But the vsp_user to expose a
 password-protected sparql endpoint is still dba.


 By default /sparq-auth is protected, so what you can try is :

 1. Export /sparq-auth definition from Conductor-Web Application Server
 - Virtual Domains  Directories
 2. Change in the generated script /sparql-auth with /sparql.
 * Note: the vsp_user is dba, but in the next step you can change in
 the authentication function a connection setting so to use your user.
 3. In the authentication function DB.DBA.HP_AUTH_SPARQL_USER
 (sparql_io.sql) there is:
 Lin: 2935   user_id := connection_get ('SPARQLUserId', 'SPARQL');
 Change it respectively so to use your user and execute the function
 creation so the change to kick in.
 4. Execute from Conductor or iSQL the changed script from step 2.

 Please let me know if that worked for you.


 Best Regards,
 Rumi Kocis


  Best,
 Gang

 On Tue, Feb 3, 2015 at 12:35 PM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 03-Feb-15 3:47 PM, Gang Fu wrote:

 Hi Rumi,

  I looked at the source code of libsrc/Wi/sparql_io.sql for procedure
 WS.WS./!sparql/:
 create procedure WS.WS./!sparql/ (inout path varchar, inout params
 any, inout lines any)

  I am not sure whether the user as SPARQL for /sparql endpoint are
 set by default here:
  user_id := connection_get ('SPARQLUserId', 'SPARQL');
  set_user_id (user_id, 1);


  I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql
 endpoint is not read-only
 And when I tried to grant another role, I got
 The object SPARQL_LOAD_SERVICE_DATA does not exist.

  But it does not allow me to expose /sparql endpoint using vsp_user
 SPARQL. What I am really interested in is how to expose sparql endpoint
 using vsp users other than dba.


  Hm, I would say you grant the roles to another vsp user as this is what
 you want to achieve is this correct?
 As now you granted them to SPARQL instead?
 Additionally, did you try the steps from the guide
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main

Re: [Virtuoso-users] vhost_define vsp_user and real user

2015-02-05 Thread Gang Fu
Hi Rumi,

Using vhost_define() through isql we can achieve the same thing:
DB.DBA.VHOST_DEFINE (
  lhost=ip:port,
  vhost=name,
  lpath='/sparql',
  ppath='/!sparql/',
  is_dav=1,
  vsp_user='dba',
  ses_vars=0,
  sec='digest',
  auth_fn='DB.DBA.HP_AUTH_SPARQL_USER',
  realm='SPARQL',
  opts=vector('noinherit', 1, 'exec_as_get', 1),
  is_default_host=0
);


It is password protected, but it is read+write ,even though I have:
'exec_as_get', 1


Best,
Gang


On Wed, Feb 4, 2015 at 6:57 AM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 04-Feb-15 2:22 AM, Gang Fu wrote:

  Hi Rumi,

  I have tried to expose a password-protected sparql endpoint, actually it
 can be done using vhost_define() function as well, just add sec='digest'
 and authentication function. But the vsp_user to expose a
 password-protected sparql endpoint is still dba.


 By default /sparq-auth is protected, so what you can try is :

 1. Export /sparq-auth definition from Conductor-Web Application Server -
 Virtual Domains  Directories
 2. Change in the generated script /sparql-auth with /sparql.
 * Note: the vsp_user is dba, but in the next step you can change in
 the authentication function a connection setting so to use your user.
 3. In the authentication function DB.DBA.HP_AUTH_SPARQL_USER
 (sparql_io.sql) there is:
 Lin: 2935   user_id := connection_get ('SPARQLUserId', 'SPARQL');
 Change it respectively so to use your user and execute the function
 creation so the change to kick in.
 4. Execute from Conductor or iSQL the changed script from step 2.

 Please let me know if that worked for you.


 Best Regards,
 Rumi Kocis


  Best,
 Gang

 On Tue, Feb 3, 2015 at 12:35 PM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 03-Feb-15 3:47 PM, Gang Fu wrote:

 Hi Rumi,

  I looked at the source code of libsrc/Wi/sparql_io.sql for procedure
 WS.WS./!sparql/:
 create procedure WS.WS./!sparql/ (inout path varchar, inout params
 any, inout lines any)

  I am not sure whether the user as SPARQL for /sparql endpoint are set
 by default here:
  user_id := connection_get ('SPARQLUserId', 'SPARQL');
  set_user_id (user_id, 1);


  I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql
 endpoint is not read-only
 And when I tried to grant another role, I got
 The object SPARQL_LOAD_SERVICE_DATA does not exist.

  But it does not allow me to expose /sparql endpoint using vsp_user
 SPARQL. What I am really interested in is how to expose sparql endpoint
 using vsp users other than dba.


  Hm, I would say you grant the roles to another vsp user as this is what
 you want to achieve is this correct?
 As now you granted them to SPARQL instead?
 Additionally, did you try the steps from the guide
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication
 ?


 Best Regards,
 Rumi Kocis


  Best,
 Gang

 On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 03-Feb-15 1:15 PM, Gang Fu wrote:

   Hi,

  I am using function vhost_define() to expose read-only sparql endpoint
 through another port (different from 8890) for security concern.

  I have two questions:
  1) how can I expose a sparql endpoint using account other than 'dba'. I
 have tried to using vsp_user='SPARQL', but I got '404 cannot access' error
 when I tried the url. I also set the opts-(executable, 'yes'), this option
 seems to allow any vsp user to have execute permission, but it still does
 not work. I also tried to set user 'SPARQL' to administrator role, but
 still cannot work


  Please try the steps from this guide: Secure SPARQL Endpoint via SQL
 Accounts -- usage path digest authentication

 Link:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication

 Related:
 -- Securing SPARQL endpoints:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints
 -- Securing your SPARQL Endpoint via OAuth:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL
 -- Securing your SPARQL Endpoint via WebID:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID


  2) how can I know and configure the user account to use '/sparql'
 endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that
 the vsp_user is 'dba', but it does not show the default user of that
 endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ'
 for both '/sparql' and '/sparql-graph-crud', but I cannot find any system
 table for that. Our system team wants to audit that information.


  The name 'SPARQL' is a constant in the code of SPARQL web service
 endpoint pages ( /sparql and /sparql-auth ).
 Another name can be used if authentication function sets connection
 variable 'SPARQLUserId' to that name, for ex., placing inside
 authentication call:

 connection_set ('SPARQLUserId', 'SOME_USER_NAME');


 What you could try is to grant more roles to the user

Re: [Virtuoso-users] vhost_define vsp_user and real user

2015-02-03 Thread Gang Fu
Hi Rumi,

I have also tried:
grant execute on DB.DBA.SPARUL_LOAD_SERVICE_DATA to SPARQL;

but still, user SPARQL cannot be used as vsp_user to expose a sparql
endpoint, I got:
404 page not found
Resource /sparql not found.Access to page is forbidden

Best,
Gang

On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 03-Feb-15 1:15 PM, Gang Fu wrote:

   Hi,

  I am using function vhost_define() to expose read-only sparql endpoint
 through another port (different from 8890) for security concern.

  I have two questions:
  1) how can I expose a sparql endpoint using account other than 'dba'. I
 have tried to using vsp_user='SPARQL', but I got '404 cannot access' error
 when I tried the url. I also set the opts-(executable, 'yes'), this option
 seems to allow any vsp user to have execute permission, but it still does
 not work. I also tried to set user 'SPARQL' to administrator role, but
 still cannot work


 Please try the steps from this guide: Secure SPARQL Endpoint via SQL
 Accounts -- usage path digest authentication

 Link:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication

 Related:
 -- Securing SPARQL endpoints:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints
 -- Securing your SPARQL Endpoint via OAuth:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL
 -- Securing your SPARQL Endpoint via WebID:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID


  2) how can I know and configure the user account to use '/sparql'
 endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that
 the vsp_user is 'dba', but it does not show the default user of that
 endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ'
 for both '/sparql' and '/sparql-graph-crud', but I cannot find any system
 table for that. Our system team wants to audit that information.


 The name 'SPARQL' is a constant in the code of SPARQL web service endpoint
 pages ( /sparql and /sparql-auth ).
 Another name can be used if authentication function sets connection
 variable 'SPARQLUserId' to that name, for ex., placing inside
 authentication call:

 connection_set ('SPARQLUserId', 'SOME_USER_NAME');


 What you could try is to grant more roles to the user if needed, such as:
 SPARQL_LOAD_SERVICE_DATA or SPARQL_UPDATE, by granting directly to the
 user or, better, to SPARQL_SELECT, since the endpoint page will require
 that the user is member of SPARQL_SELECT group -- that's the minimal
 practical permission, however one can grant more permissions.


 Best Regards,
 Rumi Kocis


 Best,
 Gang


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/



 ___
 Virtuoso-users mailing 
 listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] vhost_define vsp_user and real user

2015-02-03 Thread Gang Fu
Hi Rumi,

I looked at the source code of libsrc/Wi/sparql_io.sql for procedure WS.WS
./!sparql/:
create procedure WS.WS./!sparql/ (inout path varchar, inout params any,
inout lines any)

I am not sure whether the user as SPARQL for /sparql endpoint are set by
default here:
user_id := connection_get ('SPARQLUserId', 'SPARQL');
set_user_id (user_id, 1);


I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql
endpoint is not read-only
And when I tried to grant another role, I got
The object SPARQL_LOAD_SERVICE_DATA does not exist.

But it does not allow me to expose /sparql endpoint using vsp_user
SPARQL. What I am really interested in is how to expose sparql endpoint
using vsp users other than dba.

Best,
Gang

On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 03-Feb-15 1:15 PM, Gang Fu wrote:

   Hi,

  I am using function vhost_define() to expose read-only sparql endpoint
 through another port (different from 8890) for security concern.

  I have two questions:
  1) how can I expose a sparql endpoint using account other than 'dba'. I
 have tried to using vsp_user='SPARQL', but I got '404 cannot access' error
 when I tried the url. I also set the opts-(executable, 'yes'), this option
 seems to allow any vsp user to have execute permission, but it still does
 not work. I also tried to set user 'SPARQL' to administrator role, but
 still cannot work


 Please try the steps from this guide: Secure SPARQL Endpoint via SQL
 Accounts -- usage path digest authentication

 Link:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication

 Related:
 -- Securing SPARQL endpoints:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints
 -- Securing your SPARQL Endpoint via OAuth:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL
 -- Securing your SPARQL Endpoint via WebID:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID


  2) how can I know and configure the user account to use '/sparql'
 endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that
 the vsp_user is 'dba', but it does not show the default user of that
 endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ'
 for both '/sparql' and '/sparql-graph-crud', but I cannot find any system
 table for that. Our system team wants to audit that information.


 The name 'SPARQL' is a constant in the code of SPARQL web service endpoint
 pages ( /sparql and /sparql-auth ).
 Another name can be used if authentication function sets connection
 variable 'SPARQLUserId' to that name, for ex., placing inside
 authentication call:

 connection_set ('SPARQLUserId', 'SOME_USER_NAME');


 What you could try is to grant more roles to the user if needed, such as:
 SPARQL_LOAD_SERVICE_DATA or SPARQL_UPDATE, by granting directly to the
 user or, better, to SPARQL_SELECT, since the endpoint page will require
 that the user is member of SPARQL_SELECT group -- that's the minimal
 practical permission, however one can grant more permissions.


 Best Regards,
 Rumi Kocis


 Best,
 Gang


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/



 ___
 Virtuoso-users mailing 
 listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


[Virtuoso-users] vhost_define vsp_user and real user

2015-02-03 Thread Gang Fu
Hi,

I am using function vhost_define() to expose read-only sparql endpoint
through another port (different from 8890) for security concern.

I have two questions:
1) how can I expose a sparql endpoint using account other than 'dba'. I
have tried to using vsp_user='SPARQL', but I got '404 cannot access' error
when I tried the url. I also set the opts-(executable, 'yes'), this option
seems to allow any vsp user to have execute permission, but it still does
not work. I also tried to set user 'SPARQL' to administrator role, but
still cannot work


2) how can I know and configure the user account to use '/sparql' endpoint
by default. The system table 'DB.DBA.HTTP_PATH' only shows that the
vsp_user is 'dba', but it does not show the default user of that endpoint
is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ' for both
'/sparql' and '/sparql-graph-crud', but I cannot find any system table for
that. Our system team wants to audit that information.


Best,
Gang
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] vhost_define vsp_user and real user

2015-02-03 Thread Gang Fu
Hi Rumi,

I have tried to expose a password-protected sparql endpoint, actually it
can be done using vhost_define() function as well, just add sec='digest'
and authentication function. But the vsp_user to expose a
password-protected sparql endpoint is still dba.

Best,
Gang

On Tue, Feb 3, 2015 at 12:35 PM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 03-Feb-15 3:47 PM, Gang Fu wrote:

 Hi Rumi,

  I looked at the source code of libsrc/Wi/sparql_io.sql for procedure
 WS.WS./!sparql/:
 create procedure WS.WS./!sparql/ (inout path varchar, inout params any,
 inout lines any)

  I am not sure whether the user as SPARQL for /sparql endpoint are set
 by default here:
  user_id := connection_get ('SPARQLUserId', 'SPARQL');
  set_user_id (user_id, 1);


  I have tried to grant SPARQL_UPDATE to user SPARQL, then the /sparql
 endpoint is not read-only
 And when I tried to grant another role, I got
 The object SPARQL_LOAD_SERVICE_DATA does not exist.

  But it does not allow me to expose /sparql endpoint using vsp_user
 SPARQL. What I am really interested in is how to expose sparql endpoint
 using vsp users other than dba.


 Hm, I would say you grant the roles to another vsp user as this is what
 you want to achieve is this correct?
 As now you granted them to SPARQL instead?
 Additionally, did you try the steps from the guide
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication
 ?


 Best Regards,
 Rumi Kocis


  Best,
 Gang

 On Tue, Feb 3, 2015 at 8:10 AM, Rumi rtsek...@openlinksw.com wrote:

  Hi Gang Fu,

 On 03-Feb-15 1:15 PM, Gang Fu wrote:

   Hi,

  I am using function vhost_define() to expose read-only sparql endpoint
 through another port (different from 8890) for security concern.

  I have two questions:
  1) how can I expose a sparql endpoint using account other than 'dba'. I
 have tried to using vsp_user='SPARQL', but I got '404 cannot access' error
 when I tried the url. I also set the opts-(executable, 'yes'), this option
 seems to allow any vsp user to have execute permission, but it still does
 not work. I also tried to set user 'SPARQL' to administrator role, but
 still cannot work


  Please try the steps from this guide: Secure SPARQL Endpoint via SQL
 Accounts -- usage path digest authentication

 Link:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSPARQLProtectSQLDigestAuthentication

 Related:
 -- Securing SPARQL endpoints:
 http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideSPARQLEndpoints
 -- Securing your SPARQL Endpoint via OAuth:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtOAuthSPARQL
 -- Securing your SPARQL Endpoint via WebID:
 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLSecurityWebID


  2) how can I know and configure the user account to use '/sparql'
 endpoint by default. The system table 'DB.DBA.HTTP_PATH' only shows that
 the vsp_user is 'dba', but it does not show the default user of that
 endpoint is 'SPARLQ' (ID=106). The documentation says the user is 'SPARLQ'
 for both '/sparql' and '/sparql-graph-crud', but I cannot find any system
 table for that. Our system team wants to audit that information.


  The name 'SPARQL' is a constant in the code of SPARQL web service
 endpoint pages ( /sparql and /sparql-auth ).
 Another name can be used if authentication function sets connection
 variable 'SPARQLUserId' to that name, for ex., placing inside
 authentication call:

 connection_set ('SPARQLUserId', 'SOME_USER_NAME');


 What you could try is to grant more roles to the user if needed, such as:
 SPARQL_LOAD_SERVICE_DATA or SPARQL_UPDATE, by granting directly to the
 user or, better, to SPARQL_SELECT, since the endpoint page will require
 that the user is member of SPARQL_SELECT group -- that's the minimal
 practical permission, however one can grant more permissions.


 Best Regards,
 Rumi Kocis


 Best,
 Gang


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/



 ___
 Virtuoso-users mailing 
 listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users





--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now