Hi Andra,

In my virtuoso branch I do something similar but with Sesame. However, if you 
want good speed over large loads
you need to turn off “log” and “auto checkpointing”

This is more JDBC level actions and I don’t know how you do it in Jena but in 
sesame you can do it with utility methods like this.


public class CheckpointLoggingUtil
{
        private static Logger log = 
Logger.getInstance(CheckpointLoggingUtil.class);

        public static void autoCheckpointOff(RepositoryConnection connection)
            throws SQLException
        {
                if (connection instanceof VirtuosoRepositoryConnection)
                {
                        
                        VirtuosoRepositoryConnection 
virtuosoRepositoryConnection = (VirtuosoRepositoryConnection) connection;
                        Connection vrc = 
virtuosoRepositoryConnection.getQuadStoreConnection();
                        CallableStatement checkpointOff = 
vrc.prepareCall("checkpoint_interval (0)");
                        checkpointOff.execute();
                        log.debug("Virtuoso auto checkpointing off");
                }
        }

        public static void autoCheckpointOn(RepositoryConnection connection)
            throws SQLException
        {
                if (connection instanceof VirtuosoRepositoryConnection)
                {
                        log.debug("Virtuoso checkpoint starting");
                        VirtuosoRepositoryConnection 
virtuosoRepositoryConnection = (VirtuosoRepositoryConnection) connection;
                        Connection vrc = 
virtuosoRepositoryConnection.getQuadStoreConnection();
                        CallableStatement checkpointOff = 
vrc.prepareCall("checkpoint_interval (60)");
                        checkpointOff.execute();
                        log.debug("Virtuoso auto checkpointing on");
                }
        }
        
        public static void checkpoint(RepositoryConnection connection)
            throws SQLException
        {
                if (connection instanceof VirtuosoRepositoryConnection)
                {
                        log.debug("Virtuoso checkpoint starting");
                        VirtuosoRepositoryConnection 
virtuosoRepositoryConnection = (VirtuosoRepositoryConnection) connection;
                        Connection vrc = 
virtuosoRepositoryConnection.getQuadStoreConnection();
                        CallableStatement checkpointOff = 
vrc.prepareCall("checkpoint");
                        checkpointOff.execute();
                        log.debug("Virtuoso checkpoint finished");
                }
        }

        public static void disableLogging(RepositoryConnection connection)
            throws SQLException
        {
                if (connection instanceof VirtuosoRepositoryConnection)
                {
                        VirtuosoRepositoryConnection 
virtuosoRepositoryConnection = (VirtuosoRepositoryConnection) connection;
                        Connection vrc = 
virtuosoRepositoryConnection.getQuadStoreConnection();
                        CallableStatement loggingOff = 
vrc.prepareCall("log_enable (0)");
                        loggingOff.execute();
                }
        }

        public static void enableLogging(RepositoryConnection connection)
            throws SQLException
        {
                if (connection instanceof VirtuosoRepositoryConnection)
                {
                        VirtuosoRepositoryConnection 
virtuosoRepositoryConnection = (VirtuosoRepositoryConnection) connection;
                        Connection vrc = 
virtuosoRepositoryConnection.getQuadStoreConnection();
                        CallableStatement loggingOff = 
vrc.prepareCall("log_enable (3)");
                        loggingOff.execute();
                }
        }
}

Then in my sesame code I do something like this

connection = getConnection();
connection.setAutoCommit(false);
CheckpointLoggingUtil.disableLogging(connection);
CheckpointLoggingUtil.autoCheckpointOff(connection);

connection.add(some statements)
Once in a while I might call a checkpoint

and then commit everything and turn the “log” and “checkpointing" back on.

connection.commit();
CheckpointLoggingUtil.checkpoint(connection);
CheckpointLoggingUtil.autoCheckpointOn(connection);
CheckpointLoggingUtil.enableLogging(connection);


Of course you can also just script the DBA connection with something like this.:

isql -U dba -P dba -S 1111 “EXEC=DB.DBA.TTLP_MT (file_to_string_output 
('wpContent_v0.0.73237_20140115.ttl'), ‘’,'http://rdf.wikipathways.org')”

See http://docs.openlinksw.com/virtuoso/isql.html for more details

Regards,
Jerven

On 15 Jan 2014, at 21:29, Jim McCusker <mccus...@gmail.com> wrote:

> I do something like this:
> 
> OntModel memoryModel = ...;
> 
> VirtModel model = null;
> try {
>  model = 
> VirtModel.openDatabaseModel(graphURI,virtuosoURL,virtuosoUser,virtuosoPassword);
>  model.begin();
>  model.removeAll();
>  model.add(memoryModel);
>  model.commit();
> } catch (Exception e) {
>  if (model != null) model.abort();
> } finally {
>  if (model != null) model.close();
> }
> 
> 
> 
> On Wed, Jan 15, 2014 at 3:02 PM, Andra Waagmeester 
> <andra.waagmees...@gmail.com> wrote:
> Hi,
> 
>     I am looking into a solution to automatically load triples into our 
> triples store. We use jena to convert our data into triples. Our current 
> workflow is that we dump the triples in a file (e.g. content.ttl) after which 
> this file is loaded through the isql command line:
> 
>     "DB.DBA.TTLP_MT (file_to_string_output 
> ('wpContent_v0.0.73237_20140115.ttl'), '', 'MailScanner has detected a 
> possible fraud attempt from "rdf.wikipathways.org'" claiming to be 
> http://rdf.wikipathways.org'); “
> 
> Loading the file takes minutes, but it still requires some manual steps. I 
> would very much like to make both the authoring and the subsequent loading in 
> the triple store a automatic process. 
> 
>  I have tried the following in java:
> 
> InputStream in = new FileInputStream(args[1]);
>               InputStream in = new 
> FileInputStream("wpContent_v0.0.73237_20140115.ttl");
>               String url ="jdbc:virtuoso://localhost:4444";
> 
>               /*                      STEP 1                  */
>               VirtGraph wpGraph = new VirtGraph 
> ("http://rdf.wikipathways.org/";, url, “dba", “<dba-pw>");
> 
>               wpGraph.clear();
>               VirtuosoUpdateRequest vur = 
> VirtuosoUpdateFactory.read(in,wpGraph);
>                 vur.exec();
> 
> 
> 
> or loading them triple by triple:
> 
>               VirtGraph wpGraph = new VirtGraph 
> ("http://rdf.wikipathways.org/";, url,  “dba", “<dba-pw>");
>               
>               //wpGraph.getTransactionHandler().begin();
>               wpGraph.clear();
>               StmtIterator iter = model.listStatements();
>               while (iter.hasNext()) {
>                       Statement stmt      = iter.nextStatement();  // get 
> next statement
>                       wpGraph.add(new Triple(stmt.getSubject().asNode(), 
> stmt.getPredicate().asNode(), stmt.getObject().asNode()));
>                       System.out.println( stmt.getSubject()+" - 
> "+stmt.getPredicate()+" - "+stmt.getObject());
>               }
> 
> Both of these approached work, they only take hours to proceed. 
> 
> Can I get the minutes needed for loading through the isql  command line in an 
> automated pipeline process? Can I trigger the DB.DBA.TTLP_MT through java or 
> a shell script? 
> 
> Any guidance is much appreciated
> 
> 
> Kind regards,
> 
> 
> Andra Waagmeester 
> 
> 
> 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
> 
> 
> 
> 
> -- 
> Jim McCusker
> 
> Data Scientist
> 5AM Solutions
> jmccus...@5amsolutions.com
> http://5amsolutions.com
> 
> PhD Student
> Tetherless World Constellation
> Rensselaer Polytechnic Institute
> mcc...@cs.rpi.edu
> http://tw.rpi.edu
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today. 
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk_______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users

-------------------------------------------------------------------
Jerven Bolleman                        jerven.bolle...@isb-sib.ch
SIB Swiss Institute of Bioinformatics      Tel: +41 (0)22 379 58 85
CMU, rue Michel Servet 1               Fax: +41 (0)22 379 58 58
1211 Geneve 4,
Switzerland     www.isb-sib.ch - www.uniprot.org
Follow us at https://twitter.com/#!/uniprot
-------------------------------------------------------------------


------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to