Re: Store query results in new RDF

Adeeb Noor Thu, 07 Nov 2013 20:52:31 -0800

Here is the new version of the code using
QueryExecution.execConstructTriples:


FileLoader fileLoader = new FileLoader("src/aaCONSTRUCT.tql");

String q = fileLoader.loadAll();

Query query = QueryFactory.create(q) ;

QueryExecution qexec = QueryExecutionFactory.create(query, data.tdb);

Iterator<Triple> ti =  qexec.execConstructTriples();

StmtIterator si = ModelUtils.triplesToStatements(ti, data.tdb);

*Now I am trying to add the result of construct in new TDB.*

System.out.println(" ... Add new TDB  ...");

inferredData.tdb.add(si);

System.out.println(" ... RDF  ...");

inferredData.exportRDF();

inferredData.close();

My problem is that when I do inferredData.tdb.add(si); to save the result
in new TDB (inferredData) the program keeps running for ever and the TDB
size never change 201 MB.

AM I doing something wrong here ?

Thanks




On Thu, Nov 7, 2013 at 6:57 AM, Andy Seaborne <[email protected]> wrote:

> On 07/11/13 02:55, Adeeb Noor wrote:
>
>>
>> On Wed, Nov 6, 2013 at 5:23 AM, Andy Seaborne <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     On 06/11/13 00:31, Adeeb Noor wrote:
>>
>>         Any help with my question please.
>>
>>         AdeeB
>>
>>
>>         On Mon, Nov 4, 2013 at 1:48 PM, Adeeb Noor
>>         <[email protected] <mailto:[email protected]>> wrote:
>>
>>             Hi Andy:
>>
>>             Thanks for the response.
>>
>>             My TDB is on my hard drive with 15GB size wise.
>>
>>
>>     How many triples?
>>
>>
>> *The number of triples is: 37397456*
>>
>>
>>
>>     And how much of the DB does the SELECT query match?
>>
>>     What's SELECT (count(*) AS ?c) ....
>>     What's SELECT (count(distinct *) AS ?c) ....
>>
>
> No response?
>
> This is asking what proportion of the DB is being extracted.
>
> On the current information, I'd guess the system starts swapping due to a
> large construct graph but that's just a guess.
>
> Streaming the execConstructTriples to a disk file may help.
>
>
>
>
>>
>> *Here is the SELECT I want to apply to generate my subgraph:*
>>
>
> Long, incomplete query.
>
> Try reordering the FILTERs putting the || ones last.  May make no
> difference but without sizing figures, only you can know.
>
>         Andy
>
>
>> CONSTRUCT
>>
>> {
>>
>> ddidd:C0004057 ?r ?disease1 .
>>
>> ?disease1 ?r10 ddidd:C0004057 .
>>
>> ?disease1 ?r1 ?omim1 .
>>
>> ?omim1 ?r11 ?disease1 .
>>
>> ?omim1 ?r2 ?w .
>>
>> ?w ?r12 ?omim1 .
>>
>> ?omim1 ?r3 ?bp .
>>
>> ?bp ?r13 ?omim1 .
>>
>> ?omim1 ?r4 ?genotypePhenotype .
>>
>> ?genotypePhenotype ?r14 ?omim1 .
>>
>> ?omim1 ?r5 ?gene.
>>
>> ?gene ?r15 ?omim1 .
>>
>> ?w ?r6 ?gene2.
>>
>> ?gene2 ?r16 ?w .
>>
>> ?omim1 ?r7 ?gene3 .
>>
>> ?gene3 ?r17 ?omim1.
>>
>> ?gene3 ?r8 ?bp2 .
>>
>> ?bp2 ?r18 ?gene3 .
>>
>> ?gene3 ?r9 ?genotypePhenotype2 .
>>
>> ?genotypePhenotype2 ?r19 ?gene3 .
>>
>> ?gene a ?gCLASS.
>>
>> ?gene2 a ?g2CLASS.
>>
>> ?gene3 a ?g3CLASS.
>>
>> ?genotypePhenotype a ?genotypePhenotypeCLASS .
>>
>> ?genotypePhenotype2 a ?genotypePhenotype2CLASS.
>>
>> ?w a ?wCLASS .
>>
>> ?omim1 a ?omimt1 .
>>
>> ?bp a ?bpCLASS .
>>
>> ?bp2 a ?bp2CLASS .
>>
>> ddidd:C0004057 ddids:label ?ldrug1 .
>>
>> ?disease1 ddids:label ?ldisease1 .
>>
>> ?omim1 ddids:label ?lomim1 .
>>
>> ?w ddids:label ?lw .
>>
>> ?bp ddids:label ?lbp .
>>
>> ?genotypePhenotype ddids:label ?lgenotypePhenotype .
>>
>> ?gene ddids:label ?lgene .
>>
>> ?gene2 ddids:label ?lgene2 .
>>
>> ?gene3 ddids:label ?lgene3 .
>>
>> ?bp2 ddids:label ?lbp2 .
>>
>> ?genotypePhenotype2 ddids:label ?lgenotypePhenotype2 .
>>
>> }WHERE{
>>
>> ddidd:C0004057 ?r ?disease1 .
>>
>> ?disease1 ?r10 ddidd:C0004057 .
>>
>>   ?disease1 ?r1 ?omim1 .
>>
>> ?omim1 ?r11 ?disease1 .
>>
>> ?omim1 ?r2 ?w .
>>
>> ?w ?r12 ?omim1 .
>>
>> ?omim1 ?r3 ?bp .
>>
>> ?bp ?r13 ?omim1 .
>>
>> ?omim1 ?r4 ?genotypePhenotype .
>>
>> ?genotypePhenotype ?r14 ?omim1 .
>>
>> ?omim1 ?r5 ?gene.
>>
>> ?gene ?r15 ?omim1 .
>>
>> ?w ?r6 ?gene2.
>>
>> ?gene2 ?r16 ?w .
>>
>> ?omim1 ?r7 ?gene3 .
>>
>> ?gene3 ?r17 ?omim1.
>>
>> ?gene3 ?r8 ?bp2 .
>>
>> ?bp2 ?r18 ?gene3 .
>>
>> ?gene3 ?r9 ?genotypePhenotype2 .
>>
>> ?genotypePhenotype2 ?r19 ?gene3 .
>>
>> ?gene a ?gCLASS.
>>
>> ?gene2 a ?g2CLASS.
>>
>> ?gene3 a ?g3CLASS.
>>
>> ?genotypePhenotype a ?genotypePhenotypeCLASS .
>>
>> ?genotypePhenotype2 a ?genotypePhenotype2CLASS.
>>
>> ?w a ?wCLASS .
>>
>> ?omim1 a ?omimt1 .
>>
>> ?bp a ?bpCLASS .
>>
>> ?bp2 a ?bp2CLASS .
>>
>> ddidd:C0004057 ddids:label ?ldrug1 .
>>
>> ?disease1 ddids:label ?ldisease1 .
>>
>> ?omim1 ddids:label ?lomim1 .
>>
>> ?w ddids:label ?lw .
>>
>> ?bp ddids:label ?lbp .
>>
>> ?genotypePhenotype ddids:label ?lgenotypePhenotype .
>>
>> ?gene ddids:label ?lgene .
>>
>> ?gene2 ddids:label ?lgene2 .
>>
>> ?gene3 ddids:label ?lgene3 .
>>
>> ?bp2 ddids:label ?lbp2 .
>>
>> ?genotypePhenotype2 ddids:label ?lgenotypePhenotype2 .
>>
>>
>> FILTER ( ?r = ddids:may_treat ||  ?r = ddids:may_prevent )
>>
>> FILTER (?omimt1 = ddids:gene || ?omimt1 = ddids:genotypePhenotype )
>>
>> FILTER (?wCLASS = ddids:pathway || ?r2 = ddids:gene_is_element_in_pathway
>> )
>>
>> FILTER (?bpCLASS = ddids:biologicalProcess )
>>
>> FILTER (?bp2CLASS = ddids:biologicalProcess )
>>
>> FILTER (?genotypePhenotypeCLASS = ddids:genotypePhenotype )
>>
>> FILTER (?genotypePhenotype2CLASS = ddids:genotypePhenotype )
>>
>> FILTER (?gCLASS = ddids:gene )
>>
>> FILTER (?g2CLASS = ddids:gene )
>>
>> FILTER (?g3CLASS = ddids:gene )
>>
>> }
>>
>>
>>     We still know little about your setup.
>>
>>
>>
>>             and my PC is Mac Pro with
>>             2.4 GHZ and 4GB of memory.
>>
>>
>>     Java 32 bit or 64 bit?
>>
>>
>> *java version "1.6.0_65"*
>> *Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)*
>> *Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)*
>>
>>
>>
>>
>>             I was not able to use QueryExecution.__execConstructTriples
>> as
>>
>>             it returnees an iterator and I want to save the subgraph
>>             into a new TDB .
>>
>>
>>     Why is that a problem? Add them to a TDB database.
>>
>>
>> * I will try it and let you know. *
>>
>>
>>
>>     Or even use a SPARQL Update operation.
>>
>>
>> *SPARQL update if I am not wrong will not work in my case as I want to
>>
>> create a subgraph from the whole data and store it in a new TDB. We can
>> use only the SPARQL update if we can add data on the original TDB; AM I
>> right ? *
>>
>>     **
>>
>>
>>
>>
>>             Here is my code below:
>>
>>                FileLoader fileLoader = new
>>             FileLoader("src/DDICONSTRUCT.__tql");
>>
>>
>>                String q = fileLoader.loadAll();
>>
>>                Query query = QueryFactory.create(q) ;
>>
>>                QueryExecution qexec =
>>             QueryExecutionFactory.create(__query, data.tdb);
>>
>>
>>
>>                Model constructModel = qexec.execConstruct();
>>
>>
>>             The program has been running for almost a day now, let me
>>             know if there is
>>             something wrong or if there is an alternative to  CONSTRUCT
>>             thing.
>>
>>
>>
>>             On Sun, Nov 3, 2013 at 12:59 PM, Andy Seaborne
>>             <[email protected] <mailto:[email protected]>> wrote:
>>
>>                 On 03/11/13 07:05, Adeeb Noor wrote:
>>
>>                     Hi Andy:
>>
>>                     I did figure it out, however it takes to much time
>>                     (CONSTRUCT) to finish
>>                     as
>>                     my query is complex. Is that something normal ? in
>>                     fact, it is still
>>                     running
>>
>>
>>                 Hard to tell - it depends on many factors such as
>>                 machine setup, where
>>                 the data is stored, structure and volume of your data
>>
>>                 Try
>>
>>                 QueryExecution.__execConstructTriples
>>
>>
>>                           Andy
>>
>>
>>
>>                     AdeeB
>>
>>
>>                     On Sat, Nov 2, 2013 at 9:56 AM, Adeeb Noor
>>                     <[email protected]
>>                     <mailto:[email protected]>>
>>
>>                     wrote:
>>
>>                        Hi Andy:
>>
>>
>>                         Thanks for the quick response. I tried CONSTRUCT
>>                         and it did work out.
>>                         But
>>                         how can I reformat such a query to CONSTRUCT one:
>>
>>                         SELECT DISTINCT *
>>
>>                             {
>>
>>                              ?ddi ddids:has_association ?c .
>>
>>                             ?ddi ddids:has_association ?c2 .
>>
>>                         ?c ddids:chemical_or_drug___affects_gene_product
>>                         ?omim .
>>
>>                         ?omim ddids:gene_product_encoded_by___gene ?g .
>>
>>                         ?g ddids:gene_plays_role_in___process ?w .
>>
>>                         ?g ddids:gene_plays_role_in___process ?bp .
>>
>>
>>                         ?bp ddids:process_involves_gene ?g2 .
>>
>>                         ?g2 ddids:gene_plays_role_in___process ?bp2 .
>>
>>
>>
>>                         where I need each variable ( for example ?w, ?bp
>>                         , etc) to be a new
>>                         resources.
>>
>>                         Thanks
>>
>>
>>                         On Sat, Nov 2, 2013 at 6:41 AM, Andy Seaborne
>>                         <[email protected] <mailto:[email protected]>> wrote:
>>
>>                            You need to use a CONSTRUCT query, not a
>>                         SELECT one.
>>
>>
>>                             outputAsRDF encodes the result set (i.e. the
>>                             table) as RDF - it is not
>>                             the datamodel of the original data.
>>
>>                             CONSTRUCT allows you to create one RDF graph
>>                             from data from another.
>>
>>                             See also SPARQL Update for doign that from
>>                             one graph to another in the
>>                             same database.
>>
>>                                        Andy
>>
>>
>>                             On 02/11/13 05:35, Adeeb Noor wrote:
>>
>>                                Hi guys:
>>
>>
>>                                 I would like to save my SPARQL result
>>                                 coming from ResultSet into new
>>                                 rdf.
>>                                 (new rdf resources) cause I want to do
>>                                 more work on this subgraph and
>>                                 it
>>                                 has to be in the original rdf format.
>>
>>                                 I tried outputAsRDF function and it
>>                                 worked however the result I got
>>                                 the
>>                                 following:
>>
>>                                 <rdf:Description rdf:nodeID="A5">
>>                                         <rs:value rdf:resource="
>>                                 https://csel.cs.colorado.edu/~
>> __noor/Drug_Disease_ontology/
>>                                 <https://csel.cs.colorado.edu/
>> ~noor/Drug_Disease_ontology/>
>>                                 DDID.owl#genotypePhenotype
>>                                 "/>
>>                                         <rs:variable>omimt</rs:__
>> variable>
>>
>>                                       </rdf:Description>
>>                                       <rdf:Description rdf:nodeID="A6">
>>                                         <rs:value rdf:resource="
>>                                 https://csel.cs.colorado.edu/~
>> __noor/Drug_Disease_ontology/
>>
>>                                 <https://csel.cs.colorado.edu/
>> ~noor/Drug_Disease_ontology/>
>>                                 DDID.rdf#C0007589
>>                                 "/>
>>                                         <rs:variable>w</rs:variable>
>>                                       </rdf:Description>
>>                                       <rdf:Description rdf:nodeID="A7">
>>                                         <rs:binding rdf:nodeID="A8"/>
>>                                         <rs:binding rdf:nodeID="A9"/>
>>                                         <rs:binding rdf:nodeID="A10"/>
>>                                         <rs:binding rdf:nodeID="A11"/>
>>                                         <rs:binding rdf:nodeID="A12"/>
>>                                         <rs:binding rdf:nodeID="A13"/>
>>                                         <rs:binding rdf:nodeID="A14"/>
>>                                         <rs:binding rdf:nodeID="A15"/>
>>                                         <rs:binding rdf:nodeID="A16"/>
>>                                         <rs:binding rdf:nodeID="A17"/>
>>                                         <rs:binding rdf:nodeID="A18"/>
>>                                         <rs:binding rdf:nodeID="A19"/>
>>                                         <rs:binding rdf:nodeID="A20"/>
>>                                         <rs:binding rdf:nodeID="A21"/>
>>                                         <rs:binding rdf:nodeID="A22"/>
>>                                         <rs:binding rdf:nodeID="A23"/>
>>                                         <rs:binding rdf:nodeID="A24"/>
>>                                         <rs:binding rdf:nodeID="A25"/>
>>                                         <rs:binding rdf:nodeID="A26"/>
>>                                         <rs:binding rdf:nodeID="A27"/>
>>                                       </rdf:Description>
>>
>>                                 how I can remove this nodes things and
>>                                 make it something like:
>>
>>                                      <rdf:Description rdf:about="
>>                                 https://csel.cs.colorado.edu/~
>> __noor/Drug_Disease_ontology/
>>
>>                                 <https://csel.cs.colorado.edu/
>> ~noor/Drug_Disease_ontology/>
>>                                 DDID.rdf#C3229174">
>>                                         <j.0:label>Cytra-K Oral
>>                                 Product</j.0:label>
>>                                         <rdf:type rdf:resource="
>>                                 https://csel.cs.colorado.edu/~
>> __noor/Drug_Disease_ontology/
>>
>>                                 <https://csel.cs.colorado.edu/
>> ~noor/Drug_Disease_ontology/>
>>                                 DDID.owl#chemical
>>                                 "/>
>>                                       </rdf:Description>
>>
>>                                 please help me out
>>
>>
>>
>>
>>
>>                         --
>>                         Adeeb Noor
>>                         Ph.D. Candidate
>>                         Dept of Computer Science
>>                         University of Colorado at Boulder
>>                         Cell: 571-484-3303 <tel:571-484-3303>
>>                         Email: [email protected]
>>                         <mailto:[email protected]>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>             --
>>             Adeeb Noor
>>             Ph.D. Candidate
>>             Dept of Computer Science
>>             University of Colorado at Boulder
>>             Cell: 571-484-3303 <tel:571-484-3303>
>>             Email: [email protected] <mailto:Adeeb.noor@colorado.
>> edu>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> Adeeb Noor
>> Ph.D. Candidate
>> Dept of Computer Science
>> University of Colorado at Boulder
>> Cell: 571-484-3303
>> Email: [email protected] <mailto:[email protected]>
>>
>
>


-- 
Adeeb Noor
Ph.D. Candidate
Dept of Computer Science
University of Colorado at Boulder
Cell: 571-484-3303
Email: [email protected]

Re: Store query results in new RDF

Reply via email to