Re: Store query results in new RDF

Adeeb Noor Sat, 09 Nov 2013 22:23:05 -0800

Any help guys .


On Thu, Nov 7, 2013 at 9:51 PM, Adeeb Noor <[email protected]> wrote:

> Here is the new version of the code using
> QueryExecution.execConstructTriples:
>
> FileLoader fileLoader = new FileLoader("src/aaCONSTRUCT.tql");
>
> String q = fileLoader.loadAll();
>
> Query query = QueryFactory.create(q) ;
>
> QueryExecution qexec = QueryExecutionFactory.create(query, data.tdb);
>
> Iterator<Triple> ti =  qexec.execConstructTriples();
>
>  StmtIterator si = ModelUtils.triplesToStatements(ti, data.tdb);
>
> *Now I am trying to add the result of construct in new TDB.*
>
> System.out.println(" ... Add new TDB  ...");
>
> inferredData.tdb.add(si);
>
>  System.out.println(" ... RDF  ...");
>
> inferredData.exportRDF();
>
> inferredData.close();
>
> My problem is that when I do inferredData.tdb.add(si); to save the result
> in new TDB (inferredData) the program keeps running for ever and the TDB
> size never change 201 MB.
>
> AM I doing something wrong here ?
>
> Thanks
>
>
>
>
> On Thu, Nov 7, 2013 at 6:57 AM, Andy Seaborne <[email protected]> wrote:
>
>> On 07/11/13 02:55, Adeeb Noor wrote:
>>
>>>
>>> On Wed, Nov 6, 2013 at 5:23 AM, Andy Seaborne <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>     On 06/11/13 00:31, Adeeb Noor wrote:
>>>
>>>         Any help with my question please.
>>>
>>>         AdeeB
>>>
>>>
>>>         On Mon, Nov 4, 2013 at 1:48 PM, Adeeb Noor
>>>         <[email protected] <mailto:[email protected]>>
>>> wrote:
>>>
>>>             Hi Andy:
>>>
>>>             Thanks for the response.
>>>
>>>             My TDB is on my hard drive with 15GB size wise.
>>>
>>>
>>>     How many triples?
>>>
>>>
>>> *The number of triples is: 37397456*
>>>
>>>
>>>
>>>     And how much of the DB does the SELECT query match?
>>>
>>>     What's SELECT (count(*) AS ?c) ....
>>>     What's SELECT (count(distinct *) AS ?c) ....
>>>
>>
>> No response?
>>
>> This is asking what proportion of the DB is being extracted.
>>
>> On the current information, I'd guess the system starts swapping due to a
>> large construct graph but that's just a guess.
>>
>> Streaming the execConstructTriples to a disk file may help.
>>
>>
>>
>>
>>>
>>> *Here is the SELECT I want to apply to generate my subgraph:*
>>>
>>
>> Long, incomplete query.
>>
>> Try reordering the FILTERs putting the || ones last.  May make no
>> difference but without sizing figures, only you can know.
>>
>>         Andy
>>
>>
>>> CONSTRUCT
>>>
>>> {
>>>
>>> ddidd:C0004057 ?r ?disease1 .
>>>
>>> ?disease1 ?r10 ddidd:C0004057 .
>>>
>>> ?disease1 ?r1 ?omim1 .
>>>
>>> ?omim1 ?r11 ?disease1 .
>>>
>>> ?omim1 ?r2 ?w .
>>>
>>> ?w ?r12 ?omim1 .
>>>
>>> ?omim1 ?r3 ?bp .
>>>
>>> ?bp ?r13 ?omim1 .
>>>
>>> ?omim1 ?r4 ?genotypePhenotype .
>>>
>>> ?genotypePhenotype ?r14 ?omim1 .
>>>
>>> ?omim1 ?r5 ?gene.
>>>
>>> ?gene ?r15 ?omim1 .
>>>
>>> ?w ?r6 ?gene2.
>>>
>>> ?gene2 ?r16 ?w .
>>>
>>> ?omim1 ?r7 ?gene3 .
>>>
>>> ?gene3 ?r17 ?omim1.
>>>
>>> ?gene3 ?r8 ?bp2 .
>>>
>>> ?bp2 ?r18 ?gene3 .
>>>
>>> ?gene3 ?r9 ?genotypePhenotype2 .
>>>
>>> ?genotypePhenotype2 ?r19 ?gene3 .
>>>
>>> ?gene a ?gCLASS.
>>>
>>> ?gene2 a ?g2CLASS.
>>>
>>> ?gene3 a ?g3CLASS.
>>>
>>> ?genotypePhenotype a ?genotypePhenotypeCLASS .
>>>
>>> ?genotypePhenotype2 a ?genotypePhenotype2CLASS.
>>>
>>> ?w a ?wCLASS .
>>>
>>> ?omim1 a ?omimt1 .
>>>
>>> ?bp a ?bpCLASS .
>>>
>>> ?bp2 a ?bp2CLASS .
>>>
>>> ddidd:C0004057 ddids:label ?ldrug1 .
>>>
>>> ?disease1 ddids:label ?ldisease1 .
>>>
>>> ?omim1 ddids:label ?lomim1 .
>>>
>>> ?w ddids:label ?lw .
>>>
>>> ?bp ddids:label ?lbp .
>>>
>>> ?genotypePhenotype ddids:label ?lgenotypePhenotype .
>>>
>>> ?gene ddids:label ?lgene .
>>>
>>> ?gene2 ddids:label ?lgene2 .
>>>
>>> ?gene3 ddids:label ?lgene3 .
>>>
>>> ?bp2 ddids:label ?lbp2 .
>>>
>>> ?genotypePhenotype2 ddids:label ?lgenotypePhenotype2 .
>>>
>>> }WHERE{
>>>
>>> ddidd:C0004057 ?r ?disease1 .
>>>
>>> ?disease1 ?r10 ddidd:C0004057 .
>>>
>>>   ?disease1 ?r1 ?omim1 .
>>>
>>> ?omim1 ?r11 ?disease1 .
>>>
>>> ?omim1 ?r2 ?w .
>>>
>>> ?w ?r12 ?omim1 .
>>>
>>> ?omim1 ?r3 ?bp .
>>>
>>> ?bp ?r13 ?omim1 .
>>>
>>> ?omim1 ?r4 ?genotypePhenotype .
>>>
>>> ?genotypePhenotype ?r14 ?omim1 .
>>>
>>> ?omim1 ?r5 ?gene.
>>>
>>> ?gene ?r15 ?omim1 .
>>>
>>> ?w ?r6 ?gene2.
>>>
>>> ?gene2 ?r16 ?w .
>>>
>>> ?omim1 ?r7 ?gene3 .
>>>
>>> ?gene3 ?r17 ?omim1.
>>>
>>> ?gene3 ?r8 ?bp2 .
>>>
>>> ?bp2 ?r18 ?gene3 .
>>>
>>> ?gene3 ?r9 ?genotypePhenotype2 .
>>>
>>> ?genotypePhenotype2 ?r19 ?gene3 .
>>>
>>> ?gene a ?gCLASS.
>>>
>>> ?gene2 a ?g2CLASS.
>>>
>>> ?gene3 a ?g3CLASS.
>>>
>>> ?genotypePhenotype a ?genotypePhenotypeCLASS .
>>>
>>> ?genotypePhenotype2 a ?genotypePhenotype2CLASS.
>>>
>>> ?w a ?wCLASS .
>>>
>>> ?omim1 a ?omimt1 .
>>>
>>> ?bp a ?bpCLASS .
>>>
>>> ?bp2 a ?bp2CLASS .
>>>
>>> ddidd:C0004057 ddids:label ?ldrug1 .
>>>
>>> ?disease1 ddids:label ?ldisease1 .
>>>
>>> ?omim1 ddids:label ?lomim1 .
>>>
>>> ?w ddids:label ?lw .
>>>
>>> ?bp ddids:label ?lbp .
>>>
>>> ?genotypePhenotype ddids:label ?lgenotypePhenotype .
>>>
>>> ?gene ddids:label ?lgene .
>>>
>>> ?gene2 ddids:label ?lgene2 .
>>>
>>> ?gene3 ddids:label ?lgene3 .
>>>
>>> ?bp2 ddids:label ?lbp2 .
>>>
>>> ?genotypePhenotype2 ddids:label ?lgenotypePhenotype2 .
>>>
>>>
>>> FILTER ( ?r = ddids:may_treat ||  ?r = ddids:may_prevent )
>>>
>>> FILTER (?omimt1 = ddids:gene || ?omimt1 = ddids:genotypePhenotype )
>>>
>>> FILTER (?wCLASS = ddids:pathway || ?r2 = ddids:gene_is_element_in_pathway
>>> )
>>>
>>> FILTER (?bpCLASS = ddids:biologicalProcess )
>>>
>>> FILTER (?bp2CLASS = ddids:biologicalProcess )
>>>
>>> FILTER (?genotypePhenotypeCLASS = ddids:genotypePhenotype )
>>>
>>> FILTER (?genotypePhenotype2CLASS = ddids:genotypePhenotype )
>>>
>>> FILTER (?gCLASS = ddids:gene )
>>>
>>> FILTER (?g2CLASS = ddids:gene )
>>>
>>> FILTER (?g3CLASS = ddids:gene )
>>>
>>> }
>>>
>>>
>>>     We still know little about your setup.
>>>
>>>
>>>
>>>             and my PC is Mac Pro with
>>>             2.4 GHZ and 4GB of memory.
>>>
>>>
>>>     Java 32 bit or 64 bit?
>>>
>>>
>>> *java version "1.6.0_65"*
>>> *Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)*
>>> *Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)*
>>>
>>>
>>>
>>>
>>>             I was not able to use QueryExecution.__execConstructTriples
>>> as
>>>
>>>             it returnees an iterator and I want to save the subgraph
>>>             into a new TDB .
>>>
>>>
>>>     Why is that a problem? Add them to a TDB database.
>>>
>>>
>>> * I will try it and let you know. *
>>>
>>>
>>>
>>>     Or even use a SPARQL Update operation.
>>>
>>>
>>> *SPARQL update if I am not wrong will not work in my case as I want to
>>>
>>> create a subgraph from the whole data and store it in a new TDB. We can
>>> use only the SPARQL update if we can add data on the original TDB; AM I
>>> right ? *
>>>
>>>     **
>>>
>>>
>>>
>>>
>>>             Here is my code below:
>>>
>>>                FileLoader fileLoader = new
>>>             FileLoader("src/DDICONSTRUCT.__tql");
>>>
>>>
>>>                String q = fileLoader.loadAll();
>>>
>>>                Query query = QueryFactory.create(q) ;
>>>
>>>                QueryExecution qexec =
>>>             QueryExecutionFactory.create(__query, data.tdb);
>>>
>>>
>>>
>>>                Model constructModel = qexec.execConstruct();
>>>
>>>
>>>             The program has been running for almost a day now, let me
>>>             know if there is
>>>             something wrong or if there is an alternative to  CONSTRUCT
>>>             thing.
>>>
>>>
>>>
>>>             On Sun, Nov 3, 2013 at 12:59 PM, Andy Seaborne
>>>             <[email protected] <mailto:[email protected]>> wrote:
>>>
>>>                 On 03/11/13 07:05, Adeeb Noor wrote:
>>>
>>>                     Hi Andy:
>>>
>>>                     I did figure it out, however it takes to much time
>>>                     (CONSTRUCT) to finish
>>>                     as
>>>                     my query is complex. Is that something normal ? in
>>>                     fact, it is still
>>>                     running
>>>
>>>
>>>                 Hard to tell - it depends on many factors such as
>>>                 machine setup, where
>>>                 the data is stored, structure and volume of your data
>>>
>>>                 Try
>>>
>>>                 QueryExecution.__execConstructTriples
>>>
>>>
>>>                           Andy
>>>
>>>
>>>
>>>                     AdeeB
>>>
>>>
>>>                     On Sat, Nov 2, 2013 at 9:56 AM, Adeeb Noor
>>>                     <[email protected]
>>>                     <mailto:[email protected]>>
>>>
>>>                     wrote:
>>>
>>>                        Hi Andy:
>>>
>>>
>>>                         Thanks for the quick response. I tried CONSTRUCT
>>>                         and it did work out.
>>>                         But
>>>                         how can I reformat such a query to CONSTRUCT one:
>>>
>>>                         SELECT DISTINCT *
>>>
>>>                             {
>>>
>>>                              ?ddi ddids:has_association ?c .
>>>
>>>                             ?ddi ddids:has_association ?c2 .
>>>
>>>                         ?c ddids:chemical_or_drug___affects_gene_product
>>>                         ?omim .
>>>
>>>                         ?omim ddids:gene_product_encoded_by___gene ?g .
>>>
>>>                         ?g ddids:gene_plays_role_in___process ?w .
>>>
>>>                         ?g ddids:gene_plays_role_in___process ?bp .
>>>
>>>
>>>                         ?bp ddids:process_involves_gene ?g2 .
>>>
>>>                         ?g2 ddids:gene_plays_role_in___process ?bp2 .
>>>
>>>
>>>
>>>                         where I need each variable ( for example ?w, ?bp
>>>                         , etc) to be a new
>>>                         resources.
>>>
>>>                         Thanks
>>>
>>>
>>>                         On Sat, Nov 2, 2013 at 6:41 AM, Andy Seaborne
>>>                         <[email protected] <mailto:[email protected]>>
>>> wrote:
>>>
>>>                            You need to use a CONSTRUCT query, not a
>>>                         SELECT one.
>>>
>>>
>>>                             outputAsRDF encodes the result set (i.e. the
>>>                             table) as RDF - it is not
>>>                             the datamodel of the original data.
>>>
>>>                             CONSTRUCT allows you to create one RDF graph
>>>                             from data from another.
>>>
>>>                             See also SPARQL Update for doign that from
>>>                             one graph to another in the
>>>                             same database.
>>>
>>>                                        Andy
>>>
>>>
>>>                             On 02/11/13 05:35, Adeeb Noor wrote:
>>>
>>>                                Hi guys:
>>>
>>>
>>>                                 I would like to save my SPARQL result
>>>                                 coming from ResultSet into new
>>>                                 rdf.
>>>                                 (new rdf resources) cause I want to do
>>>                                 more work on this subgraph and
>>>                                 it
>>>                                 has to be in the original rdf format.
>>>
>>>                                 I tried outputAsRDF function and it
>>>                                 worked however the result I got
>>>                                 the
>>>                                 following:
>>>
>>>                                 <rdf:Description rdf:nodeID="A5">
>>>                                         <rs:value rdf:resource="
>>>                                 https://csel.cs.colorado.edu/~
>>> __noor/Drug_Disease_ontology/
>>>                                 <https://csel.cs.colorado.edu/
>>> ~noor/Drug_Disease_ontology/>
>>>                                 DDID.owl#genotypePhenotype
>>>                                 "/>
>>>                                         <rs:variable>omimt</rs:__
>>> variable>
>>>
>>>                                       </rdf:Description>
>>>                                       <rdf:Description rdf:nodeID="A6">
>>>                                         <rs:value rdf:resource="
>>>                                 https://csel.cs.colorado.edu/~
>>> __noor/Drug_Disease_ontology/
>>>
>>>                                 <https://csel.cs.colorado.edu/
>>> ~noor/Drug_Disease_ontology/>
>>>                                 DDID.rdf#C0007589
>>>                                 "/>
>>>                                         <rs:variable>w</rs:variable>
>>>                                       </rdf:Description>
>>>                                       <rdf:Description rdf:nodeID="A7">
>>>                                         <rs:binding rdf:nodeID="A8"/>
>>>                                         <rs:binding rdf:nodeID="A9"/>
>>>                                         <rs:binding rdf:nodeID="A10"/>
>>>                                         <rs:binding rdf:nodeID="A11"/>
>>>                                         <rs:binding rdf:nodeID="A12"/>
>>>                                         <rs:binding rdf:nodeID="A13"/>
>>>                                         <rs:binding rdf:nodeID="A14"/>
>>>                                         <rs:binding rdf:nodeID="A15"/>
>>>                                         <rs:binding rdf:nodeID="A16"/>
>>>                                         <rs:binding rdf:nodeID="A17"/>
>>>                                         <rs:binding rdf:nodeID="A18"/>
>>>                                         <rs:binding rdf:nodeID="A19"/>
>>>                                         <rs:binding rdf:nodeID="A20"/>
>>>                                         <rs:binding rdf:nodeID="A21"/>
>>>                                         <rs:binding rdf:nodeID="A22"/>
>>>                                         <rs:binding rdf:nodeID="A23"/>
>>>                                         <rs:binding rdf:nodeID="A24"/>
>>>                                         <rs:binding rdf:nodeID="A25"/>
>>>                                         <rs:binding rdf:nodeID="A26"/>
>>>                                         <rs:binding rdf:nodeID="A27"/>
>>>                                       </rdf:Description>
>>>
>>>                                 how I can remove this nodes things and
>>>                                 make it something like:
>>>
>>>                                      <rdf:Description rdf:about="
>>>                                 https://csel.cs.colorado.edu/~
>>> __noor/Drug_Disease_ontology/
>>>
>>>                                 <https://csel.cs.colorado.edu/
>>> ~noor/Drug_Disease_ontology/>
>>>                                 DDID.rdf#C3229174">
>>>                                         <j.0:label>Cytra-K Oral
>>>                                 Product</j.0:label>
>>>                                         <rdf:type rdf:resource="
>>>                                 https://csel.cs.colorado.edu/~
>>> __noor/Drug_Disease_ontology/
>>>
>>>                                 <https://csel.cs.colorado.edu/
>>> ~noor/Drug_Disease_ontology/>
>>>                                 DDID.owl#chemical
>>>                                 "/>
>>>                                       </rdf:Description>
>>>
>>>                                 please help me out
>>>
>>>
>>>
>>>
>>>
>>>                         --
>>>                         Adeeb Noor
>>>                         Ph.D. Candidate
>>>                         Dept of Computer Science
>>>                         University of Colorado at Boulder
>>>                         Cell: 571-484-3303 <tel:571-484-3303>
>>>                         Email: [email protected]
>>>                         <mailto:[email protected]>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>             --
>>>             Adeeb Noor
>>>             Ph.D. Candidate
>>>             Dept of Computer Science
>>>             University of Colorado at Boulder
>>>             Cell: 571-484-3303 <tel:571-484-3303>
>>>             Email: [email protected] <mailto:Adeeb.noor@colorado.
>>> edu>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Adeeb Noor
>>> Ph.D. Candidate
>>> Dept of Computer Science
>>> University of Colorado at Boulder
>>> Cell: 571-484-3303
>>> Email: [email protected] <mailto:[email protected]>
>>>
>>
>>
>
>
> --
> Adeeb Noor
> Ph.D. Candidate
> Dept of Computer Science
> University of Colorado at Boulder
> Cell: 571-484-3303
> Email: [email protected]
>



-- 
Adeeb Noor
Ph.D. Candidate
Dept of Computer Science
University of Colorado at Boulder
Cell: 571-484-3303
Email: [email protected]

Re: Store query results in new RDF

Reply via email to