Re: Strange behaviour of XMLLiterals in RDF/XML

2012-06-26 Thread Claude Warren
On Mon, Jun 25, 2012 at 12:57 PM, Martynas Jusevičius marty...@graphity.org
 wrote:


 Both br/ and br/br are well-formed and equivalent in the XML
 context, so why the difference in serialization?
 I'm using Jena 2.6.4 and ARQ 2.8.7.

 Martynas
 graphity.org


Back in the bad old days br/ was not parsed correctly by some parsers and
had  to be written as br / (note the space).  If you try that
substitution does it work?

Claude


-- 
I like: Like Like - The likeliest place on the webhttp://like-like.xenei.com
Identity: https://www.identify.nu/user.php?cla...@xenei.com
LinkedIn: http://www.linkedin.com/in/claudewarren


Re: Can SDB 1.3.4 be used with Jena 2.7.1?

2012-06-26 Thread Martynas Jusevičius
Have you tried TDB? I think it's currently more actively developed and more
performant.
On Jun 26, 2012 1:55 AM, Holger Knublauch hol...@knublauch.com wrote:

 Yes this is my hope, assuming SDB still works for us.

 Holger


 On 6/26/2012 9:46, Martynas Jusevičius wrote:

 Holger,

 does that also mean a new release of SPIN API which will be packaged
 with the latest Jena?

 Martynas

 On Tue, Jun 26, 2012 at 1:33 AM, Holger Knublauch hol...@knublauch.com
 wrote:

 We are now starting the process of upgrading our platform to the latest
 Jena
 version(s). I noticed that SDB has not been released yet as an Apache
 module. Question: is it safe to use SDB 1.3.4 in conjunction with Jena
 2.7.1?

 Apologies if this has been asked before.

 Thanks
 Holger






Re: Can SDB 1.3.4 be used with Jena 2.7.1?

2012-06-26 Thread Holger Knublauch
Yes sure, but this doesn't help with our existing customer base of SDB 
users. A good reason for using SDB is the use of standard tools to 
back-up your data etc.


Holger


On 6/26/2012 17:41, Martynas Jusevičius wrote:

Have you tried TDB? I think it's currently more actively developed and more
performant.
On Jun 26, 2012 1:55 AM, Holger Knublauch hol...@knublauch.com wrote:


Yes this is my hope, assuming SDB still works for us.

Holger


On 6/26/2012 9:46, Martynas Jusevičius wrote:


Holger,

does that also mean a new release of SPIN API which will be packaged
with the latest Jena?

Martynas

On Tue, Jun 26, 2012 at 1:33 AM, Holger Knublauch hol...@knublauch.com
wrote:


We are now starting the process of upgrading our platform to the latest
Jena
version(s). I noticed that SDB has not been released yet as an Apache
module. Question: is it safe to use SDB 1.3.4 in conjunction with Jena
2.7.1?

Apologies if this has been asked before.

Thanks
Holger









Re: Reading JSON from Virtuoso OpenSource output

2012-06-26 Thread Andy Seaborne

#include everything Rob says about CONSTRUCT queries.

1/
But also the JSON results set parser has a bug in it - it is reading the 
link field as a string, but it should be an array.


This is now fixed in SVN.  The development snapshot build has the fix in it.

https://repository.apache.org/content/repositories/snapshots/org/apache/jena/apache-jena/

2/
But I think the data has problems as well:

For example:

s: { type: uri , value: _:vb43419 }

That looks like it is meant to be a bNode not a URI.  It is illegal as a 
URI because there is no _ URI scheme name and scheme names are only 
letters, digits, plus (+), period (.), or hyphen (-).


s: { type: bnode , value: _:vb43419 }

Andy

PS For testing, ARQ has a command line tool arq.rset for reading and 
writing result sets.



On 26/06/12 00:35, Rob Vesse wrote:

Hi Lorena

JenaReaderRdfJson is for reading a JSON serialization of RDF.   The
serialization you are trying to read is the JSON serialization of SPARQL
Results which is completely different.

I notice you say that you use a CONSTRUCT query but the results you show
are the SPARQL Results JSON format which should only be used for
ASK/SELECT queries.  If Virtuoso is replying with that to your CONSTRUCT
query then they are behaving incorrectly and you should report a bug to
them.

If you genuinely expect SPARQL results instead then use
ResultSetFactory.fromJSON() which will give you a ResultSet object.

Rob


On 6/25/12 3:14 PM, lorena lore...@fing.edu.uy wrote:


Hi:

I'm trying to process the results of performing a CONSTRUCT query on
Virtuoso using apache-jena-2.7.0-incubating
[1] shows the JSON String I would like to read (schemaStr).

Here is an extract of my code:

SysRIOT.wireIntoJena();
Model modelSchema = ModelFactory.createDefaultModel();
RDFReader schemaReader = new JenaReaderRdfJson() ;

StringReader s = new StringReader(schemaStr);
schemaReader.read(modelSchema, s, );

And I receive the following exception caused in the line that executes
the read:

com.hp.hpl.jena.shared.JenaException: org.openjena.riot.RiotException:
[line: 2, col: 3 ] Relative IRI: head
at
org.openjena.riot.system.JenaReaderRIOT.readImpl(JenaReaderRIOT.java:150)
at org.openjena.riot.system.JenaReaderRIOT.read(JenaReaderRIOT.java:54)




It seems to have trouble reading the head section.
My questions:
Is Virtuoso JSON output compatible with what JenaReaderRdfJson expects to
read?
Am I missing something else?
I'm using the empty string () as base URI in the read method, but I
don't understand what is the read method expecting in this field.

Thanks in advance
Lorena



[1]
{ head: { link: [], vars: [ s, p, o ] },
  results: { distinct: false, ordered: true, bindings: [
{ s: { type : uri, value : _:vb43419 }  , p: { type :
uri, value : http://purl.org/olap#hasAggregateFunction; }, o:
{ type : uri, value : http://purl.org/olap#sum; }},
{ s: { type : uri, value : _:vb43418 }  , p: { type :
uri, value : http://purl.org/olap#level; }   , o: { type :
uri, value : http://example.org/householdCS#year; }},
{ s: { type : uri, value :
http://example.org/householdCS#household_withoutGeo; }   , p: {
type : uri, value : http://purl.org/linked-data/cube#component;
}   , o: { type : uri, value : _:vb43418 }},
{ s: { type : uri, value :
http://example.org/householdCS#household_withoutGeo; }   , p: {
type : uri, value : http://purl.org/linked-data/cube#component;
}   , o: { type : uri, value : _:vb43419 }},
{ s: { type : uri, value : _:vb43419 }  , p: { type :
uri, value : http://purl.org/linked-data/cube#measure; } , o: {
type : uri, value : http://example.org/householdCS#household;
}},
{ s: { type : uri, value :
http://example.org/householdCS#householdCS; }, p: { type :
uri, value : http://www.w3.org/1999/02/22-rdf-syntax-ns#type; }  ,
o: { type : uri, value :
http://purl.org/linked-data/cube#DataStructureDefinition; }} ] } }







Re: Can SDB 1.3.4 be used with Jena 2.7.1?

2012-06-26 Thread Andy Seaborne

Yes (it's 1.3.4-SNAPSHOT)

SDB is being built against Jena each night.

You can check the POM for version - it says 2.7.2-SNAPSHOT but there are 
no changes from 2.7.1.


Andy

PS Your next question will be about a release.

We need a way to test SDB on all, or at least most, of the databases 
supported.  Can you help?


On 26/06/12 08:46, Holger Knublauch wrote:

Yes sure, but this doesn't help with our existing customer base of SDB
users. A good reason for using SDB is the use of standard tools to
back-up your data etc.

Holger


On 6/26/2012 17:41, Martynas Jusevičius wrote:

Have you tried TDB? I think it's currently more actively developed and
more
performant.
On Jun 26, 2012 1:55 AM, Holger Knublauch hol...@knublauch.com wrote:


Yes this is my hope, assuming SDB still works for us.

Holger


On 6/26/2012 9:46, Martynas Jusevičius wrote:


Holger,

does that also mean a new release of SPIN API which will be packaged
with the latest Jena?

Martynas

On Tue, Jun 26, 2012 at 1:33 AM, Holger Knublauch
hol...@knublauch.com
wrote:


We are now starting the process of upgrading our platform to the
latest
Jena
version(s). I noticed that SDB has not been released yet as an Apache
module. Question: is it safe to use SDB 1.3.4 in conjunction with Jena
2.7.1?

Apologies if this has been asked before.

Thanks
Holger












How to convert assign URL to blank node?

2012-06-26 Thread franswors...@googlemail.com
How can I assign an URI to a blank node? The Resource class only 
provides getURI() or getId() methods, but the URI can't be set. Do I 
have to create a new Resource, copy all properties and delete the 
original node?


Re: How to convert assign URL to blank node?

2012-06-26 Thread Andy Seaborne

On 26/06/12 01:30, franswors...@googlemail.com wrote:

How can I assign an URI to a blank node? The Resource class only
provides getURI() or getId() methods, but the URI can't be set. Do I
have to create a new Resource, copy all properties and delete the
original node?



Yes, you create a new resource.  Resources are immutable - you can't 
modify them after creation.


Andy



Re: Property not removed?

2012-06-26 Thread Martynas Jusevičius

 Note that the statement seems to point to a bNode so a simple remove is
 probably not enough anyway.



Dave, can you elaborate on this a little?

What I'm trying to do, is to replace such ORDER BY expression

sp:orderBy ([ a   sp:Desc ;
sp:expression :TriplesVar
  ])

with my own -- say change :Triples var into another node, or sp:Desc to sp:Asc.

Martynas


Re: Property not removed?

2012-06-26 Thread Dave Reynolds

On 26/06/12 09:56, Martynas Jusevičius wrote:


Note that the statement seems to point to a bNode so a simple remove is
probably not enough anyway.




Dave, can you elaborate on this a little?


I just meant that the contents of the bNode would remain. For some 
purposes that might be a problem. I guess the likelihood is that SPIN 
won't care and a few bits of no-longer-connected bNodes lying around in 
the model would do no harm.


Dave


Re: How to convert assign URL to blank node?

2012-06-26 Thread Damian Steer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26/06/12 09:20, Andy Seaborne wrote:


You can use ResourceUtils.renameResource(oldResource, uri) [1] to
achieve the same effect. Behind the scenes this removes old statements
using oldResource and makes new ones with uri.

Damian

[1]
http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/util/ResourceUtils.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/pfYcACgkQAyLCB+mTtynRqgCfW8H0ZvWhjCKfa+FawTl0Gq83
WlAAmwYf+tWUhAukVEunHyDTE60x+AeW
=KF6y
-END PGP SIGNATURE-


Re: How to convert assign URL to blank node?

2012-06-26 Thread Damian Steer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26/06/12 09:20, Andy Seaborne wrote:
 On 26/06/12 01:30, franswors...@googlemail.com wrote:
 How can I assign an URI to a blank node? The Resource class only 
 provides getURI() or getId() methods, but the URI can't be set. 
 Do I have to create a new Resource, copy all properties and 
 delete the original node?
 
 
 Yes, you create a new resource.  Resources are immutable - you 
 can't modify them after creation.

You can use ResourceUtils.renameResource(oldResource, uri) [1] to
achieve the same effect. Behind the scenes this removes old statements
using oldResource and makes new ones with uri.

Damian

[1]
http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/util/ResourceUtils.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/pfYgACgkQAyLCB+mTtyldTQCgsEH+RrgHeXhntyRjKw+M/JoD
eoIAoM4CbsZoSZjrmD9AQAgEaJghmcdB
=548K
-END PGP SIGNATURE-


Re: memory issues using TDB from servlet

2012-06-26 Thread Andy Seaborne

a few questions and then a a suggestion:

How much physical RAM does the machine have?
Which version of the Jena software is this?
Is this running on MS Windows?

If you are on 64 bit hardware, then TDB uses out-of-heap memory as well 
as heap memory.


But what I am most suspicious of is

Dataset dataset = getDataset();
...
dataset.close();

which seems to opening the database on every call which may be the cause 
of your problems.  You may have many copies of the in-RAM datastructres 
(especially on 64-bit Windows which does not release mnamory mapped 
segments during the lifetime of the JVM - (in)famous Java bug).


You should open the database once at start up, do not close it when a 
request is finished.


With transactions, you can get away with not closing it at all but to be 
neat, close at shutdown if you like.


Otherwise, could you turn this into a standalone test case that 
simulates your set up but runs outside Spring so we can debug it?


Andy

On 25/06/12 21:55, Stephan Zednik wrote:

I have been having memory issues using TDB from a java servlet.  Memory usage 
by tomcat increases until the service becomes unresponsive and must be 
restarted.  The service operations appear to be completing successfully until 
the service becomes unresponsive.

The memory usage will rapidly rise to whatever my heap max size 
(CATALINA_OPTS=-Xms512m -Xmx4096m) or what my available RAM can hold before 
the service becomes unresponsive.  Generally during testing that has been 1.5-1.6 GB 
before my RAM is full up.

I have a fairly simple set of unit tests, it does not have full coverage but 
what tests I do have all pass.

I am using Spring web.

Below is my Spring Controller class, it asks the application context for a 
Dataset which causes Spring to invoke DatasetFactoryBean.getObject().  The 
DatasetFactoryBean is a singleton has been initialized with location to my TDB 
dataset.

The controller method is fairly simple.  A post request contains XML payload.  
The payload is passed to a service method that parses the XML and generates an 
RDF representation of the input data, encoded as RDF and stored in an in-memory 
jena model.  AnalysisSettings is a class that acts as a proxy to the Jena Model 
with methods for manipulating/accessing the encoded RDF.

I have commented out the TDB-related code and tested both the xml parsing and 
xml parsing + in-memory rdf.  Service memory usage slowly grows to a level I am 
unhappy with (~1GB according to ActivityMonitor.app and VisualVM), but does 
stabilize.  Since it stabilizes and grows slowly I do not think it is the main 
culprit of my current memory problem.

If I test the TDB Dataset creation code, but leave all queries run against the 
TDB dataset commented out, memory usage grows much quicker to the 1.5 GB range 
before my RAM is full and the service becomes unresponsive.

My tests against the deployed servlet are to make 1000 requests against the 
service.  I check the response of each request to ensure it succeeded and wait 
10 ms before sending the next request.  Wait between runs of the test suite is 
around 6 seconds.  When TDB Dataset connections are made (but no queries are 
run), the service will become unresponsive within the 3rd of 4th run of the 
test suite, so somewhere in the 4k-5k request range.

Is this an unreasonable test suite?

Perhaps I need to adjust my tomcat configuration?  I am using the default 
except for -Xms and -Xmx.

Here are the relevant methods from my controller class

public class AnalysisSettingsController implements ApplicationContextAware {

 // private vars ...

private Dataset getDataset() {
return (Dataset) context.getBean(dataset);
}

@RequestMapping(value=/test, method = RequestMethod.POST, consumes = 
{application/xml, text/xml})
public void test(HttpServletRequest request, HttpServletResponse 
response) throws IOException {
logger.info(in create(...));

OntModel m = 
ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
try {
AnalysisSettings settings = service.load(m, 
request.getInputStream()); // creates rdf representation of input, stores in 
in-memory model (m)

try {
String id = settings.getIdentifier();
String location = 
request.getRequestURL().toString()+/report/+id;
response.setHeader(Location, location);

Dataset dataset = getDataset();
logger.info(dataset connection opened);
try {
/* commented out during testing
if(service.has(dataset, id)) {

response.setStatus(HttpServletResponse.SC_FOUND);
   

Re: Queries with Multiple Aggregates in Select

2012-06-26 Thread Andy Seaborne

On 25/06/12 23:01, Stephen Allen wrote:

All,

I have a question about what the expected results are of a query with
multiple aggregates when there are no matching solutions, specifically
if one of them is COUNT.

Take the following query for example:

PREFIX books:   http://example.org/book/
PREFIX dc:  http://purl.org/dc/elements/1.1/
select (count(?b) as ?bookCount) (min(?title) as ?firstBook)
where {
   ?b dc:title ?title
}

If you run it against the books database on sparql.org you get:
(?bookCount ?firstBook) {
   (7^^http://www.w3.org/2001/XMLSchema#integer Harry Potter and
the Chamber of Secrets)
}

However, running it against an empty triple store (or
s/dc:title/dc:title2) brings back a resultset consisting of a single
row with both variables unbound.  Intuitively, I would expect that you
should instead get back a single binding like:
(?bookCount ?firstBook) {
   (0^^http://www.w3.org/2001/XMLSchema#integer UNDEF)
}


Does anyone know if this behavior expected?  I'm running against Fuseki 0.2.2.


There's a bug when the second aggregate evals to an error on zero rows. 
 If you reverse the select expressions you'll see a difference.


select (min(?title) as ?firstBook) (count(?b) as ?bookCount)

Then you get what I expect - a zero and an  unbound variable (min is 
undefined).


So you get one row (for the aggregates) and bound variable (count).

Fixed in SVN.

Andy




-Stephen






Re: Want to run SPARQL Query with Hadoop Map Reduce Framework

2012-06-26 Thread Alex Miller
 Right now I am only using DBPedia, Geoname and NYTimes for LOD cloud. And
 later on I want to extend my dataset.

 By the way, yes, I can use sparql directly to collect my required
 statistics but my assumption is using Hadoop could give me some boosting in
 collecting those stat.

 Sincerely
 Md Mizanur


Hello Md,

The Revelytix Spinner product supports SPARQL in Hadoop if you're
interested (SPARQL translated to map/reduce jobs). To fully use the
parallelism of Hadoop you would need to import all of the data.  You might
also find that just using Spinner outside of Hadoop, simple federation via
SERVICE extension might be sufficient and that is also supported.

http://www.revelytix.com/content/download-spinner

Alex Miller


Re: memory issues using TDB from servlet

2012-06-26 Thread Stephan Zednik

On Jun 26, 2012, at 4:24 AM, Andy Seaborne wrote:

 a few questions and then a a suggestion:
 
 How much physical RAM does the machine have?

4 GB

 Which version of the Jena software is this?

0.9.0-incubating (set via Maven)

 Is this running on MS Windows?

Mac OSX 10.7.4

 
 If you are on 64 bit hardware, then TDB uses out-of-heap memory as well as 
 heap memory.

I am on 64bit software.

 
 But what I am most suspicious of is
 
 Dataset dataset = getDataset();
 ...
 dataset.close();

Ah. I thought opening a Dataset was like opening a JDBC connection, and I could 
consequently open and close Datasets as needed.

 
 which seems to opening the database on every call which may be the cause of 
 your problems.  You may have many copies of the in-RAM datastructres 
 (especially on 64-bit Windows which does not release mnamory mapped segments 
 during the lifetime of the JVM - (in)famous Java bug).
 
 You should open the database once at start up, do not close it when a request 
 is finished.
 
 With transactions, you can get away with not closing it at all but to be 
 neat, close at shutdown if you like.

OK, I will modify DatasetFactoryBean return only one instance of Dataset and 
add logic to shutdown the Dataset at servlet close.

 
 Otherwise, could you turn this into a standalone test case that simulates 
 your set up but runs outside Spring so we can debug it?

Taking it outside of Spring would require a great deal of refactoring, it would 
be easier to send my full project (built via maven).

First though, I will make the change suggested above and report back to the 
list.

--Stephan

 
   Andy
 
 On 25/06/12 21:55, Stephan Zednik wrote:
 I have been having memory issues using TDB from a java servlet.  Memory 
 usage by tomcat increases until the service becomes unresponsive and must be 
 restarted.  The service operations appear to be completing successfully 
 until the service becomes unresponsive.
 
 The memory usage will rapidly rise to whatever my heap max size 
 (CATALINA_OPTS=-Xms512m -Xmx4096m) or what my available RAM can hold 
 before the service becomes unresponsive.  Generally during testing that has 
 been 1.5-1.6 GB before my RAM is full up.
 
 I have a fairly simple set of unit tests, it does not have full coverage but 
 what tests I do have all pass.
 
 I am using Spring web.
 
 Below is my Spring Controller class, it asks the application context for a 
 Dataset which causes Spring to invoke DatasetFactoryBean.getObject().  The 
 DatasetFactoryBean is a singleton has been initialized with location to my 
 TDB dataset.
 
 The controller method is fairly simple.  A post request contains XML 
 payload.  The payload is passed to a service method that parses the XML and 
 generates an RDF representation of the input data, encoded as RDF and stored 
 in an in-memory jena model.  AnalysisSettings is a class that acts as a 
 proxy to the Jena Model with methods for manipulating/accessing the encoded 
 RDF.
 
 I have commented out the TDB-related code and tested both the xml parsing 
 and xml parsing + in-memory rdf.  Service memory usage slowly grows to a 
 level I am unhappy with (~1GB according to ActivityMonitor.app and 
 VisualVM), but does stabilize.  Since it stabilizes and grows slowly I do 
 not think it is the main culprit of my current memory problem.
 
 If I test the TDB Dataset creation code, but leave all queries run against 
 the TDB dataset commented out, memory usage grows much quicker to the 1.5 GB 
 range before my RAM is full and the service becomes unresponsive.
 
 My tests against the deployed servlet are to make 1000 requests against the 
 service.  I check the response of each request to ensure it succeeded and 
 wait 10 ms before sending the next request.  Wait between runs of the test 
 suite is around 6 seconds.  When TDB Dataset connections are made (but no 
 queries are run), the service will become unresponsive within the 3rd of 4th 
 run of the test suite, so somewhere in the 4k-5k request range.
 
 Is this an unreasonable test suite?
 
 Perhaps I need to adjust my tomcat configuration?  I am using the default 
 except for -Xms and -Xmx.
 
 Here are the relevant methods from my controller class
 
 public class AnalysisSettingsController implements ApplicationContextAware {
 
 // private vars ...
 
  private Dataset getDataset() {
  return (Dataset) context.getBean(dataset);
  }
 
  @RequestMapping(value=/test, method = RequestMethod.POST, consumes = 
 {application/xml, text/xml})
  public void test(HttpServletRequest request, HttpServletResponse 
 response) throws IOException {
  logger.info(in create(...));
 
  OntModel m = 
 ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
  try {
  AnalysisSettings settings = service.load(m, 
 request.getInputStream()); // creates rdf representation of input, stores in 
 in-memory model (m)
 
  try {

Correct SPARQL query for all information for particular Individual

2012-06-26 Thread Lewis John Mcgibbney
Hi Everyone,
Having produced a subset of a rather large ontology I'm now attempting
to write a SPARQL query to retrieve all attributes of any given
individuals if the name matches. An example of one of my
NamedIndividuals is below

!-- 
http://www.buildingsmart-tech.org/ifcXML/IFC2x3/FINAL/IFC2X3_subset.owl#IfcBeam
--

NamedIndividual rdf:about=IFC2X3_subset;IfcBeam
rdf:type
Restriction
onProperty rdf:resource=IFC2X3_subset;hasSubstitutionGroup/
allValuesFrom
rdf:resource=IFC2X3_subset;IfcBuildingElement/
/Restriction
/rdf:type
rdf:type
Restriction
onProperty rdf:resource=IFC2X3_subset;hasNillableValue/
allValuesFrom rdf:resource=xsd;boolean/
/Restriction
/rdf:type
IFC2X3_subset:hasComplexTypeName
rdf:datatype=xsd;NameIfcBeam/IFC2X3_subset:hasComplexTypeName
IFC2X3_subset:hasExtensionBase
rdf:datatype=rdfs;Literalifc:IfcBuildingElement/IFC2X3_subset:hasExtensionBase
IFC2X3_subset:hasNillableValue
rdf:datatype=xsd;booleantrue/IFC2X3_subset:hasNillableValue
IFC2X3_subset:hasName rdf:resource=IFC2X3_subset;IfcBeam/
IFC2X3_subset:isOfType rdf:resource=IFC2X3_subset;IfcBeam/
/NamedIndividual

Now say my query matched the rdf:About= and the
IFC2X3_subset:hasName rdf:resource=IFC2X3_subset;IfcBeam  out of
all of the individuals I have persisted, I would like all the
information above to be returned within the response...

Any help would be greatly appreciated.

Thank you very much in advance
Lewis

-- 
Lewis


Re: Correct SPARQL query for all information for particular Individual

2012-06-26 Thread Andy Seaborne

I don't do RDF/XML :-) but this may achieve what you want.

PREFIX 
DESCRIBE ?x { ?x IFC2X3_subset:hasName IFC2X3_subset:IfcBeam }

The default for DESCRIBE is the bNode closure.

If you know the structure, then you can use CONSTRUCT or extract into 
variables with SELECT.


Andy

On 26/06/12 18:14, Lewis John Mcgibbney wrote:

Hi Everyone,
Having produced a subset of a rather large ontology I'm now attempting
to write a SPARQL query to retrieve all attributes of any given
individuals if the name matches. An example of one of my
NamedIndividuals is below

!-- 
http://www.buildingsmart-tech.org/ifcXML/IFC2x3/FINAL/IFC2X3_subset.owl#IfcBeam
--

 NamedIndividual rdf:about=IFC2X3_subset;IfcBeam
 rdf:type
 Restriction
 onProperty 
rdf:resource=IFC2X3_subset;hasSubstitutionGroup/
 allValuesFrom
rdf:resource=IFC2X3_subset;IfcBuildingElement/
 /Restriction
 /rdf:type
 rdf:type
 Restriction
 onProperty rdf:resource=IFC2X3_subset;hasNillableValue/
 allValuesFrom rdf:resource=xsd;boolean/
 /Restriction
 /rdf:type
 IFC2X3_subset:hasComplexTypeName
rdf:datatype=xsd;NameIfcBeam/IFC2X3_subset:hasComplexTypeName
 IFC2X3_subset:hasExtensionBase
rdf:datatype=rdfs;Literalifc:IfcBuildingElement/IFC2X3_subset:hasExtensionBase
 IFC2X3_subset:hasNillableValue
rdf:datatype=xsd;booleantrue/IFC2X3_subset:hasNillableValue
 IFC2X3_subset:hasName rdf:resource=IFC2X3_subset;IfcBeam/
 IFC2X3_subset:isOfType rdf:resource=IFC2X3_subset;IfcBeam/
 /NamedIndividual

Now say my query matched the rdf:About= and the
IFC2X3_subset:hasName rdf:resource=IFC2X3_subset;IfcBeam  out of
all of the individuals I have persisted, I would like all the
information above to be returned within the response...

Any help would be greatly appreciated.

Thank you very much in advance
Lewis






Re: Want to run SPARQL Query with Hadoop Map Reduce Framework

2012-06-26 Thread Paolo Castagna
Md. Mizanur Rahoman wrote:
 Hi Paolo,
 
 Thanks for your reply.
 
 Right now I am only using DBPedia, Geoname and NYTimes for LOD cloud. And
 later on I want to extend my dataset.

Ok, so it's big, but not huge! ;-)
If you have enough RAM you can do everything on a single machine.

 By the way, yes, I can use sparql directly to collect my required
 statistics but my assumption is using Hadoop could give me some boosting in
 collecting those stat.

Well, it all depends if you already have an Hadoop cluster you can use.
If not, a single machine with a lot of RAM might be easier/faster/better.

 I will knock you after going through your links.

Sure, let me know how it goes.

Paolo

 
 -
 Sincerely
 Md Mizanur
 
 
 
 On Tue, Jun 26, 2012 at 12:50 AM, Paolo Castagna 
 castagna.li...@googlemail.com wrote:
 
 Hi Mizanur,
 when you have big RDF datasets, it might make sense to use MapReduce (but
 only if you already have an Hadoop cluster at hand. Is this your case?).
 You say that your data is 'huge', just for the sake of curiosity... how
 many triples/quads is 'huge'? ;-)
 Most of the use cases I've seen related to statistics on RDF datasets were
 trivial MapReduce jobs.

 For a couple of examples on using MapReduce with RDF datasets have a look
 here:
 https://github.com/castagna/jena-grande
 https://github.com/castagna/tdbloader4

 This, for example, is certainly not exactly what you need, but I am sure
 that with little changes you can get what you want:

 https://github.com/castagna/tdbloader4/blob/master/src/main/java/org/apache/jena/tdbloader4/StatsDriver.java

 Last but not least, you'll need to dump your RDF data out onto HDFS.
 I suggest you use N-Triples/N-Quads serialization formats.

 Running SPARQL queries on top of an Hadoop cluster is another (long and
 not easy) story.
 But, it might be possible to translate part of the SPARQL algebra into Pig
 Latin scripts and use Pig.
 In my opinion however, it makes more sense to use MapReduce to
 filter/slice massive datasets, load the result into a triple store and
 refine your data analysis using SPARQL there.

 My 2 cents,
 Paolo

 Md. Mizanur Rahoman wrote:
 Dear All,

 I want to collect some statistics over RDF data. My triple store is
 Virtuoso and I am using Jena for executing my query.  I want to get some
 statistics like
 i) how many resources in my dataset ii) resources belong to in which
 position of dataset (i.e., sub/prd/obj) etc. As my data is huge, I want
 to
 use Hadoop Map Reduce in calculating such statistics.

 Can you please suggest.