Re: create/removeStatement execution time
On 24/06/12 16:12, Lars Wißler wrote: Hi all, I need information on the efficiency of the execution of the createStatement and removeStatement operations on the OntModel, i.e. how many calls of these methods Jena can execute in lets say under 3-4 min. Depends on your hardware, JDK, OS, memory settings, data, whether you have inference enabled or not etc. The correct thing to do is simply time it yourself in your environment. To make myself clearer: I load a model, transfer it to a GMFmodel, where changes can be made, and then need to transport it back to the Jena OntModel. What's a GMFmodel? Dave
Re: Strange behaviour of XMLLiterals in RDF/XML
On 25/06/12 12:57, Martynas Jusevičius wrote: Hey list, I'd like to know why the following triple @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix awol: http://bblfish.net/work/atom-owl/2006-06-06/AtomOwl.html# . _:smth awol:xml 'div xmlns=http://www.w3.org/1999/xhtml;pstuffbr/more stuff/p/div'^^rdf:XMLLiteral . serialized into RDF/XML produces escaped XMLLiteral: rdf:Description rdf:nodeID=A23 awol:xml rdf:datatype=http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral;lt;div xmlns=http://www.w3.org/1999/xhtmlgt;lt;pgt;stufflt;br/gt;more stufflt;/pgt;lt;/divgt;/awol:xml /rdf:Description However this one _:smth awol:xml 'div xmlns=http://www.w3.org/1999/xhtml;pstuffbr/brmore stuff/p/div'^^rdf:XMLLiteral . produces unescaped XMLLiteral, as expected: rdf:Description rdf:nodeID=A23 awol:xml rdf:parseType=Literaldiv xmlns=http://www.w3.org/1999/xhtml;pstuffbr/brmore stuff/p/div/awol:xml /rdf:Description Both br/ and br/br are well-formed and equivalent in the XML context, so why the difference in serialization? I'm using Jena 2.6.4 and ARQ 2.8.7. Try riot --validate - you may been a newer jena, can't remember Dec 2010. [[ WARN [line: 5, col: 19] Lexical form 'div xmlns=http://www.w3.org/1999/xhtml;pstuffbr/more stuff/p/div' not valid for datatype http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral ]] br/ is not canonical c14n XML. The rules for a valid XMLLiteral are complicated. The best RDF-WG is going to do is make XMLLiteral less mandatory. At the moment, it is the only special datatype built into RDF, and it's built into the RDF/XML parser as well. Andy Martynas graphity.org
Re: ARQ: Traversing and processing a SPARQL Algebra (Op) tree
On top of the visitor pattern, there is a subsystem to rewrite algebra expressions: Have a look at some of the ARQ transforms e.g. TransformUnionQuery The Transform instance can have state. and this is driven by teh tree walker com.hp.hpl.jena.sparql.algebra.Transformer Op Transformer.transform(Transform, op) this is a bottom up rewrite of the tree. variables appearing in all ots descendant Ops OpVars.patternVars or OpVars.allVars Andy On 23/06/12 07:14, Dimitris Spanos wrote: That's very interesting Martynas and I will keep this option in mind. I'm not sure it applies to what I want to do, as I do not want just a syntactic query translation, but also tweaking the query engine and getting results by executing it. Dimitris On Fri, Jun 22, 2012 at 9:24 PM, Martynas Jusevičius marty...@graphity.org wrote: It might not be the traditional solution, but I've done some SPARQL query transformations with RDF/XML and XSLT 2. It doesn't work on the algebra level, but rather on the SPIN serialization. It might or might not be easier than Java code. I don't know if that applies in your case. Martynad graphity.org On Jun 22, 2012 6:10 PM, Paul Gearon gea...@ieee.org wrote: On Fri, Jun 22, 2012 at 9:30 AM, Dimitris Spanos dimi.s...@gmail.com wrote: Hello all, I'm trying to traverse and process the algebra expression tree for a SPARQL query. For this purpose, I have created a class that implements OpVisitor and visits the entire tree in a top-down fashion. Ideally, I would like to be able to pass information from a visited operator to its children and vice versa, but I'm not sure how I can do this in an elegant way (using just some global class variables does not seem sufficient). From parents to children and back up again is reasonably straightforward. Just have the visitor keep a stack of where it's visited (so the visitor knows about the parent to the current Op), and accumulate results as it finishes processing an Op (so the visitor has information about what the processing of child Ops returned). I guess a naive way to deal with it would be to start from the top of the tree, visit every Op, transform it to a custom extension of the respective Op that will maybe hold a link to the parent (already transformed) Op and store the information that will get passed to its children. At the same time, every such transformed Op will be inserted into a Stack and, after the top-down traversal is done, popped out in reverse order in order to go back up the tree and pass information from children Ops to parent Ops. I think that what you're describing here is a transformation of the tree into another tree. That works, but it's more than what you said in the first paragraph (where you just need to have results of processing children, plus knowledge of what your parents are). Personally, I prefer to update the nodes in my trees so that they can be transformed directly, rather than being mapped to an equivalent tree before transforming. However, in the case of Op I can appreciate why you may not want to touch the existing structure. If you just need the info of where you are in the tree as you process it, then I recommend just keeping a stack in the tree walker that you use. If you need to see across the tree as you're processing, then mapping the tree to an equivalent that is more amenable to processing (as you suggest) is preferable to modifying Op. Is there another more straightforward procedure to accomplish this? For this kind of processing, I really, *really* prefer using Clojure (using protocols to extend Op would make this a breeze). However, I don't see that getting inserted into the Jena dependencies any time soon. ;-) Sorry for sounding completely abstract, I hope I'm making (some) sense, any ideas/hints/pointers to examples would be very much appreciated! That's OK. I sometimes find that the large amounts of boilerplate that Java requires for things like the visitor pattern can get in the way of seeing exactly how to accomplish something. Paul
Re: Strange behaviour of XMLLiterals in RDF/XML
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/06/12 13:34, Andy Seaborne wrote: The best RDF-WG is going to do is make XMLLiteral less mandatory. 'Less mandatory'? :-) I was writing a similar reply as this came in. It's horrible trying to explain it, and it will be nice not to have to do that post-rdf 1.1. Damian -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/oXQIACgkQAyLCB+mTtymq8wCfW3+7CMm6uHdJhHJ+hbqbWrE3 V/oAoOlmJJfrM1k3brwi1p+j+fswdQrf =x69P -END PGP SIGNATURE-
Re: TDBLoader2 Performance on Empty vs Existing Store (WAS: Import Messures)
On 22/06/12 17:22, Rob Vesse wrote: Off the top of my head I believe loading into an empty database is always faster because of the way it generates the index files and node tables. When loading to an existing dataset it tends to be slower because it has to add to the existing files rather than generating them from scratch. Yes. It performs the operations in an order that is index friendly - it avoid inserting in a random fashion but tries to make it roughly sequential. That makes disk caches more efficient; and real disk access is expensive. Of the order of 1e6 instructions. When empty: tdbloader loads SPO and the nodes together, then creates the secondary indexes one at a time. In normal use, loading SPO is a sequential process - data arrives in blocks of same-subject. The code does not depend on this but it is faster if it is. The SPO index is written in (at a macroscopic level) sequential order - complete a B+Tree block and not need to come back to it later. Making POS and OSP is done by a sequential walk through SPO. tdbloader2 is more extreme - it loads the node data and outputs a stream of triples to a temporary file (as a text format!). It sorts the temporary file into the necessary order for an index, then loads it the index. Repeats for all the indexes. The sorting is done by unix sort(1) - while it seems more work, this is a very efficient program and, for large data, it is faster. Where the cut over is, depends on data shape and machine. tdbloader3 is like tdblaoder2 except pure java, binary and does parallel block sorts. Load time: 16 minutes average loading: ca 81.000 triple / second index time: 40 minutes store size: 9,3GB The second test was to store the same data into an allready filled store As i started the import i created a store with 348.398.593 Triples from DNB and HBZ (which are german libraries, store size: 33 GB). Then i started to load the german dbpedia in. Load time: 3 hours and 4 minutes average loading: ca 7200 / second Looks to be like tdbloader, - it needs to check for existence of any triple when loading a pre-filled store. That is random access and slow. index time: 38 minutes store size: 19 GB! I don't know why for sure, but jumps in size can be because the indexes need to be slightly bigger but the unit of allocation is 8M (memory mapped files). It's while an empty database can look quite large - there are 8M files, and although sparse some OSs (Mac) count the space. If there are some bNodes then that can lead to some new nodes and triples causing index sizes to jump. What is ls -lh saying? Andy
Re: Strange behaviour of XMLLiterals in RDF/XML
Thanks, I didn't realize XMLLiterals have to be canonical. You don't mean XMLLiterals are going away, do you? Escaped XML would cut off all XML processing tools (I heavily use XSLT on RDF/XML, for example). Martynas On Mon, Jun 25, 2012 at 2:43 PM, Damian Steer d.st...@bristol.ac.uk wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/06/12 13:34, Andy Seaborne wrote: The best RDF-WG is going to do is make XMLLiteral less mandatory. 'Less mandatory'? :-) I was writing a similar reply as this came in. It's horrible trying to explain it, and it will be nice not to have to do that post-rdf 1.1. Damian -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/oXQIACgkQAyLCB+mTtymq8wCfW3+7CMm6uHdJhHJ+hbqbWrE3 V/oAoOlmJJfrM1k3brwi1p+j+fswdQrf =x69P -END PGP SIGNATURE-
Re: Strange behaviour of XMLLiterals in RDF/XML
On 25/06/12 13:43, Damian Steer wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/06/12 13:34, Andy Seaborne wrote: The best RDF-WG is going to do is make XMLLiteral less mandatory. 'Less mandatory'? :-) I was writing a similar reply as this came in. It's horrible trying to explain it, and it will be nice not to have to do that post-rdf 1.1. I just rant about rdf:XMLLiterals a lot. The definition isn't changing as far as I can remember. The lexical space is still c14n exclusive canonicalization with comments, with empty inclusiveNamespaces. I only know where to look because of helping people with their data. Never used the things myself. There are so few real use cases - real XML data can't be put straight into RDF because of the canonicalization rules. e.g. People having problem with GML and RDF. Canonicalization software often isn't available at the point of data creation. I'll have another coffee now. Andy Damian -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/oXQIACgkQAyLCB+mTtymq8wCfW3+7CMm6uHdJhHJ+hbqbWrE3 V/oAoOlmJJfrM1k3brwi1p+j+fswdQrf =x69P -END PGP SIGNATURE-
Re: Want to run SPARQL Query with Hadoop Map Reduce Framework
Hi Mizanur, when you have big RDF datasets, it might make sense to use MapReduce (but only if you already have an Hadoop cluster at hand. Is this your case?). You say that your data is 'huge', just for the sake of curiosity... how many triples/quads is 'huge'? ;-) Most of the use cases I've seen related to statistics on RDF datasets were trivial MapReduce jobs. For a couple of examples on using MapReduce with RDF datasets have a look here: https://github.com/castagna/jena-grande https://github.com/castagna/tdbloader4 This, for example, is certainly not exactly what you need, but I am sure that with little changes you can get what you want: https://github.com/castagna/tdbloader4/blob/master/src/main/java/org/apache/jena/tdbloader4/StatsDriver.java Last but not least, you'll need to dump your RDF data out onto HDFS. I suggest you use N-Triples/N-Quads serialization formats. Running SPARQL queries on top of an Hadoop cluster is another (long and not easy) story. But, it might be possible to translate part of the SPARQL algebra into Pig Latin scripts and use Pig. In my opinion however, it makes more sense to use MapReduce to filter/slice massive datasets, load the result into a triple store and refine your data analysis using SPARQL there. My 2 cents, Paolo Md. Mizanur Rahoman wrote: Dear All, I want to collect some statistics over RDF data. My triple store is Virtuoso and I am using Jena for executing my query. I want to get some statistics like i) how many resources in my dataset ii) resources belong to in which position of dataset (i.e., sub/prd/obj) etc. As my data is huge, I want to use Hadoop Map Reduce in calculating such statistics. Can you please suggest.
Planning for a new framework for Jena
Dear all I'm new at this list, but I would like to purpose the building of a new framework for Jena. I'll be in charge of the design and programming of this new framework, but new ideas or collaborations with other developers are welcome. The project is a java web platform for configuring, managing and queering the Jena framework from any web browser. That framework will allow the users to save time in the configuration time, initial contact and database administration of Jena framework. It could be .war for any application server with the appropriate configuration files. I consider that it has to be designed in a modular way, and the addition of new plug-ins have to be taken into acount from the very begginng. I thought it could be like a kind of Eclipse, that it's a platform for development with basic functionalities, but allows the additions of a lot of plug-gins from the community or private companies. Some of the nowadays utilities for Jena could be migrated into a plug-in for this platform. REST and SOAP services could be a plug-in for this platform. New ideas or suggestions are welcome. I think a framework like this will help Jena to be more used, because the intention is that the new framework has to be in most of the cases self explanatory and intuitive, and with a lot of helping tools. Best regards, Joan
Re: Planning for a new framework for Jena
Hi Joan Comments inline: On 6/25/12 9:30 AM, Joan Iglesias joan.igles...@live.com wrote: Dear all I'm new at this list, but I would like to purpose the building of a new framework for Jena. I'll be in charge of the design and programming of this new framework, but new ideas or collaborations with other developers are welcome. The project is a java web platform for configuring, managing and queering the Jena framework from any web browser. That framework will allow the users to save time in the configuration time, initial contact and database administration of Jena framework. It could be .war for any application server with the appropriate configuration files. Please take a look at JENA-201 (https://issues.apache.org/jira/browse/JENA-201) which contains a discussion on how to convert the existing Fuseki architecture into WAR form. If you are interested maybe you would like to work on contributing towards that effort? Also Fuseki already includes much of the configuration, management and querying capabilities you are talking about. Granted right now Fuseki can't easily be run in any Java application server because it runs off an embedded Jetty but if that issue was addressed this would become possible. And equally the built in UI could be a little less basic but none of us Jena developers claim to be graphic design or UX experts! I consider that it has to be designed in a modular way, and the addition of new plug-ins have to be taken into acount from the very begginng. I thought it could be like a kind of Eclipse, that it's a platform for development with basic functionalities, but allows the additions of a lot of plug-gins from the community or private companies. This is basically what the Jena platform is already unless I am misunderstanding your point? I don't know how familiar you are with Jena (I assume at least reasonably so given the scope of your proposal) but Jena already has many extension points that can be utilized and many people using Jena commercially already use these widely. Maybe you could elaborate on exactly what it is that you want to extend/do that you don't think Jena can do right now? You may find that the types of extensions you want are already possible in the existing framework and you are just not aware of it. For example within Fuseki you can already leverage the Jena assembly mechanism for loading and executing arbitrary code allowing you to add custom functionality to a standard Fuseki distro to some extent. Some of the nowadays utilities for Jena could be migrated into a plug-in for this platform. REST and SOAP services could be a plug-in for this platform. Utilities such as? New ideas or suggestions are welcome. I think a framework like this will help Jena to be more used, because the intention is that the new framework has to be in most of the cases self explanatory and intuitive, and with a lot of helping tools. While I do want to discourage you from contributing to the Jena ecosystem it would be interesting to here some more detail on what exactly you want to build. From reading your email I get the impression that maybe a lot of what you want may already be available and you're just looking to get it more solidly integrated into a user friendly web based UI? Regards, Rob Vesse Best regards, Joan
RE: Planning for a new framework for Jena
Hello Rob Commends inline also. From: rve...@yarcdata.com To: users@jena.apache.org Subject: Re: Planning for a new framework for Jena Date: Mon, 25 Jun 2012 17:01:30 + Hi Joan Comments inline: On 6/25/12 9:30 AM, Joan Iglesias joan.igles...@live.com wrote: Dear all I'm new at this list, but I would like to purpose the building of a new framework for Jena. I'll be in charge of the design and programming of this new framework, but new ideas or collaborations with other developers are welcome. The project is a java web platform for configuring, managing and queering the Jena framework from any web browser. That framework will allow the users to save time in the configuration time, initial contact and database administration of Jena framework. It could be .war for any application server with the appropriate configuration files. Please take a look at JENA-201 (https://issues.apache.org/jira/browse/JENA-201) which contains a discussion on how to convert the existing Fuseki architecture into WAR form. If you are interested maybe you would like to work on contributing towards that effort? Also Fuseki already includes much of the configuration, management and querying capabilities you are talking about. Granted right now Fuseki can't easily be run in any Java application server because it runs off an embedded Jetty but if that issue was addressed this would become possible. And equally the built in UI could be a little less basic but none of us Jena developers claim to be graphic design or UX experts! I'm not an expert on Jena, I only read some tutorials about Jena and its frameworks. Of course the development of such a framework needs much more knowledge than I have. Because of this I suggested the project to the Jena community, because I don't want to start a long learning period if the framework is useless or not needed for the community or already exists something similar. Like all technologies, the more user-friendly user interface, the most success in the adoption of the technology. Some companies if they appreciate a long learning period of a framework, the framework is discarded. I could deduce from the tutorials of fuseki an so on, that most of the configuration is done by file configuration. The idea of my project, it's that you just download the war file, you deploy it, and all the configuration and management is done using a very helpful and user-friendly interface. Addition of modules, database configuration and administration, and so on. I think, that any developer prefer a tool, self explanatory with small time to learn how it works. Maybe such a framework could be developed using fuseki code or other Jena frameworks as a base framework or starting framework. The platform I propose, it allows to manage plug-ins graphically, like Eclipse, for example. And a plug-in could have it's own web interface to configure or use it. I think that those functionality it's not possible by now using fuseki. For example, is there any framework that allows to manage ontologies graphically and integrated into Jena? Sometimes I think it's very useful that a plug-in has a web interface, like the case I mentioned before, and integrated into a well defined platform. What do you think about all this?? I consider that it has to be designed in a modular way, and the addition of new plug-ins have to be taken into acount from the very begginng. I thought it could be like a kind of Eclipse, that it's a platform for development with basic functionalities, but allows the additions of a lot of plug-gins from the community or private companies. This is basically what the Jena platform is already unless I am misunderstanding your point? I don't know how familiar you are with Jena (I assume at least reasonably so given the scope of your proposal) but Jena already has many extension points that can be utilized and many people using Jena commercially already use these widely. Maybe you could elaborate on exactly what it is that you want to extend/do that you don't think Jena can do right now? You may find that the types of extensions you want are already possible in the existing framework and you are just not aware of it. For example within Fuseki you can already leverage the Jena assembly mechanism for loading and executing arbitrary code allowing you to add custom functionality to a standard Fuseki distro to some extent. Some of the nowadays utilities for Jena could be migrated into a plug-in for this platform. REST and SOAP services could be a plug-in for this platform. Utilities such as? New ideas or suggestions are welcome. I think a framework like this will help Jena to be more used, because the intention is that the new framework has to be in most of the cases self explanatory and intuitive, and with a lot of helping tools. While I do want to discourage you from contributing to the Jena
Re: Planning for a new framework for Jena
Joan, could Graphity approach be similar to what you have in mind? http://www.w3.org/2011/09/LinkedData/ledp2011_submission_1.pdf You can see what kind of UI it can render on http://linkeddata.dk. Martynas graphity.org On Mon, Jun 25, 2012 at 7:57 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Rob Commends inline also. From: rve...@yarcdata.com To: users@jena.apache.org Subject: Re: Planning for a new framework for Jena Date: Mon, 25 Jun 2012 17:01:30 + Hi Joan Comments inline: On 6/25/12 9:30 AM, Joan Iglesias joan.igles...@live.com wrote: Dear all I'm new at this list, but I would like to purpose the building of a new framework for Jena. I'll be in charge of the design and programming of this new framework, but new ideas or collaborations with other developers are welcome. The project is a java web platform for configuring, managing and queering the Jena framework from any web browser. That framework will allow the users to save time in the configuration time, initial contact and database administration of Jena framework. It could be .war for any application server with the appropriate configuration files. Please take a look at JENA-201 (https://issues.apache.org/jira/browse/JENA-201) which contains a discussion on how to convert the existing Fuseki architecture into WAR form. If you are interested maybe you would like to work on contributing towards that effort? Also Fuseki already includes much of the configuration, management and querying capabilities you are talking about. Granted right now Fuseki can't easily be run in any Java application server because it runs off an embedded Jetty but if that issue was addressed this would become possible. And equally the built in UI could be a little less basic but none of us Jena developers claim to be graphic design or UX experts! I'm not an expert on Jena, I only read some tutorials about Jena and its frameworks. Of course the development of such a framework needs much more knowledge than I have. Because of this I suggested the project to the Jena community, because I don't want to start a long learning period if the framework is useless or not needed for the community or already exists something similar. Like all technologies, the more user-friendly user interface, the most success in the adoption of the technology. Some companies if they appreciate a long learning period of a framework, the framework is discarded. I could deduce from the tutorials of fuseki an so on, that most of the configuration is done by file configuration. The idea of my project, it's that you just download the war file, you deploy it, and all the configuration and management is done using a very helpful and user-friendly interface. Addition of modules, database configuration and administration, and so on. I think, that any developer prefer a tool, self explanatory with small time to learn how it works. Maybe such a framework could be developed using fuseki code or other Jena frameworks as a base framework or starting framework. The platform I propose, it allows to manage plug-ins graphically, like Eclipse, for example. And a plug-in could have it's own web interface to configure or use it. I think that those functionality it's not possible by now using fuseki. For example, is there any framework that allows to manage ontologies graphically and integrated into Jena? Sometimes I think it's very useful that a plug-in has a web interface, like the case I mentioned before, and integrated into a well defined platform. What do you think about all this?? I consider that it has to be designed in a modular way, and the addition of new plug-ins have to be taken into acount from the very begginng. I thought it could be like a kind of Eclipse, that it's a platform for development with basic functionalities, but allows the additions of a lot of plug-gins from the community or private companies. This is basically what the Jena platform is already unless I am misunderstanding your point? I don't know how familiar you are with Jena (I assume at least reasonably so given the scope of your proposal) but Jena already has many extension points that can be utilized and many people using Jena commercially already use these widely. Maybe you could elaborate on exactly what it is that you want to extend/do that you don't think Jena can do right now? You may find that the types of extensions you want are already possible in the existing framework and you are just not aware of it. For example within Fuseki you can already leverage the Jena assembly mechanism for loading and executing arbitrary code allowing you to add custom functionality to a standard Fuseki distro to some extent. Some of the nowadays utilities for Jena could be migrated into a plug-in for this platform. REST and SOAP services could be a plug-in for this platform. Utilities such as? New
RE: Planning for a new framework for Jena
Hello Martynas I think it's not exactly what I had in mind. On your site I could see a group of Semantic web sites accessible from your portal, and this portal generates a basic user interface, automatically I suppose. I suppose it's very easy to add new sites or repositories, and your platform generates a basic view-controller for the site, or allows the user to define their own view-controller. Correct me if I'm wrong, please. My idea is that my framework allows the addition of other plug-ins to manage a single site, the complexities of a single site or repository and help the programmer or admin to manage the Jena capabilities and associated plug-ins graphically. Of course it could be linked with other sites or repositories. I thought in having a very dynamic and ajax based VIEW, for example using the ZK framework. Because around java there a lot of framework and utilities [the most I think], the core technology has to be Java based. Thank you for your answer. Best regards Joan Date: Mon, 25 Jun 2012 20:26:18 +0200 Subject: Re: Planning for a new framework for Jena From: marty...@graphity.org To: users@jena.apache.org Joan, could Graphity approach be similar to what you have in mind? http://www.w3.org/2011/09/LinkedData/ledp2011_submission_1.pdf You can see what kind of UI it can render on http://linkeddata.dk. Martynas graphity.org On Mon, Jun 25, 2012 at 7:57 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Rob Commends inline also. From: rve...@yarcdata.com To: users@jena.apache.org Subject: Re: Planning for a new framework for Jena Date: Mon, 25 Jun 2012 17:01:30 + Hi Joan Comments inline: On 6/25/12 9:30 AM, Joan Iglesias joan.igles...@live.com wrote: Dear all I'm new at this list, but I would like to purpose the building of a new framework for Jena. I'll be in charge of the design and programming of this new framework, but new ideas or collaborations with other developers are welcome. The project is a java web platform for configuring, managing and queering the Jena framework from any web browser. That framework will allow the users to save time in the configuration time, initial contact and database administration of Jena framework. It could be .war for any application server with the appropriate configuration files. Please take a look at JENA-201 (https://issues.apache.org/jira/browse/JENA-201) which contains a discussion on how to convert the existing Fuseki architecture into WAR form. If you are interested maybe you would like to work on contributing towards that effort? Also Fuseki already includes much of the configuration, management and querying capabilities you are talking about. Granted right now Fuseki can't easily be run in any Java application server because it runs off an embedded Jetty but if that issue was addressed this would become possible. And equally the built in UI could be a little less basic but none of us Jena developers claim to be graphic design or UX experts! I'm not an expert on Jena, I only read some tutorials about Jena and its frameworks. Of course the development of such a framework needs much more knowledge than I have. Because of this I suggested the project to the Jena community, because I don't want to start a long learning period if the framework is useless or not needed for the community or already exists something similar. Like all technologies, the more user-friendly user interface, the most success in the adoption of the technology. Some companies if they appreciate a long learning period of a framework, the framework is discarded. I could deduce from the tutorials of fuseki an so on, that most of the configuration is done by file configuration. The idea of my project, it's that you just download the war file, you deploy it, and all the configuration and management is done using a very helpful and user-friendly interface. Addition of modules, database configuration and administration, and so on. I think, that any developer prefer a tool, self explanatory with small time to learn how it works. Maybe such a framework could be developed using fuseki code or other Jena frameworks as a base framework or starting framework. The platform I propose, it allows to manage plug-ins graphically, like Eclipse, for example. And a plug-in could have it's own web interface to configure or use it. I think that those functionality it's not possible by now using fuseki. For example, is there any framework that allows to manage ontologies graphically and integrated into Jena? Sometimes I think it's very useful that a plug-in has a web interface, like the case I mentioned before, and integrated into a well defined platform. What do you think about all this?? I consider that it has to be designed in a modular way, and the addition
Re: Strange behaviour of XMLLiterals in RDF/XML
On 25/06/12 14:05, Martynas Jusevičius wrote: Thanks, I didn't realize XMLLiterals have to be canonical. You don't mean XMLLiterals are going away, do you? Escaped XML would cut off all XML processing tools (I heavily use XSLT on RDF/XML, for example). Not going way. They have a special status in that their lexical form is changed by the RDF/XML parser to be canonical, they don't behave like normal datatypes. The RDF/XML behaviour will remain but, for example, Turtle parsers will not be required to canonicalize. [[ http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0198.html RESOLVED: in RDF 1.1: [a] XMLLiterals are optional; [b] lexical space consists of well-formed XML fragments; [c] the canonical lexical form is http://www.w3.org/TR/xml-exc-c14n/, as defined in RDF 2004; [d] the value space consists of (normalized) DOM trees. ]] and http://www.w3.org/2011/rdf-wg/track/issues/13 Andy Martynas On Mon, Jun 25, 2012 at 2:43 PM, Damian Steer d.st...@bristol.ac.uk wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/06/12 13:34, Andy Seaborne wrote: The best RDF-WG is going to do is make XMLLiteral less mandatory. 'Less mandatory'? :-) I was writing a similar reply as this came in. It's horrible trying to explain it, and it will be nice not to have to do that post-rdf 1.1. Damian -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/oXQIACgkQAyLCB+mTtymq8wCfW3+7CMm6uHdJhHJ+hbqbWrE3 V/oAoOlmJJfrM1k3brwi1p+j+fswdQrf =x69P -END PGP SIGNATURE-
Re: Planning for a new framework for Jena
Joan, Graphity also allows defining XHTML templates, so the layout and functionallity is fully customizable. You can include all the libraries you want, but the platform doesn't deal with client-side much -- linkeddata.dk is just one of the possible layouts. Do you mean http://www.zkoss.org? If I get the concept right, you will end up doing the same thing -- writing templates, only in ZK custom template language instead of standard XSLT, and probably some provider Java code? Martynas On Mon, Jun 25, 2012 at 9:16 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Martynas I think it's not exactly what I had in mind. On your site I could see a group of Semantic web sites accessible from your portal, and this portal generates a basic user interface, automatically I suppose. I suppose it's very easy to add new sites or repositories, and your platform generates a basic view-controller for the site, or allows the user to define their own view-controller. Correct me if I'm wrong, please. My idea is that my framework allows the addition of other plug-ins to manage a single site, the complexities of a single site or repository and help the programmer or admin to manage the Jena capabilities and associated plug-ins graphically. Of course it could be linked with other sites or repositories. I thought in having a very dynamic and ajax based VIEW, for example using the ZK framework. Because around java there a lot of framework and utilities [the most I think], the core technology has to be Java based. Thank you for your answer. Best regards Joan Date: Mon, 25 Jun 2012 20:26:18 +0200 Subject: Re: Planning for a new framework for Jena From: marty...@graphity.org To: users@jena.apache.org Joan, could Graphity approach be similar to what you have in mind? http://www.w3.org/2011/09/LinkedData/ledp2011_submission_1.pdf You can see what kind of UI it can render on http://linkeddata.dk. Martynas graphity.org On Mon, Jun 25, 2012 at 7:57 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Rob Commends inline also. From: rve...@yarcdata.com To: users@jena.apache.org Subject: Re: Planning for a new framework for Jena Date: Mon, 25 Jun 2012 17:01:30 + Hi Joan Comments inline: On 6/25/12 9:30 AM, Joan Iglesias joan.igles...@live.com wrote: Dear all I'm new at this list, but I would like to purpose the building of a new framework for Jena. I'll be in charge of the design and programming of this new framework, but new ideas or collaborations with other developers are welcome. The project is a java web platform for configuring, managing and queering the Jena framework from any web browser. That framework will allow the users to save time in the configuration time, initial contact and database administration of Jena framework. It could be .war for any application server with the appropriate configuration files. Please take a look at JENA-201 (https://issues.apache.org/jira/browse/JENA-201) which contains a discussion on how to convert the existing Fuseki architecture into WAR form. If you are interested maybe you would like to work on contributing towards that effort? Also Fuseki already includes much of the configuration, management and querying capabilities you are talking about. Granted right now Fuseki can't easily be run in any Java application server because it runs off an embedded Jetty but if that issue was addressed this would become possible. And equally the built in UI could be a little less basic but none of us Jena developers claim to be graphic design or UX experts! I'm not an expert on Jena, I only read some tutorials about Jena and its frameworks. Of course the development of such a framework needs much more knowledge than I have. Because of this I suggested the project to the Jena community, because I don't want to start a long learning period if the framework is useless or not needed for the community or already exists something similar. Like all technologies, the more user-friendly user interface, the most success in the adoption of the technology. Some companies if they appreciate a long learning period of a framework, the framework is discarded. I could deduce from the tutorials of fuseki an so on, that most of the configuration is done by file configuration. The idea of my project, it's that you just download the war file, you deploy it, and all the configuration and management is done using a very helpful and user-friendly interface. Addition of modules, database configuration and administration, and so on. I think, that any developer prefer a tool, self explanatory with small time to learn how it works. Maybe such a framework could be developed using fuseki code or other Jena frameworks as a base framework or starting framework. The platform I propose, it allows to manage plug-ins
RE: Planning for a new framework for Jena
Martynas, Ok I see. But your platform allows the possibility of adding plug-ins in a Eclipse way? Just specifying the url of the provider? If I understood well, if I use your platform I could define something similar to what I wanted to do. I suppose that for each plug-in I have to make a king of package, with my php code for controllers, XSLT, and so on... I'm right? How easy is to define a plug-in for your platform? Yes you are right, it's not standard technology ZKOSS, but its presentation layer it's very dynamic and powerful. Maybe it could be a good starting point to start defining jena plug-ins for your platform. But your platform should allow the user the installation in one click fashion, and the customization of the plug-in using web interface. It could be possible? Nowadays, I think it's not enough XHTML. For some functionality, it's needed more dynamic contents generation. Maybe your platform could allow the definition of the view with dojotoolkit? ZK and other frameworks I have in mind could allow to define a very dynamic view generation, and that also means that more types of plug-ins and utilities can e developed. XHTML in some cases can be limited. Best regards Joan Date: Mon, 25 Jun 2012 21:41:19 +0200 Subject: Re: Planning for a new framework for Jena From: marty...@graphity.org To: users@jena.apache.org Joan, Graphity also allows defining XHTML templates, so the layout and functionallity is fully customizable. You can include all the libraries you want, but the platform doesn't deal with client-side much -- linkeddata.dk is just one of the possible layouts. Do you mean http://www.zkoss.org? If I get the concept right, you will end up doing the same thing -- writing templates, only in ZK custom template language instead of standard XSLT, and probably some provider Java code? Martynas On Mon, Jun 25, 2012 at 9:16 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Martynas I think it's not exactly what I had in mind. On your site I could see a group of Semantic web sites accessible from your portal, and this portal generates a basic user interface, automatically I suppose. I suppose it's very easy to add new sites or repositories, and your platform generates a basic view-controller for the site, or allows the user to define their own view-controller. Correct me if I'm wrong, please. My idea is that my framework allows the addition of other plug-ins to manage a single site, the complexities of a single site or repository and help the programmer or admin to manage the Jena capabilities and associated plug-ins graphically. Of course it could be linked with other sites or repositories. I thought in having a very dynamic and ajax based VIEW, for example using the ZK framework. Because around java there a lot of framework and utilities [the most I think], the core technology has to be Java based. Thank you for your answer. Best regards Joan Date: Mon, 25 Jun 2012 20:26:18 +0200 Subject: Re: Planning for a new framework for Jena From: marty...@graphity.org To: users@jena.apache.org Joan, could Graphity approach be similar to what you have in mind? http://www.w3.org/2011/09/LinkedData/ledp2011_submission_1.pdf You can see what kind of UI it can render on http://linkeddata.dk. Martynas graphity.org On Mon, Jun 25, 2012 at 7:57 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Rob Commends inline also. From: rve...@yarcdata.com To: users@jena.apache.org Subject: Re: Planning for a new framework for Jena Date: Mon, 25 Jun 2012 17:01:30 + Hi Joan Comments inline: On 6/25/12 9:30 AM, Joan Iglesias joan.igles...@live.com wrote: Dear all I'm new at this list, but I would like to purpose the building of a new framework for Jena. I'll be in charge of the design and programming of this new framework, but new ideas or collaborations with other developers are welcome. The project is a java web platform for configuring, managing and queering the Jena framework from any web browser. That framework will allow the users to save time in the configuration time, initial contact and database administration of Jena framework. It could be .war for any application server with the appropriate configuration files. Please take a look at JENA-201 (https://issues.apache.org/jira/browse/JENA-201) which contains a discussion on how to convert the existing Fuseki architecture into WAR form. If you are interested maybe you would like to work on contributing towards that effort? Also Fuseki already includes much of the configuration, management and querying capabilities you are talking about. Granted right now Fuseki can't easily be run in any Java application server because it runs off an embedded Jetty but if that issue was
RE: Planning for a new framework for Jena
I've been reading your article more accurately. It seems very interesting your platform and I could see there will be a Java port. What about the other questions? The one click installation of plug-ins in a Eclipse way for example Thank you Joan Date: Tue, 26 Jun 2012 00:09:44 +0200 Subject: Re: Planning for a new framework for Jena From: marty...@graphity.org To: users@jena.apache.org Joan, you could try ZKOSS also, I wouldn't mind using widgets to increase productivity, but I think you'll have difficulties connecting the RDF code to them in a generic way. And if not, you will be building a tool, not a platform. Graphity does not include non-generic code. linkeddata.dk is just one instance running/extending the platform, the codebases are separate. There I use Google Chart Tools (https://developers.google.com/chart/), for example: http://linkeddata.dk/queries/world-bank/denmark/gdp-vs-household Martynas On Mon, Jun 25, 2012 at 10:22 PM, Joan Iglesias joan.igles...@live.com wrote: Martynas, Ok I see. But your platform allows the possibility of adding plug-ins in a Eclipse way? Just specifying the url of the provider? If I understood well, if I use your platform I could define something similar to what I wanted to do. I suppose that for each plug-in I have to make a king of package, with my php code for controllers, XSLT, and so on... I'm right? How easy is to define a plug-in for your platform? Yes you are right, it's not standard technology ZKOSS, but its presentation layer it's very dynamic and powerful. Maybe it could be a good starting point to start defining jena plug-ins for your platform. But your platform should allow the user the installation in one click fashion, and the customization of the plug-in using web interface. It could be possible? Nowadays, I think it's not enough XHTML. For some functionality, it's needed more dynamic contents generation. Maybe your platform could allow the definition of the view with dojotoolkit? ZK and other frameworks I have in mind could allow to define a very dynamic view generation, and that also means that more types of plug-ins and utilities can e developed. XHTML in some cases can be limited. Best regards Joan Date: Mon, 25 Jun 2012 21:41:19 +0200 Subject: Re: Planning for a new framework for Jena From: marty...@graphity.org To: users@jena.apache.org Joan, Graphity also allows defining XHTML templates, so the layout and functionallity is fully customizable. You can include all the libraries you want, but the platform doesn't deal with client-side much -- linkeddata.dk is just one of the possible layouts. Do you mean http://www.zkoss.org? If I get the concept right, you will end up doing the same thing -- writing templates, only in ZK custom template language instead of standard XSLT, and probably some provider Java code? Martynas On Mon, Jun 25, 2012 at 9:16 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Martynas I think it's not exactly what I had in mind. On your site I could see a group of Semantic web sites accessible from your portal, and this portal generates a basic user interface, automatically I suppose. I suppose it's very easy to add new sites or repositories, and your platform generates a basic view-controller for the site, or allows the user to define their own view-controller. Correct me if I'm wrong, please. My idea is that my framework allows the addition of other plug-ins to manage a single site, the complexities of a single site or repository and help the programmer or admin to manage the Jena capabilities and associated plug-ins graphically. Of course it could be linked with other sites or repositories. I thought in having a very dynamic and ajax based VIEW, for example using the ZK framework. Because around java there a lot of framework and utilities [the most I think], the core technology has to be Java based. Thank you for your answer. Best regards Joan Date: Mon, 25 Jun 2012 20:26:18 +0200 Subject: Re: Planning for a new framework for Jena From: marty...@graphity.org To: users@jena.apache.org Joan, could Graphity approach be similar to what you have in mind? http://www.w3.org/2011/09/LinkedData/ledp2011_submission_1.pdf You can see what kind of UI it can render on http://linkeddata.dk. Martynas graphity.org On Mon, Jun 25, 2012 at 7:57 PM, Joan Iglesias joan.igles...@live.com wrote: Hello Rob Commends inline also. From: rve...@yarcdata.com To: users@jena.apache.org Subject: Re: Planning for a new framework for Jena Date: Mon, 25 Jun 2012 17:01:30 + Hi Joan Comments inline: On 6/25/12 9:30 AM, Joan Iglesias joan.igles...@live.com wrote: Dear all
Can SDB 1.3.4 be used with Jena 2.7.1?
We are now starting the process of upgrading our platform to the latest Jena version(s). I noticed that SDB has not been released yet as an Apache module. Question: is it safe to use SDB 1.3.4 in conjunction with Jena 2.7.1? Apologies if this has been asked before. Thanks Holger
Re: Reading JSON from Virtuoso OpenSource output
Hi Lorena JenaReaderRdfJson is for reading a JSON serialization of RDF. The serialization you are trying to read is the JSON serialization of SPARQL Results which is completely different. I notice you say that you use a CONSTRUCT query but the results you show are the SPARQL Results JSON format which should only be used for ASK/SELECT queries. If Virtuoso is replying with that to your CONSTRUCT query then they are behaving incorrectly and you should report a bug to them. If you genuinely expect SPARQL results instead then use ResultSetFactory.fromJSON() which will give you a ResultSet object. Rob On 6/25/12 3:14 PM, lorena lore...@fing.edu.uy wrote: Hi: I'm trying to process the results of performing a CONSTRUCT query on Virtuoso using apache-jena-2.7.0-incubating [1] shows the JSON String I would like to read (schemaStr). Here is an extract of my code: SysRIOT.wireIntoJena(); Model modelSchema = ModelFactory.createDefaultModel(); RDFReader schemaReader = new JenaReaderRdfJson() ; StringReader s = new StringReader(schemaStr); schemaReader.read(modelSchema, s, ); And I receive the following exception caused in the line that executes the read: com.hp.hpl.jena.shared.JenaException: org.openjena.riot.RiotException: [line: 2, col: 3 ] Relative IRI: head at org.openjena.riot.system.JenaReaderRIOT.readImpl(JenaReaderRIOT.java:150) at org.openjena.riot.system.JenaReaderRIOT.read(JenaReaderRIOT.java:54) It seems to have trouble reading the head section. My questions: Is Virtuoso JSON output compatible with what JenaReaderRdfJson expects to read? Am I missing something else? I'm using the empty string () as base URI in the read method, but I don't understand what is the read method expecting in this field. Thanks in advance Lorena [1] { head: { link: [], vars: [ s, p, o ] }, results: { distinct: false, ordered: true, bindings: [ { s: { type : uri, value : _:vb43419 } , p: { type : uri, value : http://purl.org/olap#hasAggregateFunction; } , o: { type : uri, value : http://purl.org/olap#sum; }}, { s: { type : uri, value : _:vb43418 } , p: { type : uri, value : http://purl.org/olap#level; }, o: { type : uri, value : http://example.org/householdCS#year; }}, { s: { type : uri, value : http://example.org/householdCS#household_withoutGeo; }, p: { type : uri, value : http://purl.org/linked-data/cube#component; } , o: { type : uri, value : _:vb43418 }}, { s: { type : uri, value : http://example.org/householdCS#household_withoutGeo; }, p: { type : uri, value : http://purl.org/linked-data/cube#component; } , o: { type : uri, value : _:vb43419 }}, { s: { type : uri, value : _:vb43419 } , p: { type : uri, value : http://purl.org/linked-data/cube#measure; } , o: { type : uri, value : http://example.org/householdCS#household; }}, { s: { type : uri, value : http://example.org/householdCS#householdCS; } , p: { type : uri, value : http://www.w3.org/1999/02/22-rdf-syntax-ns#type; } , o: { type : uri, value : http://purl.org/linked-data/cube#DataStructureDefinition; }} ] } }
Re: Can SDB 1.3.4 be used with Jena 2.7.1?
Holger, does that also mean a new release of SPIN API which will be packaged with the latest Jena? Martynas On Tue, Jun 26, 2012 at 1:33 AM, Holger Knublauch hol...@knublauch.com wrote: We are now starting the process of upgrading our platform to the latest Jena version(s). I noticed that SDB has not been released yet as an Apache module. Question: is it safe to use SDB 1.3.4 in conjunction with Jena 2.7.1? Apologies if this has been asked before. Thanks Holger
Re: Can SDB 1.3.4 be used with Jena 2.7.1?
Yes this is my hope, assuming SDB still works for us. Holger On 6/26/2012 9:46, Martynas Jusevičius wrote: Holger, does that also mean a new release of SPIN API which will be packaged with the latest Jena? Martynas On Tue, Jun 26, 2012 at 1:33 AM, Holger Knublauch hol...@knublauch.com wrote: We are now starting the process of upgrading our platform to the latest Jena version(s). I noticed that SDB has not been released yet as an Apache module. Question: is it safe to use SDB 1.3.4 in conjunction with Jena 2.7.1? Apologies if this has been asked before. Thanks Holger
Re: Want to run SPARQL Query with Hadoop Map Reduce Framework
Hi Paolo, Thanks for your reply. Right now I am only using DBPedia, Geoname and NYTimes for LOD cloud. And later on I want to extend my dataset. By the way, yes, I can use sparql directly to collect my required statistics but my assumption is using Hadoop could give me some boosting in collecting those stat. I will knock you after going through your links. - Sincerely Md Mizanur On Tue, Jun 26, 2012 at 12:50 AM, Paolo Castagna castagna.li...@googlemail.com wrote: Hi Mizanur, when you have big RDF datasets, it might make sense to use MapReduce (but only if you already have an Hadoop cluster at hand. Is this your case?). You say that your data is 'huge', just for the sake of curiosity... how many triples/quads is 'huge'? ;-) Most of the use cases I've seen related to statistics on RDF datasets were trivial MapReduce jobs. For a couple of examples on using MapReduce with RDF datasets have a look here: https://github.com/castagna/jena-grande https://github.com/castagna/tdbloader4 This, for example, is certainly not exactly what you need, but I am sure that with little changes you can get what you want: https://github.com/castagna/tdbloader4/blob/master/src/main/java/org/apache/jena/tdbloader4/StatsDriver.java Last but not least, you'll need to dump your RDF data out onto HDFS. I suggest you use N-Triples/N-Quads serialization formats. Running SPARQL queries on top of an Hadoop cluster is another (long and not easy) story. But, it might be possible to translate part of the SPARQL algebra into Pig Latin scripts and use Pig. In my opinion however, it makes more sense to use MapReduce to filter/slice massive datasets, load the result into a triple store and refine your data analysis using SPARQL there. My 2 cents, Paolo Md. Mizanur Rahoman wrote: Dear All, I want to collect some statistics over RDF data. My triple store is Virtuoso and I am using Jena for executing my query. I want to get some statistics like i) how many resources in my dataset ii) resources belong to in which position of dataset (i.e., sub/prd/obj) etc. As my data is huge, I want to use Hadoop Map Reduce in calculating such statistics. Can you please suggest. -- *Md Mizanur Rahoman* PhD Student The Graduate University for Advanced Studies National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan. Cell # +81-80-4076-9044 email: mi...@nii.ac.jp Web: http://www.nii.ac.jp/en/ Lecturer, Department of Computer Science Engineering Begum Rokeya University, Rangpur, Bangladesh. email: mdmizanur.raho...@gmail.com, mi...@brur.ac.bd Cell # +88 01823 806618 Web: http://www.brur.ac.bd