Re: [Wikidata-l] Broken JSON in XML dumps

2015-02-26 Thread Martynas Jusevičius
Looks like someone hasn't learned the lesson:
https://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg02588.html

On Thu, Feb 26, 2015 at 9:27 PM, Lukas Benedix
lukas.bene...@fu-berlin.de wrote:
 I second this!


 btw:  what is the status of the problem with the missing dumps with
 history? (latest available from November 2014)

 Lukas

 Am Do 26.02.2015 um 14:52 schrieb Markus Kroetzsch:
 Hi,

 It's that time of the year again when I am sending a reminder that we
 still have broken JSON in the dump files ;-). As usual, the problem is
 that empty maps {} are serialized wrongly as empty lists []. I am not
 sure if there is any open bug that tracks this, so I am sending an
 email. There was one, but it was closed [1].

 As you know (I had sent an email a while ago), there are some remaining
 problems of this kind in the JSON dump, and also in the live exported
 JSON, e.g.,

 https://www.wikidata.org/wiki/Special:EntityData/Q4383128.json
 (uses [] as a value for snaks: this item has a reference with an empty
 list of snaks, which is an error by itself)

 However, the situation is considerably worse in the XML dumps, which
 have seen less usage since we have JSON, but as it turns out are still
 preferred by some users. Surprisingly (to me), the JSON content in the
 XML dumps is still not the same as in the JSON dumps. A large part of
 the records in the XML dump is broken because of the map-vs-list issue.

 For example, the latest dump of current revisions [2] has countless
 instances of the problem. The first is in the item Q3261 (empty list for
 claims), but you can easily find more by grepping for things like

 quot;claimsquot;:[]

 It seems that all empty maps are serialized wrongly in this dump
 (aliases, descriptions, claims, ...). In contrast, the site's export
 simply omits the key of empty maps entirely, see

 https://www.wikidata.org/wiki/Special:EntityData/Q3261.json

 The JSON in the JSON dumps is the same.

 Cheers,

 Markus


 [1] https://github.com/wmde/WikibaseDataModelSerialization/issues/77
 [2]
 http://dumps.wikimedia.org/wikidatawiki/20150207/wikidatawiki-20150207-pages-meta-current.xml.bz2





 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-28 Thread Martynas Jusevičius
Gerard,

what about query functionality for example? This has been long
promised but shows no real progress.

And why do you think practical cases cannot be implemented using RDF?
What is the justification for ignoring the whole standard and
implementation stack? What makes you think Wikidata can do better than
RDF?


Martynas

On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen
gerard.meijs...@gmail.com wrote:
 Hoi,
 Hell no. Wikidata is first and foremost a product that is actually used. It
 has that way from the start. Prioritising RDF over actual practical use
 cases is imho wrong. If anything the continuous tinkering on the format of
 dumps has mostly brought us grieve. Dumps that can no longer be read like
 currently for the Wikidata statistics really hurt.

 So lets not spend time at this time on RDF, Lets ensure that what we have
 works, works well and plan carefully for a better RDF but lets only have it
 go in production AFTER we know that it works well.
 Thanks,
   GerardM

 On 28 October 2014 02:46, Martynas Jusevičius marty...@graphity.org wrote:

 Hey all,

 so I see there is some work being done on mapping Wikidata data model
 to RDF [1].

 Just a thought: what if you actually used RDF and Wikidata's concepts
 modeled in it right from the start? And used standard RDF tools, APIs,
 query language (SPARQL) instead of building the whole thing from
 scratch?

 Is it just me or was this decision really a colossal waste of resources?


 [1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

 Martynas
 http://graphityhq.com

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-28 Thread Martynas Jusevičius
John, please see inline:

On Tue, Oct 28, 2014 at 8:39 AM, John Erling Blad jeb...@gmail.com wrote:
 The data model is close to RDF, but not quite. Statements in items are
 reified statements, etc. Technically it is semantic data, where RDF is
 one possible representaton.


Well it has been shown (in the paper I referenced) that Wikidata can
be modeled as RDF. And there is no reason why it couldn't be, because
in RDF anyone can say anything about anything.

 There was a decision choice to keep Mediawiki to ease reuse within the
 Wikimedia sites, mostly so users could reuse their knowledge, but also
 for devs to reuse existing infrastructure.

This is exactly the decision that I question. I think it was
completely misguided. If the goal was to reuse knowledge and
infrastructure, then Wikidata has failed completely, as there is more
infrastructure and knowledge of RDF than there ever will be for
Mediawiki, or any structured/semantic data model for that matter.


 Some of the problems with Wd comes from the fact that the similarities
 isn't clear enough for the users, and possibly the devs, which have
 resulted in a slightly introvert community and a technical structure
 that is slightly more Wikipedia-centric than necessary.

Here I can only agree with you. That is not an RDF problem though.


 On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen
 gerard.meijs...@gmail.com wrote:
 Hoi,
 Hell no. Wikidata is first and foremost a product that is actually used. It
 has that way from the start. Prioritising RDF over actual practical use
 cases is imho wrong. If anything the continuous tinkering on the format of
 dumps has mostly brought us grieve. Dumps that can no longer be read like
 currently for the Wikidata statistics really hurt.

 So lets not spend time at this time on RDF, Lets ensure that what we have
 works, works well and plan carefully for a better RDF but lets only have it
 go in production AFTER we know that it works well.
 Thanks,
   GerardM

 On 28 October 2014 02:46, Martynas Jusevičius marty...@graphity.org wrote:

 Hey all,

 so I see there is some work being done on mapping Wikidata data model
 to RDF [1].

 Just a thought: what if you actually used RDF and Wikidata's concepts
 modeled in it right from the start? And used standard RDF tools, APIs,
 query language (SPARQL) instead of building the whole thing from
 scratch?

 Is it just me or was this decision really a colossal waste of resources?


 [1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

 Martynas
 http://graphityhq.com

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-28 Thread Martynas Jusevičius
Gerard,

what is there practical about having a query language that 1) is not a
standard and never will be 2) is not supported by any other tool or
project and never will be?

I would understand this kind of reasoning coming from a hobbyist
project, but not from one claiming to be a global free linked
database.


Martynas

On Tue, Oct 28, 2014 at 11:37 AM, Gerard Meijssen
gerard.meijs...@gmail.com wrote:
 Hoi,
 Query has been promised and unofficially we have it for a VERY long time..
 It is called WDQ. it is used in many tools. The official query will only
 provide a subset of functionality for quite some time as I understand it.

 Practical cases in RDF for what by whom ? Wikidata is first and foremost a
 vehicle to bring interwiki links to our projects. Then and only then it
 becomes relevant to store data about the items involved. This data may be
 used in info boxes and what not in our projects.. THAT is practical use to
 our community.

 RDF may of interest to others and it may be possible to do practical things
 by them but that does not prioritise it. I do not think Wikidata can do
 better. As far as I am concerned it is the least of our problems. The reuse
 of data is first to happen within our projects and THAT is not so much of a
 technical problem at all.
 Thanks,
GerardM

 On 28 October 2014 11:26, Martynas Jusevičius marty...@graphity.org wrote:

 Gerard,

 what about query functionality for example? This has been long
 promised but shows no real progress.

 And why do you think practical cases cannot be implemented using RDF?
 What is the justification for ignoring the whole standard and
 implementation stack? What makes you think Wikidata can do better than
 RDF?


 Martynas

 On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen
 gerard.meijs...@gmail.com wrote:
  Hoi,
  Hell no. Wikidata is first and foremost a product that is actually used.
  It
  has that way from the start. Prioritising RDF over actual practical use
  cases is imho wrong. If anything the continuous tinkering on the format
  of
  dumps has mostly brought us grieve. Dumps that can no longer be read
  like
  currently for the Wikidata statistics really hurt.
 
  So lets not spend time at this time on RDF, Lets ensure that what we
  have
  works, works well and plan carefully for a better RDF but lets only have
  it
  go in production AFTER we know that it works well.
  Thanks,
GerardM
 
  On 28 October 2014 02:46, Martynas Jusevičius marty...@graphity.org
  wrote:
 
  Hey all,
 
  so I see there is some work being done on mapping Wikidata data model
  to RDF [1].
 
  Just a thought: what if you actually used RDF and Wikidata's concepts
  modeled in it right from the start? And used standard RDF tools, APIs,
  query language (SPARQL) instead of building the whole thing from
  scratch?
 
  Is it just me or was this decision really a colossal waste of
  resources?
 
 
  [1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf
 
  Martynas
  http://graphityhq.com
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] Wikidata RDF

2014-10-27 Thread Martynas Jusevičius
Hey all,

so I see there is some work being done on mapping Wikidata data model
to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts
modeled in it right from the start? And used standard RDF tools, APIs,
query language (SPARQL) instead of building the whole thing from
scratch?

Is it just me or was this decision really a colossal waste of resources?


[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas
http://graphityhq.com

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] How are queries doing?

2013-11-29 Thread Martynas Jusevičius
Jan,

my suspicion is that my predictions from last year hold true: it is a
far more complex task to design a scalable and performant data model,
query language and/or query engine solely for Wikidata than the
designers of this project anticipated - unless they did anticipate and
now knowingly fail to deliver.

You can check some threads from december last year, and they relate to
even older ones:
http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg01415.html

Martynas

On Fri, Nov 29, 2013 at 1:47 PM, Jan Kučera kozuc...@gmail.com wrote:
 Ok.

 One is a bit disappointed seeing various projects to fail to deliver
 according to their original timelines... seems like there is not enough
 money in? Do you need more developers to perform better?


 2013/11/26 Lydia Pintscher lydia.pintsc...@wikimedia.de

 On Mon, Nov 25, 2013 at 9:55 PM, Jan Kučera kozuc...@gmail.com wrote:
  Hi,
 
  so how things are going? Anything for testing already?

 Nothing to test yet. As soon as there is I will send an email to this
 list.
 The current status is that we still need to make some final
 adjustments on the database schema and finish the java script part of
 the user interface as well as ranks.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] new search backend ready for testing

2013-11-06 Thread Martynas Jusevičius
Hey Lydia,

how about query access?

Martynas
graphityhq.com

On Wed, Nov 6, 2013 at 6:17 PM, Lydia Pintscher
lydia.pintsc...@wikimedia.de wrote:
 Hey everyone,

 Progress! We now have the long awaited new search backend up and
 running for testing on Wikidata. It will still need some tweaking but
 please do try it and give feedback. It is running in parallel to the
 old one. You will need to visit a special page to use it:
 https://www.wikidata.org/w/index.php?search=athensbutton=title=Special%3ASearchsrbackend=CirrusSearch
  Please let me know about any issues you can still find with it so
 this so we can soon make it the default.

 Thanks to Chad and Katie for working on this.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF Issues

2013-09-26 Thread Martynas Jusevičius
There was a long discussion not so long ago about using established
RDF tools for Wikipedia dumps instead of home-brewed ones, but I guess
someone hasn't learnt the lesson yet.

On Thu, Sep 26, 2013 at 2:22 PM, Kingsley Idehen kide...@openlinksw.com wrote:
 All,

 See: https://www.wikidata.org/wiki/Q76

 The resource to which the URI above resolves contains:
 schema:version 72358096^^xsd:integer  .

 It should be:

 schema:version 72358096^^xsd:integer .

 Who is responsible for RDF resource publication and issue report handling?

 --

 Regards,

 Kingsley Idehen
 Founder  CEO
 OpenLink Software
 Company Web: http://www.openlinksw.com
 Personal Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter/Identi.ca handle: @kidehen
 Google+ Profile: https://plus.google.com/112399767740508618350/about
 LinkedIn Profile: http://www.linkedin.com/in/kidehen






 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode

2013-07-08 Thread Martynas Jusevičius
Here's my approach to software code problems: we need less of it, not
more. We need to remove domain logic from source code and move it into
data, which can be managed and on which UI can be built.
In that way we can build generic scalable software agents. That is the
way to Semantic Web.

Martynas
graphityhq.com

On Mon, Jul 8, 2013 at 10:13 PM, Michael Hale hale.michael...@live.com wrote:
 There are lots of code snippets scattered around the internet, but most of
 them can't be wired together in a simple flowchart manner. If you look at
 object libraries that are designed specifically for that purpose, like
 Modelica, you can do all sorts of neat engineering tasks like simulate the
 thermodynamics and power usage of a new refrigerator design. Then if your
 company is designing a new insulation material you would make a new block
 with the experimentally determined properties of your material to include in
 the programmatic flowchart to quickly calibrate other aspects of the
 refrigerator's design. To my understanding, Modelica is as big and good as
 it gets for code libraries that represent physically accurate objects.
 Often, the visual representation of those objects needs to be handled
 separately. As far as general purpose, standard programming libraries go,
 Mathematica is the best one I've found for quickly prototyping new
 functionality. A typical web mashup app or site will combine functionality
 and/or data from 3 to 6 APIs. Mobile apps will typically use the phone's
 functionality, an extra library for better graphics support, a proprietary
 library or two made by the company, and a couple of web APIs. A similar
 story for desktop media-editing programs, business software, and high-end
 games except the libraries are often larger. But there aren't many software
 libraries that I would describe as huge. And there are even fewer that
 manage to scale the usefulness of the library equally with the size it
 occupies on disk.

 Platform fragmentation (increase in number and popularity of smart phones
 and tablets) has proven to be a tremendous challenge for continuing to
 improve libraries. I now just have 15 different ways to draw a circle on
 different screens. The attempts to provide virtual machines with write-once
 run-anywhere functionality (Java and .NET) have failed, often due to
 customer lock-in reasons as much as platform fragmentation. Flash isn't
 designed to grow much beyond its current scope. The web standards can only
 progress as quickly as the least common denominator of functionality
 provided by other means, which is better than nothing I suppose. Mathematica
 has continued to improve their library (that's essentially what they sell),
 but they don't try to cover a lot of platforms. They also aren't open source
 and don't attempt to make the entire encyclopedia interactive and
 programmable. Open source attempts like the Boost C++ library don't seem to
 grow very quickly. But I think using Wikipedia articles as a scaffold for a
 massive open source, object-oriented library might be what is needed.

 I have a few approaches I use to decide what code to write next. They can be
 arranged from most useful as an exercise to stay sharp in the long term to
 most immediately useful for a specific project. Sometimes I just write code
 in a vacuum. Like, I will just choose a simple task like making a 2D ball
 bounce around some stairs interactively and I will just spend a few hours
 writing it and rewriting it to be more efficient and easier to expand. It
 always gives me a greater appreciation for the types of details that can be
 specified to a computer (and hence the scope of the computational universe,
 or space of all computer programs). Like with the ball bouncing example you
 can get lost defining interesting options for the ball and the ground or in
 the geometry logic for calculating the intersections (like if the ball
 doesn't deform or if the stairs have certain constraints on their shape
 there are optimizations you can make). At the end of the exercise I still
 just have a ball bouncing down some stairs, but my mind feels like it has
 been on a journey. Sometimes I try to write code that I think a group of
 people would find useful. I will browse the articles in the areas of
 computer science category by popularity and start writing the first things I
 see that aren't already in the libraries I use. So I'll expand Mathematica's
 FindClusters function to support density based methods or I'll expand the
 RandomSample function to support files that are too large to fit in memory
 with a reservoir sampling algorithm. Finally, I write code for specific
 projects. I'm trying to genetically engineer turf grass that doesn't need to
 be cut, so I need to automate some of the work I do for GenBank imports and
 sequence comparisons. For all of those, if there was an organized place to
 put my code afterwards so it would fit into a larger useful library I would
 totally be willing to do 

Re: [Wikidata-l] Accelerating software innovation with Wikidata and improved Wikicode

2013-07-08 Thread Martynas Jusevičius
Yes, that is one of the reasons functional languages are getting popular:
https://www.fpcomplete.com/blog/2012/04/the-downfall-of-imperative-programming
With PHP and JavaScript being the most widespread (and still misused)
languages we will not get there soon, however.

On Mon, Jul 8, 2013 at 10:57 PM, Michael Hale hale.michael...@live.com wrote:
 In the functional programming language family (think Lisp) there is no
 fundamental distinction between code and data.

 Date: Mon, 8 Jul 2013 22:47:46 +0300
 From: marty...@graphity.org

 To: wikidata-l@lists.wikimedia.org
 Subject: Re: [Wikidata-l] Accelerating software innovation with Wikidata
 and improved Wikicode

 Here's my approach to software code problems: we need less of it, not
 more. We need to remove domain logic from source code and move it into
 data, which can be managed and on which UI can be built.
 In that way we can build generic scalable software agents. That is the
 way to Semantic Web.

 Martynas
 graphityhq.com

 On Mon, Jul 8, 2013 at 10:13 PM, Michael Hale hale.michael...@live.com
 wrote:
  There are lots of code snippets scattered around the internet, but most
  of
  them can't be wired together in a simple flowchart manner. If you look
  at
  object libraries that are designed specifically for that purpose, like
  Modelica, you can do all sorts of neat engineering tasks like simulate
  the
  thermodynamics and power usage of a new refrigerator design. Then if
  your
  company is designing a new insulation material you would make a new
  block
  with the experimentally determined properties of your material to
  include in
  the programmatic flowchart to quickly calibrate other aspects of the
  refrigerator's design. To my understanding, Modelica is as big and good
  as
  it gets for code libraries that represent physically accurate objects.
  Often, the visual representation of those objects needs to be handled
  separately. As far as general purpose, standard programming libraries
  go,
  Mathematica is the best one I've found for quickly prototyping new
  functionality. A typical web mashup app or site will combine
  functionality
  and/or data from 3 to 6 APIs. Mobile apps will typically use the phone's
  functionality, an extra library for better graphics support, a
  proprietary
  library or two made by the company, and a couple of web APIs. A similar
  story for desktop media-editing programs, business software, and
  high-end
  games except the libraries are often larger. But there aren't many
  software
  libraries that I would describe as huge. And there are even fewer that
  manage to scale the usefulness of the library equally with the size it
  occupies on disk.
 
  Platform fragmentation (increase in number and popularity of smart
  phones
  and tablets) has proven to be a tremendous challenge for continuing to
  improve libraries. I now just have 15 different ways to draw a circle on
  different screens. The attempts to provide virtual machines with
  write-once
  run-anywhere functionality (Java and .NET) have failed, often due to
  customer lock-in reasons as much as platform fragmentation. Flash isn't
  designed to grow much beyond its current scope. The web standards can
  only
  progress as quickly as the least common denominator of functionality
  provided by other means, which is better than nothing I suppose.
  Mathematica
  has continued to improve their library (that's essentially what they
  sell),
  but they don't try to cover a lot of platforms. They also aren't open
  source
  and don't attempt to make the entire encyclopedia interactive and
  programmable. Open source attempts like the Boost C++ library don't seem
  to
  grow very quickly. But I think using Wikipedia articles as a scaffold
  for a
  massive open source, object-oriented library might be what is needed.
 
  I have a few approaches I use to decide what code to write next. They
  can be
  arranged from most useful as an exercise to stay sharp in the long term
  to
  most immediately useful for a specific project. Sometimes I just write
  code
  in a vacuum. Like, I will just choose a simple task like making a 2D
  ball
  bounce around some stairs interactively and I will just spend a few
  hours
  writing it and rewriting it to be more efficient and easier to expand.
  It
  always gives me a greater appreciation for the types of details that can
  be
  specified to a computer (and hence the scope of the computational
  universe,
  or space of all computer programs). Like with the ball bouncing example
  you
  can get lost defining interesting options for the ball and the ground or
  in
  the geometry logic for calculating the intersections (like if the ball
  doesn't deform or if the stairs have certain constraints on their shape
  there are optimizations you can make). At the end of the exercise I
  still
  just have a ball bouncing down some stairs, but my mind feels like it
  has
  been on a journey. Sometimes I try to write code that I 

Re: [Wikidata-l] Is an ecosystem of Wikidatas possible?

2013-06-20 Thread Martynas Jusevičius
You probably mean Linked Data?

On Tue, Jun 11, 2013 at 9:41 PM, David Cuenca dacu...@gmail.com wrote:
 While on the Hackathon I had the opportunity to talk with some people from
 sister projects about how they view Wikidata and the relationship it should
 have to sister projects. Probably you are already familiar with the views
 because they have been presented already several times. The hopes are high,
 in my opinion too high, about what can be accomplished when Wikidata is
 deployed to sister projects.

 There are conflicting needs about what belongs into Wikidata and what sister
 projects need, and that divide it is far greater to be overcome than just by
 installing the extension. In fact, I think there is a confusion between the
 need for Wikidata and the need for structured data. True that Wikidata
 embodies that technology, but I don't think all problems can be approached
 by the same centralized tool. At least not from the social side of it.
 Wikiquote could have one item for each quote, or Wikivoyage an item for each
 bar, hostel, restaurant, etc..., and the question will always be: are they
 relevant enough to be created in Wikidata? Considering that Wikidata was
 initially thought for Wikipedia, that scope wouldn't allow those uses.
 However, the structured data needs could be covered in other ways.

 It doesn't need to be a big wikidata addressing it all. It could well be a
 central Wikidata addressing common issues (like author data, population
 data, etc), plus other Wikidata installs on each sister project that
 requires it. For instance there could be a data.wikiquote.org, a
 data.wikivoyage.org, etc that would cater for the needs of each community,
 that I predict will increase as soon as the benefits become clear, and of
 course linked to the central Wikidata whenever needed. Even Commons could be
 wikidatized with each file becoming an item and having different labels
 representing the file name depending on the language version being accessed.

 Could be this the right direction to go?

 Cheers,
 Micru

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Data values

2012-12-19 Thread Martynas Jusevičius
Hey wikidatians,

occasionally checking threads in this list like the current one, I get
a mixed feeling: on one hand, it is sad to see the efforts and
resources waisted as Wikidata tries to reinvent RDF, and now also
triplestore design as well as XSD datatypes. What's next, WikiQL
instead of SPARQL?

On the other hand, it feels reassuring as I was right to predict this:
http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00056.html
http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00750.html

Best,

Martynas
graphity.org

On Wed, Dec 19, 2012 at 4:11 PM, Daniel Kinzler
daniel.kinz...@wikimedia.de wrote:
 On 19.12.2012 14:34, Friedrich Röhrs wrote:
 Hi,

 Sorry for my ignorance, if this is common knowledge: What is the use case for
 sorting millions of different measures from different objects?

 Finding all cities with more than 10 inhabitants requires the database to
 look through all values for the property population (or even all properties
 with countable values, depending on implementation an query planning), compare
 each value with 10 and return those with a greater value. To speed this
 up, an index sorted by this value would be needed.

 For cars there could be entries by the manufacturer, by some
 car-testing magazine, etc. I don't see how this could be adequatly
 represented/sorted by a database only query.

 If this cannot be done adequatly on the database level, then it cannot be done
 efficiently, which means we will not allow it. So our task is to come up with 
 an
 architecture that does allow this.

 (One way to allow scripted queries like this to run efficiently is to do 
 this
 in a massively parallel way, using a map/reduce framework. But that's also not
 trivial, and would require a whole new server infrastructure).

 If however this is necessary, i still don't understand why it must affect the
 datavalue structure. If a index is necessary it could be done over a 
 serialized
 representation of the value.

 Serialized can mean a lot of things, but an index on some data blob is only
 useful for exact matches, it can not be used for greater/lesser queries. We 
 need
 to map our values to scalar data types the database can understand directly, 
 and
 use for indexing.

 This needs to be done anyway, since the values are
 saved at a specific unit (which is just a wikidata item). To compare them on 
 a
 database level they must all be saved at the same unit, or some sort of
 procedure must be used to compare them (or am i missing something again?).

 If they measure the same dimension, they should be saved using the same unit
 (probably the SI base unit for that dimension). Saving values using different
 units would make it impossible to run efficient queries against these values,
 thereby defying one of the major reasons for Wikidata's existance. I don't 
 see a
 way around this.

 -- daniel

 --
 Daniel Kinzler, Softwarearchitekt
 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Data values

2012-12-19 Thread Martynas Jusevičius
Denny,

you're sidestepping the main issue here -- every sensible architecture
should build on as much previous standards as possible, and build own
custom solution only if a *very* compelling reason is found to do so
instead of finding a compromise between the requirements and the
standard. Wikidata seems to be constantly doing the opposite --
building a custom solution with whatever reason, or even without it.
This drives the compatibility and reuse towards zero.

This thread originally discussed datatypes for values such as numbers,
dates and their intervals -- semantics for all of those are defined in
XML Schema Datatypes: http://www.w3.org/TR/xmlschema-2/
All the XML and RDF tools are compatible with XSD, however I don't
think there is even a single mention of it in this thread? What makes
Wikidata so special that its datatypes cannot build on XSD? And this
is only one of the issues, I've pointed out others earlier.

Martynas
graphity.org


On Wed, Dec 19, 2012 at 5:58 PM, Denny Vrandečić
denny.vrande...@wikimedia.de wrote:
 Martynas,

 could you please let me know where RDF or any of the W3C standards covers
 topics like units, uncertainty, and their conversion. I would be very much
 interested in that.

 Cheers,
 Denny




 2012/12/19 Martynas Jusevičius marty...@graphity.org

 Hey wikidatians,

 occasionally checking threads in this list like the current one, I get
 a mixed feeling: on one hand, it is sad to see the efforts and
 resources waisted as Wikidata tries to reinvent RDF, and now also
 triplestore design as well as XSD datatypes. What's next, WikiQL
 instead of SPARQL?

 On the other hand, it feels reassuring as I was right to predict this:
 http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00056.html
 http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg00750.html

 Best,

 Martynas
 graphity.org

 On Wed, Dec 19, 2012 at 4:11 PM, Daniel Kinzler
 daniel.kinz...@wikimedia.de wrote:
  On 19.12.2012 14:34, Friedrich Röhrs wrote:
  Hi,
 
  Sorry for my ignorance, if this is common knowledge: What is the use
  case for
  sorting millions of different measures from different objects?
 
  Finding all cities with more than 10 inhabitants requires the
  database to
  look through all values for the property population (or even all
  properties
  with countable values, depending on implementation an query planning),
  compare
  each value with 10 and return those with a greater value. To speed
  this
  up, an index sorted by this value would be needed.
 
  For cars there could be entries by the manufacturer, by some
  car-testing magazine, etc. I don't see how this could be adequatly
  represented/sorted by a database only query.
 
  If this cannot be done adequatly on the database level, then it cannot
  be done
  efficiently, which means we will not allow it. So our task is to come up
  with an
  architecture that does allow this.
 
  (One way to allow scripted queries like this to run efficiently is to
  do this
  in a massively parallel way, using a map/reduce framework. But that's
  also not
  trivial, and would require a whole new server infrastructure).
 
  If however this is necessary, i still don't understand why it must
  affect the
  datavalue structure. If a index is necessary it could be done over a
  serialized
  representation of the value.
 
  Serialized can mean a lot of things, but an index on some data blob is
  only
  useful for exact matches, it can not be used for greater/lesser queries.
  We need
  to map our values to scalar data types the database can understand
  directly, and
  use for indexing.
 
  This needs to be done anyway, since the values are
  saved at a specific unit (which is just a wikidata item). To compare
  them on a
  database level they must all be saved at the same unit, or some sort of
  procedure must be used to compare them (or am i missing something
  again?).
 
  If they measure the same dimension, they should be saved using the same
  unit
  (probably the SI base unit for that dimension). Saving values using
  different
  units would make it impossible to run efficient queries against these
  values,
  thereby defying one of the major reasons for Wikidata's existance. I
  don't see a
  way around this.
 
  -- daniel
 
  --
  Daniel Kinzler, Softwarearchitekt
  Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des

Re: [Wikidata-l] Provenance tracking on the Web with NIF-URIs

2012-06-22 Thread Martynas Jusevičius
Denny, the statement-level of granularity you're describing is achieved by
RDF reification. You describe it however as a deprecated mechanism of
provenance, without backing it up.

Why do you think there must be a better mechanism? Maybe you should take
another look at reification, or lower your provenance requirements, at
least initially?

Martynas
graphity.org
On Jun 22, 2012 5:20 PM, Denny Vrandečić denny.vrande...@wikimedia.de
wrote:

 Here's the use case:

 Every statement in Wikidata will have a URI. Every statement can have
 one more references.
 In many cases, the reference might be text on a website.

 Whereas it is always possible (and probably what we will do first) as
 well as correct to state:

 Statement1 accordingTo SlashDot .

 it would be preferable to be a bit more specific on that, and most
 preferably it would be to go all the way down to the sentence saying

 Statement1 accordingTo X .

 with X being a URI denoting the sentence that I mean in a specific
 Slashdot-Article.

 I would prefer a standard or widely adopted way to how to do that, and
 NIF-URIs seem to be a viable solution for that. We will come back to
 this once we start modeling references in more detail.

 The reference could be pointing to a book, to a video, to a
 mesopotamic stone table, etc. (OK, I admit that the different media
 types will be differently prioritized).

 I hope this helps,
 Cheers,
 Denny

 2012/6/21 Sebastian Hellmann hellm...@informatik.uni-leipzig.de:
  Hello Denny,
  I was traveling for the past few weeks and can finally answer your email.
  See my comments inline.
 
  On 05/29/2012 05:25 PM, Denny VrandeÄ ić wrote:
 
  Hello Sebastian,
 
 
  Just a few questions - as you note, it is easier if we all use the same
  standards, and so I want to ask about the relation to other related
  standards:
  * I understand that you dismiss IETF RFC 5147 because it is not stable
  enough, right?
 
  The offset scheme of NIF is built on this RFC.
  So the following would hold:
  @prefix ld: http://www.w3.org/DesignIssues/LinkedData.html# .
  @prefix owl: http://www.w3.org/2002/07/owl# .
  ld:offset_717_729  owl:sameAs ld:char=717,12 .
 
 
  We might change the syntax and reuse the RFC syntax, but it has several
  issues:
  1.  The optional part is not easy to handle, because you would need to
 add
  owl:sameAs statements:
 
  ld:char=717,12;length=12,UTF-8 owl:sameAs ld:char=717,12;length=12 .
  ld:char=717,12;length=12,UTF-8 owl:sameAs ld:char=717,12 .
  ld:char=717,12;UTF-8 owl:sameAs ld:char=717,12;length=9876 .
 
  So theoretically ok, but annoying to implement and check.
 
  2. When implementing web services, NIF allows the client to choose the
  prefix:
 
 http://nlp2rdf.lod2.eu/demo/NIFStemmer?input-type=textnif=trueprefix=http%3A%2F%2Fthis.is%2Fa%2Fslash%2Fprefix%2Furirecipe=offsetinput=President+Obama+is+president
 .
  returning URIs like http://this.is/a/slash/prefix/offset_10_15
  So RFC 5147 would look like:
  http://this.is/a/slash/prefix/char=717,12
  http://this.is/a/slash/prefix/char=717,12;UTF-8
  or
  http://this.is/a/slash/prefix?char=717,12
  http://this.is/a/slash/prefix?char=717,12;UTF-8
 
  3. Character like = , prevent the use of prefixes:
  echo @prefix ld: http://www.w3.org/DesignIssues/LinkedData.html#
  .
  @prefix owl: http://www.w3.org/2002/07/owl# .
  ld:offset_717_729  owl:sameAs ld:char=717,12 .
test.ttl ; rapper -i turtle  test.ttl
 
  4. implementation is a little bit more difficult, given that :
  $arr = split(_, offset_717_729) ;
  switch ($arr[0]){
  case 'offset' :
  $begin = $arr[1];
  $end = $arr[2];
  break;
  case 'hash' :
  $clength = $arr[1];
  $slength = $arr[2];
  $hash = $arr[3];
  $rest = /*merge remaining with '_' */
  break;
  }
 
  5. RFC assumes a certain mime type, i.e. plain text. NIF does have a
 broader
  assumption.
 
  * what is the relation to the W3C media fragment URIs? Did not find a
  pointer there.
 
  They are designed for media such as images, video, not strings.
  Potentially, the same principle can be applied, but it is not yet
  engineered/researched.
 
  * any plans of standardizing your approach?
 
  We will do NIF 2.0  as a community standard and finish it in a couple of
  months. It will be published under open licences, so anybody W3C or ISO
  might pick it up, easily. Other than that there are plans by several EU
  projects (see e.g. here
 
 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0101.html
 )
  and a US project to use it and there are several third party
  implementations, already.  We would rather have it adopted first on a
 large
  scale and then standardized, properly, i.e. W3C. This worked quite well
 for
  the FOAF project or for RDB2RDF Mappers.
  Chances for fast standardization are not so unlikely, I would assume.
 
  We would strongly prefer to just use a standard instead of advocating
  contenders for one -- if one 

Re: [Wikidata-l] Provenance tracking on the Web with NIF-URIs

2012-06-22 Thread Martynas Jusevičius
It says deprecated on the Data model wiki.

So maybe Wikidata doesn't need statement-level granularity? Maybe the named
graph approach is good enough? But it's not based on statements.

If you build this kind of data model on the relational, not to mention
provenance, you will not be able to provide a reasonable query mechanism.
That's the reason why the development of Jena's SDB store is pretty much
abandoned.

Martynas
 On Jun 22, 2012 8:18 PM, Sebastian Hellmann 
hellm...@informatik.uni-leipzig.de wrote:

  Denny didn't even use the word deprecated.
 Reification for statement-level provenance works, but you won't be able to
 sell it as an elegant solution to the problem.
 So could - yes , should - ?? - probably not

 If Wikidata is using statement-level provenance,  there might be better
 ways to serialize it in RDF than reification in the future
 e.g. NQuads: http://sw.deri.org/2008/07/n-quads/
 or JSON ;)

 For internal use I would discourage reification.
 If using a relational scheme, a statement id, which can be joined with
 another SQL table for provenance is the best way to do it imho.

 Before you are driving us all mad with explaining why reifiction is bad, I
 would really like you to justify why WikiData should consider reification.
 I really do not know many use case (if any) where reification is the right
 choice of modelling. Before going into the discussion any further [1], I
 think you should name an example where reification is really better than
 other options.

 All the best,
 Sebastian

 [1]http://ceur-ws.org/Vol-699/Paper5.pdf


 On 06/22/2012 06:20 PM, Martynas Jusevičius wrote:

 Denny, the statement-level of granularity you're describing is achieved by
 RDF reification. You describe it however as a deprecated mechanism of
 provenance, without backing it up.

 Why do you think there must be a better mechanism? Maybe you should take
 another look at reification, or lower your provenance requirements, at
 least initially?

 Martynasgraphity.org
 On Jun 22, 2012 5:20 PM, Denny Vrandečić denny.vrande...@wikimedia.de 
 denny.vrande...@wikimedia.de
 wrote:


  Here's the use case:

 Every statement in Wikidata will have a URI. Every statement can have
 one more references.
 In many cases, the reference might be text on a website.

 Whereas it is always possible (and probably what we will do first) as
 well as correct to state:

 Statement1 accordingTo SlashDot .

 it would be preferable to be a bit more specific on that, and most
 preferably it would be to go all the way down to the sentence saying

 Statement1 accordingTo X .

 with X being a URI denoting the sentence that I mean in a specific
 Slashdot-Article.

 I would prefer a standard or widely adopted way to how to do that, and
 NIF-URIs seem to be a viable solution for that. We will come back to
 this once we start modeling references in more detail.

 The reference could be pointing to a book, to a video, to a
 mesopotamic stone table, etc. (OK, I admit that the different media
 types will be differently prioritized).

 I hope this helps,
 Cheers,
 Denny

 2012/6/21 Sebastian Hellmann hellm...@informatik.uni-leipzig.de 
 hellm...@informatik.uni-leipzig.de:

  Hello Denny,
 I was traveling for the past few weeks and can finally answer your email.
 See my comments inline.

 On 05/29/2012 05:25 PM, Denny VrandeÄ ić wrote:

 Hello Sebastian,


 Just a few questions - as you note, it is easier if we all use the same
 standards, and so I want to ask about the relation to other related
 standards:
 * I understand that you dismiss IETF RFC 5147 because it is not stable
 enough, right?

 The offset scheme of NIF is built on this RFC.
 So the following would hold:
 @prefix ld: http://www.w3.org/DesignIssues/LinkedData.html# 
 http://www.w3.org/DesignIssues/LinkedData.html# .
 @prefix owl: http://www.w3.org/2002/07/owl# 
 http://www.w3.org/2002/07/owl# .
 ld:offset_717_729  owl:sameAs ld:char=717,12 .


 We might change the syntax and reuse the RFC syntax, but it has several
 issues:
 1.  The optional part is not easy to handle, because you would need to

  add

  owl:sameAs statements:

 ld:char=717,12;length=12,UTF-8 owl:sameAs ld:char=717,12;length=12 .
 ld:char=717,12;length=12,UTF-8 owl:sameAs ld:char=717,12 .
 ld:char=717,12;UTF-8 owl:sameAs ld:char=717,12;length=9876 .

 So theoretically ok, but annoying to implement and check.

 2. When implementing web services, NIF allows the client to choose the
 prefix:


  
 http://nlp2rdf.lod2.eu/demo/NIFStemmer?input-type=textnif=trueprefix=http%3A%2F%2Fthis.is%2Fa%2Fslash%2Fprefix%2Furirecipe=offsetinput=President+Obama+is+president
 .

  returning URIs like http://this.is/a/slash/prefix/offset_10_15 
 http://this.is/a/slash/prefix/offset_10_15
 So RFC 5147 would look like:http://this.is/a/slash/prefix/char=717,12 
 http://this.is/a/slash/prefix/char=717,12http://this.is/a/slash/prefix/char=717,12;UTF-8
  http://this.is/a/slash/prefix/char=717,12;UTF-8
 orhttp://this.is/a/slash/prefix?char

Re: [Wikidata-l] Provenance tracking on the Web with NIF-URIs

2012-06-22 Thread Martynas Jusevičius
You do not need the full expressive power of SPARQL or graph
querying -- what kind of query mechanism is Wikidata planning to
support in later stages? I don't suppose the data model will be
redesigned for that? So in that case you have to have queries in mind
from the start of its design.

Regarding scalability again:

Long-term though it seems likely that native triplestores will have
the advantage for performance. A difficulty with implementing
triplestores over SQL is that although triples may thus be stored,
implementing efficient querying of a graph-based RDF model (i.e.
mapping from SPARQL) onto SQL queries is difficult.
http://en.wikipedia.org/wiki/Triplestore#Implementation

The above results indicate  a superior performance of native stores
like Sesame native, Mulgara and Virtuoso. This is in coherence with
the current emphasis on development of native stores since their
performance can be optimized for RDF.
http://www.bioontology.org/wiki/images/6/6a/Triple_Stores.pdf

On Jun 22, 2012 9:10 PM, Sebastian Hellmann
hellm...@informatik.uni-leipzig.de wrote:

 Dear Martynas,
 as far as I understand it, Wikidata will not need to worry about named graphs 
 or alike.
 IIRC Wikidata is building a fast software to edit facts and generate 
 infoboxes. You do not need the full expressive power of SPARQL or graph 
 querying.
 That is a different use case and can be done by exporting the data and 
 loading it into a triple store/graph database.
 I would assume that the most efficient operation is to retrieve all data for 
 one entity/entry/page?
 So the database needs to be optimized for lookup/update, not graph querying.

 In another mail you said that:

 Regarding scalability -- I can only see those possible cases: either
 Wikidata will not have any query language, or it's query language will
 be SQL with never ending JOINs too complicated to be useful, or it's
 gonna be another query language translated to SQL -- for example
 SPARQL, which is doable but attempts have shown it doesn't scale. A
 native RDF store is much more performant.

 Do you have a reference for this? I always thought it was exactly the 
 opposite, i.e. SPARQL2SQL mappers performing better than native stores.

 Cheers,
 Sebastian


 On 06/22/2012 08:43 PM, Martynas Jusevičius wrote:

 It says deprecated on the Data model wiki.

 So maybe Wikidata doesn't need statement-level granularity? Maybe the named
 graph approach is good enough? But it's not based on statements.

 If you build this kind of data model on the relational, not to mention
 provenance, you will not be able to provide a reasonable query mechanism.
 That's the reason why the development of Jena's SDB store is pretty much
 abandoned.

 Martynas
  On Jun 22, 2012 8:18 PM, Sebastian Hellmann 
 hellm...@informatik.uni-leipzig.de wrote:

  Denny didn't even use the word deprecated.
 Reification for statement-level provenance works, but you won't be able to
 sell it as an elegant solution to the problem.
 So could - yes , should - ?? - probably not

 If Wikidata is using statement-level provenance,  there might be better
 ways to serialize it in RDF than reification in the future
 e.g. NQuads: http://sw.deri.org/2008/07/n-quads/
 or JSON ;)

 For internal use I would discourage reification.
 If using a relational scheme, a statement id, which can be joined with
 another SQL table for provenance is the best way to do it imho.

 Before you are driving us all mad with explaining why reifiction is bad, I
 would really like you to justify why WikiData should consider reification.
 I really do not know many use case (if any) where reification is the right
 choice of modelling. Before going into the discussion any further [1], I
 think you should name an example where reification is really better than
 other options.

 All the best,
 Sebastian

 [1]http://ceur-ws.org/Vol-699/Paper5.pdf


 On 06/22/2012 06:20 PM, Martynas Jusevičius wrote:

 Denny, the statement-level of granularity you're describing is achieved by
 RDF reification. You describe it however as a deprecated mechanism of
 provenance, without backing it up.

 Why do you think there must be a better mechanism? Maybe you should take
 another look at reification, or lower your provenance requirements, at
 least initially?

 Martynasgraphity.org
 On Jun 22, 2012 5:20 PM, Denny Vrandečić denny.vrande...@wikimedia.de 
 denny.vrande...@wikimedia.de
 wrote:


  Here's the use case:

 Every statement in Wikidata will have a URI. Every statement can have
 one more references.
 In many cases, the reference might be text on a website.

 Whereas it is always possible (and probably what we will do first) as
 well as correct to state:

 Statement1 accordingTo SlashDot .

 it would be preferable to be a bit more specific on that, and most
 preferably it would be to go all the way down to the sentence saying

 Statement1 accordingTo X .

 with X being a URI denoting the sentence that I mean in a specific
 Slashdot-Article.

 I would prefer