[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-05-14 Thread Smalyshev
Smalyshev added a comment. @Thompsonbry.systap I've got this message: WARN : AbstractBTree.java:2135: Bloom filter disabled - maximum error rate would be exceeded: entryCount=1883228, factory=BloomFilterFactory{ n=100, p=0.02, maxP=0.15, maxN=1883227} Anything to worry about/change the

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-05-14 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. That is normal. You can choose to explicitly disable bloom filters in advance. Otherwise they are disabled once their expected error rate would be too high. Nothing to be concerned about. Bryan TASK DETAIL https://phabricator.wikimedia.org/T92308 EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-05-01 Thread gerritbot
gerritbot added a comment. Change 208029 merged by jenkins-bot: Set up default settings as suggested in https://phabricator.wikimedia.org/T92308 https://gerrit.wikimedia.org/r/208029 TASK DETAIL https://phabricator.wikimedia.org/T92308 REPLY HANDLER ACTIONS Reply to comment or attach

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-04-30 Thread gerritbot
gerritbot added a subscriber: gerritbot. gerritbot added a comment. Change 208029 had a related patch set uploaded (by Smalyshev): Set up default settings as suggested in https://phabricator.wikimedia.org/T92308 https://gerrit.wikimedia.org/r/208029 TASK DETAIL

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-04-30 Thread Smalyshev
Smalyshev added a comment. @Thompsonbry.systap I've added your recommended settings to our default config. One question: I see `com.bigdata.namespace.kb` - so I imagine this is per-namespace setting and if we use different namespace we should change it? Can we set it for all namespaces? TASK

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-04-30 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. Yes but. There are global defaults and you can set them. For example, if you list out e namespace properties you will see how the 128 default is set. The issue is that we are using different branching factors for the spo and lex relations, and even inside of

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-12 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. Ok. That query PREFIX wdt: http://wikidata-wdq.testme.wmflabs.org/entity/assert/ PREFIX entity: http://wikidata-wdq.testme.wmflabs.org/entity/ SELECT ?h ?date WHERE { ?h wdt:P31 entity:Q5 . ?h wdt:P569 ?date . FILTER NOT EXISTS {?h wdt:P570

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-12 Thread Smalyshev
Smalyshev added a comment. @Thompsonbry.systap I'll try the mailing list, thanks. 1. Does it imply the error I'm seeing is a bug and not something I'm doing wrong? 2. Yes, it's different data. We intend to have useable dumps soon (hopefully, next week) but don't have them yet. I'll see maybe

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-12 Thread Smalyshev
Smalyshev added a comment. Prefixes used in current format: @prefix wikibase: http://www.wikidata.org/ontology-0.0.1# . @prefix schema: http://schema.org/ . @prefix cc: http://creativecommons.org/ns# . @prefix xsd: http://www.w3.org/2001/XMLSchema# . @prefix data:

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-11 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. This may not be the right ticket, but I did some experimentation with the data sets that I referenced above looking at parameterization of the load. Using an Intel 2011 Mac Mini with 16GB of RAM and an SSD I have a total throughput across all datasets of 6

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-10 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. It might be useful to take some of these questions to the bigdata-developers mailing list. Some of these questions already have answers on the wiki. 1. RDR Syntax. For the RDR exception, please create a unit test. There are a few places where that test

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-10 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. 1. I am not sure. That's why I would like to see it in a test case. Note that the openrdf jars need to appear before the blazegraph jars on the classpath in order for the blazegraph RDF parser not to be replaced by the openrdf parser. If that happens then it

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-10 Thread Beebs.systap
Beebs.systap added a comment. Regarding the executable jar, you can pass the property file with -Dbigdata.propertyFile=path java -server -Xmx4g -Dbigdata.propertyFile=/etc/blazegraph/RWStore.properties -jar bigdata-1.5.0-bundled.jar It is fine to run this way for development and testing. For

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-10 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. What is the correct prefix declaration for wdt? TASK DETAIL https://phabricator.wikimedia.org/T92308 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign username. EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-10 Thread Smalyshev
Smalyshev added a comment. 1. I am running the bundled jar, so there should be no classpath issues. 2. Current versions of the dumps (short ones, 3m entities) can be seen here: http://wdq-wikidata.testme.wmflabs.org/dumps/ I'll look into how to do the unit tests. TASK DETAIL

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-10 Thread Thompsonbry.systap
Thompsonbry.systap added a comment. Great! Please put my name on ticket so I will see it. @beebs.systap Unfortunately the executable JAR might not provide a sufficiently strong class ordering guarantee. TASK DETAIL https://phabricator.wikimedia.org/T92308 REPLY HANDLER ACTIONS Reply to

[Wikidata-bugs] [Maniphest] [Commented On] T92308: Open questions for Blazegraph data model research

2015-03-10 Thread Smalyshev
Smalyshev added a comment. 1. I've created a test case: https://github.com/smalyshev/blazegraph/pull/1 I'll also submit an issue to trac. TASK DETAIL https://phabricator.wikimedia.org/T92308 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign