Re: [Wikidata] Significant change of Wikidata dump size

2019-06-25 Thread Stas Malyshev
Hi! On 6/25/19 11:17 PM, Ariel Glenn WMF wrote: > I think the issue is with the 0624 json dumps, which do seem a lot > smaller than previous weeks' runs. Ah, true, I didn't realize that. I think this may be because of that dumpJson.php issue, which is now fixed. Maybe rerun the dump? -- Stas Ma

Re: [Wikidata] Significant change of Wikidata dump size

2019-06-25 Thread Ariel Glenn WMF
I think the issue is with the 0624 json dumps, which do seem a lot smaller than previous weeks' runs. On Wed, Jun 26, 2019 at 8:22 AM Stas Malyshev wrote: > Hi! > > > Which script, please, and which dump? (The conversation was not > > forwarded so I don't have the context.) > > Sorry, the origin

[Wikidata] Performance and update versus query

2019-06-25 Thread Gerard Meijssen
Hoi, The performance of the query update is getting worse. Questions about this have been raised before. I do remember quality replies like it is not exponential so there is no problem. However, here we are and there is a problem. The problem is that I run batch jobs, batch jobs that do not run [1

Re: [Wikidata] Significant change of Wikidata dump size

2019-06-25 Thread Stas Malyshev
Hi! > Which script, please, and which dump? (The conversation was not > forwarded so I don't have the context.) Sorry, the original complaint was: > I apologize if I missed something, but why the current JSON dump size is ~25GB while a week ago it was ~58GB? (see https://dumps.wikimedia.org/wiki

Re: [Wikidata] Significant change of Wikidata dump size

2019-06-25 Thread Ariel Glenn WMF
Which script, please, and which dump? (The conversation was not forwarded so I don't have the context.) On Wed, Jun 26, 2019 at 3:39 AM Stas Malyshev wrote: > Hi! > > > Follow-up: according to my processing script, this dump contains > > only 30280591 entries, while the main page is still advert

Re: [Wikidata] Significant change of Wikidata dump size

2019-06-25 Thread Stas Malyshev
Hi! > Follow-up: according to my processing script, this dump contains > only 30280591 entries, while the main page is still advertising 57M+ > data items. > Isn't it a bug in the dump process? There was a problem with dump script (since fixed), so the dump may indeed be broken. CCing Ariel to ta

Re: [Wikidata] Significant change of Wikidata dump size

2019-06-25 Thread Vladimir Ryabtsev
Follow-up: according to my processing script, this dump contains only 30280591 entries, while the main page is still advertising 57M+ data items. Isn't it a bug in the dump process? Regards, Vladimir пн, 24 июн. 2019 г. в 19:37, Vladimir Ryabtsev : > Hello, > > I apologize if I missed something,

Re: [Wikidata] minimal hardware requirements for loading wikidata dump in Blazegraph

2019-06-25 Thread Ted Thibodeau Jr
On Jun 20, 2019, at 08:37 AM, Adam Sanchez wrote: > > For your information > > ... > b) It took 43 hours to load the Wikidata RDF dump > (wikidata-20190610-all-BETA.ttl, 383G) in the dev version of Virtuoso > 07.20.3230. > I had to patch Virtuoso because it was given the following error each > t

Re: [Wikidata] Scaling Wikidata Query Service

2019-06-25 Thread Ted Thibodeau Jr
On Jun 17, 2019, at 03:41 PM, Finn Aarup Nielsen wrote: > > > Changing the subject a bit: Well... Changing the subject a *lot*, to an extent probably worthy of its own subject line, and an entirely new thread, not only because it seems more relevant to the "shex-simple Toolforge tool" you re