[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-06-03 Thread Maintenance_bot
Maintenance_bot removed a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt, Maintenance_bot Cc: Gehel, RKemper, EBernhardson, Aklapper, BTullis, bking,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-06-03 Thread gerritbot
gerritbot added a comment. Change #1038328 **merged** by Bking: [operations/puppet@production] Remove temporary firewall rule for WDQS graph_split https://gerrit.wikimedia.org/r/1038328 TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-06-03 Thread gerritbot
gerritbot added a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt, gerritbot Cc: Gehel, RKemper, EBernhardson, Aklapper, BTullis, bking, dr0ptp4kt,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-06-03 Thread gerritbot
gerritbot added a comment. Change #1038328 had a related patch set uploaded (by Btullis; author: Btullis): [operations/puppet@production] Remove temporary firewall rule for WDQS graph_split https://gerrit.wikimedia.org/r/1038328 TASK DETAIL

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-01-19 Thread Gehel
Gehel closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt, Gehel Cc: Gehel, RKemper, EBernhardson, Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-01-09 Thread Maintenance_bot
Maintenance_bot removed a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt, Maintenance_bot Cc: Gehel, RKemper, EBernhardson, Aklapper, BTullis, bking,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-01-09 Thread gerritbot
gerritbot added a comment. Change 980037 **merged** by jenkins-bot: [wikidata/query/rdf@master] HDFS to .ttl statement generator https://gerrit.wikimedia.org/r/980037 TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2024-01-04 Thread dr0ptp4kt
dr0ptp4kt added a comment. Imports seemed to work. **Non-scholarly article side (proxied to wdqs1024.eqiad.wmnet)** F41650681: split-non-schol-side.gif **Scholarly article side (proxied to wdqs1023.eqiad.wmnet)** F41650680:

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-18 Thread Gehel
Gehel added a comment. We want to add some more tests before closing this task. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt, Gehel Cc: Gehel, RKemper, EBernhardson, Aklapper,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-07 Thread RKemper
RKemper added a comment. Here's some extra notes with some of the commands we ran/used: P54284 TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To:

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-07 Thread Stashbot
Stashbot added a comment. Mentioned in SAL (#wikimedia-operations) [2023-12-07T19:35:40Z] END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs[1022-1024].eqiad.wmnet with reason: graph split experiments T350106 TASK

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-07 Thread Stashbot
Stashbot added a comment. Mentioned in SAL (#wikimedia-operations) [2023-12-07T19:35:24Z] START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wdqs[1022-1024].eqiad.wmnet with reason: graph split experiments T350106 TASK DETAIL

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-07 Thread bking
bking added a comment. I started a transfer from of the gzip files mentioned above to `wdqs1023` from `wdqs1024 ` (wdqs hosts have 10Gbps Ethernet vs. 1Gps for the stat machines, so this should be faster). You can set a temporary iptables rule to allow traffic between hosts on an

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-06 Thread gerritbot
gerritbot added a comment. Change 980914 **merged** by Ryan Kemper: [operations/puppet@production] wdqs: open firewall rules for graph_split https://gerrit.wikimedia.org/r/980914 TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-06 Thread gerritbot
gerritbot added a comment. Change 980914 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper): [operations/puppet@production] wdqs: open firewall rules for graph_split https://gerrit.wikimedia.org/r/980914 TASK DETAIL https://phabricator.wikimedia.org/T350106

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-05 Thread dr0ptp4kt
dr0ptp4kt added a comment. After an update to the script (PS6) and a fresh run of the same commands new files have been `hdfs-rsync`'d to `stat1006:~dr0ptp4kt/gzips` in anticipation of doing a file transfer over to the WDQS graph split test servers. Here's a very small sample of what

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-04 Thread dr0ptp4kt
dr0ptp4kt added a subscriber: RKemper. dr0ptp4kt added a comment. I ran the current version of the code as follows: spark3-submit --master yarn --driver-memory 16G --executor-memory 12G --executor-cores 4 --conf spark.driver.cores=2 --conf spark.executor.memoryOverhead=4g --conf

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-04 Thread dr0ptp4kt
dr0ptp4kt added a comment. Not using right now, but here's roughly how one might go about generating more expanded Turtle statements without reverse-mapping prefixes: F41561068 TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-04 Thread gerritbot
gerritbot added a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt, gerritbot Cc: EBernhardson, Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-12-04 Thread gerritbot
gerritbot added a comment. Change 980037 had a related patch set uploaded (by Dr0ptp4kt; author: Dr0ptp4kt): [wikidata/query/rdf@master] WIP DNM: HDFS to .ttl statement generator https://gerrit.wikimedia.org/r/980037 TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-11-29 Thread dr0ptp4kt
dr0ptp4kt added a subscriber: EBernhardson. dr0ptp4kt added a comment. Adding a note so I don't forget: advice from @BTullis is to avoid NFS if possible, and advice from @JAllemandou is to consider use of `hdfs-rsync` (after our call I sought this out and found these:

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-11-29 Thread dr0ptp4kt
dr0ptp4kt claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dr0ptp4kt Cc: Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse, Danny_Benjafield_WMDE, Astuthiodit_1,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-11-20 Thread Gehel
Gehel added a parent task: T350465: Load Wikidata split graphs into test servers. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel Cc: Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-11-06 Thread Gehel
Gehel set the point value for this task to "5". TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel Cc: Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse, Danny_Benjafield_WMDE,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-11-03 Thread Gehel
Gehel triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel Cc: Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse, Danny_Benjafield_WMDE,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-11-03 Thread Gehel
Gehel removed a project: Data-Platform-SRE. TASK DETAIL https://phabricator.wikimedia.org/T350106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel Cc: Aklapper, BTullis, bking, dr0ptp4kt, JAllemandou, dcausse, Danny_Benjafield_WMDE,

[Wikidata-bugs] [Maniphest] T350106: Implement a spark job that converts a RDF triples table into a RDF file format

2023-10-31 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, Wikidata-Query-Service, Data-Platform-SRE, Discovery-Search (Current work). TASK DESCRIPTION The table `wikibase_rdf` contains 4 columns (not counting partition columns): - context - subject - preficate - object We