[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-29 Thread Gehel
Gehel closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Gehel Cc: EBernhardson, bking, Aklapper, dcausse, Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen,

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread EBernhardson
EBernhardson moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board. EBernhardson added a comment. Reran 2023-09-21T16:00:00, which was previously failing, with memory overhead unconfigured and with the new patch to repartition the input. This has ru

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread Maintenance_bot
Maintenance_bot removed a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Maintenance_bot Cc: EBernhardson, bking, Aklapper, dcausse, Danny_Benjafield_WMDE, Ast

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread CodeReviewBot
CodeReviewBot added a comment. ebernhardson merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/504 search: Update rdf-spark-tools to .131 TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/s

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread CodeReviewBot
CodeReviewBot added a project: Patch-For-Review. CodeReviewBot added a comment. ebernhardson opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/504 search: Update rdf-spark-tools to .131 TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread Maintenance_bot
Maintenance_bot removed a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Maintenance_bot Cc: EBernhardson, bking, Aklapper, dcausse, Danny_Benjafield_WMDE, Ast

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread gerritbot
gerritbot added a comment. Change 961441 **merged** by jenkins-bot: [wikidata/query/rdf@master] spark queries processor: Force a shuffle on input https://gerrit.wikimedia.org/r/961441 TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wik

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread EBernhardson
EBernhardson added a comment. 8g was still insufficient, one of the failed jobs passed but the other three still had trouble. Increasing to 12g made it work, but if 8g is already excessive 12g is only more of the same. Returning to the earlier idea of forcing the job to be split up more, pa

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread gerritbot
gerritbot added a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, gerritbot Cc: EBernhardson, bking, Aklapper, dcausse, Danny_Benjafield_WMDE, Isabelladantes198

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-27 Thread gerritbot
gerritbot added a comment. Change 961441 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson): [wikidata/query/rdf@master] spark queries processor: Force a shuffle on input https://gerrit.wikimedia.org/r/961441 TASK DETAIL https://phabricator.wikimedia.org/T347

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-26 Thread EBernhardson
EBernhardson added a comment. Unfortunately the above patch doesn't seem to have worked. Spark turned the input into three tasks. They were all assigned to the same executor, the first two finished and the third caused the container to die after another ~45s due to memory constraints. Spark

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-26 Thread Maintenance_bot
Maintenance_bot removed a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Maintenance_bot Cc: bking, Aklapper, dcausse, Danny_Benjafield_WMDE, Astuthiodit_1, AW

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-26 Thread CodeReviewBot
CodeReviewBot added a comment. ebernhardson merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/502 search: tune process_sparql_query_hourly TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-26 Thread CodeReviewBot
CodeReviewBot added a project: Patch-For-Review. CodeReviewBot added a comment. dcausse opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/502 search: tune process_sparql_query_hourly TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PRE

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-26 Thread dcausse
dcausse claimed this task. dcausse moved this task from Incoming to In Progress on the Discovery-Search (Current work) board. dcausse set the point value for this task to "2". TASK DETAIL https://phabricator.wikimedia.org/T347333 WORKBOARD https://phabricator.wikimedia.org/project/board/1227

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-26 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: bking, Aklapper, dcausse, Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen, k

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-25 Thread Maintenance_bot
Maintenance_bot added a project: Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T347333 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Maintenance_bot Cc: Aklapper, dcausse, Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T347333: Tune process_sparql_query_hourly so that it does not get killed by yarn

2023-09-25 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The job process_sparql_query_hourly is getting killed by YARN with: Caused by: org.apache.spark.SparkException: Job aborted due to stage failu