[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread Eevans
Eevans added a comment.

In https://phabricator.wikimedia.org/T116247#1748095, @Ottomata wrote:

> > So the producer would store the same time stamp twice? UUID v1 already 
> > contains it.
>
>
> Could you provide an example of what this UUID would look like?
>
> A reason for having a timestamp only field is so that applications can use it 
> for time based logic without having to also know how to extract the timestamp 
> out of an overloaded uuid.


Using Python as an example (and sticking strictly to what's in the standard 
lib):

  from uuid import uuid1
  
  u = uuid1()
  
  print datetime.datetime.fromtimestamp((u.time - 0x01b21dd213814000L)*100/1e9)

The constant `0x01b21dd213814000` represents the number of 100-ns units between 
the epoch that UUIDs use (1582-10-15 00:00:00), and the standard unix epoch.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Eevans
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114586: Special:NewItem redirects to https://www.wikidata.org/wiki/NewItem after clicking "create" following a "already exists" message

2015-10-23 Thread Sjoerddebruin
Sjoerddebruin added a subscriber: Sjoerddebruin.
Sjoerddebruin added a comment.

Can confirm.


TASK DETAIL
  https://phabricator.wikimedia.org/T114586

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Sjoerddebruin
Cc: Sjoerddebruin, Aklapper, Stryn, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114368: undefined method `last_session_ids=' for MediawikiSelenium::BrowserFactory::Chrome:Class (NoMethodError)

2015-10-23 Thread gerritbot
gerritbot added a comment.

Change 248372 merged by jenkins-bot:
Fix undefined `last_session_ids=` method exception

https://gerrit.wikimedia.org/r/248372


TASK DETAIL
  https://phabricator.wikimedia.org/T114368

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dduvall, gerritbot
Cc: adrianheine, JanZerebecki, WMDE-Fisch, Jonas, Tobi_WMDE_SW, gerritbot, 
SBisson, dduvall, hashar, Aklapper, zeljkofilipin, Wikidata-bugs, 
matthiasmullie, aude, Gryllida, Mattflaschen, greg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread aude
aude moved this task to Review on the Wikidata-Sprint-2015-10-13 workboard.

TASK DETAIL
  https://phabricator.wikimedia.org/T116381

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1551/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread Ottomata
Ottomata added a comment.

> topics named something like mw-edit and mw-edit-private perhaps (where the 
> latter contains this extra info).


I'd prefer if we did this the other way around.  The 'private' topic will have 
more data and be the main source of truth.  The public one will contain a 
subset of this data, and thus is subordinate to the main one.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata
Cc: Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, 
bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, 
Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116185: When running updateSearchIndexConfig.php for test.wikidata, the script chokes on the analyzers

2015-10-23 Thread aude
aude added a comment.

the patch that adds the --justMapping option got split up into two patches, one 
that adds the option to the script, and the second patch for "allowing mapping 
customization with numeric fields"

this is the patch that adds the option:

https://gerrit.wikimedia.org/r/#/c/247861/


TASK DETAIL
  https://phabricator.wikimedia.org/T116185

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: JanZerebecki, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114368: undefined method `last_session_ids=' for MediawikiSelenium::BrowserFactory::Chrome:Class (NoMethodError)

2015-10-23 Thread gerritbot
gerritbot added a comment.

Change 248353 had a related patch set uploaded (by Sbisson):
Browser tests: using mw_selenium 1.5 because 1.6 is broken

https://gerrit.wikimedia.org/r/248353


TASK DETAIL
  https://phabricator.wikimedia.org/T114368

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dduvall, gerritbot
Cc: adrianheine, JanZerebecki, WMDE-Fisch, Jonas, Tobi_WMDE_SW, gerritbot, 
SBisson, dduvall, hashar, Aklapper, zeljkofilipin, Wikidata-bugs, 
matthiasmullie, aude, Gryllida, Mattflaschen, greg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114368: undefined method `last_session_ids=' for MediawikiSelenium::BrowserFactory::Chrome:Class (NoMethodError)

2015-10-23 Thread gerritbot
gerritbot added a comment.

Change 248353 merged by jenkins-bot:
Browser tests: using mw_selenium 1.5 because 1.6 is broken

https://gerrit.wikimedia.org/r/248353


TASK DETAIL
  https://phabricator.wikimedia.org/T114368

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dduvall, gerritbot
Cc: adrianheine, JanZerebecki, WMDE-Fisch, Jonas, Tobi_WMDE_SW, gerritbot, 
SBisson, dduvall, hashar, Aklapper, zeljkofilipin, Wikidata-bugs, 
matthiasmullie, aude, Gryllida, Mattflaschen, greg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread aude
aude closed this task as "Resolved".
aude removed a project: Patch-For-Review.

TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T95686: [Task] write a maintenance script to migrate properties from string to new identifier datatype

2015-10-23 Thread JanZerebecki
JanZerebecki added a comment.

In https://phabricator.wikimedia.org/T95686#1746797, @Ricordisamoa wrote:

> In https://phabricator.wikimedia.org/T95686#1746565, @JanZerebecki wrote:
>
> > The JSON serialization for items in the db will not change
>
>
> I suppose it will change for new edits, won't it?


Besides the change the edit caused, I currently assume that any other change 
would be a bug.

> And what about XML dumps of with full history?


The XML dumps are supposed to contain the reserialized JSON, similar to 
`wbgetentities`. I have never looked at those with history, but assume it is 
the same for historic revisions. So this change is retroactively changing 
history of each Entities representation of information derived from other 
Entities, i.e. it will propagate back to the beginning of time except for the 
actual edit to the Property. (Similar to how if you render an old revision of a 
Mediawiki page it will still try to use the current revision of templates it 
refers to.)


TASK DETAIL
  https://phabricator.wikimedia.org/T95686

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, JanZerebecki
Cc: JanZerebecki, jayvdb, gerritbot, MGChecker, daniel, Multichill, 
Ricordisamoa, Liuxinyu970226, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread gerritbot
gerritbot added a comment.

Change 248345 merged by jenkins-bot:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/248345


TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, gerritbot
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread aude
aude moved this task to Done on the Wikidata-Sprint-2015-10-13 workboard.

TASK DETAIL
  https://phabricator.wikimedia.org/T116381

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1551/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T116378: Set the 'name' parameter for Coord objects that Wikibase adds to GeoData

2015-10-23 Thread aude
aude created this task.
aude added a subscriber: aude.
aude added projects: Wikidata, MediaWiki-extensions-WikibaseRepository.
Herald added a subscriber: Aklapper.

TASK DESCRIPTION
  We could also set the 'name' parameter in Coord objects when Wikibase adds 
coordinates to GeoData.
  
  unfortunately name is not a multilingual field, but we could at least set it 
to the label of the content language of the wiki, with language fallback. 
  
  For Wikidata, this would be English :/ but still would be more helpful than 
seeing the name as NULL when looking at the geo_tags database table or the data 
in elastic search.

TASK DETAIL
  https://phabricator.wikimedia.org/T116378

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: aude, Aklapper, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T116379: Set the 'country' parameter for Coord objects that Wikibase adds to GeoData

2015-10-23 Thread aude
aude created this task.
aude added a subscriber: aude.
aude added projects: Wikidata, MediaWiki-extensions-WikibaseRepository.
Herald added a subscriber: Aklapper.

TASK DESCRIPTION
  Think this is not as high priority, but we could also set the 'country' 
parameter for coordinates that Wikibase adds to geodata.
  
  If an item has P17 (country) and a primary coordinate, then it might work to 
set 'country' to the value of P17. Or if a coordinate value has P17 as a 
qualifier...

TASK DETAIL
  https://phabricator.wikimedia.org/T116379

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: aude, Aklapper, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread mobrovac
mobrovac added a comment.

In https://phabricator.wikimedia.org/T116247#1747924, @Ottomata wrote:

> I'd like an actual timestamp to be part of the framing for all events too.  
> I'm all for a reqid, (although I'd bikeshed about the name a bit), but having 
> a standardized canonical timestamp in all events is very useful.  Can we add:
>
> - **dt**: iso 8601 timestamp.  This may be the time of the event creation, or 
> it might be something else.  It can be set by the producer.


So the producer would store the same time stamp twice? UUID v1 already contains 
it.

> > Generally, no overly sensitive information (like client IPs for 
> > authenticated edits) in primary events.

> 

> >  Can be included in expanded message in separate topic, or stored 
> > separately based on reqid.

> 

> 

> In the meeting we said that MW would generate two event streams directly, one 
> that had more information, and another that had less, minus fields with 
> privacy concerns.


Yes, if it has data to produce both of them right away, sure. To topics named 
something like `mw-edit` and `mw-edit-private` perhaps (where the latter 
contains this extra info).


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac
Cc: Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, 
bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, 
Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread Ottomata
Ottomata added a comment.

I'd like an actual timestamp to be part of the framing for all events too.  I'm 
all for a reqid, (although I'd bikeshed about the name a bit), but Having a 
standardized canonical timestamp in all events is very useful.  Can we add:

- **ts**: iso 8601 timestamp.  This may be the time of the event creation, or 
it might be something else.  It can be set by the producer.

> Generally, no overly sensitive information (like client IPs for authenticated 
> edits) in primary events.

>  Can be included in expanded message in separate topic, or stored separately 
> based on reqid.


In the meeting we said that MW would generate two event streams directly, one 
that had more information, and another that had less, minus fields with privacy 
concerns.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata
Cc: Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, 
bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, 
Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T114368: undefined method `last_session_ids=' for MediawikiSelenium::BrowserFactory::Chrome:Class (NoMethodError)

2015-10-23 Thread gerritbot
gerritbot added a comment.

Change 248372 had a related patch set uploaded (by Dduvall):
Fix undefined `last_session_ids=` method exception

https://gerrit.wikimedia.org/r/248372


TASK DETAIL
  https://phabricator.wikimedia.org/T114368

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dduvall, gerritbot
Cc: adrianheine, JanZerebecki, WMDE-Fisch, Jonas, Tobi_WMDE_SW, gerritbot, 
SBisson, dduvall, hashar, Aklapper, zeljkofilipin, Wikidata-bugs, 
matthiasmullie, aude, Gryllida, Mattflaschen, greg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

@ottomata, UUIDs are described in 
https://en.wikipedia.org/wiki/Universally_unique_identifier. An example for a 
v1 UUID is `b54adc00-67f9-11d9-9669-0800200c9a66`. There are libraries to 
extract the high-resolution timestamp for most environments.

Regarding a separate timestamp in the framing information: Which time would 
this correspond to? The next version of Cassandra is likely going to track 
enqueue time itself & support efficient retrieval by timestamp 
,
 and enqueue time is something that should be handled in Kafka in any case. 
Other timestamps have event-specific semantics, like for example the MediaWiki 
save time, which is why I think it makes most sense to not include them in the 
framing information. All events should however have a unique identifier and 
timestamp that ties together all events triggered by the same original trigger, 
and can be used for per-topic de-duplication / idempotency. This is what the 
UUID in reqid would provide.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T114368: undefined method `last_session_ids=' for MediawikiSelenium::BrowserFactory::Chrome:Class (NoMethodError)

2015-10-23 Thread ReleaseTaggerBot
ReleaseTaggerBot added a project: WMF-deploy-2015-10-27_(1.27.0-wmf.4).

TASK DETAIL
  https://phabricator.wikimedia.org/T114368

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dduvall, ReleaseTaggerBot
Cc: adrianheine, JanZerebecki, WMDE-Fisch, Jonas, Tobi_WMDE_SW, gerritbot, 
SBisson, dduvall, hashar, Aklapper, zeljkofilipin, Wikidata-bugs, 
matthiasmullie, aude, Gryllida, Mattflaschen, greg



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread ReleaseTaggerBot
ReleaseTaggerBot added a project: WMF-deploy-2015-10-27_(1.27.0-wmf.4).

TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, ReleaseTaggerBot
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116185: When running updateSearchIndexConfig.php for test.wikidata, the script chokes on the analyzers

2015-10-23 Thread ReleaseTaggerBot
ReleaseTaggerBot added a project: WMF-deploy-2015-10-27_(1.27.0-wmf.4).

TASK DETAIL
  https://phabricator.wikimedia.org/T116185

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ReleaseTaggerBot
Cc: JanZerebecki, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

@JanZerebecki: Suppression information would indeed be needed for public access 
to older events. One option would be to key this on the event's UUID. We could 
also consider superseding the message using Kafka's deduplication (compaction) 
based on the same UUID.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a subscriber: EBernhardson.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T116404: EntityUsageTable::getUsedEntityIdStrings query on wbc_entity_usage table is sometimes fast, sometimes slow

2015-10-23 Thread hoo
hoo added subscribers: daniel, aude.
hoo added a project: Wikidata.
hoo set Security to None.

TASK DETAIL
  https://phabricator.wikimedia.org/T116404

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jcrespo, hoo
Cc: aude, daniel, hoo, Aklapper, jcrespo, Wikidata-bugs, GWicke, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T116404: EntityUsageTable::getUsedEntityIdStrings query on wbc_entity_usage table is sometimes fast, sometimes slow

2015-10-23 Thread hoo
hoo moved this task to monitoring on the Wikidata workboard.

TASK DETAIL
  https://phabricator.wikimedia.org/T116404

WORKBOARD
  https://phabricator.wikimedia.org/project/board/71/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jcrespo, hoo
Cc: aude, daniel, hoo, Aklapper, jcrespo, Wikidata-bugs, GWicke, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread JanZerebecki
JanZerebecki added a comment.

If we offer public access to the public events of the past we need to rewrite 
them according to new events that hide previous public events. Can you make 
sure that events that hide any part of previous public events are also public? 
So that a public archive of events can be maintained based on the public events 
alone.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JanZerebecki
Cc: Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, 
bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, 
Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T116404: EntityUsageTable::getUsedEntityIdStrings query on wbc_entity_usage table is sometimes fast, sometimes slow

2015-10-23 Thread jcrespo
jcrespo moved this task to Backlog on the Database workboard.

TASK DETAIL
  https://phabricator.wikimedia.org/T116404

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1060/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jcrespo
Cc: aude, daniel, hoo, Aklapper, jcrespo, Wikidata-bugs, GWicke, Krenair



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread aude
aude added a comment.

even forceSearchIndex.php does not result in coodinates added for a cached page 
:(


TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread Ottomata
Ottomata added a comment.

> So the producer would store the same time stamp twice? UUID v1 already 
> contains it.


Could you provide an example of what this UUID would look like?

A reason for having a timestamp only field is so that applications can use it 
for time based logic without having to also know how to extract the timestamp 
out of an overloaded uuid.

Also, who is responsible for setting this reqid?  In many cases, varnish, 
right?  A producer may emit several events during a given request, and it 
should have to ability to set what it considers to be the real timestamp of 
each event.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata
Cc: Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, mobrovac, MZMcBride, 
bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, Eevans, mmodell, 
Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, jkroll, Wikidata-bugs, 
Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread aude
aude created this task.
aude added a subscriber: aude.
aude added projects: Wikidata, CirrusSearch, MediaWiki-extensions-GeoData.
Herald added a subscriber: Aklapper.
Herald added a project: Discovery.

TASK DESCRIPTION
  When updating a page, CirrusSearch attempts to get ParserOutput from 
ParserCache.
  
  
  ```
  private function getContentAndParserOutput( $page ) { 
  
  $content = $page->getContent();   
  
  $parserOptions = $page->makeParserOptions( 'canonical' ); 
  
  $parserOutput = ParserCache::singleton()->get( $page, $parserOptions 
); 
  if ( !$parserOutput ) {   
  
  // We specify the revision ID here. There might be a newer 
revision,
  // but we don't care because (a) we've already got a job 
somewhere  
  // in the queue to index it, and (b) we want magic words like 
  
  // {{REVISIONUSER}} to be accurate
  
  $revId = $page->getRevision()->getId();   
  
  $parserOutput = $content->getParserOutput( $page->getTitle(), 
$revId ); 
  } 
  
  return array( $content, $parserOutput );  
  
  }
  ```
  
  For adding coordinates for a wiki with GeoData newly enabled, the 
ParserOutput would be lacking coordinates if obtained from cache.
  
  either there needs to be a way to force parse or, perhaps if the coordinates 
are already in geo_tags then maybe a way for coordinates to come from there.

TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread gerritbot
gerritbot added a subscriber: gerritbot.
gerritbot added a comment.

Change 248345 had a related patch set (by Aude) published:
Add --forceParse UpdaterFlag and option in forceSearchIndex script

https://gerrit.wikimedia.org/r/248345


TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: gerritbot
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread gerritbot
gerritbot added a project: Patch-For-Review.

TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: gerritbot
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T116381: When enabling GeoData and populating coordinates, CirrusSearch needs to bypass ParserCache

2015-10-23 Thread aude
aude claimed this task.
aude added a project: Wikidata-Sprint-2015-10-13.
aude set Security to None.

TASK DETAIL
  https://phabricator.wikimedia.org/T116381

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: gerritbot, aude, Aklapper, Wikidata-bugs, Deskana, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke edited the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

> Right, but how would you do this in say, Hive? Or in bash? Timestamp logic 
> should be easy and immediate.


Yeah, Hive really seems to be lacking built-in support for UUIDs. There seems 
to be UDF code to deal with them, but it's definitely not as convenient as it 
could be. I'm fine with including the timestamp corresponding to the timeuuid 
to help Hive. The overhead is fairly small, and we can automate adding the 
timestamp even if only the UUID was supplied.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread Ottomata
Ottomata added a comment.

Right, but how would you do this in say, Hive?  Or in bash?

Timestamp logic should be easy and immediate.

> Regarding a separate timestamp in the framing information: Which time would 
> this correspond to?


This is up to the producer, I think.  If there are more timestamps needed for 
specific schema, that is fine, but I see a lot of value in having a canonical 
and easily readable timestamp.  Camus uses this timestamp to auto partition 
files by hour when they are imported from Kafka into HDFS.  ISO 8601 works, 
unix epoch seconds and milliseconds work.  We'd have to add more code to make 
UUID timestamp work.

Maybe this is ok, but I'd much rather be able to use my eyes and easy tools to 
do time logic.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread GWicke
GWicke added a comment.

I went ahead and updated the task description with the current framing  / 
per-event schema. I renamed the `reqid` to just `id`, and added a `ts` field 
containing the same timestamp in ISO 8601 format.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T116247: Define edit related events for change propagation

2015-10-23 Thread JanZerebecki
JanZerebecki added a comment.

As long as a separate public suppression event exists that refers to the old 
one it sounds fine.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JanZerebecki
Cc: EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, GWicke, 
mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, chasemp, 
Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, JAllemandou, 
jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, RobLa-WMF, jeremyb



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs