[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-05-08 Thread ArielGlenn
ArielGlenn added a comment.

Great news!


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, ArielGlenn
Cc: JanZerebecki, Jimkont, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, 
ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-05-08 Thread daniel
daniel added a comment.

The double-check didn't turn anything up either. The dump seems to be clean.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, daniel
Cc: JanZerebecki, Jimkont, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, 
ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-05-06 Thread Jimkont
Jimkont added a comment.

other examples of old serializations can be found here:
https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/json/JsonWikiParser.scala#L62-L67


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, Jimkont
Cc: JanZerebecki, Jimkont, Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, 
Svick, ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-05-06 Thread daniel
daniel added a comment.

@Jimkont: broken serialization of empty lists is a separate issue, unrelated to 
unconverted old-style serializations.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, daniel
Cc: JanZerebecki, Jimkont, Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, 
Svick, ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-05-06 Thread daniel
daniel added a comment.

I'm now running the following on tool labs to find old serializations:

  daniel@tools-bastion-01:/public/dumps/public/wikidatawiki/20150330$ bzgrep 
',quot;entityquot;:quot;[qQpP][0-9]*quot;\}' 
wikidatawiki-20150330-pages-meta-history.xml.bz2 | tee 
~/wikidatawiki-20150330-pages-meta-history.bad-serialization.txt


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, daniel
Cc: JanZerebecki, Jimkont, Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, 
Svick, ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-05-05 Thread daniel
daniel added a comment.

@JanZerebecki: Redirects are serialized like this:

  {entity:Q23,redirect:Q42}

Old style serialization ends with this:

  ,entity:q207}

So, if you grep for `,quot;entityquot;}`, you should find only old style 
serializations.

Also, old style serialization will contain `quot;labelquot;:{`, while new 
style should contain `quot;labelsquot;:{`  (using lable//s//, plural).


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, daniel
Cc: JanZerebecki, Jimkont, Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, 
Svick, ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-05-05 Thread daniel
daniel added a comment.

Btw, if someone can tell me where to find a full history dump of wikidata, I'd 
be happy to check this myself. The annoying part here is to download and store 
the behemoth...


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, daniel
Cc: JanZerebecki, Jimkont, Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, 
Svick, ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-04-07 Thread hoo
hoo added a comment.

@Daniel: Could you have a quick look at this? Looks fixed to me, but I think 
you're the only one who can tell for sure.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn, hoo
Cc: JanZerebecki, Jimkont, Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, 
Svick, ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-04-07 Thread daniel
daniel added a comment.

Fore redirects, the encoding {quot;entityquot;} is correct. There is no old 
encoding for redirects, entity redirects didn't exist when we used the old 
serialization format.

So, searching for quot;entityquot; is not a good indicator for detecting 
old-style serialization.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn, daniel
Cc: JanZerebecki, Jimkont, Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, 
Svick, ArielGlenn, Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, 
Manybubbles, hoo, RobH, aude, faidon, fgiunchedi, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-03-04 Thread ArielGlenn
ArielGlenn added a comment.

Is anyone looking at the redirects serialization?


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-27 Thread ArielGlenn
ArielGlenn added a comment.

OK, I no longer feel as stupid.   The number of items with the 'entity' format 
is small in comparison to the total number of qualities, we would expect the 
opposite if old revisions were being kept as is.  And as I said I had checked 
with local testing that the export transform is indeed being called and 
changing the content. So I had a look at the problematic entries.  It turns out 
that all but 27 are of the form

text 
xml:space=preserve{quot;entityquot;:quot;Q547932quot;,quot;redirectquot;:quot;Q6150957quot;}/text

so I guess serializing of redirects needs work. I checked that newly added 
redirects are dumped with this format. The few remaining matches are likely 
discussions that happen to include the string; I spot checked some and found 
that to be the case.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-27 Thread ArielGlenn
ArielGlenn added a comment.

Um, with this format means new redirects are dumped with {quot;entityquot; 
...  etc.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-26 Thread hoo
hoo added a comment.

In https://phabricator.wikimedia.org/T74348#1069658, @ArielGlenn wrote:

 right.  this is what you want; the old style  'entity'  is gone, the new 
 style 'descriptions' is present.  or am I missing something?


To me it seems like the old style entity is still present.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn, hoo
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-26 Thread ArielGlenn
ArielGlenn added a comment.

right.  this is what you want; the old style  'entity'  is gone, the new style 
'descriptions' is present.  or am I missing something?


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-24 Thread hoo
hoo added a comment.

In https://phabricator.wikimedia.org/T74348#1059331, @ArielGlenn wrote:

 Hello?  Any wikidata dumps consumers on this ticket?  Otherwise I'll ask in 
 xmlatadumps-l.




In https://phabricator.wikimedia.org/T74348#768660, @daniel wrote:

 Bumping to critical, since it may result in data loss for clients that cannot 
 process the old style format. We really do not want them to implement that, 
 we changed for a reason...

 Btw: In order to check for old style serializations, grep for 
 quot;entityquot;. To detect new style serialization, check for 
 quot;descriptionsquot; (plural).




  hoo@tools-dev:~$ grep -c 'quot;entityquot;' 
wikidatawiki-20150207-pages-articles.xml 
  129630

:(


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn, hoo
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-24 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.

@hoo: could you have a look?


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn, Lydia_Pintscher
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-24 Thread hoo
hoo added a comment.

In https://phabricator.wikimedia.org/T74348#1062351, @Lydia_Pintscher wrote:

 @hoo: could you have a look?


Just kicked of the download of a dump, I'll verify some old revisions once 
that's done (later today).


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn, hoo
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-23 Thread ArielGlenn
ArielGlenn added a comment.

Hello?  Any wikidata dumps consumers on this ticket?  Otherwise I'll ask in 
xmlatadumps-l.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Liuxinyu970226, Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, 
Ricordisamoa, mark, Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, 
Jdouglas, RobH, aude, faidon, fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-02-12 Thread ArielGlenn
ArielGlenn added a comment.

I ran a series of tests locally and also checked production output.  I can 
verify that the transform is actually applied, the output looks good to me for 
prefetch or from the database, but a consumer of the data should probably look 
at it for 5 seconds to verify that the output format is they way you want it.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Wikidata-bugs, Tobi_WMDE_SW, jayvdb, Svick, ArielGlenn, Ricordisamoa, mark, 
Lydia_Pintscher, jeremyb-phone, daniel, Manybubbles, hoo, RobH, aude, faidon, 
fgiunchedi, Joe, Dzahn, jeremyb, chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2015-01-12 Thread ArielGlenn
ArielGlenn added a comment.

Thanks for the patch!  I will check it out in the next couple of days.  I'm 
really sorry for the long delay; I've been out for medical reasons and am now 
trying to get caught up on everything.


TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Tobi_WMDE_SW, daniel, Ricordisamoa, jayvdb, Svick, Manybubbles, 
Wikidata-bugs, hoo, Lydia_Pintscher, mark, jeremyb-phone, RobH, aude, Joe, 
chasemp



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2014-11-26 Thread hoo
hoo added a comment.

! In T74348#787697, @Lydia_Pintscher wrote:
 Can I please have a status update on this? Do we know why it is happening?

As far as I know the problem is that during dump creation content from the last 
dump is being scraped in case nothing changed. That's probably fine for 
wikitext, but of course that bypasses our on-the-fly serialization change.

TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

To: ArielGlenn, hoo
Cc: Tobi_WMDE_SW, daniel, Ricordisamoa, jayvdb, Svick, Manybubbles, 
Wikidata-bugs, hoo, Lydia_Pintscher



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T74348: Wikidata dumps contain old-style serialization.

2014-11-26 Thread ArielGlenn
ArielGlenn added a comment.

Old revisions are indeed read from the old dump, as long as the length of the 
revision text is correct. And indeed this is a necessity; the db servers cannot 
handle requests for all revisions anew, and even if they could the dumps would 
take many times loger to generate as well. The only thing that can be  done is 
a manual run of the specfic pass without prefetch, which will take... as long 
as it takes.  I need to check with Sean (DBA) about it before doing so.

TASK DETAIL
  https://phabricator.wikimedia.org/T74348

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
username.

To: ArielGlenn
Cc: Tobi_WMDE_SW, daniel, Ricordisamoa, jayvdb, Svick, Manybubbles, 
Wikidata-bugs, hoo, Lydia_Pintscher



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs