Re: [Wikitech-l] Image scaling proposal: server-side mip-mapping

2014-05-01 Thread Gilles Dubuc

 _don't consider the upload complete_ until those are done! a web uploader
 or API-using bot should probably wait until it's done before uploading the
 next file, for instance...


You got me a little confused at that point, are you talking about the
client generating the intermediary sizes, or the server?

I think client-side thumbnail generation is risky when things start getting
corrupt. A client-side bug could result in a user uploading thumbnails that
are for a different image. And if you want to run a visual signature check
on the server-side to avoid that issue, you might be looking at similar
processing time checking that the thumbnail is for the correct image then
if the server was to generate the actual thumbnail. It would be worth
researching if there's a very fast is this thumbnail a smaller version of
that image algorithm out there. We don't need 100% confidence either, if
we're looking to avoid shuffling bugs in a given upload batch.

Regarding the issue of a single intermediary size versus multiple, there's
still a near-future plan to have pregenerated buckets for Media Viewer
(which can be reused for a whole host of other things). Those could be used
like mip-maps like you describe. Since these sizes will be generated at
upload time, why not use them?

However quality starts to introduce noticeable visual artifacts when the
bucket (source image)'s dimensions are too close to the thumbnail you want
to render.

Consider the existing Media Viewer width buckets: 320, 640, 800, 1024,
1280, 1920, 2560, 2880

I think that generating the 300px thumbnail based on the 320 bucket is
likely to introduce very visible artifacts with thin lines, etc. compared
to using the biggest bucket (2880px). Maybe there's a smart compromise,
like picking the higher bucket (eg. 300px thumbnail would use the 640
bucket as its source, etc.). I think that we need a battery of visual test
to determine what's the best strategy here.

All of this is dependent on Ops giving the green light for pregenerating
the buckets, though. The swift capacity for it is slowing being brought
online, but I think Ops' prerequisite wish to saying yes to it is that we
focus on the post-swift strategy for thumbnails. We also need to figure out
the performance impact of generating all these thumbnails on upload. On a
very meta note, we might generate the smaller buckets based on the biggest
bucket and only the 2-3 biggest buckets based on the original (still to
avoid visual artifacts).

Another related angle I'd like to explore is to submit a simplified version
of this RFC:
https://www.mediawiki.org/wiki/Requests_for_comment/Standardized_thumbnails_sizeswhere
we'd propose a single bucket list option instead of multiple
(presumably, the Media Viewer ones, if not, we'd update Media Viewer to use
the new canon list of buckets). And where we would still allow arbitrary
thumbnail sizes below a certain limit. For example, people would still be
allowed to request thumbnails that are smaller than 800px at any size they
want, because these are likely to be thumbnails in the real sense of the
term, and for anything above 800px they would be limited to the available
buckets (eg. 1024, 1280, 1920, 2560, 2880). This would still allow
foundation-hosted wikis to have flexible layout strategies with their
thumbnail sizes, while reducing the craziness of this attack vector on the
image Scalers and gigantic waste of disk and memory space on the thumbnail
hosting. I think it would be an easier sell for the community, the current
RFC is too extreme in banning all arbitrary sizes and offers too many
bucketing options. I feel like the standardization of true thumbnail sizes
(small images, 800px) is much more subject to endless debate with no
consensus.


On Thu, May 1, 2014 at 12:21 PM, Erwin Dokter er...@darcoury.nl wrote:

  On 04/30/2014 12:51 PM, Brion Vibber wrote:

 * at upload time, perform a series of scales to produce the mipmap levels
 * _don't consider the upload complete_ until those are done! a web
 uploader
 or API-using bot should probably wait until it's done before uploading the
 next file, for instance...
 * once upload is complete, keep on making user-facing thumbnails as
 before... but make them from the smaller mipmap levels instead of the
 full-scale original


 Would it not suffice to just produce *one* scaled down version (ie.
 2048px) which the real-time scaler can use to produce the thumbs?

 Regards,
 --
 Erwin Dokter



 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Image scaling proposal: server-side mip-mapping

2014-05-01 Thread Gilles Dubuc
Another point about picking the one true bucket list: currently Media
Viewer's buckets have been picked based on the most common screen
resolutions, because Media Viewer tries to always use the entire width of
the screen to display the image, so trying to achieve a 1-to-1 pixel
correspondence makes sense, because it should give the sharpest result
possible to the average user.

However, sticking to that approach will likely introduce a cost. As I've
just mentioned, we will probably need to generate more than one of the high
buckets based on the original, in order to avoid resizing artifacts.

On the other hand, we could decide that the unified bucket list shouldn't
be based on screen resolutions (after all the full width display scenario
experienced in Media Viewer might be the exception, and the buckets will be
for everything mediawiki) and instead would progress by powers of 2. Then
creating a given bucket could always be done without resizing artifacts,
based on the bucket above the current one. This should provide the biggest
savings possible in image scaling time to generate thumbnail buckets.

To illustrate with an example, the bucket list could be: 256, 512, 1024,
2048, 4096. The 4096 bucket would be generated first, based on the
original, then 2048 would be generated based on 4096, then 1024 based on
2048, etc.

The big downside is that there's less progression in the 1000-3000 range (4
buckets in the Media Viewer resolution-oriented strategy, 2 buckets here)
where the majority of devices currently are. If I take a test image as an
example (https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg),
the file size progression is quite different between the screen resolution
buckets and the geometric (powers of 2) buckets:

- screen resolution buckets
320 11.7kb
640 17kb
800 37.9kb
1024 58kb
1280 89.5kb
1920 218.9kb
2560 324.6kb
2880 421.5kb

- geometric buckets
256 9.4kb
512 20kb
1024 58kb
2048 253.1kb
4096 test image is smaller than 4096

It seems like it's not ideal that a screen resolution slightly above 1024
would suddenly need to download an image 5 times as heavy, for not that
many extra pixels on the actual screen. A similar thing can be said for the
screen resolution progression, where the file size more than doubles
between 1280 and 1920. We could probably use at least an extra step between
those two if we use screen resolution buckets, like 1366 and/or 1440.

I think that the issue of buckets between 1000 and 3000 is tricky, it's
going to be difficult to avoid generating them based on the original while
not getting visual artifacts.

Maybe we can get away with generating 1280 (and possibly 1366, 1440) based
on 2048, the distance between the two guaranteeing that the quality issues
will be negligible. We definitely can't generate a 1920 based on a 2048
thumbnail, though, otherwise artifacts on thin lines will look awful.

A mixed progression like this might be the best of both worlds, if we
confirm that between 1024 and 2048 the resizing is artifact-free enough:

256, 512, 1024, 1280, 1366, 1440, 2048, 4096 where 2048 is generated based
on 4096, 1024, 1280, 1366, 1440 are generated based on 2048, 512 based on
1024, 256 based on 512.

If for example the image width is between 1440 and 2048, then 1024, 1280,
1366, 1440 would be generated based on the original. That's fine
performance-wise, since the original is small.

Something that might also be useful to generate is a thumbnail of the same
size as the original if original  4096 (or whatever the highest bucket
is). Currently we seem to block generating such a thumbnail, but the
difference in file size is huge. For the test image mentioned above, which
is 3002 pixels wide, the original is 3.69MB, while a thumbnail of the same
size would be 465kb. For the benefit of retina displays that are 2560/2880,
displaying a thumbnail of the same size as a 3002 original would definitely
be better than the highest available bucket (2048).

All of this is benchmark-worthy anyway, I might be splitting hair looking
for powers of two if rendering a bucket chain (each bucket generated based
on the next one) isn't that much faster than generating all buckets based
on the biggest bucket.


On Thu, May 1, 2014 at 12:54 PM, Gilles Dubuc gil...@wikimedia.org wrote:

 _don't consider the upload complete_ until those are done! a web uploader
 or API-using bot should probably wait until it's done before uploading the
 next file, for instance...


 You got me a little confused at that point, are you talking about the
 client generating the intermediary sizes, or the server?

 I think client-side thumbnail generation is risky when things start
 getting corrupt. A client-side bug could result in a user uploading
 thumbnails that are for a different image. And if you want to run a visual
 signature check on the server-side to avoid that issue, you might be
 looking at similar processing time checking that the thumbnail is for the
 correct image then if the 

Re: [Wikitech-l] Image scaling proposal: server-side mip-mapping

2014-05-01 Thread Gilles Dubuc
An extremely crude benchmark on our multimedia labs instance, still using
the same test image:

original - 3002 (original size) 0m0.268s
original - 2048 0m1.344s
original - 1024 0m0.856s
original - 512 0m0.740s
original - 256 0m0.660s
2048 - 1024 0m0.444s
2048 - 512 0m0.332s
2048 - 256 0m0.284s
1024 - 512 0m0.112s
512 - 256 0m0.040s

Which confirms that chaining instead of generating all thumbnails based on
the biggest bucket saves a significant amount of processing time. It's
definitely in the same order or magnitude as the savings achieved by going
from original as the source to the biggest bucket as the source.

It's also worth noting that generating the thumbnail of the same size as
the original is relatively cheap. Using it as the source for the 2048 image
doesn't save that much time, though: 0m1.252s (for 3002 - 2048).

And here's a side-by-side comparison of these images generated with
chaining and images that come from our regular image scalers:
https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try
to guess which is which before inspecting the page for the answer :)


On Thu, May 1, 2014 at 4:02 PM, Gilles Dubuc gil...@wikimedia.org wrote:

 Another point about picking the one true bucket list: currently Media
 Viewer's buckets have been picked based on the most common screen
 resolutions, because Media Viewer tries to always use the entire width of
 the screen to display the image, so trying to achieve a 1-to-1 pixel
 correspondence makes sense, because it should give the sharpest result
 possible to the average user.

 However, sticking to that approach will likely introduce a cost. As I've
 just mentioned, we will probably need to generate more than one of the high
 buckets based on the original, in order to avoid resizing artifacts.

 On the other hand, we could decide that the unified bucket list shouldn't
 be based on screen resolutions (after all the full width display scenario
 experienced in Media Viewer might be the exception, and the buckets will be
 for everything mediawiki) and instead would progress by powers of 2. Then
 creating a given bucket could always be done without resizing artifacts,
 based on the bucket above the current one. This should provide the biggest
 savings possible in image scaling time to generate thumbnail buckets.

 To illustrate with an example, the bucket list could be: 256, 512, 1024,
 2048, 4096. The 4096 bucket would be generated first, based on the
 original, then 2048 would be generated based on 4096, then 1024 based on
 2048, etc.

 The big downside is that there's less progression in the 1000-3000 range
 (4 buckets in the Media Viewer resolution-oriented strategy, 2 buckets
 here) where the majority of devices currently are. If I take a test image
 as an example (
 https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg), the
 file size progression is quite different between the screen resolution
 buckets and the geometric (powers of 2) buckets:

 - screen resolution buckets
 320 11.7kb
 640 17kb
 800 37.9kb
 1024 58kb
 1280 89.5kb
 1920 218.9kb
 2560 324.6kb
 2880 421.5kb

 - geometric buckets
 256 9.4kb
 512 20kb
 1024 58kb
 2048 253.1kb
 4096 test image is smaller than 4096

 It seems like it's not ideal that a screen resolution slightly above 1024
 would suddenly need to download an image 5 times as heavy, for not that
 many extra pixels on the actual screen. A similar thing can be said for the
 screen resolution progression, where the file size more than doubles
 between 1280 and 1920. We could probably use at least an extra step between
 those two if we use screen resolution buckets, like 1366 and/or 1440.

 I think that the issue of buckets between 1000 and 3000 is tricky, it's
 going to be difficult to avoid generating them based on the original while
 not getting visual artifacts.

 Maybe we can get away with generating 1280 (and possibly 1366, 1440) based
 on 2048, the distance between the two guaranteeing that the quality issues
 will be negligible. We definitely can't generate a 1920 based on a 2048
 thumbnail, though, otherwise artifacts on thin lines will look awful.

 A mixed progression like this might be the best of both worlds, if we
 confirm that between 1024 and 2048 the resizing is artifact-free enough:

 256, 512, 1024, 1280, 1366, 1440, 2048, 4096 where 2048 is generated based
 on 4096, 1024, 1280, 1366, 1440 are generated based on 2048, 512 based on
 1024, 256 based on 512.

 If for example the image width is between 1440 and 2048, then 1024, 1280,
 1366, 1440 would be generated based on the original. That's fine
 performance-wise, since the original is small.

 Something that might also be useful to generate is a thumbnail of the same
 size as the original if original  4096 (or whatever the highest bucket
 is). Currently we seem to block generating such a thumbnail, but the
 difference in file size is huge. For the test image mentioned above, which
 is 3002 pixels wide, the original is 3.69MB, 

Re: [Wikitech-l] Image scaling proposal: server-side mip-mapping

2014-05-01 Thread Rob Lanphier
Hi Gilles,

Thanks for the comparison images.  When I was playing around with this
a while back, I found that images with lots of parallel lines and lots
of easily recognized detail were the best to see what sorts of
problems rescaling can cause.  Here's a few

https://commons.wikimedia.org/wiki/File:13-11-02-olb-by-RalfR-03.jpg
https://commons.wikimedia.org/wiki/File:Basel_-_M%C3%BCnsterpfalz1.jpg
https://commons.wikimedia.org/wiki/File:Bouquiniste_Paris.jpg

I hadn't used faces, but a good group photo with easily recognized
faces is something else to possibly try (our brains are really good at
spotting subtle differences in faces).

Rob


On Thu, May 1, 2014 at 7:57 AM, Gilles Dubuc gil...@wikimedia.org wrote:
 An extremely crude benchmark on our multimedia labs instance, still using
 the same test image:

 original - 3002 (original size) 0m0.268s
 original - 2048 0m1.344s
 original - 1024 0m0.856s
 original - 512 0m0.740s
 original - 256 0m0.660s
 2048 - 1024 0m0.444s
 2048 - 512 0m0.332s
 2048 - 256 0m0.284s
 1024 - 512 0m0.112s
 512 - 256 0m0.040s

 Which confirms that chaining instead of generating all thumbnails based on
 the biggest bucket saves a significant amount of processing time. It's
 definitely in the same order or magnitude as the savings achieved by going
 from original as the source to the biggest bucket as the source.

 It's also worth noting that generating the thumbnail of the same size as
 the original is relatively cheap. Using it as the source for the 2048 image
 doesn't save that much time, though: 0m1.252s (for 3002 - 2048).

 And here's a side-by-side comparison of these images generated with
 chaining and images that come from our regular image scalers:
 https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try
 to guess which is which before inspecting the page for the answer :)


 On Thu, May 1, 2014 at 4:02 PM, Gilles Dubuc gil...@wikimedia.org wrote:

 Another point about picking the one true bucket list: currently Media
 Viewer's buckets have been picked based on the most common screen
 resolutions, because Media Viewer tries to always use the entire width of
 the screen to display the image, so trying to achieve a 1-to-1 pixel
 correspondence makes sense, because it should give the sharpest result
 possible to the average user.

 However, sticking to that approach will likely introduce a cost. As I've
 just mentioned, we will probably need to generate more than one of the high
 buckets based on the original, in order to avoid resizing artifacts.

 On the other hand, we could decide that the unified bucket list shouldn't
 be based on screen resolutions (after all the full width display scenario
 experienced in Media Viewer might be the exception, and the buckets will be
 for everything mediawiki) and instead would progress by powers of 2. Then
 creating a given bucket could always be done without resizing artifacts,
 based on the bucket above the current one. This should provide the biggest
 savings possible in image scaling time to generate thumbnail buckets.

 To illustrate with an example, the bucket list could be: 256, 512, 1024,
 2048, 4096. The 4096 bucket would be generated first, based on the
 original, then 2048 would be generated based on 4096, then 1024 based on
 2048, etc.

 The big downside is that there's less progression in the 1000-3000 range
 (4 buckets in the Media Viewer resolution-oriented strategy, 2 buckets
 here) where the majority of devices currently are. If I take a test image
 as an example (
 https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg), the
 file size progression is quite different between the screen resolution
 buckets and the geometric (powers of 2) buckets:

 - screen resolution buckets
 320 11.7kb
 640 17kb
 800 37.9kb
 1024 58kb
 1280 89.5kb
 1920 218.9kb
 2560 324.6kb
 2880 421.5kb

 - geometric buckets
 256 9.4kb
 512 20kb
 1024 58kb
 2048 253.1kb
 4096 test image is smaller than 4096

 It seems like it's not ideal that a screen resolution slightly above 1024
 would suddenly need to download an image 5 times as heavy, for not that
 many extra pixels on the actual screen. A similar thing can be said for the
 screen resolution progression, where the file size more than doubles
 between 1280 and 1920. We could probably use at least an extra step between
 those two if we use screen resolution buckets, like 1366 and/or 1440.

 I think that the issue of buckets between 1000 and 3000 is tricky, it's
 going to be difficult to avoid generating them based on the original while
 not getting visual artifacts.

 Maybe we can get away with generating 1280 (and possibly 1366, 1440) based
 on 2048, the distance between the two guaranteeing that the quality issues
 will be negligible. We definitely can't generate a 1920 based on a 2048
 thumbnail, though, otherwise artifacts on thin lines will look awful.

 A mixed progression like this might be the best of both worlds, if we
 confirm that between 1024 and 2048 

Re: [Wikitech-l] Image scaling proposal: server-side mip-mapping

2014-05-01 Thread Brion Vibber
On Thu, May 1, 2014 at 3:54 AM, Gilles Dubuc gil...@wikimedia.org wrote:

 
  _don't consider the upload complete_ until those are done! a web uploader
  or API-using bot should probably wait until it's done before uploading
 the
  next file, for instance...
 

 You got me a little confused at that point, are you talking about the
 client generating the intermediary sizes, or the server?


Server.

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Image scaling proposal: server-side mip-mapping

2014-05-01 Thread Max Semenik
The slowest part of large images scaling in production is their retrieval
from Swift, which could be much faster for bucketed images.


On Thu, May 1, 2014 at 7:57 AM, Gilles Dubuc gil...@wikimedia.org wrote:

 An extremely crude benchmark on our multimedia labs instance, still using
 the same test image:

 original - 3002 (original size) 0m0.268s
 original - 2048 0m1.344s
 original - 1024 0m0.856s
 original - 512 0m0.740s
 original - 256 0m0.660s
 2048 - 1024 0m0.444s
 2048 - 512 0m0.332s
 2048 - 256 0m0.284s
 1024 - 512 0m0.112s
 512 - 256 0m0.040s

 Which confirms that chaining instead of generating all thumbnails based on
 the biggest bucket saves a significant amount of processing time. It's
 definitely in the same order or magnitude as the savings achieved by going
 from original as the source to the biggest bucket as the source.

 It's also worth noting that generating the thumbnail of the same size as
 the original is relatively cheap. Using it as the source for the 2048 image
 doesn't save that much time, though: 0m1.252s (for 3002 - 2048).

 And here's a side-by-side comparison of these images generated with
 chaining and images that come from our regular image scalers:
 https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try
 to guess which is which before inspecting the page for the answer :)


 On Thu, May 1, 2014 at 4:02 PM, Gilles Dubuc gil...@wikimedia.org wrote:

  Another point about picking the one true bucket list: currently Media
  Viewer's buckets have been picked based on the most common screen
  resolutions, because Media Viewer tries to always use the entire width of
  the screen to display the image, so trying to achieve a 1-to-1 pixel
  correspondence makes sense, because it should give the sharpest result
  possible to the average user.
 
  However, sticking to that approach will likely introduce a cost. As I've
  just mentioned, we will probably need to generate more than one of the
 high
  buckets based on the original, in order to avoid resizing artifacts.
 
  On the other hand, we could decide that the unified bucket list shouldn't
  be based on screen resolutions (after all the full width display scenario
  experienced in Media Viewer might be the exception, and the buckets will
 be
  for everything mediawiki) and instead would progress by powers of 2. Then
  creating a given bucket could always be done without resizing artifacts,
  based on the bucket above the current one. This should provide the
 biggest
  savings possible in image scaling time to generate thumbnail buckets.
 
  To illustrate with an example, the bucket list could be: 256, 512, 1024,
  2048, 4096. The 4096 bucket would be generated first, based on the
  original, then 2048 would be generated based on 4096, then 1024 based on
  2048, etc.
 
  The big downside is that there's less progression in the 1000-3000 range
  (4 buckets in the Media Viewer resolution-oriented strategy, 2 buckets
  here) where the majority of devices currently are. If I take a test image
  as an example (
  https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg),
 the
  file size progression is quite different between the screen resolution
  buckets and the geometric (powers of 2) buckets:
 
  - screen resolution buckets
  320 11.7kb
  640 17kb
  800 37.9kb
  1024 58kb
  1280 89.5kb
  1920 218.9kb
  2560 324.6kb
  2880 421.5kb
 
  - geometric buckets
  256 9.4kb
  512 20kb
  1024 58kb
  2048 253.1kb
  4096 test image is smaller than 4096
 
  It seems like it's not ideal that a screen resolution slightly above 1024
  would suddenly need to download an image 5 times as heavy, for not that
  many extra pixels on the actual screen. A similar thing can be said for
 the
  screen resolution progression, where the file size more than doubles
  between 1280 and 1920. We could probably use at least an extra step
 between
  those two if we use screen resolution buckets, like 1366 and/or 1440.
 
  I think that the issue of buckets between 1000 and 3000 is tricky, it's
  going to be difficult to avoid generating them based on the original
 while
  not getting visual artifacts.
 
  Maybe we can get away with generating 1280 (and possibly 1366, 1440)
 based
  on 2048, the distance between the two guaranteeing that the quality
 issues
  will be negligible. We definitely can't generate a 1920 based on a 2048
  thumbnail, though, otherwise artifacts on thin lines will look awful.
 
  A mixed progression like this might be the best of both worlds, if we
  confirm that between 1024 and 2048 the resizing is artifact-free enough:
 
  256, 512, 1024, 1280, 1366, 1440, 2048, 4096 where 2048 is generated
 based
  on 4096, 1024, 1280, 1366, 1440 are generated based on 2048, 512 based on
  1024, 256 based on 512.
 
  If for example the image width is between 1440 and 2048, then 1024, 1280,
  1366, 1440 would be generated based on the original. That's fine
  performance-wise, since the original is small.
 
  Something that might 

Re: [Wikitech-l] Image scaling proposal: server-side mip-mapping

2014-05-01 Thread Erwin Dokter

On 01-05-2014 16:57, Gilles Dubuc wrote:


And here's a side-by-side comparison of these images generated with
chaining and images that come from our regular image scalers:
https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try
to guess which is which before inspecting the page for the answer :)


Not much difference, but it's there. Progressive scaling loses edge 
detail during each stage. The directly scaled images look sharper.


Regards,
--
Erwin Dokter


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Wikimedia Hackathon: Travel Information and Ticket sent

2014-05-01 Thread Manuel Schneider
Dear all,

sorry for cross-posting but I just have sent all participants of the
Hackathon a PDF with travel information and a personalized public
transport ticket.

Should you plan to attend the Hackathon but have not received the Travel
Information mail, then please contact me.

Thanks and regards,


Manuel
-- 
Wikimedia CH - Verein zur Förderung Freien Wissens
Lausanne, +41 (21) 34066-22 - www.wikimedia.ch

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Bugday on General MediaWiki bugs on Tue, April 29 2014, 14:30UTC

2014-05-01 Thread Andre Klapper
Thank you to everybody who participated in this bugday!

We managed to update about 50 tickets and moved 29 tickets out of the
General/Unknown bucket. More details on
https://www.mediawiki.org/wiki/Bug_management/Triage/20140429

andre


 On Thu, 2014-04-24 at 00:21 +0530, Andre Klapper wrote:
  you are invited to join us on the next Bugday:
  
   Tuesday, April 29, 2014, 14:30 to 16:30UTC [1]
  in #wikimedia-office on Freenode IRC [2]
  
  We will be triaging Bugzilla tickets under the product MediaWiki and
  the component General/Unknown [3].
-- 
Andre Klapper | Wikimedia Bugwrangler
http://blogs.gnome.org/aklapper/


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Shifting from PHP mailer to Swift Mailer

2014-05-01 Thread Andre Klapper
Hi Tony,

could you clarify your intention for this email? 
Are you seeking for more input whether it's a good idea, or if this
actually needed? Or are you searching for agreement as you plan to work
on a patch yourself? Or are you looking for somebody to work on this?

It's not entirely clear to me from your email.

Thanks,
andre


On Tue, 2014-04-29 at 21:28 +0530, Tony Thomas wrote:
 Hi,
   Looks like there isn't any more reply about this proposal on
 shifting from UserMailer to Swift-Mailer. Since, we have already started
 with implementing VERP, it's high time this enhancement needs to be
 applied, if it needs to be. If Swift-Mailer is to be done, VERP needs to be
 implemented as a plugin to it, or else as an additional script in the
 UserMailer code.
 
 Bugzilla ticket:- https://bugzilla.wikimedia.org/show_bug.cgi?id=63483
 
 Thanks,
 Tony Thomas http://tttwrites.in
 FOSS@Amrita http://foss.amrita.ac.in
 
 *where there is a wifi,there is a way*
 
 
 On Fri, Apr 4, 2014 at 10:45 PM, Tony Thomas 01tonytho...@gmail.com wrote:
 
  Hi,
While working on implementing VERP for Mediawiki[1], Nemo
  pointed to me, Tyler' recommendation[2] on shifting from PHP mailer to
  Swift Mailer[3]. Quoting Tyler's words :
  PHPMailer has everything packed into a few classes, whereas Swift_Mailer
  actually has a separation of concerns, with classes for attachments,
  transport types, etc. A result of this is that PHPMailer has two different
  functions for embedding multimedia: addEmbeddedImage() for files and
  addStringEmbeddedImage() for strings. Another example is that PHPMailer
  supports only two bodies for multipart messages, whereas Swift_Mailer will
  add in as many bodies as you tell it to since a body is wrapped in its own
  object. In addition, PHPMailer only really supports SMTP, whereas
  Swift_Mailer has an extensible transport architecture, and multiple
  transport providers. (And there's also plugins, and monolog integration,
  etc.
 
My mentors too think about it to be a nice idea, and Nemo
  recommended adding it to my GSoC project deliverable here (
  https://www.mediawiki.org/wiki/VERP#Deliverables ). But, we need more
  community-consensus on the same as this needs to be done first, and VERP as
  a plugin to it, if Swift mailer needs to be done. I have opened a BZ ticket
  for the same ( https://bugzilla.wikimedia.org/show_bug.cgi?id=63483 ).
  Please comment to this thread or in the BZ regarding the shift as it needs
  to be done for a start. The discussions we had on this till date is here:
  https://www.mediawiki.org/wiki/Talk:VERP#Swift_Mailer_and_VERP__40928.
 
  [1]: https://www.mediawiki.org/wiki/VERP
  [2]:
  https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Third-party_components
  [3]: http://swiftmailer.org/
 
 
  Thanks,
  Tony Thomas http://tttwrites.in
  FOSS@Amrita http://foss.amrita.ac.in
 
  *where there is a wifi,there is a way*
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- 
Andre Klapper | Wikimedia Bugwrangler
http://blogs.gnome.org/aklapper/


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Fwd: [Wikimedia-GH] Wiki Loves Earth Begins!

2014-05-01 Thread rupert THURNER
hi,

is there a possibility to get a banner on enwp for ghana to wiki loves
earth, as this is this years main contest there?
https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2014_in_Ghana

rupert


-- Forwarded message --
From: Enock Seth Nyamador kwadzo...@gmail.com
Date: Thu, May 1, 2014 at 11:37 AM
Subject: Re: [Wikimedia-GH] Wiki Loves Earth Begins!
To: Planning Wikimedia Ghana Chapter wikimedia...@lists.wikimedia.org


Here is our poster:


Regards,

Enock
enwp.org/User:Enock4seth


On Thu, May 1, 2014 at 1:47 AM, Enock Seth Nyamador kwadzo...@gmail.com wrote:

 Hi All,

 Wiki Loves Earth has started, you can now upload your photos, here.

 FYI  anyone reading Wikipedia and Wikimedia Commons from Ghana (specifically 
 Ghanaian IP's) whether logged in or not will see the image below:


 Regards,

 Enock
 enwp.org/User:Enock4seth



___
Wikimedia-GH mailing list
wikimedia...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-gh

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Fwd: [Wikimedia-GH] Wiki Loves Earth Begins!

2014-05-01 Thread Jeremy Baron
On May 1, 2014 6:22 PM, rupert THURNER rupert.thur...@gmail.com wrote:
 is there a possibility to get a banner on enwp for ghana to wiki loves
 earth, as this is this years main contest there?
 https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2014_in_Ghana

That doesn't have anything to do with this list. (I can't think of a more
relevant list besides a general one like wikimedia-l though)

Please post a request on either of:
* https://en.wikipedia.org/wiki/WP:Geonotice
* https://meta.wikimedia.org/wiki/CentralNotice/Calendar (and also its talk
page)

Also, don't ask for notices for events after they've already started! Plan
and coordinate them ahead of time.

-Jeremy
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Fwd: [Wikimedia-l] Mobile Operator IP Drift Tracking and Remediation

2014-05-01 Thread Adam Baso
Update.

-- Forwarded message --
From: Adam Baso ab...@wikimedia.org
Date: Thu, May 1, 2014 at 7:52 PM
Subject: Re: [Wikimedia-l] Mobile Operator IP Drift Tracking and Remediation
To: Wikimedia Mailing List wikimedi...@lists.wikimedia.org


After examining this, it looks like EventLogging is more suited to the
logging task than debug logging and the trappings of needing to alter debug
logging in the core MediaWiki software.

EventLogging logs at the resolution of a second (instead of a day), but has
inbuilt support for record removal after 90 days.

Please do let us know in case of further questions. Here's the logging
schema for those with an interest:

https://meta.wikimedia.org/wiki/Schema:MobileOperatorCode

Here's the relevant server code:

https://gerrit.wikimedia.org/r/#/c/130991/

-Adam




On Wed, Apr 16, 2014 at 2:20 PM, Adam Baso ab...@wikimedia.org wrote:

 Great idea!

 Anyone on the list know if there's a way to make the debug log facilities
 do the MMDD timestamp instead of the longer one?

 If not, I suppose we could work to update the core MediaWiki code. [1]

 -Adam

 1. For those with PHP skills or equivalent, I'm referring to
 https://git.wikimedia.org/blob/mediawiki%2Fcore.git/a26687e81532def3faba64612ce79b701a13949e/includes%2FGlobalFunctions.php#L1042.
 Scroll to the bottom of the function definition to see the datetimestamp
 approach.


 On Wed, Apr 16, 2014 at 12:47 PM, Andrew Gray 
 andrew.g...@dunelm.org.ukwrote:

 Hi Adam,

 One thought: you don't really need the date/time data at any detailed
 resolution, do you? If what you're wanting it for is to track major
 changes (last month it all switched to this IP) and to purge old
 data (delete anything older than 10 March), you could simply log day
 rather than datetime.

 enwiki / 127.0.0.1 / 123.45 / 2014-04-16:1245.45

 enwiki / 127.0.0.1 / 123.45 / 2014-04-16

 - the latter gives you the data you need while making it a lot harder
 to do any kind of close user-identification.

 Andrew.
 On 16 Apr 2014 19:17, Adam Baso ab...@wikimedia.org wrote:

  Inline.
 
  Thanks for starting this thread.
  
   Sorry if I've overlooked this, but who/what will have access to this
  data?
   Only members of the mobile team? Local project CheckUsers? Wikimedia
   Foundation-approved researchers? Wikimedia shell users? AbuseFilter
   filters?
  
 
  It's a good question. The thought is to put it in the customary
 wfDebugLog
  location (with, for example, filename mccmnc.log) on fluorine.
 
  It just occurred to me that the wiki name (e.g., enwiki), but not the
  full URL, gets logged additionally as part of the wfDebugLog call; to
 make
  the implicit explicit, wfDebugLog adds a datetime stamp as well, and
 that's
  useful for purging old records. I'll forward this email to mobile-l and
  wikitech-l to underscore this.
 
 
   And this may be a silly question, but is there a reasonable means of
   approximating how identifying these two data points alone are? That
 is,
   Using a mobile country code and exit IP address, is it possible to
   identify a particular editor or reader? Or perhaps rephrased, is this
  data
   considered anonymized?
  
 
  Not a silly question. My approximation is these tuples (datetime, now
 that
  it hit me - XYwiki, exit IP, and MCC-MNC) alone, although not perfectly
  anonymized, are low identifying (that is, indirect inferences on the
 data
  in isolation are unlikely, but technically possible, through
 examination of
  short tail outliers in a cluster analysis where such readers/editors
 exist
  in the short tail outliers sets), in contrast to regular web access logs
  (where direct inferences are easy).
 
  Thanks. I'll forward this along now.
 
  -Adam
  ___
  Wikimedia-l mailing list
  wikimedi...@lists.wikimedia.org
  Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
  mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
 ___
 Wikimedia-l mailing list
 wikimedi...@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l