Hi all,

a little more detail from the funnel analysis of UploadWizard (if you
haven't been following the other funnel thread,
[[mw:UploadWizard/Funnel_analysis]]
<https://www.mediawiki.org/wiki/UploadWizard/Funnel_analysis> has a quick
summary).

*Users repeat the upload process many times*

The main thing I am trying to understand at this point is why people use
the "upload another file" button so much. UploadWizard allows uploading up
to 50 files at the same time, which should be more then enough for the
average user, but our click-tracking data shows that most people click
through the tutorial-file-deed-details-thanks screens, then click on the
upload more button (which effectively resets the process and starts again
from the file screen), then click through the screens again, then click on
the upload more button again, then do the same again, and again, and again.
(Doing this fifty times in a row is not uncommon.) This suggests some
fundamental failing in UW - Sage suggested it is the instability of
uploading more than a few files at the same time. I wonder if others have
relevant experience?

*Errors do not seem to be the main problem*

I have tried to identify the reason for failed UploadWizard sessions (a
series of UploadWizard events logged on the same page which are not
terminated by reaching the thanks page) by checking what the last event
was, and assuming that for failed sessions caused by errors, that error
would be the last event. Assuming this is sound, errors do not seem to be
the main problem - they only appear at the end of ~25% of the failed
sessions (which is ~8% of the total sessions).

*Top errors*

That said, here is a list of error codes (these are mostly API error codes,
but a few are internal to UploadWizard) sorted by frequency, collected over
~1000 sessions:

| filename             |    20 |
| badtoken             |    19 |
| missingresult        |    14 |
| title                |    13 |
| publishfailed        |    11 |
| stasherror           |     7 |
| server-error         |     3 |
| fileexists-forbidden |     2 |
| filetype-banned-type |     1 |
| unknown              |     1 |
| verification-error   |     1 |
| unknownerror         |     1 |

A little explanation about the more frequent ones:

   - filename: these seem to be user errors - most often invalid filetype
   (doc, bmp etc), sometimes no extension at all or trying to add the same
   file twice.
   - badtoken: some sort of CSRF token expiration; bug 69691
   <https://bugzilla.wikimedia.org/show_bug.cgi?id=69691>
   - missingresult: returned by the upload API in the details step when the
   uploaded file has gone missing; bug 43967
   <https://bugzilla.wikimedia.org/show_bug.cgi?id=43967>
   - title: an error about duplicate files (i.e. the same file already
   exists on Commons) that somehow happens in the details step instead of the
   file step.
   - publishfailed: this seems to be some sort of race condition: first api
   call to publish a file from stash puts it into the job queue and sets it
   status to pending, second call will throw this error.
   - stasherror: could be lots of things. bug 56302
   <https://bugzilla.wikimedia.org/show_bug.cgi?id=56302>, bug 54028
   <https://bugzilla.wikimedia.org/show_bug.cgi?id=54028> and more.


*Some suggestions based on the findings so far*

Quick wins:

   - review UX for "fatal user errors" (i.e. when UploadWizard says "you
   can't upload this file type") - is the error message helpful?
   - review and improve api error messages (api-error-*), possibly override
   them with UW-specific ones. Do they identify next steps? Do they even
   exist?(e.g. api-error-publishfailed does not.)
   - renew token on badtoken error (bug 69691
   <https://bugzilla.wikimedia.org/show_bug.cgi?id=69691>)
   - make sure that the specific error message thrown by
   ApiUpload::dieUsage gets logged somewhere. Currently we only log a generic
   message derived from the API error code, so e.g. all the dozen different
   UploadStashException subclasses are reported with the same message.
   - poll for success on publishfailed error (unlike its name suggest, it
   seems to be actually meaning something like "publish in progress")

Medium wins:

   - understand better why people repeat the upload process so often. This
   might reveal serious UX deficiencies or functional errors (e.g. in an older
   thread about funnel analysis, Sage claims uploading more than three files
   at the same time is too unreliable for him).
   - Investigate if there is a low-effort way to recover entered details
   when the upload process has to be restarted. (There are drop-in solutions
   like garlic.js <http://garlicjs.org/> or sisyphus.js
   <https://github.com/simsalabim/sisyphus> but the very dynamic nature of
   UW forms might be a problem.)
   - figure out why are some title errors only reported in the details step
   - log information
   <https://meta.wikimedia.org/wiki/Schema:UploadWizardFlowEvent> about
   uploaded files to better identify size- or filetype-specific issues

Bigger / longer-term effort:

   - figure out a way to retry when the user already entered all the
   details but publishing the file failed. (This points towards the
   per-file-workflow-instead-of-global-workflow direction.)
   - make stashed / async uploads rely on the database instead of the
   session (bug 43967 <https://bugzilla.wikimedia.org/show_bug.cgi?id=43967>
   )
_______________________________________________
Multimedia mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/multimedia

Reply via email to