See also https://phabricator.wikimedia.org/T119352, which is proposing to track time on site / page in general.
On Jul 1, 2016 4:24 PM, "Marcel Ruiz Forns" <[email protected]> wrote: If we were doing this internally, a possibility would be to instrument MediaWiki and send sampled events with the time on page to EventLogging. This would not be retroactive though, we would have to wait a couple months to collect significant data. In any case, I'm not sure if this would be possible with an NDA? On Fri, Jul 1, 2016 at 11:52 AM, Marc Miquel <[email protected]> wrote: > I see it is quite complicated to work with this data. It is a pity > considering that valuable insights could be driven by readers' behaviors. I > will think about what can be useful for the study. > > Thanks for the answers, Nuria and Marcel! :) > Cheers, > > Marc > > El dj., 30 juny 2016 a les 14:16, Marcel Ruiz Forns (<[email protected]>) > va escriure: > >> Marc, I also see what Nuria says. Also please consider that the majority >> of Wikipedia sessions have only one pageview. So in the majority of >> sessions it would not be possible to approximate the time spent on page >> with boundaries with Joseph's alternative. >> >> On Thu, Jun 30, 2016 at 2:02 PM, Nuria Ruiz <[email protected]> wrote: >> >>> >Aye, as Joseph says, the time-on-page or time-leaving is not >>> collected, except as an extension of session reconstruction work. If you >>> want a >concrete time, you're not gonna get it. >>> >>> I was about to make the same point, the data set that will most closely >>> answer your questions is the one Oliver mentioned, otherwise we do not keep >>> any information related to time on site and page requests so there is no >>> "approximation" possible that will work on overall data. Even if you >>> calculate signatures with IP-hash +user agent to approximate users (a >>> method with known issues) there is no way for you to distinguish someone >>> reading a page for an hour and someone that came to wikipedia twice in the >>> same hour and spent a minute each time. Hopefully my example makes things >>> more clear. >>> >>> Thanks, >>> >>> Nuria >>> >>> On Wed, Jun 29, 2016 at 4:58 AM, Oliver Keyes <[email protected]> >>> wrote: >>> >>>> Aye, as Joseph says, the time-on-page or time-leaving is not collected, >>>> except as an extension of session reconstruction work. If you want a >>>> concrete time, you're not gonna get it. >>>> >>>> While PC-based data is more reliable than mobile, that does not >>>> necessarily mean "reliable". I'm sort of confused, I guess, as to why the >>>> datasets I linked (unless I'm misremembering them?) don't help: you would >>>> have to do the calculation yourself but they should contain all the data >>>> necessary to make that calculation (unless you want to have the pageID or >>>> title associated with the time-on-page, in which case...yeah, that's an >>>> issue). >>>> >>>> On Wed, Jun 29, 2016 at 3:16 AM, Marc Miquel <[email protected]> >>>> wrote: >>>> >>>>> Thanks for the answer, Oliver. But I am not sure it answers my >>>>> questions. I'd like to study aspects like how much time is spent in >>>>> certain pages, as a proxy of how content is approached/read/understood. >>>>> I'd >>>>> be happy with time of entering the page, time of leaving. This is not >>>>> entirely centered on 'user activity', but I said that because I imagined >>>>> data would be stored in a similar way to editor sessions, or in a database >>>>> and I would need to do the time calculations. >>>>> >>>>> Cheers, >>>>> >>>>> Marc >>>>> >>>>> >>>>> El dc., 29 juny, 2016 03:11, Oliver Keyes <[email protected]> va >>>>> escriure: >>>>> >>>>>> If historic data is okay, there's already a dataset released ( >>>>>> https://figshare.com/articles/Activity_Sessions_datasets/1291033) >>>>>> that was designed specifically to answer questions around how to best >>>>>> calculate session length with regards to Wikipedia ( >>>>>> http://arxiv.org/abs/1411.2878) >>>>>> >>>>>> On Tue, Jun 28, 2016 at 3:42 PM, Marc Miquel <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hello! >>>>>>> >>>>>>> I was thinking about user sessions, yes, so this would mean to >>>>>>> aggregate pageviews visited by a user during a short amount of time (I >>>>>>> should check the cutoff, but it could be around an hour or less). >>>>>>> >>>>>>> I am particularly interested in understanding the order in which >>>>>>> pages are seen (start, end), duration, etc. >>>>>>> I wouldn't need data from a long period neither, but I think data >>>>>>> from multiple languages would be helpful. >>>>>>> >>>>>>> I imagined reader data could be sensitive to privacy, but would an >>>>>>> NDA with my university and some sort of data encoding help with this? >>>>>>> As I >>>>>>> said, it is for a scientific purpose. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Marc >>>>>>> >>>>>>> El dt., 28 juny 2016 a les 21:09, Nuria Ruiz (<[email protected]>) >>>>>>> va escriure: >>>>>>> >>>>>>>> >>>>>>>> Hello! >>>>>>>> >>>>>>>> >I am considering to study reader engagement for different article >>>>>>>> topics in different languages. Because of this, I would like to know if >>>>>>>> there is >any plan to make available pageviews dumps detailing >>>>>>>> activity log >>>>>>>> at session level per user - in a similar way to editor sessions. >>>>>>>> >>>>>>>> Are you thinking of "all-pageviews-visited-by-a-certain-user"? If >>>>>>>> so, no we do not have any projects to provide that data as due to >>>>>>>> privacy >>>>>>>> concerns we neither have nor keep that information. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Nuria >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 28, 2016 at 6:55 PM, Leila Zia <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> + Analytics >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jun 28, 2016 at 6:36 AM, Marc Miquel <[email protected] >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> I have a question for you regarding pageviews datadumps. >>>>>>>>>> >>>>>>>>>> I am considering to study reader engagement for different article >>>>>>>>>> topics in different languages. Because of this, I would like to know >>>>>>>>>> if >>>>>>>>>> there is any plan to make available pageviews dumps detailing >>>>>>>>>> activity log >>>>>>>>>> at session level per user - in a similar way to editor sessions. >>>>>>>>>> >>>>>>>>>> Since this would be for a research project I might ask funding >>>>>>>>>> for it, I would like to know if I could count on that, what is the >>>>>>>>>> nature >>>>>>>>>> of the available data, and what would be the procedure to obtain >>>>>>>>>> this data >>>>>>>>>> and if there would be any implication because of privacy concerns. >>>>>>>>>> >>>>>>>>>> Thank you very much! >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Marc Miquel >>>>>>>>>> ᐧ >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Wiki-research-l mailing list >>>>>>>>>> [email protected] >>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Analytics mailing list >>>>>>>>> [email protected] >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Analytics mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Wiki-research-l mailing list >>>>>>> [email protected] >>>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Wiki-research-l mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Wiki-research-l mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> >> -- >> *Marcel Ruiz Forns* >> Analytics Developer >> Wikimedia Foundation >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- *Marcel Ruiz Forns* Analytics Developer Wikimedia Foundation _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
