Thanks Christian -- nice to see you back on the lists :)

-Toby

On Thu, Aug 20, 2015 at 1:16 PM, Christian Aistleitner <
[email protected]> wrote:

> Hi,
>
> I've been asked in a private email why WMF forked ua-parser [1]
> (a library used to extract information from User-Agents headers).
> There is no need to discuss this is private, hence I am replying to
> the mailing list.
>
> TL;DR: It was no real fork. We just worked around issues with
> upstream's release management.
>
> -----------------------------------------------------
>
> What follows is a bit detailed. But given the context I decided to
> better err on being over-verbose.
>
> Back in October 2014, WMF pushed towards analyzing User-Agent headers
> in the logs to for example allow more accurate estimations of how many
> requests WMF sees from Android vs. iPhone devices, which Browsers get
> used in which version etc.
>
> Extracting information from User-Agents is a bit tricky as there are
> quite some corner cases. So it was decided to use a third-party
> library for it. ua-parser [1] got chosen for this purpose.
>
> ua-parser comes with a Java build, so it naturally matched the log
> processing's Java eco-system. However, (at least) back then ua-parser
> did not offer compelling prebuilt jars, and ua-parser's versioning and
> release cycle of the Java part was broken.
> The latest release was about a year old, and no proper release was in
> sight. So all upstream gave us was a jar versioned as
>
>   ua-parser-1.3.0-SNAPSHOT.jar
>
> Deploying such jar to the cluster is a bad idea, as its name does not
> give a clue on which commit it is based.  For this concrete setting,
> there would be about 250 commits in ua-parser that would produce the
> same version number.  That would make debugging hard and nix
> reproducability.
>
> Since WMF cannot do a proper release for ua-parser, the typical
> workaround for WMF in such cases is to produce a “wmf” branch in
> Gerrit and do “wmf” releases at known commits. And that's what the
> ua-parser “fork” in Gerrit does.
>
> Comparing upstream with the “fork” in Gerrit, the only difference is:
>
>   https://gerrit.wikimedia.org/r/#/c/169204/
>
> That commit allows for a wmf release, is tagged 1.3.0-wmf1 and results
> in an artifact name of
>
>   ua-parser-1.3.0-wmf1.jar
>
> which (due to the 1.3.0-wmf1 tag) is good for releasing [2].
>
> As one of the questions in the private email was whether WMF could
> switch back to upstream ... I hope you see that WMF never switched
> away from upstream and WMF never “forked” upstream. WMF only rolled
> their own release.
> If upstream now provides proper releases, sure, just use them :-)
>
> Have fun,
> Christian
>
>
>
> P.S.:
> * How can I find out who actually created a repository?
>
> Look at the first commit to the meta/config branch. Like here:
>
>
> https://git.wikimedia.org/commit/analytics%2Fua-parser/2fd5dc00ac9e087b307f42669029f9b05cdcb090
>
>
> * How can I see the difference between branches?
>
> Use `git cherry` (Yes, really. Just “cherry”, no trailing “-pick”)
>
> An example session is at [3].
>
>
> * How could one have found out about the wmf1 thing?
>
> For example from the IRC logs of the day from the commit [4]:
> [20:23:08] <ottomata>    we can just make wmf1 be our release of the
> current master?
> [20:23:13] <qchris>      k
>
>
>
> ----------------------------------------------------------------------
>
> [1] Back then at
>
>   https://github.com/tobie/ua-parser
>
> now the relevant repos for WMF seem to be at
>
>   https://github.com/ua-parser/uap-core
>   https://github.com/ua-parser/uap-java
>
> .
>
>
>
> [2] It made it into archiva:
>
>   https://archiva.wikimedia.org/#artifact/ua_parser/ua-parser/1.3.0-wmf1
>
> into the refinery-hive jars:
>
>   https://gerrit.wikimedia.org/r/#/c/166142/11..14/refinery-hive/pom.xml
>
> and also to the cluster:
>
>   https://gerrit.wikimedia.org/r/#/c/170373/1/refinery-hive/pom.xml
>   https://gerrit.wikimedia.org/r/#/c/170375/
>
>
>
> [3]
> _________________________________________________________________
> christian@spencer // jobs: 0 // time: 21:40:28 // exit code: 0
> cwd: ~/tmp
> git clone https://github.com/tobie/ua-parser
> Cloning into 'ua-parser'...
> remote: Counting objects: 4507, done.
> remote: Total 4507 (delta 0), reused 0 (delta 0), pack-reused 4507
> Receiving objects: 100% (4507/4507), 4.31 MiB | 923 KiB/s, done.
> Resolving deltas: 100% (2301/2301), done.
>
> _________________________________________________________________
> christian@spencer // jobs: 0 // time: 21:41:10 // exit code: 0
> cwd: ~/tmp
> cd ua-parser
>
> _________________________________________________________________
> christian@spencer // jobs: 0 // time: 21:41:14 // exit code: 0
> cwd: ~/tmp/ua-parser
> git remote add gerrit https://gerrit.wikimedia.org/r/analytics/ua-parser
>
> _________________________________________________________________
> christian@spencer // jobs: 0 // time: 21:41:33 // exit code: 0
> cwd: ~/tmp/ua-parser
> git fetch gerrit
> remote: Finding sources: 100% (4/4)
> remote: Total 4 (delta 3), reused 4 (delta 3)
> Unpacking objects: 100% (4/4), done.
> From https://gerrit.wikimedia.org/r/analytics/ua-parser
>  * [new branch]      master     -> gerrit/master
>  * [new branch]      wmf        -> gerrit/wmf
>  * [new tag]         v1.3.0-wmf1 -> v1.3.0-wmf1
>
> _________________________________________________________________
> christian@spencer // jobs: 0 // time: 21:41:38 // exit code: 0
> cwd: ~/tmp/ua-parser
> git cherry origin/master gerrit/master
>
> _________________________________________________________________
> christian@spencer // jobs: 0 // time: 21:42:10 // exit code: 0
> cwd: ~/tmp/ua-parser
> git cherry origin/master gerrit/wmf
> + 2a44875355b558d9f880a63c86630af229044a63
>
> _________________________________________________________________
> christian@spencer // jobs: 0 // time: 21:42:17 // exit code: 0
> cwd: ~/tmp/ua-parser
> git cherry origin/master v1.3.0-wmf1
> + 2a44875355b558d9f880a63c86630af229044a63
>
>
>
> [4]
> http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-analytics/20141027.txt
>
>
>
> --
> ---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
>                            Companies' registry: 360296y in Linz
> Christian Aistleitner
> Kefermarkterstrasze 6a/3     Email:  [email protected]
> 4293 Gutau, Austria          Phone:          +43 7946 / 20 5 81
>                              Fax:            +43 7946 / 20 5 81
>                              Homepage: http://quelltextlich.at/
> ---------------------------------------------------------------
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to