I have filtered out the bogus responses and re-generated all the charts and tables. You can see the updated results here: https://github.com/tfausak/tfausak.github.io/blob/ee29da5bd8389c19763ac2b4dbe27ff5204161f5/_posts/2018-11-16-2018-state-of-haskell-survey-results.markdown Note that until I post the results on my blog, they are not published. Please don't share the preliminary results on social media!
On Sun, Nov 18, 2018, at 8:11 AM, Taylor Fausak wrote: > Thanks for finding those anomalies, Gershom! I'm disappointed that > someone submitted bogus responses, apparently to tip the scales of > Cabal versus Stack. I intend to identify those responses and exclude > them from the results. The work you've done so far will help a great > deal in finding them.> > You said that there are about 1,200 responses with demographic > information. That makes sense considering the number of submissions I > got last year. Also, there are 1,185 responses that included an answer > to at least one of the free-response questions. So perhaps whoever > wrote the script didn't bother to put an answer for those types of > questions.> > Unfortunately I do not have precise submission times or IP address > information about submissions. Beyond what's in the CSV, the only > other thing I have is (some) email addresses.> > Fortunately I wrote a script to output all the charts and tables from > the survey responses. Once I've identified the problematic responses, > I should be able to update the script to ignore them and regenerate > all the output.> > > On Sun, Nov 18, 2018, at 3:40 AM, Chris Smith wrote: >> Sadly, it looks like a Cabal/Stack thing. Of the responses with a >> country provided, 618 of 1226 claim to use Cabal, and 948 of 1226 >> claim to use Stack. Of the responses with no country, only 35 of 3868 >> claim to use Cabal, while 3781 of the 3868 claim to use Stack. >> Assuming independence, you'd expect that last number to be about 50, >> meaning there are probably around 3700 fake responses generated just >> to answer "Stack".>> >> To partially answer Simon's question, the flood of no-demographics >> responses started on November 2, around the 750-response point, and >> continued unabated through the close of the survey. And, indeed, >> looking at just the first 750 responses gives similar distributions >> to what we get by ignoring the no-demographic responses. For >> example, of the first 750 responses, 359 claim to use Cabal, and 568 >> claim to use Stack.>> >> On Sun, Nov 18, 2018 at 2:31 AM Simon Marlow >> <marlo...@gmail.com> wrote:>>> Good spot Gershom. Maybe it would be >> revealing to look at the times >>> that responses were received for the no-demographics group?>>> >>> On Sun, 18 Nov 2018, 07:17 Gershom B <gersh...@gmail.com wrote: >>>> I also noticed a number of other bizarre statistical anomolies when >>>> looking at the full results. I know this is a bit much to ask — but >>>> if you could rerun the statistics filtering out people that did not >>>> give demographic information (i.e. country of origin or education, >>>> etc) I think the results will change drastically. By all >>>> statistical logic, this should _not_ be the case, and points to a >>>> serious problem.>>>> >>>> In particular, this drops the results by a huge amount — only 1,200 >>>> or so remain. However, the remaining results tend to make a lot >>>> more sense. For example — of the “no demographics” group, there are >>>> 713 users who claim to develop with notepad++ but all of these say >>>> they develop on mac and linux, and none on windows — which is >>>> impossible, as notepad++ is a windows program. Further if you drop >>>> the “no demographics” group, then you find that almost everyone >>>> uses at least ghc 8.0.2, while in the “no demographics” group, a >>>> stunning number of people claim to be on 7.8.3. Even more >>>> bizarrely, people claim to be using the 7.8 series while only >>>> having used Haskell for less than one year. And people claim to >>>> have used haskell for “one week to one month” and also to be >>>> advanced and expert users!>>>> >>>> The differences continue and defy all probability. Of the “no >>>> demographics” group, almost everyone dislikes the new release >>>> schedule. Of the “demographics” group there are answers that like >>>> it, were not aware of it, or are indifferent, but almost nobody >>>> dislikes it. There is naturally a difference in proportions of >>>> cabal/stack and hackage/stackage responses as well.>>>> >>>> There are a lot of other things I could point to as well. But, >>>> bluntly put, I think that some disaffected party or parties wrote a >>>> crude script and submitted over 3,000 fake responses. Luckily for >>>> us, they were not very smart, and made some obvious errors, so in >>>> this case we can weed out the bad responses (although, sadly, >>>> losing at least a few real ones as well).>>>> >>>> However, assuming this party isn’t entirely stupid, it doesn’t >>>> bode well for future surveys as they may get at least slightly less >>>> dumb in the future if they decide to keep it up :-/>>>> >>>> —Gershom >>>> >>>> >>>> >>>> On November 18, 2018 at 1:10:31 AM, Gershom B (gersh...@gmail.com) >>>> wrote:>>>>> >>>>> >>>>> This is interesting, but I’m thoroughly confused. Over 2500 people >>>>> said they took last year’s survey, but it only had roughly 1,300 >>>>> respondants?>>>>> >>>>> >>>>> On Sat, Nov 17, 2018 at 9:56 PM Taylor Fausak <tay...@fausak.me> >>>>> wrote: >>>>>> Hello! It took a little longer than I expected, but I am nearly >>>>>> ready to announce the 2018 state of Haskell survey results. Some >>>>>> community members have expressed interest in seeing the >>>>>> announcement post before it's published. If you are one of those >>>>>> people, you can see the results here: >>>>>> https://github.com/tfausak/tfausak.github.io/blob/7e4937e284a3068add9e9af6b585c8d0215ff360/_posts/2018-11-16-2018-state-of-haskell-survey-results.markdown >>>>>> >>>>>> If you would like to suggest changes to the announcement post, >>>>>> please respond to this email, send me an email directly, or >>>>>> reply to this pull request on GitHub: >>>>>> https://github.com/tfausak/tfausak.github.io/pull/148 >>>>>> >>>>>> I plan on publishing the results tomorrow. Once the results are >>>>>> published, the post is by no means set in stone. I will happily >>>>>> accept suggestions from anyone at any time. >>>>>> >>>>>> Thank you! >>>>>> _______________________________________________ >>>>>> Haskell-community mailing list Haskell-community@haskell.org >>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community>>>>> >>>> _______________________________________________ >>>> Haskell-community mailing list >>>> Haskell-community@haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community>>> >>>> _______________________________________________ >>> Haskell-community mailing list >>> Haskell-community@haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community >> _________________________________________________ >> Haskell-community mailing list >> Haskell-community@haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community >
_______________________________________________ Haskell-community mailing list Haskell-community@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community