Makes sense to me. I would love to know which components/APIs are used more than others. Obviously, we should make sure everything is anonymous and we don't collect any private user data, but I believe this is already guaranteed by Google Analytics.
-Val On Tue, Nov 3, 2020 at 3:59 AM Alexey Goncharuk <alexey.goncha...@gmail.com> wrote: > Folks, > > I want to bump up this discussion and slightly change the format suggested > by Nikita. I dot think it is correct to gather any information related to > the user environment. However, can we collect just the fact of some of the > Ignite APIs/subsystems being used with no user information whatsoever? > Having started thinking about Ignite 3.0 I realized that we lack even some > very basic knowledge on the impact of changing one or another feature or > API. > > To my knowledge, the Ignite website already uses google analytics which is > available to the community. The google analytics platform already has > tooling to track app screen hits in a completely anonymous way, so we can > use this tool to track Ignite components usage (once per node startup) > sending solely component name and a unique environment hash - no IP > addresses, no jdk/os/other information. The information will be available > in the same toolkit we are already using to analyze the website and > optimize our docs. > > WDYT? > > ср, 19 июл. 2017 г. в 01:15, <dsetrak...@apache.org>: > > > I would try to ping legal again and see if they respond. If not, I think > > we will need to come up with a simpler approach, that does not require > > legal approval. > > > > D. > > > > On Jul 18, 2017, 2:23 PM, at 2:23 PM, Nikita Ivanov <nivano...@gmail.com > > > > wrote: > > >Igniters, > > >Just a quick update. I haven't gotten response from ASF Legal on this > > >thread and I frankly don't know how to proceed here. What's the process > > >to > > >arrive to a decision point here? > > > > > >Thanks! > > >-- > > >Nikita Ivanov > > > > > > > > >On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik <c...@apache.org> > > >wrote: > > > > > >> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote: > > >> > Cos, > > >> > Based on my experience having it off by default negates the entire > > >> > purpose... We need statistically meaningful data set to make any > > >> inferences > > >> > from it. Moreover, if we are going to ask folks to turn it on it > > >will > > >> > significantly skew the resulting data set anyways and show full > > >picture. > > >> I > > >> > think "on" by default is the better option if we are to collect > > >usage > > >> stats > > >> > to begin with. > > >> > > >> yes, sure. But having this "on" by default is likely to expose us to > > >> another > > >> shit-storm down the road. An interesting dilemma to have indeed. In > > >my > > >> experience, whenever I install something like a browser or an > > >operating > > >> system, it would ask if I want to make the particular piece of > > >software > > >> better > > >> by sending back some anonymized stats. Basically, I am given a way to > > >> explicitly opt-out if I wish. > > >> > > >> By turning the feature "on" by default is like saying: "we'll be > > >collecting > > >> some stats, but if you don't want to you can go here and there and > > >disable > > >> the > > >> collection. Oh, and by the way - you need to go and figure out the > > >exact > > >> steps > > >> to disable it." > > >> > > >> > Also, I want to re-iterate it again to avoid misunderstanding: > > >there is > > >> no > > >> > proposal nor will there be a technical way to attribute collected > > >data > > >> back > > >> > to a certain company. That's not what this is all about. We should > > >only > > >> be > > >> > interested in aggregated stats (community size, geo information, > > >language > > >> > information, components usage). > > >> > > >> Yes, I think it is clear, but never hurts to re-iterate. > > >> > > >> Cos > > >> > > >> > Thoughts? > > >> > > > >> > -- > > >> > Nikita Ivanov > > >> > Founder & CTO > > >> > GridGain Systems > > >> > > > >> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik <c...@apache.org> > > >> wrote: > > >> > > > >> > > Actually, that should be OFF by default. It sounds like this > > >reduce the > > >> > > amount > > >> > > of the data collected, but this would address the concerns of > > >companies > > >> > > like > > >> > > Roman's. I know for sure that a few of my clients would sue my > > >ass out > > >> of > > >> > > existence if I gave them the platform collecting their > > >data-centers > > >> info. > > >> > > > > >> > > Let's have it, set if off by default and document and easy way to > > >turn > > >> it > > >> > > off. > > >> > > Then start making rounds asking our user base to share _some_ of > > >the > > >> stats > > >> > > with the community, so we can track the growth of the install > > >base, > > >> etc. > > >> > > > > >> > > Cos > > >> > > > > >> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote: > > >> > > > The idea so far is to have a single system property in > > >configuration > > >> that > > >> > > > turns this off completely. I envision that this will be > > >prominently > > >> > > > featured on Ignite website so that everyone who would like to > > >> disable it > > >> > > - > > >> > > > can do it in seconds. > > >> > > > > > >> > > > Thoughts? > > >> > > > > > >> > > > -- > > >> > > > Nikita Ivanov > > >> > > > Founder & CTO > > >> > > > GridGain Systems > > >> > > > > > >> > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh > > ><rsht...@yahoo.com> > > >> wrote: > > >> > > > > > >> > > > > Nikita, > > >> > > > > > > >> > > > > Sending and storing (somewhere the company cannot securely > > >handle) > > >> any > > >> > > > > information (OS version, IP addresses, etc.) that can be used > > >to > > >> > > compromise > > >> > > > > the services would be unacceptable. > > >> > > > > Turning it off might be ok (possibly through the cluster > > >settings, > > >> not > > >> > > via > > >> > > > > globally-accessible site), but the thing that there's a risk > > >some > > >> > > > > information can leak outside (for any reason, starting from a > > >human > > >> > > > > mistake) is scary. > > >> > > > > > > >> > > > > -- Roman > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov < > > >> > > niva...@gridgain.com> > > >> > > > > wrote: > > >> > > > > > > >> > > > > > > >> > > > > Roman, > > >> > > > > Thanks for the feedback. What are those questions > > >specifically? > > >> Are IP > > >> > > > > addresses and OS is what causing it? > > >> > > > > > > >> > > > > Thanks! > > >> > > > > > > >> > > > > -- > > >> > > > > Nikita Ivanov > > >> > > > > Founder & CTO > > >> > > > > GridGain Systems > > >> > > > > > > >> > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh > > >> <rsht...@yahoo.com.invalid > > >> > > > > > >> > > > > wrote: > > >> > > > > > > >> > > > > NIkita, > > >> > > > > > > >> > > > > While this will help improve Ignite, it will prevent its > > >adoption > > >> by > > >> > > many > > >> > > > > projects -- sending and retaining IP adresses, OS versions, > > >etc. > > >> raises > > >> > > > > tons of questions when considering to use Ignite. Even if it > > >can be > > >> > > opted > > >> > > > > out. > > >> > > > > -- Roman > > >> > > > > > > >> > > > > > > >> > > > > On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov < > > >> > > nivano...@gmail.com> > > >> > > > > wrote: > > >> > > > > > > >> > > > > > > >> > > > > Igniters, > > >> > > > > I would like to kick off the discussion on the idea of > > >collecting > > >> > > Ignite > > >> > > > > usage statistics. The basic idea behind this is to better > > >> understand > > >> > > > > general and anonymous Ignite usage information to better > > >calibrate > > >> > > > > community efforts in developing new features, improving > > >existing > > >> ones, > > >> > > > > delivering better documentation - and in every other way to > > >make > > >> our > > >> > > > > project a better software solution. > > >> > > > > > > >> > > > > Although such instrumentation is standard practice in > > >commercially > > >> > > > > developed software, for an ASF project this could be a > > >sensitive > > >> issue. > > >> > > > > Therefore I would like to initiate a full community > > >discussion on > > >> how > > >> > > best > > >> > > > > to implement such practice for the benefit of project while > > >> ensuring > > >> > > the > > >> > > > > privacy protection of Ignite users. > > >> > > > > > > >> > > > > To ignite (pun intended) the discussion I'll outline below > > >some of > > >> the > > >> > > > > basic thoughts that I have on this subject. They are here > > >only to > > >> give > > >> > > an > > >> > > > > idea of what such instrumentation may potentially look like > > >so > > >> that we > > >> > > can > > >> > > > > discuss the merits of this idea in a tangible context. > > >> > > > > > > >> > > > > Overview > > >> > > > > ------------- > > >> > > > > Upon start and every hour thereafter each Ignite node will > > >collect, > > >> > > encrypt > > >> > > > > and send usage statistics over HTTPS to the ASF-hosted > > >server. That > > >> > > server > > >> > > > > will accept such HTTPS packets, decrypt them and store them > > >in a > > >> > > > > time-series DB. A web interface will be provided to view the > > >usage > > >> > > > > information. > > >> > > > > > > >> > > > > Opt-In or Opt-out > > >> > > > > ------------------------- > > >> > > > > Opt-out. Ignite website will offer simple instructions > > >(system > > >> > > property) on > > >> > > > > how to disable this instrumentation. > > >> > > > > > > >> > > > > Code, Infra, Access > > >> > > > > --------------------------- > > >> > > > > Ignite instrumentation will be part of the Ignite code base. > > >The > > >> > > collection > > >> > > > > server will be a separate module in the Ignite code base > > >(released > > >> > > > > separately from Ignite). The collection server will be hosted > > >by > > >> ASF > > >> > > Infra. > > >> > > > > > > >> > > > > Usage statistics will be publicly accessible by anyone in the > > >> > > community. > > >> > > > > > > >> > > > > Private, Personal Data > > >> > > > > ------------------------------ > > >> > > > > No private or personal data will ever be transferred. No > > >emails, > > >> > > usernames, > > >> > > > > company names, grid names, etc. > > >> > > > > > > >> > > > > Data Retention > > >> > > > > -------------------- > > >> > > > > All data will be retained for 1 year and deleted permanently > > >> > > thereafter. > > >> > > > > > > >> > > > > Usage Data > > >> > > > > ---------------- > > >> > > > > The following data will be collected in each packet sent to > > >the > > >> > > collection > > >> > > > > server: > > >> > > > > - GRID_SIZE (to correspond our testing environment with the > > >more > > >> > > frequent > > >> > > > > cluster sizes) > > >> > > > > - IP_ADDR (for general geo-tracking as well as to know what > > >> > > documentation > > >> > > > > language should be a priority) > > >> > > > > - SES_ID (to track continues uptime vs. re-starts) > > >> > > > > - USERNAME_TYPE (privilege username vs. standard, to track > > >> production > > >> > > vs. > > >> > > > > dev/testing usage; note - this is not an actual username) > > >> > > > > - OS_NAME > > >> > > > > - OS_VER > > >> > > > > - OS_ARCH > > >> > > > > - JAVA_VER > > >> > > > > - JAVA_VENDOR > > >> > > > > - COMP_SQL (whether or not this feature was used) > > >> > > > > - COMP_COMPUTE (whether or not this feature was used) > > >> > > > > - COMP_DATAGRID (whether or not this feature was used) > > >> > > > > - COMP_STREAMING (whether or not this feature was used) > > >> > > > > - COMP_IGFS (whether or not this feature was used) > > >> > > > > - COMP_SERVICE (whether or not this feature was used) > > >> > > > > - COMP_PERSISTENCE (whether or not this feature was used) > > >> > > > > > > >> > > > > Please let's discuss this idea. Everyone's comments and > > >> suggestions are > > >> > > > > *extremely* welcome. > > >> > > > > > > >> > > > > Thanks, > > >> > > > > Nikita Ivanov. > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > >> > > >> > > >