Re: usage analytics
Makes sense to me. I would love to know which components/APIs are used more than others. Obviously, we should make sure everything is anonymous and we don't collect any private user data, but I believe this is already guaranteed by Google Analytics. -Val On Tue, Nov 3, 2020 at 3:59 AM Alexey Goncharuk wrote: > Folks, > > I want to bump up this discussion and slightly change the format suggested > by Nikita. I dot think it is correct to gather any information related to > the user environment. However, can we collect just the fact of some of the > Ignite APIs/subsystems being used with no user information whatsoever? > Having started thinking about Ignite 3.0 I realized that we lack even some > very basic knowledge on the impact of changing one or another feature or > API. > > To my knowledge, the Ignite website already uses google analytics which is > available to the community. The google analytics platform already has > tooling to track app screen hits in a completely anonymous way, so we can > use this tool to track Ignite components usage (once per node startup) > sending solely component name and a unique environment hash - no IP > addresses, no jdk/os/other information. The information will be available > in the same toolkit we are already using to analyze the website and > optimize our docs. > > WDYT? > > ср, 19 июл. 2017 г. в 01:15, : > > > I would try to ping legal again and see if they respond. If not, I think > > we will need to come up with a simpler approach, that does not require > > legal approval. > > > > D. > > > > On Jul 18, 2017, 2:23 PM, at 2:23 PM, Nikita Ivanov > > > wrote: > > >Igniters, > > >Just a quick update. I haven't gotten response from ASF Legal on this > > >thread and I frankly don't know how to proceed here. What's the process > > >to > > >arrive to a decision point here? > > > > > >Thanks! > > >-- > > >Nikita Ivanov > > > > > > > > >On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik > > >wrote: > > > > > >> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote: > > >> > Cos, > > >> > Based on my experience having it off by default negates the entire > > >> > purpose... We need statistically meaningful data set to make any > > >> inferences > > >> > from it. Moreover, if we are going to ask folks to turn it on it > > >will > > >> > significantly skew the resulting data set anyways and show full > > >picture. > > >> I > > >> > think "on" by default is the better option if we are to collect > > >usage > > >> stats > > >> > to begin with. > > >> > > >> yes, sure. But having this "on" by default is likely to expose us to > > >> another > > >> shit-storm down the road. An interesting dilemma to have indeed. In > > >my > > >> experience, whenever I install something like a browser or an > > >operating > > >> system, it would ask if I want to make the particular piece of > > >software > > >> better > > >> by sending back some anonymized stats. Basically, I am given a way to > > >> explicitly opt-out if I wish. > > >> > > >> By turning the feature "on" by default is like saying: "we'll be > > >collecting > > >> some stats, but if you don't want to you can go here and there and > > >disable > > >> the > > >> collection. Oh, and by the way - you need to go and figure out the > > >exact > > >> steps > > >> to disable it." > > >> > > >> > Also, I want to re-iterate it again to avoid misunderstanding: > > >there is > > >> no > > >> > proposal nor will there be a technical way to attribute collected > > >data > > >> back > > >> > to a certain company. That's not what this is all about. We should > > >only > > >> be > > >> > interested in aggregated stats (community size, geo information, > > >language > > >> > information, components usage). > > >> > > >> Yes, I think it is clear, but never hurts to re-iterate. > > >> > > >> Cos > > >> > > >> > Thoughts? > > >> > > > >> > -- > > >> > Nikita Ivanov > > >> > Founder & CTO > > >> > GridGain Systems > > >> > > > >> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik > > >> wrote: > > >> > > > >> > > Actually, that should be OFF by default. It sounds like this > > >reduce the > > >> > > amount > > >> > > of the data collected, but this would address the concerns of > > >companies > > >> > > like > > >> > > Roman's. I know for sure that a few of my clients would sue my > > >ass out > > >> of > > >> > > existence if I gave them the platform collecting their > > >data-centers > > >> info. > > >> > > > > >> > > Let's have it, set if off by default and document and easy way to > > >turn > > >> it > > >> > > off. > > >> > > Then start making rounds asking our user base to share _some_ of > > >the > > >> stats > > >> > > with the community, so we can track the growth of the install > > >base, > > >> etc. > > >> > > > > >> > > Cos > > >> > > > > >> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote: > > >> > > > The idea so far is to have a single system property in > > >configuration > > >> that > > >> > > > turns this off
Re: usage analytics
Folks, I want to bump up this discussion and slightly change the format suggested by Nikita. I dot think it is correct to gather any information related to the user environment. However, can we collect just the fact of some of the Ignite APIs/subsystems being used with no user information whatsoever? Having started thinking about Ignite 3.0 I realized that we lack even some very basic knowledge on the impact of changing one or another feature or API. To my knowledge, the Ignite website already uses google analytics which is available to the community. The google analytics platform already has tooling to track app screen hits in a completely anonymous way, so we can use this tool to track Ignite components usage (once per node startup) sending solely component name and a unique environment hash - no IP addresses, no jdk/os/other information. The information will be available in the same toolkit we are already using to analyze the website and optimize our docs. WDYT? ср, 19 июл. 2017 г. в 01:15, : > I would try to ping legal again and see if they respond. If not, I think > we will need to come up with a simpler approach, that does not require > legal approval. > > D. > > On Jul 18, 2017, 2:23 PM, at 2:23 PM, Nikita Ivanov > wrote: > >Igniters, > >Just a quick update. I haven't gotten response from ASF Legal on this > >thread and I frankly don't know how to proceed here. What's the process > >to > >arrive to a decision point here? > > > >Thanks! > >-- > >Nikita Ivanov > > > > > >On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik > >wrote: > > > >> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote: > >> > Cos, > >> > Based on my experience having it off by default negates the entire > >> > purpose... We need statistically meaningful data set to make any > >> inferences > >> > from it. Moreover, if we are going to ask folks to turn it on it > >will > >> > significantly skew the resulting data set anyways and show full > >picture. > >> I > >> > think "on" by default is the better option if we are to collect > >usage > >> stats > >> > to begin with. > >> > >> yes, sure. But having this "on" by default is likely to expose us to > >> another > >> shit-storm down the road. An interesting dilemma to have indeed. In > >my > >> experience, whenever I install something like a browser or an > >operating > >> system, it would ask if I want to make the particular piece of > >software > >> better > >> by sending back some anonymized stats. Basically, I am given a way to > >> explicitly opt-out if I wish. > >> > >> By turning the feature "on" by default is like saying: "we'll be > >collecting > >> some stats, but if you don't want to you can go here and there and > >disable > >> the > >> collection. Oh, and by the way - you need to go and figure out the > >exact > >> steps > >> to disable it." > >> > >> > Also, I want to re-iterate it again to avoid misunderstanding: > >there is > >> no > >> > proposal nor will there be a technical way to attribute collected > >data > >> back > >> > to a certain company. That's not what this is all about. We should > >only > >> be > >> > interested in aggregated stats (community size, geo information, > >language > >> > information, components usage). > >> > >> Yes, I think it is clear, but never hurts to re-iterate. > >> > >> Cos > >> > >> > Thoughts? > >> > > >> > -- > >> > Nikita Ivanov > >> > Founder & CTO > >> > GridGain Systems > >> > > >> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik > >> wrote: > >> > > >> > > Actually, that should be OFF by default. It sounds like this > >reduce the > >> > > amount > >> > > of the data collected, but this would address the concerns of > >companies > >> > > like > >> > > Roman's. I know for sure that a few of my clients would sue my > >ass out > >> of > >> > > existence if I gave them the platform collecting their > >data-centers > >> info. > >> > > > >> > > Let's have it, set if off by default and document and easy way to > >turn > >> it > >> > > off. > >> > > Then start making rounds asking our user base to share _some_ of > >the > >> stats > >> > > with the community, so we can track the growth of the install > >base, > >> etc. > >> > > > >> > > Cos > >> > > > >> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote: > >> > > > The idea so far is to have a single system property in > >configuration > >> that > >> > > > turns this off completely. I envision that this will be > >prominently > >> > > > featured on Ignite website so that everyone who would like to > >> disable it > >> > > - > >> > > > can do it in seconds. > >> > > > > >> > > > Thoughts? > >> > > > > >> > > > -- > >> > > > Nikita Ivanov > >> > > > Founder & CTO > >> > > > GridGain Systems > >> > > > > >> > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh > > > >> wrote: > >> > > > > >> > > > > Nikita, > >> > > > > > >> > > > > Sending and storing (somewhere the company cannot securely > >handle) > >> any > >> > > > > information (OS version, IP addresses,
Re: usage analytics
I would try to ping legal again and see if they respond. If not, I think we will need to come up with a simpler approach, that does not require legal approval. D. On Jul 18, 2017, 2:23 PM, at 2:23 PM, Nikita Ivanovwrote: >Igniters, >Just a quick update. I haven't gotten response from ASF Legal on this >thread and I frankly don't know how to proceed here. What's the process >to >arrive to a decision point here? > >Thanks! >-- >Nikita Ivanov > > >On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik >wrote: > >> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote: >> > Cos, >> > Based on my experience having it off by default negates the entire >> > purpose... We need statistically meaningful data set to make any >> inferences >> > from it. Moreover, if we are going to ask folks to turn it on it >will >> > significantly skew the resulting data set anyways and show full >picture. >> I >> > think "on" by default is the better option if we are to collect >usage >> stats >> > to begin with. >> >> yes, sure. But having this "on" by default is likely to expose us to >> another >> shit-storm down the road. An interesting dilemma to have indeed. In >my >> experience, whenever I install something like a browser or an >operating >> system, it would ask if I want to make the particular piece of >software >> better >> by sending back some anonymized stats. Basically, I am given a way to >> explicitly opt-out if I wish. >> >> By turning the feature "on" by default is like saying: "we'll be >collecting >> some stats, but if you don't want to you can go here and there and >disable >> the >> collection. Oh, and by the way - you need to go and figure out the >exact >> steps >> to disable it." >> >> > Also, I want to re-iterate it again to avoid misunderstanding: >there is >> no >> > proposal nor will there be a technical way to attribute collected >data >> back >> > to a certain company. That's not what this is all about. We should >only >> be >> > interested in aggregated stats (community size, geo information, >language >> > information, components usage). >> >> Yes, I think it is clear, but never hurts to re-iterate. >> >> Cos >> >> > Thoughts? >> > >> > -- >> > Nikita Ivanov >> > Founder & CTO >> > GridGain Systems >> > >> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik >> wrote: >> > >> > > Actually, that should be OFF by default. It sounds like this >reduce the >> > > amount >> > > of the data collected, but this would address the concerns of >companies >> > > like >> > > Roman's. I know for sure that a few of my clients would sue my >ass out >> of >> > > existence if I gave them the platform collecting their >data-centers >> info. >> > > >> > > Let's have it, set if off by default and document and easy way to >turn >> it >> > > off. >> > > Then start making rounds asking our user base to share _some_ of >the >> stats >> > > with the community, so we can track the growth of the install >base, >> etc. >> > > >> > > Cos >> > > >> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote: >> > > > The idea so far is to have a single system property in >configuration >> that >> > > > turns this off completely. I envision that this will be >prominently >> > > > featured on Ignite website so that everyone who would like to >> disable it >> > > - >> > > > can do it in seconds. >> > > > >> > > > Thoughts? >> > > > >> > > > -- >> > > > Nikita Ivanov >> > > > Founder & CTO >> > > > GridGain Systems >> > > > >> > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh > >> wrote: >> > > > >> > > > > Nikita, >> > > > > >> > > > > Sending and storing (somewhere the company cannot securely >handle) >> any >> > > > > information (OS version, IP addresses, etc.) that can be used >to >> > > compromise >> > > > > the services would be unacceptable. >> > > > > Turning it off might be ok (possibly through the cluster >settings, >> not >> > > via >> > > > > globally-accessible site), but the thing that there's a risk >some >> > > > > information can leak outside (for any reason, starting from a >human >> > > > > mistake) is scary. >> > > > > >> > > > > -- Roman >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov < >> > > niva...@gridgain.com> >> > > > > wrote: >> > > > > >> > > > > >> > > > > Roman, >> > > > > Thanks for the feedback. What are those questions >specifically? >> Are IP >> > > > > addresses and OS is what causing it? >> > > > > >> > > > > Thanks! >> > > > > >> > > > > -- >> > > > > Nikita Ivanov >> > > > > Founder & CTO >> > > > > GridGain Systems >> > > > > >> > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh >> > > > > >> > > > > wrote: >> > > > > >> > > > > NIkita, >> > > > > >> > > > > While this will help improve Ignite, it will prevent its >adoption >> by >> > > many >> > > > > projects -- sending and retaining IP adresses, OS versions, >etc. >> raises >> >
Re: usage analytics
Igniters, Just a quick update. I haven't gotten response from ASF Legal on this thread and I frankly don't know how to proceed here. What's the process to arrive to a decision point here? Thanks! -- Nikita Ivanov On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnikwrote: > On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote: > > Cos, > > Based on my experience having it off by default negates the entire > > purpose... We need statistically meaningful data set to make any > inferences > > from it. Moreover, if we are going to ask folks to turn it on it will > > significantly skew the resulting data set anyways and show full picture. > I > > think "on" by default is the better option if we are to collect usage > stats > > to begin with. > > yes, sure. But having this "on" by default is likely to expose us to > another > shit-storm down the road. An interesting dilemma to have indeed. In my > experience, whenever I install something like a browser or an operating > system, it would ask if I want to make the particular piece of software > better > by sending back some anonymized stats. Basically, I am given a way to > explicitly opt-out if I wish. > > By turning the feature "on" by default is like saying: "we'll be collecting > some stats, but if you don't want to you can go here and there and disable > the > collection. Oh, and by the way - you need to go and figure out the exact > steps > to disable it." > > > Also, I want to re-iterate it again to avoid misunderstanding: there is > no > > proposal nor will there be a technical way to attribute collected data > back > > to a certain company. That's not what this is all about. We should only > be > > interested in aggregated stats (community size, geo information, language > > information, components usage). > > Yes, I think it is clear, but never hurts to re-iterate. > > Cos > > > Thoughts? > > > > -- > > Nikita Ivanov > > Founder & CTO > > GridGain Systems > > > > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik > wrote: > > > > > Actually, that should be OFF by default. It sounds like this reduce the > > > amount > > > of the data collected, but this would address the concerns of companies > > > like > > > Roman's. I know for sure that a few of my clients would sue my ass out > of > > > existence if I gave them the platform collecting their data-centers > info. > > > > > > Let's have it, set if off by default and document and easy way to turn > it > > > off. > > > Then start making rounds asking our user base to share _some_ of the > stats > > > with the community, so we can track the growth of the install base, > etc. > > > > > > Cos > > > > > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote: > > > > The idea so far is to have a single system property in configuration > that > > > > turns this off completely. I envision that this will be prominently > > > > featured on Ignite website so that everyone who would like to > disable it > > > - > > > > can do it in seconds. > > > > > > > > Thoughts? > > > > > > > > -- > > > > Nikita Ivanov > > > > Founder & CTO > > > > GridGain Systems > > > > > > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh > wrote: > > > > > > > > > Nikita, > > > > > > > > > > Sending and storing (somewhere the company cannot securely handle) > any > > > > > information (OS version, IP addresses, etc.) that can be used to > > > compromise > > > > > the services would be unacceptable. > > > > > Turning it off might be ok (possibly through the cluster settings, > not > > > via > > > > > globally-accessible site), but the thing that there's a risk some > > > > > information can leak outside (for any reason, starting from a human > > > > > mistake) is scary. > > > > > > > > > > -- Roman > > > > > > > > > > > > > > > > > > > > > > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov < > > > niva...@gridgain.com> > > > > > wrote: > > > > > > > > > > > > > > > Roman, > > > > > Thanks for the feedback. What are those questions specifically? > Are IP > > > > > addresses and OS is what causing it? > > > > > > > > > > Thanks! > > > > > > > > > > -- > > > > > Nikita Ivanov > > > > > Founder & CTO > > > > > GridGain Systems > > > > > > > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh > > > > > > > > > wrote: > > > > > > > > > > NIkita, > > > > > > > > > > While this will help improve Ignite, it will prevent its adoption > by > > > many > > > > > projects -- sending and retaining IP adresses, OS versions, etc. > raises > > > > > tons of questions when considering to use Ignite. Even if it can be > > > opted > > > > > out. > > > > > -- Roman > > > > > > > > > > > > > > > On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov < > > > nivano...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > Igniters, > > > > > I would like to kick off the discussion on the idea of collecting > > > Ignite > > > > > usage statistics. The basic idea behind this
Re: usage analytics
Cos, Based on my experience having it off by default negates the entire purpose... We need statistically meaningful data set to make any inferences from it. Moreover, if we are going to ask folks to turn it on it will significantly skew the resulting data set anyways and show full picture. I think "on" by default is the better option if we are to collect usage stats to begin with. Also, I want to re-iterate it again to avoid misunderstanding: there is no proposal nor will there be a technical way to attribute collected data back to a certain company. That's not what this is all about. We should only be interested in aggregated stats (community size, geo information, language information, components usage). Thoughts? -- Nikita Ivanov Founder & CTO GridGain Systems On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnikwrote: > Actually, that should be OFF by default. It sounds like this reduce the > amount > of the data collected, but this would address the concerns of companies > like > Roman's. I know for sure that a few of my clients would sue my ass out of > existence if I gave them the platform collecting their data-centers info. > > Let's have it, set if off by default and document and easy way to turn it > off. > Then start making rounds asking our user base to share _some_ of the stats > with the community, so we can track the growth of the install base, etc. > > Cos > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote: > > The idea so far is to have a single system property in configuration that > > turns this off completely. I envision that this will be prominently > > featured on Ignite website so that everyone who would like to disable it > - > > can do it in seconds. > > > > Thoughts? > > > > -- > > Nikita Ivanov > > Founder & CTO > > GridGain Systems > > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh wrote: > > > > > Nikita, > > > > > > Sending and storing (somewhere the company cannot securely handle) any > > > information (OS version, IP addresses, etc.) that can be used to > compromise > > > the services would be unacceptable. > > > Turning it off might be ok (possibly through the cluster settings, not > via > > > globally-accessible site), but the thing that there's a risk some > > > information can leak outside (for any reason, starting from a human > > > mistake) is scary. > > > > > > -- Roman > > > > > > > > > > > > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov < > niva...@gridgain.com> > > > wrote: > > > > > > > > > Roman, > > > Thanks for the feedback. What are those questions specifically? Are IP > > > addresses and OS is what causing it? > > > > > > Thanks! > > > > > > -- > > > Nikita Ivanov > > > Founder & CTO > > > GridGain Systems > > > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh > > > > wrote: > > > > > > NIkita, > > > > > > While this will help improve Ignite, it will prevent its adoption by > many > > > projects -- sending and retaining IP adresses, OS versions, etc. raises > > > tons of questions when considering to use Ignite. Even if it can be > opted > > > out. > > > -- Roman > > > > > > > > > On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov < > nivano...@gmail.com> > > > wrote: > > > > > > > > > Igniters, > > > I would like to kick off the discussion on the idea of collecting > Ignite > > > usage statistics. The basic idea behind this is to better understand > > > general and anonymous Ignite usage information to better calibrate > > > community efforts in developing new features, improving existing ones, > > > delivering better documentation - and in every other way to make our > > > project a better software solution. > > > > > > Although such instrumentation is standard practice in commercially > > > developed software, for an ASF project this could be a sensitive issue. > > > Therefore I would like to initiate a full community discussion on how > best > > > to implement such practice for the benefit of project while ensuring > the > > > privacy protection of Ignite users. > > > > > > To ignite (pun intended) the discussion I'll outline below some of the > > > basic thoughts that I have on this subject. They are here only to give > an > > > idea of what such instrumentation may potentially look like so that we > can > > > discuss the merits of this idea in a tangible context. > > > > > > Overview > > > - > > > Upon start and every hour thereafter each Ignite node will collect, > encrypt > > > and send usage statistics over HTTPS to the ASF-hosted server. That > server > > > will accept such HTTPS packets, decrypt them and store them in a > > > time-series DB. A web interface will be provided to view the usage > > > information. > > > > > > Opt-In or Opt-out > > > - > > > Opt-out. Ignite website will offer simple instructions (system > property) on > > > how to disable this instrumentation. > > > > > > Code, Infra, Access > > >
Re: usage analytics
The idea so far is to have a single system property in configuration that turns this off completely. I envision that this will be prominently featured on Ignite website so that everyone who would like to disable it - can do it in seconds. Thoughts? -- Nikita Ivanov Founder & CTO GridGain Systems On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykhwrote: > Nikita, > > Sending and storing (somewhere the company cannot securely handle) any > information (OS version, IP addresses, etc.) that can be used to compromise > the services would be unacceptable. > Turning it off might be ok (possibly through the cluster settings, not via > globally-accessible site), but the thing that there's a risk some > information can leak outside (for any reason, starting from a human > mistake) is scary. > > -- Roman > > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov > wrote: > > > Roman, > Thanks for the feedback. What are those questions specifically? Are IP > addresses and OS is what causing it? > > Thanks! > > -- > Nikita Ivanov > Founder & CTO > GridGain Systems > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh > wrote: > > NIkita, > > While this will help improve Ignite, it will prevent its adoption by many > projects -- sending and retaining IP adresses, OS versions, etc. raises > tons of questions when considering to use Ignite. Even if it can be opted > out. > -- Roman > > > On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov > wrote: > > > Igniters, > I would like to kick off the discussion on the idea of collecting Ignite > usage statistics. The basic idea behind this is to better understand > general and anonymous Ignite usage information to better calibrate > community efforts in developing new features, improving existing ones, > delivering better documentation - and in every other way to make our > project a better software solution. > > Although such instrumentation is standard practice in commercially > developed software, for an ASF project this could be a sensitive issue. > Therefore I would like to initiate a full community discussion on how best > to implement such practice for the benefit of project while ensuring the > privacy protection of Ignite users. > > To ignite (pun intended) the discussion I'll outline below some of the > basic thoughts that I have on this subject. They are here only to give an > idea of what such instrumentation may potentially look like so that we can > discuss the merits of this idea in a tangible context. > > Overview > - > Upon start and every hour thereafter each Ignite node will collect, encrypt > and send usage statistics over HTTPS to the ASF-hosted server. That server > will accept such HTTPS packets, decrypt them and store them in a > time-series DB. A web interface will be provided to view the usage > information. > > Opt-In or Opt-out > - > Opt-out. Ignite website will offer simple instructions (system property) on > how to disable this instrumentation. > > Code, Infra, Access > --- > Ignite instrumentation will be part of the Ignite code base. The collection > server will be a separate module in the Ignite code base (released > separately from Ignite). The collection server will be hosted by ASF Infra. > > Usage statistics will be publicly accessible by anyone in the community. > > Private, Personal Data > -- > No private or personal data will ever be transferred. No emails, usernames, > company names, grid names, etc. > > Data Retention > > All data will be retained for 1 year and deleted permanently thereafter. > > Usage Data > > The following data will be collected in each packet sent to the collection > server: > - GRID_SIZE (to correspond our testing environment with the more frequent > cluster sizes) > - IP_ADDR (for general geo-tracking as well as to know what documentation > language should be a priority) > - SES_ID (to track continues uptime vs. re-starts) > - USERNAME_TYPE (privilege username vs. standard, to track production vs. > dev/testing usage; note - this is not an actual username) > - OS_NAME > - OS_VER > - OS_ARCH > - JAVA_VER > - JAVA_VENDOR > - COMP_SQL (whether or not this feature was used) > - COMP_COMPUTE (whether or not this feature was used) > - COMP_DATAGRID (whether or not this feature was used) > - COMP_STREAMING (whether or not this feature was used) > - COMP_IGFS (whether or not this feature was used) > - COMP_SERVICE (whether or not this feature was used) > - COMP_PERSISTENCE (whether or not this feature was used) > > Please let's discuss this idea. Everyone's comments and suggestions are > *extremely* welcome. > > Thanks, > Nikita Ivanov. > > > > > > > >
Re: usage analytics
With such statistics collected by Ignite , we won't ever accept ignite in our environment. However, turning on and off stats collection capabilities would be helpful here if the feature is accepted further for implementation. Take Care, Rishi > On Jul 5, 2017, at 8:15 PM, Roman Shtykhwrote: > > NIkita, > > While this will help improve Ignite, it will prevent its adoption by many > projects -- sending and retaining IP adresses, OS versions, etc. raises tons > of questions when considering to use Ignite. Even if it can be opted out. > -- Roman > > >On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov > wrote: > > > Igniters, > I would like to kick off the discussion on the idea of collecting Ignite > usage statistics. The basic idea behind this is to better understand > general and anonymous Ignite usage information to better calibrate > community efforts in developing new features, improving existing ones, > delivering better documentation - and in every other way to make our > project a better software solution. > > Although such instrumentation is standard practice in commercially > developed software, for an ASF project this could be a sensitive issue. > Therefore I would like to initiate a full community discussion on how best > to implement such practice for the benefit of project while ensuring the > privacy protection of Ignite users. > > To ignite (pun intended) the discussion I'll outline below some of the > basic thoughts that I have on this subject. They are here only to give an > idea of what such instrumentation may potentially look like so that we can > discuss the merits of this idea in a tangible context. > > Overview > - > Upon start and every hour thereafter each Ignite node will collect, encrypt > and send usage statistics over HTTPS to the ASF-hosted server. That server > will accept such HTTPS packets, decrypt them and store them in a > time-series DB. A web interface will be provided to view the usage > information. > > Opt-In or Opt-out > - > Opt-out. Ignite website will offer simple instructions (system property) on > how to disable this instrumentation. > > Code, Infra, Access > --- > Ignite instrumentation will be part of the Ignite code base. The collection > server will be a separate module in the Ignite code base (released > separately from Ignite). The collection server will be hosted by ASF Infra. > > Usage statistics will be publicly accessible by anyone in the community. > > Private, Personal Data > -- > No private or personal data will ever be transferred. No emails, usernames, > company names, grid names, etc. > > Data Retention > > All data will be retained for 1 year and deleted permanently thereafter. > > Usage Data > > The following data will be collected in each packet sent to the collection > server: > - GRID_SIZE (to correspond our testing environment with the more frequent > cluster sizes) > - IP_ADDR (for general geo-tracking as well as to know what documentation > language should be a priority) > - SES_ID (to track continues uptime vs. re-starts) > - USERNAME_TYPE (privilege username vs. standard, to track production vs. > dev/testing usage; note - this is not an actual username) > - OS_NAME > - OS_VER > - OS_ARCH > - JAVA_VER > - JAVA_VENDOR > - COMP_SQL (whether or not this feature was used) > - COMP_COMPUTE (whether or not this feature was used) > - COMP_DATAGRID (whether or not this feature was used) > - COMP_STREAMING (whether or not this feature was used) > - COMP_IGFS (whether or not this feature was used) > - COMP_SERVICE (whether or not this feature was used) > - COMP_PERSISTENCE (whether or not this feature was used) > > Please let's discuss this idea. Everyone's comments and suggestions are > *extremely* welcome. > > Thanks, > Nikita Ivanov. > >
Re: usage analytics
Roman, Thanks for the feedback. What are those questions specifically? Are IP addresses and OS is what causing it? Thanks! -- Nikita Ivanov Founder & CTO GridGain Systems On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykhwrote: > NIkita, > > While this will help improve Ignite, it will prevent its adoption by many > projects -- sending and retaining IP adresses, OS versions, etc. raises > tons of questions when considering to use Ignite. Even if it can be opted > out. > -- Roman > > > On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov > wrote: > > > Igniters, > I would like to kick off the discussion on the idea of collecting Ignite > usage statistics. The basic idea behind this is to better understand > general and anonymous Ignite usage information to better calibrate > community efforts in developing new features, improving existing ones, > delivering better documentation - and in every other way to make our > project a better software solution. > > Although such instrumentation is standard practice in commercially > developed software, for an ASF project this could be a sensitive issue. > Therefore I would like to initiate a full community discussion on how best > to implement such practice for the benefit of project while ensuring the > privacy protection of Ignite users. > > To ignite (pun intended) the discussion I'll outline below some of the > basic thoughts that I have on this subject. They are here only to give an > idea of what such instrumentation may potentially look like so that we can > discuss the merits of this idea in a tangible context. > > Overview > - > Upon start and every hour thereafter each Ignite node will collect, encrypt > and send usage statistics over HTTPS to the ASF-hosted server. That server > will accept such HTTPS packets, decrypt them and store them in a > time-series DB. A web interface will be provided to view the usage > information. > > Opt-In or Opt-out > - > Opt-out. Ignite website will offer simple instructions (system property) on > how to disable this instrumentation. > > Code, Infra, Access > --- > Ignite instrumentation will be part of the Ignite code base. The collection > server will be a separate module in the Ignite code base (released > separately from Ignite). The collection server will be hosted by ASF Infra. > > Usage statistics will be publicly accessible by anyone in the community. > > Private, Personal Data > -- > No private or personal data will ever be transferred. No emails, usernames, > company names, grid names, etc. > > Data Retention > > All data will be retained for 1 year and deleted permanently thereafter. > > Usage Data > > The following data will be collected in each packet sent to the collection > server: > - GRID_SIZE (to correspond our testing environment with the more frequent > cluster sizes) > - IP_ADDR (for general geo-tracking as well as to know what documentation > language should be a priority) > - SES_ID (to track continues uptime vs. re-starts) > - USERNAME_TYPE (privilege username vs. standard, to track production vs. > dev/testing usage; note - this is not an actual username) > - OS_NAME > - OS_VER > - OS_ARCH > - JAVA_VER > - JAVA_VENDOR > - COMP_SQL (whether or not this feature was used) > - COMP_COMPUTE (whether or not this feature was used) > - COMP_DATAGRID (whether or not this feature was used) > - COMP_STREAMING (whether or not this feature was used) > - COMP_IGFS (whether or not this feature was used) > - COMP_SERVICE (whether or not this feature was used) > - COMP_PERSISTENCE (whether or not this feature was used) > > Please let's discuss this idea. Everyone's comments and suggestions are > *extremely* welcome. > > Thanks, > Nikita Ivanov. > > >
Re: usage analytics
NIkita, While this will help improve Ignite, it will prevent its adoption by many projects -- sending and retaining IP adresses, OS versions, etc. raises tons of questions when considering to use Ignite. Even if it can be opted out. -- Roman On Thursday, July 6, 2017 5:38 AM, Nikita Ivanovwrote: Igniters, I would like to kick off the discussion on the idea of collecting Ignite usage statistics. The basic idea behind this is to better understand general and anonymous Ignite usage information to better calibrate community efforts in developing new features, improving existing ones, delivering better documentation - and in every other way to make our project a better software solution. Although such instrumentation is standard practice in commercially developed software, for an ASF project this could be a sensitive issue. Therefore I would like to initiate a full community discussion on how best to implement such practice for the benefit of project while ensuring the privacy protection of Ignite users. To ignite (pun intended) the discussion I'll outline below some of the basic thoughts that I have on this subject. They are here only to give an idea of what such instrumentation may potentially look like so that we can discuss the merits of this idea in a tangible context. Overview - Upon start and every hour thereafter each Ignite node will collect, encrypt and send usage statistics over HTTPS to the ASF-hosted server. That server will accept such HTTPS packets, decrypt them and store them in a time-series DB. A web interface will be provided to view the usage information. Opt-In or Opt-out - Opt-out. Ignite website will offer simple instructions (system property) on how to disable this instrumentation. Code, Infra, Access --- Ignite instrumentation will be part of the Ignite code base. The collection server will be a separate module in the Ignite code base (released separately from Ignite). The collection server will be hosted by ASF Infra. Usage statistics will be publicly accessible by anyone in the community. Private, Personal Data -- No private or personal data will ever be transferred. No emails, usernames, company names, grid names, etc. Data Retention All data will be retained for 1 year and deleted permanently thereafter. Usage Data The following data will be collected in each packet sent to the collection server: - GRID_SIZE (to correspond our testing environment with the more frequent cluster sizes) - IP_ADDR (for general geo-tracking as well as to know what documentation language should be a priority) - SES_ID (to track continues uptime vs. re-starts) - USERNAME_TYPE (privilege username vs. standard, to track production vs. dev/testing usage; note - this is not an actual username) - OS_NAME - OS_VER - OS_ARCH - JAVA_VER - JAVA_VENDOR - COMP_SQL (whether or not this feature was used) - COMP_COMPUTE (whether or not this feature was used) - COMP_DATAGRID (whether or not this feature was used) - COMP_STREAMING (whether or not this feature was used) - COMP_IGFS (whether or not this feature was used) - COMP_SERVICE (whether or not this feature was used) - COMP_PERSISTENCE (whether or not this feature was used) Please let's discuss this idea. Everyone's comments and suggestions are *extremely* welcome. Thanks, Nikita Ivanov.
usage analytics
Igniters, I would like to kick off the discussion on the idea of collecting Ignite usage statistics. The basic idea behind this is to better understand general and anonymous Ignite usage information to better calibrate community efforts in developing new features, improving existing ones, delivering better documentation - and in every other way to make our project a better software solution. Although such instrumentation is standard practice in commercially developed software, for an ASF project this could be a sensitive issue. Therefore I would like to initiate a full community discussion on how best to implement such practice for the benefit of project while ensuring the privacy protection of Ignite users. To ignite (pun intended) the discussion I'll outline below some of the basic thoughts that I have on this subject. They are here only to give an idea of what such instrumentation may potentially look like so that we can discuss the merits of this idea in a tangible context. Overview - Upon start and every hour thereafter each Ignite node will collect, encrypt and send usage statistics over HTTPS to the ASF-hosted server. That server will accept such HTTPS packets, decrypt them and store them in a time-series DB. A web interface will be provided to view the usage information. Opt-In or Opt-out - Opt-out. Ignite website will offer simple instructions (system property) on how to disable this instrumentation. Code, Infra, Access --- Ignite instrumentation will be part of the Ignite code base. The collection server will be a separate module in the Ignite code base (released separately from Ignite). The collection server will be hosted by ASF Infra. Usage statistics will be publicly accessible by anyone in the community. Private, Personal Data -- No private or personal data will ever be transferred. No emails, usernames, company names, grid names, etc. Data Retention All data will be retained for 1 year and deleted permanently thereafter. Usage Data The following data will be collected in each packet sent to the collection server: - GRID_SIZE (to correspond our testing environment with the more frequent cluster sizes) - IP_ADDR (for general geo-tracking as well as to know what documentation language should be a priority) - SES_ID (to track continues uptime vs. re-starts) - USERNAME_TYPE (privilege username vs. standard, to track production vs. dev/testing usage; note - this is not an actual username) - OS_NAME - OS_VER - OS_ARCH - JAVA_VER - JAVA_VENDOR - COMP_SQL (whether or not this feature was used) - COMP_COMPUTE (whether or not this feature was used) - COMP_DATAGRID (whether or not this feature was used) - COMP_STREAMING (whether or not this feature was used) - COMP_IGFS (whether or not this feature was used) - COMP_SERVICE (whether or not this feature was used) - COMP_PERSISTENCE (whether or not this feature was used) Please let's discuss this idea. Everyone's comments and suggestions are *extremely* welcome. Thanks, Nikita Ivanov.