I would try to ping legal again and see if they respond. If not, I think we will need to come up with a simpler approach, that does not require legal approval.
D. On Jul 18, 2017, 2:23 PM, at 2:23 PM, Nikita Ivanov <nivano...@gmail.com> wrote: >Igniters, >Just a quick update. I haven't gotten response from ASF Legal on this >thread and I frankly don't know how to proceed here. What's the process >to >arrive to a decision point here? > >Thanks! >-- >Nikita Ivanov > > >On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik <c...@apache.org> >wrote: > >> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote: >> > Cos, >> > Based on my experience having it off by default negates the entire >> > purpose... We need statistically meaningful data set to make any >> inferences >> > from it. Moreover, if we are going to ask folks to turn it on it >will >> > significantly skew the resulting data set anyways and show full >picture. >> I >> > think "on" by default is the better option if we are to collect >usage >> stats >> > to begin with. >> >> yes, sure. But having this "on" by default is likely to expose us to >> another >> shit-storm down the road. An interesting dilemma to have indeed. In >my >> experience, whenever I install something like a browser or an >operating >> system, it would ask if I want to make the particular piece of >software >> better >> by sending back some anonymized stats. Basically, I am given a way to >> explicitly opt-out if I wish. >> >> By turning the feature "on" by default is like saying: "we'll be >collecting >> some stats, but if you don't want to you can go here and there and >disable >> the >> collection. Oh, and by the way - you need to go and figure out the >exact >> steps >> to disable it." >> >> > Also, I want to re-iterate it again to avoid misunderstanding: >there is >> no >> > proposal nor will there be a technical way to attribute collected >data >> back >> > to a certain company. That's not what this is all about. We should >only >> be >> > interested in aggregated stats (community size, geo information, >language >> > information, components usage). >> >> Yes, I think it is clear, but never hurts to re-iterate. >> >> Cos >> >> > Thoughts? >> > >> > -- >> > Nikita Ivanov >> > Founder & CTO >> > GridGain Systems >> > >> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik <c...@apache.org> >> wrote: >> > >> > > Actually, that should be OFF by default. It sounds like this >reduce the >> > > amount >> > > of the data collected, but this would address the concerns of >companies >> > > like >> > > Roman's. I know for sure that a few of my clients would sue my >ass out >> of >> > > existence if I gave them the platform collecting their >data-centers >> info. >> > > >> > > Let's have it, set if off by default and document and easy way to >turn >> it >> > > off. >> > > Then start making rounds asking our user base to share _some_ of >the >> stats >> > > with the community, so we can track the growth of the install >base, >> etc. >> > > >> > > Cos >> > > >> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote: >> > > > The idea so far is to have a single system property in >configuration >> that >> > > > turns this off completely. I envision that this will be >prominently >> > > > featured on Ignite website so that everyone who would like to >> disable it >> > > - >> > > > can do it in seconds. >> > > > >> > > > Thoughts? >> > > > >> > > > -- >> > > > Nikita Ivanov >> > > > Founder & CTO >> > > > GridGain Systems >> > > > >> > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh ><rsht...@yahoo.com> >> wrote: >> > > > >> > > > > Nikita, >> > > > > >> > > > > Sending and storing (somewhere the company cannot securely >handle) >> any >> > > > > information (OS version, IP addresses, etc.) that can be used >to >> > > compromise >> > > > > the services would be unacceptable. >> > > > > Turning it off might be ok (possibly through the cluster >settings, >> not >> > > via >> > > > > globally-accessible site), but the thing that there's a risk >some >> > > > > information can leak outside (for any reason, starting from a >human >> > > > > mistake) is scary. >> > > > > >> > > > > -- Roman >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov < >> > > niva...@gridgain.com> >> > > > > wrote: >> > > > > >> > > > > >> > > > > Roman, >> > > > > Thanks for the feedback. What are those questions >specifically? >> Are IP >> > > > > addresses and OS is what causing it? >> > > > > >> > > > > Thanks! >> > > > > >> > > > > -- >> > > > > Nikita Ivanov >> > > > > Founder & CTO >> > > > > GridGain Systems >> > > > > >> > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh >> <rsht...@yahoo.com.invalid >> > > > >> > > > > wrote: >> > > > > >> > > > > NIkita, >> > > > > >> > > > > While this will help improve Ignite, it will prevent its >adoption >> by >> > > many >> > > > > projects -- sending and retaining IP adresses, OS versions, >etc. >> raises >> > > > > tons of questions when considering to use Ignite. Even if it >can be >> > > opted >> > > > > out. >> > > > > -- Roman >> > > > > >> > > > > >> > > > > On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov < >> > > nivano...@gmail.com> >> > > > > wrote: >> > > > > >> > > > > >> > > > > Igniters, >> > > > > I would like to kick off the discussion on the idea of >collecting >> > > Ignite >> > > > > usage statistics. The basic idea behind this is to better >> understand >> > > > > general and anonymous Ignite usage information to better >calibrate >> > > > > community efforts in developing new features, improving >existing >> ones, >> > > > > delivering better documentation - and in every other way to >make >> our >> > > > > project a better software solution. >> > > > > >> > > > > Although such instrumentation is standard practice in >commercially >> > > > > developed software, for an ASF project this could be a >sensitive >> issue. >> > > > > Therefore I would like to initiate a full community >discussion on >> how >> > > best >> > > > > to implement such practice for the benefit of project while >> ensuring >> > > the >> > > > > privacy protection of Ignite users. >> > > > > >> > > > > To ignite (pun intended) the discussion I'll outline below >some of >> the >> > > > > basic thoughts that I have on this subject. They are here >only to >> give >> > > an >> > > > > idea of what such instrumentation may potentially look like >so >> that we >> > > can >> > > > > discuss the merits of this idea in a tangible context. >> > > > > >> > > > > Overview >> > > > > ------------- >> > > > > Upon start and every hour thereafter each Ignite node will >collect, >> > > encrypt >> > > > > and send usage statistics over HTTPS to the ASF-hosted >server. That >> > > server >> > > > > will accept such HTTPS packets, decrypt them and store them >in a >> > > > > time-series DB. A web interface will be provided to view the >usage >> > > > > information. >> > > > > >> > > > > Opt-In or Opt-out >> > > > > ------------------------- >> > > > > Opt-out. Ignite website will offer simple instructions >(system >> > > property) on >> > > > > how to disable this instrumentation. >> > > > > >> > > > > Code, Infra, Access >> > > > > --------------------------- >> > > > > Ignite instrumentation will be part of the Ignite code base. >The >> > > collection >> > > > > server will be a separate module in the Ignite code base >(released >> > > > > separately from Ignite). The collection server will be hosted >by >> ASF >> > > Infra. >> > > > > >> > > > > Usage statistics will be publicly accessible by anyone in the >> > > community. >> > > > > >> > > > > Private, Personal Data >> > > > > ------------------------------ >> > > > > No private or personal data will ever be transferred. No >emails, >> > > usernames, >> > > > > company names, grid names, etc. >> > > > > >> > > > > Data Retention >> > > > > -------------------- >> > > > > All data will be retained for 1 year and deleted permanently >> > > thereafter. >> > > > > >> > > > > Usage Data >> > > > > ---------------- >> > > > > The following data will be collected in each packet sent to >the >> > > collection >> > > > > server: >> > > > > - GRID_SIZE (to correspond our testing environment with the >more >> > > frequent >> > > > > cluster sizes) >> > > > > - IP_ADDR (for general geo-tracking as well as to know what >> > > documentation >> > > > > language should be a priority) >> > > > > - SES_ID (to track continues uptime vs. re-starts) >> > > > > - USERNAME_TYPE (privilege username vs. standard, to track >> production >> > > vs. >> > > > > dev/testing usage; note - this is not an actual username) >> > > > > - OS_NAME >> > > > > - OS_VER >> > > > > - OS_ARCH >> > > > > - JAVA_VER >> > > > > - JAVA_VENDOR >> > > > > - COMP_SQL (whether or not this feature was used) >> > > > > - COMP_COMPUTE (whether or not this feature was used) >> > > > > - COMP_DATAGRID (whether or not this feature was used) >> > > > > - COMP_STREAMING (whether or not this feature was used) >> > > > > - COMP_IGFS (whether or not this feature was used) >> > > > > - COMP_SERVICE (whether or not this feature was used) >> > > > > - COMP_PERSISTENCE (whether or not this feature was used) >> > > > > >> > > > > Please let's discuss this idea. Everyone's comments and >> suggestions are >> > > > > *extremely* welcome. >> > > > > >> > > > > Thanks, >> > > > > Nikita Ivanov. >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > >> >>