Hi, Am Dienstag, dem 18.04.2023 um 07:48 +0200 schrieb Jan Lahoda: > I apologize for being contrarian, but since the index download > started for me (again) while on a bus with very poor internet > connection, I guess I should tell you my view.
no reason to apologize. > Unless I am mistaken, the index gz has currently roughly 1.9GB, and > it tooks several minutes to actually create the Lucene index from it, > consuming some more space and CPU. > > To be honest, it never seemed very polite to me to download and > process so much without asking. > > I guess alternatives that I would see would include (combination of > options possible): > - explicitly ask before downloading (possibly allowing the user to > select auto-download) Yes, if people get notified, that they'll get the full index locally, then I'm okk with that. I see a problem if features silently give outdated answers or don't work at all. Else we'll get "NetBeans suggested version X, but Y is already on central, why is this not current?". > - have the features that use the index do some query on a server, if > there isn't a downloaded index (or if it is stale/obsolete) IMHO this highly depends on the speed of the API. If the latency is high, the next bug will be "It takes ages until my POM tells me, that it is outdated". > - given that https://github.com/apache/netbeans/pull/4999 produces a > smaller index, we could have a download location (server) at least > for maven central that would serve this optimized index. If I > understand it properly, the smallest index under that PR is 0.8GB, > and if it would compress reasonably well, it might be (say) 0.5GB > compressed - much better than 1.9GB, and no significant CPU usage > after the index is downloaded. (Even if it was 0.8GB, it is still > much better than 1.9GB+CPU churn.) Truncating the index needs to be done carefully. NetBeans has a search my SHA1 (or MD5?) feature. That will break, if you remove that data from the index. A similar situation will arise, if arbitrary cut offs are done based on time. Consider a libary, that does some interesting algorithm, that just works the same even after years. If we cut the index at 6 months for example, that artifact won't be found anymore. > There was also an argument on conserving the ASF resources in another > discussion recently. If I consider there would be (only) 10 000 > installations of NetBeans, with the default setting to download the > index once a week, it is almost 20TB of data every week if I count > correctly. +the CPU cycles to convert the index on user's machines. > It seems there may be a way to conserve the ASF resources and provide > better experience to the users at the same time. The download is from sonatypes CDN. Given that they actively discourage central mirrors, I have not to much concern here. It is also the the resourced of the ASF. Greetings Matthias --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org For additional commands, e-mail: dev-h...@netbeans.apache.org For further information about the NetBeans mailing lists, visit: https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists