Re: [DISCUSS] maven remote repo indexer improvements

2023-04-18 Thread Michael Bien

Hi Jakub,

not sure. Since tmp is generally the best place for temporary files. 
Other places would require periodic cleanups in case something happens. 
The temp folder is only used during extraction, the final index is in 
your cache folder.



(this PR would give us more control over some of the locations too:

https://github.com/apache/maven-indexer/pull/302)



if you can't or don't want to increase your temp folder size, you can 
tell the JVM to use something else.



put this into your netbeans_default_options of netbeans.conf:


-J-Djava.io.tmpdir=/tmp/myothertmp


please make sure this folder is empty!


-mbien


On 18.04.23 21:46, Jakub Herkel wrote:

I would like to ask you if there is (will be) some configuration
option (via property for example) in which directory remote index
processing is?
I have a problem on my notebook that all processing is in /tmp
directory (tmpfs) with size 8GB and it always stops with an out of
space exception. And it is little bit awkward if every opening of
pom.xml results in downloading remote index and processing it with no
success.

best regards

jakub

On Tue, Apr 18, 2023 at 9:11 PM Michael Bien  wrote:

my thoughts so far on how I wanted to implement it:

   - time cutoff filter would be optional, configurable in the usual
maven/indexer options. Quick tests showed that full/2years/1year might
be reasonable values

   - sha1 filter would be applied by default, this cuts index size in
half and currently no NB feature is running sha1 queries, this would be
also a candidate for substitution via a online service - if we really
don't need it we deprecate the queries. Hashes compress badly.

   - multi threaded extraction would be optional, potentially enabled by
default. This has a slight index size penalty due to merge overhead, so
I don't want to enable this without filters. Second concern are molten
notebooks, some are not made for sustained load and might prefer to run
the task in the background for 15mins instead of 6mins with loud fans.


the first query which is planned to be augmented by a online service is
class name search. Since this data wasn't in the index anymore for years
(nobody noticed though?).

and yeah index updates should run faster. But this is on the bottom of
the todo list - low hanging fruits first :)


best regards,
michael


On 18.04.23 20:36, Matthias Bläsing wrote:

Hi,

Am Dienstag, dem 18.04.2023 um 07:48 +0200 schrieb Jan Lahoda:

I apologize for being contrarian, but since the index download
started for me (again) while on a bus with very poor internet
connection, I guess I should tell you my view.

no reason to apologize.


Unless I am mistaken, the index gz has currently roughly 1.9GB, and
it tooks several minutes to actually create the Lucene index from it,
consuming some more space and CPU.

To be honest, it never seemed very polite to me to download and
process so much without asking.

I guess alternatives that I would see would include (combination of
options possible):
- explicitly ask before downloading (possibly allowing the user to
select auto-download)

Yes, if people get notified, that they'll get the full index locally,
then I'm okk with that. I see a problem if features silently give
outdated answers or don't work at all. Else we'll get "NetBeans
suggested version X, but Y is already on central, why is this not
current?".


- have the features that use the index do some query on a server, if
there isn't a downloaded index (or if it is stale/obsolete)

IMHO this highly depends on the speed of the API. If the latency is
high, the next bug will be "It takes ages until my POM tells me, that
it is outdated".


- given that https://github.com/apache/netbeans/pull/4999 produces a
smaller index, we could have a download location (server) at least
for maven central that would serve this optimized index. If I
understand it properly, the smallest index under that PR is 0.8GB,
and if it would compress reasonably well, it might be (say) 0.5GB
compressed - much better than 1.9GB, and no significant CPU usage
after the index is downloaded. (Even if it was 0.8GB, it is still
much better than 1.9GB+CPU churn.)

Truncating the index needs to be done carefully. NetBeans has a search
my SHA1 (or MD5?) feature. That will break, if you remove that data
from the index. A similar situation will arise, if arbitrary cut offs
are done based on time. Consider a libary, that does some interesting
algorithm, that just works the same even after years. If we cut the
index at 6 months for example, that artifact won't be found anymore.


There was also an argument on conserving the ASF resources in another
discussion recently. If I consider there would be (only) 10 000
installations of NetBeans, with the default setting to download the
index once a week, it is almost 20TB of data every week if I count
correctly. +the CPU cycles to convert the index on user's machines.
It seems there may be a way to conserve the ASF resources and provide
better experienc

Re: [DISCUSS] maven remote repo indexer improvements

2023-04-18 Thread Jakub Herkel
I would like to ask you if there is (will be) some configuration
option (via property for example) in which directory remote index
processing is?
I have a problem on my notebook that all processing is in /tmp
directory (tmpfs) with size 8GB and it always stops with an out of
space exception. And it is little bit awkward if every opening of
pom.xml results in downloading remote index and processing it with no
success.

best regards

jakub

On Tue, Apr 18, 2023 at 9:11 PM Michael Bien  wrote:
>
> my thoughts so far on how I wanted to implement it:
>
>   - time cutoff filter would be optional, configurable in the usual
> maven/indexer options. Quick tests showed that full/2years/1year might
> be reasonable values
>
>   - sha1 filter would be applied by default, this cuts index size in
> half and currently no NB feature is running sha1 queries, this would be
> also a candidate for substitution via a online service - if we really
> don't need it we deprecate the queries. Hashes compress badly.
>
>   - multi threaded extraction would be optional, potentially enabled by
> default. This has a slight index size penalty due to merge overhead, so
> I don't want to enable this without filters. Second concern are molten
> notebooks, some are not made for sustained load and might prefer to run
> the task in the background for 15mins instead of 6mins with loud fans.
>
>
> the first query which is planned to be augmented by a online service is
> class name search. Since this data wasn't in the index anymore for years
> (nobody noticed though?).
>
> and yeah index updates should run faster. But this is on the bottom of
> the todo list - low hanging fruits first :)
>
>
> best regards,
> michael
>
>
> On 18.04.23 20:36, Matthias Bläsing wrote:
> > Hi,
> >
> > Am Dienstag, dem 18.04.2023 um 07:48 +0200 schrieb Jan Lahoda:
> >> I apologize for being contrarian, but since the index download
> >> started for me (again) while on a bus with very poor internet
> >> connection, I guess I should tell you my view.
> > no reason to apologize.
> >
> >> Unless I am mistaken, the index gz has currently roughly 1.9GB, and
> >> it tooks several minutes to actually create the Lucene index from it,
> >> consuming some more space and CPU.
> >>
> >> To be honest, it never seemed very polite to me to download and
> >> process so much without asking.
> >>
> >> I guess alternatives that I would see would include (combination of
> >> options possible):
> >> - explicitly ask before downloading (possibly allowing the user to
> >> select auto-download)
> > Yes, if people get notified, that they'll get the full index locally,
> > then I'm okk with that. I see a problem if features silently give
> > outdated answers or don't work at all. Else we'll get "NetBeans
> > suggested version X, but Y is already on central, why is this not
> > current?".
> >
> >> - have the features that use the index do some query on a server, if
> >> there isn't a downloaded index (or if it is stale/obsolete)
> > IMHO this highly depends on the speed of the API. If the latency is
> > high, the next bug will be "It takes ages until my POM tells me, that
> > it is outdated".
> >
> >> - given that https://github.com/apache/netbeans/pull/4999 produces a
> >> smaller index, we could have a download location (server) at least
> >> for maven central that would serve this optimized index. If I
> >> understand it properly, the smallest index under that PR is 0.8GB,
> >> and if it would compress reasonably well, it might be (say) 0.5GB
> >> compressed - much better than 1.9GB, and no significant CPU usage
> >> after the index is downloaded. (Even if it was 0.8GB, it is still
> >> much better than 1.9GB+CPU churn.)
> > Truncating the index needs to be done carefully. NetBeans has a search
> > my SHA1 (or MD5?) feature. That will break, if you remove that data
> > from the index. A similar situation will arise, if arbitrary cut offs
> > are done based on time. Consider a libary, that does some interesting
> > algorithm, that just works the same even after years. If we cut the
> > index at 6 months for example, that artifact won't be found anymore.
> >
> >> There was also an argument on conserving the ASF resources in another
> >> discussion recently. If I consider there would be (only) 10 000
> >> installations of NetBeans, with the default setting to download the
> >> index once a week, it is almost 20TB of data every week if I count
> >> correctly. +the CPU cycles to convert the index on user's machines.
> >> It seems there may be a way to conserve the ASF resources and provide
> >> better experience to the users at the same time.
> > The download is from sonatypes CDN. Given that they actively discourage
> > central mirrors, I have not to much concern here. It is also the the
> > resourced of the ASF.
> >
> > Greetings
> >
> > Matthias
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
> > Fo

Re: [DISCUSS] maven remote repo indexer improvements

2023-04-18 Thread Michael Bien

my thoughts so far on how I wanted to implement it:

 - time cutoff filter would be optional, configurable in the usual 
maven/indexer options. Quick tests showed that full/2years/1year might 
be reasonable values


 - sha1 filter would be applied by default, this cuts index size in 
half and currently no NB feature is running sha1 queries, this would be 
also a candidate for substitution via a online service - if we really 
don't need it we deprecate the queries. Hashes compress badly.


 - multi threaded extraction would be optional, potentially enabled by 
default. This has a slight index size penalty due to merge overhead, so 
I don't want to enable this without filters. Second concern are molten 
notebooks, some are not made for sustained load and might prefer to run 
the task in the background for 15mins instead of 6mins with loud fans.



the first query which is planned to be augmented by a online service is 
class name search. Since this data wasn't in the index anymore for years 
(nobody noticed though?).


and yeah index updates should run faster. But this is on the bottom of 
the todo list - low hanging fruits first :)



best regards,
michael


On 18.04.23 20:36, Matthias Bläsing wrote:

Hi,

Am Dienstag, dem 18.04.2023 um 07:48 +0200 schrieb Jan Lahoda:

I apologize for being contrarian, but since the index download
started for me (again) while on a bus with very poor internet
connection, I guess I should tell you my view.

no reason to apologize.


Unless I am mistaken, the index gz has currently roughly 1.9GB, and
it tooks several minutes to actually create the Lucene index from it,
consuming some more space and CPU.

To be honest, it never seemed very polite to me to download and
process so much without asking.

I guess alternatives that I would see would include (combination of
options possible):
- explicitly ask before downloading (possibly allowing the user to
select auto-download)

Yes, if people get notified, that they'll get the full index locally,
then I'm okk with that. I see a problem if features silently give
outdated answers or don't work at all. Else we'll get "NetBeans
suggested version X, but Y is already on central, why is this not
current?".


- have the features that use the index do some query on a server, if
there isn't a downloaded index (or if it is stale/obsolete)

IMHO this highly depends on the speed of the API. If the latency is
high, the next bug will be "It takes ages until my POM tells me, that
it is outdated".


- given that https://github.com/apache/netbeans/pull/4999 produces a
smaller index, we could have a download location (server) at least
for maven central that would serve this optimized index. If I
understand it properly, the smallest index under that PR is 0.8GB,
and if it would compress reasonably well, it might be (say) 0.5GB
compressed - much better than 1.9GB, and no significant CPU usage
after the index is downloaded. (Even if it was 0.8GB, it is still
much better than 1.9GB+CPU churn.)

Truncating the index needs to be done carefully. NetBeans has a search
my SHA1 (or MD5?) feature. That will break, if you remove that data
from the index. A similar situation will arise, if arbitrary cut offs
are done based on time. Consider a libary, that does some interesting
algorithm, that just works the same even after years. If we cut the
index at 6 months for example, that artifact won't be found anymore.


There was also an argument on conserving the ASF resources in another
discussion recently. If I consider there would be (only) 10 000
installations of NetBeans, with the default setting to download the
index once a week, it is almost 20TB of data every week if I count
correctly. +the CPU cycles to convert the index on user's machines.
It seems there may be a way to conserve the ASF resources and provide
better experience to the users at the same time.

The download is from sonatypes CDN. Given that they actively discourage
central mirrors, I have not to much concern here. It is also the the
resourced of the ASF.

Greetings

Matthias


-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists






-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: [DISCUSS] disable remote index extraction by default [NB18]

2023-04-18 Thread Matthias Bläsing
Hi,

Am Dienstag, dem 18.04.2023 um 07:48 +0200 schrieb Jan Lahoda:
> I apologize for being contrarian, but since the index download
> started for me (again) while on a bus with very poor internet
> connection, I guess I should tell you my view.

no reason to apologize.

> Unless I am mistaken, the index gz has currently roughly 1.9GB, and
> it tooks several minutes to actually create the Lucene index from it,
> consuming some more space and CPU.
> 
> To be honest, it never seemed very polite to me to download and
> process so much without asking.
> 
> I guess alternatives that I would see would include (combination of
> options possible):
> - explicitly ask before downloading (possibly allowing the user to
> select auto-download)

Yes, if people get notified, that they'll get the full index locally,
then I'm okk with that. I see a problem if features silently give
outdated answers or don't work at all. Else we'll get "NetBeans
suggested version X, but Y is already on central, why is this not
current?".

> - have the features that use the index do some query on a server, if
> there isn't a downloaded index (or if it is stale/obsolete)

IMHO this highly depends on the speed of the API. If the latency is
high, the next bug will be "It takes ages until my POM tells me, that
it is outdated".

> - given that https://github.com/apache/netbeans/pull/4999 produces a
> smaller index, we could have a download location (server) at least
> for maven central that would serve this optimized index. If I
> understand it properly, the smallest index under that PR is 0.8GB,
> and if it would compress reasonably well, it might be (say) 0.5GB
> compressed - much better than 1.9GB, and no significant CPU usage
> after the index is downloaded. (Even if it was 0.8GB, it is still
> much better than 1.9GB+CPU churn.)

Truncating the index needs to be done carefully. NetBeans has a search
my SHA1 (or MD5?) feature. That will break, if you remove that data
from the index. A similar situation will arise, if arbitrary cut offs
are done based on time. Consider a libary, that does some interesting
algorithm, that just works the same even after years. If we cut the
index at 6 months for example, that artifact won't be found anymore.

> There was also an argument on conserving the ASF resources in another
> discussion recently. If I consider there would be (only) 10 000
> installations of NetBeans, with the default setting to download the
> index once a week, it is almost 20TB of data every week if I count
> correctly. +the CPU cycles to convert the index on user's machines.
> It seems there may be a way to conserve the ASF resources and provide
> better experience to the users at the same time.

The download is from sonatypes CDN. Given that they actively discourage
central mirrors, I have not to much concern here. It is also the the
resourced of the ASF.

Greetings

Matthias


-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: [NOTICE] Apache NetBeans 18 feature freeze / release branches

2023-04-18 Thread Neil C Smith
On Tue, 18 Apr 2023 at 15:13, Neil C Smith  wrote:
> **Master is closed for a few hours - please don't merge anything until
> the spec version increment is in and a follow up email sent.**

This is now done and master is open again for NB19 development.

Thanks,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: [DISCUSS] disable remote index extraction by default [NB18]

2023-04-18 Thread Michael Bien

Hi Jan,


On 18.04.23 07:48, Jan Lahoda wrote:

I apologize for being contrarian, but since the index download started for
me (again) while on a bus with very poor internet connection, I guess I
should tell you my view.

Unless I am mistaken, the index gz has currently roughly 1.9GB, and it
tooks several minutes to actually create the Lucene index from it,
consuming some more space and CPU.


correct. However, this is the initial download and initial indexing. 
Weekly index updates are usually ~8MB.





To be honest, it never seemed very polite to me to download and process so
much without asking.

I guess alternatives that I would see would include (combination of options
possible):
- explicitly ask before downloading (possibly allowing the user to select
auto-download)


+1 something we should do. First we had to add the actual setting 
(#5646), NB 18 will be able to turn off remote indexing separately from 
local .m2 scanning.




- have the features that use the index do some query on a server, if there
isn't a downloaded index (or if it is stale/obsolete)


see other mail. But yes I experimented with that (#4971).



- given that https://github.com/apache/netbeans/pull/4999 produces a
smaller index, we could have a download location (server) at least for
maven central that would serve this optimized index. If I understand it
properly, the smallest index under that PR is 0.8GB, and if it would
compress reasonably well, it might be (say) 0.5GB compressed - much better
than 1.9GB, and no significant CPU usage after the index is downloaded.
(Even if it was 0.8GB, it is still much better than 1.9GB+CPU churn.)


I personally don't care about the weekly index update at all - I barely 
notice it when it updates once per week. However I am always open to 
optimize it!


But keep in mind that remote index downloads work for more repos than 
just central, e.g apache has one too.


If you suggest that we should index repos server-side and provide a 
compressed index as a service for NB I wouldn't have anything against it 
in principle, but I also don't want to run this service in my spare time.


We could in theory build it in CI and bundle the index with NB. However, 
remember that this might annoy users which don't use java/maven. NB will 
soon support rust and is popular for PHP etc.



regarding optimizations:

one step which runs unreasonably slow ATM is the index update. The delta 
download is always below 10MB for central, while the index merge still 
takes minutes. This is something I want to check once #4999 is merged.



best regards,

michael





There was also an argument on conserving the ASF resources in another
discussion recently. If I consider there would be (only) 10 000
installations of NetBeans, with the default setting to download the index
once a week, it is almost 20TB of data every week if I count correctly.
+the CPU cycles to convert the index on user's machines. It seems there may
be a way to conserve the ASF resources and provide better experience to the
users at the same time.

Jan


On Thu, Apr 13, 2023 at 7:38 PM Michael Bien  wrote:


thanks Matthias,

ok lets keep it on :)

I never really had any issues with it either.

the thought was to switch it off in NB 18 and switch it on again in NB
19 once we are able to bump the dependencies - not as permanent new
default.

best regards,

michael

On 13.04.23 19:18, Matthias Bläsing wrote:

Hi,

I don't think this is a good idea. If your disc is to small to hold the
maven index there is a pretty good chance, that your setup is often
broken.

Quite frankly at work I never noticed NetBeans downloading the index,
it is lost in the noice of normal operation.

We will basicly loose all advantages of maven indexing. My local
repository does not hold the latested artifact (rendering the
corresponding hint useless) and completion is interesting because I did
not yet use the artifact. Else I most probably have the right setup for
the setup already at hand.

The option to disable is now there.

Given that part of the discussion around JDK 11 baseline was to make a
faster maven indexer available gives this another bad taste.

So from my POV this should not be done.

Greetings

Matthias

Am Donnerstag, dem 13.04.2023 um 18:47 +0200 schrieb Michael Bien:

Hello devs,

NB 18 will split the indexer setting into two check boxes, one for local
indexing, one for remote index extraction,

see screenshot: https://github.com/apache/netbeans/pull/5646

I propose to disable remote index extraction by default, until we solved
some more of the issues at least.

main complains are:

- failing extraction when temp folder is too small e.g #5815 (NB 18
will deactivate indexing on extraction failure)

- extraction takes too long e.g #5809

Local index scanning scans your .m2 repo, which is fast and might be
sufficient for many use cases and therefore a better default for now.

thoughts?

-mbien


-

Re: [DISCUSS] disable remote index extraction by default [NB18]

2023-04-18 Thread Michael Bien

Hi Tamas!

great to see you here.


On 18.04.23 10:57, Tamás Cservenák wrote:

Howdy,

just to chime in as a developer of maven-indexer:

1, Regarding server side processing we are shot in our foot, as index is
provided (is produced and hosted) by Sonatype, and we have no much impact,
nor is there any server side available. They just produce the GZIP files we
can download and munge as much as we like on them.
2. Am unsure about exact indexer use cases within NB, but indexer offers a
"remote client" for SMO as well (via HTTP to search web service),
https://github.com/apache/maven-indexer/tree/master/search-backend-smo
still, is less powerful than full indexing context (that one that is GB in
size), see tests


I didn't know that this existed. Awesome!



3. maven-indexer (the classic) could be combined with smo search: use
classic to index local repository while for remote use SMO client, etc.or
some sort of "overlay" could be possible as well...


I have a draft PR here which uses the web api directly for selected 
search queries:


https://github.com/apache/netbeans/pull/4971

will take a look at search-backend-smo - since this sounds like this 
would replace all the json parsing of that PR.



Regarding using web services: I do think they have potential for 
selective queries. But there will be features which likely won't work. 
For example editor hints which scan the pom for version upgrades. I got 
latencies between 200ms and 4s (!) per query (see PR for log). Right now 
hints can run those checks instantly, with a web service this could take 
minutes (assuming we don't get throttled).


NB does have a mechanism for returning partial results for queries 
(instantly) and optionally blocking for full results. The web API would 
never return anything instantly unless we would start to cache a search 
engine locally which probably wouldn't work that well.


best regards,

michael




In the end, I want to thank Michael Bien of NB, he made a great deal of
contributions to indexers so far. Finally, there is one related Michael PR
still waiting for me...
https://github.com/apache/maven-indexer/pull/302

Thanks
T

On Tue, Apr 18, 2023 at 10:44 AM Neil C Smith  wrote:


On Tue, 18 Apr 2023 at 06:49, Jan Lahoda  wrote:

I apologize for being contrarian, but since the index download started

for

me (again) while on a bus with very poor internet connection, I guess I
should tell you my view.

...

To be honest, it never seemed very polite to me to download and process

so

much without asking.

+1 - I do think that more control and explicit information before
first run would be good.  I do a lot of testing during releases with
userdirs in /tmp - I hate to think how many GBs I've downloaded there.


If I consider there would be (only) 10 000
installations of NetBeans,

Let's not be quite that pessimistic! ;-)   There have been 1.5
million* downloads of various NetBeans 17 binaries so far from ASF
infrastructure, never mind other sources, community installers, etc.
Of course, that increases your other figures too!

* or 0.5 million unique IPs, so probably somewhere in between - access
to stats for people with Apache account at
https://logging1-he-de.apache.org/stats/

Best wishes,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists







-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





[NOTICE] Apache NetBeans 18 feature freeze / release branches

2023-04-18 Thread Neil C Smith
Hi All,

As per earlier email, today is feature freeze for NetBeans 18, and the
delivery and release branches are being set up.

**Master is closed for a few hours - please don't merge anything until
the spec version increment is in and a follow up email sent.**

All open pull requests for NB18 will be pushed to NB19. If there's a
need to get a particular PR into 18, please rebase on delivery and add
the NB18 milestone. As we now have multiple people doing the release
process, there's no need to add us as reviewers - we'll work off the
milestone.

A first release candidate will be available shortly.

**The following rules are applied to pull requests from now until release:**

Please read the full info on how we manage pull requests to delivery
at least once! :-) -
https://lists.apache.org/thread/hyjbsz55zb9xfcnccghkrsvqsnt83nwf

PR's intended to be included in the 18 release :
 - Limited to fixes (link an issue if there is one, or provide
justification in description)
 - Label with priority:high or priority:critical as appropriate.
 - Base on the delivery branch.
 - Mark with NB18 milestone.
 - Will be merged by the release team.

PR's with features for a future release :
 - Base on the master branch.
 - Will be reviewed and merged in the usual way.
 - If possible stay away from big refactoring.
 - If possible do not overlap with fixes for 18 (delivery will be
merged to master weekly).

Thanks and best wishes,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: [DISCUSS] disable remote index extraction by default [NB18]

2023-04-18 Thread Svata Dedic

Interesting stats, but

Apache-NetBeans-17-bin-windows-x64.exe  643,458 downloads,
netbeans-17-bin.zip 130,456 Downloads

the files being +- the same size.

but the traffic says:

netbeans-17-bin.zip 24,873,400,441,109
Apache-NetBeans-17-bin-windows-x64.exe 539,363,119,735

...so the much less demanded file (comparable size) provides traffic 
larger by 2 orders of magnitude. Something is fishy here.


On 18. 04. 23 10:43, Neil C Smith wrote:


* or 0.5 million unique IPs, so probably somewhere in between - access
to stats for people with Apache account at
https://logging1-he-de.apache.org/stats/





-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: [DISCUSS] disable remote index extraction by default [NB18]

2023-04-18 Thread Tamás Cservenák
Howdy,

just to chime in as a developer of maven-indexer:

1, Regarding server side processing we are shot in our foot, as index is
provided (is produced and hosted) by Sonatype, and we have no much impact,
nor is there any server side available. They just produce the GZIP files we
can download and munge as much as we like on them.
2. Am unsure about exact indexer use cases within NB, but indexer offers a
"remote client" for SMO as well (via HTTP to search web service),
https://github.com/apache/maven-indexer/tree/master/search-backend-smo
still, is less powerful than full indexing context (that one that is GB in
size), see tests
3. maven-indexer (the classic) could be combined with smo search: use
classic to index local repository while for remote use SMO client, etc.or
some sort of "overlay" could be possible as well...

In the end, I want to thank Michael Bien of NB, he made a great deal of
contributions to indexers so far. Finally, there is one related Michael PR
still waiting for me...
https://github.com/apache/maven-indexer/pull/302

Thanks
T

On Tue, Apr 18, 2023 at 10:44 AM Neil C Smith  wrote:

> On Tue, 18 Apr 2023 at 06:49, Jan Lahoda  wrote:
> > I apologize for being contrarian, but since the index download started
> for
> > me (again) while on a bus with very poor internet connection, I guess I
> > should tell you my view.
> ...
> > To be honest, it never seemed very polite to me to download and process
> so
> > much without asking.
>
> +1 - I do think that more control and explicit information before
> first run would be good.  I do a lot of testing during releases with
> userdirs in /tmp - I hate to think how many GBs I've downloaded there.
>
> > If I consider there would be (only) 10 000
> > installations of NetBeans,
>
> Let's not be quite that pessimistic! ;-)   There have been 1.5
> million* downloads of various NetBeans 17 binaries so far from ASF
> infrastructure, never mind other sources, community installers, etc.
> Of course, that increases your other figures too!
>
> * or 0.5 million unique IPs, so probably somewhere in between - access
> to stats for people with Apache account at
> https://logging1-he-de.apache.org/stats/
>
> Best wishes,
>
> Neil
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
> For additional commands, e-mail: dev-h...@netbeans.apache.org
>
> For further information about the NetBeans mailing lists, visit:
> https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists
>
>
>
>


RE: [DISCUSS] disable remote index extraction by default [NB18]

2023-04-18 Thread Eric Barboni
Hi,
 I think for maven central is sonatype that own the repo, ASF is not in charge 
of that. So we will no spare much ASF resources. But sparing resources is not 
bad at all.

I'm not sure we can use ASF infra to cache that somewhere. Should be carefuly 
asked. VM are for us serving apidoc but serving huge file for each install may 
be an issue.

I have more question 😃 (and very egocentric view by the way).
Do we need to have a mavenindex generation per NetBeans version ? per platform 
using java clusters (ok this is my use case)? Per user ?
At university every NetBeans, or java based platform for each student download 
index, as index is too big for hardrive (cost 20% of student capacity) it's 
removed each week to avoid account soft lock. 
If thoose index could be cached at university level that would be awesome, 
saving lots of time.
If nothing in index is user privacity involved.


To me as task can be long, asking first time should be done or a reminder until 
is done telling you will not have full potential until index run.

Best Regards
Eric

-Message d'origine-
De : Neil C Smith  
Envoyé : mardi 18 avril 2023 10:44
À : dev@netbeans.apache.org
Objet : Re: [DISCUSS] disable remote index extraction by default [NB18]

On Tue, 18 Apr 2023 at 06:49, Jan Lahoda  wrote:
> I apologize for being contrarian, but since the index download started 
> for me (again) while on a bus with very poor internet connection, I 
> guess I should tell you my view.
...
> To be honest, it never seemed very polite to me to download and 
> process so much without asking.

+1 - I do think that more control and explicit information before
first run would be good.  I do a lot of testing during releases with userdirs 
in /tmp - I hate to think how many GBs I've downloaded there.

> If I consider there would be (only) 10 000 installations of NetBeans,

Let's not be quite that pessimistic! ;-)   There have been 1.5
million* downloads of various NetBeans 17 binaries so far from ASF 
infrastructure, never mind other sources, community installers, etc.
Of course, that increases your other figures too!

* or 0.5 million unique IPs, so probably somewhere in between - access to stats 
for people with Apache account at https://logging1-he-de.apache.org/stats/

Best wishes,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: [DISCUSS] disable remote index extraction by default [NB18]

2023-04-18 Thread Neil C Smith
On Tue, 18 Apr 2023 at 06:49, Jan Lahoda  wrote:
> I apologize for being contrarian, but since the index download started for
> me (again) while on a bus with very poor internet connection, I guess I
> should tell you my view.
...
> To be honest, it never seemed very polite to me to download and process so
> much without asking.

+1 - I do think that more control and explicit information before
first run would be good.  I do a lot of testing during releases with
userdirs in /tmp - I hate to think how many GBs I've downloaded there.

> If I consider there would be (only) 10 000
> installations of NetBeans,

Let's not be quite that pessimistic! ;-)   There have been 1.5
million* downloads of various NetBeans 17 binaries so far from ASF
infrastructure, never mind other sources, community installers, etc.
Of course, that increases your other figures too!

* or 0.5 million unique IPs, so probably somewhere in between - access
to stats for people with Apache account at
https://logging1-he-de.apache.org/stats/

Best wishes,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists