[ 
https://issues.apache.org/jira/browse/LUCENE-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-10342:
-----------------------------------
    Description: 
At the moment we have some parts in Lucene where we check for features of the 
JDK and use them, only if available:
- We only use MMapDirectory, if we can unmap, so sun.misc.Unsafe must be 
available
- To correctly calculate heap space requirements of data structures for caching 
and other types of buffers, we try to calculate the heap size of objects. To do 
this correctly we need some information like the size of ordinal object 
pointers. If they are 32 bits on 64 bit platforms, the memory usage of Object[] 
arrays (and HashSet/HashMaps) dramatically reduce size by factor of 2.

As those checks require "optional" modules in the java module system, which may 
not even be available by default as they are unsupported, we do the check 
dynamically.

In Classpath mode, this are all nobrainers, because the modules are available 
by default. But as soon as downstream code switches to module mode, the whole 
things may suddenly stop working (because module was forgotten) for the 
application and nobody notices. A user of Lucene may then need to add those 
modules to the module descriptor of the application or pass on command line (to 
make it optional if the JVM supports it, e.g. what happens if "jdk.unsupported" 
is not available for custom JDK xy by provider Z?). If heshe misses to add the 
module or for some other reason it does not work (like feature is not available 
in J9 instead of Hotspot), we silently disable MMapDirectory or for memory 
usage we may be off by an factor of 2.

My suggestion is now to enable java.util.logging in Lucene's core (maybe also 
in other modules, too - luke is already) and report such warnings using 
java.util.logging. Any downstream code will see the logging then depending on 
the used logging system (log4j, slf4j with correct wrapper module).

I know this suggestion may cause a bit a flame war because of disagreement 
about pros and cons about if logging is needed, so I'd like to limit this to 
warnings+. We won't ever emit info/debug/trace. To make sure we only log "warn" 
and "error"/"severe" events, I will add a forbiddenapis rule to deny all other 
loglevels.

As a side effect I will also add an IndexWriter log stream implementation for 
java.util.logging to make it easy to log infostream events during indexing.

I will provide a PR adding 2 warnings and the InfoStream implementation.

  was:
At the moment we have some parts in Lucene where we check for features of the 
JDK and use them, only if available:
- We only use MMapDirectory, if we can unmap, so sun.misc.Unsafe must be 
available
- To correctly calculate heap space requirements of data structures for caching 
and other types of buffers, we try to calculate the heap size of objects. To do 
this correctly we need some information like the size of ordinal object 
pointers. If they are 32 bits on 64 bit platforms, the memory usage of Object[] 
arrays (and HashSet/HashMaps) dramatically reduce size by factor of 2.

As those checks require "optional" modules in the java module system, which may 
not even be available by default as they are unsupported, we do the check 
dynamically.

With the Java module system the problems start, that a user of Lucene may need 
to add those modules to the module descriptor of the application. If he misses 
that or for some other reason it does not work (like feature is not available), 
we silently disable MMapDirectory or for memory usage we may be off by an 
factor of 2.

My suggestion is now to enable java.util.logging in Lucene's core (maybe also 
in other modules, too - luke is already) and report such warnings using 
java.util.logging. Any downstream code will see the logging then depending on 
the used logging system (log4j, slf4j with correct wrapper module).

I know this suggestion may cause a bit a flame war because of disagreement 
about pros and cons about if logging is needed, so I'd like to limit this to 
warnings+. We won't ever emit info/debug/trace. To make sure we only log "warn" 
and "error"/"severe" events, I will add a forbiddenapis rule to deny all other 
loglevels.

As a side effect I will also add an IndexWriter log stream implementation for 
java.util.logging to make it easy to log infostream events during indexing.

I will provide a PR adding 2 warnings and the InfoStream implementation.


> Add (very limited) java.util.logging to Lucene Core
> ---------------------------------------------------
>
>                 Key: LUCENE-10342
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10342
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/other
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Critical
>              Labels: flamewar, logging
>
> At the moment we have some parts in Lucene where we check for features of the 
> JDK and use them, only if available:
> - We only use MMapDirectory, if we can unmap, so sun.misc.Unsafe must be 
> available
> - To correctly calculate heap space requirements of data structures for 
> caching and other types of buffers, we try to calculate the heap size of 
> objects. To do this correctly we need some information like the size of 
> ordinal object pointers. If they are 32 bits on 64 bit platforms, the memory 
> usage of Object[] arrays (and HashSet/HashMaps) dramatically reduce size by 
> factor of 2.
> As those checks require "optional" modules in the java module system, which 
> may not even be available by default as they are unsupported, we do the check 
> dynamically.
> In Classpath mode, this are all nobrainers, because the modules are available 
> by default. But as soon as downstream code switches to module mode, the whole 
> things may suddenly stop working (because module was forgotten) for the 
> application and nobody notices. A user of Lucene may then need to add those 
> modules to the module descriptor of the application or pass on command line 
> (to make it optional if the JVM supports it, e.g. what happens if 
> "jdk.unsupported" is not available for custom JDK xy by provider Z?). If 
> heshe misses to add the module or for some other reason it does not work 
> (like feature is not available in J9 instead of Hotspot), we silently disable 
> MMapDirectory or for memory usage we may be off by an factor of 2.
> My suggestion is now to enable java.util.logging in Lucene's core (maybe also 
> in other modules, too - luke is already) and report such warnings using 
> java.util.logging. Any downstream code will see the logging then depending on 
> the used logging system (log4j, slf4j with correct wrapper module).
> I know this suggestion may cause a bit a flame war because of disagreement 
> about pros and cons about if logging is needed, so I'd like to limit this to 
> warnings+. We won't ever emit info/debug/trace. To make sure we only log 
> "warn" and "error"/"severe" events, I will add a forbiddenapis rule to deny 
> all other loglevels.
> As a side effect I will also add an IndexWriter log stream implementation for 
> java.util.logging to make it easy to log infostream events during indexing.
> I will provide a PR adding 2 warnings and the InfoStream implementation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to