Or, have a separate method, such as in java.util.Spliterator#estimateSize()

Thanks,

Emilio

On 9/2/20 3:44 PM, Emilio Lahr-Vivaz wrote:
In regards to returning -1, I believe the relevant count methods are detailed below. It may make sense, if the API is changing anyway, to allow for an 'exact' (or equivalent) parameter to force the count evaluation, and remove the somewhat un-intuitive (to me at least) differentiation between count and size.

Thanks,

Emilio


DataFeatureCollection: (note: no javadoc, but the implication is that it can return -1 to align with FeatureSource)

  public abstract int getCount() throws IOException;

FeatureCollection: (may *not* return -1)

  /**
     * Please note this operation may be expensive when working with remote content.
     *
     * @see java.util.Collection#size()
     */
    int size();

FeatureSource: (may return -1)

 /**
     * Gets the number of the features that would be returned by the given
     * {@code Query}, taking into account any settings for max features and
     * start index set on the {@code Query}.
     * <p>
     * It is possible that this method will return {@code -1} if the calculation      * of number of features is judged to be too costly by the implementing class.
     * In this case, you might call <code>getFeatures(query).size()</code>
     * instead.
     * <p>
     * Example use:<pre><code> int count = featureSource.getCount();
     * if( count == -1 ){
     *    count = featureSource.getFeatures( "typeName", count ).size();
     * }
     *
     * @param query the query to select features
     *
     * @return the numer of features that would be returned by the {@code Query};
     *         or {@code -1} if this cannot be calculated.
     *
     * @throws IOException if there are errors getting the count
     */
    int getCount(Query query) throws IOException;



On 9/2/20 2:10 PM, Jim Hughes wrote:

Hi all,

The JavaDoc on this method reminded me of one of the points I wanted to suggest.  If there are multiple methods for counting records, it may be good to discuss the semantics.  If I recall, some of the methods in GeoTools have the idea that returning a -1 is suitable way to communicate that getting the exact count would be too expensive.

As an anchor point, I'm kinda happy when I see a distributed system scanning over a million-ish records per second.  With that back of the envelope, if a filter matched a billion or so features, it may take 10-15 minutes to get an exact count.

In GeoMesa, we implemented a system property around returning exact counts or not.  I think we noticed this the most when GeoServer returned GeoJson since that request pathway calls getCount and then getFeatures (which is GeoMesa's case basically repeats the query!).

Anyhow, having an 'int' method and a 'long' method to call may help limit the amount of time spent counting;).

Cheers,

Jim

On 9/2/2020 1:54 PM, Andrea Aime wrote:
Hi Jody,
I like this road, works fine for FeatureSource.
What about FeatureCollection though? It's already using "int size()", from the interface:

/** * Please note this operation may be expensive when working with remote content. * * @see java.util.Collection#size() */ int size();

Cheers
Andrea


On Wed, Sep 2, 2020 at 7:45 PM Jody Garnett <jody.garn...@gmail.com <mailto:jody.garn...@gmail.com>> wrote:

    Here is another softer approach:
    /**
     * @return Returns the number of features in this collection, if
    this collection contains more than Integer.MAX_VALUE elements,
    returns Integer.MAX_VALUE.
     * @deprecated Please use count()
     */
    public int getCount();

    public long size();

    This gives us a clear api migration and does not immediately
    break projects when they upgrade. The size() name matches how
    Collections.size() handles a size greater than the range of
    Integer.MAX_VALUE.

    --
    Jody Garnett


    On Wed, 2 Sep 2020 at 06:43, Andrea Aime
    <andrea.a...@geo-solutions.it
    <mailto:andrea.a...@geo-solutions.it>> wrote:

        Hi, any other opinion on this?

        Personally I would not like breaking all existing store
        implementations and clients, for a "clean break" it seems
        quite bloody :-D
        But Jody suggests to go that way.

        Some tie breaker, or even multiple votes leading to another
        tie, would be appreciated, ha!

        Cheers
        Andrea

        On Thu, Aug 27, 2020 at 11:16 PM Andrea Aime
        <andrea.a...@geo-solutions.it
        <mailto:andrea.a...@geo-solutions.it>> wrote:

            Hi Jody,
            comment inline.

            On Thu, Aug 27, 2020 at 11:06 PM Jody Garnett
            <jody.garn...@gmail.com <mailto:jody.garn...@gmail.com>>
            wrote:

                We are just about to start a new release cycle, API
                changes are a short-term pain, but the most
                maintainable approach long term.

                As for the change, how about returning a returning
                long? Existing client code that used integrer would
                be easy to update.

                    long count = featureSource.getCount(query);

                Or:

                    int count = (int) featureSource.getCount(query);


            We can gauge how easy it is by doing the switch in GT/GS...
            Maybe it could be done as a refactor... but I cannot
            imagine exactly how yet. Closest thing to something
            working may be:

            1) Rename existing getCount to getCountOld via refactor
            2) Add a new method called getCount, returning long,
            make it abstract, have a default implementation of
            getCountOld delegating to getCount
            3) Fix all implementations, switching them from
            getCountOld to getCount
            4) Inline getCountOld, that should fix all calling points

            Hopefully that should not be too much work, I hope there
            are few implementations of FeatureSource/Collection but many
            client code calls using them.


                If you really want Java 8 has Math.toIntExact
                method, that produces an exception if the long is
                out of range:

                    int count = Math. toIntExact(
                    featureSource.getCount(query) );


            That would also work yes

            Cheers
            Andrea

            == GeoServer Professional Services from the experts!
            Visit http://goo.gl/it488V for more information. == Ing.
            Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S.
            Via di Montramito 3/A 55054 Massarosa (LU) phone: +39
            0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549
            http://www.geo-solutions.it
            http://twitter.com/geosolutions_it
            -------------------------------------------------------
            /Con riferimento alla normativa sul trattamento dei dati
            personali (Reg. UE 2016/679 - Regolamento generale sulla
            protezione dei dati “GDPR”), si precisa che ogni
            circostanza inerente alla presente email (il suo
            contenuto, gli eventuali allegati, etc.) è un dato la
            cui conoscenza è riservata al/i solo/i destinatario/i
            indicati dallo scrivente. Se il messaggio Le è giunto
            per errore, è tenuta/o a cancellarlo, ogni altra
            operazione è illecita. Le sarei comunque grato se
            potesse darmene notizia. This email is intended only for
            the person or entity to which it is addressed and may
            contain information that is privileged, confidential or
            otherwise protected from disclosure. We remind that - as
            provided by European Regulation 2016/679 “GDPR” -
            copying, dissemination or use of this e-mail or the
            information herein by anyone other than the intended
            recipient is prohibited. If you have received this email
            by mistake, please notify us immediately by telephone or
            e-mail./



--
        Regards, Andrea Aime

        == GeoServer Professional Services from the experts! Visit
        http://goo.gl/it488V for more information. == Ing. Andrea
        Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di
        Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313
        fax: +39 0584 1660272 mob: +39 339 8844549
        http://www.geo-solutions.it
        http://twitter.com/geosolutions_it
        ------------------------------------------------------- /Con
        riferimento alla normativa sul trattamento dei dati
        personali (Reg. UE 2016/679 - Regolamento generale sulla
        protezione dei dati “GDPR”), si precisa che ogni circostanza
        inerente alla presente email (il suo contenuto, gli
        eventuali allegati, etc.) è un dato la cui conoscenza è
        riservata al/i solo/i destinatario/i indicati dallo
        scrivente. Se il messaggio Le è giunto per errore, è
        tenuta/o a cancellarlo, ogni altra operazione è illecita. Le
        sarei comunque grato se potesse darmene notizia. This email
        is intended only for the person or entity to which it is
        addressed and may contain information that is privileged,
        confidential or otherwise protected from disclosure. We
        remind that - as provided by European Regulation 2016/679
        “GDPR” - copying, dissemination or use of this e-mail or the
        information herein by anyone other than the intended
        recipient is prohibited. If you have received this email by
        mistake, please notify us immediately by telephone or e-mail./



--

Regards, Andrea Aime

== GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- /Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail./



_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel


_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel



_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to