You are, indeed :-).

What appears to be the problem - and I'm not sure yet, but it sure seems like a 
good culprit - is that Postgres search, for reasons that mystify me, was 
implemented with TF but no notion of IDF. There are various extensions that add 
IDF-like properties to Postgres search. Why it didn't start out that way is a 
mystery to me, and I don't know how stable any of the extensions that do this 
actually are.

At the moment, that's my diagnosis of the discrepancy. I'll probably follow up 
with the Postgres folks to see if they have any more insight into those 
extensions.

Thanks to all who responded.

Cordially,
Sam Bayer
The MITRE Corporation

On 3/17/22 12:42 PM, Eric Pugh wrote:
What I’ve done to compare other search engines with RRE and Quepid is to put a 
proxy in the middle that converts your query into what looks like a Solr 
request/response ;-).  This works great for custom Search API’s, and I *guess* 
you could do it with database backed search?

Now we are probably getting beyond what Sam was hoping to do!




On Mar 17, 2022, at 11:56 AM, Alessandro Benedetti <a.benede...@sease.io> wrote:

This is an interesting question.
I second both comments so far (from Eric and David), but I am afraid at the
moment the open-source tools for search quality evaluation can't really
compare Postgres to Solr.
As far as I know, both Quepid(Eric correct me if I am wrong) and RRE(
https://github.com/SeaseLtd/rated-ranking-evaluator and also the Enterprise
version) are able to compare only Apache Solr and Elasticsearch backed
systems (against each other, or against different configurations).

In general, I would recommend following David's suggestions:
- collect your requirements(both functional and performance-wise)
- compare

I have seen in the past many times DB used as terrible search engines and
search engines used as terrible DB.
Many times I have seen queries on a search engine to perform poorly because
they were designed as they were DB queries.

Cheers

--------------------------
Alessandro Benedetti
Apache Lucene/Solr PMC member and Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Sat, 5 Mar 2022 at 05:04, David Smiley <dsmi...@apache.org> wrote:

Hello Sam,

You are a familiar name from my MITRE days :-)

Check out Solr's feature list and see how it compares to that of Postgres.
If you are only doing the most basic default relevancy ranked top-N search
with default text analysis, then the tech/maintenance overhead might not be
worth it.  I'm looking at this as such an example:
https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=solr

On the other hand, if you want to ensure that you're able to make search
the best it can be for your users, then keeping Solr and using it more will
get you there; a database won't.  To a database, full-text-search is just
one checkbox of many concerns.  The capabilities there are usually very
simple.  It's fine for a demo/POC -- getting started.

One feature in particular I want to call out is faceting.  To some apps,
it's a game changer that can pivot the UX from merely having a basic search
box to having navigation filters and everything else, at which point Solr
is the foundation of what's driving the UX.  I've seen people/apps miss
this -- the user experience is so clumsy without it for rich/structured
data in particular.  If you've ever used a Maven repository manager like
Nexus or it's competitors (last I checked), they are still stuck in the
stone-age -- it's painful when you've been exposed to so much better.  On
the backend, if all you know is a database, you may not see how to make a
faceting UI work because it's rather unnatural for SQL.

Eric's response was great too.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Mar 4, 2022 at 9:33 AM Bayer, Samuel <s...@mitre.org> wrote:

Hi all -

In the interest of reducing my technology stack, I'm exploring whether
using Postgres full-text search instead of Solr might be an option when I
need both complex querying and full-text search. In my experience, so
far,
Postgres can't compare to Solr, but I'm trying to understand why, in
order
to have more of an ability to evaluate the functionality/complexity
tradeoffs. I know something about search technologies, but I'm not an
expert by any stretch of the imagination, and I've been looking for
sources
that talk about the comparison in an informed way - people, blogs,
articles. So far, everything I've found is extremely basic. Does anyone
have any pointers for me?

Thanks in advance -
Sam Bayer
The MITRE Corporation
s...@mitre.org



_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My 
Free/Busy <http://tinyurl.com/eric-cal>
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
      
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.


Reply via email to