Re: Large scale reliable software system

2023-06-27 Thread Avin Kavish
Well, seeing as postgres isn't designed to serve http requests or to run
general purpose code that doesn't involve databases, which you can express
elegantly in python, to answer OP's question - my vote is on the original
answer - Django. It's got everything out of the box - authentication. file
storage. etc etc.

Once you get the application running you can enhance the database as
necessary.

On Tue, Jun 27, 2023 at 4:19 PM Tony Shelver  wrote:

>
> On Tue, 27 Jun 2023 at 07:08, Guyren Howe  wrote:
>
>> Correct. It’s a tragically wrong piece of folk wisdom that’s pretty
>> general across web development communities.
>>
>> On Jun 26, 2023, at 21:32, Michael Nolan  wrote:
>>
>> It's not just Ruby, dumb databases are preferred in projects like
>> WordPress, Drupal and Joomla, too.
>>
>> Now, if it's because they're used to using MySQL, well maybe that's
>> not so hard to understand.  :-)
>>
>> On Mon, Jun 26, 2023 at 8:05 PM Guyren Howe  wrote:
>>
>>
>> This is a reasonable answer, but I want to offer a caveat.
>>
>> Likely because of the influence of the originator of Ruby on Rails, it is
>> close to holy writ in the web development community that the database must
>> be treated as a dumb data bucket and all business logic must be implemented
>> in the Ruby or Python or whatever back end code.
>>
>> This heuristic is nearly always mostly wrong.
>>
>> Guyren G Howe
>> On Jun 26, 2023 at 17:48 -0700, Adrian Klaver ,
>> wrote:
>>
>> On 6/26/23 16:48, B M wrote:
>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com
>>
>>
>>
>>
>>  The accepted front-end developer wisdom of treating the DB as a dumb
> data store works under conditions, for example the DB will never be
> accessed from a different ORM / framework, and where the performance
> attributes of using an ORM with 'standard' datastructures are acceptable.
>
> The moment you need to plug in something like reporting tools, or access
> from a different system / API / framework / language / ORM or whatever, the
> approach not having rules / views / procedures / whatever built into the
> database falls apart.
>
> Other things to consider are performance / load / overhead:  we have one
> system that involves processing through large amounts of data for reports /
> queries.  Shipping all that back through the ORM / db interface (ODBC /
> JDBC / psycopg2 / whatever for resolution / filtering on the front end
> application where SQL / procedures / views could do that in the DB and just
> ship back the required data seems counterproductive.
>
> Tony Shelver
>
>>


Re: bug or lacking doc hint

2023-06-25 Thread Avin Kavish
Sounds like the problem you are having is, the server is running out of
temporary resources for the operation that users are trying to do. So
according to Tom, on the postgres side, the operation cannot be optimized
further.

I think you have few choices here,
  - See if increasing the resources of the server will allow them to run
the operation
  - Ask users not to do that operation
  - Use a extension like citus to scale horizontally

But I'm thinking why a massively inefficient join is needed in the first
place. Shouldn't joins be for following keyed relationships. So ideally a
unique indexed column, but at the very least an indexed column. Why is a
join required on a dynamically calculated substring? Can it be made into a
static computed value and indexed? Substring sounds like an op that should
be in the filter stage.

Can you describe your data model? Maybe we can give some specific advice.

Regards,
Avin

On Mon, Jun 26, 2023 at 3:57 AM Marc Millas  wrote:

>
> On Sun, Jun 25, 2023 at 11:48 PM Tom Lane  wrote:
>
>> David Rowley  writes:
>> > The problem is that out of the 3 methods PostgreSQL uses to join
>> > tables, only 1 of them supports join conditions with an OR clause.
>> > Merge Join cannot do this because results can only be ordered one way
>> > at a time.  Hash Join technically could do this, but it would require
>> > that it built multiple hash tables. Currently, it only builds one
>> > table.  That leaves Nested Loop as the join method to implement joins
>> > with OR clauses. Unfortunately, nested loops are quadratic and the
>> > join condition must be evaluated once per each cartesian product row.
>>
>> We can do better than that if the OR'd conditions are each amenable
>> to an index scan on one of the tables: then it can be a nestloop with
>> a bitmap-OR'd inner index scan.  I thought the upthread advice to
>> convert the substr() condition into something that could be indexed
>> was on-point.
>>
> ok. but one of the tables within the join(s) tables is 10 billions rows,
> splitted in 120 partitions. Creating something like 20 more indexes to
> fulfill that condition do have its own problems.
>
>>
>> > Tom Lane did start some work [1] to allow the planner to convert some
>> > queries to use UNION instead of evaluating OR clauses, but, if I
>> > remember correctly, it didn't handle ORs in join conditions, though
>> > perhaps having it do that would be a natural phase 2. I don't recall
>> > why the work stopped.
>>
>> As I recall, I was having difficulty convincing myself that
>> de-duplication of results (for cases where the same row satisfies
>> more than one of the OR'd conditions) would work correctly.
>> You can't just blindly make it a UNION because that might remove
>> identical rows that *should* appear more than once in the result.
>>
>
> I did rewrite the query using a cte and union(s). For that query, no dedup
> point.
> But my pb is that  that DB will be used by a bunch of people writing raw
> SQL queries, and I cannot let them write queries that are going to go on
> for ages, and eventually crash over temp_file_limit after hours every now
> and then.
> So, my understanding of the above is that I must inform the users NOT to
> use OR clauses into joins.
> which maybe a pb by itself.
> regards
> Marc
>
>
>> regards, tom lane
>>
>
> Marc MILLAS
>


Accessing referential constraint information with minimal permissions

2023-06-24 Thread Avin Kavish
Hello,

I want to programmatically read all the references in a database. Ideally,
I want to do it with read-only permissions to the table. Is it possible?

I know the information is in `information_schema.referential_constraints`,
but apparently reading that information requires having write permissions
to the tables that have references. I don't know why it's designed like
that. I think knowing the relationships between tables is in the same class
of privileges as knowing the columns in the tables. (which you can do by
reading the `columns` view with just SELECT permissions)

Anyway, is there a workaround for this? If not, what is the least
destructive write permission I can give a user who wants this access?

Cheers,
Avin.