Thanks for the quick revert. I saw sqlalchemy-migration and have used
alembic a lot. Thanks for maintaining them, they are very useful!

Let me try describing my usecase better.

I have a bunch of legacy scripts in pure SQL for hive for ETL tasks.
Recently, this code has become very chaotic and large and I'm trying to
figure out a better way to implement it. And I've had a good experience
with sqlalchemy in MySQL and sqlite, so thought I should try using
sqlalchemy.

Primary benefits:
 - Having the ability to change the back-end of the DB was a bonus - I. E.
If the data is small, use sqlite. If the data is large use Hive
 - The ability to have and introspect the schema at any point - by plotting
the ERD
 - Ability to add foreign key relations that hive doesn't have native
support for, and creating relationships to simplify joins and so on (later)

Now, as this is ETL we have a core set of tables initially and then we make
a bunch of temp tables, joins, group bys, etc. And there are newer tables
created on the fly - as most ETL operations go.

Currently I'm trying to mimic the Hive code in sqlalchemy tables as a first
step.

As part of the Hive workloads, we do a bunch of map jobs to create new
columns and drop columns.
I'm not trying to get SQLAlchemy to create the SQL (yet) but I do want the
overall schema to be in sqlalchemy tables do I can move towards that.

So, if I drop a column in hive, what's the best way I can make the
sqlalchemy tables reflect that change?


On Mon, Aug 5, 2019, 07:44 Mike Bayer <[email protected]> wrote:

>
>
> On Sun, Aug 4, 2019, at 1:14 PM, Abdeali Kothari wrote:
>
> I am using the sqlalchemy Table class to load up a data schema I am
> provided.
>
> Once loaded into sqlalchemy, I want to use some of the existing extensions
> like:
>  - ERD plots - eralchemy
>  - Auto finding relationships using automap
> etc.
>
> I am not using any actual DB connection under the table though - i.e. not
> really going to be generating SQL statements (as of now)
>
> My schema comes from different operations that are fed into the system -
> like create table, add column, rename column, delete column.
> While I was trying to mimic these operations in sqlalchemy, I realized
> that I need to:
> 1. Create a MetaData
> 2. Create the table with a name and add columns to the arguments
> 3. To add a new column - I can use the extend_existing kwarg in
> sqlalchemy.Table
>
> But I could not find a way to "*drop*" or "remove" a column I have added.
>
> *To add some code examples*:
> >>> import sqlalchemy as sqla
> >>> mytable = sqla.Table("mytable", sqla.MetaData(), sqla.Column('col1',
> sqla.String()))
> >>> mytable
> Table('mytable',
>       MetaData(bind=None),
>       Column('col1', String(), table=<mytable>), schema=None)
>
> >>> # Mutating the table to add a column:
> >>> sqla.Table("mytable", mytable.metadata, sqla.Column('col2',
> sqla.String()), extend_existing=True)
> Table('mytable',
>       MetaData(bind=None),
>       Column('col1', String(), table=<mytable>),
>       Column('col2', String(), table=<mytable>), schema=None)
>
> >>> # Mutating the table to drop a column:
> >>> # What do I do here ?
> Table('mytable',
>       MetaData(bind=None),
>       Column('col1', String(), table=<mytable>), schema=None)
>
>
>
> So "mutating" a Table like that would correspond to emitting ALTER
> statements on the database.   There is a tool called sqlalchemy-migrate
> which works this way; you have a Table object, you say,
> "table.remove_column(col)", and it emits an "ALTER TABLE" which drops that
> column.   To do this, they altered SQLAlchemy itself by monkeypatching the
> Table class.   This design ultimately proved too rigid for its purpose, as
> it required the construction of Table models in order to emit simple
> commands and therefore led to a lot of overly complex and inefficient
> migration scripts.
>
> To alleviate this, I wrote a new migrations tool called Alembic that
> instead allows you to emit commands like "ALTER TABLE" without actually
> mutating Table objects.   Today, Alembic has superseded sqlalchemy-migrate
> and as luck would have it, I was put in charge of maintaining
> sqlalchemy-migrate anyway even though it is pretty much a legacy project.
>
> The point is, I don't think you should be trying to change the structure
> of a Table object once you create it.   The extend_existing() method is
> only there to suit the purpose of being able to reflect database metadata
> into an existing set of Table objects that include some existing state
> already.  But otherwise, the use case for Table / MetaData / Column etc. is
> that they represent a fixed database schema structure that is expressed at
> once.  They allow for additive creation, that is, adding new tables, adding
> new columns, only as a means to help construct a particular schema from
> different parts, such as certain ORM mixins that add new columns to a
> table, or being able to reflect additional columns / tables to complement a
> set of columns / tables that are explicitly declared in the application.
> But in all cases this additive construction is intended to represent a
> schema that is unchanging.   Schema changes are considered to be out of
> scope for the runtime of the application itself.
>
> If your program is provided with "a data schema" to start with, there
> should be no need to rename or delete columns.    Your actual use case is
> likely not quite like this so feel free to share more details of why you
> are seeking these operations.
>
>
>
>
>
> *I can think of 2 approaches*:
> # Recreate the entire table
> mytable.metadata.remove(mytable)
> mytable = sqla.Table(
>     mytable.name,
>     mytable.metadata,
>     *[i for i in mytable.columns if i.name != 'col2])
>
> # Modify the private _columns collection
> mytable._columns.remove(mytable.c['col2'])
>
> I was wondering if there is a correct/compliant way to do this.
>
> PS: I cannot avoid doing a drop column - have to do it as that is an
> operation that we want to support.
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/CA%2BBKPDXffNsB9LaBqz4yiUa_%2BmFAcKbYEnXRbrL6pdohr5DZxg%40mail.gmail.com
> <https://groups.google.com/d/msgid/sqlalchemy/CA%2BBKPDXffNsB9LaBqz4yiUa_%2BmFAcKbYEnXRbrL6pdohr5DZxg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/e8f6621d-b4af-4c53-9415-39b9a65282f0%40www.fastmail.com
> <https://groups.google.com/d/msgid/sqlalchemy/e8f6621d-b4af-4c53-9415-39b9a65282f0%40www.fastmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CA%2BBKPDUXrUT92WM6hRWaiRwzOwhB1Rrki_xnUx3KWr8PAUEWdg%40mail.gmail.com.

Reply via email to