Thanks Mike for your suggestion . It looks bulk_insert_mappings / 
bulk_update_mapping seems missing Relationship not handled and even these 
objects not even going to session. I missed few more points to add to my 
questions early

For the above example
I am going to add lot more DB validations (Sql validation) , Eg: Comparing 
data with existing table / other parent relationship table . 
I have a requirement to cache the SQLAlchemy session itself, because of 
after doing validation based on user input I will act accordingly .

Considering the above I suggest ,
1. taking small batch of data (1000) add them to session , 
2. do the validations for  the batch
3. Cache the session, return the " session id , errors "

Finally 
4. Flush the data to different tables based on the user action , (Get all 
the cached session Ids )


On Tuesday, 24 May 2016 22:23:40 UTC+5:30, Mike Bayer wrote:
>
> I'm assuming this is data from a flat file or XML or something. 
>
> 1. Read the rows in chunks.  say 1000 at a time. 
>
> 2. organize the system by which you will regenerate the primary key from 
> a given row.   For each chunk, run this method on each row and associate 
> the primary key with each row in the chunk.  This can be done all in 
> memory because we are limiting the size of the dataset. 
>
> 3. Organize the chunk of data into a hashtable (e.g. Python dictionary), 
> keyed the primary key value that we've generated for each data row. 
>
> 4. Construct a list of those primary key values.  Organize a SQL 
> statement along the lines of:  "SELECT primary_key FROM table WHERE 
> primary_key IN (pk1, pk2, pk3, pk4, ...)", for the full 1000 elements of 
> the chunk. SQLAlchemy ORM or SQLAlchemy Core can be used for this. 
>
> 5. Read the result from the query, and as it is read, split the chunk 
> into two sets - the INSERT set, and the UPDATE set.  As each element is 
> read from the SQL result set, look up that record in the dictionary you 
> made in step 3.   That record goes into the UPDATE set, because its 
> primary key already exists. 
>
> 6. All records that have not been located go into the INSERT set; these 
> primary keys do not exist. 
>
> 7. If you are using ORM mappings, use ORM bulk_insert_mappings and 
> bulk_update_mappings separately on these sets to establish the data in 
> the database. 
>
>
> http://docs.sqlalchemy.org/en/latest/orm/session_api.html?highlight=bulk_insert#sqlalchemy.orm.session.Session.bulk_insert_mappings
>  
>
>
> http://docs.sqlalchemy.org/en/latest/orm/session_api.html?highlight=bulk_insert#sqlalchemy.orm.session.Session.bulk_update_mappings
>  
>
>
> On 05/24/2016 12:01 PM, Shankar Ganesh wrote: 
> > Hi, 
> > 
> > I need to know about what is the best way for doing update session data 
> > (session data is incomplete - not having the entire data) . Say data 
> > doesn't have primary key itself,  How we can populate the primary key in 
> > the SQL Alchemy session ? 
> > 
> > Use Case: 
> > 
> > * Have the arbitrary data currently, need to produce the primary key 
> > column based on the unique key . Data set size may be around 10k rows . 
> >  We can use get_or_create method ,  but it looks only cover single 
> > instance . 
> > (
> http://stackoverflow.com/questions/2546207/does-sqlalchemy-have-an-equivalent-of-djangos-get-or-create)
>  
>
> > 
> > Suggested ways: 
> > 
> > We can put the entire data into temp table and update primary key and 
> > other columns based on the unique data we have. 
> > 
> > Is there are any ways using SQLAlchemy ORM ? 
> > 
> > Thanks, 
> > Shankar. 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> > Groups "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> > an email to [email protected] <javascript:> 
> > <mailto:[email protected] <javascript:>>. 
> > To post to this group, send email to [email protected] 
> <javascript:> 
> > <mailto:[email protected] <javascript:>>. 
> > Visit this group at https://groups.google.com/group/sqlalchemy. 
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to